Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The structure and evolution of the bovine prothrombin gene Irwin, David Michael 1986

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1987_A1 I78.pdf [ 11.13MB ]
Metadata
JSON: 831-1.0097390.json
JSON-LD: 831-1.0097390-ld.json
RDF/XML (Pretty): 831-1.0097390-rdf.xml
RDF/JSON: 831-1.0097390-rdf.json
Turtle: 831-1.0097390-turtle.txt
N-Triples: 831-1.0097390-rdf-ntriples.txt
Original Record: 831-1.0097390-source.json
Full Text
831-1.0097390-fulltext.txt
Citation
831-1.0097390.ris

Full Text

THE STRUCTURE AND EVOLUTION OF THE BOVINE PROTHROMBIN GENE by DAVID MICHAEL IRWIN B.Sc.(Hons.), U n i v e r s i t y Of Guelph, 1982  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES Department Of Biochemistry  (Genetics Programme)  We accept t h i s t h e s i s as conforming to the r e q u i r e d standard  THE UNIVERSITY OF BRITISH COLUMBIA December 1986  ©  David Michael Irwin, 1986  In  presenting  requirements  thesis  I  available  for  agree  fulfilment  for  the  Library  shall  reference  and  study.  I  extensive  may  or  representatives.  of  further  It for  is  financial  Biochemistry  16 December  'understood  written permission.  The U n i v e r s i t y of B r i t i s h 2075 Wesbrook P l a c e Vancouver, Canada V6T 1W5  1986  British  it  freely  agree  be granted by the Head of my Department or  allowed without my  Department of  make  the  that  copying of t h i s t h e s i s f o r s c h o l a r l y  p u b l i c a t i o n of t h i s t h e s i s  Date:  partial  that  purposes her  in  f o r an advanced degree at the U n i v e r s i t y of  Columbia,  permission  this  Columbia  gain  that  by  his  copying or  shall  not  be  Abstract The  gene f o r bovine prothrombin  encodes a mRNA of 2025 n u c l e o t i d e s  i s 15.6 Kbp i n l e n g t h  plus a poly(A) t a i l .  which  The  prothrombin gene i s composed of 14 exons separated by 13 i n t r o n s , a l l of which vary i n s i z e . introns  found w i t h i n  The p o s i t i o n s of the  the prothrombin gene p r o v i d e s some i n s i g h t  i n t o the e v o l u t i o n of prothrombin and p r o v i d e evidence on the o r i g i n of i n t r o n s . W i t h i n the a c t i v a t i o n peptide and leader precursor  sequence of  prothrombin, some of the i n t r o n s appear t o separate  s t r u c t u r a l and f u n c t i o n a l p r o t e i n domains.  Introns  s e p a r a t e c e r t a i n domains, i n c l u d i n g the p r e - p e p t i d e , p e p t i d e and G l a region, and each of the k r i n g l e s . organization  of exons may r e f l e c t  the e v o l u t i o n  a r e found t o the proThis  of the  prothrombin gene as the r e s u l t of the f u s i o n of exon(s) containing  p r o t e i n domains by exon s h u f f l i n g .  p e p t i d e appears to be c o n s t r u c t e d peptide,  The a c t i v a t i o n  from four domains: a p r e -  a pro-peptide and G l a region,  and two k r i n g l e s .  On comparison of the exon o r g a n i z a t i o n  of the s e r i n e  p r o t e a s e domain of prothrombin and other s e r i n e protease genes, it are  was found that none of the i n t r o n s of the prothrombin gene shared with any of the other s e r i n e protease genes.  absence of shared i n t r o n s i s i n c o n t r a s t  to the shared  This introns  found f o r the shared domains of the a c t i v a t i o n p e p t i d e and leader.  The p o s i t i o n s of the i n t r o n s of the s e r i n e  protease  domain of s e r i n e proteases genes does not appear to r e f l e c t the evolution  of the s e r i n e protease from p r o t e i n domains, but  ii  rather  the  r e s u l t of  coding  regions.  of the  few  i n t r o n i n s e r t i o n i n t o the s e r i n e protease  Intron  i n s e r t i o n would a l s o e x p l a i n the  i n t r o n s of the a c t i v a t i o n peptide  that do not  origin appear  to separate p r o t e i n domains. In c o n c l u s i o n ,  the o r g a n i z a t i o n  of the exons and  i n t r o n s of  the gene f o r prothrombin r e f l e c t  both the o r i g i n of i n t r o n s  i n s e r t i o n events, and  i n t r o n s i n exon s h u f f l i n g .  the use  i n s e r t i o n of i n t r o n s , and  the  of  by The  subsequent p o s s i b i l i t y of exon  s h u f f l i n g appear to have been e s s e n t i a l for the e v o l u t i o n of  the  multidomainal p r o t e i n s , such as prothrombin, which are e s s e n t i a l for vertebrate  life.  iii  Table of Contents Abstract Table of Contents L i s t of Tables L i s t of F i g u r e s Acknowledgement L i s t of A b r e v i a t i o n s I ntroduct ion A. Physiology of Blood Coagulation 1 . Hemostasi s . 2. Discovery of C o a g u l a t i o n F a c t o r s 3. P o s t - T r a n s l a t i o n a l M o d i f i c a t i o n s 4. I n i t i a t i o n of Blood C o a g u l a t i o n 5. Non-Enzymatic F u n c t i o n s of Thrombin B. Biochemistry of Blood C o a g u l a t i o n 1. F i b r i n C l o t Formation 2. C o a g u l a t i o n F a c t o r s as Zymogens 3. Two Pathways of Blood C o a g u l a t i o n C. S t r u c t u r e of the Prothrombin Molecule 1. S t r u c t u r e of Plasma Prothrombin 2. P o s t - T r a n s l a t i o n a l M o d i f i c a t i o n s 3. Precursor Prothrombin 4. Gamma Carboxyglutamic A c i d Domain 5. K r i n g l e Domain 6. Thrombin Domain 7. Three Dimensional S t r u c t u r e D. Functions of Prothrombin 1. A c t i o n on F i b r i n o g e n 2. Other Enzymatic F u n c t i o n s 3. Non-Enzymatic F u n c t i o n s E. Blood C o a g u l a t i o n i n Non-Mammals 1. Blood C o a g u l a t i o n i n the V e r t e b r a t e s 2. Blood C o a g u l a t i o n i n the I n v e r t e b r a t e s F. S t r u c t u r e of S e r i n e Proteases 1. Three Dimensional S t r u c t u r e 2. L i m i t e d S u b s t r a t e S p e c i f i c i t y of the Coagulation F a c t o r s G. Homologies Within S e r i n e Protease Zymogens 1. F a m i l i e s of S e r i n e P r o t e a s e s . . . . 2. Roles of S e r i n e Proteases i n P h y s i o l o g y 3. Homologous Domains Within the A c t i v a t i o n Peptide of Prothrombin 4. Homologous Domains Found i n S e r i n e Protease Zymogens Other than Prothrombin H. S t r u c t u r e of E u k a r y o t i c S t r u c t u r a l Genes 1 . The Gene 2. Exons 3. Introns 4. Promoters 5. T r a n s c r i p t i o n and P r o c e s s i n g  ii iv viii ix xi xii 1 1 1 2 3 6 7 7 7 8 9 11 11 12 12 16 16 17. 18 19 19 19 20 21 21 23 24 24 25 26 26 26 27 30 32 .32 32 33 33 34  iv  I. E v o l u t i o n of Amino A c i d and DNA Sequence 1. M o l e c u l a r Clock 2. Gene D u p l i c a t i o n s . . . J . E v o l u t i o n of the S t r u c t u r e of P r o t e i n s and Genes 1. I n t e r n a l D u p l i c a t i o n s Within a Gene 2. Gene F u s i o n s K. F u n c t i o n of I n t r o n s 1 . Distribution 2. P o s i t i o n of Introns Within Genes and P r o t e i n s 3. I n t r o n S l i d i n g . . . L. O r i g i n of I n t r o n s 1. M e t a b o l i c Enzymes 2. The Triose-Phosphate Isomerase Gene 3. Intron M o b i l i t y 4. Models of Intron O r i g i n M. S e r i n e P r o t e a s e Genes 1. Sequence of Serine Proteases 2. Genes f o r S e r i n e Proteases N. The E v o l u t i o n of the Serine Protease Genes M a t e r i a l s and Methods A. M a t e r i a l s B. S t r a i n s , V e c t o r s , and Media 1. B a c t e r i a l S t r a i n s 2 . Vectors 3. Media C. B a s i c M o l e c u l a r Biology Techniques D. I s o l a t i o n of DNA 1. I s o l a t i o n of Plasmid DNA. 2. I s o l a t i o n of Phage DNA 3. Genomic DNA I s o l a t i o n E. DNA S u b c l o n i n g 1. P r o d u c t i o n of DNA Fragments f o r L i g a t i o n 2. L i g a t i o n of DNA i n t o pUC13 or M13 V e c t o r s 3. T r a n s f o r m a t i o n of DNA i n t o B a c t e r i a F. I s o l a t i o n of RNA 1. I s o l a t i o n of T o t a l C e l l u l a r RNA 2. I s o l a t i o n of Poly A RNA G. L a b e l i n g of DNA 1. Nick T r a n s l a t i o n 2. Klenow L a b e l i n g H. B l o t H y b r i d i z a t i o n 1. Genomic Southern Blot A n a l y s i s 2. Southern B l o t A n a l y s i s to Detect R e p e t i t i v e DNA. 3. Northern B l o t B l o t A n a l y s i s I. DNA Sequence A n a l y s i s 1. C o n s t r u c t i o n of M13 Clones 2. S c r e e n i n g of M1 3 Clones 3. M13 DNA I s o l a t i o n 4. DNA Sequencing 5. Computer A n a l y s i s of DNA Sequence Data +  35 35 36 37 37 38 39 39 39 40 41 41 42 43 44 46 46 47 48 49 49 50 50 51 51 52 54 54 57 58 59 59 60 61 62 62 64 65 65 65 66 66 67 68 69 69 69 70 71 73  V  J. K. L. M.  Heteroduplex A n a l y s i s S c r e e n i n g Phage L i b r a r i e s 1. P l a t i n g Phage L i b r a r i e s 2. S c r e e n i n g o f Phage F i l t e r s Screening Plasmid L i b r a r i e s M a p p i n g t h e End of a mRNA T r a n s c r i p t 1. N u c l e a s e S1 M a p p i n g 2. P r i m e r E x t e n s i o n  73 74 74 75 75 76 76 78  Results 80 A . I s o l a t i o n of t h e B o v i n e P r o t h r o m b i n Gene 80 1. S o u t h e r n B l o t A n a l y s i s of t h e B o v i n e P r o t h r o m b i n Gene 80 2. C l o n i n g of t h e B o v i n e P r o t h r o m b i n Gene 84 3. A n a l y s i s of t h e S i z e of t h e B o v i n e P r o t h r o m b i n mRNA 87 B. H e t e r o d u p l e x Mapping 90 1 . Method 90 2. Exons and I n t r o n s 90 3 . R e p e t i t i v e DNA 95 C . DNA S e q u e n c e A n a l y s i s of t h e B o v i n e P r o t h r o m b i n Gene 95 D . M a p p i n g t h e S i t e of mRNA I n i t i a t i o n 103 1. N u c l e a s e S1 A n a l y s i s 103 2. P r i m e r E x t e n s i o n 107 E . M a p p i n g R e p e t i t i v e DNA 112 F . I s o l a t i o n of a Human P r o t h r o m b i n cDNA 117 G . P a r t i a l DNA S e q u e n c e o f p I I H l 3 120 H . I s o l a t i o n of t h e Human P r o t h r o m b i n Gene 120 1. I s o l a t i o n o f Genomic C l o n e s 120 2. P a r t i a l DNA Sequence A n a l y s i s of t h e Human Prothrombin Gene...' ..123 I . I s o l a t i o n of cDNA C l o n e s f o r C h i c k e n P r o t h r o m b i n . . . . 128 1. C o n d i t i o n s of S c r e e n i n g 128 2. DNA S e q u e n c e of p C I I I 128 J . I s o l a t i o n of L o n g e r C h i c k e n P r o t h r o m b i n cDNAs 131 K . S i z e A n a l y s i s o f C h i c k e n P r o t h r o m b i n mRNA 134 Discussion 138 A . C h a r a c t e r i z a t i o n of the B o v i n e P r o t h r o m b i n Gene 138 1. I s o l a t i o n of t h e B o v i n e P r o t h r o m b i n Gene 138 2. S i z e A n a l y s i s of the B o v i n e P r o t h r o m b i n m R N A . . . . . 1 3 8 3. S e q u e n c e of t h e B o v i n e P r o t h r o m b i n Gene 139 4. S i t e of mRNA I n i t i a t i o n 141 5. I n t r o n P o s i t i o n s i n t h e C o d i n g R e g i o n 142 B . C h a r a c t e r i z a t i o n of a Human P r o t h r o m b i n cDNA 146 C . C h a r a c t e r i z a t i o n of cDNAs f o r C h i c k e n P r o t h r o m b i n . . . 149 1. S e q u e n c e of t h e C h i c k e n P r o t h r o m b i n cDNAs 149 2. A l t e r n a t i v e S i t e s of P o l y a d e n y l y l a t i o n 150 D . C o m p a r i s o n of P r o t h r o m b i n S e q u e n c e s 151 1. C o n s e r v e d S e q u e n c e s 151 2. D e l e t i o n s / I n s e r t i o n s 154 3 . mRNA S t r u c t u r e 155  vi  E. Comparison of the Bovine and Human Prothrombin Genes F. Comparison of Serine Protease Genes 1. Leader and Gla Region 2. K r i n g l e Region 3. S e r i n e Protease Region G. O r i g i n of Introns and Exon S h u f f l i n g 1. O r i g i n of Introns 2. Exon S h u f f l i n g H. E v o l u t i o n of the A c t i v e S i t e S e r i n e Codon I. Model of the E v o l u t i o n of the Vitamin K-Dependent Coagulation Factors _ J . E v o l u t i o n of the Blood Coagulation System Literature Cited  156 158 158 164 169 174 174 176 177 181 187 191  vii  L i s t of T a b l e s  I. DNA Sequencing Mixes I I . A Comparison  72  of the S i z e s of Exons Determined Both by  DNA Sequence A n a l y s i s and Heteroduplex A n a l y s i s I I I . A Comparison  of the S i z e s of I n t r o n s Determined  93 Both  by DNA Sequence A n a l y s i s and Heteroduplex A n a l y s i s . . . . 94 IV. Length and L o c a t i o n of I n v e r t e d Repeat  Sequences  Observed W i t h i n the I n t r o n s of the Bovine Prothrombin Gene V. N u c l e o t i d e Sequences  96 a t the Intron-Exon J u n c t i o n s of  the Bovine Prothrombin Gene  105  VI. F r e q u e n c i e s of N u c l e o t i d e s a t Intron-Exon Junctions  106  viii  L i s t of F i g u r e s  1. The Blood Coagulation Cascade  5  2. The Prothrombin Molecule  14  3. Homologies i n Coagulation F a c t o r Zymogens  29  4. Southern B l o t A n a l y s i s of the Bovine Prothrombin Gene  82-83  5. R e s t r i c t i o n Map of the Bovine Prothrombin Gene  86  6. Northern B l o t A n a l y s i s of Bovine Prothrombin mRNA  89  7. Heteroduplex A n a l y s i s of the Bovine Prothrombin Gene...92 8. P a r t i a l R e s t r i c t i o n Map and Sequencing S t r a t e g y f o r the  Bovine Prothrombin Gene  9. P a r t i a l DNA Sequence of the Bovine  99 Prothrombin  Gene  101-102  10. Nuclease S1 Mapping of the Prothrombin mRNA  109  11. Primer Extension A n a l y s i s of Prothrombin mRNA  111  12. Southern B l o t A n a l y s i s of R e p e t i t i v e DNA Within the Bovine Prothrombin Gene 13. Map of R e p e t i t i v e DNA i n the Bovine  114 Prothrombin  Gene  116  14. R e s t r i c t i o n Endonuclease Map of the Human Prothrombin cDNAs  119  15. N u c l e o t i d e Sequence of the 5' End of pIIH13  122  16. R e s t r i c t i o n Map of the Human Prothrombin Gene  125  17. Southern B l o t A n a l y s i s of the Human Prothrombin Gene..127 18. DNA Sequence of Chicken Prothrombin cDNAs  130  19. R e s t r i c t i o n Map of Chicken Prothrombin cDNAs  133  ix  20. Northern B l o t A n a l y s i s of Chicken Prothrombin mRNA.... 136 21. Introns  in the Prothrombin M o l e c u l e  144  22. Alignment of the Bovine and Human Prothrombin mRNA Sequences  148  23. Homologies  i n the Prothrombin Sequences  24. Comparison  of the O r g a n i z a t i o n  Leader Peptide 25. Comparison  153  of the Exons of the  and G l a Domain  of the O r g a n i z a t i o n  161 of the Exons of the  K r i n g l e Domain 26. Comparison  of the O r g a n i z a t i o n  166 of the Exons of the  S e r i n e Protease Domain 27. A Model f o r the E v o l u t i o n of the Vitamin Coagulation  Factors  171 K-Dependent 183  X  Acknowledgement  I would l i k e to thank my s u p e r v i s o r Dr. Ross MacGi11ivray, f o r p r o v i d i n g the space and o p p o r t u n i t y f o r me to do t h i s work. I a l s o thank the members of my s u p e r v i s o r y committee D r s . C a r o l i n e A s t e l l , Tom G r i g l i a t t i , Rob McMaster, and Mike Smith f o r t h e i r h e l p f u l comments and s u g g e s t i o n s . I thank D r s . Kevin Ahern and George Pearson of Oregon State U n i v e r s i t y f o r the heteroduplex a n a l y s i s of the bovine prothrombin gene, which a i d e d my work with the sequencing of the gene. I thank a l l the members of the l a b , e s p e c i a l l y E n r i q u e t a Guinto, Marion Fung, Debbie Cool, and C o l i n Hay f o r the many h e l p f u l s u g g e s t i o n s , comments, methods, and m a t e r i a l s . Thankyou a l s o to a l l the members of the Biochemistry department, e s p e c i a l l y J e f f Leung and C r a i g Newton, who have made my stay here very e n j o y a b l e . I would l i k e to thank Drs. T. M a n i a t i s , F. Rottman, S. Orkin, and T. Kirshgessner f o r p r o v i d i n g genomic and cDNA l i b r a r i e s used to i s o l a t e some of the c l o n e s d e s c r i b e d i n t h i s t h e s i s . I would l i k e to acknowledge NSERC and the U n i v e r s i t y Graduate F e l l o w s h i p committee f o r t h e i r f i n a n c i a l support.  xi  List  of A b r e v i a t i o n s  A  Adenosine  ATP  Adenosinetr iphosphate  bp  Base P a i r ( s )  BSA  Bovine Serum Albumin  C  Cytidine  Ca  2+  Calcium  ions  dNTP  Deoxyribonucleosidetriphosphate  ddNTP  Dideoxyribonucleosidetriphosphate  DNA  Deoxyribonucleic Acid  DNase  Deoxyr ibonuclease  DTT  Dithiothreitol  EDTA  Ethylenediaminetetraacetic Acid  EtBr  Ethidium  G  Guanosine  Gla  7-Carboxyglutamic  GuHCl  Guanidine Hydrochloride  hnRNA  Heterogeneous Nuclear  IPTG  Isopropy1-0-D-Thiogalactopyranoside  Kbp  Kilobase Pair(s)  Krpm  Thousand Revolutions Per Minute  LB  Luria  mA  Milliamps  min  minute(s)  mRNA  Messenger  N  Any Nucleoside  Bromide  Acid  RNA  Broth  RNA (G,A,T, or C)  xii  OD  Optical  pf u  Plaque  forming  R  Purine  (A o r  RNA  Ribonucleic  RNase  Ribonuclease  rRNA  Ribosomal  TEMED  N,N,N',  Tr i s  Tri(hydroxymethyl)aminomethane  tRNA  Transfer  U  Uridine  UV  Ultra  V  Volts  T  Thymidine  W  Watts  X-Gal  5-Bromo-4-Chloro-3-Indolyl-f3-D-  Density unit  G) Acid  RNA N'-Tetramethylethylenediamine  RNA  Violet  Galactopyranoside Y  Pyrimidine  (T  or  C)  1  INTRODUCTION A.  PHYSIOLOGY OF BLOOD COAGULATION  1.  Hemostasi s  In the v e r t e b r a t e s , a c l o s e d c i r c u l a t o r y system i s essential  f o r n u t r i e n t t r a n s p o r t , waste removal, hormonal  regulation,  immune response, and other p h y s i o l o g i c a l f u n c t i o n s .  T h i s c l o s e d system of blood v e s s e l s ( a r t e r i e s , v e i n s , and capillaries) fluid. exist  i s prone t o i n j u r i e s which lead to l o s s of blood  Several  i n t e r a c t i n g p h y s i o l o g i c a l mechanisms or systems  to maintain  hemostasis.  blood volume and flow, a process  In mammals, four systems i n t e r a c t to stop  l o s s and r e p a i r damage i n response to i n j u r y These four systems or mechanisms are: upon i n j u r y reduces b l o o d limits fluid formation fluid  known as  loss,  flow  (1)  (Guyton,1977).  vascular contraction  i n the damaged v e s s e l , and thus  (2) p l a t e l e t aggregation  of a p l a t e l e t  r e s u l t s i n the  p l u g that a c t s as a p h y s i c a l blockage to  l o s s ( i n non-mammalian v e r t e b r a t e s , a nucleated  r e p l a c e s the mammalian p l a t e l e t p l a t e l e t plug  blood  blood  cell  (Engle and Woods,1960)); t h i s  i s o f t e n enough to prevent  fluid  l o s s from small  blood v e s s e l s , (3) b l o o d c o a g u l a t i o n r e s u l t s i n the formation of a fibrin  blood c l o t  which a c t s as a mechanical block to f l u i d  l o s s , and (4) i n v a s i o n of the blood c l o t d i s s o l u t i o n of the f i b r i n repair  (Guyton,1977).  by f i b r o u s t i s s u e and  c l o t during c e l l and v e s s e l w a l l  2  2.  Discovery Of C o a g u l a t i o n  Factors  Blood c o a g u l a t i o n p r o t e i n s represent only one hemostasis  (Jackson and Nemerson,1980).  mechanism i s f a r from understood  (Davie et al.,1979;  complete  hemostatic  with the blood c o a g u l a t i o n  system perhaps the best, but s t i l l process  The  component of  incompletely  Jackson  understood  and Nemerson,1980) .  E l u c i d a t i o n of the process of blood c o a g u l a t i o n has  been slow  and complicated  and  (MacFarlane,1960; Ratnoff,1977;  Zur  Nemerson,1981). It was tissue  found  i n the mid  ( e s p e c i a l y b r a i n ) was  coagulation historical  19th century t h a t an e x t r a c t from a potent a c t i v a t o r of  (see Ratnoff,1977; reviews).  blood  Zur and Nemerson,1981 f o r  These e a r l y experiments l e d to the  first  model for blood c o a g u l a t i o n i n which a t i s s u e f a c t o r would convert prothrombin (Ca ). 2 +  to thrombin i n the presence  Thrombin c o u l d then convert  Almost immediately  t h i s model was  of c a l c i u m  f i b r i n o g e n to  shown to be  ions  fibrin.  inadequate  c o u l d not e x p l a i n many of the known b l e e d i n g d i s o r d e r s .  as i t Indeed,  the m a j o r i t y of the blood c o a g u l a t i o n f a c t o r s were i d e n t f i e d  by  d e s c r i p t i o n of t h e i r absence i n p a t i e n t s with b l e e d i n g tendencies  (Bloom,1981).  These d e f i c i e n c i e s l e d to the  d i s c o v e r y of f a c t o r V (Quick,1943), Bollman,1948), f a c t o r VIII  f a c t o r VII  (Patek and  Brinkhous,1947; Quick,1947), f a c t o r IX f a c t o r X ( T e l f e r et al.,1956; (Rosenthal e_t al.,1953), and Colopy,1955).  (Owen and  Taylor,1937; (Biggs et  al.,1952),  Hougie et a l . , 1 9 5 7 ) , f a c t o r XII  (Ratnoff  With the d i s c o v e r y of these  factor  XI  and  factors, a  cascade,  3  or w a t e r f a l l , model of c o a g u l a t i o n was developed (MacFarlane,1964; Davie and Ratnoff,1964) (see F i g . 1 ) . c o a g u l a t i o n cascade has had f u r t h e r m o d i f i c a t i o n s due  t o the d i s c o v e r y of a d d i t i o n a l f a c t o r s .  p r o t e i n s were i n i t i a l l y subsequently  (see below)  Some of these  c h a r a c t e r i z e d b i o c h e m i c a l l y , and  found t o be a s s o c i a t e d with s p e c i f i c  d i s o r d e r s , e.g.  This  hematological  p r o t e i n C ( G r i f f i n et al.,1981) and p r o t e i n S  (Comp et al.,1984; Schwarz et al.,1984). 3.  Post-Translational Modifications  Nutritional another aspect  s t u d i e s i n the chicken  l e d to the d i s c o v e r y of  of the blood c o a g u l a t i o n system.  Specific  d e f i n e d d i e t s f e d t o c h i c k s l e a d to a b l e e d i n g tendency and v i t a m i n K was p o s t u l a t e d to be the missing e s s e n t i a l (Dam,1935). production  vitamin  The b l e e d i n g tendency was shown t o be due t o the of an abnormal prothrombin  (Dam e_t a l . , 1 936) .  Subsequently, i t has been shown that vitamin K i s e s s e n t i a l i n both the mammals and the b i r d s (Suttie,1985), and f o r the formation and  of normal prothrombin as w e l l as f a c t o r s V I I , IX, X,  p r o t e i n s C, S, and Z (Suttie,1985).  I t has been  demonstrated t h a t v i t a m i n K i s a necessary formation  of 7-carboxyglutamic  acid  c o f a c t o r i n the  (Gla) r e s i d u e s found a t the  amino-terminal r e g i o n s of the vitamin K dependent factors  (Suttie,1985).  The G l a residues are formed by the  c a r b o x y l a t i o n of s p e c i f i c K-dependent c a r b o x y l a s e WARFARIN, i n h i b i t blood  coagulation.  coagulation  glutamic  a c i d r e s i d u e s by a v i t a m i n  (Suttie,1985).  Coumaral drugs, e.g.  the c a r b o x y l a t i o n r e a c t i o n and thus  impair  4  Figure  1: The  Blood C o a g u l a t i o n  Cascade  O u t l i n e of the mammalian blood c o a g u l a t i o n cascade with intrinsic converging ending  pathway ( l e f t ) and  e x t r i n s i c pathway  with the formation of the i n s o l u b l e f i b r i n  I n t r a molecular  l i n e s between the two the c r o s s l i n k e d XHIa.  and  clot.  ( p r o p o r t i o n a l to  p o l y p e p t i d e c h a i n l e n g t h ) with molecular below.  (right)  at the a c t i v a t i o n of f a c t o r X to f a c t o r Xa,  Bars represent the p o l y p e p t i d e c h a i n s  the  weights  indicated  d i s u l p h i d e b r i d g e s are i n d i c a t e d  chains.  fibrin clot  (From Neurath,1984).  X-linked f i b r i n  by  represents  formed by the a c t i o n of  factor  5  Surface  HMW Kininogen Kallikrcin  Kallikreln XIU.XI,  I8K  45K  VII 2  I7K  39K  a  VII  .  Tissue  Co * V I I I , P-llpid  (actor  I7K 28K  X va Ca ", P-lipid z  70 K Prothrombin  Thrombin  Fibrinogen  Fibrin xai„  Fibrin  (X-linked)  6  The  7-carboxyglutamic  K-dependent c o a g u l a t i o n phospholipid  membranes  membranes a r e p r o b a b l y Jackson,1977),  f a c t o r s t o form C a (Suttie,1985). provided  An a b s e n c e o f t h e s e  Initially, initiation  Nemerson,1981  Of B l o o d  ( S u t t i e and  i n hemostasis.  2 +  was a l s o  The  factors with  f o u n d t o be  ( f a c t o r s V and V I I I , f a c t o r s would a l s o  and t i s s u e  impair  Coagulation  t i s s u e f a c t o r was t h o u g h t t o be e s s e n t i a l f o r of blood  coagulation  for historical  observed that c o a g u l a t i o n  (see Ratnoff,1977;  reviews).  without  and C o p l e y , 1 9 5 5 ) .  system appeared t o e x i s t  Zur and  H o w e v e r , i t was  c o u l d be i n i t i a t e d  t i s s u e f a c t o r (Ratnoff  initiation  phospholipid  (Bloom,1981).  Initiation  extrinsic  bridges to  by t h e p l a t e l e t s  i n the presence of C a  coagulation  the  These  p h y s i o l o g i c a l processes  d e p e n d e n t on p r o t e i n c o f a c t o r s  4.  2 +  the vitamin  of t h e v i t a m i n K-dependent c o a g u l a t i o n  phospholipid  factor).  allow  p r o v i d i n g an e x a m p l e o f t h e i n t e r a c t i o n b e t w e e n  the d i f f e r e n t interaction  acid residues  later  an An  intrinsic  which lead t o the  d e v e l o p m e n t o f t h e i d e a o f two p a t h w a y s o f i n i t i a t i o n o f coagulation The  - the i n t r i n s i c  intrinsic  initiation  and e x t r i n s i c  system i s s t i l l  (as discussed not  completely  understood but does r e q u i r e f a c t o r X I I , p r e k a l l i k r e i n , molecular  weight kininogen  (Griffin,1981).  later).  and a n e g a t i v e l y c h a r g e d  high  surface  7  5.  Non-Enzymatic Functions Of Thrombin  Prothrombin  was found to have f u n c t i o n s other than the  c o n v e r s i o n of s o l u b l e f i b r i n o g e n to i n s o l u b l e f i b r i n section).  I t was d i s c o v e r e d that thrombin  prothrombin) resulting  (see next  (activated  i n t e r a c t e d with p l a t e l e t s and e n d o t h e l i a l  cells  i n the formation of a c t i v a t e d p l a t e l e t s and inducing  wound r e p a i r , thus a i d i n g hemostasis Bing,l986).  (Fenton,1981; Fenton and  Thrombin i s a l s o a chemotactic agent  some c e l l s of the immune system, e.g.  attracting  neutrophils  (Fenton,1981;  Fenton and B i n g , l 9 8 6 ) , which may f u n c t i o n to prevent e n t r y of f o r e i g n m a t e r i a l by way of the i n j u r e d blood v e s s e l . mechanisms of many of these a d d i t i o n a l not c o m p l e t e l y understood  f u n c t i o n s of thrombin a r e  (see Fenton,1981  B.  BIOCHEMISTRY OF BLOOD COAGULATION  1.  F i b r i n Clot  p a r t i c i p a t i o n of at l e a s t  r e q u i r e s the  14 plasma p r o t e i n s , a t i s s u e  p h o s p h o l i p i d membranes, C a , and p l a t e l e t s 2 +  Jackson and Nemerson,1980). that  of hemostasis The b l o o d c l o t  (Davie e_t a l . , 1979;  I t i s the formation of the f i b r i n  (Davie et al.,1979; Jackson and Nemerson,1980). i s formed by the p o l y m e r i z a t i o n of f i b r i n i n c o r p o r a t e s the p l a t e l e t p l u g ,  and other p r o t e i n s and c e l l s  prevent f l u i d  protein,  i s the best c h a r a c t e r i z e d and understood process  monomers i n t o a network which thrombin  f o r a review).  Formation  Formation of the f i b r i n blood c l o t  blood c l o t  The  loss  (Doolittle,1984).  i n t o a mechanical plug to Fibrin  i s formed by  l i m i t e d p r o t e o l y s i s of f i b r i n o g e n to f i b r i n as i n d i c a t e d i n  8  Fig.1  ( D o o l i t t l e , 1 9 8 4 ) . F i b r i n o g e n i s a plasma p r o t e i n of  340,000 molecular weight 2 Aa,  2 B/3, and  and comprised  2 7 chains  Doolittle,1984).  (Jackson and Nemerson, 1980;  Thrombin c l e a v e s four p e p t i d e bonds in each  f i b r i n o g e n monomer, one  i n each of the Aa and B/3 c h a i n s ,  r e l e a s i n g 2 f i b r i n o p e p t i d e s A, monomer ( D o o l t t l e , 1 9 8 4 ) . The polymerize  of 6 p o l y p e p t i d e c h a i n s :  spontaneously  (Doolittle,1984).  The  2 f i b r i n o p e p t i d e s B, and f i b r i n monomers can  to form  fibrin  insoluble  fibrin  fibrin  then polymers  network i s f u r t h e r strengthened  by  the formation of c o v a l e n t c r o s s l i n k s between monomers by the transglutamase XIII to  i s found  factor X H I a  (see F i g . 1 )  i n plasma as an  Factor X H I a by thrombin  (Curtis,1981).  inactive protein  (Davie et al.,1979;  Factor  that i s a c t i v a t e d Jackson  and  Nemerson,1980). 2.  C o a g u l a t i o n F a c t o r s As  Many of the enzymatic cascade  Zymogens  steps of the b l o o d c o a g u l a t i o n  c o n s i s t of the c o n v e r s i o n of i n a c t i v e zymogens to a c t i v e  s e r i n e p r o t e a s e s , such as the a c t i v a t i o n thrombin  by f a c t o r Xa  and Nemerson,1980).  (see F i g . 1 )  of prothrombin  to  (Davie et aJL.,1979; Jackson  As shown i n F i g . 1 , the zymogen forms of the  c o a g u l a t i o n f a c t o r s V I I , IX, X, XI, X I I , and prothrombin  are  a c t i v a t e d to the c o r r e s p o n d i n g s e r i n e p r o t e a s e s  Vila,  IXa, Xa, XIa, X l l a , proteolysis  and  thrombin,  (Davie et al.,1979;  Many of these p r o t e o l y t i c such as f a c t o r V, or  tissue factor  (factors  r e s p e c t i v e l y ) by  Jackson  limited  and Nemerson,1980).  reactions require a protein cofactor  f a c t o r V I I I , h i g h m o l e c u l a r weight (Davie et al.,1979;  Jackson  kininogen,  and Nemerson,1980) .  9  In a d d i t i o n to the p r o t e i n c o f a c t o r s , the v i t a m i n K-dependent c o a g u l a t i o n p r o t e i n s ( f a c t o r s V I I , IX, X, and prothrombin i n Fig.1)  a l s o r e q u i r e p h o s p h o l i p i d and C a  Jackson and Nemerson,1980). K-dependent c o a g u l a t i o n through C a at  2 +  2 +  (Davie et al.,1979;  As d i s c u s s e d e a r l i e r ,  f a c t o r s i n t e r a c t with  bridges with 7-carboxyglutamic  phospholipid  acid residues  the amino-termini regions of t h e s e . p r o t e i n s  In the v i t a m i n K-dependent c o a g u l a t i o n residues  i n the f i r s t  (Jackson  Two Pathways Of Blood  Blood  found  (Suttie,1985) .  f a c t o r s , a l l glutamate  45 r e s i d u e s of the amino-terminal of these  p r o t e i n s are 7-carboxylated 3.  the v i t a m i n  coagulation  and Nemerson,1980).  Coagulation  is initiated  by e i t h e r or both of the two  pathways shown i n Fig.1 (Davie e_t §_1.,1979; Jackson and Nemerson,1980) .  The e x t r i n s i c pathway i s i n i t i a t e d  r e l e a s e of t i s s u e f a c t o r (the e x t r i n s i c tissue  by the  f a c t o r ) from damaged  (Davie et al.,1979; Jackson and Nemerson,1980).  Tissue  f a c t o r , as a p r o t e i n c o f a c t o r , a c c e l e r a t e s the a c t i v a t i o n of f a c t o r X by f a c t o r V i l a  (or VII) (see F i g . 1 ) .  a c t i v a t e d by many of the c o a g u l a t i o n Xlla,  Xa, and thrombin  (Jackson  appears to have p a r t i a l but  factors including factors  and Nemerson,1980).  proteolytic activity  i s unable to i n i t i a t e blood c o a g u l a t i o n  tissue  f a c t o r (Jackson  of t i s s u e f a c t o r w i l l production coagulation  and Nemerson,1980).  without  F a c t o r VII activation,  i n the absence of Upon i n j u r y ,  release  i n i t i a t e blood c o a g u l a t i o n ; however, the  of f a c t o r V i l a  will  response (Jackson  The i n t r i n s i c  F a c t o r VII can be  pathway  i n c r e a s e and s u s t a i n the and Nemerson,1980).  (see Fig.1)  differs  as the protease  10  r e s p o n s i b l e f o r the f i r s t initiation  p r o t e o l y t i c cleavage necessary  of c o a g u l a t i o n has not been i d e n t i f i e d  Nemerson,1980; G r i f f i n , 1 9 8 1 ) .  events, but t h e i r  Jackson  in the  initial understood  and Nemerson,1980; G r i f f in,1981).  i s induced by the c o n t a c t of a plasma  factor(s)  ( i n t r i n s i c ) with a n e g a t i v e l y - c h a r g e d s u r f a c e c r e a t e d by to  the v e s s e l w a l l ( G r i f f i n , 1 9 8 1 ) .  (Fig.1) can proceed surface  and  prekallikrein  i n d i v i d u a l r o l e s are not completely  (Davie et al.,1979; Initiation  (Jackson  F a c t o r s XII and XI,  and h i g h molecular weight kininogen p a r t i c i p a t e  f o r the  to f i b r i n  Once i n i t i a t e d , the  injury cascade  f o r m a t i o n to cover the exposed  (Davie et aJ.,1979; Jackson  and Nemerson,1980).  In the  past t w e n t y - f i v e y e a r s , most of the c o a g u l a t i o n f a c t o r s have been p u r i f i e d  from plasma a l l o w i n g c h a r a c t e r i z a t i o n of t h e i r  s t r u c t u r e s and  f u n c t i o n s (Davie et §_1.,1979; Jackson  and  Nemerson,1980 - f o r comparison see MacFarlane,1960).  Recently,  the amino a c i d sequences of the plasma and precursor forms of the c o a g u l a t i o n f a c t o r s have become a v a i l a b l e due molecular Two  to advances i n  biology techniques. important  f e a t u r e s of the blood c o a g u l a t i o n cascade  i l l u s t r a t e d by F i g . 1 .  The  e x i s t e n c e of a cascade  a m p l i f i c a t i o n of the response  to i n j u r y  are  allows r a p i d  (MacFarlane,1964; Davie  and Ratnoff,1964) because each a c t i v a t e d zymogen i s able to a c t i v a t e c a t a l y t i c a l l y a l a r g e number of zymogens i n the next step of the cascade Nemerson,1980).  (see Fig.1)  and  T h i s a m p l i f i c a t i o n a l l o w s the r a p i d response  i n j u r y e s s e n t i a l f o r hemostasis Secondly,  (Davie et a_l.,l979; Jackson  (Jackson and Nemerson,1980).  because a l a r g e number of d i f f e r e n t  protease  to  11  i n h i b i t o r s are found i n plasma  (Jackson and Nemerson,1980), the  m u l t i p l e steps provide a l a r g e number of o p p o r t u n i t i e s to r e g u l a t e the cascade Nemerson,1980).  (Davie et ajL., 1979;  Jackson  and  T h i s prevents c o a g u l a t i o n beyond the s i t e of  i n j u r y and allows t e r m i n a t i o n of c o a g u l a t i o n once the mechanical plug preventing f l u i d  loss i s in place.  C.  STRUCTURE OF THE  1.  S t r u c t u r e Of Plasma  Prothrombin  PROTHROMBIN MOLECULE Prothrombin  i s the c i r c u l a t i n g  zymogen of thrombin,  the  s e r i n e protease r e s p o n s i b l e f o r the l i m i t e d p r o t e o l y s i s of f i b r i n o g e n to produce Nemerson,1980).  fibrin  (Davie et al.,1979; Jackson  Both bovine and human plasma  prothrombin  g l y c o p r o t e i n s of approximately 70,000 molecular weight al.,1979; Jackson and Nemerson,1980). molecular weight The complete  Prothrombin  in other mammalian s p e c i e s  amino a c i d sequence  Prothrombin  has a l s o been p a r t i a l l y c h a r a c t e r i z e d , and was  et  al.,1977)  shown to have  to the mammalian  (Walz,1978).  of c h i c k e n  Based  on m o l e c u l a r  weight, amino a c i d composition and p a r t i a l amino a c i d it  similar  from the c h i c k e n  The N-terminal amino a c i d sequence  prothrombin has been determined  has a  (Walz e_t al.,1974) and amino  a c i d composition (Walz et al.,1974) prothrombins.  (Davie e_t  (Magnusson e_t  al.,1975) and human (Walz et al.,1977; Butkowski  both a s i m i l a r molecular weight  are  (Walz et a l . , 1 9 7 4 ) .  of both bovine  prothrombin have been determined.  and  sequence,  has been concluded that avian and mammalian prothrombins  probably s i m i l a r  i n s t r u c t u r e and  in function  (Walz,1978).  are  1 2  2.  Post-Translational  Prothrombin, which of  the blood c o a g u l a t i o n  undergoes  Modification  i s s y n t h e s i z e d i n the l i v e r as are many factors  (Anderson and Barnhart,1964),  g l y c o s y l a t i o n and 7 - c a r b o x y l a t i o n d u r i n g i t s  biosynthesis  (Swanson and S u t t i e , 1 9 8 5 ) .  These  biosynthetic  processes are many and complex.  S e v e r a l p r e c u r s o r s of plasma  prothrombin have been i d e n t i f i e d  in l i v e r  s t r u c t u r e s have not been c h a r a c t e r i z e d Swanson and S u t t i e , 1 9 8 5 ) . mRNA f o r prothrombin libraries  t i s s u e , though  their  (Graves e_t a l . , 1 980a , b;  Bovine and human cDNA c o p i e s of the  have been i s o l a t e d  from l i v e r cDNA  (MacGi 11 i v r a y e_t al.,1980; Degen et al.,1983;  M a c G i l l i v r a y and Davie,1984) have a l l o w e d the p r e d i c t i o n of the complete  amino a c i d  sequence  The amino a c i d sequence  of the p r e c u r s o r of prothrombin.  of the bovine prothrombin p r e c u r s o r i s  shown i n F i g u r e 2 ( M a c G i l l i v r a y and Davie,1984). to  bovine prothrombin  The p r e c u r s o r  c o n t a i n s an amino-terminal e x t e n s i o n of 43  amino a c i d r e s i d u e s ( M a c G i l l i v r a y and Davie,1984), while the human prothrombin p r e c u r s o r has an e x t e n s i o n of l e a s t 36 residues  (Degen e t a l . , 1 9 8 3 ) .  3.  P r e c u r s o r Prothrombin  The  l e a d e r p e p t i d e (43 amino a c i d s ) of both bovine and  human prothrombin secretion  i s c l e a v e d a t an Arg-Ala bond p r i o r to  from the l i v e r  (Magnusson et §_1.,1975;  Walz et  al.,1977; Degen e t al.,1983; M a c G i l l i v r a y and Davie,1984) (Fig.2). signal  S i g n a l p e p t i d a s e , the p r o t e o l y t i c enzyme which  removes  (pre-) p e p t i d e s from s e c r e t e d p r o t e i n s , t y p i c a l l y c l e a v e s  1 3  Figure  2:  The  Schematic  representation  prothrombin and  Prothrombin  as  single  letter  backwards  Magnusson 422,  and  e_t  Amino  code.  from  prothrombin.  of  predicted  Davie,1984).  the  acid  of  a l . ( 1975).  cDNA  cleavage  The  the  putative  site  of  signal  putative  site  of  propeptidase  -glycosylated  are  acid  residues  residues  peptidase  site  by  the  numbered plasma  according His-366, catalytic  cleavage  cleavage  residues  (MacGi11ivray  produces  placed  active  bovine  indicated  is  that  are  three  of  sequence  residues  bridges  constitute  Y -7-carboxyglutamic  structure  prepro-peptide  Disulphide  Ser-528  the  from  The site  Molecule  to Asptriad.  KRINGLES  s)-COOH 582  15  after  small a l i p h a t i c amino a c i d  (von Heinji,1983,1985), arginine.  The  (e.g.  alanine)  and not l a r g e b a s i c r e s i d u e s such  s i t e c l e a v e d to produce mature plasma  (Fig.2) i s more s i m i l a r as prepro-albumin cleavage  side chains  to p r o - p e p t i d e cleavage  ( S t e i n e r e_t al.,1980),  sequences.  than  I t has been suggested  as  prothrombin  sequences  signal  such  peptidase  that prothrombin  is  s y n t h e s i z e d as a p r e p r o - p r o t e i n and c o n t a i n s both a pre( s i g n a l ) and a pro-peptide et a l . , 1983;  i n the p r e p r o - l e a d e r sequence  M a c G i l l i v r a y and  Davie,1984).  S i m i l a r p r e p r o - l e a d e r p e p t i d e s have been found v i t a m i n K-dependent c o a g u l a t i o n f a c t o r s Jaye  et al.,1983;  Hagen et §_1.,1986).  1 8  producing a 21  18 r e s i d u e pro-peptide in f a c t o r it  IX and  in prothrombin (see Fig.2)  Based on t h i s  residues H i s "  (Bently et al.,1986) producing  site  2 0  cleavage  and  Gin  - 1 9  a 24 r e s i d u e p r e -  While the f u n c t i o n of  i s unknown, t h i s p r o - p e p t i d e  a  peptidase,  that the s i t e of s i g n a l p e p t i d a s e  i s between amino a c i d  site  amino a c i d  s p e c i f i c i t y of s i g n a l  p e p t i d e and a 19 r e s i d u e p r o - p e p t i d e . pro-peptide  IX, the  r e s i d u e p r e - p e p t i d e and  ( B e n t l y e_t a_l.,l986).  the cleavage  has been suggested  (or 25)  Davie,1982;  al.,1984;  In f a c t o r  of s i g n a l peptidase cleavage probably precedes residue T h r ~  in other  (Kurachi and  Fung et al.,1984,1985; Long et  Beckman et al.,1985;  (Degen  the  has h i g h homology with  the pro-peptides of other v i t a m i n K-dependent c o a g u l a t i o n factors  (Fung et al.,1984) and  the p r o - p e p t i d e of the v i t a m i n  K-dependent bone p r o t e i n o s t e o c a l c i n (Pan and al.,1985).  Price,1985;  Pan  Because of t h i s homology, i t has been suggested  the pro-peptide may  have a r o l e  i n the 7 - c a r b o x y l a t i o n of  et  that  the  16  v i t a m i n K-dependent p r o t e i n s Price,1985;  Pan  et  (Fung e_t a l . , 1 984 ,1 985;  Gamma Carboxyqlutamic A c i d Domain  The  N-terminal  47 amino a c i d residues of plasma  prothrombin, the G l a  region  c o n t a i n s a l l of the G l a of C a  2 +  (see Fig.2)  residues  bridges  i n t e r a c t i o n s are e s s e n t i a l prothrombin  and  al.,1985).  4.  formation  Pan  (Magnusson et  al.,1975),  (see above) which allow  to p h o s p h o l i p i d membranes.  These  f o r the e f f i c i e n t a c t i v a t i o n  (Jackson,1981) .  the  of  Descarboxyprothrombin, found i n the  plasma of v i t a m i n K d e f e c i e n t cows and humans, i s p o o r l y a c t i v a t e d as a r e s u l t acid residues 5.  ( S u t t i e and  Jackson,1977).  the G l a r e g i o n are the s t r u c t u r e s known as  (Magnusson e_t a l . , 1 9 7 5 ) .  about 80 amino a c i d  et al.,1975) (see F i g . 2 ) . not c l e a r but  K r i n g l e s are composed of  residues containing six invariant cysteine  r e s i d u e s which form t h r e e  i n t e r n a l disulphide bridges  (Magnusson  The .function(s) of the k r i n g l e s are  the second k r i n g l e of prothrombin has  r e p o r t e d to bind to f a c t o r Va  been  (Esmon and Jackson,1974), which i s  the e s s e n t i a l p r o t e i n c o f a c t o r i n prothrombin complex.  7-carboxyglutamic  K r i n g l e Domain  Following kringles  of the absence of the  activation  17  6.  Thrombin Domain  The  C-terminal  the s e r i n e protease  h a l f of the prothrombin molecule catalytic  region  (Magnusson e_t a l . , 1 9 7 5 ) .  Factor Xa c l e a v e s the p o l y p e p t i d e c h a i n i n two Fig.2) r e l e a s i n g the amino-terminal and  k r i n g l e domains) from the two  (Magnusson et a l . , 1 9 7 5 ) .  places  (see  a c t i v a t i o n peptide  (with Gla  chain thrombin molecule  Bovine thrombin c o n s i s t s of an A chain  (50 amino a c i d r e s i d u e s ) l i n k e d to the B c h a i n  (259  residues) by a d i s u l p h i d e bridge  The  the A chain of thrombin  contains  (see F i g . 2 ) .  amino a c i d f u n c t i o n of  i s unknown (Jackson,1981) .  The  B chain  of thrombin shares amino a c i d sequence homology with many s e r i n e proteases and  i n c l u d i n g the  serine  Fig.2)  5 2 8  invariant h i s t i d i n e  3 6 6  mechanism of a c t i v a t i o n of prothrombin (see below) (Jackson  of the prothrombin and  i s a l s o observed  suggest  is similar  trypsinogen  substrate s p e c i f i c i t y  4 2 2  ,  (see the  that  the  to t h a t of  and Nemerson,1980) .  g i v i n g thrombin a t r y p s i n - l i k e s p e c i f i c i t y  sect ion F-2).  5 2 7  Upon  sequences, homology  at the s u b s t r a t e b i n d i n g pockets  further d i s c u s s i o n s ) .  triad  Homologies to t r y p s i n at  amino-terminus of the B c h a i n and around A s p  alignment  aspartate  r e s i d u e s that comprise the c a t a l y t i c  (Magnusson et a l . , 1 9 7 5 ) .  trypsinogen  ,  with  Asp  (see s e c t i o n F-2  5 0 6  for  However, thrombin has a more l i m i t e d than  the p a n c r e a t i c s e r i n e p r o t e a s e s  (see  18  7.  Three Dimensional  Three d i m e n s i o n a l  Structure  s t r u c t u r e s of thrombin  or prothrombin  have not been e l u c i d a t e d , however, the three dimensional s t r u c t u r e of one of the k r i n g l e s of bovine prothrombin determined  ( T u l i n s k y et §_1.,1985; Park and T u l i n s k y , 1 986) .  s t r u c t u r e was o b t a i n e d from the p r o t e o l y t i c prothrombin al.,1985;  to C y s  8 7  near the middle  1 2 7  and C y s '  1 5  to C y s '  fragment  (Fig.2) a r e found  i n a d i s c - l i k e manner  The G l a region i s a l s o c o n t a i n e d i n  1 (see Fig.2) but the s t r u c t u r e of the  to A l a  4 6  (Park and T u l i n s k y , 1 9 8 6 ) .  i s unknown, the amino a c i d sequence of  shares c o n s i d e r a b l e homology with t r y p s i n  al.,1982).  from  the t h r e e dimensional s t r u c t u r e of the thrombin  domain of prothrombin  ( F u r i e e_t  T h i s sequence homology has allowed the development  a three d i m e n s i o n a l model f o r thrombin  crystal  The sequence  r e s i d u e s suggesting a p o s s i b l e f u n c t i o n as  a receptor r e c o g n i t i o n s i t e  thrombin  suggested  of the G l a region c o u l d be r e s o l v e d , and c o n t a i n s  some s t a c k e d aromatic  Although  I t was  i n the Gla region may be r e q u i r e d f o r  membrane b i n d i n g (Park and Tulinsky,1986). 3 6  ( T u l i n s k y e_t  35 amino a c i d r e s i d u e s c o u l d not be r e s o l v e d , due to l a c k  that some f l e x i b i l i t y  of  3 9  uniform s t r u c t u r e (Park and Tulinsky,1986).  Ser  1 of bovine  of the f o l d e d s t r u c t u r e , with the loops of the  (Park and Tulinsky,1986) . prothrombin  This  The d i s u l p h i d e b r i d g e s  k r i n g l e sequence surrounding t h i s nucleus  of  fragment  (amino a c i d r e s i d u e s 1 to 156, Fig.2)  Park and T u l i n s k y , 1 9 8 6 ) .  between C y s  first  has been  s t r u c t u r e of t r y p s i n  based  (Furie et al.,1982;  on the known see s e c t i o n  19  D.  FUNCTIONS OF PROTHROMBIN  1.  A c t i o n On F i b r i n o g e n  As o u t l i n e d above, prothrombin form of the p r o t e a s e thrombin.  i s the c i r c u l a t i n g zymogen  The primary f u n c t i o n of thrombin  in the c o a g u l a t i o n cascade i s the c o n v e r s i o n of f i b r i n o g e n to fibrin  (see F i g . 1 ) (Davie et §_1.,1979; Jackson and  Nemerson,1980).  F i b r i n o g e n i s c o n v e r t e d to f i b r i n monomer by  l i m i t e d p r o t e o l y s i s i n which thrombin c l e a v e s f i b r i n o g e n of  i n each  the two Aa and two Bj3 c h a i n s to r e l e a s e two of each of the  f i b r i n o p e p t i d e s , A and B ( D o o l i t t l e , 1 9 8 4 ) .  F i b r i n monomers can  then spontaneously polymerize to form i n s o l u b l e that form the b a s i s of the blood c l o t  fibrin  polymers  (Doolittle,1984).  Only  one peptide bond i n each of the Aa and Bj3 c h a i n s of f i b r i n o g e n i s s u s c e p t i b l e to the a c t i o n of thrombin, demonstrating the l i m i t e d s u b s t r a t e s p e c i f i c i t y of t h i s enzyme ( D o o l i t t l e , 1984). Impairment of thrombin, the p h y s i o l o g i c a l cause of f i b r i n formation, thus d i r e c t l y (Fenton,1981; 2.  impairs blood c l o t  formation  Fenton and B i n g , l 9 8 6 ) .  Other Enzymatic F u n c t i o n s  Thrombin i s a l s o a b l e to c l e a v e a l i m i t e d number of peptide bonds i n a few other plasma p r o t e i n s with important p h y s i o l o g i c a l consequence.  Thrombin can a c t i v a t e both f a c t o r s V  and VIII producing f a c t o r s Va and V i l l a Jackson and Nemerson,1980).  (Davie e t al.,1979;  These p r o t e i n s a r e e s s e n t i a l  c o f a c t o r s i n the a c t i v a t i o n complexes of prothrombin and f a c t o r  20  X, r e s p e c t i v e l y  (Davie et al.,1979;  In  of the e n d o t h e l i a l membrane p r o t e i n  the presence  thrombomodulin, thrombin Esmon,l983). inactivates  roles  f a c t o r s Va and V i l l a , thereby r e p r e s s i n g the (Stenflo,1976; Esmon,l983).  i n both the i n i t i a t i o n  c o a g u l a t i o n cascade, protease  w i l l a c t i v a t e p r o t e i n C (Stenflo,1976;  The r e s u l t i n g a c t i v a t e d p r o t e i n C (APC)  c o a g u l a t i o n cascade has  Jackson and Nemerson,1980).  Thus,  thrombin  and t e r m i n a t i o n of the  and as such i s an important r e g u l a t o r y  (Fenton,1981; Fenton  and  Bing,1986).  Thrombin a l s o a c t s to a c t i v a t e f a c t o r XIII by l i m i t e d p r o t e o l y s i s , producing  factor X H I a  (see Fig.1 ).  Factor X H I a  i s a t r a n s g l u t a m i n a s e which c a t a l y s z e s the formation of c o v a l e n t c r o s s l i n k s between glutamine  and l y s i n e r e s i d u e s i n the  7 c h a i n s of a d j a c e n t  f i b r i n monomers (Davie e_t al.,1979;  and Nemerson,1980) .  T h i s c r o s s l i n k i n g strengthens the blood  clot  to a s s i s t  blockage  to f l u i d  Nemerson,1980) . of  i n the formation of an i n s o l u b l e loss  (Davie et al.,1979;  Jackson  mechanical  Jackson and  Thrombin has been i m p l i c a t e d as the a c t i v a t o r  other b l o o d c o a g u l a t i o n f a c t o r s , e.g.  f a c t o r VII as  d i s c u s s e d above, but i n these r o l e s may not be important  ir\ v i v o  (Zur and Nemerson,1981). 3.  Non-Enzymatic Functions  As mentioned above, thrombin components of the hemostatic i n c o m p l e t e l y understood different  cell  a l s o i n t e r a c t s with other  response  to i n j u r y .  mechanisims, w i l l  Thrombin, by  s t i m u l a t e many  types l e a d i n g to mitogenesis, a r a c h i d o n i c a c i d  metabolism and the s e c r e t i o n of p r o t e i n s (see Fenton and  21  Bing,l986 f o r a review). mammalian t i s s u e or c e l l responsive cells, cells  Although r e a c t i v i t y may vary, a l l types  (except  erythrocytes) are  t o thrombin, e s p e c i a l l y e n d o t h e l i a l c e l l s ,  smooth muscle c e l l s ,  leucocytes,  (Fenton and B i n g , l 9 8 6 ) .  w e l l s t u d i e d and,  and c u l t u r e d  nerve fibroblast  Thrombin a c t i o n on p l a t e l e t s i s  upon a c t i v a t i o n i n v o l v e s a change i n c e l l  shape and s e c r e t i o n of p r o t e i n s i n t o plasma  (Milis,1981).  Thrombin has a hormone-like a c t i o n upon c e l l s system (Fenton,1981), and thus may a s s i s t  of the immune  i n the prevention of  i n v a s i o n of the body by f o r e i g n agents by way of i n j u r e d blood vessels. E.  BLOOD COAGULATION IN NON-MAMMALS  1.  Blood  Blood has  Coagulation  coagulation  In The V e r t e b r a t e s  appears t o occur  i n a l l v e r t e b r a t e s , but  been best c h a r a c t e r i z e d w i t h i n the mammals (see above)  (Davie  et al.,1979; Jackson and Nemerson,1980 ) .  coagulation  cascade as shown i n Fig.1  The blood  was developed for the  bovine and human systems, but has been found t o be s i m i l a r i n other mammalian s p e c i e s  (Davie  e_t a_l.,1979; Jackson and  Nemerson,1980) whereas c o a g u l a t i o n vertebrates  systems i n non-mammalian  have been l e s s w e l l c h a r a c t e r i z e d .  Conversion of  f i b r i n o g e n t o f i b r i n by a thrombin-1ike enzyme i s the b a s i s of blood c l o t  formation  in a l l vertebrates  (Doolittle,1984).  many of the v e r t e b r a t e c l a s s e s , the e x i s t e n c e coagulation chicken  of other  f a c t o r s has not been i n v e s t i g a t e d i n d e t a i l .  appears t o have the best  characterized  In  The  coagulation  22  system  i n non-mammals (Didisheim et al.,1959; Walz et a l . , 1 9 7 5 ) .  In the c h i c k e n , most of the mammalian c o a g u l a t i o n  factors,  i n c l u d i n g the p a r t i a l l y c h a r a c t e r i z e d prothrombin  (see s e c t i o n  C-1), have been i d e n t i f i e d a_l. , 1 974 , 1 975) . from lamprey lamprey  (Didisheim et al.,1959; Walz et  Prothrombin  has a l s o been p a r t i a l l y  ( D o o l i t t l e e_t al.,1962; Dool i t t l e , 1 965) .  prothrombin  i s able to c o a g u l a t e bovine  ( D o o l i t t l e e_t a l . , 1 9 6 2 ) .  Lamprey plasma  Nelsestuen,1976).  barium  Other  fibrinogen  (Zytkovicz  one  and  Lamprey prothrombin, l i k e G l a c o n t a i n i n g  (Zytkovz and Nelsestuen,1976), can be adsorbed to  salts  lamprey  Activated  c o n t a i n s at l e a s t  p r o t e i n which c o n t a i n s 7-carboxyglutamic a c i d  proteins  purified  ( D o o l i t t l e et al.,1962; D o o l i t t l e , 1 9 6 5 ) .  prothrombin  structural  i s most l i k e l y a G l a c o n t a i n i n g  imformation about  lamprey  Thus,  protein.  prothrombin  i s not  known. The  remainder  well characterized Surgenor,1962).  of the c o a g u l a t i o n f a c t o r s have been (Didisheim et al.,1959; D o o l i t t l e  Attempts  to demonstrate  s u r f a c e a c t i v a t i o n of c o a g u l a t i o n conclusively  less  and  the e x i s t e n c e of  i n b i r d s and f i s h  failed  to  i d e n t i f y t h i s process (Engle and Woods,1960;  D o o l i t t l e and Surgenor,1962),  while e x t r i n s i c  initiation  been observed i n a l l v e r t e b r a t e s examined (Didisheim et aJL.,1959;  D o o l i t t l e and Surgenor , 1 962) , suggesting that  intrinsic  i n i t i a t i o n of c o a g u l a t i o n may  a d a p t a t i o n to the e x t r i n s i c  be a mammalian  system of blood c o a g u l a t i o n .  has  23  2.  Blood  Coagulation  Blood c o a g u l a t i o n Woods,1960). other phyla Arthropodia,  l i m i t e d to v e r t e b r a t e s  (Engle  and  Eichinodermatia  the s i t e of i n j u r y  been observed in many  (Engle and Woods,1960;  In many of these  invertebrate species,  the r e s u l t of aggregation  of blood c e l l s at  (Engle and Woods,1960; MacFarlane,1960) which  be analogous to the  (MacFarlane,1960).  formation  There are  of a p l a t e l e t plug  fewer cases  in mammals  of plasma p r o t e i n s  i n v o l v e d in a c o a g u l a t i o n scheme (MacFarlane,1960) . The  best c h a r a c t e r i z e d i n v e r t e b r a t e c o a g u l a t i o n p r o t e i n i s  the f i b r i n o g e n molecule from the spiny l o b s t e r ( F u l l e r D o o l i t t l e , 1 9 7 1 a,b). the p o l y m e r i z a t i o n  In t h i s animal,  clot  formation  dependent transglutaminase Doolittle,1971b).  (Engle and Woods,1960; F u l l e r  2 +  and  In the horseshoe c r a b , a second c o a g u l a t i o n after  proteolysis  The  (Solum,1973; Cheng et a l . , 1 9 8 6 ) .  a c i d sequence of the p r e c u r s o r (Cheng e_t a_l. ,1986), and has  of coagulogen has  enzyme r e s p o n s i b l e f o r the  limited  complete amino been determined  no s i m i l a r i t y to e i t h e r v e r t e b r a t e  or spiny l o b s t e r f i b r i n o g e n s (Cheng e_t al.,1986).  The  clotting  l i m i t e d p r o t e o l y s i s has been p u r i f i e d  p a r t i a l l y c h a r a c t e r i z e d ( S e i d and The  i s caused by  by a C a  scheme e x i s t s where a coagulem i s polymerized  Liu,1982).  and  of a plasma f i b r i n o g e n (which i s u n l i k e  v e r t e b r a t e f i b r i n o g e n ; F u l l e r and D o o l i t t l e , 1 9 7 1 a )  and  and  including Ceolinteratia, Annelidia, Molluscia,  hemostasis i s simply  being  i s not  Invertebrates  Hemostasis of some type has  MacFarlane,1960).  may  In The  c l o t t i n g enzyme i s C a  to be a c t i v a t e d by endotoxins  Liu,1980; Liang 2 +  ( S e i d and  and  dependent, and Liu,1980; Liang  appears and  24  Liu,1982). F.  STRUCTURE OF SERINE PROTEASES  1.  Three Dimensional  Structure  Many of the a c t i v a t e d blood c o a g u l a t i o n prothrombin, are s e r i n e proteases and  Nemerson,1980).  coagulation  (Davie  et al.,1979; Jackson  The most obvious f u n c t i o n of these  f a c t o r s i s as proteases  a c t i v a t i o n or i n a c t i v a t i o n of other plasma p r o t e i n s  factors, including  (Jackson  (see Fig.1) f o r e i t h e r the coagulation  and Nemerson,1980).  s t r u c t u r e s of the c o a g u l a t i o n  f a c t o r s or  Three  dimensional  f a c t o r s e r i n e proteases (or  zymogens) have not been determined, but due to t h e i r homology t o the d i g e s t i v e s e r i n e proteases, s e v e r a l of the c o a g u l a t i o n  models of the s t r u c t u r e s of  f a c t o r s have been proposed  al.,1982; Cool et al.,1985).  ( F u r i e e_t  These models assume that the  coagulation  f a c t o r s e r i n e proteases  dimensional  s t r u c t u r e s to the d i g e s t i v e s e r i n e p r o t e a s e s and  f u n c t i o n with Cool  have s i m i l a r  s i m i l a r c a t a l y t i c mechanisms  e t al.,1985).  A l l of the c o a g u l a t i o n  proteases  c o n t a i n the c a t a l y t i c a l l y  aspartate  and s e r i n e r e s i d u e s  three  ( F u r i e e_t al.,1982; factor serine  important  histidine,  i n homologous l o c a t i o n s (Davie et  §_1.,1979; Jackson and Nemer son , 1 980) . f a c t o r s a l s o c o n t a i n s an a s p a r t a t e  Each of the c o a g u l a t i o n  residue  i n a homologous  l o c a t i o n to the a s p a r t a t e of the s u b s t r a t e b i n d i n g pocket of trypsin  (Kraut,1977; S t r y e r , l 9 8 l ) which may account  (limited) trypsin-like specificity (Davie  f o r the  of the c o a g u l a t i o n  e t a_l.,l979; Jackson and Nemerson, 1 980) .  factors  Trypsinogen i s  25  a c t i v a t e d to t r y p s i n by l i m i t e d p r o t e o l y s i s removing an aminot e r m i n a l a c t i v a t i o n p e p t i d e and T h i s new  amino-terminal  c r e a t i n g a new  i s o l e u c i n e then  with the a s p a r t a t e r e s i d u e a d j a c e n t resulting  amino-terminus.  forms a new  salt  bridge  to the a c t i v e s i t e s e r i n e  i n a c o n f o r m a t i o n a l change (Stroud et  al.,1977;  Stryer,1981).  In the c o a g u l a t i o n f a c t o r s , a homologous cleavage  in a conserved  activation  may  sequence  (Jackson and Nemerson,1980)  cause a s i m i l a r c o n f o r m a t i o n a l change r e s u l t i n g  protease a c t i v i t y  (Davie e_t c a l . , 1979;  Jackson  in serine  and  Nemerson,1980). 2.  Limited Substrate S p e c i f i c i t y  Of The  Coagulation  Factors The mechanism f o r the l i m i t e d p r o t e o l y t i c a c t i o n of the coagulation understood  f a c t o r s to s p e c i f i c ( F u r i e e_t §_1 .,1982).  s u b s t r a t e s i s not  completely  S t u d i e s of the s t r u c t u r e of  t r y p s i n have allowed a g r e a t e r understanding  of the  catalytic  mechanism, together with a b a s i s f o r the s u b s t r a t e s p e c i f i c i t y (e.g.  see C r a i k e_t a l . , 1 9 8 5 ) ,  which may  by analogy  help e x p l a i n  the mechanism of the c o a g u l a t i o n f a c t o r s e r i n e proteases. extreme s u b s t r a t e s p e c i f i c i t y  The  i n the c o a g u l a t i o n f a c t o r s may  due  i n part to changes s u r r o u n d i n g  and  the i n f l u e n c e of the a d d i t i o n a l p o l y p e p t i d e chain present i n  many of the c o a g u l a t i o n f a c t o r s limited substrate s p e c i f i c i t y of  the c o a g u l a t i o n cascade  Nemerson,1980).  the s u b s t r a t e b i n d i n g  be  ( F u r i e et a_l.,l982).  is essential  (see Davie  f o r the  pocket  This  amplification  e_t a_l.,l979; Jackson  and  26  G.  HOMOLOGIES WITHIN SERINE PROTEASE ZYMOGENS  1.  F a m i l i e s Of S e r i n e Proteases  The development of the c a t a l y t i c mechanism p r o t e a s e s has occurred at l e a s t life  on e a r t h (Neurath,1984).  twice d u r i n g the e v o l u t i o n of Two  f a m i l i e s of s e r i n e p r o t e a s e s  have been i d e n t i f i e d which share a s i m i l a r (Neurath,1984).  The s u b t i l i s i n  the same c a t a l y t i c mechanism  of the s e r i n e  mechanism  type f a m i l y , a l t h o u g h  ( i n c l u d i n g the c a t a l y t i c  i t shares t r i a d of  r e s i d u e s ) , does not share amino a c i d sequence or three dimensional proteases a larger  s t r u c t u r a l homology with the t r y p s i n - l i k e s e r i n e  (Neurath,1984).  The t r y p s i n - l i k e  f a m i l y and more widespread  s e r i n e proteases are found  f a m i l y appears  i n Nature.  i n both eukaryotes  Trypsin-like and p r o k a r y o t e s  (Delbaere e_t el.., 1975), while the s u b t i l i s i n s are found w i t h i n the B a c i l l i  (Kraut,1977).  i n d i c a t i o n of the age of these p r o t e i n s .  i s an  They must have been i n  e x i s t e n c e s i n c e e a r l y i n the e v o l u t i o n of l i f e  2.  only  E x i s t e n c e of the t r y p s i n - l i k e  s e r i n e proteases i n both prokaryotes and eukaryotes  years ago)  to be  ( i . e . >1X10  9  (Neurath,1984).  Roles Of Serine Proteases In P h y s i o l o g y  S e r i n e proteases have a r o l e p h y s i o l o g i c a l processes  i n a l a r g e number of e s s e n t i a l  (Neurath and Walsh,1976; Neurath,1984) ,  i n c l u d i n g blood c o a g u l a t i o n and d i g e s t i o n as w e l l as such d i v e r s e processes as the complement  cascade,  p r o c e s s i n g , f i b r i n o l y s i s , and f e r t i l i z a t i o n (Neurath and Walsh,1976).  neuropeptide of germ  cells  A l l of these s e r i n e p r o t e a s e s  share  27  amino a c i d sequence homology.  F i g . 3 i l l u s t r a t e s some of the  amino a c i d homologies w i t h i n the c o a g u l a t i o n and f i b r i n o l y t i c s e r i n e protease zymogens  (Young et al.,1978;  Hewett-Emmett et  a l . ,1981). The c a t a l y t i c  r e g i o n s of the blood c o a g u l a t i o n f a c t o r s  share approximately  40% amino a c i d  a l s o with each other al.,1981) i n t h e i r  i d e n t i t y with t r y p s i n o g e n and  (Katayama e t al.,1979;  Hewett-Emmett et  s e r i n e protease domain regions (see F i g . 3 ) .  The blood c o a g u l a t i o n f a c t o r s , and many of the other s e r i n e proteases  (e.g.  complement  f a c t o r B) d i f f e r  from  trypsinogen  and the other d i g e s t i v e s e r i n e proteases i n possessing long amino-terminal  a c t i v a t i o n peptides  Nemerson,1980).  (see Fig.3) (Jackson and  The a c t i v a t i o n peptide i n trypsinogen i s only 6  amino a c i d r e s i d u e s long while  i n prothrombin  and  the a c t i v a t i o n p e p t i d e i s longer than the c a t a l y t i c Fig.3)  (Jackson and Nemerson,1980).  region (see  A l l of these s e r i n e  proteases appear to have a q u i r e d unique amino-terminal  plasminogen,  (but see below)  e x t e n s i o n s i n a d d i t i o n to a common s e r i n e  protease domain  (Jackson and Nemerson,1980).  e x t e n s i o n s have important  The amino-terminal  r o l e s i n the r e g u l a t i o n and a c t i v a t i o n  of the s e r i n e p r o t e a s e s , and may have r o l e s independent s e r i n e protease enzymatic 3.  of the  f u n c t i o n (Jackson and Nemerson,1980).  Homologous Domains Within The A c t i v a t i o n Peptide Of  Prothrombin When the amino-terminal  extensions of many s e r i n e proteases  are compared, s e v e r a l homologous domains are observed (see Fig.3)  (Jackson and Nemerson,1980;  Zur and Nemerson,1981;  28  F i g u r e 3; Amino A c i d Sequence Homologies i n Coagulation F a c t o r Zymogens Comparison of the s t r u c t u r e s of c o a g u l a t i o n and f i b r i n o l y t i c zymogens  to t r y p s i n o g e n .  catalytic  The s o l i d bar represents the  r e g i o n i n the p r o t e a s e s ,  the c r o s s hatched region  r e p r e s e n t s the G l a r e g i o n , K r e p r e s e n t s the k r i n g l e s , E r e p r e s e n t s r e g i o n s homologous to epidermal  growth f a c t o r  precursor,  1 and 2 r e p r e s e n t  r e g i o n s homologous to the type  I and type  II homologies of f i b r o n e c t i n , and A represents  the homologous r e g i o n s found i n f a c t o r XI and p r e k a l l i k r e i n . The lengths of the bars are approximately the lengths of the p o l y p e p t i d e c h a i n s .  p r o p o r t i o n a l to  Arrows represent the  l o c a t i o n s of p e p t i d e bonds t h a t are c l e a v e d during a c t i v a t i o n of the zymogens.  S o l i d l i n e s below the p r o t e i n s  represent d i s u l p h i d e b r i d g e s and do not n e c e s s a r i l y represent  t h e i r true l o c a t i o n s .  (See text f o r d e t a i l s ) .  29  M  PROTHROMBIN  FACTOR  VII  FACTOR  IX  FACTOR  X  PROTEIN  FACTOR  K  I K  W/l E 1 E | , | C  C  XI  A  I  A PRE KALLI KREIN  FACTOR  \  TISSUE TYPE PLASMINOGEN UROKINASE  TRYPSINOGEN  I  A  1  A  f  A  A  A  A  I  A  I  ^  I 2 I E I I f E I K 1 ^itea  XII  PLASMINOGEN  A  A  K  K  K  1 I E !  K  K  I K  ACTIVATOR E L K  30  Doolittle,1985). two  As mentioned p r e v i o u s l y , prothrombin  k r i n g l e s t r u c t u r e s that are 80 amino a c i d  contains  r e s i d u e s long  (K i n Fig.3) (Magnusson et a l . , 1 9 7 5 ) , as shown i n F i g . 3 . K r i n g l e s have a l s o been i d e n t i f i e d w i t h i n f a c t o r XII a_l.,l985; McMullen and Fu j i kawa, 1 985 ) , and the zymogens plasminogen  fibrinolytic  (Sottrup-Jensen et a l . , 1 9 7 8 ) , t i s s u e - t y p e  plasminogen  activator  (Pennica et al.,1983) and  plasminogen  activator  (Verde et a_l.,l984).  i s the Gla domain ( c r o s s hatched). t h i s region i s found  (Cool et  urokinase-type  A l s o shown i n F i g . 3  As mentioned p r e v i o u s l y ,  i n other v i t a m i n K-dependent c o a g u l a t i o n  p r o t e i n s i n c l u d i n g f a c t o r VII  (Hagen e_t a_l.,l986), f a c t o r  IX  (Kurachi and Davie,1982; Jaye et al.,1983), f a c t o r X (Fung et a l . ,1984,1985; Leytus et a l . , 1 9 8 4 ) , and p r o t e i n C (Long et al.,1984;  F o s t e r and Davie,1984; Beckmann et a l . , 1 9 8 5 ) , a l l of  which a l s o c o n t a i n a prepro leader p e p t i d e (see Fung et al.,1985).  Not  shown i n F i g . 3 are p r o t e i n S, which c o n t a i n s  both the Gla r e g i o n and prepro l e a d e r (Dahlback  et a l . , 1 9 8 6 ) ,  and p r o t e i n Z, which c o n t a i n s at l e a s t the G l a r e g i o n (Hojrup et. al.,1985). 4.  Homologous Domains Found In S e r i n e Protease  Other Than  Zymogens  Prothrombin  A d d i t i o n a l domains are found i n other p r o t e a s e zymogens (Fig.3) which are not present  i n prothrombin.  noted by D o o l i t t l e e_t a l . ( l 9 8 4 ) and Bloomquist a region of homology to epidermal growth f a c t o r been i d e n t i f i e d  i n f a c t o r VII  One  of these as  et a l . ( l 9 8 4 ) , i s (EGF)  which  (Hagen et a l . , 1 9 8 6 ) , f a c t o r  has  IX  (Kurachi and Davie,1982; Jaye et al.,1983), f a c t o r X (Fung et  31  al.,1984,1985; L e y t u s et al.,1984), p r o t e i n C (Long et  al.,1984;  F o s t e r and Davie,1984; Beckmann et al.,1985), p r o t e i n S (Dahlback XII  e_t a l . , 1 9 8 6 ) , p r o t e i n Z (Hojrup et a l . , 1 9 8 5 ) ,  (Cool et al.,1985;  type plasminogen  McMullen and Fujikawa,1985),  activator  (Pennica e_t a_l.,1983).  factor  and  tissue-  These  EGF-  l i k e domains are found not only i n s e r i n e proteases but a l s o i n other p r o t e i n s such as the LDL a_l. , 1 985a ,b) . fibronectin  receptor (Sudhoff et  In a d d i t i o n , type I and type II homologies  (Peterson e_t al.,1983) are found i n f a c t o r XII  et al.,1985; McMullen and Fujikawa,1985), homology a l s o found  (Cool  with a type II  i n t i s s u e - t y p e plasminogen  ( P e n n i c i a et a l . , 1 9 8 3 ) .  of  I f the amino-terminal  activator e x t e n s i o n s of  other s e r i n e p r o t e a s e s are compared to other p r o t e i n  sequences,  more homologous domains are found i n c l u d i n g an homologous domain in complement f a c t o r B (Morley and Campbell,1984) and  the  i n t e r l e u k i n - 2 r e c e p t o r (Leonard e_t al.,1985), and the four repeats  (A i n F i g . 3 ) shared by f a c t o r XI  and p r e k a l l i k r e i n  (Fujikawa e_t al.,1986)  (Chung e_t al.,1986).  Thus the s e r i n e p r o t e a s e f a m i l y i l l u s t r a t e s d i f f e r e n t modes of p r o t e i n e v o l u t i o n .  several  Not only are there  changes i n the amino a c i d sequences of the c a t a l y t i c r e g i o n s , but a l s o there are g a i n s and/or l o s s e s of a d d i t i o n a l domains, and protein.  protein  i n some cases d u p l i c a t i o n of these domains w i t h i n a  32  H.  STRUCTURE OF EUKARYOTIC STRUCTURAL GENES  I.  The Gene  Genes f o r p r o t e i n s found i n the v e r t e b r a t e s a r e of a complex s t r u c t u r e  (Breathnach and Chambon,1981).  genes are composed of a s p l i t Transcription  initiation  s t r u c t u r e of exons and i n t r o n s .  sequences,  other r e g u l a t o r y sequences,  T y p i c a l l y , the  i n c l u d i n g promoters and  are found i n the 5' f l a n k i n g  sequence and t r a n s c r i p t i o n t e r m i n a t i o n sequences  a r e found i n  the 3' f l a n k i n g sequence (Breathnach and Chambon,1981 ; Nevins,1983).  T r a n s c r i p t i o n of these genes r e q u i r e s a large  number of d i f f e r e n t  p r o c e s s i n g steps (see below) t o produce a  mature mRNA capable of being t r a n s l a t e d  into a protein  (Breathnach and Chambon,1981; Nevins,1983) . 2.  Exons  Since t h e i r d i s c o v e r y , i n t r o n s have been found a l l v e r t e b r a t e p r o t e i n coding genes, i . e . RNA polymerase  i n almost  those t r a n s c r i b e d by  II (Breathnach and Chambon,1981; G i l b e r t , 1 9 8 5 ) .  Introns separate the exons which are s p l i c e d t o g e t h e r t o form the t r a n s l a t a b l e mRNAs.  Exons have been found t o vary g r e a t l y  in s i z e , although s p e c i f i c (Naora and Deacon,1982). length  s i z e c l a s s e s appear  t o be p r e f e r r e d  A r e l a t i o n s h i p between mRNA t r a n s c r i p t  (coding s i z e ) and number of exons has been  (Blake,1983a,b).  The average exon s i z e  i s about  observed 140 bp, which  corresponds to the most abundant of the observed s i z e (Naora and Deacon,1982).  classes  33  3.  Introns  I n t r o n s s e p a r a t e the exons of a gene and must be removed t o produce a mRNA t r a n s c r i p t  (Breathnach and Chambon,1981).  I n t r o n s , l i k e exons, vary g r e a t l y i n s i z e Deacon,1982) .  (Naora and  At the 5' and 3' end of i n t r o n s ,  specific  conserved sequences can be found (Mount,1982; K e l l e r and Noon,1984) which appear to be e s s e n t i a l introns  (Wieringa e t a l . , 1 9 8 4 ) .  f o r the removal of  Within the i n t r o n s an  a d d i t i o n a l c o n s e r v e d sequence was found ( K e l l e r and Noon,1984). D e l e t i o n of t h i s sequence though has no consequence splicing  (Wieringa e_t a l . , 1 9 8 4 ) .  shown that t h i s sequence  Subsequently, i t has been  i s i n v o l v e d i n branch formation d u r i n g  i n t r o n s p l i c i n g , and can be replaced with other sequences  (Keller,1984) .  i n intron  intronic  The minimum s i z e of i n t r o n s appears t o  be about 80 bp, which may be due to c o n s t r a i n t s caused by the i n t r o n s p l i c i n g mechanism (Wieringa e_t a l . , 1 9 8 4 ) . 4.  Promoters  Upstream of the s i t e of mRNA i n i t i a t i o n ,  promoter  sequences  can be found (Breathnach and Chambon,1981 ; Nevins,1983). Comparison  of DNA sequences of these regions show the presence  of s e v e r a l c o n s e r v e d sequences, i n c l u d i n g the "TATA" and "CAAT" sequences  (Breathnach and Chambon,1981).  (Goldberg-Hogness box)  i s u s u a l l y found about 30 bp 5' to the  s i t e of mRNA i n i t i a t i o n , mRNA i n i t i a t i o n Kingsbury,1982) .  The "TATA" sequence  and i s e s s e n t i a l  f o r the p r e c i s i o n of  (Breathnach and Chambon,1981; McKnight and Approximately 80 bp 5' to the s i t e of mRNA  34  initiation,  a second conserved sequence i s u s u a l l y  "CAAT" sequence (Breathnach and Chambon,1981).  found - the  The f u n c t i o n of  the "CAAT" sequence i s unknown (Breathnach and Chambon,1981), but often  t h i s "CAAT" sequence i s f l a n k e d  repeats (McKnight repeats appear  and Kingsbury,1982).  Kingsbury,1982).  (Nevins,1983).  Other DNA sequences  promoter  of promoter  are o r i e n t a t i o n and  or o r i e n t a t i o n  function  (Gluzman,1985). The  and promoter  p h y s i c a l l y and f u n c t i o n a l l y , such that becoming b l u r r e d  activity,  activity  s p e c i f i c , while o t h e r s , such as enhancers,  d i s t i n c t i o n between enhancers  inverted  f l a n k i n g the s i t e of mRNA  Some of these sequences  independently of d i s t a n c e  inverted  (McKnight and  i n i t i a t i o n a l s o a f f e c t the r e g u l a t i o n  5.  The G/C r i c h  t o be e s s e n t i a l f o r e f f i c i e n t  but not f o r p r e c i s i o n of i n i t i a t i o n  distance  by G/C r i c h  elements  o v e r l a p both  their distinction i s  (Gluzman,1985).  Transcription  And P r o c e s s i n g  E x p r e s s i o n of a gene t o produce  a protein  product  involves  many processes i n c l u d i n g t r a n s c r i p t i o n of the gene, capping, polyadenylylation transport  of the RNA to the cytoplasm, and f i n a l l y  (Nevins,1983).  Capping  for both e f f i c i e n t translation RNA  and s p l i c i n g of the heterogenous  s p l i c i n g (Grabowski  (Shatkin,1985).  above.  transcription  translation  of the 5' end of the RNA i s e s s e n t i a l  i s e s s e n t i a l t o produce  discusssed  nuclear RNA,  e t al.,1985) and  S p l i c i n g of the i n t r o n s  from the  the c o n t i g u o u s t r a n s l a t a b l e mRNA as  The s i t e and mechanism of t e r m i n a t i o n of RNA  i s unknown ( B i r n s t i e l  Most of the genes t r a n s c r i b e d  et a l . , 1 9 8 5 ) . by RNA polymerase  II are  35  polyadenylylated the RNA  after  nuclease  (Perry,1976; Nevins,1983).  removal  Poly(A) i s added to  of the 3' end of the t r a n s c r i p t by a  (Breathnach and Chambon,1981).  T h i s cleavage event  occurs approximately 20 bp 3' to a conserved AAUAAA sequence found  i n the mRNAs (Proudfoot and Brownlee,1976).  T h i s AAUAAA  sequence i s e s s e n t i a l f o r the cleavage r e a c t i o n , but not f o r the poly(A) a d d i t i o n reaction of p o l y ( A ) a d d i t i o n al.,1984).  i s not completely understood  A f t e r the capping, s p l i c i n g , and  (not n e c e s s a r i l y mature RNA  (Montell et §_1.,1983).  i n that order) of the RNA  (McDevitt e_t  polyadenylylation  transcript,  the  i s t r a n s p o r t e d from the nucleus to the cytoplasm  where i t i s t r a n s l a t e d  into protein  I.  EVOLUTION OF AMINO ACID AND  1.  M o l e c u l a r Clock  If a p r o t e i n differences  The mechanism  (Nevins,1983). DNA  SEQUENCE  i s i s o l a t e d from s e v e r a l d i f f e r e n t  i n the amino a c i d sequence are u s u a l l y  (Zuckerkandl and P a u l i n g , 1 965) . found between sequences  species,  found  A greater difference  i s usually  from s p e c i e s which have a more a n c i e n t  common a n c e s t o r (Zuckerkandl and Paul ing, 1 965; Wilson e_t §_1.,1977).  I t appears as i f there i s constant change i n the  sequence of a p r o t e i n through time Wilson et a l . , 1 9 7 7 ) .  There  (Zukerkandl and  Pauling,1965;  i s some evidence that most of the  changes have o c c u r r e d at a n e a r l y uniform r a t e over time, that the changes i n sequence act as a molecular c l o c k . be a u s e f u l t o o l  i n the r e s o l u t i o n of the phylogeny  (Wilson et al.,1977; L i et al.,1985b).  such  T h i s can  of s p e c i e s  Once the f u n c t i o n of a  36  p r o t e i n changes (as can occur to one product of a gene d u p l i c a t i o n ) , the r a t e of e v o l u t i o n  of a p r o t e i n  i s l i k e l y to  change (Wilson et al.,1977) as the p r o t e i n w i l l now be under a different  c o l l e c t i o n of s e l e c t i v e p r e s s u r e s .  matters, the rate of the e v o l u t i o n  To complicate  of a p r o t e i n , even though i t  maintains the same f u n c t i o n , can change due to changes i n the organism's environment, or even the c e l l u l a r or molecular environment The  (Wilson et al.,1977) .  apparent reason f o r the o f t e n  evolution  of a p r o t e i n  n e a r l y uniform  i s that  not due to d i f f e r i n g  (Wilson et e l . , 1 9 7 7 ) .  L i et al.., 1985b).  The  r a t e s between d i f f e r e n t p r o t e i n s i s  r a t e s of mutation of DNA, but p r i m a r i l y due  to s e l e c t i o n and the a b i l i t y  can  the mutation rate of DNA has been  (Wilson ejt a l . , 1 977;  difference in evolutionary  near uniform rate of  of a p r o t e i n t o t o l e r a t e change  Even w i t h i n  a protein, different  regions  evolve at d i f f e r e n t r a t e s , such as the i n s u l i n molecule  (Wilson et §_1.,1977; L i e_t a_l. , 1985b). sequence of the same p r o t e i n  Thus, comparison of the  from d i f f e r n t  demonstrate f u n c t i o n a l l y important  regions  species  may  by t h e i r reduced rate  of change (Wilson et a l . , 1 9 7 7 ) . 2.  Gene  Duplications  There a r e many gene f a m i l i e s of s t r u c t u r a l l y and functionally similar proteins §_1.,1983).  such as the g l o b i n s  These f a m i l i e s represent  proteins  which f u n c t i o n i n  a s i m i l a r f a s h i o n and o f t e n complement each other al.,1983). family  ( E d g e l l e_t  ( E d g e l l et  Other f a m i l i e s such as the lysozyme-lactalbumin  ( H a l l et al.,1982) or t o a l e s s e r extent the  37  immunoglobulin superfamily differently. essential  formation of these d i f f e r e n t p r o t e i n s  Often the gene d u p l i c a t i o n events have  s e v e r a l times, e.g.  the g l o b i n s  a l . , 1 9 8 4 ) , immunoglobulins genes (Crabtree family  occurred  ( E d g e l l et al.,1983; H a r d i e s et  (Hood et al.,1985), or the  et al.,1985).  W i t h i n the  fibrinogen  s e r i n e p r o t e a s e gene  s i m i l a r gene d u p l i c a t i o n s have been r e s p o n s i b l e  expansion of t h i s family al.,1981). globins  The  s e r i n e proteases d i f f e r  greatly  EVOLUTION OF  1.  Internal Duplications  THE  have not  (Doolittle,1985).  STRUCTURE OF  from the example of  PROTEINS AND  and  Many p r o t e i n s  have increased  also in s i z e  greatly  the  (Li,1983).  s i z e of a p r o t e i n ,  It i s easy to imagine that  k r i n g l e s of plasminogen (see Fig.3) are  result three  the  the e n t i r e molecule i s d u p l i c a t e d , (Neurath,1984).  as  (McLachlan,1979).  five  such  In other cases in the case of  I n t e r n a l d u p l i c a t i o n s not  in homologous amino a c i d sequence, but dimensional s t r u c t u r e  just part  the  r e s u l t of  (Kurosky et a l . , 1 9 8 0 ) .  in s i z e  Different  the most obvious of which i s the d u p l i c a t i o n of a l l or  streptokinase  s i z e of  GENES  forms ( D o o l i t t l e , 1 9 8 5 ) .  mechanisms appear to f u n c t i o n to i n c r e a s e  internal duplications  the  Within A Gene  only changed i n sequence but  compared to t h e i r a n c e s t r a l  of a p r o t e i n  the  (Hewett-Emmett et al.,1981; Patthy,1985) .  J.  Proteins  for  (Young et al.,1978; Hewett-Emmett et  i n that they have a l s o a l t e r e d the s t u c t u r e  their proteins  nearly  very  In e i t h e r case, gene d u p l i c a t i o n events were  for the  (Li,1983).  (Hood et a l . , 1 9 8 5 ) , f u n c t i o n  only  a l s o a homologous In  trypsinogen,  38  it  has been observed that by r o t a t i n g the molecule 180°, i t i s  p o s s i b l e to produce a s i m i l a r three molecule (McLachlan,1979). that t r y p s i n o g e n ,  dimensional s t r u c t u r e of the  T h i s has been i n t e r p r e t e d to imply  and thus a l l other s e r i n e proteases, have been  formed by d u p l i c a t i o n events t o r e s u l t  i n four s i m i l a r  s t r u c t u r a l domains making up the s e r i n e protease domain (McLachlan,1979). these ancient 2.  Today no amino a c i d homology i s v i s i b l e  d u p l i c a t i o n events  from  (McLachlan,1979).  Gene F u s i o n s  Gene d u p l i c a t i o n s cannot c o m p l e t e l y e x p l a i n the e v o l u t i o n of some of the l a r g e r p r o t e i n s coagulation  found today, such as the blood  f a c t o r s ( D o o l i t t l e , 1985) .  In these p r o t e i n s , i t  appears that p r o t e i n domains from s e v e r a l d i f f e r e n t sources have been combined to c r e a t e  new p r o t e i n s  by some gene f u s i o n type  event  for possible  mechanisms)  (see next s e c t i o n  (Doolittle,1985).  In some c a s e s , the gene f u s i o n s have been  very c o m p l i c a t e d such as with the l a r g e number of d i f f e r e n t p r o t e i n domains found i n f a c t o r XII (Cool e_t al.,1985; Neurath,1985).  Duplication  events appear to occur together with  these gene f u s i o n events, s i m i l a r to t r a n s p o s i t i o n of r e p e t i t i v e DNA  elements (Calos and M i l l e r , 1 9 8 0 ) r e t a i n i n g the p r o t e i n  domain i n the donor p r o t e i n .  39  K.  FUNCTION OF  1.  Distribution  INTRONS  With the d i s c o v e r y  of  introns  (Berget  et §_1.,1977; Chow et  al_.,l977), the paradox of the number of genes and was  partially  i f not  s i z e of genes was p r o t e i n , and  completely r e s o l v e d  found to be u n r e l a t e d  thus the  s i z e of the  number of genes w i t h i n Gilbert,1979). i n the  f u n c t i o n a l RNA  (Breathnach and RNA  splicing  s i z e of  the to  of DNA  which are  not  Chambon,1981).  Removal of  introns  from hnRNA  product.  Introns  are  found i n n u c l e a r  genes of eukaryotes, some genes of a r c h e b a c t e r i a ,  If the mechanism of s p l i c i n g of  i n t r o n s found i n v a r i o u s  species are compared, at l e a s t three  s p l i c i n g are observed  2.  by  and  ( D a r n e l l and  possibility  found  (mRNA, rRNA, or tRNA) of a gene  some v i r a l genes of prokaryotes  and  the  (Cavalier-Smith,1978;  regions  product  to the  The  (Cech,1983) j o i n s the exons of a gene to form a  f u n c t i o n a l RNA organellar  are  (Gilbert,1979).  genome i s u n r e l a t e d  the genome  Introns  genome s i z e  P o s i t i o n Of  Doolittle,1986) . genes  d i f f e r e n t types of  (Cech,1983; Sharp,1985), suggesting  of m u l t i p l e o r i g i n s of Introns  When the p o s i t i o n s of  introns  (see  W i t h i n Genes And  introns  and  RNA  the  below).  Proteins  in genes were mapped to  p o s i t i o n s i n the  t r a n s l a t e d p r o t e i n p r o d u c t s , i t was  observed  that many of the  i n t r o n s separated p r o t e i n domains (Artymiuk e_t  al.,1981; Blake,1978,1983a,b,1985).  Subsequently i t  demonstrated that  i n t r o n s separated domains  of three  in some genes, the  dimensional s t r u c t u r e  (which may  not  was  necessarily  be  40  f u n c t i o n a l domains) was  that o f t e n  An a d d i t i o n a l  Gilbert(1978,1979)  observations,  observation  the p o s i t i o n of i n t r o n s mapped to the s u r f a c e of  ( C r a i k et a l . , 1 9 8 2 a , b , 1 9 8 3 ) .  a protein  the  (Go,1981,1983).  From the e a r l y  postulated  that  s h u f f l i n g of p r o t e i n domains, a mechanism  i n t r o n s allowed  he c a l l e d  shuffling.  I t should  explanation  of the f u n c t i o n of i n t r o n s , but e x p l a i n s  have been used d u r i n g  be noted that exon s h u f f l i n g  evolution  exon  i s not an how  they  Gilbert,1985;  (Blake,1985;  Rogers,1985). 3.  Intron  Sliding  The d i s c o v e r y explanation between  of i n t r o n s a l s o provided  a possible  f o r the many i n s e r t i o n s and d e l e t i o n s  related proteins.  observed  These i n s e r t i o n s or d e l e t i o n s  between  p r o t e i n s were o f t e n observed at or near intron-exon j u n c t i o n s of at l e a s t one gene w i t h i n s i t e of RNA  a gene  s p l i c i n g could  family.  create  Thus, changes i n the  mRNAs c o n t a i n i n g  and/or d e l e t i o n s of a few amino a c i d residues al.,1982a,b,1983). would r e s u l t would a l t e r altering  insertions  (Craik et  Only changes of 3 bp or m u l t i p l e s  i n such o b s e r v a t i o n s , the r e a d i n g  as any other  of 3 bp  type of change  frame of the mRNA, thus completely  the sequence of the p r o t e i n .  41  L.  ORIGIN OF INTRONS  1.  Metabolic Enzymes  Many of the metabolic enzymes bind n u c l e o t i d e s as c o f a c t o r s and probably c o n s t i t u t e one of the most a n c i e n t gene (Rogers,1985; G i l b e r t , 1 9 8 5 ) .  T h i s gene f a m i l y d i v e r g e d i n t o the  d i f f e r e n t metabolic enzymes p r i o r  to the e u k a r y o t i c - p r o k a r y o t i c -  a r c h e b a c t e r i a l divergence, and thus o c c u r r e d i n the (Gilbert,1985; Marchionni and G i l b e r t , 1 9 8 6 ) . p o s s i b l e to determine the progenote  families  progenote  Hence, i t may  be  i f i n t r o n s were present i n the genes of  by comparing the gene o r g a n i z a t i o n of the  d i f f e r e n t members of the metabolic enzyme f a m i l y ( G i l b e r t , 1 9 8 5 ) . Genes f o r s e v e r a l members of the metabolic enzyme f a m i l y have been i s o l a t e d and c h a r a c t e r i z e d , i n c l u d i n g dehydrogenase ( B e n y a j a t i e_t a l . , 1 98 1 , 1 983;  alcohol  Dennis  et  a l . , 1 984,1 985; Duester e_t a_l. , 1 986) , g l y c e r a l d e h y d e phosphate dehydrogenase (Stone et al.,1985a,b),  l a c t a t e dehydrogenase ( L i  et al.,1985a), pyruvate kinase (Lonberg and G i l b e r t , 1 9 8 5 ) , and triose-phosphate isomerase  (Brown et al.,1985;  S t r a u s and  G i l b e r t , 1 9 8 5 ; Marchionni and G i l b e r t , 1 9 8 6 ; McKnight et §_1.,1986).  Comparison of the o r g a n i z a t i o n of some of  genes (Cornish-Bowden,1985; Duester i n t r o n s do tend to c l u s t e r §_1.(1986) concluded that  these  et al.,1986) shows that  in similar  locations.  the  Duester et  i n t r o n s were present b e f o r e these genes  d u p l i c a t e d , suggesting the e x i s t e n c e of i n t r o n s s i n c e the beginning of  life.  When the sequences of these genes are compared, i t i s found  42  that none of the i n t r o n s i n the d i f f e r e n t genes i s shared (Cornish-Bowden,1985; Straus been acknowledged t h a t the  and G i l b e r t , 1 9 8 5 ) ;  i t would be d i f f i c u l t  to move i n t r o n s by  f r a c t i o n s of a codon o f t e n r e q u i r e d to a l i g n t h e i r  from d i f f e r e n t genes s l i d i n g above).  (Straus and G i l b e r t , 1 9 8 5 ,  positions  see i n t r o n  No c l e a r example of i n t r o n s l i d i n g of a  f r a c t i o n of a codon has been demonstrated. the  indeed, i t has  Thus c l u s t e r i n g of  i n t r o n s does not suggest that there was an i n t r o n from the  beginning;  i t i s p o s s i b l e or even probable that these  are due to independent  i n s e r t i o n s (Rogers,1985).  2.  The T r i o s e - P h o s p h a t e Isomerase Gene  The  gene f o r t r i o s e - p h o s p h a t e  characterized (Pichersky  introns  isomerase has been  from a number of s p e c i e s ,  including  coli  e t a l . , 1 9 8 4 ) , Saccharomyces c e r e v i s i a e (Alber and  Kawaski,1982), Schizosaccharomyces pombe (Russell,1985),  chicken  (Straus and Gi l b e r t , 1 985) , man (Brown et §_1.,1985), maize (Marchionni and G i l b e r t , 1 9 8 6 ) , (McKnight e t a l . , 1 9 8 6 ) . and  The genes i n E_;_ c o l i ,  S_j_ pombe do not c o n t a i n  contain  up t o e i g h t  a_l.,1986).  and A s p e r g i l l u s  introns  nidulans S.  cerevisiae,  i n t r o n s , while the remainder (Mcknight et al.,1986; G i l b e r t et  Comparison of these genes i n the four s p e c i e s  with  i n t r o n s shows that o n l y one of the i n t r o n s i s shared by a l l species five other  (McKnight e t aJL.,1986; G i l b e r t et al.,1986).  i n t r o n s found i n A s p e r g i l l u s , only species.  Introns  of codons d i s p l a c e d  Of the  i n t r o n B i s found i n the  A and E are found at non i n t e g e r number  ( t h e r e f o r e cannot be e a s i l y e x p l a i n e d by  i n t r o n s l i d i n g ) , and i n t r o n s C and D are unique (McKnight et  43  §_1.,1986).  The i n t r o n o r g a n i z a t i o n of the human and chicken  genes are i d e n t i c a l  ( G i l b e r t et a l . , 1 9 8 6 ) , and s i x of the e i g h t  i n t r o n s of maize are shared with the c h i c k e n  (Marchionni and  G i l b e r t , 1 9 8 6 ; G i l b e r t et a l . , 1 9 8 6 ) . G i l b e r t et al.(1986)  concluded  that these o b s e r v a t i o n s are  best e x p l a i n e d by i n t r o n s being present  i n the p r o g e n i t o r  s p e c i e s (and t h e r e f o r e s i n c e the beginning of  life).  U n f o r t u n a t e l y , no known mechanism w i l l move an i n t r o n a f r a c t i o n of  a codon (Rogers,1985; Straus and G i l b e r t , 1 9 8 5 ) ; t h e r e f o r e ,  i n t r o n s which i n t e r r u p t the sequence i n d i f f e r e n t n e c e s s a r i l y have a common o r i g i n .  phases do not  Intron invasion at preferred  s i t e s cannot be excluded as an a l t e r n a t i v e e x p l a n a t i o n f o r the observations w i t h i n the t r i o s e - p h o s p h a t e isomerase  genes, and  indeed may be the more probable e x p l a n a t i o n . 3.  Intron M o b i l i t y  Data on the t r i o s e - p h o s p h a t e isomerase little  genes show that  change i n number of i n t r o n s has o c c u r r e d s i n c e the  divergence of p l a n t s and animals one b i l l i o n (Marchionni and G i l b e r t , 1 9 8 6 ; G i l b e r t and G i l b e r t ( 1 9 8 6 ) concluded  or more years ago  et a l . , 1 9 8 6 ) .  that the d i f f e r e n c e  Marchionni  i n the number of  i n t r o n s i n the maize and chicken gene f o r t r i o s e - p h o s p h a t e isomerase  was due to i n t r o n l o s s i n a n i m a l s .  However, i t i s not  p o s s i b l e to exclude the p o s s i b i l i t y of i n t r o n p l a n t s with these d a t a .  insertion in  S i m i l a r o b s e r v a t i o n s with the g l o b i n  genes have been made ( D a r n e l l and D o o l i t t l e , 1 9 8 6 ) . The s t r u c t u r e " of the t r i o s e - p h o s p h a t e isomerase i n d i c a t e d that some i n t r o n s are at l e a s t  gene f o r A s p e r g i l l u s  1.2 b i l l i o n  years o l d  44  (McKnight et al.,1986; G i l b e r t et al.,1986). observed between the p l a n t s and e a s i l y be due 1.2  and  animals and  Differences Aspergillus  to i n t r o n i n s e r t i o n s between  1 billion  years  ago.  Gene s t r u c t u r e s of the metabolic enzymes do not clearly  (Cornish-Bowden, 1 985) .  for  this  s i t e s of  i s not  apparent.  One  life.  As d i s c u s s e d  the genes f o r t r i o s e - p h o s p h a t e  Models Of  The  usefulness  Blake,1985) but  early  present  isomerase i m p l i e s that  i n the  last  from  billion  1 billion  of  intron  years ago  with  years, e s p e c i a l l y  Intron O r i g i n of i n t r o n s i s c l e a r  their origin  i s not  (Gilbert,1985;  (Rogers,1985).  S p l i c i n g of  thus i n t r o n s have been proposed to have e x i s t e d  i n the e v o l u t i o n of l i f e  t h i s s p l i c i n g was  ( D a r n e l l and  Doolittle,1986).  eukaryotes l o s t most  day  life  Subsequently, prokaryotes and  and  larger  (Doolittle,1978;  ( i f not a l l ) i n t r o n s by  since  Doolittle,1986),  necessary for the e v o l u t i o n of the  p r o t e i n s e s s e n t i a l f o r present and  (Cavalier-  (McKnight et al.,1986; G i l b e r t et a l . , 1 9 8 6 ) .  4.  and  preferred  above, the o r g a n i z a t i o n  t a k i n g place at l e a s t 1.2  further a c t i v i t y  in animals  that  explanation  proposed source of the  i n t r o n i n s e r t i o n rather than i n t r o n s being  i n s e r t i o n was  out  Data from the metabolic enzyme genes appear to  the b e g i n n i n g of  RNA  an  i n s e r t i o n i s based on chromatin s t r u c t u r e  Smith, 1985). support  ruled  McKnight e_t a l . ( l 9 8 6 ) observed  i n s e r t i o n occurs at p r e f e r r e d s i t e s and  little  demonstrate  the presence of i n t r o n s p r i o r to t h e i r d u p l i c a t i o n in  the pregenote because i n t r o n i n s e r t i o n cannot be  intron  may  Darnell  unicellular  s e l e c t i o n for  45  smaller  genome s i z e ( D o o l i t t l e , 1 9 7 8 ; G i l b e r t , 1 9 8 5 ;  Doolittle,1986).  Multicellular  have t h i s s e l e c t i o n pressure (Gilbert,1985).  If t h i s  D a r n e l l and  eukaryotes a r e p o s t u l a t e d  f o r smaller  i s true,  not to  genome s i z e  i n t r o n l o s s has been a major  force i n gene e v o l u t i o n , although t h i s does not account f o r the mechanisms of RNA s p l i c i n g a s s o c i a t e d  with i n t r o n removal  (Cech,1983). A second p o s s i b i l i t y  i s that  i n t r o n s have invaded the  genome of the eukaryotes (Cavalier-Smith,1978,1985;  Crick,1979).  Intron RNA can be s e l f - s p l i c i n g , without the requirement f o r p r o t e i n s , as a ribozyme (Kruger e_t al.,1982; Zaug et al.,1986), possibly  implying  that  i n t r o n s were mobile v i r u s - l i k e  that may have invaded genes (Sharp,1985).  The three  elements splicing  mechanisms (Cech,1983; Sharp,1985) thus c o u l d be the r e s u l t of the conversion  of three d i f f e r e n t i n v a d i n g  i n t o the three  types of i n t r o n s found today.  for a recent  viral-like  elements  A d d i t i o n a l support  i n v a s i o n of genes by i n t r o n s i s found when.the  family of f l a v i n - c o n t a i n i n g metabolic enzymes a r e compared (see above).  The preference  f o r i n t r o n s t o s e p a r a t e p r o t e i n domains  (Go,1981,1983; Blake,1978,1983a,b,1985), and t h e i r l o c a t i o n corresponding t o the surface of p r o t e i n s al.,1982a,b,1983) has not been e x p l a i n e d . i n t r o n s i n s e r t e d they were not mutagenic  ( C r a i k et I t i s c l e a r that i f (Rogers,1985; Duester  et al.,1986) so the obvious s e l e c t i v e p r e s s u r e i n t r o n s i s not v a l i d .  f o r c l u s t e r i n g of  Mechanisms such as the i n s e r t i o n of  i n t r o n s at the j u n c t i o n of l i n k e r  sequences and nucleosome core  p a r t i c l e s can account f o r the observed s i z e c l a s s e s of i n t r o n s  46  ( C a v a l i e r - S m i t h , 1 9 8 5 ) , but as yet there i s no evidence support  t h i s mechanism.  A yet unknown mechanism may  to  e x p l a i n the  s i t e p r e f e r e n c e of i n t r o n s .  I n v e s t i g a t i o n of other gene  f a m i l i e s may  provide insight  i n t o both the o r i g i n and f u n c t i o n  of i n t r o n s .  The  s e r i n e protease gene f a m i l y i s a e x c e l l e n t  f a m i l y f o r such an  i n v e s t i g a t i o n , as the gene d u p l i c a t i o n  are s c a t t e r e d throughout  the e v o l u t i o n of the eukaryote  events  (Young  et al.,1978) . M.  SERINE PROTEASE GENES  1.  Sequence Of Serine Proteases  The  amino a c i d sequences of a l a r g e number of s e r i n e  p r o t e a s e s zymogens have been determined Hewett-Emmett et §_1.,1981), been p a r t i a l l y advent  (see Young et  al.,1978;  and a l a r g e number of others have  sequenced (Hewett-Emmett et a l . , 1 9 8 1 ) .  of molecular b i o l o g i c a l techniques,  With the  i s o l a t i o n of cDNAs  has a l l o w e d the p r e d i c t i o n of the amino a c i d sequences of many more s e r i n e protease zymogens, together with t h e i r p r e c u r s o r sequences.  For the c o a g u l a t i o n p r o t e i n s , the complete amino  a c i d sequences of prothrombin et al.,1983; al.,1986), factor X (Fujikawa  (MacGi 11 i v r a y et a_l.,l980; Degen  M a c G i l l i v r a y and Davie,1984),  f a c t o r IX  f a c t o r VII  (Hagen et  (Kurachi and Davie,1982; Jaye et a l . , 1 9 8 3 ) ,  (Fung et a l . ,1984,1985; Leytus et al.,1984), f a c t o r et al.,1986), f a c t o r XII  prekallikrein  (Cool et al.,1985),  (Chung e_t al.,1986), p r o t e i n C (Long et a l . , 1984;  F o s t e r and Davie,1984; Beckman et al.,1985), and p r o t e i n S (Dahlback  XI  et al.,1986) have been determined  from  the  47  corresponding  cDNA sequences.  the f i b r i n o l y t i c  zymogens i n c l u d i n g plasminogen  Davie,1983; Malinowski activator  In a d d i t i o n , cDNAs f o r many of  et a l . , 1 9 8 4 ) ,  (Malinowski  t i s s u e - t y p e plasminogen  (Pennica e_t a_l.,l983), and  urokinase  (Verde et  §_1.,1984) have been i s o l a t e d and c h a r a c t e r i z e d . have allowed a b e t t e r understanding r e l a t e d to each other and 2.  Genes For S e r i n e  and  of how  These sequences  these p r o t e i n s are  other p r o t e i n s (see  Patthy,1985).  Proteases  Genes f o r many of the s e r i n e p r o t e a s e s have a l s o been characterized  including trypsinogen  ( C r a i k et  al.,1984),  chymotrypsinogen  ( B e l l et §_1.,1984), p r o e l a s t a s e (Swift et  al.,1984), nerve  growth f a c t o r  Richards,1985), gland  complement f a c t o r B (Campbell  activator  fibrinolytic  kidney  (van Leeuwen et al.,1986),  and Porter,1983;  Campbell et  zymogens t i s s u e type plasminogen  (Ny et §_1.,1984; F i s h e r e_t al.,1985;  al.,1986), urokinase  (Nagamine et al.,1984;  blood c o a g u l a t i o n p r o t e i n s , f a c t o r Yoshitake  y (Evans and  p e p t i d e p r o c e s s i n g k a l l i k r e i n s of the m a x i l l a r y  (Mason e_t al.,1983) and  al.,1984),  s u b u n i t s a and  IX  Degen et  R i c c i o et al.,1985),  (Anson et  al.,1984;  e_t a l . , 1 9 8 5 ) , p r o t e i n C ( F o s t e r et a_l.,l985; P l u t z k y  et a l . , 1 9 8 6 ) , and  the plasma p r o t e i n h a p t o g l o b i n  protease homologue) (Maeda et a_l. , 1984). s t r u c t u r e s of plasminogen al.,1985), and prothrombin  (Malinowski  P a r t i a l gene  et al.,1984;  Sadler et  (Degen et §_1. , 1 983 , 1 985;  al.,1983) have a l s o been r e p o r t e d . protease gene from  (a non-serine  The  Davie et  s t r u c t u r e of a s e r i n e  the i n v e r t e b r a t e D r o s o p h i 1 i a melanogaster  (Davis et_ al.,1985) has a l s o been r e p o r t e d .  48  N.  T H E E V O L U T I O N OF THE S E R I N E P R O T E A S E G E N E S  By c h a r a c t e r i z i n g t h e  gene  comparing  the  gene s t u c t u r e  listed  the  previous  in  insight found  section,  into  the  origin  within  the  prothrombin  evolution also  to  of  o n e member o f  shed l i g h t  possibly  their  prothrombin identify functions  on t h e  f r o m a number of  the  the it  prothrombin  within  the  structural Such a study  of  of  genes obtain  domains of  introns,  c o m p a r i s o n of  the  the  for  species.  the  and  sequence  s p e c i e s may h e l p  importance these  and  s e r i n e p r o t e a s e s may  history  different  functional  of  may be p o s s i b l e t o  family  Finally,  prothrombin  structures  molecule.  the  of  bovine  different  evolutionary  function.  regions of  of  for  shared  to  49  MATERIALS A.  AND METHODS MATERIALS  Yeast e x t r a c t , casamino  a c i d s , b a c t o - t r y p t o n e , and bacto-  agar were D i f c o grade from the Grand  Island B i o l o g i c a l  NZ-amine type A was from Humko S h e f f i e l d Chemical Co.  Company. Agarose,  acrylamide, b i s a c r y l a m i d e , urea, ammonium p e r s u l p h a t e , and TEMED (N,N,N',N'-tetramethylethlyenediamine) Laboratories.  were from  N i t r o c e l l u l o s e sheets and c i r c l e s  Bio-Rad (82 and 132  mm)  were 0.45jum pore s i z e from M i l l i p o r e or S c h l e i c h e r and S c h u e l l . 3 2  P - l a b e l e d n u c l e o t i d e s were from New England Nuclear or  Amersham. redistilled  Phenol was from B r i t i s h Drug Houses L t d . before use.  c o l l e c t e d and frozen  and was  The f r a c t i o n d i s t i l l e d a t 179°C  i n a l i q u o t s at -20°C.  was  Deoxy-,  d i d e o x y r i b o n u c l e o t i d e s , and random h e x a d e o x y r i b o n u c l e o t i d e s (p(dN9)) were from PL-Pharmacia. thiogalactopyranoside galactopyranoside dimethylsulphoxide  I sopropyl-j3-D-  (IPTG), 5-bromo-4-chloro-3-indolyl-^-D-  (X-Gal), e t h i d i u m bromide  (EtBr),  (DMSO), 3-(N-morpholino)propanesulphonic  (MOPS), yeast t r a n s f e r RNA  (tRNA), a m p i c i l l i n ,  tetracycline,  chloramphenicol and r i b o n u c l e a s e A were from Sigma.  Cesium  c h l o r i d e was from Cabot B e r y l c o L t d . U l t r o g e l AcA54 was LKB.  acid  from  O l i g o d e o x y r i b o n u c l e o t i d e s were s y n t h e s i z e d on an A p p l i e d  Biosystems 890A DNA S y n t h e s i z e r  (by Tom A t k i n s o n , Dept.  of  Biochemistry) and p u r i f i e d by d e n a t u r i n g p o l y a c r y l a m i d e g e l e l e c t r o p h o r e s i s p r i o r to use (Atkinson and Smith,1984). other chemicals were of reagent grade or b e t t e r and were  All  50  purchased  from e i t h e r Sigma Chemical  B r i t i s h Drug Houses L t d . ligase,  T4 DNA  F i s h e r S c i e n t i f i c , or  R e s t r i c t i o n endonucleases,  T4  DNA  polymerase, T4 p o l y n u c l e o t i d e kinase and  BSA  (nuclease f r e e ) were from New  England  L a b o r a t o r i e s , or PL-Pharmacia. deoxyribonuclease  B i o l a b s , Bethesda Research  Nuclease  SI  and  I were from Boerhinger-Mannheim.  myoblastosis  virus  Inc.  England  or New  Co.,  r e v e r s e t r a n s c r i p t a s e was Nuclear.  DNA  Avian  from L i f e  polymerase I and  Sciences  DNA  polymerase I Klenow fragment were from Boerhinger-Mannheim or PL-Pharmacia.  Day  o l d c h i c k s were obtained from Western  Hatcheries, Abbotsford. Dr. was  P.  March, Dept.  obtained  of P o u l t r y Science, UBC.  from I n t e r c o n t i n e n t a l Packers,  B.  STRAINS, VECTORS, AND  1.  Bacterial Strains  E.  coli  al.,1982) was  +  host  +  isolation coli  of DNA  supE) ( M a n i a t i s et  +  f o r s c r e e n i n g and  isolation  (Karn et §_1.,1980) was  E^_  of DNA  of c l o n e s  c o l i Q359 (hsdR" ,  host f o r s c r e e n i n g  and  from c l o n e s i n X1059 vector (Karn e_t a l . , 1 9 8 0 ) .  JM83 ( a r a , A l a c p r o , s t r A , t h i " , 080,  and Messing,1982) was  host  from c l o n e s i n pUC13 v e c t o r JM101  liver  MEDIA  ( B l a t t n e r et al.,1977).  hsdM , supF, 080)  Bovine  from  Vancouver.  K802 (hsdR , hsdM , g a l " , met",  in XCh4A v e c t o r  E.  Adult chicken l i v e r s were o b t a i n e d  lacZAM15)  f o r t r a n s f o r m a t i o n and DNA ( V i e i r a and Messing,1982).  ( A l a c p r o , supE, t h i " , F',  (Vieira  isolation E_^  coli  traD36, proAB, l a c I Q , lacZAM15)  and JM103 ( A l a c p r o , supE, t h i " , s t r A , sbcBl5, endA, hsdR", F', traD36, proAB, l a c I Q , lacZAM15) (Messing,1983) were hosts f o r  51  t r a n s f o r m a t i o n and DNA and  i s o l a t i o n of c l o n e s  11 v e c t o r s (Messing, 1983) .  E_^  coli  i n M13  mp7,  8, 9,  10,  RY1088 ( A l a c U l 6 9 , supE,  supF, hsdR", hsdM , metB, trpR,.tonA21, proC::Tn5(pMC9), pMC9 i s +  pBR322-lacIQ) (Young and Davis,1983a,b) was and  i s o l a t i o n of DNA  host  from c l o n e s i n the Xgt11  for screening  vector  (Young and  Davis,1983a,b). 2.  Vectors  For DNA and  11  sequence a n a l y s i s the M13  v e c t o r s mp7,  (Messing,1983) were used as c l o n i n g v e c t o r s .  r e s t r i c t i o n endonuclease mapping and DNA initially was  8,  subcloned  obtained  9,  DNA  sequencing  10, for  was  i n pUC13 ( V e i e r a and Messing,1982) (pUC13  from Dr.  Mark Z o l l e r , Dept.  of  Biochemistry,  UBC) .  was  3.  Media  The  medium f o r growth and  s c r e e n i n g of X c l o n e s and  NZYC ( M a n i a t i s et al.,1982) (1Og  5g NaCl, 5g Yeast  NZamine type A,  E x t r a c t , 1g Casamino A c i d s per  pH7.5 with NaOH).  hosts  2g M g C l , 2  liter,  and  For s c r e e n i n g phage X l i b r a r i e s , the phage  were p l a t e d on NZYC-agar(1,5%,w/v) p l a t e s with o v e r l a y of NZYCagarose(0.75%,w/v).  For t i t e r i n g of phage X s t o c k s , the o v e r l a y  c o n s i s t e d of NZYC-agar i n place of the NZYC-agarose.  The  for the t r a n s f o r m a t i o n and growth of b a c t e r i a c o n t a i n i n g plasmid d e r i v a t i v e s was Yeast E x t r a c t , 1Og  L u r i a broth  Bacto-Tryptone,  the s e l e c t i o n of pUC-containing  medium pUC  ( M a n i a t i s et §_1.,1982) (5g and  1Og  NaCl per  liter).  For  b a c t e r i a , c l o n e s were p l a t e d on  LB-agar(1.5%,w/v) p l a t e s supplemented with 50Mg/ml  ampicillin.  52  T h i s same medium was used f o r screening  the human cDNA l i b r a r y  in pKT2l8 except that t e t r a c y c l i n e (l2.5Mg/ml) r e p l a c e d the ampicillin. medium  Bacteria containing  (Maniatis  M13 clones  et al.,1982) (5g Yeast E x t r a c t ,  Tryptone, and 5g NaCl per l i t e r ) . p l a t e d on YT-agar 0.75% agar.  were grown i n YT  E^  Phage M13 transformants were  ( 1 . 5 % , W / V ) p l a t e s overlayed coli  JM101  with YT  and JM103, hosts f o r M13  were maintained on minimal medium p l a t e s  t o 55°C, and was mixed with 40ml 5X S a l t s 2  20%  4  vectors,  was a u t o c l a v e d ,  2  0.9g K H P 0 , 0.2g  containing  (Messing,1983), which  was made up as f o l l o w s : 3g of agar i n 160ml H 0 cooled  8g Bacto-  (2.1g K H P O „ , 2  ( N H „ ) S 0 , 0.1g N a C i t r a t e • 7 H 0 per 40ml), 2ml 2  2  2  g l u c o s e , 0.2ml 20% M g S 0 « 7 H 0 , and 0.1ml 1Omg/ml thiamine. 4  2  Each of these s o l u t i o n s was s t e r i l i z e d by a u t o c l a v i n g thiamine which was f i l t e r - s t e r i l i z e d . plasmid p r e p a r a t i o n s et  Bacteria  for large  were grown i n M9 mimimal medium  al.,1982) which was made up as 840ml H 0, 2  except the scale  (Maniatis  100ml 10X S a l t s (7g  N a H P O „ , 3g KH PO,,, 0.5g NaCl, 1g NH C1 per 100ml), 10ml 2  2  4  MgSO -7H 0, 20ml 20% glucose, a  2  Casamino A c i d s ,  0.2ml  10ml 0.01M C a C l ,  u r i d i n e which were  for  separately  Each  except the thiamine  filter-sterilized.  C.  BASIC MOLECULAR BIOLOGY TECHNIQUES  DNA  fragments were separated according  electrophoresis  20ml 20%  1Omg/ml thiamine and 0.2g u r i d i n e .  of the s o l u t i o n s was autoclaved and  2  to s i z e by  i n agarose or polyacrylamide g e l s .  agarose g e l e l e c t r o p h o r e s i s  The b u f f e r  was 1XTAE (50XTAE b u f f e r  i s 2M  T r i s base, 1M G l a c i a l A c e t i c A c i d , 0.1M EDTA) (Maniatis et al.,1982).  DNA  fragments i n these g e l s were v i s u a l i z e d e i t h e r  53  by UV f l u o r e s e n c e or autoradiography. UV f l u o r e s e n c e , agarose  For d e t e c t i o n of DNA by  g e l s were prepared  c o n t a i n i n g I0jug/ml  EtBr, and the DNA was v i s u a l i z e d by i r r a d i a t i o n (260nm).  I f the DNA fragments  autoradiography,  were v i s u a l i z e d by  the g e l s were d r i e d under vacuum u s i n g a B i o -  Rad g e l d r i e r a t 60°C f o r one hour.  The d r i e d g e l was then  exposed t o Kodak XK-1 f i l m , with or without screen used,  under UV l i g h t  ( L i g h t n i n g P l u s , Dupont).  an i n t e n s i f y i n g  If intensifying  screens were  the f i l m s were exposed a t -20°C or -70°C; otherwise, the  f i l m was exposed a t room temperature. used with 1XTBE b u f f e r  (1OXTBE b u f f e r  P o l y a c r y l a m i d e g e l s were i s 0.89M T r i s base, 0.89M  Boric A c i d , 25mM EDTA, pH 8.3) ( M a n i a t i s e t a l . , 1 9 8 2 ) . P o l y a c r y l a m i d e g e l s were e i t h e r d e n a t u r i n g or nondenaturing, due to the presence  or absence of urea as a d e n a t u r a n t . (added  For  nondenaturing  g e l s , acrylamide  t o the a p p r o p r i a t e  concentration  from a stock of 29:1 a c r y l a m i d e : b i s a c r y l a m i d e ) and  b u f f e r were mixed with the a p p r o p r i a t e volume of water, and degassed  u s i n g a water a s p i r a t o r .  P o l y m e r i z a t i o n was i n i t i a t e d  by the a d d i t i o n of ammonium p e r s u l p h a t e and TEMED t o f i n a l c o n c e n t r a t i o n s of 0.066%(w/v) and 0.04%(w/v), r e s p e c t i v e l y . fragments  DNA  i n these g e l s were v i s u a l i z e d by s t a i n i n g the g e l s  with I0jug/ml EtBr i n water f o r 10 minutes, i r r a d i a t i o n under UV l i g h t  (260nm).  g e l s i n TBE b u f f e r c o n t a i n e d urea  f o l l o w e d by  Denaturing  polyacrylamide  (8.3M), a c r y l a m i d e  (added to  the c o n c e n t r a t i o n from a 38:2 a c r y l a m i d e : b i s a c r l y a m i d e ) and b u f f e r were mixed mixed with the a p p r o p r i a t e volume of water, and degassed  u s i n g a water a s p i r a t o r .  P o l y m e r i z a t i o n was  54  i n i t i a t e d by the a d d i t i o n of ammonium p e r s u l p h a t e and TEMED to f i n a l c o n c e n t r a t i o n s of 0.066%(w/v) and respectively.  DNA  0.024%(w/v),  i n denaturing g e l s was  visualized  by  a u t o r a d i o g r a p h y a f t e r d r y i n g under vacuum i n a Bio-Rad at 80°C f o r 20-30 minutes, or without  intensifying  and exposing to Kodak XK-1  ISOLATION OF  1.  I s o l a t i o n Of Plasmid  DNA were prepared by a  m o d i f i c a t i o n of the a l k a l i n e l y s i s method of Birnboim ( M a n i a t i s et al.,1982).  An a l i q u o t  centrifugation pellet  was  containing  and  (1.5ml) of an  o v e r n i g h t c u l t u r e of the clone of i n t e r e s t was tube  placed in a  (Eppendorf), and the b a c t e r i a were c o l l e c t e d f o r 1 minute in an Eppendorf  resuspended  microfuge.  in lOOjul of an i c e c o l d  50mM g l u c o s e , 1OmM  EDTA, 25 mM  room temperature,  and 200yl of a s o l u t i o n c o n t a i n i n g 0.2N  minutes,  added.  and  The mixture was  The  T r i s - H C l pH8.0, and  The  was  suspension was  by  solution  4mg/ml lysozyme.  1% SDS  with  DNA  Small amounts of plasmid DNA  microfuge  film,  screens.  D.  Doly(l979)  gel drier  incubated f o r 5 minutes at NaOH-  incubated at 4°C f o r 5  150M1 of potassium a c e t a t e s o l u t i o n pH4.8 (60ml  KOAc, 11.5ml G l a c i a l A c e t i c A c i d , 28.5ml H 0) 2  mixing by v o r t e x i n g , the suspension was  was  added.  5M  After  incubated at 4°C f o r 5  minutes.  C e l l u l a r d e b r i s was  removed by c e n t r i f u g a t i o n  Eppendorf  c e n t r i f u g e f o r 5 minutes at 4°C.  The  in a  supernatant  was  removed and e x t r a c t e d with an equal volume of p h e n o l : c h l o r o f o r m (1:1,V/V).  N u c l e i c a c i d s were p r e c i p i t a t e d by the a d d i t i o n of 2  55  volumes  of ethanol at room temerature.  an Eppendorf c e n t r i f u g e  f o r 5 minutes, the supernatant was  d i s c a r d e d and the n u c l e i c a c i d p e l l e t ethanol. buffer  The p e l l e t  (10 mM  Two  After centrifugation in  was  washed with 1ml of 70%  was a i r d r i e d and resuspended i n 50M1 TE  T r i s - H C l pH8.0, 1 mM  EDTA).  d i f f e r e n t procedures were used f o r l a r g e s c a l e plasmid  isolation.  The T r i t o n l y s i s procedure (Katz et al.,1973,1977),  was used f o r l a r g e p r e p a r a t i o n s of p l a s m i d i n e i t h e r the pBR322 or the pKT2l8 c l o n i n g  vectors.  c u l t u r e of b a c t e r i a was  An a l i q u o t  used to i n o c u l a t e  37°C, with shaking at approximately 200 of the c u l t u r e was the c u l t u r e was  (5ml) of an overnight 1L of M9 medium at  rpm.  When the OD600nm  0.6-0.7, 250mg c h l o r a m p h e n i c o l was added,  shaken at 37°C f o r 12-16  hours.  c o l l e c t e d by c e n t r i f u g a t i o n at 6Krpm i n a GS-3 minutes and frozen  at -20°C  rotor  for 10 The  ml of a s o l u t i o n  25%(w/v) s u c r o s e , and 50mM T r i s - H C l pH8.0. a 1Omg/ml s o l u t i o n  C e l l s were  f o r at l e a s t two hours.  were then resuspended at 4°C i n 6.25  and  cells  containing  Lysozyme (1.5 ml of  i n 25% sucrose-50mM T r i s - H C l pH8.0) was  added, and the s o l u t i o n was c o n t i n u o u s l y mixed by s w i r l i n g on i c e f o r 5 minutes.  EDTA (1.25 ml of a 0.5M  s o l u t i o n , pH8.0) was  added and mixed on i c e by s w i r l i n g f o r an a d d i t i o n a l 5 minutes. Triton solution  (10ml of a s o l u t i o n c o m p r i s i n g 10 ml  T r i t o n X-100, 125 ml 0.5M  EDTA pH8.0, 50 ml 1M T r i s - H C l pH8.0,  800 ml H 0)was added, and mixed 2  Debris was  10%(W/V)  f o r an a d d i t i o n a l 5 minutes.  removed by c e n t r i f u g a t i o n at 19Krpm i n an SS-34 r o t o r  for 30 minutes at 4°C.  Plasmid DNA  chromosomal DNA  by i s o p y c n i c  and RNA  was  s e p a r a t e d from  centrifugation  using cesium  56  chloride gradients.  CsCl/EtBr s o l u t i o n s were produced by  direct  a d d i t i o n of 3.9g  of the  supernatant.  the  larger  used.  CsCl and  0.3ml EtBr(1Omg/ml) to 3.8  These volumes were s c a l e d  rotors.  Centrifugation  at 65Krpm or 20 hours at 50Krpm.  The  large  al.,1982) was described addition  For  for 20 hours at 50Krpm at  scale a l k a l i s c a l e d up  small  supernatant was  isopropanol, minutes.  and  scale  Nucleic  9Krpm i n an HB-4 Plasmid DNA described  was  the d e b r i s was  rotor  p u r i f i e d by  DNA  by Messing(1983).  YT  f o r 6 hours.  6 hours.  the  for 30 minutes at  4°C.  volumes of 15 at  for 30 minutes at room temperature. isopycnic  centrifugation  ( r e p l i c a t i v e form) was A s i n g l e plaque was  as  The  two  as v e c t o r s  mixed with  grown up  as 1Oul  (JM101 or JM103) bacteria  in 10 ml YT medium f o r  c u l t u r e s were then added to 1L of YT medium at  grown for 4 hours.  alkali  host c e l l s  isolated  C o n c u r r e n t l y , a colony of host  from a minimal medium p l a t e was  by  by  a c i d s were c o l l e c t e d by c e n t r i f u g a t i o n  of o v e r n i g h t c u l t u r e of u n i n f e c t e d  37°C and  removed  above.  described  1ml  After  incubated at room temperature f o r  Double stranded M13  in  e_t  preparation  immediately mixed with of 0.6  was  4 hours  20°C.  following modifications.  of potassium acetate,  rotor  rotor,  l y s i s procedure (Maniatis  from the  above with the  either  the Ti70.1  ml  f o r tubes f o r  times v a r i e d with the  c e n t r i f u g a t i o n at 35Krpm i n a Ti60 r o t o r The  up  With the vTi65 r o t o r , c e n t r i f u g a t i o n was  c e n t r i f u g a t i o n was  the  DNA  was  i s o l a t e d from these  l y s i s procedure, as d e s c r i b e d  for c l o n i n g experiments was  above.  subjected  cells  A l l DNA  to two  used  rounds  57  of p u r i f i c a t i o n through C s C l / E t B r 2.  I s o l a t i o n Of Phage DNA  For  large scale preparations  al.,1982),  density  gradients.  of phage X DNA (Maniatis et  1 0 ° host b a c t e r i a l c e l l s were c o l l e c t e d by 1  c e n t r i f u g a t i o n and resuspended i n 3 ml of SM b u f f e r  (5.8g NaCl,  2g MgSO -7H 0, 50 ml 1M T r i s - H C l pH7.5, 5 ml 2% g e l a t i n per L ) . a  2  Phage X (5X10 -5X10 7  8  p f u ) were added to the c e l l s , and the phage  were allowed to a t t a c h minutes.  t o the c e l l s by incubation  T h i s mixture was used t o i n o c u l a t e  at 37°C f o r 10  0.5L of prewarmed  NZYC medium and the c u l t u r e was incubated at 37°C u n t i l Chloroform  (10ml) was added and i n c u b a t i o n  lysis.  at 37°C continued f o r  10 minutes i n order to l y s e the remainder of the c e l l s . B a c t e r i a l d e b r i s was removed by c e n t r i f u g a t i o n at 7Krpm i n a GSA or GS-3 r o t o r f o r 10 minutes. by  Phage p a r t i c l e s were p r e c i p i t a t e d  the a d d i t i o n of 0.3 volumes of 50% p o l y e t h e l e n e g l y c o l 6000  (Carbowax 8000) and 0.15 volumes of 5M NaCl, and incubation at 4°C  overnight.  Phage p a r t i c l e s were c o l l e c t e d by c e n t r i f u g a t i o n  at 7Krpm i n a GSA or GS-3 r o t o r f o r 15 minutes at 4°C.  After  removal of a l l the PEG/NaCl s o l u t i o n , the phage p a r t i c l e s were gently  resuspended i n 10 ml DNase I b u f f e r  (50 mM T r i s - H C l  pH7.5, 5 mM M g C l , 0.5 mM C a C l ) t o which 100/ul 1mg/ml DNase I 2  and  2  200M1 RNase A were added, and the s o l u t i o n was incubated at  37°C f o r 30 minutes.  D e b r i s was removed by c e n t r i f u g a t i o n at  1OKrpm i n an SS-34 r o t o r using CsCl g r a d i e n t s .  f o r 5 minutes.  Phage were p u r i f i e d  G r a d i e n t s were made by the a d d i t i o n of  0.75g CsCl per ml of phage s o l u t i o n . 20 hours at 20°C i n a Ti70.1  Centrifugation  r o t o r at 50Krpm.  was f o r 16-  Phage were  58  removed from the gradient a f t e r (e.g.  removed by d i a l y s i s a g a i n s t DNase I b u f f e r  f o r at l e a s t one and  hour at 4°C.  SDS  was  phenoltchloroform chloroform.  DNA  was  (see above)  the s o l u t i o n was  (1:1,v/v) f o l l o w e d by 3 e x t r a c t i o n s with  Phage DNA  was  p r e c i p i t a t e d by the a d d i t i o n of 2 volumes of  s c a l e X p r e p a r a t i o n s were s c a l e d down from the  phage from one  phage plaque (or 3X10  the omission  was  isopropanol 100/4  were  DNA  i s o l a t i o n was  as above  Phage were d i g e s t e d  RNase A d i g e s t i o n with  p r e c i p i t a t e d by the a d d i t i o n of one i n s t e a d of ethanol and  used  proteinase  volume of  the p e l l e t was  resuspended i n  of TE b u f f e r . 3.  Genomic DNA  Isolation  Bovine genomic DNA method of B l i n and  was  prepared  t i s s u e was  the  the same method  from human l i v e r s .  Liver  ground to a f i n e powder i n l i q u i d n i t r o g e n , e i t h e r  with a Waring blendor powder was  by Ross M a c G i l l i v r a y by  S t a f f o r d (1976), which was  used f o r the p u r i f i c a t i o n of DNA  0.5M  Eluted  pfu from phage stock)  of the CsCl g r a d i e n t .  immediately a f t e r DNase I and DNA  large  to lOOjul of host c e l l s at 37°C f o r 10 minutes and  to i n o c u l a t e 20 ml of NZYC medium.  K.  6  0.1  ethanol.  p r e p a r a t i o n d e s c r i b e d above ( M a n i a t i s et al.,1982).  attached  incubated  p u r i f i e d by e x t r a c t i o n with  volume of 3M NaOAc pH4.8 and Small  and  added to 1%(w/v), EDTA to  p r o t e i n a s e K to 50/ug/ml and  at 68°C f o r 1 hour.  with  source  with a f l a s h l i g h t , the phage appear as a blue band),  C s C l was  5 mM,  l o c a l i z a t i o n with a l i g h t  or a with a mortar and  pestle.  Liver  d i s s o l v e d in a b u f f e r (1Oml/g t i s s u e ) c o n s i s t i n g of  EDTA pH8.0, 0.5%  SDS,  and  I00jug/ml p r o t e i n a s e K,  and  was  59  digested  o v e r n i g h t at  gently  extracted  three times with equal volumes of phenol and  dialyzed  against  buffer  EDTA) u n t i l  (50mM T r i s - H C l  50°C.  The  pH8.0, lOmM NaCl, 1OmM  OD270nm of the d i a l y s a t e was added to a c o n c e n t r a t i o n  of  incubated at 37°C f o r one gently  material  was  r o t o r at 4°C addition 1mM  ethanol.  The  then d i a l y z e d a g a i n s t  the  DNA  10 minutes.  of G i l b e r t S a l t s EDTA) to  1X  DNA  (5X  followed  was  TE b u f f e r .  genomic DNA  pellet  approximately 0.5  was  p r e c i p i t a t e d by  as d e s c r i b e d  mg/ml i n TE  the  NH„OAc, lOOmM M g C l , 2  days.  resuspended at a  extracted  14Krpm i n an SS-34  volumes of  A f t e r c o l l e c t i o n by c e n t r i f u g a t i o n , the DNA  removed by c e n t r i f u g a t i o n  was  Insoluble  the a d d i t i o n of two  l e a s t two  was  phenol:chloroform  S a l t s i s 2.5M by  solution  s o l u t i o n was  removed by c e n t r i f u g a t i o n at for  the  RNase (DNase free)  I00jug/ml and  hour.  allowed to rehydrate f o r at was  below 0.05.  three times with equal volumes of  (1:1,v/v), and  and  s o l u t i o n was  was  Insoluble above.  The  material final  concentration  buffer.  E.  DNA  SUBCLONING  1.  Producion Of  DNA  fragments f o r l i g a t i o n  were produced by  DNA  Fragments For  Ligation  i n t o e i t h e r pUCl3 or M13  s e v e r a l methods i n c l u d i n g  (Deininger,1983), or by  d i g e s t i o n were d i g e s t e d manufacturer of the  sonication  r e s t r i c t i o n endonuclease  Fragments that were produced by  r e s t r i c t i o n endonuclease DNA  digestion.  r e s t r i c t i o n endonuclease  under the  enzyme.  vectors  conditions  suggested by  Both mixtures and  gel  the  purified  fragments were l i g a t e d i n t o  60  vectors.  If  restriction for  mixtures  before  fragments  to  inactivate  ligation.  were  using  (10-20Mg  1OmM E D T A ) solution  was  was  resulting  in  a  500jul  sonicated  cooled  on  enzymes  electrophoresis followed  by  in  agarose  or  produced  Heat  Systems  of by  five and  were  mixed  made  of  in  300-600  fragments  (10-15jul) MgCl , 2  of  a  approximately excess was  of  ligated  to  Of  about  DNA I n t o  were  and  a  level  Tris-HCl  of  seconds.  The  pulses.  The  5  by  pH7.4,  of  the  6u  BSA,  T4  with  0 . 2mM  DNA  separated  by  gel  DNA w e r e  precipitated in  DNA  incubation  I00mg/ml  were  l0Mg/Ml  to  consisting  of  0.4-1.OmM ATP.  DNA, 1-5  pUC13  ligated  lOOng v e c t o r  insert  by  again  made  with  TE b u f f e r  (1OmM  EDTA).  buffer  5mM D T T ,  phenol  ends  output  polyacrylamide  resuspended at  DNA  with  gels  0.1M  bp  ethanol  Ligation  extracted  at  50M1 and  extracted,  2.  68°C  sonication  1OmM M g O A c ,  as  imM  at  endonuclease  blunt-ended  blunt-ended  pH8.0,  then  between  The  Tris-HCl  by  pulses  electroelution.  and  and  heated  Sonifier  5% n o n - d e n a t u r i n g  above,  was  the  polyacrylamide  0 . 5M N a C l ,  66mM K O A c ,  a  ligated,  restriction  were  DNA f r a g m e n t s  be  mixture  each deoxynucleotidetriphosphate  polymerase.  to  al.,1982).  ice,  DNA f r a g m e n t s  33mM T r i s - O A c p H 7 . 8 , of  et  fragments  (Deininger,1983),  the  from  (Maniatis  R a n d o m DNA  DNA  were  Purified  isolated  electroelution  2.  fragments  endonuclease digestion  10 m i n u t e s  phenol  of  was  while fold  for  molar  Or  vector  Vectors  DNA  in  small  volumes  66mM T r i s - H C l p H 7 . 5 , For  ligated M13  M13  pUC13 to  a  e x c e s s of  ligations,  three  ligations  5mM  fold  I0~20ng  insert  DNA.  molar vector T4  DNA  DNA  61  l i g a s e was added  (1 u n i t  f o r blunt-ended  l i g a t i o n s and 0.1u f o r  s t i c k y - e n d e d l i g a t i o n s , M a n i a t i s et al.,1982), and l i g a t i o n was allowed t o proceed  o v e r n i g h t a t 15°C.  I f not used  l i g a t i o n mixtures were s t o r e d a t -20°C u n t i l 3.  immediately,  used.  T r a n s f o r m a t i o n Of DNA Into B a c t e r i a  Host b a c t e r i a  f o r pUC13 and M13 t r a n s f o r m a t i o n s were made  competent by treatment Fifty milliliters  with c a l c i u m c h l o r i d e  (Messing,1983).  of YT ( f o r JM101 or 103) or L broth ( f o r JM83)  were i n o c u l a t e d with host c e l l s and incubated a t 37°C with shaking u n t i l  the OD600nm of the c u l t u r e was 0.5-0.6.  were c o l l e c t e d by c e n t r i f u g a t i o n  (2.5Krpm i n an HB-4 r o t o r , 4°C,  5 minutes) and g e n t l y resuspended volume of i c e c o l d 50mM C a C l . 2  Cells  i n one h a l f of the s t a r t i n g  C e l l s were incubated on i c e f o r  30-60 minutes and were a g a i n c o l l e c t e d by c e n t r i f u g a t i o n (2.5Krpm i n a HB-4 r o t o r , g e n t l y resuspended c o l d 50mM C a C l . typically  seen  B a c t e r i a were  i n one t e n t h of the s t a r t i n g volume of i c e Highest  2  4°C, f i v e minutes).  t r a n s f o r m a t i o n e f f i c i e n c y was  i f these competent c e l l s were s t o r e d a t 4°C f o r  24 hours (Dagert and Ehr1ich,1979). normally used without  However, c e l l s were  t h i s 24 hour storage.  A l i q u o t s (0.3 ml)  of competent c e l l s were t y p i c a l l y transformed with 2-3jul of l i g a t e d DNA (see p r e v i o u s s e c t i o n ) . DNA heat  C e l l s were incubated with  i n 13X100mm g l a s s tubes a t 4°C f o r 40-60 minutes and then shocked  a t 42°C f o r 2 minutes.  were mixed with  cells  10M1 lOOmM IPTG, 35-50M1 X-Gal (1Omg/ml i n  dimethylformamide), agar(42°C),  M13 DNA transformed  0.2ml host c e l l s , and 3-5ml s o f t YT  and poured  onto YT p l a t e s .  Heat shocked  pUC13  62  transformants were rescued with the a d d i t i o n of 0.7  ml of L  broth, followed by i n c u b a t i o n at 37°C f o r one hour. cells  Rescued  (100MD were spread with 50/nl X-Gal on LB p l a t e s  supplemented with a m p i c i l l i n transformed c e l l s recombinants  (50iug/ml).  A l l p l a t e s with  were incubated o v e r n i g h t at 37°C, and  with a l l v e c t o r s were d e t e c t e d as c o l o u r l e s s  c o l o n i e s or c l e a r plaques  i n the presence  of X-Gal  (Messing,1983). F.  ISOLATION OF  1.  I s o l a t i o n Of T o t a l C e l l u l a r  a) Bovine  RNA  RNA  RNA  A l l glassware, p i p e t s and s o l u t i o n s were a u t o c l a v e d to destroy endogenous r i b o n u c l e a s e s .  Bovine RNA  the method of Chirgwin et a l . ( l 9 7 9 ) . t i s s u e was  added to a b u f f e r  was  isolated  Powdered bovine  by  liver  (1Oml/g t i s s u e ) c o n s i s t i n g of  7.5M  guanidine h y d r o c h l o r i d e (GuHCl) pH7.5, 25mM sodium c i t r a t e pH7.0, and 0.1M  DTT.  The  liver  by using a p o l y t r o n homogenizer. to  0.5%  (w/v)  centrifugation  t i s s u e suspension was d i s r u p t e d N - l a u r y l s a r c o s i n e was  and the i n s o l u b l e matter (5Krpm for 30 minutes,  was  removed by  4°C, HB-4  p r e c i p i t a t e d by the a d d i t i o n of e t h a n o l to 33% incubation o v e r n i g h t at -20°C.  RNA  was  (5Krpm i n an SS-34 r o t o r , 4°C,  RNA  resuspended  GuHCl b u f f e r . was  rotor).  RNA  was  f o l l o w e d by  collected  centrifugation p e l l e t was  added  by  30 m i n u t e s ) .  The  i n h a l f of the s t a r t i n g volume of  I n s o l u b l e m a t e r i a l was  removed as b e f o r e .  p r e c i p i t a t e d as b e f o r e , and resuspended  i n one  RNA  f o u r t h of the  63  s t a r t i n g volume of GuHCl b u f f e r . removed, and tRNA and  5S  RNA  was  Insoluble  material  p r e c i p i t a t e d as before.  rRNA) and  DNA  was  removed by  Small RNAs  (Barlow et al.,1963).  was  of 0.1M  of 4M L i C l ,  0.1M  added.  The  incubated at -20°C f o r 30 minutes followed for an a d d i t i o n a l 30 minutes. centrifugation The  RNA  LiCl,  i n an  p e l l e t was 0.1  RNA  was  and  incubation  of RNA  b) Chicken RNA  day (1OmM  0°C  of two RNA  dissolved  was  then  was  in a  HB-  dissolved  was  removed  for 5 minutes.  volumes of collected  in a small  2M  ethanol by  volume of  H 0. 2  determined by assuming that a  an OD260nm of 20.  in small  of  4°C.  RNA  was  stored  as  ethanol  a l i q u o t s at -20°C.  RNA  y i e l d s w i t h the GuHCl method from c h i c k e n l i v e r s were  very low phenol  was  RNA  rotor at 4°C  the a d d i t i o n  o v e r n i g h t at -20°C.  1mg/ml s o l u t i o n had precipitates  at  by  insoluble material  at 9Krpm in a HB-4  p r e c i p i t a t e d by  concentration  incubation  collected  12 minutes at 4°C.  c e n t r i f u g a t i o n as above and The  was  c o l l e c t e d by c e n t r i f u g a t i o n  NaOAc pH5.0 and  by c e n t r i f u g a t i o n  was  an equal volume  mixture  by  pellet  washed twice by resuspending in 8 ml  M NaOAc pH7.0 and  of 0.1M  RNA  RNA  SS-34 r o t o r at 5Krpm f o r 30 minutes at  4 r o t o r at 9Krpm f o r in 5 ml  The  NaOAc pH7.0, and  NaOAc pH7.0 was  (e.g.  selective precipitation  of l a r g e RNAs with L i C l resuspended i n 4 ml  was  so a second RNA  i s o l a t i o n procedure using  ( L i z a r d i , 1 9 8 3 ) was  used with t h i s t i s s u e .  old chicks Tris-HCl  to 50jug/ml and  SDS  Livers  were homogenized in 30 volumes of SET pH7.5, 5mM the  EDTA, 1% SDS).  homogenate was  and from  buffer  P r o t e i n a s e K was  incubated at 50°C f o r  one  added  64  hour.  A f t e r d i g e s t i o n , t r i t o n X-100  were each added to 1% (v/v and w/v, 0.1M.  The homogenate  and sodium deoxycholate r e s p e c t i v e l y ) , and NaCl to  was e x t r a c t e d three times with  volumes of phenol:chloroform  (1:1,v/v).  the a d d i t i o n of two volumes of e t h a n o l , at -20°C. at  RNA  RNA  equal  was p r e c i p i t a t e d by  f o l l o w e d by incubation  was c o l l e c t e d by c e n t r i f u g a t i o n i n a SS-34 r o t o r  1OKrpm f o r 10 minutes at 4°C.  The RNA  was washed  i n 66%  ethanol, 0.1M  NaOAc pH5.0 and c o l l e c t e d by c e n t r i f u g a t i o n as  above.  RNAs and DNA  Small  was removed by p r e c i p i t a t i o n  with  L i C l as p r e v i o u s l y d e s c r i b e d . 2.  I s o l a t i o n Of Poly A*  Poly A* RNA  was  oligo-dT c e l l u l o s e T o t a l chicken  RNA  RNA  i s o l a t e d by chromatography on a column of  (Edmonds et al.,1971; A v i v and i n a small volume of 0.4M  SDS was a p p l i e d to the column.  with 0.4M  NaOAc pH7.5, 1mM  1mM  Poly A  EDTA, 0.1% SDS.  +  RNA  concentration  RNA  was  washed  the OD260nm  was e l u t e d from the  F r a c t i o n s c o n t a i n i n g RNA  p r e c i p i t a t e d by the a d d i t i o n of 0.1 two volumes of e t h a n o l .  was  The column was then  i d e n t i f i e d by t h e i r OD260nm and were pooled.  and  fraction  EDTA and 0.1% SDS u n t i l  of the e l u a t e was below 0.05. column with  NaOAc pH7.5, 0.1%  The unbound RNA  r e a p p l i e d to the column three times.  Leder,l972).  RNA  were  was  volumes of 3M NaOAc pH4.8 resuspended i n H 0  of 2 mg/ml and s t o r e d at -70°C.  2  at a  65  G.  LABELING OF DNA  1.  Nick  DNA  f o r use as h y b r i d i z a t i o n probes was l a b e l e d by nick  translation labeled BSA,  Translation  (Maniatis  et al.,1975).  2  20MM  dGTP, 20juM dTTP, 1.4/zM dATP,  dCTP, 1.4/LtCi/Ml a - P dATP (3000Ci/mMole) , 1.4juCi/Ml 3 2  dCTP(3000Ci/mMole) , 0. 2mM C a C l , 2  E.  500ng of DNA was  i n 50jul of 50mM T r i s - H C l pH7.5, 5mM M g C l , 0.05mg/ml  10 mM 0-mercaptoethanol,  1.4yM  Typically,  coli  1  pg/*il  DNA polymerase I (Romberg).  incubated f o r 60-120 minutes at 15°C.  a- P 3 2  DNase I, and 0.4u/jul  The r e a c t i o n mixture was The r e a c t i o n was  t e r m i n a t e d by the a d d i t i o n of three volumes of 1% SDS-1OmM EDTA, containing minutes.  25/ig tRNA, followed After allowing  by h e a t i n g to 68°C for 10  the r e a c t i o n mixture t o c o o l t o room  temperature, the unincorporated l a b e l e d n u c l e o t i d e s by chromatography on an U l t r o g e l AcA54 column. eluted  were removed  Labeled DNA was  from the column with 1OmM T r i s - H C l pH7.5, 200mM NaCl,  0.25mM EDTA. 0.5-1.0X10  8  T y p i c a l l y , l a b e l e d DNA had a s p e c i f i c a c t i v i t y of cpm/jug.  Labeled DNA was denatured by b o i l i n g  for 10  minutes immediately before use. 2.  Rlenow  DNA  was a l s o l a b e l e d by the method of F e i n b e r g and  Vogelstein  Labeling  (1983).  T y p i c a l l y , a r e a c t i o n mixture c o n t a i n e d 200-  300ng of DNA i n a volume of 50M1. boiling minutes. Tris-HCl  DNA i n 20yl was denatured by  f o r three minutes, and was c o o l e d Labeling  occurred i n a f i n a l  pH8.0, 10mM M g C l , 2  t o 37°C for 15-30  volume of 50M1 of 50mM  10mM 0-mercaptoethanol, 20/uM dCTP,  66  20juM dGTP,  20MM  dTTP, ]uCi/nl  a~ P  HEPES pH6.6, 60OD260nm/ml p(dN9), E.  coli  DNA  polymerase  dATP ( 3 0 0 0 C i / mMo 1 e) , 200mM  32  0.4  I Klenow fragment.  allowed to occur o v e r n i g h t at 37°C. and l a b e l e d DNA  was  mg/ml BSA,  The  Extension  terminated  separated from u n i n c o r p o r a t e d l a b e l e d  Typically  the s p e c i f i c a c t i v i t y  was  cpm/jug.  8  was  r e a c t i o n was  n u c l e o t i d e s as d e s c r i b e d f o r n i c k t r a n s l a t i o n  2X10  and 0.1u/jul  Labeled DNA  (see above).  of a Klenow l a b e l e d probe was  DNA  denatured as above p r i o r to  use. H.  BLOT HYBRIDIZATIONS  I.  Genomic Southern B l o t  Genomic DNA  f o r Southern  Analysis b l o t s were t r a n s f e r r e d to  n i t r o c e l l u l o s e e s s e n t i a l l y as d e s c r i b e d by Southern(1975), b l o t s were h y b r i d i z e d and washed as d e s c r i b e d by Kan Dozy(l978).  Genomic DNA  endonucleases  (10/ug) was  d i g e s t e d with  electrophoresis DNA  and  restriction  (20-30u) i n a volume of 40M1 under c o n d i t i o n s  recommended by the enzyme manufacturers.  gels.  and  f o r 16-24  i n the g e l s was  NaOH, 0.6M  NaCl and was  45 minutes  with  DNA  hours at 20-25 mA denatured  was  separated by  i n submerged agarose  f o r 30 minutes  in  0.5N  then n e u t r a l i z e d by twice t r e a t i n g for  1M T r i s - H C l pH7.5, 0.6M  NaCl.  DNA  was  t r a n s f e r r e d to n i t r o c e l l u l o s e membranes w i t h 1OXSSC (1XSSC i s 0.15M  NaCl, 0.015M N a C i t r a t e pH7)  t r a n s f e r , the n i t r o c e l l u l o s e  f o r 36~48 hours.  filter  was  After  washed i n 3XSSC to  remove any agarose, a i r d r i e d , and then baked at 68°C for 6 hours.  67  DNA probes.  fragments were d e t e c t e d by h y b r i d i z a t i o n The n i t r o c e l l u l o s e f i l t e r  and then p r e h y b r i d i z e d  was f i r s t  0.02% f i c o l ,  Hybridizations  P  labeled  1OmM  containing  Tris-HCl  pH7.5,  s o l u t i o n i s 0.02% BSA,  l00jug/ml denatured h e r r i n g  with the a d d i t i o n cpm/ml.  (1X Denhardt's  0.02% p o l y v i n y l p y r r o l i d o n e ) ,  pyrophosphate, poly(A).  solution  3 2  wetted with 3XSSC  f o r 1-16 hours, i n a s o l u t i o n  50% formamide, 6XSSC, ImM EDTA, 0.1% SDS, 10X Denhardt's  to  0.05%  sodium  sperm DNA,  and  were c a r r i e d out i n the same b u f f e r  of denatured l a b e l e d probe to at l e a s t  Hybridization  25nq  was f o r 36-48 hours at 37°C.  1X10  6  After  h y b r i d i z a t i o n , b l o t s were washed f o r one hour at room temperature  i n 2XSSC, 1X Denhardt's, and then washed twice f o r  90 minutes a t 50°C i n 0.1XSSC, 0.1% SDS. twice a t room temperature  B l o t s were then  i n 0.1XSSC, 0.1% SDS, followed  by 4  r i n s e s a t room temperature  i n 0.1XSSC.  were exposed  f i l m with i n t e n s i f y i n g screen f o r 1-7  to Kodak XK-1  After a i r drying,  rinsed  blots  days a t -70°C. 2.  Southern Blot A n a l y s i s  B l o t s t o detect performed  transferred  were  fragments were separated on agarose g e l s and  to n i t r o c e l l u l o s e as d e s c r i b e d  were probed with nick DNA probes. film  the presence of r e p e t i t i v e DNA  DNA  i n a s i m i l a r way to the genomic Southern b l o t s .  Cloned genomic DNA  XK-1  To Detect R e p e t i t i v e  above.  t r a n s l a t e d genomic DNA  These  instead  blots  of s p e c i f i c  B l o t s were washed as before and exposed to Kodak  f o r 1-3 hours without i n t e n s i f y i n g s c r e e n s .  68  3.  Northern B l o t  Two  methods were used t o determine the s i z e of mRNAs using  either glyoxal  Analysis  (Thomas,1980) or formaldehyde  §_1.,1982) as the d e n a t u r i n g agent.  (Maniatis et  A l l buffers  f o r Northern  b l o t a n a l y s i s were a u t o c l a v e d t o d e s t r o y endogenous ribonucleases. 60 minutes 8.0M1  For g l y o x a l  g e l s , RNA was denatured at 50°C f o r  i n a t o t a l volume of 16jul with 2.Jul  DMSO, and 1.6/zl 0.1M NaH PO 2  u  pH7.0 with up to 20/ug RNA.  Denatured RNA was s e p a r a t e d by e l e c t r o p h o r e s i s gels was  f o r 6 hours a t 100V u s i n g then t r a n s f e r r e d  hours.  a 1OmM NaH PO„ 2  on 1% agarose pH7.0 b u f f e r .  t o n i t r o c e l l u l o s e i n 20XSSC b u f f e r  RNA  f o r 16  A f t e r t r a n s f e r , the n i t r o c e l l u l o s e b l o t was a i r d r i e d  and baked at 80°C species  i n a vacuum oven  f o r 3-4 hours.  were d e t e c t e d by h y b r i d i z a t i o n t o s p e c i f i c  probes as d e s c r i b e d  f o r Southern b l o t s  For Northern b l o t s u s i n g agent  6M g l y o x a l ,  S p e c i f i c mRNA l a b e l e d DNA  (see above).  formaldehyde as the denaturing  (Lehrach et a_l. ,1977; Goldberg, 1 980) , RNA was denatured i n  a t o t a l volume of 20jul with 2ul  5X G e l b u f f e r  50mM NaOAc, 5mM EDTA), 3.5M1 formaldehyde, up t o 20/ug RNA at 55°C electrophoresis  f o r 15 minutes.  (0.2M MOPS pH7.0,  1Oul formamide, and  RNA was separated by  i n agarose g e l s c o n t a i n i n g  1X Gel b u f f e r  (40mM  MOPS pH7.0, lOmM NaOAc, 1mM EDTA) and 2. 2M formaldehyde, at 100 V f o r 4-6 hours.  P r i o r t o t r a n s f e r , the g e l s were washed with  H 0 f o r 5 minutes, denatured w i t h 50mM NaOH, 1OmM NaCl f o r 45 2  minutes, n e u t r a l i z e d soaked  f o r 45 minutes w i t h 0.1M T r i s - H C l pH7.5 and  i n 20XSSC f o r 60 minutes.  nitrocellulose  filter  RNA was then t r a n s f e r r e d to a  o v e r n i g h t ( 1 6 - 2 4 hours) i n 20XSSC.  After  69  t r a n s f e r , b l o t s were washed with 3XSSC and baked 68°C.  S p e c i f i c mRNA s p e c i e s were d e t e c t e d by h y b r i d i z a t i o n and  washing  as d e s c r i b e d f o r Southern b l o t s  I.  DNA SEQUENCE ANALYSIS  1.  C o n s t r u c t i o n Of M13  DNA was sequenced et  (see above).  Clones  by the c h a i n t e r m i n a t i o n method  (Sanger  al.,1977) using M13 sequencing v e c t o r s (Messing e_t a l . , 1 981 ;  Messing,1983).  DNA  to be c l o n e d i n t o M13 v e c t o r s f o r sequencing  was produced by r e s t r i c t i o n endonuclease (Messing,1983), (see  f o r 6 hours a t  digestion  or by s o n i c a t i o n and end r e p a i r  (Deininger,1983)  above). 2.  Screening Of M13 Clones  T y p i c a l l y , mixtures of DNA vectors.  To i d e n t i f y  encoding sequences, hybridization  fragments were c l o n e d i n t o  recombinant  M13  M13 c l o n e s c o n t a i n i n g exon  the M13 plaques were screened by plaque  (Benton and Davis,1977).  were t r a n s f e r r e d to n i t r o c e l l u l o s e  R e p l i c a s of the plaques  f i l t e r s , and the DNA  was  denatured by treatment with 0.5N NaOH, 1.5M NaCl f o r 5 minutes. The  nitrocellulose filters  were n e u t r a l i z e d by treatment with 1M  T r i s - H C l pH7.5 f o r 5 minutes  f o l l o w e d by treatment with 0. 5M  T r i s - H C l pH7.5, 1.5M NaCl f o r 5 minutes. filters of  were baked at 68°C f o r two hours.  A f t e r a i r d r y i n g , the Recombinant M13 phage  i n t e r e s t were d e t e c t e d by h y b r i d i z a t i o n to l a b e l e d probes and  autoradiography.  P r i o r to h y b r i d i z a t i o n ,  w i t h 6XSSC, and then p r e h y b r i d i z e d s o l u t i o n at 68°C f o r 1-4 hours.  filters  i n 6XSSC, 2X  were washed Denhardt's  F i l t e r s were then h y b r i d i z e d  70  overnight at 68°C  i n 6XSSC, 2X Denhardt's,  1mM EDTA, 0.5% SDS,  and denatured l a b e l e d probe  (at l e a s t  a c t i v i t y >0.5X10  A f t e r h y b r i d i z a t i o n , f i l t e r s were  8  cpm/jug).  washed twice at room temperature washes at 68°C rinsed  1X10 cpm/ml, 6  specific  i n 2XSSC followed by three  i n 1XSSC, 0.5% SDS f o r 30-40 minutes, and f i n a l l y  i n 1XSSC at room temperature.  f i l t e r s were exposed  t o Kodak XK-1  A f t e r a i r d r y i n g , the  f i l m overnight at -70°C with  i n t e n s i f y i n g screens. 3.  M13 DNA  Isolation  DNA  from c l o n e s of i n t e r e s t  (see p r e v i o u s s e c t i o n ) was  prepared as d e s c r i b e d by Messing(1983). as 2 ml c u l t u r e s i n YT medium  i n 15ml F a l c o n 2059 tubes using  one plaque and 20ul of host b a c t e r i a innoculum.  M13 c l o n e s were grown  (JM101 or 103) as  The c u l t u r e s were incubated at 37°C f o r 6-16  hours  (clones known to c o n t a i n l a r g e i n s e r t s were grown f o r the s h o r t e r time p e r i o d ) .  Host c e l l s were removed by c e n t r i f u g a t i o n  in a 1.5ml microfuge tube  (Eppendorf).  Phage p a r t i c l e s i n 1.3ml  of supernatant were p r e c i p i t a t e d by the a d d i t i o n of 0.3ml of 20% PEG, 2. 5M NaCl, and i n c u b a t i o n a t room temperature f o r 15 minutes.  M13 phage were c o l l e c t e d by c e n t r i f u g a t i o n  Eppendorf c e n t r i f u g e f o r 5 minutes.  A f t e r removal of a l l the  supernatant, the phage p a r t i c l e s were resuspended low t r i s b u f f e r  (50mM NaCl,  1OmM  i n an  i n 200M1 of  T r i s - H C l pH7.5, 1mM EDTA).  DNA  was p u r i f i e d by s u c c e s s i v e e x t r a c t i o n s of phenol, phenol:chloroform  ( 1 : 1 , V / V ) , and  chloroform.  DNA  was  p r e c i p i t a t e d twice by the a d d i t i o n of 0.1 volume of 3M NaOAc and 2 volumes of e t h a n o l .  The f i n a l DNA p e l l e t was washed i n 70%  71  ethanol and resuspended  i n 50/ul of low t r i s  4.  DNA Sequencing  DNA  i n M13 c l o n e s was sequenced  buffer.  by the c h a i n  termination  method (Sanger e_t al.,1977) as m o d i f i e d f o r phage M13 templates (Messing e_t al.,1981).  Sequencing r e a c t i o n s were c a r r i e d out  using the dideoxy- and d e o x y r i b o n u c l e o t i d e c o n c e n t r a t i o n s shown in Table I.  Sequencing was performed by h y b r i d i z i n g 4jul of  template (from above) with 1ul primer (0.03OD260nm/ml, 17-mer: 5'-GTAAAACGACGGCCAG-3') , 1 ul H 0, and 2jzl 10XHin b u f f e r 2  (600mM  NaCl, lOOmM T r i s - H C l pH7.5, 70mM MgCl ) a t 68°C f o r 10 minutes. 2  The h y b r i d i z a t i o n mix was allowed t o c o o l t o room (20-30 minutes), and 1 ul of 15uM dATP, 1.0-1.5/ul  temperature of a - P dATP 3 2  (1OuCi/ul,3000 Ci/mMole) and 2ul of 1 U / M 1 DNA polymerase I Klenow fragment were added.  An a l i q u o t  (2.5/zl) of t h i s  template/primer mix was added t o 1.5yl of the a p p r o p r i a t e deoxy/dideoxy mix (see Table I ) . i n c u b a t i o n at room temperature, After  15-20  After  15-20  minutes of  1 M 1 of 0.5mM dATP was added.  minutes of i n c u b a t i o n a t room temperature, 5M1 of  stop-dye mix (98% formamide, 1OmM EDTA pH8.0, 0.02% Xylene Cyanole, 0.02%  Bromphenol Blue) was added.  The extended  products were denatured by h e a t i n g t o 92°C f o r t h r e e minutes and 1-2M1 of these products were analyzed on 6% and 8% t h i n ( 0 . 3 5 mm), denaturing p o l y a c r y l a m i d e g e l s (50cm long) a t 52W i n 1XTBE. A f t e r e l e c t r o p h o r e s i s , the g e l s were d r i e d a t 80°C with a B i o Rad g e l d r i e r f o r 20-30 minutes, and autoradiographed t o Kodak XK-1  f i l m overnight a t room temperature.  72  Table  I : Sequencing Mixes  Nucleotide  d/ddG  d/ddA  d/ddT  d/ddC  dG  7.9  109.4  1 58.7  1 57.9  dT  157.6  1 09.4  dC  157.6  1 09.4  7.9 1 58.7  157.9 10.5  -  -  -  -  116.7  -  -  ddT  -  -  550.3  -  ddC  -  -  -  191.6  ddG  1 57.4  ddA  The  c o n c e n t r a t i o n s of the dideoxy-  and d e o x y - r i b o n u c l e o t i d e  t r i p h o s p h a t e s used i n the sequencing  mixes f o r M13  sequencing.  Concentrations  Concentrations  determined e m p i r i c a l l y UBC.  are uM.  by Dr.  DNA  Joan McPherson, Dept.  were of Botany,  73  5.  Computer A n a l y s i s Of DNA  The DNA above) was  deduced  from the sequencing g e l s (see  a n a l y z e d u s i n g the computer programs of Staden  and Delaney J.  sequences  (1982)  (1982 ) .  HETERODUPLEX ANALYSIS  To a s s i s t  i n d e t e r m i n i n g the s i z e and p o s i t i o n of exons and  i n t r o n s i n the bovine prothrombin was  Sequence Data  conducted by Dr.  gene, heteroduplex a n a l y s i s  Kevin Ahern and Dr.  Oregon S t a t e U n i v e r s i t y .  George Pearson,  Heteroduplexes were formed  between  EcoRI and PstI cut bovine prothrombin cDNAs (pBI1111 or p B I I l 0 2 , M a c G i l l i v r a y and Davie,1984) and DNA c o n t a i n i n g bovine genomic sequences  e i t h e r from the X c l o n e s (XBII1, XBII2, or XBII3) or  from a p p r o p r i a t e l y c l e a v e d subclones of the bovine genomic sequences.  An a l i q u o t  (lOOng) of each DNA  to be a n a l y z e d by  heteroduplex a n a l y s i s were denatured together i n 10M1 of 80% formamide by h e a t i n g to 70°C f o r 10 minutes. occurred at 37°C f o r one hour 20^1  Hybridization  i n a r e a c t i o n mixture volume of  of 50% formamide, 200mM NaCl.  DNA  spreading c o n d i t i o n s  were e s s e n t i a l l y as d e s c r i b e d by Chow and Broker e n t i r e duplex mixture was  spread as hyperphase  40/ul of 50% formamide, lOOmM NaCl, 5mM standard and cytochrome  c (40yg/ml).  (1981).  i n a volume of  EDTA, lOOng of DNA The DNA  The  protein  film  length was  adsorbed to a p a r l o d i o n c o a t e d g r i d , s t a i n e d with u r a n y l a c e t a t e , and r o t a r y shadowed with platinum-palladium.  Grids  were examined with a Z e i s s EM-10A e l e c t r o n microscope o p e r a t i n g at 60kV.  M o l e c u l a r l e n g t h s were measured using a V i d e o p l a n II  74  image a n a l y s i s system. converted t o double correct  S i n g l e stranded DNA measurements were  stranded lengths u s i n g the f a c t o r  f o r compression  during  spreading.  K.  SCREENING PHAGE LIBRARIES  1.  P l a t i n g Phage L i b r a r i e s  Genomic and cDNA l i b r a r i e s v e c t o r s were screened (1977).  i n a v a r i e t y of d i f f e r e n t X  by the procedure  These l i b r a r i e s were i n i t a l l y  d e n s i t y of 10" plaques petri dish.  1.16 t o  of Benton and Davis screened  at a high  per 100mm p e t r i d i s h or 5X10" per 150mm  A p p r o p r i a t e d i l u t i o n s of phage were i n c u b a t e d  with  host c e l l s at 37°C f o r 10 minutes (to allow attachment of the phage) and then p l a t e d on NZYC p l a t e s with a d d i t i o n of s o f t NZYC agarose.  P l a t e s were incubated a t 37°C u n t i l  the phage  plaques  were v i s i b l e but not touching each other, and the p l a t e s were placed a t 4°C f o r one hour.  R e p l i c a s of the plaques  t r a n s f e r r e d to n i t r o c e l l u l o s e c i r c l e s and incubated  were i n v e r t e d on  f r e s h NZYC p l a t e s a t 37°C overnight to a l l o w a m p l i f i c a t i o n of phage plaques.  Master p l a t e s were s t o r e d a t 4°C.  other than the f i r s t h i g h d e n s i t y screen, t h i s step was omitted. denatured, (see  DNA on the n i t r o c e l l u l o s e  F o r screens  amplification  f i l t e r s was  n e u t r a l i z e d and baked as d e s c r i b e d f o r M13 screens  above).  75  2.  S c r e e n i n g Of Phage F i l t e r s  Various d i f f e r e n t washing  of f i l t e r s  s t r i n g e n c i e s f o r h y b r i d i z a t i o n and  were used depending on the homology of the  probe t o the d e s i r e d sequences w i t h i n the l i b r a r y .  When the  probe and the l i b r a r y were from the same s p e c i e s , the f i l t e r s were h y b r i d i z e d and washed at high s t r i n g e n c y , as d e s c r i b e d f o r s c r e e n i n g M13 f i l t e r s  (see above).  Cross h y b r i d i z a t i o n  between  s p e c i e s r e q u i r e d c o n d i t i o n s of reduced s t r i n g e n c y f o r h y b r i d i z a t i o n and washing.  Reduced  s t r i n g e n c y was o b t a i n e d by  reducing the temperature of the h y b r i d i z a t i o n , NaCl c o n c e n t r a t i o n , and/or washes.  i n c r e a s i n g the  reducing the temperature of the  Cross h y b r i d i z a t i o n between human and c h i c k e n DNA  fragments was o b t a i n e d by h y b r i d i z a t i o n at 50°C and washing i n 6XSSC a t 45°C.  C o n d i t i o n s f o r autoradiography v a r i e d due to  c o n d i t i o n s of h y b r i d i z a t i o n and washing, X v e c t o r , type of l i b r a r y , and s p e c i f i c a c t i v i t y of the probe.  The c o n d i t i o n s  v a r i e d from 4 hours a t -20°C with i n t e n s i f y i n g screens to 3 days at  -70°C with i n t e n s i f y i n g screens. L.  SCREENING PLASMID LIBRARIES  A human l i v e r cDNA l i b r a r y was screened by the method of Benton and D a v i s (1977).  The human cDNA l i b r a r y  (Prochownik e_t al.,1983) was p l a t e d by Marion Approximately  i n pKT218  Fung.  10" c l o n e s per 100mm p e t r i d i s h were spread on LB  p l a t e s supplemented  with t e t r a c y c l i n e .  37°C u n t i l c o l o n i e s were 1-2mm  P l a t e s were incubated at  i n diameter.  r e p l i c a s were made on to n i t r o c e l l u l o s e  At t h i s time,  filters.  The master  76  p l a t e s were s t o r e d a t 4°C, while the r e p l i c a on LB t e t r a c y c l i n e p l a t e s u n t i l diameter.  were grown  the c o l o n i e s were 3-4mm i n  The n i t r o c e l l u l o s e f i l t e r s  LB p l a t e s supplemented  filters  were then t r a n s f e r r e d to  with c h l o r a m p h e n i c o l  (25 Mg/ml) and n  incubated o v e r n i g h t a t 37°C. C o l o n i e s were l y s e d and the DNA was denatured by t r e a t i n g the n i t r o c e l l u l o s e r e p l i c a twice f o r 20 minutes.  filters  with 0.5N NaOH, 1.5M NaCl  Nitrocellulose  r e p l i c a s were n e u t r a l i z e d  by t r e a t i n g with 1M T r i s - H C l pH7.5 f o r 20 minutes treatment with 0.5M T r i s - H C l pH7.5,  1.5M NaCl  f o l l o w e d by  f o r 20 minutes.  A f t e r a i r d r y i n g , the f i l t e r s were baked a t 68°C f o r two hours. The human cDNA l i b r a r y was screened with a bovine cDNA probe so that c o n d i t i o n s of reduced s t r i n g e n c y were needed t o d e t e c t the corresponding human cDNA. nitrocellulose remove c e l l  filters  P r i o r t o h y b r i d i z a t i o n , the  were washed t h r e e times i n 6XSSC to  d e b r i s and p r e h y b r i d i z e d i n 6XSSC 2X Denhardt's at  68°C f o r two hours.  F i l t e r s were h y b r i d i z e d and washed as  d e s c r i b e d f o r s c r e e n i n g M13 c l o n e s except that h y b r i d i z a t i o n was at  60°C and washes were a t 60°C and i n 6XSSC.  P o s i t i v e clones  were d e t e c t e d by autoradiography. M.  MAPPING THE END OF A mRNA TRANSCRIPT  1.  Nuclease S1 Mapping  Uniformly l a b e l e d s i n g l e stranded DNA probes a n a l y s i s were produced  f o r S1  as d e s c r i b e d by Nasmyth(1983).  0 1 i g o d e o x y r i b o n u c l e o t i d e primers, e i t h e r the M13 sequencing primer  (see above) or a primer complementary t o the prothrombin  77  mRNA (5'-CCTCGGACGCGCGCCAT-3'), were used to prime DNA s y n t h e s i s to produce prothrombin  s i n g l e stranded probes complementary t o the bovine mRNA.  Primer DNA (2.5M1 of 0.03OD260nm/ml) was  mixed w i t h 2.5M1 of a p p r o p r i a t e M13 c l o n e template, lOXHin b u f f e r  1.25yl  (as above), and 1.25/ul of H 0, and was incubated 2  at 68°C f o r 10 minutes and allowed to c o o l t o room temperature (20-30 minutes).  N u c l e o t i d e s (1.2 5jul c o n t a i n i n g  0.5mM dCTP, 0 . 5mMdGTP, and 0 . 5mMdTTP) , 2.5*il  a- P 3 2  dATP(1 OjuCi/jul, 3000Ci/mMole), and 1.25jul (0.625u) DNA polymerase I Klenow fragment  were added and the mixture was incubated a t  15°C f o r 60 minutes.  The r e a c t i o n was stopped by h e a t i n g t o  68°C f o r 10 minutes.  DNA of a s p e c i f i c  s i z e was produced by  d i g e s t i o n w i t h the r e s t r i c t i o n endonclease EcoRI f o r 60 minutes. A f t e r d i g e s t i o n , the r e a c t i o n was stopped by the a d d i t i o n of an equal volume of sequencing stop-dye mix (see above) and denatured by h e a t i n g to 92°C f o r 5 minutes. was  The probe  fragment  s e p a r a t e d on a denaturing 6% p o l y a c r y l a m i d e g e l . The  fragment  was recovered by e l e c t r o e l u t i o n  ( M a n i a t i s et a l . , 1 9 8 2 ) ,  phenol e x t r a c t e d , and p r e c i p i t a t e d with e t h a n o l . Approximately  10 cpm of l a b e l e d probe was mixed with lOOjug 5  t o t a l bovine l i v e r RNA i n 30M1 of 80% formamide, 40mM PIPES pH6.4, 400mM NaCl, for 5 minutes,  1mM EDTA.  The mixture was incubated a t 85°C  f o l l o w e d by incubation at 42°C o v e r n i g h t .  Nuclease S1 d i g e s t i o n was performed by the a d d i t i o n of 300^1 of nuclease S1 b u f f e r  (0.28M NaCl, 50mM NaOAc pH4.8, 4.5mM ZnS0 , 4  20iug/ml denatured h e r r i n g sperm DNA) c o n t a i n i n g 2000u/ml n u c l e a s e S1 .  The r e a c t i o n was incubated a t 37°C f o r 60 minutes,  78  followed by phenol  extraction.  Nuclease  SI p r o t e c t e d DNA  fragments were recovered by a d d i t i o n of NH^OAc t o 0.7M, 10/ig tRNA and an equal volume of i s o p r o p a n o l .  The p r e c i p i t a t e was  recovered by c e n t r i f u g a t i o n and r e d i s s o l v e d i n a small volume of sequencing  stop-dye  mix (see above).  a 8% d e n a t u r i n g p o l y a c r y l a m i d e 92°C f o r 3 minutes. autoradiography 2.  Primer  Primer Law  Products  g e l , a f e r denaturing  P r o t e c t e d DNA fragments were detected by  Extension  e x t e n s i o n was performed e s s e n t i a l l y as d e s c r i b e d by Six picomoles  oligodeoxyribonucleotide  of 5' end l a b e l e d  (same o l i g o as used above for nuclease  S1 mapping, s p e c i f i c a c t i v i t y was 3x10 resuspended with 5uq t o t a l bovine  6  by b o i l i n g  cpm/pMole) were  l i v e r RNA i n 5jul TE pH7.4  (1OmM T r i s - H c l pH7.4, 1mM EDTA).  denatured  the DNA at  on the d r i e d g e l .  and Brewer(1984).  buffer  were separated on  The mixture was  for 3 minutes, and c o o l e d i n i c e water.  In  a t o t a l volume of 10^1 KC1 and T r i s - H C l pH8.3 were added t o 200mM and lOmM, r e s p e c t i v e l y , and kept on i c e f o r 10 minutes. Each d e o x y r i b o n u c l e o t i d e t r i p h o s p h a t e was added t o 1mM, T r i s - H C l pH8.3 t o 50mM, KC1 t o 50mM, M g C l  2  t o lOmM, actinomycin  D to  40Mg/ml, and /3-mercaptoethanol t o 30mM i n a t o t a l volume of 40jul.  Avian  reverse t r a n s c r i p t a s e (50u)  was added and the  r e a c t i o n was incubated a t 37°C f o r 90 minutes. terminated  The r e a c t i o n was  by the a d d i t i o n of 3/xl of 0. 5M EDTA pH8.0, 3jul was  mixed with 3M1 of sequencing  stop-dye  mix (see above).  After  d e n a t u r a t i o n at 92°C for 3 minutes, the products  were separated  on a 8% d e n a t u r i n g p o l y a c r y l a m i d e  were detected  gel.  Products  79  by autoradiography  of the d r i e d g e l using Kodak XK-1  film.  80  RESULTS A.  ISOLATION OF THE BOVINE PROTHROMBIN GENE  1.  Southern B l o t A n a l y s i s Of The Bovine Prothrombin Gene  As an i n i t i a l  step toward the c h a r a c t e r i z a t i o n of the  bovine prothrombin gene, bovine l i v e r DNA was d i g e s t e d with s e v e r a l r e s t r i c t i o n endonucleases, and the r e s u l t i n g were separated by agarose g e l e l e c t r o p h o r e s i s .  fragments  After  d e n a t u r a t i o n , the DNA fragments were t r a n s f e r r e d t o n i t r o c e l l u l o s e and analyzed with  3 2  P-labeled  hybridization  probes d e r i v e d from cloned bovine prothrombin cDNAs. bovine prothrombin cDNA c l o n e s have been d e s c r i b e d and Davie,1984) i n c l u d i n g pBI1111 5 bp of 5 ' - u n t r a n s l a t e d sequence to  Several  (MacGillivray  (that c o n t a i n s DNA c o d i n g f o r and DNA coding f o r r e s i d u e s -43  579 of prothrombin) and pBIIl02 (that c o n t a i n s DNA c o d i n g f o r  r e s i d u e s 69 to 582, a stop codon, u n t r a n s l a t e d sequence,  119 n u c l e o t i d e s 3'  and a poly(A) t a i l ) .  When the Southern  b l o t s of bovine genomic DNA were a n a l y z e d with the cDNA of  inserts  both pBI1111 and pBIIl02 as h y b r i d i z a t i o n probes,  several  fragments were d e t e c t e d with each of the r e s t r i c t i o n  enzymes  used  (Fig.4A).  The i n t e n s i t i e s of bands were s i m i l a r to those  found when pBI1111 DNA was i n c l u d e d i n the b l o t at a c o n c e n t r a t i o n e q u i v a l e n t to a s i n g l e copy gene (data not shown). When the 5' or 3' ends of the cDNA were used as h y b r i d i z a t i o n probes, s i n g l e r e s t r i c t i o n the  enzymes used  fragments were d e t e c t e d with many of  (Fig.4B, 4C), s u g g e s t i n g that the bovine genome  c o n t a i n s a s i n g l e gene coding f o r prothrombin.  From these  81  Fig,4:  Southern B l o t A n a l y s i s of the Bovine Prothrombin Gene  Southern b l o t a n a l y s i s of the bovine prothrombin gene. molecular weight bovine l i v e r DNA was d i g e s t e d with restriction  endonucleases and e l e c t r o p h o r e s e d  agarose g e l .  A f t e r denaturation,  High  various  i n a 0.7%  the DNA was t r a n s f e r r e d to  n i t r o c e l l u l o s e and h y b r i d i z e d to prothrombin cDNA as indicated  i n p a r t D.  l a b e l e d s i z e markers Bovine DNA c l e a v e d Hindlll 6).  In each b l o t , lane M represents (X DNA cleaved  w i t h BamHI (lane  (lane 3 ) , PstI  with H i n d l l l ) . 1), EcoRI  (lane 4), B g l l l  3 2  p-  B l o t A:  (lane 2 ) ,  (Ine 5), SstI  (lane  The complete cDNA i n s e r t s of pBI1111 and pBIIl02  ( M a c G i l l i v r a y and Davie,1984) were used as h y b r i d i z a t i o n probes.  B l o t B: Bovine DNA was cleaved  Hindlll  (lane 2 ) , EcoRI  5), PstI  (lane 6 ) .  (lane 3), SstI  The P s t l - X h o l  used as a h y b r i d i z a t i o n probe. cleaved 3).  with H i n d l l l  (lane  h y b r i d i z a t i o n probe. clones  (lane 4 ) , B g l l l  (lane  fragment of pBI1111 was  Blot C: Bovine DNA was  1), EcoRI  The BamHI-PstI fragment  BamHI (lane 1),  (lane 2 ) , BamHI  (lane  of pBIIl02 was used as a  D: The r e s t r i c t i o n  map of the cDNA  pBIIl02 and pB11 111 with 5' and 3' probes i n d i c a t e d  ( M a c G i l l i v r a y and Davie,1984), cDNA clones a r e f l a n k e d by PstI  restriction  sites.  A  C  83  D  cDNA  pBI.M 1 1 =ipBI11 02  clones: probes:  BBHSBH  5'  3'  84  blots,  i t was  10 Kbp  in length.  2.  e s t i m a t e d that the prothrombin  C l o n i n g Of The  Bovine Prothrombin  To study the bovine prothrombin bovine genomic phage l i b r a r y was  One  at l e a s t  Gene  gene more thoroughly, a  c o n s t r u c t e d by Ross  M a c G i l l i v r a y u s i n g bovine l i v e r DNA of X1059.  gene was  c l o n e d into the BamHI s i t e  m i l l i o n phage from t h i s l i b r a r y were screened by  Ross M a c G i l l i v r a y by u s i n g the cDNA i n s e r t of pBIIl02 as a h y b r i d i z a t i o n probe. XBII1 and XBII2.  Two  independent  p o s i t i v e s were i s o l a t e d ,  R e s t r i c t i o n endonuclease  mapping and  Southern  b l o t a n a l y s i s showed that these phage c o n t a i n e d o v e r l a p p i n g DNA and r e p r e s e n t e d 25 Kbp Southern  (Fig.5).  b l o t a n a l y s i s showed that these phage contained most of  the prothrombin l i b r a r y was fragment  of c o n t i g u o u s bovine genomic DNA  gene but l a c k e d the 3' r e g i o n .  The X1059  subsequently r e s c r e e n e d using the 3' BamHI-PstI  of pBII102 as a h y b r i d i z a t i o n probe, but these screens  only r e s u l t e d  i n the r e i s o l a t i o n of  XBII1.  To i s o l a t e the 3' end of the prothrombin a second bovine l i v e r genomic l i b r a r y  gene, 10  6  phage of  ( i n XCharon 28 from  Dr.  F r i t z Rottman, Case Western Reserve U n i v e r s i t y ) were screened by using the BamHI-PstI fragment d i f f e r e n t c l o n e s , XBII3, plaque p u r i f i e d .  of pBIIl02 as a probe.  XBII4,  Three  and XBII5, were i d e n t i f i e d  R e s t r i c t i o n enzyme mapping showed that  phage c l o n e s overlapped XBII1 and 3' to the mapped prothrombin contained r e s t r i c t i o n  d e t e c t e d i n the genomic Southern  these  XBII2 at p o s i t i o n s that were  gene ( F i g . 5 ) .  fragments  and  XBII3 and  XBII4  that were c o n s i s t e n t with those b l o t s with the 3' probe.  XBII5  85  Fig.5:  R e s t r i c t i o n Map  The r e s t r i c t i o n recombinant  map  phage  o f t h e B o v i n e P r o t h r o m b i n Gene  was d e t e r m i n e d by a n a l y s i s o f t h e f i v e  XBII1-5 and s u b c l o n e s d e r i v e d  from  The l o c a t i o n o f t h e p r o t h r o m b i n gene w i t h i n t h i s indicated  (see s e c t i o n B ) .  b o x e s a n d h a v e been numbered  Exons a r e r e p r e s e n t e d  region i s by  black  f r o m t h e 5' end o f t h e g e n e .  The s c a l e a t t h e t o p r e p r e s e n t s pa i r s.  them.  nucleotides  in kilobase  SCALE (KB): 0 1  5  10  15  20  1  1  1  1  EXONS:  EcoRI Hindlll SstI Bglll Xhol Xbal  30  35  i  I  i  40 4 2 I  i  B a  1234  GENE: BamHI  25  56  I I II  J  Sail Kpnl  CLONES: XBII2I  L_L  I  i  i i •  i i  i i  i  i  UI i i i i i  i v „  789101112  II  I M  1314  • • i II  I  i  i i  II  III i i  i  i  1  1  <  r  -  XBII1C XBII3I XBII5I  XBII4C  i  i  i  U  I  I __  i  I I I  I 11 i  LJ  i i 1  LL L_J i i i 1  ™  87  d i d not c o n t a i n the 3'-most exons (see Fig.5) but was i s o l a t e d because  i t contained exon 12, a part of which  the BamHI-PstI fragment  used as a probe.  i s contained i n  A t o t a l of 42.4 Kbp of  contiguous genomic DNA was r e p r e s e n t e d by the f i v e phage XBII5).  T h i s region c o n t a i n e d a l l r e s t r i c t i o n  d e t e c t e d i n the genomic Southern b l o t a n a l y s i s  enzyme  (XBII1-  fragments  (see F i g . 4 ) .  The  prothrombin gene maps t o 15 Kbp i n the middle of t h i s c l o n e d DNA (see s e c t i o n s B and C ) . 3.  A n a l y s i s Of The S i z e Of The Bovine Prothrombin mRNA  To determine the s i z e of the mRNA f o r bovine  prothrombin,  t o t a l bovine l i v e r RNA was denatured with g l y o x a l and s e p a r a t e d by s i z e on an agarose g e l (Thomas,1980).  After transfer to  n i t r o c e l l u l o s e , the mRNA f o r bovine prothrombin was d e t e c t e d by h y b r i d i z a t i o n to the P - l a b e l e d cDNA i n s e r t of pBI1111 as shown 3 2  in F i g . 6 .  Autoradiography of the b l o t r e v e a l e d a s i n g l e band  which was 2150 ± 100 n u c l e o t i d e s i n s i z e  (see F i g . 6 ) .  The  prothrombin cDNAs pBI1111 and pBIIl02 c o n t a i n  1998 n u c l e o t i d e s  coding sequence  ( M a c G i l l i v r a y and  Davie,1984).  p l u s 3' u n t r a n s l a t e d sequence  As poly(A) t a i l s are u s u a l l y  in l e n g t h (Perry,1976), t h i s  i n d i c a t e d that <50 n u c l e o t i d e s of  prothrombin mRNA 5' f l a n k i n g sequences cloned cDNAs.  180-200 n u c l e o t i d e s  were absent  from the  Thus the 5' end of pBII 111 must be very near t o  the s i t e of mRNA  initiation.  88  Fig.6:  Northern B l o t A n a l y s i s of Bovine Prothrombin mRNA  The s i z e of the mRNA of bovine prothrombin was determined a f t e r d e n a t u r i n g 20jug bovine l i v e r RNA with g l y o x a l and electrophoresis  on an agarose g e l .  to n i t r o c e l l u l o s e and was h y b r i d i z e d  The RNA was t r a n s f e r r e d to P-labeled 3 2  pBIIIII.  The molecular weight markers r e p r e s e n t the p o s i t i o n of X - H i n d l l l DNA fragmentd.  KB r e f e r s  to kilobase  pairs.  89  kb  9.966.674.25-  2.25 1.96-  90  B.  HETERODUPLEX MAPPING  1.  Method  Heteroduplex was  undertaken  a n a l y s i s of the cloned bovine prothrombin  by Dr.  Kevin Ahern i n Dr.  l a b o r a t o r y at Oregon S t a t e U n i v e r s i t y . was  useful  Pearson's  T h i s heteroduplex  data  i n d e t e r m i n i n g the s i z e s of the i n t r o n s and exons, as  w e l l as i n d i c a t i n g elements. (pBI1111  George  gene  the p o s s i b l e presence of r e p e t i t i v e  DNA  Examples of the heteroduplexes of prothrombin  cDNAs  and pBII102) to genomic clones (XBII1, XBII2, or  XBII3,  or subclones) are shown i n F i g . 7 . 2.  Exons And  Introns  The  s i z e s of the exons and  i n t r o n s determined  heteroduplex a n a l y s i s , and a comparison s i z e s determined III.  The  by DNA  sequence data are shown i n Tables II and  heteroduplex a n a l y s i s and by DNA  The  of these data to the  s i z e s of a l l exons were determined  found to be  in excellent  s i z e of a l l but two  heteroduplex a n a l y s i s  by  both by  sequence a n a l y s i s , and were  agreement with each other  (Table I I ) .  i n t r o n s c o u l d be determined  (Table I I I ) . Two  by  of the i n t r o n s  (G and  M)  are too short to be a c c u r a t e l y measured, but were v i s i b l e (Fig.7) (Irwin et al_.,1985).  The p o s s i b i l i t y of other small  i n t r o n s i n the gene c o u l d not be discounted from  the  heteroduplex data  ( I r w i n et al.,1985), however, DNA  sequence  data demonstrated  t h a t a l l i n t r o n s were detected by  heteroduplex  analysis.  As shown i n T a b l e I I I , there were some d i f f e r e n c e s  for those i n t r o n s which were s i z e d both by heteroduplex  analysis  91  F i g . 7 : Heteroduplex A n a l y s i s of the Bovine Prothrombin Gene E l e c t r o n micrographs of h e t e r d u p l e x e s formed between c l o n e d bovine genomic DNA (pBI1111).  (XBII3) and c l o n e d prothrombin cDNA  Three r e p r e s e n t a t i v e h e t e r o d u p l e x e s are shown  together with i n t e r p r e t i v e drawings below each The t h i n l i n e  i s s i n g l e stranded DNA,  double stranded DNA. Kbp.  photograph.  the t h i c k l i n e i s  The bar i n each panel r e p r e s e n t s 1  I n t r o n s are l e t t e r e d A through M s t a r t i n g at the 5'  end of the gene where i n t r o n A i s f l a n k e d by exons 1 and 2 (see  Fig.8).  Stem r e f e r s to an i n v e r t e d repeat  found i n i n t r o n F.  sequence  IR i n d i c a t e s an i n v e r t e d repeat  sequence  shared by i n t r o n s I and L, where a-d l o c a t e the p o s i t i o n of the  IR w i t h i n each i n t r o n  al.,1985).  (see T a b l e I V ) .  (From Irwin et  92  93  T a b l e 11: A Comparison of the S i z e s of Exons Determined Both by DNA Sequence A n a l y s i s and Heteroduplex A n a l y s i s  EXON  SIZE FROM DNA SEQUENCE (bp)  1 2 3 4 5 6 7 8 9 10 1 1 12 13 14  94 1 64 25 51 1 06 1 37 315 1 35 127 1 68 1 74 1 82 71 266  SIZE FROM HETERODUPLEX (bp)  98(14 ) 168(18) 28(8) 53(13) 103(13) 139(15) 317(26) 137(15) 117(16) 170(19) 159(19) 160(17) 65(10) 227(17)  1  2  REGION  -43 to -17 to 39 to 47 to 64 to 99 to 1 45 to 250 to 295 to 337 to 393 to 451 to 51 1 to 536 to  3  -17 38 47 64 99 1 45 250 295 337 393 451 51 1 535 582  , Exon 1 i s measured to the 5' end of pBII111 to allow compar i son. 2  , In the heteroduplex a n a l y s i s l i s t i n g  standard d e v i a t i o n ,  the mean l e n g t h and  i n parentheses, of the exons i n base p a i r s  i s shown. 3  , Region r e p r e s e n t s the amino a c i d r e s i d u e s of prothrombin  encoded by each exon. Heteroduplex a n a l y s i s data are taken from Irwin et al.(1985) .  94  Table I I I : A Comparison  of the S i z e s of I n t r o n s Determined  by DNA Sequence A n a l y s i s and Heteroduplex  INTRON  A B C D E F G H I J K L M  SIZE FROM DNA SEQUENCE (bp) 342 ND 227 ND 98 ND 293 75 ND ND 242 ND 1 35 2  Analysis  SIZE FROM HETERODUPLEX (bp) 261(46) 601(62) 170(39) 1504(73) 112(19) 1381(99) 235(23) < 1 00 1055(94) 397(46) 216(29) 6940(255) <100 3  both  LOCATION  -17 38-39 47 64 99 145 250 295 337 393 451 516 535-536  L o c a t i o n i s the amino a c i d r e s i d u e ( s ) at the intron-exon j unct i o n . 2  , ND, not determined.  3  , Mean l e n g t h with standard d e v i a t i o n  i n parentheses of the  i n t r o n s i n base p a i r s i s l i s t e d . Heteroduplex a n a l y s i s data a r e taken from Irwin e_t a l . (1 985) .  1  95  and DNA  sequence  analysis.  In g e n e r a l the s h o r t e r  i n t r o n s (see  Table I I I ) were o v e r e s t i m a t e d i n s i z e by the heteroduplex a n a l y s i s , p o s s i b l y due  to d i f f i c u l t i e s  i n t r o n loops (see F i g . 7 ) .  S i z e s of the l a r g e r  good agreement with s i z e s p r e d i c t e d endonuclease map. was  The t o t a l  from the  the short  i n t r o n s were i n  restriction  s i z e of the bovine prothrombin  estimated by heteroduplex a n a l y s i s as 14.9  al.,1985).  Kbp  sequencing and r e s t r i c t i o n  gene  (Irwin et  T h i s i s i n c l o s e agreement to the s i z e of 15.6  i n d i c a t e d by DNA (see  i n measuring  Kbp  enonuclease mapping  s e c t i o n C). 3.  Repetitive  DNA  Heteroduplex a n a l y s i s d e t e c t e d the presence of repeated sequences  w i t h i n the genomic c l o n e s (see F i g . 7 ) .  These  sequences  were mapped to w i t h i n i n t r o n s F, I, and L.  DNA  repeated  As shown  in Table IV, the s i z e s and p o s i t i o n s of some of these repeated sequences c o u l d be determined.  One  i n v e r t e d repeat sequence  within  i n t r o n F, and a second was  as a homologous sequence  i n i n t r o n s I and L ( F i g . 7 ) .  presence of two homologous DNA clone implies that these may DNA  such element  sequences  was  found as an found  The  w i t h i n the same genomic  be a type or types of  repetitive  elements. C.  DNA  SEQUENCE ANALYSIS OF THE  BOVINE PROTHROMBIN GENE  To c h a r a c t e r i z e the gene at the n u c l e o t i d e l e v e l , s m a l l fragments of XBII1, XBII2, and XBII4 (or a p p r o p r i a t e subclones) were cloned i n t o M13 identified  v e c t o r s , and e x o n - c o n t a i n i n g Ml 3 phage were  by plaque h y b r i d i z a t i o n u s i n g prothrombin cDNA  96  Table IV: Length and L o c a t i o n of I n v e r t e d Repeat Sequences Observed Within the Introns of the Bovine Prothrombin Gene  FEATURE  1  stem ir a b c d loop  3  LENGTH  :  119(26) 378(27) 586(57) 129(23) 4456(186) 2117(109) 5692(234)  1  , F e a t u r e s are from F i g . 7 .  2  , Lengths of DNA expressed as mean with standard d e v i a t i o n  parentheses. 3  , S e p a r a t i o n between i r sequences.  97  fragments as h y b r i d i z a t i o n probes and DNA e x o n - c o n t a i n i n g M13  phage were determined by the c h a i n  t e r m i n a t i o n method. Kbp of genomic  DNA  sequences of these  The n u c l e o t i d e sequence of a t o t a l of was determined (Figs.8 and 9 ) .  t h i s sequence with the prothrombin cDNA sequence  6.6  Comparison of (MacGillivray  and Davie,1984) a l l o w e d the i d e n t i f i c a t i o n of i n t r o n and exon sequences, as shown i n F i g . 9 . to n u c l e o t i d e  1 in Fig.9  The 5' end of the mRNA was mapped  (see s e c t i o n D).  The  nucleotide  sequence of 583 bp of 5' f l a n k i n g sequence was determined i n a d d i t i o n to the sequence of each of the 14 exons, and 145 bp of 3' f l a n k i n g sequence  (Figs.8 and 9).  The complete n u c l e o t i d e  sequences of 7 of the 13 i n t r o n s were determined, a l t h o u g h the n u c l e o t i d e sequence of only the intron/exon boundaries of the larger  i n t r o n s was a n a l y z e d .  data was  A t o t a l of 20 Kbp of DNA  sequence  o b t a i n e d w i t h the sequence of each n u c l e o t i d e  determined an average of 3 times.  A l l intron-exon  were o b t a i n e d u s i n g a t l e a s t two d i f f e r e n t M13  junctions  clones.  A l l exon  sequence was determined at l e a s t twice except f o r a short p o r t i o n of exon 7.  P a r t s of the i n t r o n sequences, however, were  determined only once. heteroduplex r e s u l t s  The DNA  sequence confirmed e a r l i e r  (see s e c t i o n B) on the number and s i z e s of  exons and i n t r o n s , as shown i n Tables II and I I I . The of the exons  positions  i n the genomic clones and the s i z e s of the l a r g e r  i n t r o n s were c o n f i r m e d by the presence of r e s t r i c t i o n enzyme sites  i n the DNA  r e s t r i c t i o n map  sequence that matched the p r e v i o u s l y determined (Fig.5).  From the DNA  Fig.9 and the s i z e s of the l a r g e r  sequence data shown i n  i n t r o n s as determined by  98  Fig.8; the  Partial  Bovine  Hindlll;  are  are  lettered 5'  to  used a r e :  K - Kpnl;  shown as  Introns  P -  shown as  (1-14)  The d i r e c t i o n  The a r r o w s below and amount  independent  M13 c l o n e s .  of  BamHI;  PstI;  single  orientation  pa i r s.  B -  b l a c k boxes  A-M.  3'.  Strategy  for  P r o t h r o m b i n Gene  Abbreviations -  R e s t r i c t i o n Map and S e q u e n c i n g  the  Bglll;  X - Xhol; under  lines of  Bg -  the  joining  E -  Xm - X m a l .  Exons  restriction the  exons,  transcription  is  gene  the  indicate  DNA s e q u e n c e o b t a i n e d The s c a l e  EcoRI; H  represents  map. and  are  indicated  from kilobase  99  1  Fig.9: P a r t i a l The sequence indicated  00  DNA Sequence of the Bovine Prothrombin Gene  was determined by a n a l y s i s of the Ml 3 clones  in Fig.8.  The p r e d i c t e d amino a c i d sequence of  bovine prothrombin i s given above the n u c l e o t i d e  sequence.  The s i t e of t r a n s c r i p t i o n i n i t i a t i o n i s given as  nucleotide  1 (G); the 5' f l a n k i n g sequence this point.  i s numbered backwards from  P o s s i b l e promoter elements  sequence  i n c l u d e an i n v e r t e d repeat  sequence  (boxed) and a ATTAA sequence  details. arrows.  i n the 5' f l a n k i n g  (-^^-),  a CCAT  (boxed) - see t e x t f o r  Intron/exon j u n c t i o n s are denoted by  vertical  The s i z e s of the l a r g e r i n t r o n s have been taken  from the heteroduplex a n a l y s i s ( F i g . 7 , and Table I I I ) . putative polyadenylylation  s i g n a l s AATAAA  15,563-15,568) and CAGTG ( n u c l e o t i d e s boxed, and the two p o l y a d e n y l y l a t i o n the  s o l i d diamonds.  The  (nucleotides  15,599-15,603) are s i t e s are denoted by  In the p r o t e i n coding  region,  the  cleavage s i t e g i v i n g r i s e to plasma prothrombin i s denoted by (^^T!  and the two s i t e s of a c t i v a t i o n of prothrombin by  f a c t o r Xa are denoted  by(^).  101  C TGC AGG CCG GCC TCC TGG TGA CCT GGA ACG AAG ATA GAC CAG AGG CCT GGG AGG CCA GGG CCC GAC TCT TCC TCC TGG CAA CCG CTA CAC ACA AAC AC _570 -540 -510 CCC CAG CTC CCA GGC AGG GCG GGG ACG TGG GAC CCT CCG TGT GCG GCC GGG TGG CCA CAC CCT GCC CTC CAT TTC CTT ACA TGT GGA CGG TGG ACT CCA CAG C -470 -450 -*20 -390 TCC CCG CAG GCT TTC CTG CAC ACA GCT GCT GCT CAC TAA GCT CCC CTC TAA ATT AAG AAT CTC CTT CAG TCT CTA CAG CAG GAC ACT CTC CCC ACC GCC CAG AGG -360 "330 -300 AGG AGA CAG GCT CAC AGA GGT CAA AGC AAC CAT CAC CGT GTG TTA GGT AGG AAG GAG CCT GCA GGA GAA CCC TGT GAC CCC ACT GAC CCC GGA GAG GGA GAG -260 -240 -210 -ISO GGA TGG TGG CAG CAC GTC TGG GCT CCG CTC TGG GGC TTC CTC CCA GGA TGG CGG GGG TGG GCT CltC CAT) CCA CGT GTC CCT ATG GCC CTG ACC CGC TGA CCT CCG -150 -120 -90 -43 -40 • Met Ala Arg Val Arg Gly Pro CTT CCC GGC TGA TTT CTT CAC GTT GGT TCA ACfV TTA jpC CGG TGG GGT CAG GAC CAG CCC GCA GAG TGC CGG AGC GGA TAC ACC ATG GCG CGC GTC CGA GGC CCG -50 -30 -1+1 30 -30 -20 I intron A Arg Leu Pro Gly Cys Leu Ala Leu Ala Ala Leu Phe Ser Leu val His Ser Gin His V^ CGG CTG CCT GGC TGC CTG GCC CTG GCT GCC CTG TTC AGC CTC GTG CAC AGC CAG CAT GGT AAG GGG GGC GCT GGA AGC TGT GAT AGG CTG GCG GCA TGC GTG 60 90 120 150 GTC TGT GGG CTG GGG GTC TCC ACC GAG AGA AAC AGG GCT GGC TCC CAG ATC CTC ACC ATG TCC AGC TCA GGG AAG GAC CCC CGG CGC TCC GGG CCG GAC AG 180 210 240 ACT GAC TAC TGC TCT CAG GCA ATA TGG AAG GTG GGC TGG GGG TGA CCC ATG AAA GGA GAG GGC TAG TGG CTG CCA CTA GCA GCC TTC CGG GGC CTG CCG CCA 270 300 330 \al Phe Leu Ala His Gin CGG AGT CCC CCG CTC CCG TTT CGG AAG CCA GCA GAG CTT GCC TCC TGC CCC CAC GGT GGC CAT CGT CCC AGC CTC CTC CCC CCT GCA GTG TTC CTG GCC CAT CA 370 390 /^ 420 450 Gin Ala Ser Ser Leu Leu Gin Arg Ala Arg -,Arg 1a Ala Asi Asn Lys Gly Phe Leu Gla Gla Val Arg Lys Gly Asn Leu Gla Arg Gla Cys Leu Gla Gla Pro Cys Ser CAA GCA TCC TCG CTG CTC CAG AGG GCCCCCGC GT GCC AAC AAG GGC TTC CTG GAG GAG GTG CGG AAG GGC AAC CTG GAG CGA GAG TGC CTG GAG GAG CCA T AACC 480 510 S40 30 Arg Gla Gla Ala Phe Gla Ala Leu Gla Ser *Le.u» Ser CGC GAG GAG GCC TTC GAC GCC CTG GAG TCT CiTC. «A4GT GCC ACG GTG AGG CCC CGG TGA GGC AGG TCC TGG CTC CCT CCA AGG GGT CCA GCT G S10 bp 580 600 630 ITG AGG CCC CGG TGA t kj  i  n  t  r  o  n  B  GGA TCC TGC CAC AGC CTC ATA CTC AGC CTT GITTAsTpTTAla CAGPheGA T GAl* CG Lys TTC Tyr TGGThrGCC TAC ACA CGGT GAA CAC GCG GAA GAC TTT GCT CTG GGA GGG GAG Trp A^ AAGJOtTOn 1170 1200 1230 1260 TCC TGG GGA CCC CAG CTG CAG AGT GCT CCA CCC CAG AGA GGC TTC TGG TCC GCC CAG CCG CCC ATC CCT GCG CCC CTG CCT CGT TCC TCC CTT CCT TCC ATT G 1290 1320 1350 I Cys Glu Ser Ala ^1* Arg J TGC CCG CCC CTC TGT TTC TGA GCC CTG TCC TAC CCT TTA CTT GTC CCG TCC CCA CCT CAA TCT CAG TGG TGT CTC TGG GTC TTT CTA GCT TGT GAG TCA GCC AGA 1380 1410 1440 1440 Asn Fro Arg Glu Lys Leu Asn IGlucr. i » ou 4 ' n t r 0 n D AAT CCT CGA GAA AAG CTC AAT GAA TGT CTG GAA CGT GAG GAA CTG ACA TGG GGG TGG GGA GAC CCC CGT GTG CAA AGT AGG GGT GGG GTA GGA GTC GAG GCC 1500 1530 1560 GGG TGG GGG GCC CTG GCC CTT CTG TTC TGA GGT AAG GAT GGC TCT TTC CCC TGC TGT ATG CTG AAT ATC 1220 bp CC CGG GCA CAG CGC CTG 50  L  1590  1620  2Bao  GCA CAT GGC TGT CAC ACA GGG GGC GCT CAG TGA ATG TTG GGT GCC TGC TGG GTA CAA AGG AAG TGC TCA GTG AAG GCA AGT TAA GGC TCA TGC AGC AGA AGT 2910 2940 2970 Jly Asn Cys Ala i Glut TTG GAG GGG AGG CAC CGA CAG AGC TTT ACG AGG ACA GAA GGG CGG GTG GAC AAG TCC TCA GOG GCA GAC AC*C CTG GAG AACT TGG CGGG CTC( TCC GCA GCA AAC TGC GC 3000 3030 3060 33009900 I intron E Gly Val Gly Met Asn Tyr Arg Gly Asn Val Ser val Thr Arg Ser Gly lie Glu Cys Gin Leu Trp Arg Ser Arg Tyr Pro His Lys Pro Clf GGT GTG GGG ATG AAC TAC CGA GGG AAC CTG AGC GTC ACC CGG TCA GGC ATC GAG TGC CAG CTG TGG AGA AGT CGC TAC CCA CAT AAG CCA GAG TGA GTG A 3120 3150 3180 I ioo ^u He Asn Ser Thr Thr His Pro AGG CCT GTC TGC TGA GAC GCC GGG GGA CGG AGA CAC TGC GCG TGG CGG GGG CGG GCT TCT TGC TGA CAT CCT TTC TAT TCC AGA ATC AAC TCT ACC ACC CAC CC 3210 3240 3270 3300 110 120 130 140 Gly Ala Asp Leu Arg Glu Asn Phe Cys Arg Asn Pro Asp Gly Ser He Thr Gly Pro Trp Cys Tyr Thr Thr Ser Pro Thr Leu Arg Arg Glu Glu Cys Ser Val GGG GCT GAC CTG CGG GAG AAT TTT TGC CGC AAC CCG GAT GGC AGC ATT ACT GGG CCC TGG TGC TAC ACC ACA TCC CCG ACT CTG CGG AGA GAA GAG TGC AGC 3330 3360 3390 70  , » c y s  80  90  Intron F  c|  CCG GTG TGC GGT GAG CGG GGG CGG TCG GTG GCC CAA GGC CAA AGC CAG GAC GGG AAT CGA GAT GCC AGC ACC CTC TGA CCC GGG TTA AGT TAG ACA CTT TTC 3420 3450 34B0 3510 GTT AAG TGA CAT CAG GAG GCC 1120 bp GA TCC CAG CTG TCT TTC GTA CTG GGT CTT TGT GAA AAC ACA GAA TCC CTT AGA CTC TGG GCG GGC 3540 4680 ACT AGC AGT AGA GTA CAG ATA GCG CAG GAG GTG AAA CCT CGG TAC CAT CCC TGG CTA GTC ACG CCC CAG ACA CTT GCG CCA TAT CTT TTG TTT AAA TCT CAA CA 4740 4770 4800 4830 CCC TGC AAA AAA AAA CCT CAT TAC AGA TCC CTT TCA CAG CCA AGC CGA ATG CGG CTC AGA GAG GTT AAG TCA CTT GAC ATC GTA CAG GTC AAA GGT CAG GGG GC 4860 4890 4920 150 ; ly Gin Asp Arg Val Thr Val Glu Val He Pro Arg ^ i-i-i «n. wm WC i TCC ACT GTG GTC CAA CGC TCT CTG CCC CCT CTC TCT CCT CAC CCA CCA GGC CAC GAC CGA GTC ACA GTG GAG GTG ATC CCC CGG 4950 4980 5010 5040 i6o Ser Gly Gly Ser Thr Thr Ser Gin Ser Pro Leu Leu Glu Thr Cya Val Pro Asp Arg Gly Arg Glu Tyr Arg Gly Arg Leu Ala Val Thr Thr Ser Gly Ser Arg TCA GGA GGC TCC ACT ACC AGT CAG TCG CCT CTA CTG GAA ACA TGC GTC CCG GAC CGC GGC CGG GAG TAC. CCA GGG CGG CTG GCG GTG ACC ACA AGC GGG TCC 5070 5100 5130 200 210 220 Cys Leu Ala Trp Ser Ser Glu Gin Ala Lys Ala Leu Ser Lys Asp Gin Asp Phe Asn Pro Ala Val Pro Leu Ala Glu Asn Phe Cys Arg Asn Pro Asp Gly Asp TGC CTT GCC TGG AGC AGC GAG CAC GCC AAG GCC CTG AGC AAG GAC CAG GAC TTC AAC CCG GCC GTG CCC CTG GCG GAG AAC TTC TGC CGC AAC CCA GAC GGG 5190 5220 5250 Glu Glu Gly Ala Trp Cys Tyr Val Ala Asp Gin Pro Gly Asp Phe Glu Tyr Cys Asp Leu Asn Tyr Cys cJ IntfOn G GAG GAG GGC GCC TGG TGC TAC GTG GCC GAC CAG CCT GGC CAC TTT CAG TAT TGT GAC CTG AAC TAC TGC GGT GAC AGG GCA GGG CCG GGC CCG ACA GAG GAC 5310 5340 GCT GGC GGT CAG AGC GGG AGG CGA GCC TTC CCT GGC CTC GGG CTT CCC ACG TGC GCG ACA GGG CCT TCC TGA GCC AGG TAG GGC CCA GCC TAG CCC CTG CCC A 5370 5400 5430 AGC TGA GCC CAG TGA GGC CCG CGA GCT CGT TCG CTA GTA AGG TCC GCT CTT AAC CGC CGC CAC ACG GCC TCC CCG GGG TGC GGG CTC GGG GCA GTC CAG CC 5470 5490 5520 5550 4710  TCT  X70  23o  5160  5280  180  190  102 p50 260 ^ l u Glu Pro V a l Asp Gly Asp Lau Gly Aap Ar? Lau Gly Glu Aap Pro Aap Pro Asp GGT GTG GCA TGG CCC GGC CCA GCC GCA GCC COT GTC TGG GTC CCT OCA GAG GAG CCG GTG GAT GGA OAC CTG GGA GAC AGO CTG GST GAG GAC CCG GAC COG .GAC 5580 — 5610 5640  Jr  270  p  F  320  I  290  280  A l a A l a l i e G l u Gly Arg Thr Sar G l u Asp His Pha G i n Pro Pha Phe Aan GOG GCC ATC GAG GGA CGC ACG TCT GAG GAC CAT TTC CAA CCC TTC TTC AAC 5700 I ^sp GGC GGG GCG TOG CGG CGC TCC ACC TCT CAC GGT CCC OCT TGC CCC TTA GAC 5790 ^0 5S20  G l u Lys Thr Phe Gly A l a Gly GAG AAG ACC TTT GGC GCC GGG 5730 300 Cys Gly Lau Arg Pro Lau Pha TGT GGC CTG OGA CCC CTG TTC  I  330  intron H  Glu A l a A f GAG GCC GGT AAG GTG TGG GCG TCA CGG CGT GCG 5760 310 Glu Lys Lys Gin V a l Gin Asp Gin Thr Glu Lya GAG AAG AAG CAG GTG CAG GAC CAA ACG GAG AAG 5850  intron I  Glu Lau Pha G l u Sar Tyr 11a Glu Gly Arg " l i e v a l Glu Gly Gin Asp A l a Glu V a l Gly Lau Ser Pro T r ^ ..... w.. GAG CTT TTC GAC TCC TAC ATC GAG GGG CGC ATC GTG GAG GGT CAG GAC GOG GAG GTT GGC CTC TCG CCC TGG TGC GTG CTC CTC GCC TCC CCC GTG GCC CTG CTG 5890 5910 5940 5970 CCC CGC CCC CCA GCC AAC GGG CCC GGA GGC CTT CTC CGG GTC ACA GGA CTT TAA GGC TCC ACT TGG TAA CCT ACG CCA CAC CAC GCA TT 320 bp 6000 6030 6060 - A AGG TOG CCA GGT CAA GCT GGG TCT GGG CCA GCA GTT AGC TCT AAT TAG TTA TTA AAC TTG GGA CTT TAC GCT TGT TTT TGT TOT TCA GTC ACT AAG TCG TGT 6420 6450 6480 CCA ACT CTC TOG GAA TCC CAT GGA CTC GAG CAC ACC AGG CTT CCC TGT CCT TCA CTA TCT CCC AGA GTT TGC CCA AAC TCA TGT CCA TTG ACT CGG TGA CAC CAT 6510 6540 6570 6600 CCA ACC ATC TCA TCC TCT GTC GTC CCC TTC TCC TCC CAC CCT CAA TCT TTC CCA GCA TCA GGG TCT TTT CCA GTG ACT CAG CTC TTC GCA TCA GGT GGC CAA AGG 6630 6660 6690 ACT GCA GGG TCG GCA TCA GTC CTT CTA ATG AAT ATT CAG AAT TTA TTT. CCT TTA GAT TGA CAG GTT GGA TCT CCT TCG TGT CCT CCC CAC TCT CAA GAG TCT TCT 6720 6750 6780 6810 CCA ACA CCA CAG TTC AAA AGC ATC AAT TCT TCG GGC CGC TCT GCC TTC TTT ATG GTC CAA TTC TCA CAT CCA TAC ATG ACC ACT GGA AAA ACC ATA GCT TTG ACT 6840 6870 6900 I 340 f p Gin V a l Met Leu AAG ACG GAC CTT TCT GCT TGT AGG GCT GGT GAA TGG GGC AGC CCC CAG CCC AAC CCT GCC ACC ACC TAA ATG CTT CCG GCT TCC CGC CTC AGG CAG GTG ATG CTC 6930 6960 6990 7020 3 7 0 350 360 Phe Arg Lys Ser Pro Gin Glu Leu Leu Cys Gly A l a Ser Leu l i e Ser Asp Arg Trp v a l Lau Thr A l a A l a His Cys Leu Leu Tyr Pro Pro Trp Asp Lys Aan TTT CGT AAG ACT CCC CAG GAG.CTG CTC TGT GGG GCC AGC CTC ATC ACT GAC CGC TGG GTC CTC ACG GCT GCC CAC TGT CTC CTG TAC CCG CCT TGG GAC AAG AAC 7050 7080 7110  O  „ , „.  380  390  .... I  intron J  Phe Thr V a l Asp Aap Leu Leu V a l Arg l i e Gly Lys His Ser Arg Thr A r ^ TTC ACC GTG GAT GAC CTG CTG GTG CGC ATC GGC AAG CAC TCC CGC ACC AGG TCG GAG GGG CC 7140  3 5 0 bp  A GCT TCT CTT TTT CTC TGC TGG GGT  7170  1  7560  400  410  420  Q  ^ g Tyr Glu Arg Lys V a l Glu Lys l i e Ser Met Leu Aap Lys l i e Tyr l i e Hia Pro Arg Tyr Asn Trp Lys Glu Asn Leu Asp Arg Asp l i e A l a Leu CTG CAC AGG TAT GAG CGG AAG GTT GAA AAG ATC TCC ATG CTG GAC AAG ATC TAC ATC CAC CCC AGG TAC AAC TGG AAG GAG AAT CTG GAC CGG GAC ATC GCC CTG  430  s  Leu Lys Leu Lys Arg Pro  7590  440  l i e G l u Leu Ser Asp T y r H e  His Pro v a l Cys  7620  450  7650  intron K  i  M i l l  Leu Pro Asp Lys G i n Thr A l a A l a L y ^  W l l rv  CTG AAG CTC AAG AGG CCC ATC GAG TTA TCC GAC TAC ATC CAC CCC GTG TGC CTG CCC GAC AAG CAG ACA GCA GCC AAG TTG GGC AGC CAG GAG GGC AGC GGG GGG 7680  7710  7740  7770  GTG GTG GAG GGG GCG GCT TGA GGC TGA GGG GGC CTG GGC TGG GTT CTG GGC CCA ACT CTC ACA TTC CTG TTG CCT TGC CGA AGC TCC TTC CCA TTT CCA GCC TCG 7800  7830  7860  GGC CTT CCT GCC ACG GGG GTC TTA GGC TCG AGT CTC TAC GGG GTG GTG TTG GGG CCA GGA GGC TCC TGG GCG GGA TCT GTT CTC ACT GGG TCC TTC TCC CTT CCC 7890  7920  •  7950  460  7980  470  480  Leu Leu His A l a Gly Phe Lys Gly Arg V a l Thr Gly Trp Gly Asn Arg Arg Glu Thr Trp Thr Thr Ser v a l A l a Glu V a l Gin Pro Ser V a l Leu Gin CAA AGG CTG CTC CAC GCT GGG TTC AAA GGG CGG GTG ACG GGC TGG GGC AAC CGG AGG GAG ACG TGG ACC ACC AGC GTG GCC GAG GTG CAG CCC AGC GTC CTC CAG 8010  8040  490  8070  500  510  I  intron L  Val V a l Asn Leu Pro Leu V a l Glu Arg Pro V a l Cys Lys A l a Ser Thr Arg H e Arg H e Thr Asp Asn Met Phe Cys A l a G-f GTG GTC AAC CTG CCT CTC GTG GAG CGG CCC GTG TGC AAG GCC TCC ACC CGG ATC CGC ATC ACC GAC AAC ATG TTC TGT GCC GGC AAG TGC CCT GGG CGG GCG GGG 8100 8130 8160 8190 CTG CGG TGG GAG GAT GAG ACC CGT TAA CAG CGC GGG CCT GTG TTC AAG GCC TGG CTT CGC TTT ATT TGC TTG TGT ATT ACA CAT TTT ATT TGA ACA TAG TTG ATA 8220 8250 82B0 CAC AAT ATT AGT GTC AGG TGT ACA ACA CAG TGA TTC AGT GTG TCG ATA GCT TAT ACT CCA TTT AAA GCT ATT ACA AAA TGA TGG CTG TAT TTC CCT GCG CTG GCC 8310 8340 8370 8400 AGT GTA TCT TGG TTA TTT AGA TGG GAT GCG GTA GTT TCT CTC TCT TAA CCC CCA GCC CCG TCT TGC CCC TCC TCA CTC CCT CTC CCT GCT GGC AAT TCC ATG TTT 8430 8460 8490 GTT CTC TGT CAG TGG GTC TGT TTC TGT TTC ATT ATA TTC ATC TGT TTA TTT TTG GAT TAC CA 8520 8550  6270 bp -'  T CAC TCT GCC TGT TGG GTG GAG ACT 14850  GGA TTG GAG GCA GCG AAA GGA GAG GCA GAG AAA GCA GCG GTT CGG GGA GAA AGT GGT GTG TGA TGG GCC CGG GAG CGG AAG TGG CGA GAG TGG CTG GAC TGG GGC 14880 14910 14940 14970 TGC ATG TTG CAG ACA GAG CTG ACA AAA CCT GCC TGG GTT GGA TGC GAG GGG GAG GCA ATG CGC AGT CAG GGA GGG CTA GCA GTC GGG GGG CAC TCT GGC TGG AGC 15000 15030 15060 520 O  1  5  3  0  l y Tyr Lys Pro Gly Glu Gly Lys Arg Gly Asp A l a Cys Glu Gly Asp Ser Gly Gly Pro Phe V a l GTG ACT GGT CAC TCC CTG AGC ACT GCG GTT CTC TCT CAA GGT TAC AAG CCT GGT GAA GGC AAA CGA GGG GAC GCT TGT GAG GGC GAC AGC GGG GGA CCC TTC GTC 15090 15120 15150 15180 Met ATG AAG AGC GTC TCC GAA GGC CCC GGA ACT GGT GGG GAG ATC CTT CTG GGT GGA CGG GAG GGA CCC GAG GAT TCA GGA ACA ATC AAT TGA CCC TAC CTT GGA Lys GTA + 1S210 15240 15270 I 540 " 550 f Ser Pro Tyr Asn Asn Arg Trp Tyr Gin Met Gly H e V a l Ser Trp Gly Glu Gly Cys Asp Arg Asp Gly CTC GAC TCT ATT GGA AAC CCC ATA TTT CTT CCT' CAG AGC CCC TAT AAC AAC CGC TGG TAT CAA ATG GGC ATC GTC TCA TGG GGT GAA GGC TGT GAC AGG GAT GGA 15300 15330 15360 15390 560 570 580 582 Lys T y r Gly Phe Tyr Thr His V a l Phe Arg Leu Lys Lys Trp H e Gin Lys V a l H e Asp Arg Leu Gly Ser STOP AAA TAT GGC TTC TAC ACA CAC GTC TTC CGC CTG AAG AAG TGG ATA CAG AAA GTC ATT GAT CGG TTA GGA AGT TAG GGA GCC ACC CAC ATT CCA GGC TCC TCA CTG 15420 15450 15480  _ 1  intron M  ^  ,  ••  CAA AAT CTC AGA GGC CAA TCC AGT GAA TGA ATT ATT TTT GTG GTT TGT TCC TAA AAC TAT CTT TCT CfeA TAA JfoG TGA CTC TAT CAA CGA GCC TCG GGA CTC OCA 15510 15540 15570 15600 GTGI CTG TTC ATG GGG CAG CTC AGG AAG CGC CAG CCC CAC CCC TGG ACA AGC GGC ACG CGA GGG ACC TGC CAC CCT AGA ACA GGG CCA GGT GAG AGG GGA CAT GGC 15630 15660 15690 AGC CTG AAC TTA GCA TTT CAG ATG TT 15720  103  heteroduplex  a n a l y s i s , the t o t a l s i z e of the prothrombin gene i s  approximately is  15.6 Kbp.  Within  experimental  value  i n e x c e l l e n t agreement with the s i z e of the gene determined  by heteroduplex intron-exon occurrence i s given  a n a l y s i s ( 1 4 . 9 Kbp).  j u n c t i o n s are given  The sequences found at the  i n Table V, and the  frequency of  of n u c l e o t i d e s a t each p o s i t i o n around the j u n c t i o n s  i n Table VI.  The sequences agree w e l l with  j u n c t i o n consensus sequence found i n other RNA  error, this  polymerase II (Mount,1982).  the s p l i c e  genes t r a n s c r i b e d by  A l l i n t r o n s f o l l o w the GT/AG  r u l e of Breathnach and Chambon(1981) except  f o r the donor  sequence of i n t r o n L that has the sequence GC.  The sequence of  t h i s region of i n t r o n L was determined on two separate  alleles  of the bovine prothrombin gene (cloned from the two d i f f e r e n t phage l i b r a r i e s as d e s c r i b e d an  i n section A).  Both a l l e l e s gave  i d e n t i c a l sequence except that n u c l e o t i d e 8288  i n t r o n was T i n one a l l e l e and C i n the  MAPPING THE SITE OF mRNA INITIATION  1.  Nuclease S1 Mapping  The  mRNA i n i t i a t i o n  by nuclease  i n the f i r s t  first  exon was determined  S1 mapping using a probe that c o n t a i n e d  5' f l a n k i n g sequence, the e n t i r e f i r s t intron.  exon was about  i n the  other.  D.  site  (Fig.9)  part of the  exon, and p a r t of the  T h i s a n a l y s i s showed that the s i z e of the f i r s t 100 n u c l e o t i d e s  (data not shown).  the p r e c i s e s i t e of mRNA i n i t i a t i o n ,  To determine  a more s p e c i f i c  probe was  made using a s y n t h e t i c o l i g o n u c l e o t i d e t o prime DNA s y n t h e s i s from a genomic DNA fragment cloned  i n t o Ml 3.  The  3 2  P-labeled  1 04  Table V: N u c l e o t i d e Sequences at the Intron-Exon J u n c t i o n s of the Bovine Prothrombin Gene Upper case l e t t e r s are exon sequence, lower case are i n t r o n sequence.  Codon phase r e f e r s to the p o s i t i o n w i t h i n  i n t e r u p t e d by i n t r o n s : first  codons  0 - between codons, I - a f t e r the  n u c l e o t i d e of a codon, II - a f t e r the second  n u c l e o t i d e of a codon.  Numbers at the intron-exon  i n d i c a t e the p o s i t i o n of the i n t r o n  junctions  i n the mRNA sequence.  1 05  EXON NUMBER  5' SPLICE DONOR  INTRON  3' SPLICE ACCEPTOR  CODON PHASE  1  CATGgtaagg 1 03  A  cagcctcctcccccctgcagTGTT 1 04  I  2  CACGgtgagg 267  B  tactcagccttgtttttcagGATG 268  O  3  ACAGgtgaac 292  C  gtgtctctgggtctttctagCTTG 293  I  4  GAAGgtgagg 343  D  ctggactggggtctccgcagGAAA 344  I  5  CAGAgtgagt 449  E  tgagatgctttctattccagAATC 450  II  6  TGCGgtgaga 586  F  tctctctcctcacccaccagGCCA 587  I  7  TGCGgtgaga 901  G  ccgtgtctgggtccctgcagAGGA 902  I  8  GCCGgtaagg 1 036  H  cggtcccgcttgccccttagACTG 1 037  I  9  CCTGgtgcgt 1 1 63  I  cttccggcttcccgcctcagGCAG 1 1 64  II  10  CCAGgtcgga 1 331  J  ctctgctggggtctgcacagGTAT 1 332  II  1 1  CCAAgttggg 1 505  K  tccttctcccttccccaaagGCTG 1 506  II  12  GCCGgcaagt 1 687  L  cactgcggttctctctcaagGTTA 1 688  I  13  GAAGgtaagc 1 758  M  accccatatttcttcctcagAGCC 1 759  0  1 06  Table VI: F r e q u e n c i e s of N u c l e o t i d e s at Intron-Exon J u n c t i o n s  DONOR FREQUENCIES  -4 -3 -2 -1 +1 +2 +3 +4 +5 +6 4 1 2 6  G A T C CON  2 5 0 6 N  1 11 13 0 5 2 0 0 2 0 0 12 5 0 0 1 A  A  G  G  T  7 2 12 4 10 1 1 0 0 1 1 0  5 2 3 3  R  T  A  G  C  ACCEPTOR FREQUENCIES  -20-19-18-17-16-15-14-13-12-11-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 +1 +2 +3 +4 G A T C CON  The  1 1 4 7  2 3 4 4 Y  5 1 2 5 Y  2 3 1 4 4 4 4 4 2 0 1 1 0 3 0 2 2 0 1 0 0 0 1 0 1 0 1 2 7 2 4 5 2 5 7 5 6 3 6 3 6 4 4 6 6 4 6 4 2 4 4 10 5 9 6 4 Y  Y  Y  Y  Y  Y  Y  Y  Y  f r e q u e n c i e s of the d i f f e r e n t  Y  Y  Y  Y  Y  0 0 13 7 3 1 5 2 13 0 4 3 3 4 2 0 0 1 3 "7 2 9 0 0 1 4 2, ^ 2 N  Y  A  G  G  N  N  N  n u c l e o t i d e s a t the intron-exon  j u n c t i o n s of the bovine prothrombin gene are compared to the consensus (CON) of Mount(l982). and  +1.  Splice  j u n c t i o n s are between -1  1 07  probe DNA was r e l e a s e d from the M13 DNA with endonucleases,  and was i s o l a t e d by denaturing p o l y a c r y l a m i d e g e l  electrophoresis.  The probe c o n s i s t e d of n u c l e o t i d e s -212 to 41  of the bovine prothrombin h y b r i d i z e d t o bovine S1 .  restriction  gene ( F i g . 9 ) .  The probe DNA was  l i v e r mRNA, and then t r e a t e d with  nuclease  The s i z e of the nuclease S 1 - r e s i s t a n t DNA was analyzed by  denaturing polyacrylamide gel electrophoresis (Fig.10). DNA  fragment was observed  that were l a r g e r  hindrance  together with s e v e r a l minor fragments  than the major fragment.  has been observed  A major  T h i s type of p a t t e r n  by o t h e r s , and may be the r e s u l t of s t e r i c  of the nuclease S1 by the mRNA cap s t r u c t u r e (see  Weaver and Weissmann,1979).  The s i z e of the major band was  e s t i m a t e d by comparing i t s m o b i l i t y t o a c h a i n t e r m i n a t i o n sequencing  ladder  (Fig.10).  T h i s ladder was generated  by DNA  sequence a n a l y s i s of the same M13 c l o n e / o l i g o n u c l e o t i d e that was used  t o c o n s t r u c t the nuclease S1 r e s i s t a n t probe.  band from the nuclease SI a n a l y s i s corresponds position 2.  The major  to the G at  1 i n Fig.9. Primer  Extension  As an a l t e r n a t i v e method of a n a l y z i n g the 5' end of the bovine prothrombin  gene, bovine  l i v e r RNA was reverse  t r a n s c r i b e d u s i n g the s y n t h e t i c o l i g o n u c l o t i d e as a primer.  (same as above)  The primer extension products were then  analyzed  by d e n a t u r i n g p o l y a c r y l a m i d e g e l e l e c t r o p h o r e s i s . Two DNA fragments were observed nucleotide  - a major band corresponding to  10 i n F i g . 9 and a minor band corresponding t o  nucleotide 2 i n Fig.9 (Fig.11).  N u c l e o t i d e 1 (Fig.9)  1 08  F i g . 1 0 : Nuclease SI Mapping  of the Prothrombin mRNA  Autoradiograph of p r o t e c t e d DNA fragments separated by e l e c t r o p h o r e s i s a f t e r d i g e s t i o n of a l a b e l e d s i n g l e stranded DNA  probe complementary  Nuclease S1.  t o n u c l e o t i d e -212 to 41 (Fig.9) by  DNA sequence of the probe shown beside the  p r o t e c t e d DNA, sequence  i s complementary  to that  in Fig.9.  Major band corresponds t o mRNA i n i t i a t i o n at n u c l e o t i d e 1.  1 09  1 10  Fig.11: Primer Autoradiography prothrombin  E x t e n s i o n A n a l y s i s of Prothrombin  mRNA  of p r o d u c t s of extension of bovine  mRNA with a v i a n reverse t r a n s c r i p t a s e with an  o l i g o d e o x y r i b o n u c l e o t i d e complementary to n u c l e o t i d e s 25 t o 41  (Fig.9).  prothrombin extended Fig.9.  DNA sequence of the 5' end of the gene (see F i g . 9 )  products.  bovine  i s shown i n p a r e l l e l with the  DNA sequence i s complementary t o that i n  Major t e r m i n a t i o n s i t e s  with avian reverse  t r a n s c r i p t a s e was a t n u c l e o t i d e 10, with minor s i t e at n u c l e o t i d e 2.  G  A  T  C  1  1 12  corresponds t o a consensus  mRNA i n i t i a t i o n  f l a n k e d by p y r i m i d i n e s ; Breathnach that t h i s  i s the t r u e s t a r t  site  and Chambon,1981) suggesting  s i t e of prothrombin  case, the s i z e of prothrombin  (a purine  mRNA.  In that  mRNA would be 2025 n u c l e o t i d e s  which, with a poly(A) t a i l ,  agrees w e l l with the s i z e of the  mRNA determined  blot analysis  by Northern  n u c l e o t i d e s , s e c t i o n A-3). terminating at p o s i t i o n  10 may be the r e s u l t  of s t a l l i n g of the  structure  i n the mRNA.  MAPPING REPETITIVE DNA  The presence was  ± 100  The primer e x t e n s i o n product  reverse t r a n s c r i p t a s e due t o secondary E.  (2150  of r e p e t i t i v e DNA w i t h i n the genomic c l o n e s  d e t e c t e d by h y b r i d i z a t i o n of l a b e l e d genomic DNA to the  cloned DNA.  The h y b r i d i z a t i o n  autoradiography fragment  will  s i g n a l d e t e c t e d by  f o r each c l o n e d r e s t r i c t i o n  endonuclease  then be p r o p o r t i o n a l t o the number of copies of  that sequence found w i t h i n the bovine genome; t h e r e f o r e , fragments  c o n t a i n i n g r e p e t i t i v e DNA sequences w i l l be d e t e c t a b l e  upon the s h o r t e s t exposure demonstrates  of the autoradiogram.  F i g u r e 12  one of these b l o t s , together with the corresponding  g e l s t a i n e d with ethidium bromide. r e p e t i t i v e DNA elements  With these b l o t s  c o u l d be mapped t o s e v e r a l  w i t h i n , and f l a n k i n g the bovine prothrombin  (Fig.12)  locations  gene ( F i g . 1 3 ) . As  i n d i c a t e d i n the p r e v i o u s s e c t i o n , r e p e t i t i v e DNA elements  were  i d e n t i f i e d w i t h i n some of the genomic c l o n e s by heteroduplex analysis  ( F i g . 7 , Table IV)  (Irwin e t a_l.,1985).  repeats (Table IV) were a l s o d e t e c t e d by Southern (Fig.13) c o n f i r m i n g the presence  These  inverted  blot  analysis  of r e p e t i t i v e DNA.  11 3  Fig.12: Southern B l o t A n a l y s i s of R e p e t i t i v e DNA Within the Bovine Prothrombin Gene DNA,  lanes 1-5 XBII2,  lanes 6-10 XBII3, lanes 11-15 XBII4,  was c u t with v a r i o u s r e s t r i c t i o n on an agarose g e l .  endonucleases and separated  A, E t h i d i u m bromide  s t a i n e d agarose g e l .  B, Autoragiograph (100 minutes) of the DNA  from A a f t e r  h y b r i d i z i n g to n i c k t r a n s l a t e d bovine genomic DNA cpm//ig).  lane 1, EcoRI,  2, H i n d l l l ,  EcoRI-BamHI, 5, Sstl-BamHI,  (1 x 1 0  8  3, EcoRI-Hindi 11 , 4,  6, H i n d l l l ,  7, Sstl-BamHI,  8,  Xbal-BamHI, 9, EcoRI-BamHI, 10, Hindi11-BamHI, 11, SstI , 12, Bglll,  13, EcoRI,  14, X b a l , 15, X b a l - H i n d i I I , M, marker, X  d i g e s t e d with H i n d l l l .  1 1 4  1 2 3 4 5 M 6 7 8 9 10 M 11 12 13 14 15  0.59-  B 1  23.4 9.966.674.25-  2.251.96-  0.59 -  2  3  4 5  11  5  Fig.13: Map of R e p e t i t i v e DNA The r e s t r i c t i o n  i n the Bovine Prothrombin Gene  map from F i g . 5 i s shown with areas c o n t a i n i n g  r e p e t i t i v e DNA sequences i n d i c a t e d above the r e s t r i c t i o n endonuclease c u t s i t e s  as s o l i d b a r s .  SCALE (KB): 0 L-—  5  10  15  20  25  30  1  1  1  I  I  I  EXONS.  1 23 4  EcoRI Hindlll Sstl Bglll Xhol Xbal Sail Kpnl  7 8 9 1 0 1 1 t 2  11 II  GENE:  BamHI  g 6  :  -i_ i  CLONES: XBII2I  i  i i  i i  i  t_i  u i i i i i i i i -  i  i .  D1I  i  I  u i  i  i i  „  XBII1  =  _i_  n  i  i  40 42  1 3 1 4  • • i II  II  35 _ J  u_i i I : i  i  i i—  —  II  i  i  I I I  i  i u I_I  i  1  i i i  11 i i i I I i  H-  117  Nucleotides  6,390-6,700 i n Fig.9 represent  the approximate  l o c a t i o n of one of the repeated DNAs from heteroduplex a n a l y s i s (Table  IV).  T h i s sequence, by h y b r i d i z a t i o n a n a l y s i s , was  to c o n t a i n a r e p e t i t i v e DNA in Fig.9  element.  Comparison  of DNA  shown  sequence  ( e s p e c i a l l y n u c l e o t i d e s 6,390-6,900) to known bovine  r e p e t i t i v e DNA §_1.,1986)  elements  (Watanabe et al.,1982; Richardson et  f a i l e d to f i n d any homology.  i d e n t i t y of the r e p e t i t i v e DNA  Thus, the l o c a t i o n and  elements w i t h i n the bovine  prothrombin gene are unknown. F.  ISOLATION OF A HUMAN PROTHROMBIN cDNA  Degen et §_1.(1983) used the bovine prothrombin cDNA as a h y b r i d i z a t i o n probe to i s o l a t e human prothrombin cDNAs.  The  h y b r i d i z a t i o n s were performed under c o n d i t i o n s of reduced stringency  to allow  sequences.  f o r mismatches  between the bovine and human  Three of the p o s i t i v e s were c h a r a c t e r i z e d .  longest  c l o n e , pHII3, contained  DNA  coding  peptide  of 36 amino a c i d s as w e l l as the e n t i r e coding  the plasma p r o t e i n , a 3' u n t r a n s l a t e d poly(A) t a i l  f o r part of a leader  (Fig.14).  of the l e a d e r peptide,  One  region of  sequence of 97 bp and a  To i s o l a t e a human prothrombin cDNA clone  al.,1983) was  The  a d i f f e r e n t cDNA l i b r a r y  f o r the remainder (Prochownik e_t  screened using pBI1111 as a h y b r i d i z a t i o n probe.  hundred and twenty thousand c o l o n i e s were screened by  hybridization  colony  (Benton and Davis,1977) using the same c o n d i t i o n s  as Degen et a l . ( l 9 8 3 ) .  Eight of the p o s i t i v e s were  characterized  By r e s t r i c t i o n endonuclease mapping,  further.  1 18  Fig.14: R e s t r i c t i o n Endonuclease Map of the Human Prothrombin cDNAs R e s t r i c t i o n endonuclease map of the human prothrombin cDNAs pIIH13 and pHII-3 (Degen et a l . , 1 9 8 3 ) .  cDNA i n s e r t s a r e  flanked by PstI s i t e s by the c l o n i n g procedure.  Open bars  correspond to plasma prothrombin coding r e g i o n , s o l i d  bars  correspond to the p r e p r o - l e a d e r , and hatched bars c o r r e s p o n d to the 5' and 3' u n t r a n s l a t e d sequences.  Arrows below pIIH13  r e f e r to M13 c l o n e s used f o r DNA sequence  analysis.  1  cn p  Xhol Hindi  o  o  CD  O O 'O  Sst ro  Bgl  PstI BamHI  CD  co  ro b "0_  X  X  i  co  CO 6TT  7t  cr  co  1 20  pIIH13 appeared t o be a f u l l - l e n g t h cDNA f o r human prothrombin. G.  PARTIAL DNA SEQUENCE OF pIIH13  DNA sequence of the 5' r e g i o n of pIIH13 was determined as shown i n Fig.15 on both strand s method (Sanger et a l . , 1 9 7 7 ) .  u s i n g the chain  termination  T r a n s l a t i o n of the cDNA sequence  using the standard  g e n e t i c code showed that pIIH13 d i d indeed  c o n t a i n DNA coding  f o r human prothrombin.  (Fig.15)  encoded amino a c i d  Nucleotides  49-156 (Fig.15)  residues  Nucleotides  1-57 of plasma prothrombin.  encode the part of the leader  sequence i n pHII3 as r e p o r t e d by Degen e t a l . ( l 9 8 3 ) . of n u c l e o t i d e 49 i s an ATG codon that  157-327  Upstream  ( n u c l e o t i d e s 28-30, Fig.15)  i s i n the same p o s i t i o n as the i n i t i a t o r methionine found  in bovine prothrombin  ( M a c G i l l i v r a y and Davie,1984). S i x  n u c l e o t i d e s upstream of t h i s ATG codon i s a TGA stop codon (nucleotides  19-21, Fig.15) s t r o n g l y suggesting  that the ATG a t  n u c l e o t i d e 28-30 encodes the i n i t i a t o r methionine f o r human prothrombin mRNA.  In t h a t case,  s y n t h e s i z e d as a p r e c u r s o r amino a c i d r e s i d u e s . prothrombin leader  c o n t a i n i n g a leader peptide of 43  T h i s i s the same l e n g t h as the bovine  peptide.  H.  ISOLATION OF THE HUMAN PROTHROMBIN GENE  1.  I s o l a t i o n Of Genomic Clones  To  i s o l a t e DNA coding  approximatly  3 2  f o r the human prothrombin gene,  10 c l o n e s of the p a r t i a l H a e l l l / A l u l 6  l i v e r genomic l i b r a r y using  human prothrombin i s  f e t a l human  i n XCh4A (Lawn e t §_1.,1977) were  P - l a b e l e d pIIH13 as a h y b r i d i z a t i o n probe.  screened  Three  121  F i g . 1 5 : N u c l e o t i d e Sequence of the 5' End of pIIHl3 The p r e d i c t e d amino a c i d sequence of human i s shown above the cDNA sequence.  prepro-prothrombin  The leader p e p t i d e has  been numbered backwards from the s i t e of cleavage that g i v e s r i s e to plasma  prothrombin.  -43 -40 -30 Met A l a Arg l i e Arg Gly Leu Gin Leu Pro Gly Cys Leu A l a Leu Ala CCC TAG TGA CCC AGG AGC TGA CAC ACT ATG GCC CGC ATC CGA GGC TTG CAG CTG CCT GGC TGC CTG GCC CTG GCT 15 30 45 60 75 -20 -10 A l a Leu Cys Ser Leu V a l His Ser Gin H i s V a l Phe Leu A l a Pro Gin Gin A l a Arg Ser Leu Leu Gin Arg Val GCC CTG TGT AGC CTT GTG CAC AGC CAG CAT GTG TTC CTG GCT CCT CAG CAA GCA CGG TCG CTG CTC CAG CGG GTC 90 105 120 135 150 -1 +1 10 20 Arg Arg A l a Asn Thr Phe Leu Glu Glu V a l Arg Lys Gly Asn Leu G l u Arg Glu Cys V a l Glu Glu Thr Cys Ser CGG CGA GCC AAC ACC TTC TTG GAG GAG GTG CGC AAG GGC AAC CTG GAG CGA GAG TGC GTG GAG GAG ACG TGC AGC 165 180 195 210 225 30 40 Tyr Glu Glu A l a Phe Glu A l a Leu Glu Ser Ser Thr A l a Thr Asp V a l Phe Trp A l a Lys Tyr Thr Ala Cys Glu TAC GAG GAG GCC TTC GAG GCT CTG GAG TCC TCC ACG GCT ACG GAT GTG TTC TGG GCC AAG TAC ACA GCT TGT GAG 240 255 270 285 300 50 Thr A l a Arg Thr Pro Arg Asp Lys Leu ACA GCG AGG ACG CCT CGA GAT AAG CTT 315 327  1  different  23  X c l o n e s were i d e n t i f i e d  and plaque p u r i f i e d .  The DNA  c o n t a i n e d i n these c l o n e s was c h a r a c t e r i z e d by r e s t r i c t i o n endonuclease mapping ( F i g . 1 6 ) . c o n t a i n e d a 5.0 Kbp i n s e r t isolated clones total  One of these c l o n e s  and was i d e n t i c a l  to the p r e v i o u s l y  genomic c l o n e X10 (Degen et a l . , 1 9 8 3 ) . (XHII2, XHII3) overlapped t h i s  of 23 Kbp of human genomic DNA  (XHII1)  The other two  sequence and c o n t a i n e d a (Fig.16).  Part of the  human prothrombin gene has been l o c a t e d i n t h i s r e g i o n by Degen et  a l . ( l 9 8 3 ) as shown i n Fig.16.  be g r e a t e r than 20 Kbp i n s i z e Nagamine e_t aJL.,1984).  The gene has been e s t i m a t e d t o  (unpublished r e s u l t s  quoted i n  In that case, the c l o n e d DNA shown i n  Fig.16 does not c o n t a i n the complete human prothrombin gene. a d d i t i o n , the r e s t r i c t i o n  map of the genomic c l o n e s shown i n  Fig.16 f a i l e d to account f o r a l l the r e s t r i c t i o n fragments d e t e c t e d by genomic Southern b l o t Thus,  In  endonuclease  analysis  (Fig.17).  i t appears that these genomic c l o n e s do not c o n t a i n the 3'  end of the human prothrombin gene. 2.  P a r t i a l DNA Sequence A n a l y s i s Of The Human Prothrombin  Gene To prove that the genomic c l o n e s i s o l a t e d  c o n t a i n e d the  gene f o r human prothrombin, p a r t i a l DNA sequence 1.0 Kbp Hindi11-EcoRI was undertaken  restriction  (see F i g . 1 6 ) .  a n a l y s i s of a  endonclease fragment of XHII 1  T h i s fragment was found to c o n t a i n  exons 10 and 11 of the human prothrombin gene as was expected from the r e s t r i c t i o n Fig.16).  endonuclease map of Degen e_t a_l.(l983) (see  1 24  Fig.16: R e s t r i c t i o n Map of the Human Prothrombin Gene The r e s t r i c t i o n  map  XHII2, and XHII3. restriction  sites  was d e r i v e d from the three c l o n e s XHII1, Genomic DNA  (E).  fragments are f l a n k e d by EcoRI  The exons are i n d i c a t e d as s o l i d  boxes, and i n t r o n s as the t h i n l i n e ; both exons and  introns  have been p l a c e d u s i n g data from Degen e_t a l . (1 983 , 1 985) and Davie et al.(1983) .  m EcoRI EcoRI  EcoRI  BamHI Hindlll  CO4^"  EcoRI  m CD <  m  CO.  co-  BamHI  rn  Hindlll  m BamHI  BamHI  o'  m X  X I  CO  F3  X.  7;  CT TJ  S2T  Hindll  1 26  Fig.17:  Southern B l o t A n a l y s i s of the Human Prothrombin Gene  Human genomic DNA (1 0jug) was d i g e s t e d with  various  r e s t r i c t i o n endonucleases and e l e c t r o p h o r e s e d gel.  After denaturation,  the DNA was t r a n s f e r r e d to  n i t r o c e l l u l o s e and h y b r i d i z e d to P - l a b e l e d 3 2  represents cleaved  3 2  P-labeled  with H i n d l l l .  i n an agarose  pIIH13.  Lane M  s i z e markers comprised of X DNA Human DNA was cleaved with  (lane  1), BamHI (lane 2 ) , EcoRI  Bglll  (lane 5 ) , and PstI  (lane 3 ) , SstI  (lane 6 ) .  Hindlll  (lane 4 ) ,  23.4 •  9.96 6.67 4.25  2.25 1.96  0.59  1 28  1.  ISOLATION OF cDNA CLONES FOR CHICKEN PROTHROMBIN  1.  C o n d i t i o n s Of S c r e e n i n g  To  initiate  s t u d i e s of the prothrombin  s p e c i e s , a c h i c k e n l i v e r cDNA l i b r a r y Dr.  gene i n other  (generously provided by  Todd K i r s h g e s s n e r , UCLA) was screened at low s t r i n g e n c y  using a  3 2  P - l a b e l e d human prothrombin  h y b r i d i z a t i o n probe.  cDNA (pIIH13) as a  The l i b r a r y was screened on d u p l i c a t e  f i l t e r s at low s t r i n g e n c y i n an attempt to detect any weak c r o s s h y b r i d i z a t i o n s i g n a l between the human and chicken sequences. Duplicate f i l t e r s  were necessary  the high background. screened,  to d e t e c t p o s t i v e c l o n e s due t o  From the i n i t i a l  30,000 recombinant  clones  10 p o s i t i v e s were i d e n t i f i e d , two of which were  studied further.  One of these, pCII1 contained a 950 bp i n s e r t .  2.  DNA Sequence Of pCII1  The  e n t i r e DNA sequence of pCII1 was  ( n u c l e o t i d e s 650 to 1569, F i g . 1 8 ) .  determined  One of the p o t e n t i a l  t r a n s l a t i o n products of t h i s DNA sequence was found approximately  70% amino a c i d sequence i d e n t i t y with both  and human prothrombin, amino a c i d  to have  identity  i n the s e r i n e protease domain.  suggested  Amino a c i d sequence data  (generously provided by Dr.  Wayne S t a t e Univ.) confirmed the c h i c k e n prothrombin  t h a t t h i s was chicken  gene.  bovine  T h i s high  prothrombin. Dan Walz,  t h a t the sequence corresponded to Amino a c i d sequence data was  a v a i l a b l e f o r two r e g i o n s of c h i c k e n thrombin:  the amino-  t e r m i n a l 27 amino a c i d r e s i d u e s of the B c h a i n , and a 29 amino acid  r e s i d u e long s e c t i o n w i t h i n the B chain of thrombin  (383 to  1 29  Fig.18: The  DNA S e q u e n c e  predicted  amino  sequence.  The  triangles,  with  underlined. H i s  3  5  0  factor  ,  A s p  two  ,  0 4  o  e  the  ,  by  Chicken  acid  and  Prothrombin  sequence  polyadenylation AATAAA  S e r  5  1  sites,  thrombin.  1  is  shown  sites  the . and  catalytic  Solid the  arrows open  cDNAs above  are  polyadenylation  indicates  Xa c l e a v a g e  cleavage  of  the  DNA  indicated  by  signals  triad  residues  indicate  arrow  the  the  site  two of  the  130  100 HQ 120 Lys Tyr Pro His H e Pro Lys Phe Asn Ala Ser H e Tyr Pro Asp Leu Thr Glu Asn Tyr Cys Arg Asn Pro Asp Asn Asn Ser Glu Gly Pro Trp Cys Tyr Thr AAA TAT CCA CAT ATA CCT AAA TTT AAT GCC TCC ATT TAT CCT GAC CTC ACT GAG AAC TAC TGC AGG AAC CCA GAC AAC AAC TCA GAA GGT CCA TGG TGC TAC ACA IS 30 45 60 75 90 105  y\  130 140 150 f> 160 Arg Asp Pro Thr Val Glu Arg Glu Glu Cys Pro H e Pro Val Cys Gly Gin Glu Arg Thr Thr Val Glu Phe Thr Pro Arg Val Lys Pro Ser Thr Thr Gly Gin CGA GAC CCA ACA GTG GAA CGG GAA GAG TGC CCC ATT CCA GTA TCT GGT CAA GAA AGG ACA ACA GTT GAG TTC ACT CCG CGG GTC AAA CCA TCA ACC ACA GGG CAG 120 . 135 150 165 180 195 210 170 180 190 Pro Cys Glu Ser Glu. Lys Gly Met Leu Tyr Thr Gly Thr Leu Ser Val Thr Val Ser Gly Ala Arg Cys Leu Pro Trp Ala Ser Glu Lys Ala Lys Ala Leu Leu CCT TGT GAA TCA GAG AAA GGA ATG CTT TAT ACA GGG ACG CTT TCA GTC ACT GTA TCT GGG GCT AGG TGC CTG CCA TGG GCC TCA GAG AAG GCC AAA GCA TTG CTC 225 240 255 270 285 300 315 200 210 220 230 Gin Asp Lys Thr H e Asn Pro Glu Val Lys Leu Leu Glu Asn Tyr Cys Arg Asn Pro Asp Ala Asp Asp Glu Gly Val Trp Cys Val H e Asp Glu Pro Pro Tyr CAA GAC AAA ACC ATT AAC CCA GAA GTG AAG CTG CTG GAG AAT TAC TGT CGG AAC CCT GAT GCA GAT GAT GAG GGT GTC TGG TGT GTA ATA GAT GAA CCA CCA TAC 330 345 360 375 390 405 420  \f  240 250 260 Phe Glu Tyr Cys Asp Leu His Tyr Cys Asp Ser Ser Leu Glu Asp Glu Asn Glu Gin Val Glu Glu H e Ala Gly A r g ^ h r H e Phe Gin Glu Phe Lys Thr Phe TTT GAA TAC TGT GAC CTG CAT TAC TGC GAC AGC TCG CTC GAG GAT GAG AAT GAA CAG GTG GAG GAA ATA GCG GGA CGT ACC ATC TTT CAA GAG TTC AAA ACC TTC 435 450 465 480 495 S10 525 270 280 290 300 Phe Asp Glu Lys Thr Phe Gly Glu Gly Glu A l a Asp Cys Gly Thr Arg Pro Leu Phe Glu Lys Lys Gin H e Thr Asp Gin Ser Glu Lys Glu Leu Met Asp Ser TTC GAT GAA AAA ACT TTT GGT GAA GGT GAA GCA GAC TGT GGA ACT CGC CCT TTA TTC GAA AAG AAA CAG ATA ACA GAC CAA AGT GAG AAG GAG CTG ATG GAC TCC 540 555 570 585 600 615 630 310 320 330 Tyr Met Gly Gly Arg'Val Val His Gly Asn Asp Ala Glu Val Gly Ser Ala Pro Trp Gin Val Met Leu Tyr Lys Lys Ser Pro Gin Glu Leu Leu Cys Gly Ala TAC ATG GGA GGC AGA GTT GTA CAC GGG AAC GAT GCA GAA GTT GGA AGC GCC CCC TGG CAG GTG ATG CTC TAC AAA AAG AGT CCT CAA GAG CTG CTG TGT GGT GCC > 645 660 675 690 705 720 735  0  340 350 360 370 Ser Leu H e Ser Asn Ser Trp H e Leu Thr Ala Ala His Cys Leu Leu Tyr Pro Pro Trp Asp Lys Asn Leu Thr Thr Asn Asp H e Leu V a l Arg Met Gly Leu AGC CTC ATC AGT AAC AGC TGG ATC CTC ACT GCT GCT CAT TGC CTT CTT TAT CCA CCC TGG GAC AAG AAC TTA ACT ACA AAT GAC ATC TTG GTG CGG ATC GGC TTG 750 765 780 795 810 825 840  0  380 390 400 His Phe Arg Ala Lys Tyr Glu Arg Asn Lys Glu Lys He Val Leu Leu Asp Lys Val H e H e His Pro Lys Tyr Asn Trp Lys Glu Asn Met Asp Arg Asp H e CAT TTC AGC GCA AAA TAC GAA AGG AAT AAA GAG AAA ATT GTT CTG TTG GAT AAA GTC ATC ATC CAT CCT AAG TAC AAC TGG AAA GAG AAC ATG GAC CGA GAT ATT 855 670 885 900 915 930 945 410 420 430 440 Ala Leu Leu His Leu Lys Arg Pro Val H e Phe Ser Asp Tyr H e His Pro Val Cys Leu Pro Thr Lys Glu Leu Val Cln Arg Leu Met Leu Ala Gly Phe Lys GCA CTC CTG CAC CTG AAG CGA CCG GTC ATC TTC AGC GAC TAC ATC CAT CCT GTC TGC TTG CCT ACC AAG GAG CTT GTG CAG AGG CTG ATG CTG GCA GGT TTT AAA 960 975 990 1 005 1 020 1 035 1 050 1  450 460 470 Gly Arg Val Thr Gly Trp Gly Asn Leu Lys Clu Thr Trp Ala Thr Thr Pro Glu Asn Leu Pro Thr Val Leu Gin Gin Leu Asn Leu Pro He Val Asp Gin Asn GGG CGG GTA ACT GGC TGG GGA AAT CTG AAA GAA ACG TGG GCC ACT ACC CCA GAA AAC CTG CCA ACA GTT CTG CAA CAG CTC AAT CTG CCC ATT GTA GAC CAA AAC 1 065 1 080 1 095 1 110 1 125 1 140 1 155  0  480 490 500 510 Thr Cys Lys A l a Ser Thr Arg Val Lys Val Thr Asp Asn Met Phe Cys Ala Gly Tyr Ser Pro Glu Asp Ser Lys Arg Gly Asp Ala Cys Glu Gly Asp Ser Gly ACC TGC AAG GCA TCC ACC AGG GTT AAA GTC ACA GAC AAT ATG TTC TGT GCT GGT TAC ACT CCT GAA GAC TCA AAG AGA GGA GAT GCT TGT GAA GGG GAC AGT CGG 1 170 1 165 1 200 1 2L5 1 230 1 245 1 260 520 530 540 Gly Pro Phe Val Met Lys Asn Pro Asp Asp Asn Arg Trp Tyr Gin Val Gly H e Val Ser Trp Gly Glu Gly Cys Asp Arg Asp Gly Lys Tyr Gly Phe Tyr Thr GGG CCT TTT GTA ATG AAG AAC CCA GAT GAC AAC CGC TGG TAT CAA GTG GGA ATA GTT TCA TGG CGA GAA GGC TGT GAC CGA GAT GGC AAA TAT GGA TTT TAC ACT 1 275 1 290 1 305 1 320 1 335 1 350 1 365 550 560 564 - His Val Phe Arg Leu Lys Lys Trp Met Arg Lys Thr H e Glu Lys Gin Gly STOP CAC GTA TTC CGC CTG AAA AAA TGG ATG CGA AAA ACC ATT GAA AAA CAA GGA TAG AAG AGA GCT TCC CTT GCT TGT TCT CAG TTC TGC TAC AAT ACT CCA CTT CTT 1 380 1 395 1 410 1 425 1 440 1 455 1 470  V  AAA AAC ATA CAC 1 CTT GTG TTC ATG 1  ATT GAA CAA ATC TTG 485 1 CTA AGC TGA ACA CCA 590 1  AAG TGG AAG TTA AAT CCC TGC AAC TTG ACA 500 1 515 1 CCT GAA TCC ATG CCA TCA CAA TAG CTA GCA 605 1 620 1  AAG GAA CGT GTT CCT CCT TGA AAA TAA AAG TTC TCA ACC ATC TTC 530 1 545 1 560 1 GCA CCA ACA CAA CAG CAC CTG CAG TAC TGC TAG TTA AGA TGC TGC 635 1 650 1 665 1  CTC 575 CCT 660  TCA AGT GTT CTC CTC TAC TCT ATC AGC AGT AAC AAT CAA CAG ATT TTA GAC TTC AGA TGA TGG ACT TCA GTC ACA GTA AGC AAG ACG TCC CTT GGA CAC TGT CCA 1 695 1 710 1 725 1 740 1 755 1 770 1 785 TTC CCC CCT TCA ACT AAA TTC ATT TTC TGT TCT AGA AAT CTG AAA GGA 1 800 1 815 1 630  TAA CAA GCT GGA GAT ACC TAC CCA CCT TAC AAG AAC TGT AGC ATT ATT CAA AAT GCC 1 645 1 860 1 675 1 690  ACA TCA AGA CTA AAG CAA CTA TAG CCT TTG TTG ATA AGA CAG ACA TTG TTC TCA GCC ACA ACA GCA GCA ACA AAA TAC CAT CTG TGC TTC TTA CAA AGT TAG TGT 1 905 1 920 1 935 1 950 1 965 1 980 1 995 CTT AAG TTA CAG ATG TCA TCT ATG TGC AAC TTA ATG AGG TAC AGA AAT AGG GGG TTT GAA TAG ATG AAG TAA CAC ACG CAT TTC TGC ATA GCA GTA ACT TTC TAT 2 010 2 025 2 040 2 055 2 070 2 065 2 100 ATG GCC AAG  TAC TGC TGG GAC TTG AAA GTA TAT TTT CCA CTG GCA TAA CTA GAT TCA GAA GGA AGC ACT TCG TAC ACA CAA TTT TCA AAG GTC TTC CAA AGG GCA 2 115 2 130 2 145 2 160 2 175 2 190 2 205  GCA TCC GTC ACT GTA CCT ATT TTG TTC TTA TAA AAC TGT TTA GGA TTC ACC CTT AAA AGA AGC CCC ACT TCT TTC ATG AAC TCT TCA GCA AAG ACA CAG AAG TAC 2 220 2 235 2 250 2 265 2 280 2 295 2 310 AAT ACT ATT ATA TAG ACT GGC CAA TCT GTT CAG ACC AGT TTT CTC TCA AAC TAA AGA GGG ATT TGG AAG CTA TCT TTG CTC CCC AAA ACA TCA TTC TCA AAT CCC 2 325 2 340 2 355 2 370 2 385 2 400 2 415 TCA TCC CTC ACA GTG CCA TCA ACT TAC AGA AAC AAC CAA TAG ACA AAA GTT CTT CCT CCT TAA ATG GAG TAT TAA AGG 2 430 . . . 2 445 2 460 2 475 2 490 AGA ACT ATC CAA AAT TTG TTG GAA, AT , flAC AGT TAT TAA 2 529 2 538 2 S47 2* 556 n  V  TC  ACA ATC- ACT TCA AAA AAG ATG CTA CAG 2 505 2 520  *  131  411, F i g . 1 8 ) .  Of these 56 r e s i d u e s , two d i f f e r e n c e s were  observed between the p r o t e i n sequence that determined by Walz.  Position  glutamate by amino a c i d sequence sequence a n a l y s i s .  p r e d i c t e d by the cDNA and  310 (Fig.18) was a s s i g n e d as  a n a l y s i s and h i s t i d i n e by DNA  P o s i t i o n 326 was a p h e n y l a l a n i n e by amino  a c i d sequence a n a l y s i s while the DNA sequence was  a tyrosine.  Overall,  i n d i c a t e d that i t  i t i s c l e a r that t h i s cDNA does  code  f o r chicken prothrombin. J.  ISOLATION OF LONGER CHICKEN PROTHROMBIN cDNAS  In an attempt to c h a r a c t e r i z e the e n t i r e mRNA f o r c h i c k e n prothrombin, 250,000 recombinants of the c h i c k e n l i v e r cDNA l i b r a r y were screened with probe.  3 2  P - l a b e l e d pCII1 as a h y b r i d i z a t i o n  A t o t a l of twenty a d d i t i o n a l c h i c k e n prothrombin cDNAs  were i d e n t i f i e d and plaque p u r i f i e d .  T h i s low number of  prothrombin cDNA c l o n e s d e t e c t e d i n the cDNA l i b r a r y that the mRNA f o r prothrombin in the bovine l i v e r  i n the c h i c k e n l i v e r  i s lower than  (0.01% of the mRNA i n c h i c k e n v e r s u s 1% of  the mRNA i n bovine) (see next s e c t i o n ) . appeared to include a poly(A) t a i l ,  A l l cDNA c l o n e s  i n d i c a t i n g that they had  been primed from the 3' end by o l i g o ( d T ) .  cDNA c l o n e s g r e a t e r  than 1.0 Kbp i n length were mapped f o r r e s t r i c t i o n sites  suggests  endonuclease  (see F i g . 1 9 ) , and those shown i n Fig.19 were used f o r  f u r t h e r DNA sequence a n a l y s i s appeared to have a d i f f e r e n t  (Fig.18).  3' end (pCII203, Fig.19 and a  s i m i l a r clone pCII205, not shown). c o n t a i n e d an extra  Two of the c l o n e s  These two cDNA c l o n e s  1000 n u c l e o t i d e s of 3' u n t r a n s l a t e d  ( n u c l e o t i d e s 1570 to 2561 F i g . 1 8 ) , t h i s suggest that an  sequences  1 32  F i g . 1 9 : R e s t r i c t i o n Map of Chicken Prothrombin  cDNAs  cDNA i n s e r t s a r e f l a n k e d with EcoRI r e s t r i c t i o n the c l o n i n g procedure.  from  P r o t e i n coding r e g i o n i s shown as  s o l i d bar, i n d i c a t i n g the approximate sequences.  sites  l e n g t h of 5' end  A l l cDNA c l o n e s end with poly(A)  tails.  133  Cl o  a.  CO  LO  d  o CM  o CM  O CL  CJ  CL  1 34  alternative polyadenylylation site prothrombin  gene.  Fig.18).  i s used by the c h i c k e n  None of the cDNAs c o n t a i n e d a f u l l  l e n g t h copy of the prothrombin mRNA, with pCII20l extending the most 5' (see F i g . 1 9 ) .  The three cDNAs provided a t o t a l of 2565  bp of cDNA sequence ( F i g . 1 8 ) .  The cDNA sequence allowed the  p r e d i c t i o n of the sequence of 471 amino a c i d r e s i d u e s of c h i c k e n prothrombin.  Based on Northern b l o t a n a l y s i s (see next s e c t i o n )  and analogy to the mammalian prothrombin mRNAs, i t appears  that  about 450 n u c l e o t i d e s of c h i c k e n prothrombin mRNA are not represented by these cDNAs (see F i g . 1 9 ) .  A second l i v e r cDNA  l i b r a r y was c o n s t r u c t e d , and screened with a 5' chicken prothrombin cDNA probe. recombinant prothrombin K.  None of the 320,000 randomly primed  c l o n e s c o n t a i n e d the missing 5' end of the c h i c k e n sequence.  SIZE ANALYSIS OF CHICKEN PROTHROMBIN mRNA  The s i z e of the c h i c k e n mRNA f o r prothrombin was determined by d e n a t u r i n g c h i c k e n l i v e r p o l y A separating  +  RNA with  i t on formaldehyde-agarose  denatured RNA to n i t r o c e l l u l o s e . h y b r i d i z e d with  3 2  formaldehyde,  g e l s , and t r a n s f e r r i n g the  When these b l o t s were  P - l a b e l e d c h i c k e n prothrombin cDNA  two mRNAs were d e t e c t e d ( F i g . 2 0 ) .  (pCIIl),  These mRNAs were about 2200  and 3200 n u c l e o t i d e s i n l e n g t h ( F i g . 2 0 ) .  T h i s supports the  suggestion that two d i f f e r e n t p o l y a d e n y l y l a t i o n s i g n a l s are used in the c h i c k e n l i v e r 3' ends.  (see Figs.18 and 19) c r e a t i n g two d i f f e r e n t  G r e a t e r than 90% of the mRNA f o r chicken prothrombin  appears to use the f i r s t  polyadenylylation signal  (see F i g . 2 0 ) ,  as suggested by the i s o l a t i o n of 20 of the 22 cDNAs with t h i s  1 35  F i g . 2 0 : Northern B l o t A n a l y s i s of Chicken Prothrombin mRNA Chicken l i v e r poly A formaldehyde,  +  RNA  (20/ug) was denatured with  separated by e l e c t r o p h o r e s i s , and b l o t t e d  nitrocellulose.  The Blot was h y b r i d i z e d to the c h i c k e n  prothrombin cDNA pCII1.  The two mRNAs f o r c h i c k e n  prothrombin are i n d i c a t e d by the arrows, and are approximately 3200 and 2200 n u c l e o t i d e s i n l e n g t h .  onto  1 36  M  6.674.252.25-  1.96-  I  1 3 7  poly(A) t a i l . detected  Chicken prothrombin mRNA c o u l d not be e a s i l y  with t o t a l  mRNA (see F i g . 6 ) . chicken  liver  l i v e r RNA i n c o n t r a s t t o bovine T h i s suggests that prothrombin  prothrombin  mRNA i n the  i s much l e s s abundant than i n e i t h e r the bovine or  human l i v e r , where t o t a l RNA c o u l d be used i n Northern b l o t analysis  (see F i g . 6 ) .  1 38  DISCUSSION A.  CHARACTERIZATION OF THE BOVINE PROTHROMBIN GENE  1.  I s o l a t i o n Of The Bovine Prothrombin Gene  P r e l i m i n a r y c h a r a c t e r i z a t i o n of the bovine prothrombin by Southern b l o t a n a l y s i s u s i n g c l o n e d bovine prothrombin as h y b r i d i z a t i o n probes demonstrated s i n g l e gene f o r prothrombin gene i s at l e a s t  cDNAs  that there i s p r o b a b l y a  i n the bovine genome, and that  10 Kbp i n l e n g t h  gene  (Fig.4).  this  When the cDNAs were  used as h y b r i d i z a t i o n probes t o screen bovine genomic X l i b r a r i e s , a t o t a l of f i v e d i f f e r e n t (Fig.5).  X c l o n e s were  isolated  The DNA i n these f i v e c l o n e s overlapped each o t h e r and  represented a t o t a l of 42.4 Kbp of contiguous bovine genomic DNA (Fig.5).  These clones c o n t a i n e d genomic DNA from only one  l o c a t i o n again suggesting that there i s only a s i n g l e gene f o r prothrombin  i n the bovine genome.  Southern b l o t t i n g  i n d i c a t e d that the bovine prothrombin approximately  experiments  gene r e s i d e d i n  15 Kbp i n the middle of the c l o n e d genomic DNA  (Fig.5). 2.  Size A n a l y s i s Of The Bovine Prothrombin mRNA  The s i z e of the mRNA f o r bovine prothrombin was determined by Nothern b l o t a n a l y s i s  (Fig.6).  Prothrombin mRNA was d e t e c t e d  by h y b r i d i z a t i o n to l a b e l e d bovine prothrombin cDNA, pBI1111. These b l o t s demonstrated  the presence of a s i n g l e  bovine  prothrombin mRNA s p e c i e s of 2150 ± 100 n u c l e o t i d e s i n l e n g t h i n liver  tissue.  T h i s s i z e of the mRNA i n d i c a t e d that the bovine  139  prothrombin cDNAs i s o l a t e d by M a c G i l l i v r a y and Davie(l984) i n c l u d e d n e a r l y the e n t i r e mRNA sequence, l a c k i n g about 3.  but were probably  50 bp a t the 5' end of the mRNA.  Sequence Of The Bovine Prothrombin Gene  Further c h a r a c t e r i z a t i o n of the bovine prothrombin gene was undertaken by p a r t i a l DNA sequence DNA  sequence  prothrombin  analysis.  Comparison  presented i n F i g . 9 to the cDNA sequence  of the  of bovine  ( M a c G i l l i v r a y and Davie,1984) demonstrates  that the  bovine prothrombin gene i s made up of 14 exons separated by 13 introns.  The gene c o v e r s approximately 15.6 Kbp of the bovine  genome, and i s processed i n t o a mRNA of 2025 nuceotides p l u s poly(A) t a i l .  As shown i n T a b l e s V and VI, a l l DNA sequences a t  the intron-exon j u n c t i o n s match the consensus Mount(l982)  sequence of  except the s p l i c e donor of i n t r o n L.  The s p l i c e  donor of i n t r o n L has GC ( n u c l e o t i d e s 8170-71 Fig.9) i n s t e a d of the consensus GT a t i t s i n t r o n - e x o n j u n c t i o n . has a l s o been observed a t s p l i c e (e.g.  T h i s rare v a r i a n t  j u n c t i o n s i n a few other genes  Wieringa e t al.,1984; Dush et al.,1985).  sequence  has been observed  i n two d i f f e r e n t a l l e l e s of the  bovine prothrombin gene ( i s o l a t e d phage l i b r a r i e s ) .  T h i s sequence  cloning/sequencing a r t i f a c t , signal  i s probably f u n c t i o n a l  Comparison  from the two d i f f e r e n t  genomic  probably does not represent a  suggesting that t h i s rare i n the bovine prothrombin  of the DNA sequence  splice gene.  of the exons of the  prothrombin gene t o that of the p r e v i o u s l y bovine prothrombin  T h i s GC  i s o l a t e d cDNAs f o r  ( M a c G i l l i v r a y e t a_l.,l980; M a c G i l l i v r a y and  Davie,1984) show a t o t a l of 7 n u c l e o t i d e d i f f e r e n c e s .  One of  1 40  the d i f f e r e n c e s i s a d e l e t i o n of an A r e s i d u e untranslated cDNA sequence within  region of the genomic (between p o s i t i o n s  the 3' u n t r a n s l a t e d  i n the 3'  sequence i n comparison to the  15,482 and 15,484  region.  (Fig.9))  Of the remaining s i x  d i f f e r e n c e s , four are changes i n the t h i r d p o s i t i o n of the codons f o r amino a c i d r e s i d u e s these r e s u l t i n a change  157, 180, 182, and 281.  i n amino a c i d r e s i d u e ,  None of  and a r e probably  f u n c t i o n a l l y s i l e n t polymorphisms of the DNA sequence r e s u l t i n g in  (presumably) n e u t r a l changes.  The other two d i f f e r e n c e s i n  the DNA sequence are i n the codon f o r amino a c i d r e s i d u e  188  (see Fig.9) which r e s u l t i n the change from the cDNA determined residue serine between  h i s t i d i n e (CAC) to the genomic coding sequence f o r (AGC).  This residue  i s one of the amino a c i d  differences  the p r e d i c t e d amino a c i d sequence determined by cDNA  sequence a n a l y s i s  ( M a c G i l l i v r a y and Davie,1984), and amino a c i d  sequence a n a l y s i s  (Magnusson  for t h i s residue  et al.,1975) while genomic  confirms the amino a c i d sequence  r e s u l t , t h i s amino a c i d d i f f e r e n c e at r e s i d u e an amino a c i d residue  sequence  analysis  188 may  represent  polymorphism, as the human prothrombin  amino a c i d sequence (Degen et a_l.,l983)  has a h i s t i d i n e a t t h i s  p o s i t i o n , which i s the same as the bovine prothrombin cDNA sequence ( M a c G i l l i v r a y and Davie,1984). residue may represent  Thus the h i s t i d i n e  the a n c e s t r a l residue  which has changed to a s e r i n e residue  at t h i s p o s i t i o n ,  i n some c a t t l e .  Heterogeneity a l s o occurs at the 3' end of the bovine prothrombin mRNAs where there polyadenylylation.  a r e at l e a s t two s i t e s of  These s i t e s were d e t e c t e d  by the comparison  141  of the DNA sequences  of s e v e r a l independent  bovine  prothrombin  cDNA c l o n e s ( M a c G i l l i v r a y e t al.,1980; M a c G i l l i v r a y and Davie,1984).  The consensus p o l y a d e n y l y l a t i o n sequence  (Proudfoot and Brownlee,1976) i s found at p o s i t i o n s 15,568 ( F i g . 9 ) of the bovine prothrombin gene. sequences  AATAAA  15,563-  These  AATAAA  a r e 16 and 18 bp 5' t o the s i t e s of p o l y a d e n y l y l a t i o n ,  a distance similar  t o that  found i n other e u k a r y o t i c genes  (Proudfoot and Brownlee,1976; p o s s i b l e sequence  B i r s t i e l et a l . , 1 9 8 5 ) .  A second  CAYTG which may be i n v o l v e d i n  p o l y a d e n y l y l a t i o n has been observed 3' to the s i t e of p o l y a d e n y l y l a t i o n of some genes (Berget,1984). sequence  CAGTG i s found  polyadenylylation  13 and 15 bp 3' of the s i t e s of  i n the bovine prothrombin gene  15,599-15,603 F i g . 9 ) . i s at n u c l e o t i d e  A similar  (nucleotides  Thus, the 3' end of the prothrombin mRNA  15,584 or 15,586, although t e r m i n a t i o n of  t r a n s c r i p t i o n p r o b a b l y o c c u r s f u r t h e r 3' at an unknown 4.  S i t e Of mRNA  site.  Initiation  Nuclease S1 and primer extension a n a l y s i s  (Figs.10 and 11)  both i n d i c a t e t h a t the 5' end of the bovine prothrombin mRNA i s l o c a t e d a t or near n u c l e o t i d e p o s i t i o n sequence  1 in Fig.9.  The DNA  of t h i s s i t e of mRNA i n i t i a t i o n corresponds t o the  consensus  start  s i t e of a p u r i n e flanked by p y r i m i d i n e s that i s  found i n many genes t r a n s c r i b e d by RNA polymerase and Chambon,1981) . initiation  Therefore, this  i s the most probable mRNA  s i t e a l t h o u g h a l t e r n a t e mRNA i n i t i a t o n  be d i s c o u n t e d .  An i n t r o n  II (Breathnach  sites  cannot  i n the 5' f l a n k i n g u n t r a n s l a t e d  i s u n l i k e l y as t h e r e i s no consensus  s p l i c e acceptor  region  sequence  1 42  (Mount,1982) i n or near the 5' f l a n k i n g sequences. "TATA" sequence can be seen initiation,  but an AT r i c h  immediately sequence,  No obvious  5' t o the s i t e  of mRNA  ATTAA, i s found a t the  expected d i s t a n c e f o r a "TATA" sequence ( n u c l e o t i d e s -28 to -24 F i g . 9 ) , and may f u n c t i o n as the "TATA" sequence. sequence i s found approximately initiation  Often a "CAAT"  100 bp 5' t o t h e s i t e  (Breathnach and Chambon,1981).  of mRNA  In the prothrombin  gene, the sequence CCAT i s found a t n u c l e o t i d e s -100 t o -97. Like the "CAAT" sequence,  the CCAT sequence i s f l a n k e d by an  i n v e r t e d repeat (Kingsbury and McKnight,1982) which i s G/C r i c h ( n u c l e o t i d e s -121 t o -104 and -81 to -63). sequences site  can be found a t the a p p r o p r i a t e d i s t a n c e s from the  of mRNA i n i t i a t i o n  bovine prothrombin performed  i n the 5' f l a n k i n g  gene.  sequence of the  However, f u r t h e r experiments  to i d e n t i f y the r e g i o n ( s ) of the 5' f l a n k i n g  that are i n v o l v e d prothrombin 5.  Thus p r o m o t e r - l i k e  must be sequence  i n the r e g u l a t i o n and e x p r e s s i o n of the bovine  gene  Intron P o s i t i o n s In The Coding  It has been observed  Region  i n a number of genes that  i n t r o n s are  p o s i t i o n e d between p r o t e i n domains (Blake 1 978,1983a,b; Gilbert,1978,1979; Go,1981,1983). bovine prothrombin  When the i n t r o n s of the  gene are mapped t o the amino a c i d  sequence of  the p r o t e i n molecule as shown i n F i g . 2 1 , some of the i n t r o n s appear  to separate p r o t e i n domains e s p e c i a l l y w i t h i n the  activation peptide. precursor prothrombin  The s i t e  of s i g n a l p e p t i d a s e cleavage i n  has not yet been determined,  p o s t u l a t e d t o occur a t G i n  - 1 9  (Bently et a l . , 1 9 8 6 ) .  but has been Intron A  1 43  F i g . 2 1 : I n t r o n s i n the Prothrombin The  Molecule  r e l a t i v e p o s i t i o n s of i n t r o n s w i t h i n  amino a c i d sequence are i n d i c a t e d by  the  ( HH).  V  prothrombin  BOVINE  PROTHROMBIN  1 45  F i g s . 9 and  21)  r e s i d u e -17,  i n t e r r u p t s the sequence of the p r e p r o - p e p t i d e at  appearing  to separate the pre- and  The G l a domain has been i d e n t i f i e d as extending residue such  1 to 47  pro-peptides. from amino a c i d  (Jackson and Nemerson,1980; Patthy,1985),  i s f l a n k e d at r e s i d u e 47 by i n t r o n C ( F i g s . 9 and  21).  i n t r o n s e p a r a t e s the Gla domain and the p r o - p e p t i d e , linking  the p r o - p e p t i d e to a f u n c t i o n a l r o l e  the Gla r e g i o n (see Fung et al.,1985; k r i n g l e s of prothrombin ( F i g s . 9 and  Pan  and No  further  i n the formation of  et a l . , 1 9 8 5 ) .  The  two  are f l a n k e d by i n t r o n s D, F, and G  21), which separate the k r i n g l e s from each other  from the remainder  as  of the p r o t e i n molecule.  I t appears  and  that the  N-terminal a c t i v a t i o n peptide has been c o n s t r u c t e d of exon domains f o r a s i g n a l p e p t i d e , a pro-peptide and Gla r e g i o n , and two  separate k r i n g l e domains.  i n t e r r u p t e d by first  Some of these domains are  i n t r o n s (the Gla domain by  i n t r o n B, and  k r i n g l e domain by i n t r o n E ( F i g s . 9 and  appear to s e p a r a t e obvious  further the  2 1 ) ) , which do not  s t r u c t u r a l or f u n c t i o n a l domains.  The d e f i n i t i o n of p r o t e i n domains, e i t h e r s t r u c t u r a l or f u n c t i o n a l , w i t h i n the c a t a l y t i c  region of prothrombin  c l e a r as f o r the a c t i v a t i o n p e p t i d e . separate the c a t a l y t i c a l l y  As shown i n Fig.21  important H i s  r e s i d u e s , as w e l l as s e p a r a t i n g the two from each other and represent  the remainder  3 6 5  s t r u c t u r e of thrombin  model of the s t r u c t u r e e x i s t s  , Asp  4 2 2  The  Ser  These  5 2 8  sites  may  three  i s unknown, but a  ( F u r i e et a l . , 1 9 8 2 ) .  model, the i n t r o n s of the thrombin  , and  of the p r o t e i n .  as  introns  f a c t o r Xa cleavage  some form of f u n c t i o n a l domains.  dimensional  i s not  proposed Using  domain are found to map  this to  1 46  the  surface of the molecule, as has been observed i n other  proteins B.  (Craik e t  al,1982a,b,1983).  CHARACTERIZATION OF A HUMAN PROTHROMBIN cDNA  The amino a c i d sequence determined  of human prothrombin  (Butkowski et al.,1977; Walz et a l . , 1 9 7 7 ) .  C h a r a c t e r i z a t i o n of cDNA c l o n e s comfirmed demonstrated at  least  that p r e c u r s o r prothrombin  t h i s sequence, and  has a l e a d e r sequence of  36 amino a c i d r e s i d u e s (Degen e t a l . , 1 9 8 3 ) .  i s o l a t i o n of a new human prothrombin sequence  has been  of p r e c u r s o r prothrombin  With the  cDNA, pIIH13, the complete  has been determined  (Fig.14).  The sequence  of t h i s cDNA shows that p r e c u r s o r human  prothrombin,  l i k e the bovine p r e c u r s o r , has a p r e p r o - p e p t i d e of  43 amino a c i d r e s i d u e s . in  the same reading frame (Fig.14) s u g g e s t i n g that M e t "  i n i t i a t i n g metionine. sequence first of  Stop codons a r e observed 5' to M e t "  (Fig.22) p l a c e s the  n u c l e o t i d e of pIIH13 (Fig.14) a t the i n i t i a t i n g  previously  Comparison  n u c l e o t i d e of the human  of the sequence  of pIIH13 with  i s o l a t e d cDNA c l o n e s (Degen e_t a_l.,l983) shows only  one n u c l e o t i d e d i f f e r e n c e .  The codon f o r r e s i d u e 13 was CTG  compared t o CTA (Degen et a l . , 1 9 8 3 ) . mutat ion.  nucleotide  T h i s i n d i c a t e s that pIIH13 may be a f u l l  length cDNA i n i t i a t i n g near the f i r s t prothrombin gene.  i s the  Optimal alignment of the human cDNA  with the bovine genomic sequence  the bovine gene.  43  43  This represents a s i l e n t  1 47  Fig.22: Alignment of the Bovine and Human Prothrombin mRNA Sequences The n u c l e o t i d e prepro-leader nucleotides  sequence of the 5' u n t r a n s l a t e d  sequence of the bovine prothrombin gene (B, 1 to 153 of mRNA sequence, Fig.9) i s a l i g n e d t o  the cDNA sequence of pIIH13 Fig.16).  sequence and  (H, n u c l e o t i d e s  Gaps (-) a r e p l a c e d  two DNA sequences.  Numbering  1 to 156,  t o maximize homology i s from pIIH13  indicate identical nucleotides.  of the  (Fig.16);  stars  The i n i t i a t o r methionine  ( r e s i d u e -43) i s encoded by n u c l e o t i d e s the leader sequence by n u c l e o t i d e s  28-30, and A r g " of  154-156.  1  1 48  B:  GCAGAGTG — C C - G G A G C G G A T A C A C C A T G G C G C G C G T C C G A G G C C C G C G G C T G C C T G G C **** * * * * * * * ** * * * * * * * * * * * * * * * * * * * * ** **********  H:  CCCTAGTGACCCAGGAGCTGACACACTATGGCCCGCATCCGAGGCTTGCAGCTGCCTGGC 1 15 30 45 60  B:  TGCCTGGCCCTGGCTGCCCTGTTCAGCCTCGTGCACAGCCAGCATGTGTTCCTGGCCCAT ********************** * * * * * * * * * * * * * * * * * * * * * * * * * * * * *'* * * *  H:  TGCCTGGCCCTGGCTGCCCTGTGTAGCCTTGTGCACAGCCAGCATGTGTTCCTGGCTCCT 75 90 105 120  B:  CAGCAAGCATCCTCGCTGCTCCAGAGGGCCCGCCGT ********* * * * * * * * * * * * * * * * * * * **  H:  CAGCAAGCACGGTCGCTGCTCCAGCGGGTCCGGCGA 135 150  1 49  C.  CHARACTERIZATION OF cDNAS FOR CHICKEN PROTHROMBIN  1 .  Sequence Of The Chicken Prothrombin cDNAs  A t o t a l of 22 prothrombin cDNA c l o n e s were i s o l a t e d chicken l i v e r  cDNA l i b r a r y .  2561 n u c l e o t i d e s of sequence (Fig.19).  From t h i s DNA  from a  Three of these cDNA c l o n e s p r o v i d e d of chicken prothrombin mRNA  sequence,  the amino a c i d sequence of  472 r e s i d u e s of chicken prothrombin c o u l d be p r e d i c t e d  (fig.18),  with a p p r o x i m a t e l y 92 of the N-terminal amino a c i d r e s i d u e s missing  (see below).  Three p o r t i o n s of the p r e d i c t e d amino a c i d  sequence  of c h i c k e n prothrombin c o u l d be a l i g n e d with amino a c i d  sequence  data (D.  Walz, unpublished r e s u l t s ) , at p o s i t i o n s  185, 308-334, and 381-409 assignment  (Fig.18).  were found at p o s i t i o n s  155-  D i f f e r e n c e s i n amino a c i d 168, 310, and 326 with  d i f f e r e n c e s of Lys, G l u , and Phe i n the cDNA and G l y , H i s , and Tyr  i n the amino a c i d sequence  Fig.18).  a n a l y s i s r e s p e c t i v e l y (see  The d i f f e r e n c e s found at p o s i t i o n s 310 and 326 were  observed i n a t l e a s t two of the cDNA c l o n e s , and are t h e r e f o r e u n l i k e l y t o be c l o n i n g a r t i f a c t s .  The d i f f e r e n c e at p o s i t i o n  168 was observed i n only one cDNA c l o n e and may be a c l o n i n g artifact.  These  three d i f f e r e n c e s may  w i t h i n the c h i c k e n prothrombin sequence  r e p r e s e n t polymorphisims  sequence.  data c l e a r l y demonstrate  The amino a c i d  that the c l o n e d cDNAs code f o r  c h i c k e n prothrombin, or an extremely c l o s e l y  related  protein,  such as a r e c e n t l y d u p l i c a t e d gene product. The sequence prothrombin  shown i n Fig.18 i n d i c a t e s that c h i c k e n  has a very s i m i l a r s t r u c t u r e to that of the  1 50  mammalian prothrombins.  DNA sequence data demonstrate that the  chicken prothrombin molecule i s probably  made up of a two chain  thrombin, and c o n t a i n s two k r i n g l e s i n the a c t i v a t i o n The  peptide.  e x i s t a n c e of a G l a domain i n c h i c k e n prothrombin had been  demonstrated p r e v i o u s l y by amino a c i d (Walz,l978).  sequence a n a l y s i s  The s t r u c t u r e of the l e a d e r peptide  i s unknown at  present. 2.  A l t e r n a t i v e S i t e s Of P o l y a d e n y l y l a t i o n  Northern  b l o t a n a l y s i s (Fig.20) of c h i c k e n  l i v e r mRNA  demonstrated the e x i s t a n c e of two mRNA s p e c i e s for chicken prothrombin.  DNA sequence a n a l y s i s of cDNA c l o n e s for chicken  prothrombin demonstrated that the d i f f e r e n c e between these two mRNAs (Figs.18 and 19)  i s due t o the use of two d i f f e r e n t  polyadenylylation signals. were approximately  The two s i t e s of p o l y a d e n y l y l a t i o n  1000 n u c l e o t i d e s a p a r t  ( F i g . 1 8 ) , accounting  for the d i f f e r e n c e i n s i z e of the mRNAs ( F i g . 2 0 ) .  The use of  these two s i t e s of p o l y a d e n y l y l a t i o n does not a l t e r the p r o t e i n coding  r e g i o n of the mRNAs, but only changes the length of the  3' u n t r a n s l a t e d sequences. The length  poly(A)  tail  (Perry,1976).  of most mRNAs a r e 180-200 n u c l e o t i d e s i n Thus, the c o d i n g  r e g i o n s of the chicken  prothrombin mRNAs a r e about 3000 and 2000 n u c l e o t i d e s long. date,  To  2561 bp of chicken prothrombin cDNA sequence have been  determined, i n d i c a t i n g that about 450 bp of sequence are absent from the i s o l a t e d cDNA c l o n e s  (Fig.19).  Approximately  92 amino  a c i d r e s i d u e s of amino a c i d sequence of plasma prothrombin are absent  from the chicken prothrombin cDNA sequence (see below),  151  t o g e t h e r with the leader sequence, sequences.  A f t e r accounting f o r the m i s s i n g 92 amino a c i d  r e s i d u e s , there are about  170 n u c l e o t i d e s of mRNA  remaining, which would be adequate similar  and 5' u n t r a n s l a t e d  sequence  t o encode a p r e p r o - p e p t i d e  to the mammalian prothrombins  (43 amino a c i d r e s i d u e s  c o r r e s p o n d to 132 n u c l e o t i d e s i n a d d i t i o n to 40 n u c l e o t i d e s of 5' u n t r a n s l a t e d sequences).  Thus, i t appears that there may be  o n l y minimal d i f f e r e n c e s between c h i c k e n prothrombin mammalian  and the  prothrombins.  D.  COMPARISON OF PROTHROMBIN SEQUENCES  1.  Conserved  Sequences  An alignment of the amino a c i d sequences  of bovine  prothrombin  ( M a c G i l l i v r a y and and Davie, 1984), human  prothrombin  (Degen e_t a l . , 1983; F i g . 15) and c h i c k e n prothrombin  (Walz,  1978; Walz, unpublished r e s u l t s ; Fig.18) i s shown i n  Fig.23.  Gaps and i n s e r t i o n s have been p l a c e d to allow f o r  maximum homology with the minimum of d e l e t i o n s and/or but with r e t e n t i o n of common s t r u c t u r a l There  features  insertions  (see F i g . 2 3 ) .  i s 87% amino a c i d i d e n t i t y between the p r e c u r s o r  forms of bovine and human prothrombin, and 68% and 65% i d e n t i t y between bovine and c h i c k e n , and human and c h i c k e n prothrombins, respectively.  The most conserved r e g i o n s between these  prothrombins are the G l a region and the thrombin However, the A chain of thrombin B chain. about  domain.  i s much l e s s conserved than the  In a d d i t i o n , the k r i n g l e s are much l e s s conserved, a t  60% i d e n t i t y between c h i c k e n and the mammals.  The l e a s t  1 52  F i g . 2 3 : Homologies  i n Prothrombin  An alignment of bovine prothrombin Davie,1984), human prothrombin F i g . 1 6 , r e s i d u e s -36 c h i c k e n prothrombin  Sequences (MacGillivray  and  ( r e s i d u e s -43 to -37  to 579 from Degen et al.,1983) and ( r e s i d u e s 1 to 45 from Walz,1978,  r e s i d u e s 56 to 90 from Walz, unpublished r e s u l t s , 93 to 564  from  from F i g . 1 8 ) .  Sequence i s a l i g n e d to give minimum  of i n s e r t i o n s and/or d e l e t i o n s , f a c t o r Xa c l e a v a g e ,  residues  site  , i n d i c a t e the s i t e s of  i n d i c a t e the s i t e of thrombin  c l e a v a g e , <^ , i n d i c a t e the a c t i v e s i t e r e s i d u e s , represent gaps i n the amino a c i d sequence homology between the sequences,  ,  to allow maximum  ???, represent  u n c h a r a c t e r i z e d amino a c i d residues which are p r e d i c t e d to e x i s t by analogy to the mammalian prothrombins and/or  i n s e r t i o n s may  exist).  (deletions  153  Met Met  -40 A l a Arg V a l Arg A l a A r g H e Arq  His Gin G i n A l a Pro Gin G i n A l a  Gly Gly  Pro A r g Leu Leu G i n Leu  P r o G l y Cys P r o G l y Cys  Ser Ser  Leu Leu G i n Arg Leu Leu G i n A r g  Arg Arg  -30 Leu A l a Leu A l a A l a Leu Leu A l a Leu A l a A l a Leu  Phe Cys  Ser Ser  Leu V a l His Leu-Val His  -20 S e r G i n H i s -Val S e r G i n His Yfli  Phe Leu A l a Phe.Lflu A l e  Arg A l a Asn L y s G l y Phe Leu G l u G l u V a l A r g L y s G l y A s n L e u G l u A r g G i t Arg A l a Asn|T h r • J P h e Leu G l u G l u V a l .Arg L y s G l y A s n L e u . G l u A r g G i t [ A l a Asn L y s G l y Phe Leu G l u G l u | M e t I l e | L y s G l y Asn L e u G l u A r g G i t  Cys Leu G l u G l u l P r o [ C y s S e r j A r g | G l u G l u V i a ' Phe G l u A l a Leu G l u Ser Leu S e r [ A l a T h r A s p A l a Phe T r p A l a L y s T y r T h r A l Cys [Val JGlu G l u Thr C y s S e r T y r C l u G l u A l a Phe G l u A l a Leu G l u Ser S e r *Thr | A l a T h r A s p [ V a i l Phe T r p A l a L y s T v r T h r A l a Cys Leu G l u G l u T h r C y s f A s n l T y r G l u G l u A l a Phe G l u A l a Leu G l u Ser T h r V a l Asp] T h r A 9 p A l a Phe T r p A l a L v s T v r l ? ? ? ? ? ? [ i  B: H: C:  Cys Cys  Hi  G l u Ser G l u Thr ??? ???  TTT  Arg A l a Arg iii ???  Asn Thr ???  Trb Arg Pro ???  G l u ILys Leu|Asn G l u j C y s Leu G l u G l y Asn C y s A l a G l u G l y | Va11Gly Met Arg Asp [Lys L e u f A l a pTTa Cys Leu G l u G l y Asn Cys A l a G l u G l y Leu G l y Thr ??? ??? T h r T h r . L e u ) A s p [Ala Cys Leu G l u G l y Asn Cys A l a j v a l A s n l L e u G l y G i n  60 90 |Val jSer V a l I Thr Arg S e r G l y l i e G l u Cys G i n L e u T r p Arg S e r A r g T y r P r o H i s L y s P r o G l u |v»I Asii [ l i e I Thr A r g S e r G l y H e G l u Cys G i n Leu T r p Arg S e r A r g T y r P r o H i s L y s P r o G l u I l e l A s x i T v r l T h r i L v s l S e r G l v H e G l u Cvs G l n l V a l T v r 77? 7?7 L v s l T y r P r o H i s I 1 l e |_Pro I Ly s 110 |Ala Asp Leu [ A l a Asp Leu  120 G l u " A s n Phe Cys A r g Asn P r o Asp G l y G l u Asn Phe Cys A r g Asn P r o Asp Ser G l u A s n l T v r l C v s A r a Asn P r o Aso Asn  Ser Ser Asn  Asn T y r A r g G l y Asn Asn[Trp]Arg Gly His Asn T v r A r g G l v T h r  100 l i e Asn Se'r T h r T h r H i s P r o Glyj H e A s n S e r T h r T h r H i s P r o Glyj P h e l A s n l A l a Ser l i e Tvrl  130 H e i f r h r ' t i l y Pro T r p C y a T y r T h r T h r | S e r | P r o T h r J L e u l A r g A r g Glti| A s n J T h r G l y P r o T r p Cys T y r T h r T h r Asp p r o T h r V a l Arg Arg f c i n ' G l y P r o T r p C y s T y r T h r A r g A s p P r o T h r V a l J G l u l A r q Glut  160 G l u Cys S e r l V a l [Pro V a l ~ C y s G l y G i n Asp A r g ' V a i ' Thr" V a l G l u V a l ) l i e J P r o A r g S e r j G l y |Gly S e r T h r T h T l S e r G l n f S e r P r o L e u G l u Cys S e r H e P r o V a l C y s G l y G i n Asp V a l T h r V a l [ A l a | Met Thr P r o A r g S e r | G l u |Gly Ser [S_e_r V a l Asn L e u l S e r P r o P r o G l u Cvs Jp'r'olHe Pro Va\, C y s G l v G i n |Glu Arg. T h r JThr V a l G l u Phe T h r P r o A r g j v a l L y s P r o l S e r T h r T h r l d y 170 ILeu G l u l T h r l C y s V a l P r o Asp A r g G l y ] A r g G l u | l e u G l u l G l n l c y s V a l P r o Asp A r g G l y j G i n G i n l G l n | P r o L c y s j G l u S e r G l u L v s | G l v , Met Lue  180 190 Arg |Gly A r g L e u A l a V o l T h r T h r H i s G l y l S e r J A r g Cys t e u A l a T r p l S e r j S e r j G i n JGly A r g L e u A l a V a l T h r T h r H i a G l y I Leu P r o ! Cys L e u A l a T r p A l a Sein Thr I G l y r T h r ^ L e u l S e r l V a l Th r l V a l ' S e r IG1 v [ A l a l A r g Cys L e u l ' p r o l T r p A l a Serl  200 210 G l u [Gin A l a L y s A l a L e u S e r L y s Asp G i n A s p Phe Asn Pro A l a V a i l Pro A l a (Gin A l a L y s A l a L e u S e r L y B p r T T l G l n Asp Phe Asn tS*eT| A l a Va 1 G i n Glu L y s l A l a ^ L y s A l a LeuJLeu Gln|Asp[Lys Thr 1 l e i A s n Pro|GluIVa1|Lys B:  H: Ct  Bi  H:  B ;  H : C:  Glu Glu QlU  Gly Gly GlY  Tyr Thr Val  Val Val He  Ala Ala Asp  Gly Gly  Asp Asp  dlu Glu Glu  Asp Asp Asn  Pro Ser Glu  Asp Gly Glu  Glu Glu Glu  220 A s n Phe Cys A r g Asn P r o A s p G l y A s p Glu] A s n Phe Cys A r g Asn P r o A s p G l y Asp G l u l A s n l T y r l C v s A r g Asn P r o A s p j A l a l A s o l A s p  G i n [Pro G l y Asp Phe G l u T y r Cys Asp Leu Asn T y r Cys G l u G l u I Pro I Va 1| Asp G l y Asp L e u L y s P r o G l y Asp P h e | c i y | T y r Cys Asp Leu Asn T y r C y s G l u C l u ] A l a 1 V a l G l u j G l u G l u T h r P r o . P r o . T v r ---|phe G l u T v r Cvs Aso L e u l H i s . T v r C v a l A a n S e r S e r Laul C l n l  "  y Arg T h r S e r l.ftppl Arg[ A l a l i e G l u C l y Arg T h r A l a • -• • " l l l e f A l a l G l y Arg T h r H e  G l u A s p H i s [Phe G i n Pro Phe T h r S e r G l u ^T y r l G i n T h r Phe Phe G i n G l u lPhg|Tys~ T h r Phe  Thr Thr Thr  Phe Phe Phe  Ala Ser Glu  Gly Gly Gly  Phe G l u A l a Asp Cys G l y Leu Arg P r o Leu Phe c i u L y s L y s GJn| V a l G i n Asp G i n T h r G l u L y s C G l u A l a Asp Cys G l y Leu A r g P r o Leu Phe G l u L y s Lys] Ser* Leu G l u T h r G l u l A r g l G l u Leu L e u G l u A l a Asp Cys G l y j T h r l A r g P r o Leu Phe G l u L y s L y s Gin] H e T h r Asp G i n S e r l G l q L y s G l u Leu Met  Ser Ser Ser  320 Tyr l i e ] Glu T y r IIe| Asp TyrjHet Gly  Gly Gly Glv  A r g l i e V a l G l u G l y [ G i n Asp A r g H e v a l G l u G l y S e r Asp A r o I V a i l V a i l H i s l G l v I Asn A S P  Glu Glu Glu  350 L e u Leu Cys L e u Leu Cys L e u Leu Cys  360 370 G l y A l a S e r Leu l i e S e r Asp Arg T r p V a l L e u T h r A l a A l a H i s Cys L e u Leu T y r P r o Pro T r p Asp Lys A s n Phel G l y A l a S e r L e u H e S e r A S P Arg T r p Va 1, L e u T h r A l a A l a H i s C y s L e u Leu T y r P r o P r o T r p A s p L y s Asn .Ph_J G l y A l a S e r L e u H e S e r l A s n S e r f T r p I H e 1 Leu T h r A l a A l a H i s Cys L e u Leu T v r P r o Pro T r o A S P L V W A a n l Leu  Gly Gly Glv  H: C:  Tyr  He  He He He  His His His  Glu Glu Asp  A l a G l u V a l G l y Leu ISer Pro T r p G i n V a l Met Leu Phe A r g L y s Ser P r o G i n A l a G l u l l l e j G l y Met|Ser P r o T r p G i n V a l Met Leu Phe A r g L v s Ser P r o G i n A l a G l u V a l G l v Ser A l a j Pro T,rp G i n Va^ Met L e u l T v r L v s l L v s Ser P r o G i n  0  V a l A s p J A s p L e u L e u V a l A r g H e G l y L y s H i s S e r A r g T h r A r g T y r G l u Arg| L y s V a l G l u G l u JAsn A s p L e u L e u V a l A r g H e G l y L y s H i s S e r A r g T h r A r g T y r G l u A r g A s n l l i e G l u T h r U s n A s p l I l e f L e u V a l A r g [Me 11 G l y I Leu] H i 8,1 Phe I Arg | A l a Lys'fTyr G l u Arg Asnl l y sG l u  B :  Phe Asn G l u Lys] Phe A s n l P r o A r g PhelAgpldu Lvsl  L y s H e S e r Met L e u Asp L y s I l e l L y s l i e S e r Met Leuf Glu] L y s H e ) Lvs I l e l V a l L * U ( T . A . . * c n i . w n l v i  0  410 420 H i s P r o A r g T y r A s n T r p L y s G l u Asn Leu Asp A r g Asp l i e A l a Leu Leu L y s L e u , L y s A r g P r o j l l e G l u L e u f S e r A s p T y r Asp A r g Asp H e A l a Leu| Met L y s L e u L y s |Lys") P r o V a l l A l a j P h e S e r A s p T y r ] A s p A r g Asp H e A l a Leu Leu H ii ssl. L e u L y s A r q P r o V a l | H e | P h e Ser A s p T v r  Pro Pro Pro  Val Val val  Cys Leu Cys Leu Cy s Leu  P r o A s p Ly B G i n P r o A s p lArg G l u P r o [ T h r iLvs G l u  A r g j G l u Thr T r p T h r Thr Ser ILys G l u T h r T r p T h r l A l a Asn |Lys G l u T h r TrpJ A l a 1 T h r l T h r  Thr Thr Leu  Ala Ala Ala Ala Val Gin  Ala Ala Ala  G l y Phe L y s G l y A r g V a l T h r G l y T r p C l y A s n [ A r q G l y ( T y r ] L y s G l y A r g V a l T h r G l y T r p G l y A s n Leu] G l y Phe L y s G l v A r g V a l T h r G l v T r p G l y A s n Leul  A l a | G l u ] V a l [ G i n Pro S e r V a l L e u G i n V a l V a l Asn L e u P r o j t e u j V a l ' G l u A r g P r o V a l C y s L y s G l y jLy's * G l y l d n P r o S e r V a l L e u G i n V a l V a l Asn L e u P r o H e V a l G l u Arq P r o V a l C y s L y s Pro i G l u l A s n L e u l Prol T h r l V a l L e u G l n l G l n Leul Ann L P U P r o H P V a l l A s o G i n A s n T h r l C v s L v s 1  A l a S e r T h r A r g H e A r g H e T h r Asp Asn Met Phe Cys A l a G l y T y r L y s P r o G l y G l u G l y L y s A r g G l y Asp A l a C y s G l u G l y Asp A s p . S e r T h r A r g H e A r q l i e T h r Asp A s n Met Phe Cys A l a G l y T y r L y s P r o A s p G l u G l y L y s A r g G l y Asp A l a Cys G l u G l y Asp A l a S e r T h r A r g j ' v a l L y s V a l \ T h r Asp Asn Met Phe Cys A l a G l y Thr| S e r ] P r o G l u A S P S e r l L v s A r g G^y A«p A l a Cys r.l» m y A<tp. i  Ser Ser Ser  540 G l y G l y P r o Phe V a l Met L y s S e r P r o T y r f A s n Asn A r g T r p T y r G i n Met G l y H e V a l S e r G l y G l y P r o Phe V a l Met L y s S e r P r o PheJAsp Asn A r g T r p T y r G i n Met G l v H e V a l S e r G l y G l y P r o Phe V a l Met L y s ] A s n } P r o Asp A B pI A B n A r g T r p T h r G l n | | G l y H e V a l S e r  Gly Gly  L y s T y r G l y Phe T y r T h r H i s V a l Phe A r g Leu Lys L y s T r p H e G i n L y s V a l l i e A s p L y s T y r G l y Phe T y r T h r H i s V a l Phe A r g Leu Lysi L v a T r o l Met A r a [Lys! T h r )  560  570 (  550 T r p G l y G l u G l y C y s Asp A r g Asp T r p G l y G l u G l y C y s Asp A r g Asp T r p G l y "Clu G l y C y s Asp A r g Asp. 580 Leu Phe Gin  Gly Gly GlV  582 Ser STOP G l u STOP STOP  1 54  conserved regions are the r e g i o n s connecting k r i n g l e s , the r e g i o n c o n n e c t i n g connecting  the Gla and  the k r i n g l e s , and the region  the k r i n g l e and the thrombin domain (see F i g . 2 3 ) .  T h i s homology i m p l i e s that the G l a and thrombin B chain are the regions most e s s e n t i a l f o r the common f u n c t i o n of the chicken and mammalian prothrombins.  The k r i n g l e s play a  somewhat l e s s e s s e n t i a l r o l e , and the connecting only f u n c t i o n to separate 2.  regions  may  the d i f f e r e n t domains.  Deletions/Insertions  A number of d e l e t i o n s and/or i n s e r t i o n s are r e q u i r e d f o r maximal alignment of the prothrombin sequences f o r chicken, human, and bovine ( F i g . 2 3 ) .  L i k e the regions of low amino a c i d  c o n s e r v a t i o n , many of these  d e l e t i o n s and/or i n s e r t i o n s are a l s o  found i n the connecting  regions  (Fig.23).  found throughout the prothrombin molecule.  Other d e l e t i o n s are In the human  prothrombin sequence, a d e l e t i o n e x i s t s at amino a c i d r e s i d u e 4 (Fig.23).  T h i s same d e l e t i o n i s found i n some of the other  v i t a m i n K-dependent c o a g u l a t i o n  f a c t o r s (Jackson and  Nemerson,1980), but i t s s i g n i f i c a n c e i s unknown.  In the k r i n g l e  r e g i o n s , d e l e t i o n s of two and one amino a c i d residues are observed i n the c h i c k e n Fig.23). their  sequence ( p o s i t i o n s 107 and 240,  D e l e t i o n s have been observed i n other  i n f l u e n c e on f u n c t i o n i s unknown (Jackson and  Nemerson,1980).  Two s i n g l e amino a c i d d e l e t i o n s occur  the thrombin domain of c h i c k e n and  k r i n g l e s , and  582 ( F i g . 2 3 ) .  C-terminal  within  prothrombin, at p o s i t i o n s 475,  The d e l e t i o n a t p o s i t i o n 582 removes the  amino a c i d r e s i d u e ; however, the length of t h i s  1 55  C-terminal  region i s not conserved  (Jackson and Nemerson,1980). 475  between s e r i n e p r o t e a s e s  The second d e l e t i o n a t p o s i t i o n  a l s o occurs a t a p o s i t i o n of l e n g t h v a r i a b l i t y i n  coagulation factors  (Jackson and Nemerson,1980), as w e l l as  being found on the s u r f a c e of the three dimensional thrombin little  (Furie et al.,1982).  model of  These two d e l e t i o n s probably  have  e f f e c t on the s t r u c t u r e and/or f u n c t i o n of thrombin.  Other d e l e t i o n s , as mentioned above, occur  i n the  connecting  r e g i o n s : d e l e t i o n s of two r e s i d u e s i n the human sequence a t position 164,  266,  255,  and d e l e t i o n s of 5, 7 and 1 r e s i d u e a t p o s i t i o n s  and 270 i n the chicken  sequence.  None of the d e l e t i o n s / i n s e r t i o n s found between the t h r e e prothrombin bovine  sequences occurs a t intron-exon  (Fig.9) or human (Degen e t al.,1983,1985; Davie et  §_1.,1983) prothrombin these  j u n c t i o n s i n the  genes.  T h e r e f o r e , i t appears that none of  i n s e r t i o n s and/or d e l e t i o n s were produced by i n t r o n  sliding  (Craik et al.,1982a,b,1983, see s e c t i o n I ) .  d e l e t i o n s and/or i n s e r t i o n s were probably  These  produced by d e l e t i o n  and/or i n s e r t i o n of short p i e c e s of DNA sequence. 3.  mRNA S t r u c t u r e  Prothrombin  from bovine, human, and chicken can be encoded  by a mRNA t r a n s c r i p t of about 2200 n u c l e o t i d e s ( F i g . 6 ; Degen e_t §_1.,1983; F i g . 2 0 ) , of which 2000 n u c l e o t i d e s are of c o d i n g sequence.  As d i s c u s s e d above, the mRNAs from the t h r e e s p e c i e s  probably have s i m i l a r lengths of 5' u n t r a n s l a t e d sequences. the three prothrombin  p o l y p e p t i d e c h a i n s are of s i m i l a r  As  lengths  (see F i g . 2 3 ) , the length of p r o t e i n coding r e g i o n i n each of the  1 56  mRNA t r a n s c r i p t s must be s i m i l a r . untranslated differs  r e g i o n s do d i f f e r .  from the other  However, the length of the 3' Chicken prothrombin mRNA  two s p e c i e s by using two d i f f e r e n t  of p o l y a d e n y l y l a t i o n with polyadenylylation signal.  each having  untranslated  a separate  In the chicken,  polyadenylylation s i g n a l corresponding  sites  the 5'  t o the shorter 3'  sequence appears t o be equivalent  to the  p o l y a d e n y l y l a t i o n s i t e s of the mammalian prothrombins. Comparison of these three  3' u n t r a n s l a t e d  sequences demonstrates  a great d e a l of l e n g t h v a r i a t i o n : 97 n u c l e o t i d e s  i n human  prothrombin  (Degen et a_l.,1983), 122 n u c l e o t i d e s  i n bovine  prothrombin  ( F i g . 9 ; M a c G i l l i v r a y and Davie,1984), and 150  nucleotides  i n chicken  prothrombin  (Fig.18).  To account f o r  t h i s length v a r i a t i o n , a l a r g e number of d e l e t i o n s and i n s e r t i o n s appear t o have o c c u r r e d .  These d e l e t i o n s and  i n s e r t i o n s complicate  the comparison of the sequences of the 3'  untranslated  indeed,  signal  regions;  i s c l e a r l y conserved.  untranslated  only the AATAAA p o l y a d e n y l y l a t i o n I t appears that the 3'  r e g i o n has no other  role  i n the prothrombin  transcripts. E.  COMPARISON OF THE BOVINE AND HUMAN PROTHROMBIN GENES  The  gene f o r human prothrombin has been i s o l a t e d and  p a r t i a l l y characterized al.,1983; F i g . 1 6 ) .  (Degen et al.,1983,1985; Davie et  I t i s t h e r e f o r e p o s s i b l e to make some  comparisons of the s t r u c t u r e and o r g a n i z a t i o n of the bovine and human prothrombin genes.  The gene f o r human prothrombin has  been reported as >20 Kbp i n l e n g t h  (unpublished  r e s u l t s quoted  1 57  in Nagamine et a l . , 1 9 8 4 ) , while the bovine gene i s o n l y 15.6 Kbp (Fig.9). visible  The i n c r e a s e i n s i z e of the human prothrombin gene i s i n the the i n c r e a s e i n the s i z e of some of the  restriction  fragments of the human prothrombin  p o s s i b l y conserved r e s t r i c t i o n  sites,  (for  see F i g s . 5 and 16). The  d i f f e r e n c e i n s i z e of r e s t r i c t i o n endonuclease observed i n genomic Southern b l o t s  gene  fragments  i s also  (Figs.4 and 17).  The number and s i z e of the exons of the human gene f o r prothrombin  (Degen et al.,1983,1985;  Davie et al.,1983)  i s the  same as f o r the bovine gene, with a l l intron-exon j u n c t i o n s at identical locations.  The d i f f e r e n c e i n the s i z e of the two  genes i s due to the presence of l a r g e r prothrombin gene. larger. similar  i n t r o n s w i t h i n the human  Not a l l of the i n t r o n s of the human gene a r e  For example, i n t r o n s E, G, and H F i g s . 9 and 16) a r e of length i n both genes.  the l a r g e r  introns d i f f e r  In g e n e r a l , i t appears that o n l y  i n l e n g t h between the two s p e c i e s .  Many of the l a r g e i n t r o n s of the bovine (Fig.13) and human (Degen e_t a_l.,l983; Davie e_t §_1.,1983) prothrombin r e p e t i t i v e DNA elements.  A l u elements have been  genes c o n t a i n  identified  w i t h i n the i n t r o n s of the human prothrombin gene (Degen e_t al.,1983; Davie et a l . , 1 9 8 3 ) , which are t y p i c a l l y length ( J e l n i c k and Schmid,1982).  The major r e p e t i t i v e DNA of  the bovine genome i s o n l y 120 bp i n length al.,1982).  300 bp i n  (Watanabe e_t  T h e r e f o r e , i f a l l bovine r e p e t i t i v e DNA  elements  have been replaced with A l u elements, there would be an i n c r e a s e in the s i z e of the i n t r o n s between the bovine and human prothrombin genes.  Another p o s s i b l e mechanism to i n c r e a s e the  1 58  s i z e of i n t r o n s would be to change the number of r e p e t i t i v e DNA elements found w i t h i n DNA  introns.  I n s e r t i o n and d e l e t i o n of unique  sequences c o u l d a l s o change the s i z e of i n t r o n s . In g e n e r a l ,  evolved  i t appears that the gene f o r prothrombin has  both i n DNA  and i n amino a c i d sequence i n the 80 m i l l i o n  years s i n c e mammalian r a d i a t i o n (Culbert,1980).  The number and  p o s i t i o n s of exons and i n t r o n s have been s t a b l e f o r t h i s  80  m i l l i o n year p e r i o d , as has been observed i n the o r g a n i z a t i o n of the p o r c i n e activator  and human genes f o r the urokinase-type plasminogen  (Nagamine et a l . , 1 9 8 5 ) .  in the o r g a n i z a t i o n probably r e f l e c t  Thus, any d i f f e r e n c e s found  of s e r i n e protease genes w i t h i n mammals  changes that occurred  during  the e v o l u t i o n of  the gene r a t h e r than the e v o l u t i o n of the s p e c i e s .  As a l a r g e  number of s e r i n e p r o t e a s e genes have been c h a r a c t e r i z e d ,  they  can  family.  be compared to understand the e v o l u t i o n of t h i s gene F.  COMPARISON OF SERINE PROTEASE GENES  1.  Leader And G l a Region  Several  of the c o a g u l a t i o n  f a c t o r s (prothrombin, f a c t o r IX,  f a c t o r X, f a c t o r V I I , p r o t e i n C, p r o t e i n S, and p r o t e i n require vitamin  K for their biosynthesis.  These p r o t e i n s  undergo a p o s t - t r a n s l a t i o n a l m o d i f i c a t i o n at s e v e r a l acid residues  by a membrane bound, vitamin  carboxylase.  The r e s u l t i n g c a r b o x y l a t e d  ions which f a c i l i t a t e  Z)  glutamic  K-dependent  p r o t e i n binds  calcium  the anchoring of the p r o t e i n s to membranes  at the s i t e of i n j u r y (see Suttie,1985 f o r a recent The cDNA sequences of prothrombin  review).  (Degen et al.,1983;  1 59  M a c G i l l i v r a y and Davie,1984),  f a c t o r X (Fung  et al.,1984,1985;  Leytus e_t al.,1984), f a c t o r IX (Kurachi and Davie, 1982; al.,1983), f a c t o r VII al.,1984;  Jaye et  (Hagen et a l . , 1 9 8 6 ) , p r o t e i n C (Long et  F o s t e r and Davie,1984; Beckmann et a l . , 1 9 8 5 ) ,  p r o t e i n S (Dahlback  and  et al.,1986) have shown that each of  these  p r o t e i n s i s s y n t h e s i z e d as a p r e c u r s o r c o n t a i n i n g a p r e p r o leader sequence.  As the v i t a m i n K-dependent bone p r o t e i n  o s t e o c a l c i n i s s y n t h e s i z e d with a p r e p r o - l e a d e r p e p t i d e that i s homologous to the c o a g u l a t i o n f a c t o r s , that t h i s region may  be  (Pan and Price,1985;  Pan  The  i t has been  suggested  i n v o l v e d i n the c a r b o x y l a t i o n process et a l . , 1 9 8 5 ) .  o r g a n i z a t i o n of t h i s region of the bovine  prothrombin  gene ( F i g . 9 ) , human f a c t o r IX gene (Anson et al.,1984; et  al.,1985), and  the human p r o t e i n C gene ( F o s t e r et a l . ,1985;  Plutzky et al.,1986) i s shown i n Fig.24. f a c t o r IX genes the f i r s t same l o c a t i o n s of  the second  Yoshitake  In the prothrombin  three i n t r o n s are at p r e c i s e l y  (to the same n u c l e o t i d e ) while o n l y the intron  (corresponding to the f i r s t  f a c t o r IX and prothrombin  genes, see Fig.24)  the  location  i n t r o n of the  of p r o t e i n C  d i f f e r s , by being s h i f t e d upstream (5') by 6 bp, probably intron s l i d i n g  (see F i g . 2 4 ) .  Intron s l i d i n g  and  by  i s a process  whereby an i n s e r t i o n or a d e l e t i o n of coding sequence occurs because of a change i n the s i t e of mRNA s p l i c i n g al.,1982a,b,1983).  T h i s i s caused  ( C r a i k e_t  by the f o r m a t i o n or  u t i l i z a t i o n of an a l t e r n a t e s p l i c e donor or a c c e p t o r w i t h i n an i n t r o n or an exon, which r e p l a c e s the site.  T h i s process does not  sequence  pre-existing  i n v o l v e the d e l e t i o n or  insertion  160  F i g . 24; Comparison of the O r g a n i z a t i o n Leader Peptide  of Exons i n the  and Gla Domain  The o r g a n i z a t i o n leader peptide and G l a exons of the f a c t o r IX, p r o t e i n C, and prothrombin genes. by open bars;  by the  Codons f o r the r e s i d u e s at the s i t e of  g i v i n g r i s e to the plasma p r o t e i n s are denoted by  the v e r t i c a l arrow.  Codons f o r 7-carboxyglutamic a c i d  residues are denoted by the i n v e r t e d s o l i d Intron phases are 0, i n t r o n between a f t e r the f i r s t  the codons, I, i n t r o n  The s i z e s of the exons  i n d i c a t e d by the s c a l e r e p r e s e n t i n g  the i n t r o n s are not to s c a l e . transcription  triangles.  n u c l e o t i d e of the codon, I I , i n t r o n a f t e r  the second n u c l e o t i d e of a codon. are  represented  5' u n t r a n s l a t e d r e g i o n are represented  slashed bars. cleavage  Exons are  i s 5' to 3'.  50 bp.  The d i r e c t i o n of  The s i z e s of  Factor IX  • TT .TTTT  TT T T T ,  Protein C  Prothrombin  TT  TTTT  I  0  -  o  -  -1  -  o  -  - I  T T T T  50 bp  1  1 62  of DNA sequence w i t h i n a gene, but does r e s u l t  i n a length  d i f f e r e n c e of the f i n a l mRNA and p r o t e i n product.  In the  p r o t e i n C gene i t appears that a new s p l i c e acceptor produced 6 bp upstream of the o r i g i n a l s i t e existing  s p l i c e a c c e p t o r AG i s s t i l l  s i t e was  (the probable  present  pre-  i n the genomic DNA  sequence and now i s p a r t of the coding sequence 6 bp 3' to the present  s p l i c e acceptor  al.,1986).  site)  (Foster et al.,1985;  Another example of i n t r o n s l i d i n g  the f a m i l y of s e r i n e p r o t e a s e s . urokinase, intron  two d i f f e r e n t  only one of which i s used i n the R i c c i o et a_l.,1985),  in a 9 amino a c i d r e s i d u e (27 bp) i n s e r t i o n .  two  different Mutations  new s p l i c e  similar  Often site.  to these changes i n s p l i c e s i t e  f o r some of the thalassemias  these are caused  f o r every  new p r o t e i n c o d i n g  i n the  ( B u s s l i n g e r e_t  by frame s h i f t s due to the  The p r o t e i n C and urokinase mutations  maintain  Note that i t i s not  s u c e s s f u l i n t r o n s l i d i n g event  the r e a d i n g frame although  t o maintain  i f the reading frame i s changed, the  r e g i o n C-terminal to t h i s change w i l l have no  homology t o the p r e - e x i s t i n g p r o t e i n . frame would be s i m i l a r splicing,  T h i s may represent  s p l i c e s i t e s has not been made y e t .  the r e a d i n g frame of the s p l i c e d mRNAs. necessary  resulting  i n i n t r o n s l i d i n g , where the c h o i c e between the  g l o b i n genes account al.,1981).  within  s p l i c e donor s i t e s a r e used f o r one  (Nagamine e_t §_1.,1985),  intermediate  i s observed  In the p o r c i n e gene f o r  human gene (Nagamine e t al.,1985;  an  P l u t z k y et  T h i s change i n reading  to the r e s u l t s of some d i f f e r e n t i a l  f o r example at the 3' end of the y f i b r i n o g e n gene,  which produces y and 7 ' f i b r i n o g e n s with d i f f e r e n t  C-terminal  1 63  sequences (Crabtree and Kant,1982).  As mentioned above, often  these changes i n reading frame a r e d e l e t e r i o u s as i n some thalassemias  ( B u s s l i n g e r et a l . , 1 9 8 1 ) .  The mutations  i n the  p r o t e i n C and urokinase genes presumably do not i n t e r f e r e  with  the p r o t e i n f o l d i n g or f u n c t i o n s of these p r o t e i n s . The  three exons c o n t a i n i n g amino a c i d c o d i n g  sequences  encode the p r e p r o - l e a d e r p e p t i d e , and the e n t i r e G l a r e g i o n . Bently e t §_1.(1986) have c h a r a c t e r i z e d an abnormal f a c t o r IX gene that r e s u l t s  in defective pro-peptide processing.  Amino  a c i d sequence a n a l y s i s of the p r o - f a c t o r IX t h a t accumulates i n the plasma of such  i n d i v i d u a l s showed t h a t s i g n a l  peptidase  c l e a v e s the f a c t o r IX p r e p r o - l e a d e r p e p t i d e between amino a c i d r e s i d u e s -19 and -18. suggested  By analogy,  B e n t l e y e_t a l . (1 986)  that s i g n a l peptidase c l e a v e s the prothrombin  prepro-  leader peptide between r e s i d u e s -20 and -19, and i n a s i m i l a r position  i n p r o t e i n C.  In that case, the s i g n a l peptide i s  encoded by a s i n g l e exon i n the prothrombin, p r o t e i n C genes.  factor  IX and  I n t e r e s t i n g l y , most of the p r o - r e g i o n and G l a  region i s encoded by the next exon.  D i f f e r e n c e s e x i s t between  f a c t o r IX, p r o t e i n C and prothrombin  i n the l e n g t h of the f i r s t  exon, i n c l u d i n g the presence and  of an i n t r o n  i n the p r o t e i n C gene,  the l o c a t i o n of the (presumed) i n i t i a t o r methionine  These d i f f e r e n c e s i n the f i r s t  residue.  exon a r e not unexpected as s i g n a l  peptides o f t e n have l i t t l e homology, even i f they have a common ancestor  (Rogers,1985).  O v e r a l l , the leader and G l a r e g i o n s of the three genes appear to have evolved  from a common a n c e s t o r .  The G l a region  1 64  i s not a recent exist  a d d i t i o n to prothrombin; t h i s region  appears t o  i n lamprey prothrombin as t h i s p r o t e i n can be adsorbed to  barium s a l t s ( D o o l i t t l e et al.,1962; Zytkovicz and Nelsestuen,1976) .  The observations  i n d i c a t e that the G l a region  i s a t l e a s t 450 m i l l i o n years o l d , suggesting that K-dependent c a r b o x y l a t i o n  vitamin  of the glutamate r e s i d u e s  of the  p r o t e i n p r e d a t e s the d i f f e r e n c e s found i n the remainder of the protein.  Some type of c o r r e c t i o n event  may be r e s p o n s i b l e leader-Gla  region  f o r maintaining  Kringle  Region  The  protein  structures  plasminogen  gene c o n v e r s i o n )  the o r g a n i z a t i o n  of the  (see s e c t i o n I ) .  2.  several proteins  (e.g.  known as k r i n g l e s have been found i n  i n c l u d i n g prothrombin  (Magnusson et a l . , 1 9 7 5 ) ,  (Sottrup-Jensen e_t al.,1978),  plasminogen a c t i v a t o r  tissue-type  (Pennica e_t a_l.,1983), urokinase-type  plasminogen a c t i v a t o r (Verde et al.,1984), and f a c t o r XII (McMullen and Fujikawa,1985; Cool et a l . , 1 9 8 5 ) .  Genes f o r  s e v e r a l of these p r o t e i n s have been i s o l a t e d and c h a r a c t e r i z e d allowing (Fig.25).  a comparison of the o r g a n i z a t i o n  of the k r i n g l e  regions  In each case, the k r i n g l e s are separated from each  other and from the remainder of the p r o t e i n molecule by i n t r o n s . All  of the i n t r o n s that separate the k r i n g l e s from the remainder  of the p r o t e i n or from each other i n t e r r u p t the reading the mRNAs i n the same phase (see F i g . 2 5 ) ; i n t r o n o c c u r s a f t e r the f i r s t  nucleotide  frame of  i n a l l cases, the of a codon.  One  consequence of t h i s i s that by d u p l i c a t i n g the exon(s) encoding a k r i n g l e , d u p l i c a t i o n of the p r o t e i n domain occurs because the  165  Fig.25:  Comparison  of the O r g a n i z a t i o n  of Exons i n the  K r i n g l e Domain The o r g a n i z a t i o n  of the k r i n g l e exons  plasminogen a c t i v a t o r , urokinase, prothrombin genes. six  tissue-type  plasminogen, and  D e t a i l s are as i n Fig.24 except that the  invariant cysteine  exons.  i n the  residues are denoted by a C above the  tPA  #1  tPA  #2  Urokinase P l a s m i n o g e n #4 Plasminogen  #5  Prothrombin  #1  Prothrombin  #2  i  .1 67  new s p l i c e d product maintains the r e a d i n g frame.  Although t h i s  i s not common, i t i s found i n some exon-encoded domains such as the epidermal growth f a c t o r homologies found i n the genes f o r f a c t o r IX (Anson et a_l,1984; Y o s h i t a k e et al.,1985), p r o t e i n C (Foster  and Davie, 1985;  P l u t z k y et a_l.,1986), and such non-  proteases as the LDL r e c e p t o r  (Sudhoff e t al.,1985a,b).  In the prothrombin, t i s s u e - t y p e  plasminogen a c i v a t o r , and  urokinase-type plasminogen a c t i v a t o r genes, the i n t r o n the C-terminus of the k r i n g l e s nucleotide  found at  occurs a t about the same  (see F i g s . 9 and 25). The s m a l l d i f f e r e n c e s  in  the p o s i t i o n s  an  i n t r o n s l i d i n g p r o c e s s , as d i s c u s s e d above f o r the p r o t e i n C  leader  of these f l a n k i n g  sequence.  Many of the k r i n g l e s  (Fig.25) and the p o s i t i o n of t h i s kringle  tissue-type  intron  the same l o c a t i o n  Part  kringle  found i n the  (Ny et a_l,l984;  Degen e_t  i n the prothrombin gene (see  of the plasminogen gene has been  f o u r t h and f i f t h  from each other  introns  The second  The k r i n g l e s  kringles  (Fig.25).  The o r g a n i z a t i o n  (Fig.25).  from other  These d i f f e r e n c e s  cannot be accounted f o r by an i n t r o n  the d i f f e r e n c e s  in intron  characterized  S a d l e r e_t a_l.,l985) i n c l u d i n g p a r t s of  each of these plasminogen k r i n g l e s d i f f e r s and  introns  Nagamine et al.,1984; R i c c i o et al.,1985) which d i f f e r s  (Malinowski e_t a_l.,l984; the  intron varies.  (see F i g . 9 ) .  from the l o c a t i o n of the i n t r o n Fig.25).  have i n t e r n a l  and the u r o k i n a s e - t y p e plasminogen a c t i v a t o r s have  i n t r o n at e x a c t l y  al,l986;  are probably due to  i n prothrombin l a c k s an i n t r o n , w h i l e the f i r s t  contains a single  an  introns  observed  of  kringles  i n l o c a t i o n of  s l i d i n g process as  l o c a t i o n a r e not a s s o c i a t e d  with  1 68  i n s e r t i o n or d e l e t i o n  of coding  These d i f f e r e n c e s  sequences.  i n i n t r o n l o c a t i o n can e i t h e r be  e x p l a i n e d by the l o s s of i n t r o n s c o n t a i n e d at l e a s t four by  that  i n t r o n s per k r i n g l e , or a l t e r n a t i v e l y ,  the i n s e r t i o n of i n t r o n s  i n t o k r i n g l e - e n c o d i n g genes.  second p o s s i b i l i t y of i n t r o n because i f a t l e a s t four  from an o r i g i n a l gene  i n s e r t i o n appears more  The  likely,  i n t r o n s were o r i g i n a l l y present  i n the  k r i n g l e gene, t h i s would r e s u l t i n some"extremely small exons (e.g.  6 bp).  characterized introns.  In a d d i t i o n ,  there i s an absence of any  kringle containing  more than one of the four  T h i s proposal of i n t r o n  invasion  i s a l s o supported by  data from the s e r i n e protease domain (see next s e c t i o n ) . been noted p r e v i o u s l y  that  the f i r s t  I t has  k r i n g l e of prothrombin i s  more homologous to the t h i r d k r i n g l e of plasminogen than t o the second k r i n g l e of prothrombin  (Kurosky et a l . , 1 9 8 0 ) .  t h i s homology between the k r i n g l e s prothrombin a c q u i r e d the f i r s t third the  r e s u l t of a d u p l i c a t i o n  kringle  that  from the ancestor t o the  the p o s i t i o n of the  k r i n g l e of prothrombin d i f f e r s from those of  other k r i n g l e c o n t a i n i n g  possible  than  of the k r i n g l e domain w i t h i n the  As d i s c u s s e d p r e v i o u s l y ,  i n t r o n of the f i r s t  and  i t has been proposed  k r i n g l e of plasminogen (Kurosky et a l . , 1 9 8 0 ) , r a t h e r  prothrombin gene.  all  Because of  genes ( F i g . 2 5 ) .  Thus i t may be  to use t h i s i n t r o n as a marker t o follow  the e v o l u t i o n  movement of t h i s k r i n g l e . The  gene s t r u c t u r e s  shown i n Fig.25 suggest that  the exons  c o d i n g f o r k r i n g l e s have a common ancestor as a s i n g l e exon. This  exon d u p l i c a t e d  several  times t o form the a n c e s t r a l  exon  1  for  the plasminogen  a c t i v a t o r s , plasminogen,  k r i n g l e of prothrombin Patthy,1985).  first  A f t e r m u l t i p l e d u p l i c a t i o n events to form the  was  inserted  a copy of the t h i r d k r i n g l e of  i n t o the prothrombin gene to become the  k r i n g l e found i n prothrombin  Patthy,1985).  and the second  (Young e_t al.,1978; Kurosky et al.,1980;  f i v e k r i n g l e s of plasminogen, plasminogen  69  today  (Kurosky e_t al.,1980;  T h i s i s supported by the p r o p o s a l that  have invaded some of the k r i n g l e exons a f t e r the d u p l i c a t i o n s , but i n some c a s e s , p r i o r  to the  introns  initial  final  duplications. 3.  S e r i n e Protease Region  A comparison  of the exon o r g a n i z a t i o n of the c a t a l y t i c  r e g i o n s of the prothrombin  gene, s e v e r a l s e r i n e protease genes,  and the h a p t o g l o b i n gene i s shown i n Fig.26.  As i n most s e r i n e  p r o t e a s e s , the c a t a l y t i c  ,  Ser  5 2 8  residues H i s  3 6 6  A s p " , and 2 2  of prothrombin are l o c a t e d on separate exons (see  Fig.26). are  triad  However, none of the i n t r o n s i n the prothrombin gene  i n s i m i l a r p o s i t i o n s to any other gene reported (see  Fig.26).  The  different  types based on the i n t r o n p o s i t i o n s shown in Fig.26.  The  group c o n s i s t s of the h a p t o g l o b i n gene where no  first  s e r i n e p r o t e a s e genes can be d i v i d e d  introns interrupt  the c a t a l y t i c  region.  The second  into  five  group  comprises the genes f o r the p a n c r e a t i c protease zymogens t r y p s i n o g e n , chymotrypsinogen, gland and kidney k a l l i k r e i n s ,  and p r o e l a s t a s e , the m a x i l l a r y the a and 7 subunits of nerve  growth f a c t o r , and the t i s s u e - t y p e and urokinase-type plasminogen  activators.  Although t h e r e are a l s o d i f f e r e n c e s  1 70  Fig.26: Comparison of the O r g a n i z a t i o n of Exons in the Serine Protease The  Domain  o r g a n i z a t i o n of the s e r i n e p r o t e a s e  haptoglobin, kallikrein,  exons in the  t r y p s i n o g e n , chymotrypsinogen, p r o e l a s t a s e , a and  y s u b u n i t s of nerve growth f a c t o r  r e c e p t o r , t i s s u e - t y p e plasminogen a c t i v a t o r , complement f a c t o r B, genes.  f a c t o r IX, p r o t e i n C, and  Intron phases are as  represents  100  bp.  urokinase,  in Fig.24.  The  prothrombin  scale  Codons f o r the r e s i d u e s at the s i t e of  a c t i v a t i o n of the zymogens are denoted by the arrows; complement f a c t o r B and  vertical  the y subunit of nerve  growth f a c t o r are not a c t i v a t e d i n t h i s way.  The  codons for  the a c t i v e s i t e r e s i d u e s h i s t i d i n e , a s p a r t a t e , and are denoted by H, D,  and  S r e s p e c t i v e l y ; in haptoglobin,  however, the corresponding  codons code f o r l y s i n e  aspartate  (A) r e s i d u e s .  (D), and a l a n i n e  haptoglobin  gene has  serine  The  not been c h a r a c t e r i z e d .  (K),  3' end of the The  3'-most  exons of f a c t o r IX, t i s s u e - t y p e plasminogen a c t i v a t o r , urokinase and  1119  have been a b b r e v i a t e d bp  - they are  in size r e s p e c t i v e l y .  u n t r a n s l a t e d regions are  The  1935  exons coding  solid  bp,  5'  A unique  r e g i o n of complement f a c t o r B i s i n d i c a t e d by box.  914  i n d i c a t e d by the d o t t e d boxes, and  3' u n t r a n s l a t e d r e g i o n s by the s l a s h e d b a r s . coding  bp,  and  the  D Haptoglobin  i  D-  Trypsinogen Chymotrypslnogen  i -C  f  E O  aNGF  N  TNGF  N  i H  I  >o-C  D  r  H  i-C  Urokinase  i-C  Factor B  I  l-o-.  H  hoH I -  I  {  ho-i -0  o-  D  H Hi-T H  >o-C i  r  Jo-t  D H ZZ> II -L7= r>o-C H D D-o-C i z m v / / / / / / / / / A > II >i-C >i£ H >II-L H s Y/7A oC Zh-LTJo-C }n{=i D H s  -c  i-Oot i -f_  r  V noT- r y '  =5-11-  h i  tPA  Prothrombin  H  \T~Y\-T~Yo-\  Kallikrein  Protein C  D  D-i=3o=iii=}<>t  Proelastase  Factor IX  H  1  H  5'—^3'  h  ! HIM  i  H  !  H  •  - i n  V  D •  D i  H - i l  V//AV////////,  s V////////////A  s -  H >oioo bp  1 72  (see F i g . 2 6 ) ,  each of these genes c o n t a i n  ( i ) an i n t r o n j u s t 3'  of the codon f o r the a c t i v e s i t e h i s t i d i n e , ( i i ) an i n t r o n 3' t o the codon f o r the a c t i v e s i t e a s p a r t a t e ,  and ( i i i ) an i n t r o n 5'  to the codon f o r the a c t i v e s i t e s e r i n e .  A l l of these  introns  i n t e r r u p t the coding sequences at i d e n t i c a l l o c a t i o n s , i n the same phase, i n each of the genes ( F i g . 2 6 ) .  The t h i r d  group  c o n s i s t s of the complement f a c t o r B gene which c o n t a i n s introns within  the c a t a l y t i c  region  (Fig.26).  7  The f o u r t h  group  c o n s i s t s of the f a c t o r IX and p r o t e i n C genes which have two i n t r o n s r e s u l t i n g i n a l a r g e exon that c o n t a i n s s i t e aspartate  and s e r i n e r e s i d u e s .  gene c o n s t i t u t e s the f i f t h other genes d i s c u s s e d  L a s t l y , the prothrombin  group as i t i s d i f f e r e n t to a l l the  (Fig.26).  T h i s grouping i s not only gene o r g a n i z a t i o n s  both the a c t i v e  representative  but i s a l s o c o n s i s t e n t  of the s i m i l a r  with amino a c i d  sequence homologies (Young et §_1.,1978; Hewett-Emmett e_t §_1.,1981) suggesting that f i v e types d u p l i c a t e d proteases. evolution prior  the a n c e s t r a l genes f o r each of these  e a r l y i n the e v o l u t i o n  of the s e r i n e  The a n c e s t r a l gene probably d u p l i c a t e d  e a r l y i n the  of the eukaryote (Young e_t aJL.,1978), and c e r t a i n l y  to the emergence of the f i r s t  years ago.  Therefore,  vertebrates  600 m i l l i o n  e i t h e r enough time has passed to hide the  a n c e s t r a l gene o r g a n i z a t i o n  by movement of i n t r o n s , or i n t r o n s  have entered these genes a f t e r t h e i r divergence, and a r e therefore  found at d i f f e r e n t l o c a t i o n s i n the d i f f e r e n t genes.  L i k e the k r i n g l e domain, d i f f e r e n c e s  i n the i n t r o n  p o s i t i o n s are most l i k e l y due to i n t r o n i n s e r t i o n .  Many  introns  1 73  are  located  different  in s i m i l a r regions  reading  frames (see  f a c t o r B genes, F i g . 2 6 ) .  r e s u l t of  i n t r o n s seem to be groups, e.g.  the  the  trypsinogen  As d i s c u s s e d  d i f f e r e n c e s are p r o b a b l y not l i k e l y the  of the genes, but  due  and  p a i r s , but  the  intron insertions.  i n t r o n of or the  f a c t o r IX and  r e t e n t i o n of a n c e s t r a l a l s o be due  Some of  the  the  second  This could  the  be  i n t r o n s by these gene  to h o r i z o n t a l t r a n s f e r of  mechanism such as gene c o n v e r s i o n  similar  are more  second i n t r o n of f a c t o r IX and  i n t r o n between the genes a f t e r d u p l i c a t i o n and  possible  these  shared between genes from the d i f f e r e n t first  i t may  complement  previously,  second i n t r o n of complement f a c t o r B ( F i g . 2 6 ) . by  in  to i n t r o n s l i d i n g but  independent  i n t r o n of p r o e l a s t a s e ,  explained  often  the  divergence by a  (Sharp,1985) .  It i s also  that these i n t r o n s were both i n s e r t e d by chance i n very locations.  insertion  In t o t a l ,  the evidence p o i n t s  in order to e x p l a i n  some i n t r o n l o s s may  to i n t r o n  the observed d i f f e r e n c e s , although  have o c c u r r e d  a f t e r some of the  introns  were i n s e r t e d . Genes from i n v e r t e b r a t e introns  (Gilbert,1985;  accounted f o r e i t h e r by species,  or by  species  g e n e r a l l y have fewer  G i l b e r t et al.,1986). l o s s of  This c o u l d  i n t r o n s in the  l e s s i n s e r t i o n of  invertebrate  i n t r o n s within  these  species.  A gene for a s e r i n e p r o t e a s e homologous to trypsinogen isolated  from the  et a l . , 1 9 8 5 ) . represent  invertebrate  T h i s gene l a c k s  has  D r o s o p h i l i a melanogaster i n t r o n s and  be  therefore  been  (Davis  may  a copy of the a n c e s t r a l , e a r l y eukaryote i n t r o n - l e s s  s e r i n e protease gene.  Subsequent i n v a s i o n of  introns a f t e r  1 74  d u p l i c a t i o n to form the provided  f i v e f a m i l i e s of s e r i n e p r o t e a s e genes  the d i s t i n c t i v e o r g a n i z a t i o n s  D u p l i c a t i o n s during  the  differing  observed in the t r y p s i n o g e n - l i k e genes ORIGIN OF  1.  O r i g i n Of  It  has  INTRONS AND  is  (Fig.26).  i n t r o n s have been present  (Blake,1978; D o o l i t t l e , 1 9 7 8 ;  G i l b e r t ejt §_1.,1986) but  f l a v i n - c o n t a i n i n g enzymes does not  G i l b e r t , 1985;  as  Introns  the beginnings of l i f e  the  in others,  EXON SHUFFLING  been proposed that  D o o l i t t l e , 1986;  (Fig.26).  i n t r o n i n v a s i o n process would r e s u l t in  genes sharing some i n t r o n s , but  G.  seen today  present  since  Darnell  evidence  support t h i s  (Longby  introduction).  of i n t r o n s may  i n t r o n s in the d i s t a n t l y separated and  archaebacteria,  p o s s i b l y be due  the  have become i n s e r t e d i n t o the genes f o r  s e r i n e proteases w e l l a f t e r the o r i g i n of l i f e .  may  and  Here,  a d d i t i o n a l evidence i s presented i n d i c a t i n g that at l e a s t  prokaryotes,  from  Stone et al.,1985; Rogers,1985; Duester et  al.,1985; McKnight et a_l.,l986; see  majority  and  The  branches of l i f e  D a r n e l l and  to m u l t i p l e o r i g i n s of  presence of (eukaryotes,  Doolittle,1986),  i n t r o n s and/or  t r a n s f e r of i n f e c t i v e , a n c e s t r a l i n t r o n s between the  the  kingdoms of  life. Despite  the u n c e r t a i n t y  of the o r i g i n of  c l e a r that they have been i n v a s i v e and process s t a r t e d e a r l y i n e u k a r y o t i c common i n t r o n found in the  introns, i t i s  mobile.  The  invasion  e v o l u t i o n , as shown by  f u n g a l , p l a n t and  the  animal genes f o r  t r iose-phosphate isomerase (McKnight et al.,1986; G i l b e r t et_  175  al.,1986).  T h i s p r o c e s s appears  to have been completed  at l e a s t  450 m i l l i o n years ago, perhaps because of l o s s of m o b i l i t y . Evidence  f o r the l o s s of m o b i l i t y comes from comparison of genes  which are known t o have d u p l i c a t e d i n the l a s t  s e v e r a l hundred  m i l l i o n y e a r s , f o r example the g l o b i n genes ( E d g e l l et al.,1983; D a r n e l l and D o o l i t t l e , 1986) and the i n s u l i n genes ( P e r l e r e_t a_l.,l 980).  Both i n t r o n s l i d i n g and i n t r o n l o s s have o c c u r r e d  ( P e r l e r e_t a l . , 1 9 8 0 ) , but these events can be e x p l a i n e d by mechanisms u n r e l a t e d t o the m o b i l i z a t i o n of i n t r o n s ( i n t r o n insertion).  Indeed, l i t t l e  change i s observed  o r g a n i z a t i o n of the t r i o s e - p h o s p h a t e isomerase and animals, a d i v e r g e n c e  i n the gene i n p l a n t s  of at l e a s t one b i l l i o n  years  (Marchionni and G i l b e r t , 1 9 8 6 ; G i l b e r t et al.,1986).  No gain of  i n t r o n s has been c l e a r l y demonstrated to have o c c u r r e d d u r i n g the l a s t  450 m i l l i o n y e a r s i n the v e r t e b r a t e s .  between the t r i o s e - p h o s p h a t e isomerase  The d i f f e r e n c e s  genes of the v e r t e b r a t e s  and p l a n t s (Marchionni and G i l b e r t , 1 9 8 6 ) c o u l d be due to i n t r o n insertion  i n the p l a n t l i n e a g e or due to i n t r o n l o s s i n the  v e r t e b r a t e l i n e a g e ; p r e s e n t evidence cannot these two p o s s i b i l i t i e s . duplicated prior  d i s t i n g u i s h between  The f l a v i n - c o n t a i n i n g enzymes, which  t o the divergence of the eukaryote,  prokaryote,  and a r c h a b a c t e r i a l i n e a g e s , do not share any i n t r o n s though many appear i n s i m i l a r  locations  (Duester et §_1.,1986; see  Introduction). Other life,  gene f a m i l i e s d u p l i c a t e d l a t e r  but e a r l y  i n the e v o l u t i o n of  i n the e v o l u t i o n of the eukaryote.  These gene  f a m i l i e s such as the f i b r i n o g e n genes (Crabtree e_t al.,1985) and  1  76  the s e r i n e protease genes (see above) show v a r y i n g  degrees of  i n t r o n sharing which i s p r o p o r t i o n a l to the time s i n c e the genes d u p l i c a t e d and d i v e r g e d .  The o r g a n i z a t i o n  d i f f e r e n t gene f a m i l i e s can best  be e x p l a i n e d  i n t r o n s i n t o these genes over time r a t h e r l o s s of i n t r o n s . i n i t i a t e d before  of the genes of these by the i n v a s i o n of  than the movement and  The time p e r i o d of i n t r o n i n v a s i o n the divergence of the filamentous  probably  fungi  from  p l a n t s and animals as observed by the shared i n t r o n of the triose-phosphate chicken  isomerase gene of A s p e r g i l l u s , maize, and  (McKnight et al.,1986).  than 1.2 b i l l i o n  years  T h i s d i v e r g e n c e was  ( G i l b e r t et al.,1986),  p r i o r to the divergence of the v e r t e b r a t e s ,  greater  and was completed  which o c c u r r e d at  l e a s t 450 m i l l i o n years ago. 2.  Exon S h u f f l i n g  Exon s h u f f l i n g as proposed by G i l b e r t ( 1 9 7 8 , 1 9 7 9 ) p r o v i d e s a r o l e f o r i n t r o n s i n the e v o l u t i o n of genes, but not a r o l e f o r i n t r o n s themselves (Crick,1979; Smith,l985).  Cavilier-  Today, the processes of i n t r o n s p l i c i n g a r e much  b e t t e r understood still  Rogers,1985;  (Keller,1984;  Ruskin and Green,1985), yet we  do not know the f u n c t i o n of i n t r o n s .  Despite  this,  i t is  c l e a r that i n t r o n s have had a r o l e i n the e v o l u t i o n of many genes (Sudhoff et al.,1985b; G i l b e r t , 1 9 8 5 ; As d i s c u s s e d  G i l b e r t et  al.,1986).  p r e v i o u s l y , v a r i o u s p a r t s of the prothrombin  molecule have homology to other sequence and gene o r g a n i z a t i o n .  proteins  i n both amino a c i d  T h i s homology cannot be  accounted f o r j u s t by gene d u p l i c a t i o n events as only not a l l p r o t e i n domains are shared by other  some, but  individual proteins.  177  S h u f f l i n g of exons would account f o r the observed p a t t e r n s , e s p e c i a l l y as seen f o r the k r i n g l e s t u c t u r e s first  k r i n g l e of prothrombin  (see above).  The  shares the h i g h e s t amino a c i d  homology not w i t h the second k r i n g l e of prothrombin, but with the  third  k r i n g l e of plasminogen  (Kurosky et a l . , 1 9 8 0 ) .  i m p l i e s that prothrombin a c q u i r e d the f i r s t plasminogen this of  r a t h e r than from i t s e l f .  from  The best mechanism f o r  i s an exon s h u f f l i n g event which c o p i e d the t h i r d  plasminogen  prothrombin. for  kringle  and i n s e r t e d  This  i t as the f i r s t  kringle  k r i n g l e of  The G l a region appears to have a common ancestor  a l l the v i t a m i n K-dependent c o a g u l a t i o n f a c t o r s , and i t s  initial  source i s unknown.  exon s h u f f l i n g  I t appears to have been gained by an  type event with a c q u i s i t i o n of the p r o - p e p t i d e  and G l a as one event, and even p o s s i b l y a c q u i s i t i o n of the p r e p e p t i d e as an a d d i t i o n a l event c o r r e c t i o n events appear  (or both t o g e t h e r ) .  Gene  to have had a r o l e i n m a i n t a i n i n g the  o r g a n i z a t i o n of the l e a d e r and G l a region i n the face of i n t r o n insertion after the  factor  the d u p l i c a t i o n of the prothrombin ancestor and  I X - l i k e gene a n c e s t o r .  T h i s would then e x p l a i n the  i d e n t i c a l gene s t r u c t u r e s i n c o n t r a s t to the d i f f e r i n g o r g a n i z a t i o n of the s e r i n e protease domain (see Fig.24 and 26). H.  EVOLUTION OF THE ACTIVE SITE SERINE CODON  Amino a c i d sequence, can be combined  DNA sequence,  i n an attempt  and gene s t r u c t u r e data  to e x p l a i n the e v o l u t i o n a r y  r e l a t i o n s h i p s w i t h i n the family of s e r i n e p r o t e a s e s , and w i t h i n the  s u b f a m i l y of v i t a m i n K-dependent c o a g u l a t i o n f a c t o r s i n  178  particular.  Amino a c i d sequence comparisons have produced  several evolutionarty  t r e e s of the s e r i n e p r o t e a s e s  (Young et  al.,1978; Hewett-Emmett et al.,1980; Patthy,1985) . feature of these r e l a t i o n s h i p s i s that coagulation  the v i t a m i n  K-dependent  f a c t o r s are more c l o s e l y r e l a t e d to each other than  to other s e r i n e p r o t e a s e s . ancestor of the v i t a m i n  I t has been suggested that the  K-dependent  serine proteases  from the d i g e s t i v e s e r i n e p r o t e a s e s very e a r l y the  One common  diverged  i n the h i s t o r y of  family of s e r i n e p r o t e a s e s (Young e t a l . , 1 9 7 8 ) . Serine  proteases c o n t a i n  a conserved a c t i v e s i t e  of Gly-Asp-Ser-Gly-Gly, with the Ser being the a c t i v e serine residue.  Serine  TCC, AGT, and AGC.  sequence site  has s i x p o s s i b l e codons: TCG, TCA,  TCT,  These can be s e p a r a t e d i n t o two types:  TCN,  were N i s G, A, T, or C and AGY, unique i n the genetic  where Y i s T or C.  code i n that  i t i s not p o s s i b l e  Serine i s to go from  one codon to a l l other codons by s i n g l e base p a i r changes w h i l s t still  r e t a i n i n g the a b i l i t y  residue.  to code f o r the same amino a c i d  To change from the TCN type codon to a AGY  at l e a s t two n u c l e o t i d e  changes are r e q u i r e d ,  and i f t h i s occurs  as s i n g l e base p a i r changes, then an i n t e r m e d i a t e have to e x i s t which does not code f o r s e r i n e .  type codon,  sequence w i l l  I f such a change  occurs at the a c t i v e s i t e of a s e r i n e p r o t e a s e , the protease would l o s e i t s c a t a l y t i c a c t i v i t y due to the absence of the active s i t e serine residue. the  s e r i n e residue  was  Both TCN and AGY exist within  The a c t i v i t y would be r e s t o r e d when  restored. types of codons f o r the a c t i v e s i t e  serine  the f a m i l y of s e r i n e p r o t e a s e s , as determined by  1 79  cDNA and gene sequence a n a l y s i s .  The AGY  type codon i s found i n  a s m a l l number of s e r i n e proteases i n c l u d i n g the v i t a m i n K-dependent c o a g u l a t i o n f a c t o r cDNAs c h a r a c t e r i z e d to date. These i n c l u d e f a c t o r VII  (Hagen e_t al.,1986), f a c t o r IX (Kurachi  and Davie,1982; Jaye et al.,1983), f a c t o r X (Fung et al.,1984,1985; Leytus et a l , l 9 8 4 ) , p r o t e i n C (Long et F o s t e r and Davie,1984; Bechmann et al.,1985), and  al.,1984;  prothrombin  ( M a c G i l l i v r a y et al.,1980; Degen et al.,1983; M a c G i l l i v r a y Davie,1984; F i g . 1 8 ) . have the AGY  The only other s e r i n e protease known to  type codon at i t s a c t i v e s i t e  ( M a l l i n o s k i e_t §_1.,1984). have the TCN  i s plasminogen  A l l of the other s e r i n e p r o t e a s e s  type a c t i v e s i t e codon, i n c l u d i n g the d i g e s t i v e  zymogens ( C r a i k et al.,1985; B e l l et al.,1985; al.,1985), f i b r i n o l y t i c  S w i f t et  zymogens ( P e n n i c i a e al.,1983; Verde e_t  a l . , 1 9 8 4 ) , complement f a c t o r s  (Campbell  et a l . , 1 9 8 3 ) , the  p r o t e i n p r o c e s s i n g proteases of the k a l l i k r e i n al.,1984;  Evans and Richards,1985;  f a m i l y (Mason e_t  Ashley and MacDonald,1985;  van Leewuen et a l . , 1 9 8 6 ) , c y t o l y t i c proteases  (Gershenfeld and  Weissman,1986; Lobe et al.,1986), and the non-vitamin dependent c o a g u l a t i o n f a c t o r s f a c t o r s XII and XI, prekallikrein al.,1986).  and  K-  and  (Cool et al.,1985; Fujikawa et §_1.,1986; Chung et  A gene f o r a d i g e s t i v e s e r i n e protease i n  Drosophi 1 i a melanogaster  has been i s o l a t e d  and t h i s gene a l s o has the TCN  (Davis et_ a l . , 1 9 8 5 ) ,  type s e r i n e codon.  The d i s t r i b u t i o n of the types of s e r i n e codons  suggests  that the a n c e s t r a l s e r i n e codon f o r the s e r i n e protease gene of the TCN  type, which i s now  found i n both v e r t e b r a t e s and  was  180  invertebrates.  If t h i s i s t r u e , then d u r i n g the e v o l u t i o n of  the vitamin K-dependent c o a g u l a t i o n s e r i n e changed from the TCN occurred as two more l i k e l y  p r o t e i n e x i s t e d which had  i f this  no  then an  serine  protease  I t i s i n t e r e s t i n g to note t h a t h a p t o g l o b i n  i n a c t i v e at l e a s t  in part because of mutations i n i t s a c t i v e  protease  coagulation  i s a descendent of  factor intermediate.  be p o s s i b l e i f the gene f o r the non-serine  protease  d u p l i c a t e d p r i o r to the second p o i n t mutation  r e s t o r e s e r i n e protease coagulation  is  (Kurosky et al.,1980) but i s  I t i s p o s s i b l e that h a p t o g l o b i n  non-serine  was  type, and  base p a i r changes (which appears to be  homologous to s e r i n e proteases  site.  to the AGY  than a simultaneous double mutation),  intermediate activity.  separate  type  f a c t o r s , the codon for  f a c t o r s , and  f u n c t i o n ) w i t h one  This would intermediate  (the one  product  the second product  the  to  becoming the  becoming  haptoglobin. The  reason  that plasminogen a l s o has  codon i s not c l e a r .  serine  r e l a t e d to the v i t a m i n K-dependent  f a c t o r s (Hewett-Emmett e_t a l . , 1 9 8 0 ) ,  r e s u l t of a separate  type  Amino a c i d sequence homology i n d i c a t e s that  plasminogen i s only d i s t a n t l y coagulation  the AGY  d u p l i c a t i o n from the one  d i g e s t i v e zymogen a n c e s t o r .  and  i s the  g i v i n g r i s e to the  T h i s i m p l i e s that the s e r i n e codon  in plasminogen changed independentely v i t a m i n K-dependent c o a g u l a t i o n  of the s e r i n e codon of  factors.  The  r a t e s of e v o l u t i o n  of the amino a c i d sequence of most of the s e r i n e proteases unknown, and provide  the  t h e r e f o r e amino a c i d sequence homology may  are  not  the best d e s c r i p t i o n of the e v o l u t i o n a r y r e l a t e d n e s s of  181  these  proteins.  Unfortunately,  plasminogen i s not completely  the gene s t r u c t u r e of  known (Sadler et al.,1985),  and  cannot be used at t h i s time to a i d i n s o l v i n g i t s r e l a t i o n s h i p s to the v i t a m i n K-dependent c o a g u l a t i o n f a c t o r s . I.  MODEL OF THE  EVOLUTION OF  THE  VITAMIN K-DEPENDENT  COAGULATION FACTORS A model f o r the e v o l u t i o n of the v i t a m i n K-dependent coagulation acid and  f a c t o r s i s shown in Fig.27.  In t h i s model, amino  sequence homologies, change(s) i n a c t i v e s i t e s e r i n e codon, gene s t r u c t u r a l o r g a n i z a t i o n are a l l used in an attempt  to  d e s c r i b e the pathway of e v o l u t i o n of the c o a g u l a t i o n f a c t o r s . It  i s c l e a r that the vitamin K-dependent c o a g u l a t i o n  a separate  branch of the family of s e r i n e p r o t e a s e s ,  f a c t o r s are as shown by  t h e i r amino a c i d sequence homology (Hewett-Emmett et and  t h e i r common a c t i v e s i t e s e r i n e codon (see above).  amino a c i d sequence homologies, the ancestor K-dependent c o a g u l a t i o n  factors diverged  zymogens, probably  occurred  e a r l y i n e u k a r y o t i c e v o l u t i o n and  billion  years ago  eukaryotic  et a_l.,l978), as i s evident  If the haptoglobin  This  greater  than  It i s a l s o c l e a r early in  m i l l i o n years ago  (Young  from the d i f f e r e n c e s found i n gene  o r g a n i z a t i o n of the prothrombin and Fig.26).  probably  f a c t o r I X - l i k e genes d i v e r g e d  e v o l u t i o n greater than 600  on  vitamin  a f t e r a gene d u p l i c a t i o n .  (Young et §_1.,1978).  that the prothrombin and  to the  Based  from the d i g e s t i v e  protease  one  al.,1980),  f a c t o r I X - l i k e genes  gene i s a l s o d e r i v e d from  v i t a m i n K-dependent c o a g u l a t i o n  factor ancestor,  t h i s branch of the s e r i n e protease  (see  the  t h i s would g i v e  f a m i l y a t h i r d type of gene  1 82  F i g . 2 7 : A Model f o r the E v o l u t i o n of the Vitamin Coagulation Rectangles  K-Dependent  Factors represent the s e r i n e protease domain, with S f o r  a c t i v e s e r i n e protease, and X f o r a l t e r e d a c t i v e s i t e s e r i n e residue.  T r i a n g l e s with y represent the l e a d e r - G l a domain.  Squares represent the k r i n g l e s , and numbered as i n mammalian prothrombins.  C i r c l e s with E represent the epidermal  f a c t o r homologies.  (see t e x t f o r d e t a i l s )  growth  183  >1  X  10"  yrs  duplication  point I? m u t a t i o n  TRYPSINOGEN ETC  poin t ^ mutation duplication  /7\  gene fusion  HAPTOGLOBIN duplicat ion  N  •  gene fusion  gene fusion  > 2 5 0 X 1 0 ° yrs duplications  PROTHROMBIN FACTOR  VII /  FACTOR I X \  PROTEIN  C  FACTOR X  PROTEIN  Z  1 84  organization,  indicating  the great  age o f t h i s b r a n c h .  s h a r i n g of i n t r o n p o s i t i o n between phosphate  Gilbert  et a l . , 1 9 8 6 )  branches of the s e r i n e protease more t h a n one b i l l i o n d u p l i c a t i o n s may intron  t h e genes f o r t r i o s e -  isomerase between p l a n t s and a n i m a l s  Gilbert,1986;  (Marchionni  family diverged  ago.  and  suggests that the d i f f e r e n t  This  ancient  e x p l a i n the d i f f e r e n t  insertions.  Hewett-Emmett  years  The  from each  other  age o f t h e  gene o r g a n i z a t i o n s due t o  Amino a c i d h o m o l o g y  (Young e t  et al.,1980) would not d i s a g r e e  al.,1978;  w i t h the d a t e s of  these d u p l i c a t i o n s . F i g . 2 7 d e m o n s t r a t e s t h e e a r l y gene d u p l i c a t i o n s e p a r a t i n g the c o a g u l a t i o n zymogen years  (e.g.  ago.  factor ancestor trypsinogen)  gene, p r o b a b l y  To c h a n g e t h e a c t i v e s i t e  point mutations are required. duplication  that  coagulation  factors.  protease of  The  The  close together  mutations could a l t e r  more t h a n one  of haptoglobin  i n only  restored  restored.  s o t h a t no o t h e r  have a G l a r e g i o n this  (Jackson  region probably  (though l a t e r  exon s h u f f l i n g  a c q u i s t i o n of t h i s  region  probably  domain  other  factors  so a c q u i s i t i o n  changes i n the molecules  e v e n t s may  a l s o be i n v o l v e d  i n some g e n e s ) .  and  once t h e s e r i n e  K-dependent c o a g u l a t i o n  and Nemerson,1980),  predates  products  point  e s s e n t i a l p a r t s of the protease  A l lvitamin  and t h e  one o f t h e two  prevent p o s s i b l e f u n c t i o n as a s e r i n e protease c o d o n was  two  serine  T h e s e two p o i n t m u t a t i o n s  i n time,  billion  f i r s t mutation preceeded the  second mutation  f u n c t i o n , but o c c u r r e d  protease  s e r i n e codon, a t l e a s t  l e d to the separation  t h e gene d u p l i c a t i o n .  occurred  from the d i g e s t i v e  with  A l lGla containing  of  185  genes c h a r a c t e r i z e d to date have p r e p r o - l e a d e r s (Fung e_t al.,1985;  Pan et al.,1985),  implying that the p r e p r o - l e a d e r  a c q u i r e d together with the Gla domain.  was  Exon o r g a n i z a t i o n of  t h i s region (Fig.24) supports t h i s p r o p o s a l , though the prep e p t i d e may  have been a c q u i r e d at a separate time  p e p t i d e may  have been part of the o r i g i n a l protease gene to  a l l o w s e c r e t i o n , e.g. account  as i n t r y p s i n o g e n ) .  Exon s h u f f l i n g  f o r the a c q u i s i t i o n of t h i s domain, and  i n t r o n s to be present so that the i n t r o n have s t a r t e d  (but not f i n i s h e d , see  (the pre-  this requires  i n v a s i o n p r o c e s s must  below).  D u p l i c a t i o n of the Gla c o n t a i n i n g protease gene would a l l o w the formation of prothrombin  and  The Gla domain appears  i n a l l prothrombin  to be found  molecules  ( D o o l i t t l e e_t a l . , 1 962 ) ;  thus the Gla region must have been a c q u i r e d at l e a s t  gene, i n t r o n  then  the f a c t o r I X - l i k e genes.  i s o l a t e d to date , i n c l u d i n g the lamprey  years ago.  may  450  million  A f t e r d u p l i c a t i o n of the Gla c o n t a i n i n g p r o t e a s e i n v a s i o n continued to produce the  distinctive  o r g a n i z a t i o n s of the s e r i n e protease domains (see F i g . 2 6 ) . G l a r e g i o n r e t a i n e d i t s p a r t i c u l a r o r g a n i z a t i o n while i n v a s i o n occurred.  T h i s i m p l i e s that a homogenization  of the Gla region may  have been i n v o l v e d (e.g.  The  intron process  gene c o n v e r s i o n )  to r e t a i n t h i s o r g a n i z a t i o n , s i m i l a r to the processes o f t e n seen with repeated DNA  sequences  (Dover,1982).  A d d i t i o n a l p r o t e i n domains were a c q u i r e d by both prothrombin the two  and the f a c t o r I X - l i k e genes (prothrombin  the acquired  k r i n g l e s , and the f a c t o r I X - l i k e genes a c q u i r e d the  epidermal growth f a c t o r homologies).  In both genes these  1 86 domains are found as d i s c r e t e  u n i t s made up of one or two exons.  These domains a r e o r g a n i z e d such that would not c r e a t e frame s h i f t s ,  i n s e r t i o n of the exon(s)  but would r e s u l t i n a l a r g e r mRNA  using the same r e a d i n g frame ( F i g . 2 5 ) .  Exon s h u f f l i n g appears  to be the mechanism by which one copy of each of these domains was i n s e r t e d  i n t o the r e s p e c t i v e  and f a c t o r I X - l i k e genes,  gene.  In both the prothrombin  t h i s new domain i s found twice and i n  both cases and i t does not appear  that  the two c o p i e s of the  domain are the r e s u l t of an p a r t i a l gene d u p l i c a t i o n (Patthy,1985).  I t appears  that  the  same domain was i n s e r t e d  two  independent  i n both genes, a second copy of  independently.  The l i k e l i h o o d of  i n s e r t i o n s of the same sequence  gene seems extremely u n l i k e l y .  A possible  occurrence i s a p a r t i a l gene d u p l i c a t i o n  i n t o the same  mechanism f o r t h i s  of t h i s repeated domain  f o l l o w e d at a l a t e r time by a gene c o n v e r s i o n type event with an u n r e l a t e d gene.  This  event and increase In prothrombin  would mask the i n t e r n a l gene d u p l i c a t i o n  the p r o b a b i l i t y of an exon s h u f f l i n g event.  i t appears that  k r i n g l e 2 f o l l o w e d by k r i n g l e 3 of plasminogen genes,  the f i r s t  k r i n g l e a c q u i r e d was  1 which was a c q u i r e d from  (Kurosky et a l . , 1 9 8 0 ) .  In the f a c t o r  kringle IX-like  the order of the a c q u i s i t i o n of the EGF homologies  or i f  both were a c q u i r e d a t the same time i s unknown. F u r t h e r amino a c i d s u b s t i t u t i o n s i n s e r t i o n s and d e l e t i o n s prothrombin protein  structure  (see Fig.23) r e s u l t e d  genes found today.  stucture  of prothrombin  of prothrombin  and to a l e s s e r extent  As demonstrated  i n the by the conserved  i n mammals and b i r d s , the  found today was completed at l e a s t 250  1 87  million  years  ago.  gene d u p l i c a t i o n IX,  X,  protein  found  in  years  Further  produce Z.  (Didisheim  structure  gene has  the  family  factors  et  al.,1959;  as were  at  least  classes  genes  in  the  of  of  of  evolution  and would  date  the  of  the  in  IX,  Walz  and X  et  appears  VII,  that  250  factor to  would  the  million  genes  the  these  in  clarifing  K-dependent  their  events.  in  identify assist  are  al.,1974)  gene d u p l i c a t i o n  vitamin  steps  many  factors  VII,  least  and a t t e m p t s chordates  the  at  coagulation  vertebrates  non-vertebrate  pathways  some o f  it  undergone  of  As  was c o m p l e t e d  characterization  factors,  IX-like  (Jackson and Nemerson,1980),  other  the  to  C, and p r o t e i n  IX-like ago,  factor  events  both chickens  a n d mammals factor  The  coagulation  evolution  more  precisely. J. It of  E V O L U T I O N OF THE BLOOD C O A G U L A T I O N seems t o  individual  the  cause  system  must  unknown,  attempting coagulation It  to  is  follow  trace  factors  a modern loss  damaging evolved  600 m i l l i o n and  to  coagulation  (the  have  vertebrates is  the  imagine  hemophilia  some t y p e  form  of  to  coagulation  the  coagulation  evolution  difficult  be p o s s i b l e  of  system  is  prior  to  more  clouded.  years ago. presents  a  but It  is  factor  to  blood  coagulation  Clearly, or  histories  and F i g . 2 7 ) ,  without  one b l o o d  enough).  origin  evolutionary  (see above  vertebrate  therefore the  the  SYSTEM  coagulation  with  the  emergence  This  pre-vertebrate  difficulties  and development  of  of of life  in the  blood  system.  has been p r o p o s e d  from p r o t e i n s  which  that  previously  the  coagulation  existed  in  factors  plasma  evolved  188  ( D o o l i t t l e , 1 9 6 1 ) and which had f u n c t i o n s u n r e l a t e d to hemostasis.  The f i r s t  r o l e of f i b r i n o g e n may have been to  i n c r e a s e the v i s c o s i t y of blood prothrombin) may have e v o l v e d zymogen, a f t e r insoluble  (Doolittle,1961).  from another plasma  the a c q u i s i t i o n of i t s a b i l i t y  fibrin  from f i b r i n o g e n .  indeed,  other probably (see above).  to produce  than t h i s simple  i t may no longer e x i s t today.  v i t a m i n K-dependent c o a g u l a t i o n  protease  A l l blood c o a g u l a t i o n  yet d e s c r i b e d are much more complicated described;  Thrombin (or  A l l the  f a c t o r s are r e l a t e d to each  Thus, the expansion of the blood  r e s u l t of these gene d u p l i c a t i o n s with  cascade of r e a c t i o n s .  required for e f f i c i e n t  Accessory  to produce  p r o t e i n s are a l s o  blood c o a g u l a t i o n , and i t appears that a t  f a c t o r s V and VIII  al.,1985).  events  coagulation  subsequent m o d i f i c a t i o n of the s u b s t r a t e s p e c i f i c i t y  least  system  as a r e s u l t of gene d u p l i c a t i o n s and other  cascade may be the d i r e c t  the stepwise  systems  are r e l a t e d to each other  (Fass et  In f a c t , the enzyme complexes f o r prothrombin and  f a c t o r X a c t i v a t i o n a r e very  similar  (see Fig.2) and c o u l d  e a s i l y be due to d u p l i c a t i o n of the e n t i r e complex and t h e i r genes. The  d u p l i c a t i o n of a prothrombin ancestor  for a l l the s e r i n e p r o t e a s e s cascade.  found i n the mammalian  The s e r i n e p r o t e a s e s  i n i t i a t i o n a r e not c l o s e l y  involved in i n t r i n s i c  i n s t e a d of AGY  closely  coagulation coagulation  r e l a t e d to prothrombin, as i n d i c a t e d  by the presence of the TCN type site  cannot account  (see above).  s e r i n e codon at t h e i r a c t i v e Factor XII appears more  r e l a t e d to the f i b r i n o l y t i c  enzymes t i s s u e - t y p e and  189  urokinase-type  plasmingen a c t i v a t o r s (Cool e t al.,1985;  Neurath,1985).  Evidence f o r i n t r i n s i c  blood c o a g u l a t i o n  chicken and f i s h  i s absent  MacFarlane, 1960;  D o o l i t t l e et al.,1962),  intrinsic The  i n the  (Didisheim e t a1.,1959;  i n i t i a t i o n may be absent  indicating  i n these  that  species.  f a c t o r XI and p r e k a l l i k r e i n amino a c i d and n u c l e o t i d e  sequences are homologous (Chung et al.,1986; Fujikawa ejt al.,1986).  I t has been proposed that the genes f o r these two  coagulation  f a c t o r s are the r e s u l t of a recent gene d u p l i c a t i o n  event that occurred approximately et al.,1986).  250 m i l l i o n years  ago (Chung  T h i s d u p l i c a t i o n event may have o c c u r r e d  mammals as the mammalian l i n e a g e d i v e r g e d  only i n  from the r e p t i l i a n and  avian l i n e a g e s a l s o about 250 m i l l i o n years ago  (Culbert,1980).  Thus, t h i s gene d u p l i c a t i o n i n mammals may have p r o v i d e d the necessary  proteases  to allow the e v o l u t i o n of an i n t r i n s i c  blood  c o a g u l a t i o n cascade.  I t i s p o s s i b l e to r e c o n c i l e the absence of  intrinsic  i n the non-mammalian v e r t e b r a t e s , with the  coagulation  p o s s i b l e e x i s t a n c e of a d d i t i o n a l plasma p r o t e a s e s  i n the  mammals. The  development and e v o l u t i o n of the mammalian  blood  c o a g u l a t i o n system has thus i n v o l v e d many d i f f e r e n t gene e v o l u t i o n events.  types of  As shown i n Fig.27 gene f u s i o n events  (mediated by exon s h u f f l i n g ) have been r e s p o n s i b l e f o r the c o n s t r u c t i o n of the v a r i o u s blood c o a g u l a t i o n p r o t e i n s . d u p l i c a t i o n s have been i n v o l v e d i n the supply  of new  to allow the expansion of the cascade (the v i t a m i n p r o t e i n s , see F i g . 2 7 ) .  Gene  proteases  K-dependent  Gene d u p l i c a t i o n s of d i s t a n t l y r e l a t e d  1 90  proteases,  which p o s s i b l y had no r o l e i n c o a g u l a t i o n ,  the e v o l u t i o n of a v a r i a n t of the blood c o a g u l a t i o n intrinsic  pathway).  allowed  cascade (the  I n v e s t i g a t i o n of the s t r u c t u r e of the genes  of the mammalian blood  coagulation  p r o t e i n s has helped i n  p r o v i d i n g a c l e a r e r p i c t u r e of the mechanisms which have been involved process.  i n the formation  of t h i s e s s e n t i a l p h y s i o l o g i c a l  191  LITERATURE CITED 1.  A l b e r , T., and Kawaski, G. (1982). N u c l e o t i d e Sequence of the T r i o s e Phosphate Isomerase Gene of Saccharomyces cerevisiae. J . M o l . A p p l . Genet. Jj_ 419-434.  2.  Anderson, G. F., and Barnhart, M. I . (1964). I n t r a c e l l u l a r L o c a l i z a t i o n of Prothrombin. Proc. Exp. Biol. Med. 116; 1-16.  3.  Anson, D. S., Choo, K. H., Rees, D. J . G., G i a n n e l l i , F. , Gould, K., Huddleston, J . A., and Brownlee, G. G. (1984). The Gene S t r u c t u r e of Human Anti-Haemophi1ic F a c t o r IX. EMBO J . 3}_ 1053-1060.  4.  Artymiuk, P. J . , B l a k e , C. C. F., and S i p p e l , A. E. (1981). Genes Pieced Together - Exons D e l i n e a t e Homologous S t r u c t u r e s of Diverged Lysozymes. Nature 290; 287-288.  5.  Ashley, P. L., and MacDonald, R. J . (1985). KallikreinR e l a t e d mRNAs of the Rat Submaxillary Gland: N u c l e o t i d e Sequence of Four D i s t i n c t Types I n c l u d i n g Tonin. Biochemisty 24; 4512-4520.  6.  A t k i n s o n , T. and Smith, M. (1984). S o l i d Phase S y t h e s i s of O l i g o d e o x y r i b o n u c l e o t i d e s by the P h o s p h i t e - T r i e s t e r Method, i n Oligonucleotide Synthesis: A P r a c t i c a l Approach ( G a i t , M. J . E d . ) , IRL Press, Oxford, pp. 3581 .  7.  A v i v , H., and Leder, P. (1972). P u r i f i c a t i o n of B i o l o g i c a l l y A c t i v e G l o b i n Messenger RNA by Chromatography on O l i g o t h y m i d y l i c A c i d - C e l l u l o s e . Proc. N a t l . Acad. S c i . USA 69j_ 1 408-1 41 2.  8.  Barlow, J . J . , Mathias, A. P., and Williamson, R. (1963). A Simple Method f o r the Q u a n t i t a t i v e I s o l a t i o n of Undegraded High M o l e c u l a r Weight R i b o n u c l e i c A c i d . Biochem. Biophys. Res. Commun. 7j_ 61-66.  9.  Beckmann, R. J . , Schmidt, R. J . , Santerre, R. F., P l u t z k y , J . , C r a b t r e e , G. R., and Long, G. L. (1985). The S t r u c t u r e and E v o l u t i o n of a 461 Amino A c i d Human P r o t e i n C P r e c u r s o r and I t s Messenger RNA Based Upon the DNA Sequence of Cloned L i v e r cDNA. N u c l e i c A c i d s Res. 13; 5233-5247.  10.  Soc.  B e l l , G. I . , Q u i n t o , C , Quiroga, M., V a l e n z u e l a , P., C r a i k , C. S., and R u t t e r , W. J . (1984). I s o l a t i o n and Sequence of a Rat Chymotrypsinogen B Gene. J . B i o l . Chem. 259; 14265-14270.  1 92  11.  Bently, A. K. , Rees, D. J . G., R i z z a , C , and Brownlee, G. G. (1986). D e f e c t i v e P r o p e p t i d e P r o c e s s i n g of Blood C l o t t i n g F a c t o r IX Caused by a Mutation of A r g i n i n e to Glutamine at P o s i t i o n -4. C e l l 45; 343-348.  12.  Benton, W. D., and Davis, R. W. (1977). S c r e e n i n g Xgt Recombinant Clones by H y b r i d i z a t i o n i n s i t u . S c i e n c e 196; 180-182.  13.  B e n y a j a t i , C., P l a c e , A. R., Powers, D. A., and Sofer, W. (1981). A l c o h o l Dehydrogenase Gene of D r o s o p h i l i a melanogaster: R e l a t i o n s h i p of I n t e r v e n i n g Sequences to F u n c t i o n a l Domains of the P r o t e i n . Proc. Natl. Acad. S c i . USA 78j_ 2717-2721.  14.  B e n y a j a t i , C , S p o e r e l , N., Haymerle, H., and Ashburner, M. (1983). The Messenger RNA f o r A l c o h o l Dehydrogenase i n D r o s o p h i l i a melanogaster D i f f e r s i n I t s 5' End i n D i f f e r e n t Developmental Stages. C e l l 33; 125133.  15.  Berget, S. M. (1984). Are U4 Small Nuclear R i b o n u c l e o p r o t e i n s Involved i n P o l y a d e n y l a t i o n ? 309; 179-182.  16.  Berget, S. M., Moore, C , and Sharp, P. A. (1977). S p l i c e d Segments at the 5' Termininus of Adneovirus 2 l a t e mRNA. Proc. N a t l . Acad. S c i . USA 7 4; 1371-1375.  17.  Biggs, R., Douglas, A. S., MacFarlane, R. G., Dacie, J . V., P i t n e y , W. R. , Merskey, C , and O'Brien, J . R. (1952). Christmas D i s e a s e : A C o n d i t i o n P r e v i o u s l y Mistaken f o r Haemophilia. Brit. Med. J. 2; 1378-1382.  18.  Birboim, H. C , and Doly, J . (1979). A Rapid E x t r a c t i o n Procedure f o r S c r e e n i n g Recombinant Plasmid DNA. Nucleic Acids Res. 7j_ 1513-1523.  19.  B i r n s t i e l , M. L., B u s s l i n g e r , M., and S t r u b , K. (1985). T r a n s c r i p t i o n T e r m i n a t i o n and 3' P r o c e s s i n g : The End i s i n Site. Cell 349-359.  20.  Blake, C. C. F. (1978). Do Genes-In-Pieces Proteins-In-Pieces? Nature 273; 267.  21.  Blake, C. (1983a). Exons - Present From the Begining? Nature 306; 535-537.  22.  Blake, C. (1983b). Exons and the E v o l u t i o n of P r o t e i n s . Trends Biochem. S c i . 8j_ 11-13.  23.  Blake, C. C. F. (1985). Exons and the E v o l u t i o n of Proteins. I n t . Rev. Cytol. 93; 149-185.  Nature  Imply  1 93  24.  B l a t t n e r , F. R. , W i l l i a m s , B. G., B l e c h l , A. E., Denniston-Thompson, K., Farber, H. E., Furlong, L. -A., Grunwald, D. J . , K i e f e r , D. 0., Moore, D. D., Schamm, J . W. , Sheldon, E. L., and Smithies, 0. (1977). Charon Phages: Safer D e r i v a t i v e s of Bacteriophage Lambda for DNA C l o n i n g . Science 196; 161-169.  25.  B l i n , N. , and S t a f f o r d , D. W. (1976). A General Method for I s o l a t i o n of High Molecular Weight DNA from Eukaryotes. N u c l e i c Acids Res. 3j_ 2303-2308.  26.  Bloom, A. L. (1981). I n h e r i t e d D i s o r d e r s of Blood C o a g u l a t i o n , i n Haemostasis and Thrombosis (Bloom, A. L., and Thomas, D. P~! Eds.), C h u r c h i l l L i v i n g s t o n e , Edinburgh, pp. 321-370.  27.  Bloomquist, M. C , Hunt, L. T., and Barker, W. C. (1984). V a c c i n a V i r u s 1 9 - K i l o d a l t o n P r o t e i n : R e l a t i o n s h i p to S e v e r a l Mammalian P r o t e i n s I n c l u d i n g Two Growth F a c t o r s . Proc. N a t l . Acad. S c i . USA 8jj_ 7363-7367.  28.  Breathnach, R., and Chambon, P. (1981). O r g a n i z a t i o n and E x p r e s s i o n of E u k a r y o t i c S p l i t Genes Coding f o r P r o t e i n s . Ann. Rev. Biochem. 50; 349-383.  29.  Brinkhous, K. M. (1947). C l o t t i n g Defeciency i n Haemophilia: D e f i c i e n c y i n a Plasma Factor Required f o r Platlet Utilization. Proc. Soc. Exp. B i o l . Med. 66; 117-120.  30.  Brown, J . R. , Daar, I . 0., Krug, J . R., and Maquat, L. E. (1985). C h a r a c t e r i z a t i o n of the F u n c t i o n a l Gene and S e v e r a l Processed Pseudogenes i n the Human T r i o s e p h o s p h a t e Isomerase Gene Family. Mol. C e l l . Biol. 5j_ 1694-1706.  31.  B u s s l i n g e r , M. , Moschonas, N., and F l a v e l l , R. A. (1981). /3 T h a l a s s e m i a : Aberrant S p l i c i n g R e s u l t s from a S i n g l e P o i n t Mutation i n an I n t r o n . C e l l 27; 289-298. +  32.  Butkowski, R. J . , E l i o n , J . , Downing, M. R., and Mann, K. G. (1977). Primary S t r u c t u r e of Human Prethrombin 2 and a-Thrombin. J . B i o l . Chem. 252; 4942-4957.  33.  C a l o s , M. P., and M i l l e r , J . H. (1980). Elements. C e l l 20; 579-595.  34.  Campbell, R. D., and P o r t e r , R. R. (1983) Molecular C l o n i n g and C h a r a c t e r i z a t i o n of the Gene Coding f o r Human Complement P r o t e i n Factor B. Proc. N a t l . Acad. S c i . USA 80j_ 4464-4468.  Transposable  1 94  35.  Campbell, R. D., B e n t l e y , D. R., and Morley, B. J . (1984). The F a c t o r B and C2 Genes. P h i l . T r a n s . R. Soc. Lond. B. 306; 367-378.  36.  C a v a l i e r - S m i t h , T. (1978). Nuclear Volume C o n t r o l by N u c l e o s k e l a t a l DNA, S e l e c t i o n f o r C e l l Volume and C e l l Growth Rate, and the S o l u t i o n of the DNA C-Value Paradox. J. Cell. S c i . 34; 247-278.  37.  C a v a l i e r - S m i t h , T. (1985). S e l f i s h DNA and the O r i g i n of I n t r o n s . Nature 315; 283-284.  38.  Cech, T. R. (1983). RNA S p l i c i n g : Three Themes with Variation. C e l l 34j_ 713-716.  39.  Cheng, S. -M., Suzuki, A., Zon, G. and L i u , T. -Y. (1986). C h a r a c t e r i z a t i o n of a Complementary D e o x y r i b o n u c l e i c A c i d for the Coagulogen of Limulus polyphemus. B i o c . Bioph. Acta 868; 1-8.  40.  Chirgwin, J . M., P r z y b y l a , A. E., MacDonald, R. J . , and Rutter, W. J . (1979). I s o l a t i o n of B i o l o g i c a l l y A c i t v e R i b o n u c l e i c A c i d from Sources E n r i c h e d i n R i b o n u c l e a s e . Biochemistry J_8j_ 5294-5299.  41.  Chow, L. T., G e l i n a s , R., Broker, T. R., and Roberts, R. J . (1977). An Amazing Sequence Arrangement a t the 5' Ends of Adnovirus 2 Messenger RNA. C e l l 12; 1-8.  42.  Chow, L. T., and Broker, T. R. (1981). Heteroduplexes by E l e c t r o n Microscopy, Microscopy i n B i o l o g y ( G r i f f i t h , J . D. Wiley, New York, pp. 139-188.  43.  Chung, D. W., Fujikawa, K., McMullen, B. A., and Davie, E. W. (1986). Human Plasma P r e k a l l i k r e i n , A Zymogen to a S e r i n e Protease that c o n t a i n s Four Tandem Repeats. B i o c h e m i s t r y 25; 2410-2417.  44.  Comp, P. C , Nixon, R. R. , Cooper, M. R., and Esmon, C. T. (1984). F a m i l i a l P r o t e i n S D e f i c i e n c y i s A s s o c i a t e d with Recurrent Thrombosis. J. Clin. Invest. 74; 2082-2088.  45.  C o o l , D. E., E d g e l l , C. - J . S., L o u i e , G. V., Z o l l e r , M. J . , Brayer, G. D., and M a c G i l l i v r a y , R. T. A. (1985). C h a r a c t e r i z a t i o n of Human Blood C o a g u l a t i o n F a c t o r XII cDNA: P r e d i c t i o n of the Primary S t r u c t u r e of F a c t o r XII and the T e r t i a r y S t r u c t u r e of 0-Factor X l l a . J. Biol. Chem. 260; 13666-13676.  46.  Cornish-Bowden, A. (1985). Are I n t r o n s S t r u c t u r a l Elements or E v o l u t i o n a r y D e b r i s ? Nature 313; 434-435.  Mapping RNA:DNA in Electron Ed.7"^ v o l . 1,  1 95  47.  C r a b t r e e , G. R., and Kant, J . A. (1982) O r g a n i z a t i o n of the Rat 7 - F i b r i n o g e n Gene: A l t e r n a t e mRNA S p l i c e P a t t e r n s Produce the 7A and 7 6 ( 7 ' ) Chains of F i b r i n o g e n . C e l l 31; 159-166.  48.  C r a b t r e e , G. R. , Comeau, C. M., Fowkes, D. M., Fornace, A. J . , Malley, J . D., and Kant, J . A. (1985). E v o l u t i o n and S t r u c t u r e of the F i b r i n o g e n Genes: Random I n t r o n I n s e r t i o n of Introns or S e l e c t i v e Loss? J. Mol. Biol. 185; 1-19.  49.  C r a i k , C. S., Sprang, S., F l e t t e r i c k , R., and R u t t e r , W. J . (1982a). Intron-Exon S p l i c e J u n c t i o n s at P r o t e i n S u r f a c e s . Nature 299; 180-182.  Map  50.  C r a i k , C. S., Laub, 0., B e l l , G. I . , Sprang, S., F l e t t e r i c k , R. , and Rutter, W. J . (1982b). The R e l a t i o n s h i p of Gene S t r u c t u r e to p r o t e i n S t r u c t u r e , i n Gene R e g u l a t i o n (O'Malley, B., and Fox, C. F. Eds.), Academic P r e s s , New York, pp. 35-54.  51.  C r a i k , C. S., R u t t e r , W. J . , and F l e t t e r i c k , R. (1983). S p l i c e J u n c t i o n s : A s s o c i a t i o n with V a r i a t i o n i n P r o t e i n Structure. Science 220; 1125-1129.  52.  C r a i k , C. S., Choo, Q. -L., S w i f t , G. H., Quinto, C , MacDonald, R. J . , and Rutter, W. J . (1984). Structure Two R e l a t e d Rat P a n c r e a t i c T r y p s i n Genes. J . Biol. Chem. 259; 14255-14264.  53.  C r a i k , C. S., Largman, C , F l e t c h e r , T., Roczniak, S., B a r r , P. J . , F l e t t e r i c k , R., and R u t t e r , W. J . (1985). R e d e s i g n i n g T r y p s i n : A l t e r a t i o n of Substrate S p e c i f i c i t y . S c i e n c e 2 28; 291-297.  54.  C r i c k , F. (1979). 204; 264-271 .  55.  C u l b e r t , E. M. John Wiley and  56.  C u r t i s , C. G. (1981). Plasma Factor X I I I , i n Haemostasis and Thrombosis (Bloom, A. L., and Thomas, D. P~ Eds.) , C h u r c h i l l L i v i n g s t o n e , Edinburgh, pp. 192-197.  57.  Dagert, M. , and E h r l i c h , S. D. (1979). Prolonged I n c u b a t i o n i n Calcium C h l o r i d e Improves the Competence of Escherichia coli Cells. Gene 6j_ 23-28.  58.  Dahlback, B., Lundwall, A., and S t e n f l o , J . (1986). Primary S t r u c t u r e of Bovine Vitamin K-Dependent P r o t e i n Proc. N a t l . Acad. S c i . USA 83j_ 4199-4203.  59.  Dam,  H.  (1935).  S p l i t Genes and  RNA  Splicing.  Science  (1980). E v o l u t i o n of the V e r t e b r a t e s Sons, New York.  The  Antihaemoragic Vitamin  of the  of  ,  S.  Chick.  196  Biochem.  J.  29; 1273-1285.  60.  Dam, H., Schonheyder, F., and Tage-Hansen, E. (1936). S t u d i e s on the Mode of A c t i o n of V i t a m i n K. Biochem. J . 30; 1075-1079.  61.  D a r n e l l , J . E., and D o o l i t t l e , W. F. (1986). Speculations on the E a r l y Course of E v o l u t i o n . Proc. N a t l . Acad. S c i . USA 83j_ 1 271 -1 275.  62.  Davie, E. W., and R a t n o f f , 0. D. (1964). W a t e r f a l l Sequence f o r I n t r i n s i c Blood C l o t t i n g . Science 145; 13101312.  63.  Davie, E. W., Fujikawa, K., K u r a c h i , K. , and K i s i e l , W. (1979). The Role of S e r i n e Proteases i n the Blood C o a g u l a t i o n Cascade. Adv. Enzymol. 48; 277-318.  64.  Davie, E. W., Degen. S. J . F., Y o s h i t a k e , S., and K u r a c h i , K. (1983). C l o n i n g of V i t a m i n K-Dependent C l o t t i n g F a c t o r s . Dev. Biochem. 25; 45-52.  65.  Davis, C. A., R i d d e l l , D. C , H i g g i n s , M. J . , Holden, J . J . A., and White, B. N. (1985). A Gene Family in D r o s o p h i l i a melanogaster Coding f o r T r y p s i n - L i k e Enzymes. N u c l e i c A c i d s Res. 13; 6605-6619.  66.  Degen, S. J . F., M a c G i l l i v r a y , R. T. A., and Davie, E. W. (1983). C h a r a c t e r i z a t i o n of the Complementary D e o x y r i b o n u c l e i c A c i d and Gene Coding f o r Human Prothrombin. B i o c h e m i s t r y 22; 2087-2097.  67.  Degen, S. J . F., Rajput, B., R e i c h , E., and Davie, E. W. (1985). C o a g u l a t i o n and F i b r i n o l y s i s : C h a r a c t e r i z a t i o n of the Human Prothrombin and T i s s u e Plasminogen A c t i v a t o r Genes, i n P r o t i d e s of the B i o l o g i c a l F l u i d s (Peeters, H. E d . ) , v o l . 33., Pergamon Press, Oxford, pp. 47-50.  68.  Degen, S. J . F., Rajput, B., and R e i c h , E. (1986). The Human T i s s u e Plasminogen A c t i v a t o r Gene. J . B i o l . Chem. 261; 6972-6985.  69.  D e i n i n g e r , P. L. (1983). Random S u b c l o n i n g of Sonicated DNA: A p p l i c a t i o n to Shotgun DNA Sequence A n a l y s i s . Anal. Biochem. 129; 216-223.  70.  Delaney, A. D. (1982). A DNA Sequence Handling Program. N u c l e i c A c i d s Res. 10; 61-67.  71.  Delbaere, L. T. J . , Hucheon, W. L. B., James, M. N. G., and T h i e s s e n , W. E. (1975). Tertiary Structural D i f f e r e n c e s Between M i c r o b i a l S e r i n e Proteases and Pancreatic Serine Proteases. Nature 257; 758-763.  1 97  72.  Dennis, E. S., G e r l a c h , W. L., Pryor, A. J . , Bennetzen, J . L., I n g l i s , A., L l e w e l l y n , D., Sachs, M. F e r l , R. J . , and Peacock, W. J . (1984). M o l e c u l a r A n a l y s i s of the A l c o h o l Dehydrogenase (ADH1) Gene of Maize. N u c l e i c A c i d s Res. 12; 3983-4000.  73.  Dennis, E. S., Sachs, M. M., G e r l a c h , W. L., Finnegan, E. J . , and Peacock, W. J . (1985). M o l e c u l a r A n a l y s i s of the A l c o h o l Dehydrogenase 2 (ADH2) Gene of Maize. N u c l e i c A c i d s Res. 13; 727-743.  74.  D i d i s h e i m , P., H a t t o r i , K., and Lewis, J . H. (1959). Hematologic Coagulation S t u d i e s i n V a r i o u s Animal S p e c i e s . J. Lab. C l i n . Med. 53; 866-875.  75.  D o o l i t t l e , R. F. (1961). The Comparative B i o c h e m i s t r y of Blood C o a g u l a t i o n , Ph. D. T h e s i s , Harvard Univ.  76.  D o o l i t t l e , R. F. (1965). D i f f e r e n c e s i n the C l o t t i n g Lamprey F i b r i n o g e n by Lamprey and Bovine Thrombin. Biochem J . 94; 735-741.  77.  D o o l i t t l e , R. F. (1984). F i b r i n o g e n and F i b r i n . Rev. Biochem. 53; 195-229.  78.  D o o l i t t l e , R. F. (1985). The Geneology of Some R e c e n t l y E v o l v e d V e r t e b r a t e P r o t e i n s . Trends Biochem. S c i . 10; 233-237.  79.  D o o l i t t l e , R. F., and Surgenor, D. M. (1962). Blood Coagulation in Fish. Amer. J . P h y s i o l . 203; 964-970.  80.  D o o l i t t l e , R. F., Oncley, J . L., and Surgenor, D. M. (1962). Species D i f f e r e n c e s i n the I n t e r a c t i o n of Thrombin and F i b r i n o g e n . J . B i o l . Chem. 237; 3123-3127.  81.  D o o l i t t l e , R. F., Feng, D. F., and Johnson, M. S. (1984). Computer-Based C h a r a c t e r i z a t i o n of Epidermal Growth F a c t o r P r e c u r s o r . Nature 307; 558-560.  82.  D o o l i t t l e , W. F. (1978). Genes i n P i e c e s : Were They Ever Together? Nature 272; 581-582.  83.  Dover, G. (1982). Molecular D r i v e : A Cohesive Mode of S p e c i e s E v o l u t i o n . Nature 299; 111-117.  84.  Duester, G., J o r n v a l l , H., and H a t f i e l d , G. W. (1986). Intron-Dependent E v o l u t i o n of the N u c l e o t i d e B i n d i n g Domains Within A l c o h o l Dehydrogenase and R e l a t e d Enzymes. N u c l e i c A c i d s Res. j_4j_ 1931-1941.  85.  Dush, M. K.,  Sikela,  J . M.,  Kahn, S.  A.,  M.,  of  Ann.  1 98  T i s c h f i e l d , J . A., and Stambrook, P. J . (1985). N u c l e o t i d e Sequence and O r g a n i z a t i o n of the Mouse Adenine P h o s p h o r i b o s y l t r a n s f e r a s e Gene: Presence of a Coding Region Common t o Animal and B a c t e r i a l P h o s p h o r i b o s y l t r a n s f e r a s e s that has a V a r i a b l e Intron/Exon Arrangement. Proc. N a t l . Acad. S c i . USA 82; 2 7 312735. 86.  E d g e l l , M. H. , H a r d i e s , S. C , Brown, B., V o l i v a , C , H i l l , A., P h i l l i p s , S., Comer, M., Burton, F., Weaver, S., and Hutchison I I I , C. A. (1983). E v o l u t i o n of the Mouse y G l o b i n Complex L o c i , i n E v o l u t i o n of Genes and P r o t e i n s (Nei, M., and Koehn, R. K. E d s . ) , Sinauer A s s o c i a t e s Inc., Sanderland, Mass., pp. 1-13.  87.  Edmonds, M., Vaughn, M. H., and Nakazato, H. (1971). P o l y a d e n y l i c A c i d Sequences i n the Heterologous Nuclear RNA and R a p i d l y - L a b e l e d P o l y r i b o s o m a l RNA of HeLa C e l l s : P o s s i b l e Evidence f o r a P r e c u r s o r R e l a t i o s h i p . Proc. Natl. Acad, S c i . USA 68j_ 1336-1340.  88.  Engle, R. L., and Woods, K. R. (1960). Comparative B i o c h e m i s t r y and Embryology, i n The Plasma P r o t e i n s (Putnam, F. W. E d . ) , v o l . 2, Academic Press, New York, pp. 184-266.  89.  Esmon, C. T., and Jackson, C. M. (1974). The Conversion of Prothrombin t o Thrombin IV: The F u n c t i o n of Fragment 2 Region During A c t i v a t i o n i n the Presence of Factor V. J . Biol. Chem. 249; 7791-7797.  90.  Esmon, C. T. (1983). P r o t e i n - C : B i o c h e m i s t r y , Physiology, and C l i n i c a l I m p l i c a t i o n s . Blood 62; 1155-1158.  91.  Evans, B. A., and R i c h a r d s , R. I . (1985). The Genes f o r the a and y Subunits of Mouse Nerve Growth F a c t o r a r e Contiguous. EMBO J . §j_ 133-138.  92.  Fass, D. N., Hewick, R. M., Knutson, G. J . , Nesheim, M. E., and Mann, K. G. (1985). Internal d u p l i c a t i o n and sequence homology i n f a c t o r V and V I I I . Proc. N a t l . Acad. S c i . USA 82; 1688-1691.  93.  F e i n b e r g , A. P., and V o g e l s t e i n , B. (1983). A Technique f o r R a d i o l a b e l i n g DNA R e s t r i c t i o n Endonuclease Fragments to High S p e c i f i c A c t i v i t y . A n a l . Biochem. 132; 6-13.  94.  Fenton I I , J . W. (1981). Thrombin S p e c i f i c i t y . Y. Acad. S c i . 370; 468-495.  95.  Fenton I I , J . W., and Bing, D. H. (1986). Thrombin A c t i v e - S i t e Regions. Semin. Thromb. Hemost. 12; 200208.  Ann. N.  1 99  96.  F i s h e r , R., Waller, E. K., G r o s s i , G., Thompson, D., T i z a r d , R., and Schleuning, W. -D. (1985). I s o l a t i o n and C h a r a c t e r i z a t i o n of the Tissue-Type Plasminogen A c t i v a t o r S t r u c t u r a l Gene I n c l u d i n g I t s 5' F l a n k i n g Region. J . Biol. Chem. 260; 11223-11230.  97.  F o s t e r , D. C , and Davie, E. W. (1984). C h a r a c t e r i z a t i o n of a cDNA Coding for Human P r o t e i n C. Proc. Natl. Acad. Sci. USA 8_U 4766-4770.  98.  F o s t e r , D. C , Yoshitake, S., and Davie, E. W. (1985). The N u c l e o t i d e Sequence of the Gene for Human P r o t e i n C. Proc. N a t l . Acad. S c i . USA 82; 4673-4677.  99.  Fujikawa, K., Chung, D. W., Hendrickson, L. E., and Davie, E. W. (1986). Amino A c i d Sequence of Human F a c t o r XI, A Blood Coagulation F a c t o r with Four Tandem Repeats That Are Highly Homologous with Plasma P r e k a l l i k r e i n . Biochemistry 25; 2417-2424.  100.  F u l l e r , G. M. and D o o l i t t l e , R. F. (1971a). S t u d i e s of I n v e r t e b r a t e F i b r i n o g e n I : P u r i f i c a t i o n and C h a r a c t e r i z a t i o n of Fibronogen from the Spiny L o b s t e r . Biochemistry 10; 1305-1311.  101.  F u l l e r , G. M. and D o o l i t t l e , R. F. (1971b). S t u d i e s of I n v e r t e b r a t e F i b r i n o g e n I I : Transformation of L o b s t e r F i b r i n o g e n to F i b r i n . B i o c h e m i s t r y 10; 1311-1315.  102.  Fung, M. R., Campbell, R. M., and M a c G i l l i v r a y , R. T. A. (1984). Blood C o a g u l a t i o n F a c t o r X mRNA Encodes a S i n g l e P o l y p e p t i d e C o n t a i n i n g a Pre-Pro Leader Sequence. N u c l e i c A c i d s Res. 12; 4481-4492.  103.  Fung, M. R., Hay, C. W., and M a c G i l l i v r a y , R. T. A. (1985). C h a r a c t e r i z a t i o n of an Almost F u l l - L e n g t h cDNA Coding f o r Human Blood C o a g u l a t i o n F a c t o r X. Proc. N a t l . Acad. S c i . USA 82j_ 3591-3595.  104.  F u r i e , B., Bing, D. H., Feldmann, R. J . , Robison, D. J . , B u r n i e r , J . P., and F u r i e , B. C. (1982). ComputerGenerated Models of Blood C o a g u l a t i o n F a c t o r Xa, F a c t o r IXa, and Thrombin Based on S t r u c t u r a l Homology with Other S e r i n e Proteases. J. Biol. Chem. 257; 3875-3882.  105.  G e r s h e n f e l d , H. K., and Weissman, I. L. (1986) . C l o n i n g of a cDNA for a T - C e l l - S p e c i f i c S e r i n e Protease from a C y t o t o x i c T Lymphocyte. Science 232; 854-858.  106.  G i l b e r t , W. (1978). 501 .  107.  G i l b e r t , W. (1979). Introns and Exons: Playgrounds of E v o l u t i o n , i n E u k a r y o t i c Gene R e g u l a t i o n ( A x e l , R.,  Why Genes i n Pieces?  Nature 271;  200  M a n i a t i s , T., and Fox, C. F. York, pp. 1-12.  Eds.),  Academic  Press,  108.  G i l b e r t , W. (1985). 2 28; 823-824.  109.  G i l b e r t , W., M a r c h i o n n i , M. , and McKnight, G. the A n t i q u i t y of I n t r o n s . C e l l 46; 151-154.  110.  Gluzman, Y. (1985). E u k a r y o t i c T r a n s c r i p t i o n : The r o l e of c i s - and t r a n s - A c t i n g Elements i n I n i t i a t i o n , Cold Spring Harbor P u b l i c a t i o n s , C o l d Spring Harbor.  111.  Go, M. (1981). C o r r e l a t i o n of DNA Exonic Regions with P r o t e i n S t r u c t u r a l U n i t s i n Haemoglobin. Nature 291; 9092.  112.  Go. M. (1983). Modular S t r u c t u r a l u n i t s , Exons, and F u n c t i o n i n Chicken lysozyme. Proc. N a t l . Acad. S c i . USA 80_ 1964-1968.  113.  Goldberg, D. A. (1980). I s o l a t i o n and P a r t i a l C h a r a c t e r i z a t i o n of the D r o s o p h i l i a A l c h o l Dehydrogenase Gene. Proc. N a t l . Acad. S c i . USA 77; 5794-5798.  114.  Grabowski, P. J . , S e i l e r , S. R. , and Sharp, P. A. (1985). A Multicomponent Complex i s Involved i n the S p l i c i n g of Messenger RNA P r e c u r s o r s . C e l l 42; 345-353.  115.  Graves, C. B., Grabau, G. G., Olsen, R. E., and Munns, T. W. (1980a). Immunochemical I s o l a t i o n and E l e c t r o p h o r e t i c C h a r a c t e r i z a t i o n of Precursor Prothrombins in H-35 Rat Hepatoma C e l l s . Biochemistry 19; 266-272.  116.  Graves, C. B., Grabau, G. G., and Munns, T. W. (1980b). B i o s y n t h e s i s and P r o c e s s i n g of p r c u r s o r Prothrombins, i n Vitamin K Metabolism and V i t a m i n K-Dependent P r o t e i n s ( S u t t i e , J . W. E d . ) , U n i v e r s i t y Park Press, Baltimore, pp. 529-541.  117.  G r i f f i n , J . H. (1981). The Contact Phase of Blood C o a g u l a t i o n , i n Haemostasis and Thrombosis (Bloom, A. L., and Thomas, D. P~ Eds. ) , C h u r c h i l l L i v i n g s t o n e , Edinburgh, pp. 84-97.  118.  G r i f f i n , J . H., E v a t t , B., Zimmerman, T. S., and K l e i s s , A. J . (1981). D e f i c i e n c y of P r o t e i n C i n C o n g e n i t a l Thrombotic D i s e a s e . J . C l i n . Invest. 1370-1373.  119.  Genes-In-Pieces R e v i s i t e d .  New  Science  (1986).  68;  Guyton, A. C. (1977). B a s i c Human Physiology: Normal Function and Mechanisms of Disease , Second Edn., W, B. Saunders, P h i l a d e l p h i a .  On  201  120.  Hagen, F. S., Gray, C. L., O'Hara, P., Grant, F. J . , S a a r i , G. C , Woodbury, R. G., Hart, C. E., I n s l e y , M. , K i s i e l , W., K u r a c h i , K., and Davie, E. W. (1986). C h a r a c t e r i z a t i o n of a cDNA Coding f o r Human F a c t o r V I I . Proc. N a t l . Acad. S c i . USA 8_3j. 2412-2416.  121.  H a l l , L., C r a i g , R. K., Edbrooke, M. R., and Campbell, P. N. (1982). Comparison of the N u c l e o t i d e Sequence of Cloned Human and Guinea-Pig Pre-a-Lactalbumin cDNA With That of Chicken Pre-Lysozyme cDNA Suggests E v o l u t i o n From a Common A n c e s t r a l Gene. N u c l e i c A c i d s Res. J_0j_ 3503-3515.  122.  Hardies, S. C , E d g e l l , M. H. , and Hutchison I I I , C. A. (1984). E v o l u t i o n of the Mammalian 7 - G l o b i n Gene C l u s t e r . J. Biol. Chem. 259; 3748-3756.  123.  Hewett-Emmett, D., C z e l u s n i a k , J . , and Goodman, M. (1981). The E v o l u t i o n a r y R e l a t i o n s h i p s of the Enzymes i n B l o o d Coagulation and Haemostasis. Ann. N. Y. Acad. S c i . 370; 511-527.  124.  Hood, L., Kronenberg, M., and H u n k a p i l l e r , T. (1985). T C e l l Antigen Receptor and Immunoglobulin Supergene F a m i l y . C e l l 40j_ 225-229.  125.  Hougie, C , Barrow, E. Stuart C l o t t i n g Defect Hemorrhagic S t a t e from C a l l e d "Stable F a c t o r " Deficiency. J. Clin.  126.  Hojrup, P., Jensen, M. S., and P e t e r s e n , T. E. (1985). Amino A c i d Sequence of Bovine P r o t e i n Z: A V i t a m i n KDependent S e r i n e Protease Homolog. F. E. B. S. Lett. 184; 333-338.  127.  Irwin, D. M., Ahern, K. G., Pearson, G. D., and M a c G i l l i v r a y , R. T. A. (1985). C h a r a c t e r i z a t i o n of the Bovine Prothrombin Gene. B i o c h e m i s t r y 24; 6854-6861 .  128.  Jackson, C. M. (1981). B i o c h e m i s t r y of Prothrombin A c t i v a t i o n , i n Haemostasis and Thrombosis (Bloom, A. L. , and Thomas, D. P~. Eds. ) , C h u r c h i l l L i v i n g s t o n e , Edinburgh, pp. 140-162.  129.  Jackson, C. M., and Nemerson, Y. (1980). Blood C o a g u l a t i o n . Ann. Rev. Biochem. 49; 765-811.  130.  Jaye, M., de l a S a l l e , H., Schamber, F., B a l l a n d , A., K o h l i , V., F i n d e l i , A., T o l s t o s h e v , P., and Lecocq, J . P. (1983). I s o l a t i o n of Anti-Haemophi1ic Factor IX cDNA Using a Unique 52-Base S y n t h e t i c O l i g o n u c l e o t i d e Probe Deduced from the Amino A c i d Sequence  M., and Graham, J . B. (1957). I: Segregation of a H e r e d i t a r y the Heterogenous Group H e r e t o f o r e (SPCA, P r o c o n v e r t i n , F a c t o r V I I ) I n v e s t . 36; 485-496.  202  of Bovine F a c t o r IX.  N u c l e i c Acids Res.  11; 2325-2335.  131.  J e l i n e k , W. R., and Schmid, C. W. (1982). Repetitive Sequences i n E u k a r y o t i c DNA and T h e i r E x p r e s s i o n . Ann. Rev. Biochem. 51; 813-844.  132.  Kan, Y. W., and Dozy, A. M. (1978). Polymorphism of DNA Sequence Adjacent, t o Human 7-Globin S t r u c t u r a l Gene: R e l a t i o n s h i p to S i c k l e Mutation. Proc. N a t l . Acad. Sci. USA 75j_ 5631-5635.  133.  Karn, J . , Brenner, S., B a r n e t t , L., and C e s a r e n i , G. (1980). Novel Bacteriophage X C l o n i n g Vector. Proc. N a t l . Acad. S c i . USA 77j_ 5172-5176.  134.  Katayama, K., E r i c s s o n , L. H., E n f i e l d , D. L., Walsh, K., Neurath, H., Davie, E. W., and T i t a n i , K. (1979). Comparison of Amino A c i d Sequence of Bovine C o a g u l a t i o n F a c t o r IX (Christmas F a c t o r ) with That of Other V i t a m i n KDependent Plasma P r o t e i n s . Proc. N a t l . Acad. S c i . USA 76; 4990-4994.  135.  Katz, L., Kingsbury, D. T., and H e l i n s k i , D. R. (1973). S t i m u l a t i o n By C y c l i c Adenosine Monophosphate of Plasmid D e o x y r i b o n u c l e i c A c i d R e p l i c a t i o n and C a t a b o l i c Repression of the Plasmid D e o x y r i b o n u c l e i c A c i d - P r o t e i n R e l a x a t i o n Complex. J . B a c t e r i o l . 114; 577-591.  136.  Katz, L., W i l l i a m s , P. H., Sato, S., L a e v i t t , R. W., and H e l i n s k i , D. R. (1977). P u r i f i c a t i o n and C h a r a c t e r i z a t i o n of C o v a l e n t l y C l o s e d R e p l i c a t i v e Intermediats of ColE1 DNA From E s c h e r i c h i a c o l i . Biochemistry 16; 1677-1683.  137.  K e l l e r , E. B., and Noon, W. A. (1984). Intron S p l i c i n g : A Conserved I n t e r n a l S i g n a l i n Introns of Animal Pre-mRNA's. Proc. N a t l . Acad. S c i . USA 8_l_£_ 7417-7420.  138.  K e l l e r , W. (1984). The RNA L a r i a t : A New Ring to the S p l i c i n g of mRNA P r e c u r s o r s . C e l l 34; 423-425.  139.  Kraut, J . (1977). S e r i n e Proteases: S t r u c t u r e and Mechanism of C a t a l y s i s . Ann. Rev. Biochem. 46; 331358.  140.  Kruger, K., Grabowski, P. J . , Zaug, A. J . , Sands, J . , G o t t s c h l i n g , D. E., and Cech, T. R. (1982). Self-Splicing RNA: A u t o e x c e s s i o n and A u t o c y c l i z a t i o n of the Ribosomal RNA I n t e r v e n i n g Sequence of Tetrahymena. C e l l 31 ; 1471 57.  141.  K u r a c h i , K., and Davie, E. W. (1982). I s o l a t i o n and C h a r a c t e r i z a t i o n of a cDNA Coding f o r Human Factor IX. Proc. N a t l . Acad. S c i . USA 79j_ 6461-6464.  203  142.  Kurosky, A., B a r n e t t , D. R., Lee, T. -H., Touchstone, B., Hay, R. E., A r n o t t , M. S., Bowman, B. H., and F i t c h , W. M. (1980). Covalent S t r u c t u r e of Human Haptoglobin: A S e r i n e Protease Homolog. Proc. Natl. Acad. S c i . USA 77j_ 3388-3392.  143.  Law, S. W., and Brewer, H. B. ( 1 9 8 4 ) . N u c l e o t i d e Sequence and the Encoded Amino A c i d s of Human A p o l i p o p r o t e i n A-I mRNA. Proc. N a t l . Acad. S c i . USA 8Jj_ 6 6 - 7 0 .  144.  Lawn, R. M. , F r i t s c h , E. F. , Parker, R. C , Blake, G., and M a n i a t i s , T. ( 1 9 7 8 ) . The I s o l a t i o n and C h a r a c t e r i z a t i o n of Linked 5- and 7-Globin Genes From a Cloned L i b r a r y of Human DNA. C e l l 1157-1174.  145;  Lehrach, H., Diamond, D., Wozney, J . R., and Boedtker, H. (1977). RNA Molecular Weight Determination by Gel E l e c t r o p h o r e s i s under Denaturing C o n d i t i o n s , A C r i t i c a l Reexamination. Biochemistry 16; 4743-4751.  146.  Leonard, W. J . , Depper, J . M., Kanehisa, M., Kronke, M. , P e f f e r , N. J . , S v e t l i k , P. B., S u l l i v a n , M. , and Greene, W. C. (1985). S t r u c t u r e of the Human I n t e r l e u k i n 2 Receptor Gene. Science 230; 633-639.  147.  Leytus, S. P., Chung, D. W., K i s i e l , W., K u r a c h i , K. , and Davie, E. W. (1984). C h a r a c t e r i z a t i o n of a cDNA Coding for Human F a c t o r X. Proc. N a t l . Acad. S c i . USA 81; 3699-3702.  148.  L i , S. S., Tiano, H. F., Fukasawa, K. M., Y a g i , K., Shimizu, M., S h a r i e f , S., Nakashima, Y., and Pan, Y. E. (1985). P r o t e i n S t r u c t u r e and Gene O r g a n i z a t i o n of Mouse L a c t a t e Dehydrogenase-A Isozyme. Eur. J . Biochem. 149; 215-225.  149.  L i , W-. H. (1983). E v o l u t i o n of D u p l i c a t e Genes and Pseudogenes, i n E v o l u t i o n of Genes and p r o t e i n s ( N e i , M., and Koehn, R. K. E d s . ) , Sinauer A s s o c i a t e d Inc., Sunderland, Mass., pp. 14-37.  150.  L i , W-. H., Luo, C-. C , and Wu. C-I. (1985). E v o l u t i o n of DNA Sequence, i n Molecular E v o l u t i o n a r y G e n e t i c s (Maclntyre, R. J . Ed.), Plenum Press, New York, pp. 194.  151.  Liang, S. -M., and L i u , T. -Y. (1982). S t u d i e s on the Limulus C o a g u l a t i o n System: I n h i b i t i o n of A c t i v a t i o n of the P r o c l o t t i n g Enzyme by Dimethyl S u l f o x i d e . B i o c . Bioph. Res. Comm. 105; 553-559.  152.  L i z a r d i , P. M. (1983). Methods f o r the P r e p a r a t i o n of Messenger RNA. Meth. Enzymol. 96; 24-38.  204  153.  Lobe, C. G., F i n l a y , B. B., Paranchych, W., Paetkau, V. H., and B l e a c h l e y , R. C. (1986). Novel s e r i n e P r o t e a s e s Encoded by Two C y t o t o x i c T Lymphocyte-Specific Genes. S c i e n c e 232; 858-861.  154.  Lonberg, N., and G i l b e r t , W. (1985). Intron/Exon S t r u c t u r e of the Chicken Pyruvate Kinase Gene. C e l l 81-90.  155.  Long, G. L., B a l a g a j e , R. M., and M a c G i l l i v r a y , R. T. A. (1984). C l o n i n g and Sequencing of L i v e r cDNA Coding f o r Bovine P r o t e i n C. Proc. Natl. Acad. S c i . USA 8_lj_ 5653-5656.  156.  MacFarlane, R. G. (1960). The Blood C o a g u l a t i o n System, in The Plasma P r o t e i n s (Putnam, F. W. Ed.), v o l . 2, Academic P r e s s , New York, pp. 137-181.  157.  MacFarlane, R. G. (1964). An Enzyme Cascade i n the Blood C l o t t i n g Mechanism and I t s Function as a B i o l o g i c a l Amplifier. Nature 202; 498-499.  158.  M a c G i l l i v r a y , R. T. A., Degen, S. J . F., Chandra, T., Woo. S. L. C , and Davie, E. W. (1980). C l o n i n g and A n a l y s i s of a cDNA Coding f o r Bovine Prothrombin. Proc. Natl. Acad. S c i . USA 77j_ 5153-5157.  159.  M a c G i l l i v r a y , R. T. A., and Davie, E. W. (1984). C h a r a c t e r i z a t i o n of Bovine Prothrombin mRNA and I t s T r a n s l a t i o n Product. Biochemistry 23; 1626-1634.  160.  Maeda, N., Yang, F. , Barnett, D. R., Bowman, B. H., and S m i t h i e s , O. (1984). D u p l i c a t i o n Within the Haptoglobin Hp Gene. Nature 309; 131-135.  40;  2  161.  Magnusson, S., P e t e r s e n , T. E., Sottrup-Jensen, L., and C l a e y s , H. (1975). Complete Primary S t r u c t u r e of Prothrombin: I s o l a t i o n , S t r u c t u r e and R e a c t i v i t y of Ten C a r b o x y l a t e d Glutamic A c i d Residues and R e g u l a t i o n of Prothrombin A c t i v a t i o n by Thrombin, i n Proteases and B i o l o g i c a l C o n t r o l (Reich, E., R i f k i n , B. D., and Shaw, E. E d s . ) , C o l d Spring Harbor L a b o r a t o r i e s , C o l d S p r i n g Harbor, pp. 123-149.  162.  M a l i n o w s k i , D. P., S a d l e r , J . E., and Davie, E. W. (1984). C h a r a c t e r i z a t i o n of a Complementary D e o x y r i b o n u c l e i c A c i d Coding f o r Human and Bovine Plasminogen. B i o c h e m i s t r y 23; 4243-4250.  163.  M a n i a t i s , T., J e f f r e y , A., and K l e i d , D. G. (1975). N u c l e o t i d e Sequence of the Rightward Operator of Phage X. Proc. N a t l . Acad. S c i . USA 72j_ 1184-1188.  164.  M a n i a t i s , T.,  F r i t s c h , E. F., and Sambrook, J . (1982).  205  Molecular C l o n i n g : A L a b o r a t o r y Manual , C o l d S p r i n g Harbor L a b o r a t o r i e s , Cold S p r i n g Harbor. 165.  Marchionni, M., and G i l b e r t , W. (1986). The Triosphosphate Isomerase Gene From Maize: I n t r o n s Antedate the Plant Animal Divergence. C e l l 46; 133-141.  166.  Mason, A. J . , Evans, B. A., Cox, D. R., Shine, J . and R i c h a r d s , R. I . (1983). S t r u c t u r e of Mouse K a l l i k r e i n Gene Family Suggests a Role i n S p e c i f i c P r o c e s s i n g of B i o l o g i c a l l y A c t i v e P e p t i d e s . Nature 303; 300-307  167.  McDevitt, M. A., I m p e r i a l e , M. J . , A l i , H., and Nevins, J . R. (1984). Requirement of a Downstream Sequence f o r Generation of a Poly(A) A d d i t i o n S i t e . 37; 993-999.  Cell  168.  McKnight, S. L., and Kingsbury, R. (1982). T r a n s c r i p t i o n a l C o n t r o l S i g n a l s of a E u k a r y o t i c P r o t e i n Coding Gene. Science 217; 316-324.  169.  McKnight, G. L., O'Hara, P. J . , and P a r k e r , M. L. (1986). N u c l e o t i d e Sequence of the T r i o s e p h o s p h a t e Isomerase Gene from A s p e r g i l l u s n i d u l a n s : I m p l i c a t i o n s f o r a D i f f e r e n t i a l Loss of I n t r o n s . C e l l 46; 143-147.  170.  McLachlan, A. D. (1979). Gene D u p l i c a t i o n i n the S t r u c t u r a l E v o l u t i o n of Chymtrypsinogen. J . Mol. 128; 49-79.  171.  McMullen, B. A., and Fujikawa, K. (1985). Amino A c i d Sequence of the Heavy Chain of Human a - F a c t o r XIIa ( A c t i v a t e d Hageman F a c t o r ) . J. Biol. Chem. 260; 53285341 .  172.  Messing, J . (1983). New M13 V e c t o r s f o r C l o n i n g . Enzymol. 101; 20-78.  173.  Messing, J . , Crea, R., and Seeburg, P. H. (1981). A System f o r Shotgun DNA Sequencing. N u c l e i c A c i d s Res. 9; 309-321.  174.  M i l l s , D. C. B. (1981). The B a s i c B i o c h e m i s t r y of the P l a t e l e t , i n Haemostasis and Thrombosis (Bloom, A. L., and Thomas, D. P. E d s . ) , C h u r c h i l l L i v i n g s t o n e , Edinburgh, pp. 50-60.  175.  M o n t e l l , C , F i s h e r , E. E., C a r u t h e r s , M. H., and Berk, A. J . (1983). I n h i b i t i o n of RNA Cleavage But not P o l y a d e n y l a t i o n by a Point Mutation i n mRNA Concencus Sequence AAUAAA. Nature 305; 600-608.  176.  Morley, B. J . , and Campbell, R. D. (1984). Internal Homologies of the Ba Fragment of Human Complement  Biol.  Meth.  206  Component 153-157.  F a c t o r B, A C l a s s III MHC A n t i g e n .  EMBO J . 3j_  177.  Mount, S. M. (1982). A Catalogue of S p l i c e J u n c t i o n Sequences. N u c l e i c A c i d s Res. 10; 459-472.  178.  Nagamine, Y. , Pearson, D., A t l u s , M. S., and R e i c h , E. (1984). cDNA and Gene Sequence of P o r c i n e Plasminogen A c t i v a t o r . N u c l e i c A c i d s Res. 12; 95259541 .  179.  Nagamine, Y., Pearson, D., and G r a t t a n , M. (1985). ExonI n t r o n Boundary S l i d i n g i n the Generation of Two mRNA's Coding For P o r c i n e Urokinase-Like Plasminogen A c t i v a t o r . Biochem. Biophys. Res. Commun 132; 563-569.  180.  Naora, H. , and Deacon, N. J . (1982). R e l a t i o n s h i p Between the T o t a l S i z e of Exons and Introns i n P r o t e i n - C o d i n g Genes of Higher Eukaryotes. Proc. N a t l . Acad. S c i . USA T 9 ± 6196-6200.  181.  Nasmyth, K. (1983). Molecular A n a l y s i s of a C e l l Lineage. Nature 302; 670-676.  182.  Neurath, H. (1984). E v o l u t i o n of P r o t e o l y t i c S c i e n c e 224; 350-357.  183.  Neurath, H. (1985). P r o t e o l y t i c Enzymes, Past and Present. Fed. Proc. 44; 2907-2913.  184.  Neurath, H., and Walsh, K. A. (1976). The Role of P r o t e a s e s i n B i o l o g i c a l R e g u l a t i o n , i n P r o t e o l y s i s and P h y s i o l o g i c a l R e g u l a t i o n (Robbins, D. W., and Brew, K. E d s . ) , Academic Press, New York, pp. 29-42.  185.  Nevins, J . R. (1983). The Pathway of E u k a r y o t i c mRNA Formation. Ann. Rev. Biochem. 52; 441-466.  186.  Ny, T., E l g h , F., and Lund, B. (1984). The S t r u c t u r e of the Human Tissue-Type Plasminogen A c t i v a t o r Gene: C o r r e l a t i o n of Intron and Exon S t r u c t u r e s to F u n c t i o n a l and S t r u c t u r a l Domains. Proc. N a t l . Acad. S c i . USA 81; 5355-5359.  187.  Owen, C. A., and Bollman, J . L. (1948). C o n v e r s i o n F a c t o r of Diacumarol Plasma. Biol. Med. 67j_ 231-234.  188.  Pan, L. C , and P r i c e , P. A. (1985). The Propeptide of Rat Bone 7-Carboxyglutamic A c i d P r o t e i n Shares Homology With Other V i t a m i n K-Dependent P r o t e i n P r e c u r s o r s . Proc. Natl. Acad. S c i . USA 82j_ 6109-6113.  189.  Pan, L. C , W i l l i a m s o n , M. K. , and P r i c e , P. A. (1985).  Enzymes.  Prothrombin Proc. Soc. Exp.  207  Sequence of the P r e c u r s o r to Rat Bone 7-Carboxyglutamic A c i d P r o t e i n That Accumulates i n W a r f a r i n T r e a t e d Osteosarcoma C e l l s . J. Biol. Chem. 260; 13398-13401. 190.  Park, C. H., and T u l i n s k y , A. (1986). Three-Dimensional S t r u c t u r e of the K r i n g l e Sequence: S t r u c t u r e of Prothrombin Fragment 1. B i o c h e m i s t r y 25; 3977-3982.  191.  Patek, A. J . , and T a y l o r , F. H. L. (1937). Hemophilia I I : Some P r o p e r t i e s of a S u b s t r a t e Obtained From Normal Plasma E f f e c t i v e i n A c c e l e r a t i n g the C o a g u l a t i o n of Hemophilic Blood. J . C l i n . I n v e s t . 16; 113-124.  192.  Patthy, L. (1985). E v o l u t i o n of the P r o t e a s e s of Blood C o a g u l a t i o n and F i b r i n o l y s i s by Assembly From Modules. C e l l 41; 657-663.  193.  Pennica, D., Holmes, W. E., Kohr, W. J . , H a r k i n s , R. N., Vehar, G. A., Ward, C. A., Bennett, W. F., Y e l v e r t o n , E., Seeburg, P. H., Heyneker, H. L., Goeddel, D. V., and C o l l e n , D. (1983). C l o n i n g and E x p r e s s i o n of Human Tissue-Type Plasminogen A c t i v a t o r cDNA i n E. coli. Nature 301; 214-221.  194.  P e r l e r , F., E f s t r a t i a d i s , A., Lomedico, P., G i l b e r t , W., Kolodner, R., and Dodgson, J . (1980). The E v o l u t i o n of Genes: The Chicken P r e p r o i n s u l i n Gene. C e l l 20; 555-566.  195.  P e r r y , R. P. (1976). P r o c e s s i n g of RNA. Biochem. 45; 605-629.  196.  P e t e r s e n , T. E., Thogersen, H. C., Shorstengaard, K., Vibe-Pedersen, K., S a h l , P., S o t t r u p - J e n s e n , L., and Magnusson, S. (1983). P a r t i a l Primary S t r u c t u r e of Bovine Plasma F i b r o n e c t i n : Three Types of I n t e r n a l Homology. Proc. N a t l . Acad. S c i . USA 80; 137-141.  197.  P i c h e r s k y , E., G o t t l i e b , L. D., and Hess, J . F. (1984). N u c l e o t i d e Sequence of the T r i o s e Phosphate Isomerase Gene of E. c o l i . Mol. Gen. Genet. 195; 314-320.  198.  P l u t z k y , J . , Hoskins, J . A., Long. G. L., and C r a b t r e e , G. R. (1986). E v o l u t i o n and O r g a n i z a t i o n of the Human P r o t e i n C Gene. P r o c . N a t l . Acad. S c i . USA 83; 546-550.  199.  Prochownik, E. V., Markham, A. F., and O r k i n , S. H. (1983). I s o l a t i o n of a cDNA Clone f o r Human Antithrombin I I I . J . B i o l . Chem. 258; 8389-8394.  200.  Proudfoot, N. J . , and Brownlee, G. G. (1976). 3' NonCoding Region Sequences i n E u k a r y o t i c Messenger RNA. Nature 263; 211-214.  Ann.  Rev.  208  201.  202.  Quick, A. J . ( 1 9 4 3 ) . Amer. J . P h y s i o l .  On the C o n s t i t u t i o n of Prothrombin. 140;  212-220.  Quick, A. J . ( 1 9 4 7 ) . Studies on the Enigma of the Hemostatic D y s f u n c t i o n of Hemophilia. Amer. J . Med. Sci.  214;  272-280.  203.  R a t n o f f , 0 . D. ( 1 9 7 7 ) . Blood C l o t t i n g Mechanisms: An Overview, i n Haemostasis: B i o c h e m i s t r y , P h y s i o l o g y and Pathyology (Ogston, D., and Bennett, B. E d s . ) , John Wiley and Sons., London, pp. 1 - 2 4 .  204.  R a t n o f f , 0 . D., and Colopy, J . H. ( 1 9 5 5 ) . A Familial Hemorrhagic T r a i t A s s o c i a t e d With a D e f i c i e n c y of a C l o t Promoting F r a c t i o n of Plasma. J . C l i n . Invest. 3 4 ; 602-613.  205.  R i c c i o , A., G r i m a l d i , G., Verde, P., Sebastue, G., Boast, S., and B l a s i , F. ( 1 9 8 5 ) . The Human UrokinasePlasminogen A c t i v a t o r Gene and I t s Promoter. Nucleic A c i d s Res. 1 3 ; 2 7 5 9 - 2 7 7 1 .  206.  R i c h a r d s o n , K. K., Crosby, R. M., Good, P. J . , Rosen, N. L., and M a y f i e l d , J . E. ( 1 9 8 6 ) . Bovine DNA C o n t a i n s a S i n g l e Major Family of I n t e r s p e r s e d R e p e t i t i v e Sequences. Eur. J . Biochem. 154; 3 4 9 - 3 5 4 .  207.  Rogers, J . ( 1 9 8 5 ) . Exon S h u f f l i n g and I n t r o n Invasion i n S e r i n e Protease Genes. Nature 3 1 5 ; 4 5 8 - 4 5 9 .  208.  R o s e n t h a l , R. L., D r e s k i n , 0 . H., and R o s e n t h a l , M. ( 1 9 5 3 ) . New Hemophilia-Like Disease Caused by D e f i c i e n c y of a T h i r d Plasma Thromboplastin F a c t o r . Proc. Soc. Exp. B i o l . Med. 8 2 ; 1 7 1 - 1 7 4 .  209.  Ruskin, B., and Green, M. R. ( 1 9 8 5 ) . S p e c i f i c and S t a b l e I n t r o n - F a c t o r I n t e r a c t i o n s Are E s t a b l i s h e d E a r l y During In V i t r o Pre-mRNA S p l i c i n g . Cell 43; 131-142.  210.  R u s s e l , P. R. ( 1 9 8 5 ) . T r a n s c r i p t i o n of the T r i o s e Phosphate Isomerase Gene of Shizosacchromyces pombe I n i t i a t e s from a S t a r t Point D i f f e r e n t From That i n Sacchromyces c e r e v i s i a e . Gene 4 0 ; 1 2 5 - 1 3 0 .  211.  S a d l e r , J . E. , Malinowski, D. P., and Davie, E. W. ( 1 9 8 5 ) . C l o n i n g and S t r u c t u r a l C h a r a c t e r i z a t i o n of the Gene f o r Human Plasminogen, i n Progress i n F i b r i n o l y s i s (Davidson, J . F., Donati, M. B., and C o c c h e r i , S. E d s . ) , vol. V I I , C h u r c h i l l L i v i n g s t o n e , Edinburgh, pp. 2 01— 204.  212.  Sanger, F., N i c k l e n , S., and Coulsen, A. R. ( 1 9 7 7 ) . DNA Sequencing With Chain-Terminating I n h i b i t o r s . Proc. Natl. Acad. S c i . USA 74j_ 5 4 6 3 - 5 4 6 7 .  209  213.  Schwarz, H. P . , F i s c h e r , M . , Hopmeier, P . , B a t a r d , M. A . , and G r i f f i n , J . H. (1984). Plasma Protein S Deficiency in F a m i l i a l Thrombotic Disease. Blood 64; 1297-1300.  214.  S e i d , R. C , and L i u , T. - Y . (1980). Purification and P r o p e r t i e s of the L i m u l u s C l o t t i n g Enzyme. Dev. Biochem. 10; 481-493.  215.  Sharp, P. Introns.  216.  Shatkin, A. J . (1985). mRNA C a p E s s e n t i a l Factors for I n i t i a t i n g 223-224.  A. (1985). On t h e C e l l 42j_ 3 9 7 - 4 0 0 .  Origin  of  Splicing  and  Binding Proteins: Translation. Cell  40;  217.  Solum, N. 0. (1973). The C o a g u l o g e n of L i m u l u s polyphemus Hemocytes: A Comparison of the C l o t t e d and Non-Clotted Forms of the M o l e c u l e . Thrombosis Res. 2j_ 5 5 - 7 0 .  218.  Sottrup-Jensen, L., Claeys, H., Zajdel, M., P e t e r s e n , T. E . , and Magnusson, S . (1978). The Primary Structure o f Human P l a s m i n o g e n : I s o l a t i o n o f Two L y s i n e B i n d i n g F r a g m e n t s a n s One " M i n i - " P l a s m i n o g e n (MW, 3 8 , 000) by E l a s t a s e - C a t a l y z e d S p e c i f i c Limited Proteolysis, in Progress in Chemical F i b r i n o l y s i s and Thrombolysis ( D a v i d s o n , J . F . , Rowan, R. M . , Samana, M. M, and Desnoyer, P. C. Eds.), vol. 3 , R a v e n P r e s s , New York, pp. 191-209.  219.  S o u t h e r n , E . M. ( 1 9 7 5 ) . Detection of a S p e c i f i c Sequence A m o n g DNA F r a g m e n t s S e p a r a t e d b y G e l E l e c t r o p h o r e s i s . J . Mol. Biol. 98; 503-517.  220.  S t a d e n , R. ( 1 9 8 2 ) . Automation of the Computer H a n d l i n g G e l R e a d i n g D a t a P r o d u c e d b y t h e S h o t g u n M e t h o d o f DNA Sequencing. Nucleic Acids Res. 10; 4731-4751.  221.  S t e i n e r , D. F . , Q u i n n , P . S . , C h a n , S . J . , M a r s h , J . , Tager, H. S. (1980). P r o c e s s i n g Mechanisims in the B i o s y n t h e s i s of P r o t e i n s . Ann. N. Y. Acad. S c i . 1-16.  of  and 343;  222.  Stenflo, J . (1976). A New V i t a m i n K - D e p e n d e n t Protein: Purification From Bovine Plasma and Preliminary Characterization. J . B i o l . Chem. 251; 355-363.  223.  S t o n e , E . M . , R o t h b l u m . K. N . , and S c h w a r t z , R. J . ( 1 9 8 5 a ) . Intron-Dependent Evolution Chicken Glyceraldehyde Phosphate Dehydrogenase Gene. Nature 313; 498-500.  224-  S t o n e , E . M. , and Schwartz,  R o t h b l u m , K. N . , A l e v y , R. J . ( 1 9 8 5 ) . Complete  M. C , Kuo, Sequence of  of  T. M. , the  210  Chicken Glyceraldehyde-3-Phosphate Dehydrogenase Gene. Proc. N a t l . Acad. S c i . USA 82j_ 1628-1632. 225.  S t r a u s , D., and G i l b e r t , W. (1985). Genetic E n g i n e e r i n g in the Precambrian: S t r u c t u r e of the Chicken Triosephosphate Isomerase Gene. Mol. C e l l . Biol. 5; 3497-3506.  226.  Stroud, R. M., K o s s i a k o f f , A. A., and Chambers, J . L. (1977). Mechanisims of Zymogen Activation. Ann. Rev. Biophys. Bioeng. 6j_ 177-193.  227.  S t r y e r , L. (1981). Franc i sco.  228.  Sudhoff, T. C. G o l d s t e i n , J . L., Brown, M. S., and R u s s e l l . D. W. (1985a). The LDL Receptor Gene: A Mosaic of Exons Shared With D i f f e r e n t P r o t e i n s . Science 228; 815-822.  229.  Sudhoff, T. C , R u s s e l l , D. W., G o l d s t e i n , J . L., Brown, M. S., Sanchez-Pescador, R., and B e l l , G. I. (1985b). C a s s e t t e of E i g h t Exons Shared by Genes f o r LDL Receptor and EGF P r e c u r s o r . S c i e n c e 228; 893-895.  230.  S u t t i e , J . W. (1985). Vitamin K-Dependent C a r b o x y l a s e . Ann. Rev. Biochem. 54; 459-477.  231.  S u t t i e , J . W., and Jackson, C. M. (1977). Prothrombin S t r u c t u r e , A c t i v a t i o n and B i o s y n t h e s i s . P h y s i o l . Rev. 57; 1-70.  232.  Swanson, J . C , and S u t t i e , J . W. (1985). Prothrombin B i o s y n t h e s i s : C h a r a c t e r i z a t i o n of P r o c e s s i n g Events i n Rat L i v e r Microsomes. B i o c h e m i s t r y 24; 3890-3897.  233.  S w i f t , G. H., C r a i k , C. S., S t a r y , S. J . , Quinto, C , L a h a i e , R. G., R u t t e r , W. J . , and MacDonald, R. J . (1984). S t r u c t u r e of the Two R e l a t e d E l a s t a s e Genes Expressed i n the Rat Pancreas. J. Biol. Chem. 2 59; 14271-14278.  234.  T e l f e r , T. P., Denson, K. W., and Wright, D. R. (1956). 'New' Coagulation D e f e c t . B r i t . J . Haemat. 2j_ 308316.  235.  Thomas, P. S. (1980). H y b r i d i z a t i o n of Denatured RNA and Small DNA Fragments T r a n s f e r r e d to N i t r o c e l l u l o s e . Proc. Natl. Acad. S c i . USA 77j_ 5201-5205.  236.  T u l i n s k y , A., Park, C. H., and Kydel, T. J . (1985). The S t r u c t u r e of Prothrombin Fragment 1 at 3. 5 A R e s o l u t i o n . J. Biol. Chem. 260; 10771-10778.  B i o c h e m i s t r y , Freman P r e s s , San  0  A  21 1  237.  van Leeuwen, B. H., Evans, B. A., Tregear, G. W., and R i c h a r d s , R. I . (1986). Mouse Glandular K a l l i k r e i n n Genes: I d e n t i f i c a t i o n , S t r u c t u r e , and Expression of the Renal K a l l i k r e i n Gene. J . B i o l . Chem. 261; 5529-5535.  238.  Verde, P., S t o p p e l l i , M. P., G a l e f f i , P., Di Nocera, P., and B l a s i , F. (1984). I d e n t i f i c a t i o n and Primary Sequence of an U n s p l i c e d Human Urokinase P o l y ( A ) RNA. Proc. Natl. Acad. S c i . USA 8J_^ 4727-4731. +  239.  V i e i r a , J . , and Messing, J , (1982), The pUC Plasmids, an M13mp7 D e r i v e d System f o r I n s e r t i o n Mutagenisis and Sequencing With S y n t h e t i c U n i v e r s a l Primers. Gene 1 9; 259-268.  240.  von H e i j n e , G. (1983). P a t t e r n s of Amino Acids Near Signal-Sequence Cleavage S i t e s . E u r . J . Biochem. 133; 17-21.  241.  von H e i j n e , G. (1985). S i g n a l Sequences: The L i m i t s of Variation. J . Mol. B i o l . 184; 99-105.  242.  Walz, D. A. (1978). Activation. Biblo.  243.  Walz, D. A., Kipfer*, R. K. , Jones, J . P., and Olsen, R. E. (1974). P u r i f i c a t i o n and P r o p e r t i e s of Chicken prothrombin. Arch. Biochem. Biophys. 164; 527535.  244.  Walz, D. A., K i p f e r , R. K., and Olsen, R. E. (1975). E f f e c t of V i t a m i n K D e f i c i e n c y , W a r f a r i n , and I n h i b i t o r s of P r o t e i n S y n t h e s i s Upon the Plasma L e v e l s of Vitamin K-Dependent C l o t t i n g F a c t o r s i n the Chick. J . Nutr. 105; 972-981.  245.  Walz, D. A., Hewett-Emmett, D., and Seegers, W. H. (1977). Amino A c i d Sequence of Human Prothrombin Fragments 1 and 2. Proc. N a t l . Acad. S c i . USA 74j_ 1969-1972.  246.  Watanabe, Y., Tsukada, T., Notake, M., Nakanishi, S, and Numa, S. (1982). S t r u c t u r a l A n a l y s i s of R e p e t i t i v e DNA Sequences i n the Bovine C o r t i c o t r o p i n - / 3 - L i p o t r o p i n Precursor Gene Region. N u c l e i c A c i d s Res. 10; 14591469.  247.  Weaver, R. F., and Weissmann, C. (1979). Mapping of RNA by a M o d i f i c a t i o n of the Berk-Sharp Procedure: The 5' t e r m i n i of 15S /3-Globin mRNA P r e c u r s o r and Mature 10S 0-Globin mRNA Have I d e n t i c a l Map C o o r d i n a t e s . N u c l e i c A c i d s Res. 7j_ 1 175-1193.  248.  Wieringa, B., Hofer, E., and Weissmann, C. (1984).  C o m p a r i t i v e Aspects of Prothrombin Haemat. 44; 8-14.  A  212  Minimal Intron Length But No S p e c i f i c I n t e r n a l Sequence i s Required For S p l i c i n g the Large Rabbit 0-Globin I n t r o n . C e l l 37j_ 915-925. 249.  Wilson, A. C , C a r l s o n , S. S., and White, T. J . (1977). Biochemical E v o l u t i o n . Ann. Rev. Biochem. 46; 573639.  250.  Yoshitake, S., Schach, B. G., F o s t e r , D. C , Davie, E. W. and Kurachi, K. (1985). N u c l e o t i d e Sequence of the Gene for Human F a c t o r IX ( A n t i h e m p h i l i c F a c t o r B). Biochemistry 24; 3736-3750.  251.  Young, C. L., Barker, W. C , T o m a s e l l i , C. M. , and Dayhoff, M. 0. (1978). S e r i n e P r o t e a s e s , i n A t l a s of P r o t e i n S t r u c t u r e (Dayhoff, M. 0. Ed.), v o l . 5 ( s u p p l . 3), N a t i o n a l Biomedical Research Foundation, S i l v e r Spring, Maryland, pp. 73-93.  252.  Young, R. A., and D a v i s , R. W. (1983a). Efficient I s o l a t i o n of Genes by Using Antibody Probes. Proc. Acad. S c i . USA 80j_ 1194-1198.  253.  Young, R. A., and Davis, R. W. (1983b). Yeast RNA polymerase II Genes: I s o l a t i o n With Antibody Probes. Science 222; 778-782.  254.  Zaug, A. J . , and Cech, T. R. (1986). The I n t e r v e n i n g Sequence RNA of Tetrahymena i s an Enzyme. S c i e n c e 231; 470-475.  255.  Zuckerkandl, E., and P a u l i n g , L. (1965). E v o l u t i o n a r y Divergence and Convergence i n Plasma P r o t e i n s , i n E v o l v i n g Genes and P r o t e i n s (Bryson, V., and V o g e l , H. J . E d s . ) , Academic Press, New York, pp. 97-166.  256.  Zur, M., and Nemerson, Y. (1981). T i s s u e F a c t o r Pathways of Blood C o a g u l a t i o n , i n Haemostasis and Thrombosis (Bloom, A. L., and Thomas, D. P^ E d s . ) , Churchi11 L i v i n g s t o n e , Edinburgh, pp. 124-139.  257.  Z y t k o v i c z , T. H., and N e l s e s t u e n , G. L. (1976). 7-Carboxyglutamic A c i d D i s t r i b u t i o n . Biochem. Biophys. Acta 444; 344-348.  Natl.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0097390/manifest

Comment

Related Items