Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Prediction program of secondary structure from sequence of proteins according to the method of Chou and… Pham, Anne-Marie 1981

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1981_A6_7 P52.pdf [ 15.67MB ]
Metadata
JSON: 831-1.0095254.json
JSON-LD: 831-1.0095254-ld.json
RDF/XML (Pretty): 831-1.0095254-rdf.xml
RDF/JSON: 831-1.0095254-rdf.json
Turtle: 831-1.0095254-turtle.txt
N-Triples: 831-1.0095254-rdf-ntriples.txt
Original Record: 831-1.0095254-source.json
Full Text
831-1.0095254-fulltext.txt
Citation
831-1.0095254.ris

Full Text

PREDICTION PROGRAM OF SECONDARY STRUCTURE FROM SEQUENCE OF PROTEINS ACCORDING TO THE METHOD OF CHOU AND FASMAN by ANNE-MARIE PHAM B.Sc,  The U n i v e r s i t y  of M o n t p e l l i e r ,  1978  A THESIS SUBMITTED I N PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE SUDIES (Department  We a c c e p t t h i s  of Food  thesis  to the required  Science)  as c o n f o r m i n g standard  THE UNIVERSITY OF B R I T I S H COLUMBIA J u l y 1981  (c)  Anne-Marie  Pham, 1981  In p r e s e n t i n g  this  thesis  in partial  f u l f i l m e n t of the  r e q u i r e m e n t s f o r an a d v a n c e d d e g r e e a t t h e of  B r i t i s h Columbia, I agree that  it  freely  the L i b r a r y s h a l l  a v a i l a b l e f o r r e f e r e n c e and s t u d y .  agree that p e r m i s s i o n for  University  f o r extensive  s c h o l a r l y p u r p o s e s may  for  financial  shall  of  Food  Science  The U n i v e r s i t y o f B r i t i s h 2075 W e s b r o o k P l a c e Vancouver, Canada V6T 1W5 Date  DE-6  (2/79)  J u l y 7,  1981  Columbia  my  It is thesis  n o t be a l l o w e d w i t h o u t my  permission.  Department  thesis  be g r a n t e d by t h e h e a d o f  copying or p u b l i c a t i o n of t h i s  gain  further  copying of t h i s  d e p a r t m e n t o r by h i s o r h e r r e p r e s e n t a t i v e s . understood that  I  make  written  ABSTRACT  S e v e r a l methods h a v e b e e n p r o p o s e d f o r p r e d i c t i n g t h e s e c o n d a r y s t r u c t u r e o f p r o t e i n s . The  method o f Chou  Fasman ( 1 9 7 4 a , 1974b, 1 9 7 8 a , 1978b) i s r e l a t i v e l y theory  and  Chou and  reasonably  Fasman a r e  in  c a n be  inter-  researchers.  Several attempts  h a v e b e e n made f o r  t i o n o f t h e r u l e s o f Chou and Chou and  simple  a c c u r a t e . U n f o r t u n a t e l y , the r u l e s of  s o m e t i m e s a m b i g u o u s and  p r e t e d d i f f e r e n t l y by  and  Fasman ( A r g o s  Fasman, 1978b; D z i o n a r a  computeriza-  e_t al_. ,  1976;  e_t al_. , 1 9 7 7 ) . H o w e v e r ,  they  are f o r computation o f o n l y a p o r t i o n of the p r o t e i n s e c o n d a r y s t r u c t u r e . The s t r u c t u r e has  final  t o r e l y on t h e  assignment of the  individual's  manipulation.  In a d d i t i o n to t h r e e s e p a r a t e f o r p r e d i c t i o n o f t h e a - h e l i x , g - s h e e t and a f o u r t h p r o g r a m was  written for clarifying  between  g-sheet. Although  a - h e l i x and  computer programs g-turn s t r u c t u r e s , overlapping  and  satisfactory,  boundary v a l u e s  prediction,  were i n  t h e r e w e r e a number o f m i s s i n g  different  areas  the p r e d i c t e d  s t r u c t u r e s o f 24 p r o t e i n s w i t h known c o n f o r m a t i o n general  entire  areas  from X-ray d i f f r a c t i o n  patterns.  I n an a t t e m p t t o i m p r o v e t h e a c c u r a c y  of  the n u c l e a t i o n r u l e s were m o d i f i e d  emphasize  to  the  i i  \  i m p o r t a n c e o f the i n the  t y p e and  added to the  step  residues  region. F u r t h e r m o r e , an  was  p o s i t i o n s o f amino a c i d  step  importance of the  the  possible  c o n f o r m a t i o n s at the These m o d i f i c a t i o n s  and  the  adjustment  3-sheet r e g i o n s .  boundary  interference  of  This  conformational the d i f f e r e n t  b o u n d a r i e s of the  predicted  regions.  produced p r e d i c t e d  secondary  structures  w h i c h were i n good a g r e e m e n t w i t h patterns  f o r boundary  s e a r c h f o r a - h e l i x and  compared the  p a r a m e t e r s and  extra  predicted  the  patterns  X-ray  diffraction  o f Chou and  Fasman  '(1974b, 1978b) . The a - h e l i x and  Matthews': c o e f f i c i e n t (C) B - s h e e t w e r e 0.39  p r e d i c t i o n w o u l d be or  two  helical  regions  D  p  (E < 0.05) —  significantly The  the  c a l c u l a t e d f o r the improved from the  computer-assisted  of  values  discrepancy  ambiguous i n t e r p r e t a t i o n s o f the second p a r t  a p p l i c a t i o n of the  of  this  and  Fasman.  in this  thesis,  between the ; due  to  the  r u l e s o f Chou and study involved  program to s e v e r a l  iii  paired-  (P < 0.01)  o f Chou and  d a t a from d i f f e r e n t r e s e a r c h e r s  The  The  one  p r e s e n t p r e d i c t i o n were  technique described  t h e r e f o r e , would d e c r e a s e the predicted  values  the  m i g h t be  missed or o v e r p r e d i c t e d . that  for  or above, meaning t h a t  quite u s e f u l although there  sample t - t e s t r e v e a l e d C  calculated  Fasman. the  food r e l a t e d  proteins  (bovine  serum a l b u m i n , a ^ - c a s e i n ,  chymosin, a - l a c t a l b u m i n ,  3-casein,  K-casein,  a - l a c t o g l o b u l i n , ovalbumin,  and  trypsinogen). Although  references  all  p r o t e i n s t e s t e d , the r e s u l t s  c o u l d n o t be f o u n d f o r  obtained  a - l a c t a l b u m i n were c o m p a r a b l e t o t h o s e  pepsin  f o r K-casein  r e p o r t e d by  and  other  researchers. Since conformational  d a t a have l o n g be'en r e c o g n i z e d  as c o n t r i b u t i n g t o t h e i n f o r m a t i o n on p r o t e i n and enzyme functionality,  the computerization  o f Chou and Fasman w i l l  o f t h e p r e d i c t i v e method  d e f i n i t e l y be a t o o l  the p r o t e i n f u n c t i o n a l i t y  i n food  iv  processing.  for explaining  TABLE OF CONTENTS  PAGE Abstract  i i  Table of Contents  v  List  of Tables  vi  List  of Figures  ix  Acknowledgements  xi  Introduction  1  Literature  6  Review  Definition regions:  of the d i f f e r e n t c o n f o r m a t i o n a l  A.  Alpha-helix  6  B.  Beta-sheet  8  C.  Coil  11  D.  Beta-turn  region  11  Review o f the v a r i o u s p r e d i c t i v e  methods.  M a t e r i a l s and Methods  13 26  The Chou and Fasman p r e d i c t i v e  method  26  A.  Search f o r h e l i c a l  regions.  28  B.  Search f o r 3-sheet r e g i o n s .  32  C.  O v e r l a p p i n g a- and 3 - r e g i o n s .  34  D.  Search f o r 3-turns.  36  E.  E v a l u a t i o n of the p r e d i c t i v e accuracy  37  v  PAGE Amino a c i d  sequence o f p r o t e i n s ,  39  Programming.  40  R e s u l t s and D i s c u s s i o n ,  41  Programming o f t h e method  41  A.  Scheme f o r t h e s e a r c h o f h e l i x and s h e e t r e g i o n s .  42  B.  Scheme f o r t h e s e a r c h o f 3 - t u r n s .  43  C.  Scheme f o r s o l v i n g a- a n d 3 - a r e a s .  43  E f f i c i e n c y of the a-helix  overlapping  prediction.  47  E f f i c i e n c y o f t h e 3-sheet p r e d i c t i o n  128  E f f i c i e n c y of the 3-turn p r e d i c t i o n  168  E f f i c i e n c y of the r e s o l u t i o n and 3 - a r e a s .  172  of overlapping a-  Comparison o f the p r e d i c t i v e  accuracy  C o n f o r m a t i o n o f some f o o d r e l a t e d p r o t e i n s Conclusions Literature  185 211 226  cited  231  Appendix  241  vi  L I S T OF  TABLES  TABLE .1  2  PAGE Conformational parameters f o r a - h e l i c a l and 8 - s h e e t r e s i d u e s b a s e d on 29 p r o t e i n s  27  Conformational parameters of h e l i c a l b o u n d a r y r e s i d u e s i n 29 p r o t e i n s  29  C o n f o r m a t i o n a l parameters o f g-sheet b o u n d a r y r e s i d u e s i n 29 p r o t e i n s  30  F r e q u e n c y h i e r a r c h i e s o f amino a c i d s i n t h e 8 - t u r n s o f 29 p r o t e i n s  37  Comparison o f e x p e r i m e n t a l (X-ray) and p r e d i c t e d h e l i c a l r e g i o n s o b t a i n e d b y Chou a n d Fasman and by o u r p r o g r a m b e f o r e and a f t e r i t s refinement  189  Comparison of e x p e r i m e n t a l (X-ray) and p r e d i c t e d 8 - s h e e t r e g i o n s o b t a i n e d by Chou and Fasman and by o u r p r o g r a m b e f o r e and a f t e r i t s refinement  198  Agreement f a c t o r s Q , C o b t a i n e d by Chou and Fasman ( 1 9 7 4 b ) , A r g o s e_t aJL. (1976) a n d o u r program  207  A g r e e m e n t f a c t o r s Qg, Cg o b t a i n e d by Chou a n d Fasman, A r g o s e_t a_l. , and o u r p r o g r a m  209  a  vii  a  TABLE 9  10  PAGE P e r c e n t a g e s o f h e l i x , s h e e t , and t u r n o f some f o o d r e l a t e d p r o t e i n s o b t a i n e d w i t h our program  215  H e l i x , s h e e t , and t u r n r e g i o n s some f o o d r e l a t e d p r o t e i n s as p r e d i c t e d by o u r p r o g r a m  216  viii  of  L I S T OF  FIGURES  Schematic diagram of the p r e d i c t e d secondary s t r u c t u r e o f b o v i n e serum a l b u m i n Schematic diagram of the p r e d i c t e d secondary structure of ag^-casein (bovine) Schematic diagram of the p r e d i c t e d secondary structure of 3-casein (bovine) Schematic predicted structure (bovine)  diagram o f the secondary of K-casein  Schematic diagram of the p r e d i c t e d secondary s t r u c t u r e of chymosin (bovine) Schematic predicted structure (bovine)  diagram of the secondary of a-lactalbumin  Schematic predicted structure (bovine)  diagram of the secondary of $ - l a c t o g l o b u l i n  ix  L I S T OF  FIGURES  (cont'd)  Schematic diagram of the p r e d i c t e d secondary s t r u c t u r e of ovalbumin Schematic diagram o f the p r e d i c t e d secondary structure of pepsin (porcine) Schematic diagram of the p r e d i c t e d secondary structure of trypsinogen (bovine)  ACKNOWLEDGEMENTS  The a u t h o r w i s h e s  t o express her s i n c e r e apprecia-  t i o n t o h e r s u p e r v i s o r , D r . S. N a k a i , P r o f e s s o r , D e p a r t m e n t o f Food S c i e n c e , f o r h i s c o n s t a n t a d v i s e , h e l p and encouragement t h r o u g h o u t  the course of t h i s  s t u d y , and i n t h e  preparation o f the t h e s i s . She  i s a l s o t h a n k f u l t o t h e members o f h e r g r a d u a t e  committee: D r . R. C. F i t z s i m m o n s , A s s o c i a t e P r o f e s s o r , Department o f P o u l t r y S c i e n c e Dr. W.D. P o w r i e , P r o f e s s o r and Head, D e p a r t m e n t o f Food S c i e n c e Dr. B . J . Skura, A s s i s t a n t P r o f e s s o r , Department o f Food S c i e n c e for this  their  interest  i n t h i s r e s e a r c h and f o r t h e r e v i e w o f  thesis.  xi  -1  INTRODUCTION  Broadly,the note a n y p h y s i c o - c h e m i c a l and  behavior  f u n c t i o n a l p r o p e r t i e s o f p r o t e i n s deproperty  o f p r o t e i n s i n food  ity  attributes  of the f i n a l  ties  a r e i n f l u e n c e d by and v a r y  that affects  s y s t e m s , a s j u d g e d by t h e quaT-  product.  The f u n c t i o n a l p r o p e r -  according  o f p r o t e i n s , b) t h e method o f i s o l a t i o n  t o : a) t h e s o u r c e  and p u r i f i c a t i o n ,  t h e c o n c e n t r a t i o n o f p r o t e i n s , d) t h e t y p e (enzymatic,acid,or conditions  'al  of modifications  a l k a l i n e h y d r o l y s i s ) , a n d e)  extensive review  environmental  o f t h e v a r i o u s s t u d i e s on p r o -  f u n c t i o n a l i t y was p u b l i s h e d by K i n s e l l a  (1976).  I n gener-  most o f t h e c h a n g e s i n p r o t e i n f u n c t i o n a l i t y h a v e b e e n  f o u n d t o be r e l a t e d  t o the degree o f d e n a t u r a t i o n t h a t  pro-  t e i n s undergo. For example,in g e l a t i o n a heat treatment sually r e q u i r e d t o cause a t l e a s t p a r t i a l f o l d i n g of. t h e p o l y p e p t i d e then  g r a d u a l l y a s s o c i a t e to form a g e l m a t r i x  It  i s necessary  chains  will  i f attractive  are suitable.  to consider the hydrophobic,electron-  and s t e r i c p a r a m e t e r s o f m o l e c u l e s  anism o f f o l d i n g  i s u-/.  d e n a t u r a t i o n o r un-  c h a i n s . Those u n f o l d e d  f o r c e s and t h e r m o d y n a m i c c o n d t i o n s  i c  c)  (pH,temperature,and i o n i c s t r e n g t h ) . An  tein  the processing  to understand  t h e mech-  and u n f o l d i n g o f p r o t e i n s , t h e i r b i o l o g i c a l  a c t i v i t y , a n d to p r e d i c t t h e i r behavior  1  upon c e r t a i n  treatments  ( i . e . , p o s s i b l e areas o f d e n a t u r a t i o n , e x t e n t o f u n f o l d i n g , p o s i n g o f t h e h y d r o p h o b i c c o r e ) . The be e v a l u a t e d by e l e c t r o p h o r e s i s  of hydrophobic  e l e c t r o n i c parameters  , while a fluorometric  has b e e n d e v e l o p e d by K a t o and N a k a i  (1980)  o f more t h a n s i x t y p r o t e i n s .  d i f f r a c t i o n has  to study the  In a d d i t i o n ,  steric  X-ray  c o n t r i b u t e d a g r e a t d e a l t o our knowledge o f  and Rossman, 1974^ ; M a t t h e w s , 1 9 7 5 b ) .  crystallographic  interactions  However  of c r y s t a l l i z a t i o n .  Furthermore, the X-ray  i s q u i t e l a b o r i o u s , e x p e n s i v e and A r i f i n s e n e_t a^L.  to  technique  time-consuming.  (1961) , s t u d y i n g t h e r e f o r m a t i o n o f  reduced bovine r i b o n u c l e a s e , observed t h a t the n a t i v e ture of a p r o t e i n  , these  a n a l y s e s c a n n o t be a p p l i e d t o most f o o d p r o -  t e i n s , o r t o many membrane and r i b o s o m a l p r o t e i n s due problems  method  for evaluation  protein-protein, protein-metal, protein-solvent (Liljas  i s c o n t r o l l e d by  i t s amino a c i d  struc-  sequence.  T h i s f i n d i n g has become t h e m o t i v a t i o n f o r many a t t e m p t s obtain p a t t e r n s of p r o t e i n s t r u c t u r e  from sequence  Although protein f u n c t i o n a l i t y t h r e e - d i m e n s i o n a l t o p o l o g y , one p r e d i c t i o n of i t s secondary (1972, 1973), their  energy  can  parameters.  X - r a y d i f f r a c t i o n has b e e n u s e d parameters  ex-  can s t i l l  structure.  data.  depends on - i t s u n i q u e 1  l e a r n much f r o m  N i s h i k a w a and  i n a study of p r o t e i n t e r t i a r y calculations  to  Ooi  structure,  based  on t h e c o n f o r m a t i o n d e r i v e d f r o m  2  the  the  (computation o f s e t s y p r e p t i d e .; c h a i n  of dihedral  a n g l e s <j) and  . To f i t the-polv- .  o f the tobacco mosaic v i r u s p r o t e i n  a l o w r e s o l u t i o n F o u r i e r map,some i n f o r m a t i o n s t r u c t u r e was f o u n d t o be d e s i r a b l e  on t h e s e c o n d a r y  (Leberman,1971).  Secondary s t r u c t u r e p r e d i c t i o n s w i l l information tern For  (TMV) t o  provide  useful  on a r e a s o f p r o t e i n m o l e c u l e s where t h e X - r a y  i s not yet c l e a r l y instance,  resolved,especially  areas p r e d i c t e d  at the  pat-  N-terminal.  as h e l i c a l by Chou and Fasman  (1974b) f o r c y t o c h r o m e b^ and f e r r i c y t o c h r o m e  c had n o t been  o  detected  by X - r a y d i f f r a c t i o n  a t 2.8A r e s o l u t i o n . T h e i r o  results  o  were l a t e r  c o n f i r m e d by X - r a y a t 2.45A and 2.OA r e s o l u t i o n  (Dickerson  ejt a^L. ,1971; Mathews  e t al_. , 1972;  Takano ejt a l . ,  1973). Conformational  information  may a l s o be u s e d t o de-  s i g n e x p e r i m e n t a l models f o r c h e c k i n g t h e e f f e c t s o f conformational  c h a n g e s on h o r m o n a l o r e n z y m a t i c a c t i v i t y  Chaiken,1975; Fink Some r e s e a r c h e r s sidered dating  a n d B o d a n s z k y , 1 9 7 6 ; Perta e t  (Dunn and al.,1975).  (Deber e_t al_. , 1976 ; K o p p l e ejt al_. , ] 975)  s t u d y o f t h e g - t u r n a good s t a r t i n g p o i n t the influence  con-  for eluci-:  o f s e q u e n c e a n d s u r r o u n d i n g s ,on p r o t e i n  c o n f o r m a t i o n . The g - t u r n s t r u c t u r e  i s potentially  identifia-  b l e a n d i s s i m p l e enough t o be c h a r a c t e r i z e d by e x p e r i m e n t a l 13 and p r e d i c t i v e t e c h n i q u e s ( C NMR, c i r c u l a r dichroism,confor-  3  mational  energy c a l c u l a t i o n s ) .  I t also helps  mode o f a c t i v a t i o n o f b i o l o g i c a l l y  to e x p l a i n the  active peptides  (Bradbury  et al.,1976). ' Another a p p l i c a t i o n o f the secondary prediction  i s t h e c o m p a r i s o n o f p r o t e i n s o f t h e same f a m i l y  w h i c h may m a i n t a i n tions  some c o n f o r m a t i o n a l h o m o l o g y d e s p i t e  i n sequence data,such  proteinase  inhibitors The  as t h e c a s e  (Chou and F a s m a n , 1 9 7 8 b ) .  method o f Chou and Fasman  (Chou and Fasman, the least  i n use f o r the p r e d i c t i o n o f the secondary  p r o t e i n s . Y e t , i t possesses random g u e s s i n g  an o v e r a l l  accuracy  f o r a t h r e e - s t a t e model  state).  fied  i n a p r o t e i n i s 75 f o r t h i s method v e r s u s  s t u d i e s (Davies,1964;  since i t takes  i n t o account  that are a-helix,8-sheet The  computed p e r c e n t a g e  t h e i r method a g r e e s studies  of total  structure of  higher  than  residues correctly  identi-  33 f o r random  F u r t h e r m o r e , C h o u and Fasman's work h a s i m p r o v e d on  the e a r l i e r 1969)  The p e r c e n t  compli-  (a-helix,8-sheet•and  coil  guessing.  varia-  o f p r o i n s u l i n s and  1978a,1978b) has been f r e q u e n t l y c o n s i d e r e d cated  structure  Havsteen,1966; combinations  and B - t u r n o f secondary  formers  Goldsack,  of residues  and b r e a k e r s .  s t r u c t u r e o b t a i n e d by  q u i t e w e l l w i t h e s t i m a t e s b a s e d on CD  ( K a w a u c h i and L i , 1974; G a r e l e_t a J . ,1975 ; G a r n i e r  e t a J . ,1975; G r e e n , 1 9 7 5 ; M a t t h e w s , 197 5a ; S c a n u e_t a l . , 1975 ; Holladay  and P u e t t , 1 9 7 6 ;  Munoz e t a l . , 1 9 7 6 ;  4  Wallace,1976).  With the of the  exception  o f A r g o s ejt a l _ . (1976) , most  l a b o r a t o r i e s t h a t have a p p l i e d  t h e method o f Chou  Fasman f o r s p e c i f i c i n v e s t i g a t i o n s on p r o t e i n not be  yet  reported  a common c o m p u t e r i z e d t e c h n i q u e w h i c h  used f o r o t h e r p r o t e i n s .  objectives  follows:  similar  r e s u l t s to those p u b l i s h e d  ( 1 9 7 4 b , 1978b) and  a)  The  w e r e as  b)  better  by  i f successful, so  Chou and  understood.  5  have can  study provide  Fasman  extend'this  that possible  b e t w e e n p r o t e i n f u n c t i o n a l i t y and be  of t h i s  d e s i g n a program which would  to food r e l a t e d p r o t e i n s  may  structure  program  correlation  conformational  and  changes  LITERATURE REVIEW  D e f i n i t i o n of the D i f f e r e n t Conformational According chemical  Regions  t o t h e TUPAC - IUB C o m m i s s i o n on b i o -  nomenclature(1970) the secondary s t r u c t u r e o f a  segment o f a p o l y p e p t i d e c h a i n i s t h e l o c a l ment o f i t s m a i n .chain atoms w i t h o u t  regard  spatial  t o the conforma-  - tion of i t s side chain or i t s relationship with segments.  The f o u r t y p i c a l  conformations  arrange-  other  encountered i n  the secondary s t r u c t u r e a r e the a - h e l i x , the 8-sheet, the 6-turn  A.  (bend),  a n d t h e random  coil.  Alpha-Helix The  a-helix contains  3.6 amino a c i d r e s i d u e s p e r  t u r n o f the p r o t e i n backbone, w i t h the R groups o f the amino a c i d s e x t e n d i n g structure.  outward from t h e a x i s o f the h e l i c a l  Hydrogen bonding can occur  between t h e hydrogen  o f t h e NH g r o u p o f one p e p t i d e bond and t h e o x y g e n o f t h e CO g r o u p o f a n o t h e r protein chain.  p e p t i d e bond f o u r r e s i d u e s a l o n g t h e  The h y d r o g e n bonds a r e n e a r l y p a r a l l e l t o  the a x i s o f the h e l i x , structure.  lending strength to the h e l i c a l  S i n c e n a t u r a l amino a c i d s e x i s t  uration , a right-handed  helix  6  in Lconfig-,  i s more s t a b l e t h a n a  left-handed helix.  Therefore,if helical  structures  i n p r o t e i n s they are i n v a r i a b l y right-handed (Anglemier  and M o n t g o m e r y , 1 9 7 6 ) .  the l o w e s t f e a s i b l e is  spontaneous,  charged  exist  helices  Since the a - h e l i x  free energy,formation of t h i s  p r o v i d e d t h e r e a r e no  has  structure  interactions  between  R g r o u p s o r s t e r i c h i n d r a n c e by r e s i d u e s on  the  l a r g e r amino a c i d s .  Examples o f p r o t e i n t y p e s i n which  a - h e l i x predominates  a r e enzymes and r e s p i r a t o r y  Taking t h a t are s p e c i f i c  i n t o account  the s t r u c t u r a l  to g l o b u l a r p r o t e i n s , Lim  a number o f c o n d i t i o n s n e c e s s a r y f o r a along the p e p t i d e c h a i n . E a c h "a.'". . h y d r o p h o b i c  globule.  proteins.  requirements  (1974a)  helix  to  proposed  exist  s e p a r a t e b e l i e a 1 r e g i o n ..must have  s i d e group or a group which would  the h e l i x to a t t a c h i t s e l f  the  to the hydrophobic  From t h e a n a l y s i s o f i m m e r s i o n  permit  core of  of the  the  hydrophobic  s i d e c h a i n s i t u a t e d on t h e a - h e l i x  surface,Lim  emphasized the r o l e of hydrophobic  p a i r s , ( 1 - 5 ) , a n d hydro-  p h o b i c t r i p l e t s , (1-2 - 5) o r  ( 1 - 4 - 5 ) , i n the attachment  the a - h e l i x to the hydrophobic philic helix  (1974a)  t r i p l e t s , (1-2 - 5) and  core.  of  Hydrophobic-hydro-  (1-4-5),are also  important f o r  stabilization. Another  way  to d e s c r i b e the conformation of a  p r o t e i n c h a i n i s t o m e a s u r e t h e d i h e d r a l a n g l e s <J> and  $  which  bonds.  correspond  to r o t a t i o n s  a b o u t t h e N-C?  7  1  and  C -C a  The  <J>, I[J a n g l e s  helix  f o rresidues i n a regular  right-handed  a r e g i v e n b y (-57, -47°) ( I U P A C - I U B , 1 9 7 0 ) .  3.6 r e s i d u e s a r e r e q u i r e d t o f o r m  Since  a h y d r o g e n bond i n a  s i n g l e t u r n o f t h e a - h e l i x , a l l c o n s e c u t i v e sequences o f f o u r o r more r e s i d u e s h a v i n g  cj), i|> a n g l e s w i t h i n 40° o f  (-60 , - 50") a r e c o n s i d e r e d t o be h e l i c a l .  Some r e s i d u e s  at  that f a l l  the h e l i c a l  ends may h a v e d i h e d r a l a n g l e s  s i d e t h e r a n g e -100° < * < -20° a n d -90° < i> <_-10° i n c l u d e d as h e l i c a l  i f they  on t h e above c r i t e r i a identified  B.  butare  show h y d r o g e n b o n d i n g .  a t o t a l o f 152 h e l i c a l  out-  Based  r e g i o n s were  i n 29 p r o t e i n s (Chou and Fasman, 1 9 7 8 a , 1 9 7 8 b ) .  Beta-Sheet In  this  conformation,  t h e p e p t i d e backbone  forms  a z i g - z a g p a t t e r n w i t h t h e R g r o u p s o f t h e amino a c i d s extending  above a n d b e l o w t h e p e p t i d e c h a i n .  p e p t i d e bonds a r e a v a i l a b l e conformation  f o r hydrogen bonding,  anti-parallel  p l e a t e d sheets  adjacent  Both p a r a l l e l and  are possible.  This  p r e d o m i n a t e s i n many f i b r o u s p r o t e i n s s u c h  insect  this  a l l o w s maximum c r o s s - l i n k i n g b e t w e e n  p e p t i d e c h a i n s a n d , t h u s , good s t a b i l i t y .  tion  Since a l l  conforma-  as s i l k a n d  fibres. According  c a n be d i v i d e d i n t o  t o L i m (1974a), 8 - s t r u c t u r a l t h r e e types by t h e i r  8  regions  relative  position  to  the s u r f a c e of the g l o b u l e : the  the s e m i - s u r f a c e structural be  type.  In order  requirements  i n t e r n a l , t h e surface,and  to e x i s t without  for globular proteins,each  . For  i n s t a n c e , e n t i r e l y hydrophobic  bic  r e g i o n s w i t h one  two  and/or l a s t  f a v o r the  o r two  two  t h e N-  and  g r o u p s . The of the  g r o u p s and  semi-surface  B-sheets.  so be v e r y  the o t h e r  t y p e may  residues of  critical.  f a c e t y p e must n o t  first will  l o c a t e d on  the  s i d e o f the band have  side only h y d r o p h i l i c  exist  i n peripheral regions  Gly.  Pro  c a n n o t be  stereochemistry  i n c l u d e d i n the  formation  molecules  can  bouring  w i t h the C  core.  l o o s e n h y d r o g e n bonds o f t h e p e p t i d e  on  a  the  side w i l l  i n the hydrophobic  9  and  fact impede  Water  g r o u p s neigh-  atoms o f G l y o r A l a when t h e s e  the h y d r o p h i l i c s i d e .  sur-  s i d e or Gly  i s s t i p u l a t e d by  t h a t t h e p r e s e n c e o f G l y on t h e h y d r o p h o b i c the t i g h t p a c k i n g  B-structu-  o f i t s s i d e g r o u p . The  have G l y on t h e h y d r o p h o b i c  the h y d r o p h i l i c s i d e . T h i s  acids occur  i n the  p o s i t i o n o f c e r t a i n amino a c i d r e s i d u e s c a n a l -  re because of the  on  hydropho-  T h e s e r e g i o n s must h a v e o n l y h y d r o p h i l i c s i -  groups or m a i n l y The  or  C-terminal  t o be  s u r f a c e o f t h e g l o b u l e r e q u i r e s t h a t one  no  should  type.  c o n d i t i o n f o r a B-chain  only hydrophobic  regions  hydrophilic residues  p o s i t i o n s on  internal The  Ala  type  f o r m e d f r o m a c e r t a i n number o f h y d r o p h o b i c / h y d r o p h i l i c r e s -  idues  de  violating  two  ami-  The  angles  <j) ) JT  f o r residues i n a parallel-chain  8 - s h e e t and an a n t i p a r a l l e l and  8 - s h e e t h a v e v a l u e s o f (-119°,113°)  (-139°, 135°) , r e s p e c t i v e l y  (IUPAC-IUB',1970) . A- c o n s e c u t i v e  s e q u e n c e o f t h r e e o r more r e s i d u e s h a v i n g  <J> J a n g l e s JT  within  40°  o f (-120°,110°) o r (-140°,135°) a r e c o n s i d e r e d t o be i n  the  8-conformation,even  hydrogen bonding.  i f these r e s i d u e s are not i n v o l v e d i n  However,residues  a t t h e 8-ends t h a t h a v e d i -  hedral  a n g l e s o u t s i d e t h e r a n g e - 1 8 0 < cj) < -8.0  £  are included i n the 8-region  u  70°  least  i f they p a r t i c i p a t e  B-sheets  are not counted  residues but instead are assigned to the c o i l  i n at  as B -  c o n f o r m a t i o n and  the B-turn conformation. Chou a n d Fasman  gions,observed 131,  u  one h y d r o g e n b o n d . The two e n d r e s i d u e s t h a t a r e n o t hy-  drogen bonded i n a n t i p a r a l l e l  /or  a n d 1 7 5 ~ < i\>  (1978a,1978b),analyzing  3 two-residue  and f e r r o d o x i n 50-51),  B-segments  137 B - r e -  ( p a p a i n 111-112", 130-  10 t h r e e - r e s i d u e . B - s e g m e n t s , a n d  9 f o u r - r e s i d u e 8 - s e g m e n t s . T h i s number i n c r e a s e s t o 28 a n d 24 for  the f i v e - r e s i d u e  The  t h r e e l o n g e s t . B - r e g i o n s c o n t a i n 17 r e s i d u e s ( t h e r m o l y s i n  16-32),  and s i x - r e s i d u e B - s e g m e n t s , r e s p e c t i v e l y .  16 r e s i d u e s ( r i b o n u c l e a s e 9 6 - 1 1 1 ) ,  ( l a c t a t e dehydrogenase 280-294). (1978a,1978b) i d e n t i f i e d residues  i n 29 p r o t e i n s .  I n c o n t r a s t , Chou a n d Fasman  24 h e l i c a l  s e g m e n t s l o n g e r t h a n 17  The r e a s o n  10  a n d 15 r e s i d u e s  that helices  are longer  than  3 - s h e e t s may be b e c a u s e o f t h e g r e a t e r e a s e o f h e l i c a l  i n t r a c h a i n h y d r o g e n bond f o r m a t i o n c o m p a r e d t o 3 - s h e e t i n t e r c h a i n h y d r o g e n bond  C.  Coil  Regions  Residues to  formation.  i n the p r o t e i n that are not c l a s s i f i e d  be i n t h e h e l i x o r 3 - r e g i o n s  are assigned  c o n f o r m a t i o n , i r r e s p e c t i v e l y o f t h e cj>,  to the c o i l  angles  of the  r e s i d u e . Hence, t h r e e c o n s e c u t i v e r e s i d u e s h a v i n g conformation  o r two c o n s e c u t i v e r e s i d u e s h a v i n g  conformation  but without  be  i n the c o i l  longest c o i l s residues  s t a t e (Chou and Fasman, 1 9 7 8 b ) . The f o u r r e g i o n s found  among 29 p r o t e i n s c o n t a i n e d  ( t h e r m o l y s i n 181-234) , 51 r e s i d u e s  . completely  many 3 - t u r n s . flexibility the  In the case  o f these  coil  54  (carboxypep-  ( f e r r o d o x i n 4-49) and 41  ( r u b r e d o x i n 14-54). These c o i l  considered  t h e 3-  hydrogen bonding are considered to  t i d a s e 1 2 3 - 1 7 3 ) , 46 r e s i d u e s residues  the a-  r e g i o n s c a n n o t be  s t r u c t u r e l e s s s i n c e t h e y may c o n t a i n o f f e r r e d o x i n and r u b r e d o x i n , t h e regions  i s s e v e r e l y r e s t r i c t e d by  iron-sulfur coordinations.  D.  3-Turn R e g i o n s The  3-turn i n v o l v e s four consecutive residues i n  11  a p r o t e i n where t h e p o l y p e p t i d e n e a r l y 180°.  I t i s these  c h a i n f o l d s b a c k on  regions  a protein i t s globularity  of chain r e v e r s a l that  r a t h e r than  linearity.  (1971) p r o p o s e d t h a t c h a i n r e v e r s a l s p l a y t h e of b r i n g i n g d i s t a n t p a r t s of the p e p t i d e  i n t e r a c t i o n s between h e l i x - h e l i x ,  parallel  3-pleated sheet,  three  types  of turns  (1968) was  Thr,  Asp,  residues  Asn are  and  Using  4651  ( 80%)  o r more o f t h e This  With the  capable  bonds w i t h t h e i r  Chou and  first  to c h a r a c t e r i z e  group of r e s i d u e  exception  Fasman  own  of forming backbone  is a  i and  the  from 8 p r o t e i n s  the  idea that and  perhaps f o r  o f Pro w h i c h can these  occupy  r e s i d u e s have been  s i d e chain-backbone (Lewis  in a helical  457  3-turns  hydrogen  ejt a J . , 1973) .  the X-ray atomic c o o r d i n a t e s (1977) c o m p u t e d t h e C^  r e g i o n were c o n s i d e r e d  e l u c i d a t e d , 243  - C^ ^ +  f r o m 29 p r o t e i n s , distances  as  -  of and  3 - t u r n s . Of  o f them a l s o h a v e 0^  12  Ser,  these  t e t r a p e p t i d e s . Those whose d i s t a n c e s were b e l o w  not  NH  following residues:  supports  backbone c o n f o r m a t i o n s ,  shown t o be  together,  i n a t e t r a p e p t i d e where t h e r e  Pro.  role  antiparallel-  r e s p o n s i b l e f o r bend s t a b i l i t y  bend f o r m a t i o n . o n l y a few  the  ( i + 3 ) . Most bends  c o n t a i n a t l e a s t one  L e w i s ejt a l .  or h e l i x - 3 - s h e e t .  h y d r o g e n bond b e t w e e n t h e CO group o f r e s i d u e  by  give  important  chain  enabling  Venkatachalam  itself  the  N^ ^ +  d i s t a n c e s < 3. bonding.  and w e r e c o n s i d e r e d t o h a v e h y d r o g e n  Chou and  Fasman (1977) a l s o a s s i g n e d 8 - t u r n s  types s i m i l a r  to those of Lewis  <f>,  a n g l e s o f the second  dihedral  e_t a J . (1973) b a s e d and  third  to  on  11  the  residues of  the  bend.  Review o f the V a r i o u s P r e d i c t i v e  Methods  S e v e r a l r e s e a r c h e r s have a t t e m p t e d secondary  s t r u c t u r e of p r o t e i n s from S z e n t - G y o r g y i and  Cohen  their  by o p t i c a l  their  epidermin  f i b r i n o g e n ) and w i t h c o l l a g e n , d e m o n s t r a t e d  t h a t the  rotatory dispersion  distributed  t h e s e q u e n c e . They c o n c l u d e d  that  deforms  Very h i g h Pro content  (1964) u s i n g t h e b  the mole p e r c e n t a g e  Cys), residues c l a s s i f i e d  Q  v a l u e o f ORD,  found  of  (Ser + Thr  a pro-  + Val + l i e +  as " n o n h e l i c a l - f o r m e r s " by  13  may  type.  s t r o n g c o r r e l a t i o n between the h e l i x c o n t e n t o f f i f t e e n t e i n s and  less  randomly i n a c h a i n permits  the backbone i n t o a rancom c o i l .  Davies  is  o f Pro r e s i d u e s  more t h a n 50 p e r c e n t a - h e l i x . A b o u t 8 p e r c e n t P r o  favor a poly-L-proline h e l i x  helix  (ORD)  to the percentage  than 3 p e r c e n t Pro d i s t r i b u t e d  study  and  inversely proportional throughout  the  sequence d a t a .  (1957) t h r o u g h  w i t h t h e KMEF p r o t e i n s ( k e r a t i n , m y o s i n ,  content determined  to p r e d i c t  Blout  e t a l . (1960) and  Blout  the h e l i x  and  content  amino a c i d was  u s e d by  observed  strong c o r r e l a t i o n  (Davies,  Davies  Szent-Gyorgyi  (1964).  by  be  and  Cohen (1957) was  Therefore,  the  increased  applied with caution u n t i l  (1966) c a r r i e d o u t  acids  parameter b  i n 40 p r o t e i n s and  r e l a t i o n s h i p was of  the  t h e i r ORD  o b s e r v e d b e t w e e n -1/b  ( S e r + Thr  f i n d i n g s on  that  they  are  sup-  + Pro); thus,  peptide  l i n k a g e s w h i c h may  r e q u i r i n g a 90°  residues  fluence b  and  . The  . A  the  supporting  linear  percentage  the  previous  groups of  i n t e r f e r e w i t h the  tend  Ser,  forma-  to d e s t a b i l i z e h e l i c e s  bend o f the p e p t i d e  o f a B-form o f l e f t - h a n d e d  analy-  o f c e r t a i n amino  i n t e r a c t i o n s between h y d r o x y l  t i o n of a - h e l i c e s . Pro by  not  to  a statistical  content  and  cor-  correlationspreviously  s i s o f the c o r r e l a t i o n between the  Thr  particular  a d d i t i o n a l data. Havsteen  content  between  1964). Furthermore, the  when t h e number o f p r o t e i n s was  mentioned should ported  No  t h e m o l e p e r c e n t a g e o f any  r e l a t i o n r e p o r t e d by sustained  (1962).  c h a i n . The  presence  h e l i c e s a l s o seems t o m a r k e d l y i n -  i n f l u e n c e o f t h e amino a c i d s i d e c h a i n s  on  o b  Q  justify  different,  their and  classification  helix-inhibiting  Goldsack  (1969),  as h e l i x - f a v o r i n g , h e l i x - i n groups.  u s i n g the data  demonstrated t h a t the parameter b total  Q  c a n be  o f 107  proteins,  c o r r e l a t e d to  c o n t e n-t J o f . t h e s o.'- c a l l e d he 1 i x - f o r m i n g am i n o  14  the  acids  ('Ala + A r g of the + Ser  + Asp  + Cys  + Glu+ Leu  nonhelix-forming + Thr  + Trp  g r o u p o f amino a c i d s  + Tyr).  Gn  p a r a m e t e r , i t seemed t h a t no grossly controls Nevertheless,  as,  intermolecular the  to  that  ( G l y + Phe.+  o t h e r hand, u s i n g  the  a  p a r t i c u l a r amino a c i d s i d e  Pro  Q  chain  £-structure i n a p r o t e i n .  c h a r a c t e r i z a t i o n of the  (intramolecular  to e l u c i d a t e  the  amount o f  f u r t h e r ORD  8-structures well  the  + L y s ) , as w e l l a s ,  parallel  cross-3  and  antiparallel,  structure)  r e l a t i o n s h i p between a  Q  different  will  and  be  the  as  useful  amino  acid  composition. These p r e l i m i n a r y  efforts in predicting  c o n f o r m a t i o n r e l i e d h e a v i l y on c o m p o s i t i o n . The an  ORD  d a t a and  X-ray a n a l y s i s of p r o t e i n  e a r l y s t a g e o f d e v e l o p m e n t and  was  still  the  s t r u c t u r e was  amino a c i d  at  sequence  (1960) a t t e m p t e d t o c o n s t r u c t  d i m e n s i o n a l model of r i b o n u c l e a s e the  on  p r i m a r y , s e c o n d a r y and  importance l i e s plan  acid  unknown f o r many p r o t e i n s . Scheraga  d a t a on  amino  protein  i n the  fact that  e x p e r i m e n t s f o r the i t may  o f X - r a y d a t a on  ribonuclease  the  myoglobin, alpha-  a l s o be  basis and  basis  tertiary  of  three-  available  structures.  i t provides a basis  i n v e s t i g a t i o n of s i d e chain  i n t e r a c t i o n s and  On  the  a  of help  in Fourier  Its  to group  analysis  crystals.  o f known s e q u e n c e and  beta-hemoglobin,  15  Guzzo  structure (1965)  of  suggested that the presence o f the four c r i t i c a l Asp,  G l u a n d H i s may be a n e c e s s a r y c o n d i t i o n f o r a s e c t i o n o f  proteins Asp  groups; P r o ,  t o be n o n - h e l i c a l . A n a l y s e s o f P r o r e p l a c e m e n t  by  and G l u i n m u t a n t and v a r i a n t p r o t e i n s s u p p o r t e d h i s  t h e o r y . T h i s was a p p l i e d s t r u c t u r e o f lysozyme hydrophobic bonding bonding  i n an e f f o r t  to predict  and t o b a c c o m o s a i c  and w e a k e n i n g  the secondary  v i r u s . Absence o f  of interpeptide  hydrogen  as a r e s u l t o f w a t e r c o m p e t i t i o n i n t h e v i c i n i t y o f  those p o l a r r e s i d u e s might  be t h e r e a s o n f o r t h e u n f a v o r a b l e  e f f e c t o f t h o s e r e s i d u e s on h e l i x f o r m a t i o n . Prothero Guzzo to  (1965)  (1966)  compared h i s r e s u l t s  on s i x p r o t e i n s a n d p r o p o s e d  achieve a r e a s o n a b l e degree  structures. will  The r u l e s t a t e s :  be a - h e l i c a l  to that of  a r u l e which  o f f i t w i t h t h e known p r o t e i n any r e g i o n o f f i v e r e s i d u e s  i f at l e a s t three of i t s residues are  comprised of A l a , V a l , Leu, or Glu. A l t e r n a t i v e l y , of  seems  seven r e s i d u e s w i l l  be a - h e l i c a l  i f at least  any r e g i o n  three  r e s i d u e s a r e c o m p r i s e d o f A l a , V a l , L e u , G l u and an a d d i t i o n a l residue of  includes G i n , l i e , or Thr.  Using this  rule,  goodness  f i t b e t w e e n 65 and 681 was o b t a i n e d f o r a-, 3- and y-  hemoglobin,  lysozyme Periti  statistical  and m y o g l o b i n .  ejt al_. (1967) c a r r i e d o u t a s y s t e m a t i c  a n a l y s i s of the a v a i l a b l e data f o r horse  16  h e m o g l o b i n , and  sperm w h a l e m y o g l o b i n .  consideration of h e l i c a l  and  acid residues  1 4,  2 8,  3 4,  (1 2,  ....  ).  segments o f egg to t h e i r  1 3,  ....  , 1 7;  2 3,  white  s e g m e n t s by  the  lysozyme were c o n s t r u c t e d  the u s u a l  " h e l i c a l wheel". side chains  the a x i s of the h e l i x . characteristics data  ....  ,  helical  according  u n d e s i r a b l e to r e p r e s e n t l i n e a r way,  The  Schiffer  onto a plane  Side chains  perpendicular  i n t e r a c t i o n s and  arcs are  be  observed  that  for helicity.  of help to i d e n t i f y  w i t h h e l i c a l p o t e n t i a l . Among t h e  areas n^4  Such  i s t h e most a c c u r a t e  l a t e r p r o p o s e d by  17  areas  s i x p r o t e i n s chosen f o r  the wheel method, the p r e d i c t i o n of  X-ray data  visualized.  a b s e n t i n n o n h e l i c a l wheels'." H e n c e ,  t h e w h e e l r e p r e s e n t a t i o n may  segments i n i n s u l i n  to general  l o c a t e d i n t h e n+_ 3, n, and  p o s i t i o n s have the g r e a t e s t p o t e n t i a l hydrophobic  and  representation  o f t h e h e l i c e s c a n be b e t t e r  residues  the  wheels are p r o j e c t i o n s of  f o r m f o u r p r o t e i n s , i t was  with hydrophobic  testing  the  method.  t h e amino a c i d  Using  2 4,  Histograms f o r the r e c o g n i t i o n of  Edmunson (1967) p r o p o s e d a t w o - d i m e n s i o n a l called  l e d them t o  a n t i - h e l i c a l p a i r s o f amino  F i n d i n g t h a t i t was helical  This  and  B l u n d e l l e_t a J .  helical c l o s e s t to (1972)  Low  ejt a_l. (1968) l o o k e d  f o r sequence  o f l e n g t h v a r y i n g b e t w e e n t h a t o f d i - and The  theory  identities  hexapeptides.  b e h i n d t h e i r m e t h o d i s b a s e d on t h e  assumption  t h a t i f h e l i x - f o r m i n g sequences i n which l o c a l i n t e r a c t i o n s p r e d o m i n a t e c a n be polypeptide  recognized  c h a i n may  be  then t h e i r p o s i t i o n along  i r r e l e v a n t . A computer program  w r i t t e n t o l o c a t e sequence i d e n t i t i e s Although  t h i s method g i v e s  regions  t o be  The  authors  l e s s o v e r - p r e d i c t i o n of  i m p r o v e d by  recognized  Kotelchuck  and  single peptide  ( A l a , V a l , Leu, and  Arg)  Lys).  Scheraga  helices  This  Met,  Thr,  and  78%  of the  Asp,  as  S c h i f f e r and  His,  Tyr,  and  those  of  Edmunson,  c o r r e c t i d e n t i f i c a t i o n o f 61% total  residues  termination, ruling  18  of  the  i n four proteins; and  They d i d a t t e m p t t o d e f i n e c o n d i t i o n s f o r  n u c l e a t i o n and  helix-forming  Trp,  were q u i t e s i m i l a r - t o  myoglobin, lysozyme, tosyl-a-chymotrypsin A.  earlier  G i n , G l u , Phe , C y s ,  ( P r o t h e r o , 1966;  allowed  long-  a set of r u l e s i n which  or h e l i x - b r e a k i n g (Ser, Asn,  studies  e f f e c t s of  (1969), _ f r o m  u n i t s were a s s i g n e d  Their designations  previous 1967).  He,  i n more  t h a t of n o n - h e l i c a l sequences.  energy computations, formulated various  helical  t h a t the p r o c e d u r e needs  t a k i n g i n t o account the  r a n g e i n t e r a c t i o n s and  was  from a v a i l a b l e data.  compared t o o t h e r methods, i t r e s u l t s  omissions.  the  ribonuclease helix  t h a t f i v e o r more  peptide  u n i t s c o n s t i t u t e t h e minimum l e n g t h f o r any h e l i c a l t h a t a s e q u e n c e o f two h e l i x - b r e a k e r s w i l l propagation. very  the h e l i x  They a g r e e d , h o w e v e r , t h a t t h e i r m o d e l was n o t  accurate  f o r s m a l l e r p r o t e i n s y s t e m s where  i n t e r a c t i o n s may p l a y an i m p o r t a n t and  stop  a r e a and  long-range  role i n helix  nucleation  stabilization. Using  a combination of the Kotelchuck  and S c h e r a g a  (1969) and t h e S c h i f f e r and Edmunson (1967) s c h e m e s , Leberman  (1971) s u c c e e d e d i n c o r r e c t l y a s s i g n i n g  residues  i n s e v e n p r o t e i n s as h e l i c a l  regions.  The o m i s s i o n  an  or the b i n d i n g  and n o n h e l i c a l  o f observed regions  e f f e c t of the t e r t i a r y  was e x p l a i n e d as  o r even t h e q u a t e r n a r y  o f a p r o s t h e t i c group  82% o f a l l  structure,  ( e . g . , human h e m o g l o b i n ,  myoglobin). L e w i s e t a l . (1970) b a s e d t h e i r method on t h e Zimm and  Bragg  (1959) a and s p a r a m e t e r s f o r h e l i x  elongation.  The p a r a m e t e r s w e r e o b t a i n e d  from  i n i t i a t i o n and melting  c u r v e s o f random c o p o l y m e r s o f amino a c i d s . H e l i x  probabili-  ty p r o f i l e s  68% a c c u r a -  cy.  constructed  f o r eleven  proteins yield  C o r r e l a t i o n between t h e p r o p e n s i t y  helical  t o be a  f o r m e r i n t h e d e n a t u r e d p r o t e i n and i t s o c c u r r e n c e  in a helical  area  i n the corresponding  s u g g e s t e d . The c o r r e l a t i o n s u p p o r t s residues  of a residue  i n the a  R  conformation  19  n a t i v e p r o t e i n was  the hypothesis  may be i n v o l v e d  that i n the  nucleation of protein conformational  folding.  A i c o m p a r i s o n was made o f t h e  s t r u c t u r e o f denatured cytochrome  i o u s . s p e c i e s ( L e w i s and S c h e r a g a , even though c  t h e r e w e r e amino a c i d  throughout  evolution,  1971).  They showed  replacements  t h e r e remains  c f r o m .var-  in  that,  cytochrome  a conservation of  t h e n a t u r e o f t h e h e l i x - f o r m i n g power a t e a c h p o s i t i o n i n the  chain. Despite the progress i n p r o t e i n  was  still  predictionthere  a l a c k o f i n f o r m a t i o n on 3 - s h e e t s t r u c t u r e .  was b e c a u s e  the e a r l i e s t p r o t e i n s  e l u c i d a t e d by X - r a y  f r a c t i o n were hemoglobin: and m y o g l o b i n  which  Ihis\ dif-  are devoid of  3 - s h e e t c o n f o r m a t i o n . H e n c e , most o f t h e r e s e a r c h e r s a t that  time o f t e n chose  in their obtain  to ignore the 3-sheet conformation  calculations.  Furthermore,  3-sheet i n s o l u t i o n  i t was d i f f i c u l t  to  f o r spectrophotometric analysis.  H o w e v e r , as more p r o t e i n s t r u c t u r e s - w e r e e l u c i d a t e d by Xray d i f f r a c t i o n , presence  i t became i n c r e a s i n g l y a p p a r e n t  o f 3 - s h e e t was  that the  as i m p o r t a n t as t h a t o f a - h e l i x . o  I n t e r p r e t a t i o n o f an e l e c t r o n d e n s i t y map indicates  t h a t the predominant  A i s . formed cluded  resolution  conformation i n concanavalin  by two a n t i p a r a l l e l  i n the 3 s t r u c t u r e s  a t 2A  3-sheets. Residues not i n -  are arranged i n regions of ran-  dom c o i l .  One o f t h e p l e a t e d  tensively  to the i n t e r a c t i o n s  20  s h e e t s c o n t r i b u t e s examong t h e monomers t o f o r m  both  d i m e r s and  a n a l y s i s o£  tetramers  tosyl-a-chymotrypsin  f r a c t i o n of a - h e l i x but pleated Blow,  sheets  s e v e r a l adjacent, a n t i - p a r a l l e l  as h e l i c a l  $-breaker or  tendency of s t a b i l i z i n g  Nonpolar  amino a c i d s  p o l a r ones.  The  and  s i d e r e d as  are  B-breakers.  to  and .Ty.r, h a v e  zones than  the  with positive B-potential  Although  their  Pro  are  con-  classification  i n t e r a c t i o n s of the  s i d e groups  takes with  not w i t h each other,' they  good agreement between t h e i r p r e d i c t i v e method  and  X-ray data  This  supports  (Qa  the  = 79%  and  suggestion  i n t e r a c t i o n s and  help  e x c e p t Cys  helical  :  assigned  t h e m a i n c h a i n b a c k b o n e and  distant  antihelical,  the v a r i o u s B - s t r u c t u r e s .  w i t h c h a r g e d s i d e g r o u p s and  i n t o account o n l y the  obtained  the  amino a c i d r e s i d u e s w i t h c o m p a c t h y d r o -  carbon sidegroups whereas those  or  3-former a c c o r d i n g  ( L e u , A l a , Met)  a g r e a t e r tendency to enter  and  (Birktoft  F i n k e l s h t e i n (1970) c l a s s i f i e d  amino a c i d r e s i d u e s  t e n t a t i v e l y as  local  small  1972).  various  their  X-ray  revealed only a  s t a b i l i z e d by h y d r o g e n b o n d s  P t i t s y n and  and  ( E d e l m a n e_t a l . , 1972) .  = 79%)  for nine p r o t e i n s .  that i n s t e a d of competing  d i c t a t i n g the  secondary s t r u c t u r e ,  i n t e r a c t i o n s w o r k i n harmony w i t h t h e to s t a b i l i z e  from l o c a l  the  conformation  interactions.  21  with  local  which mainly  ones results  Nagano  (1973) d e v e l o p e d a c o m p u t e r method t o and 3 - s t r u c t u r e s  p r e d i c t h e l i c e s , loops structure. that  The b a s i s  o f h i s method l i e s  on t h e a s s u m p t i o n  s h o r t - r a n g e i n t e r a c t i o n s a r e due t o amino a c i d  pairs  s e p a r a t e d by m r e s i d u e s  prediction  functions  tistical  loop,  random c o i l ,  and  Four  3-  by a l i n e a r c o m b i n a t i o n o f s t a -  q u a n t i t i e s of d i f f e r e n t m values  statistical  residue  (m = 0, 1, 2, 5, ... 6 ) .  (helix,  s t r u c t u r e ) were e s t i m a t e d  the  from the p r i m a r y  constraint.  as a m e a s u r e o f  The c o e f f i c i e n t s u s e d i n t h e  c o m b i n a t i o n w e r e d e t e r m i n e d t o make t h e number o f c o r r e c t a s s i g n m e n t s as l a r g e a s p o s s i b l e . were o b t a i n e d and  Very s u c c e s s f u l r e s u l t s  (85.3% f o r h e l i x p r e d i c t i o n , 64.4% f o r l o o p ,  90.1% f o r 3 - s t r u c t u r e s ) . On t h e b a s i s  bouring  o f the influence of nearest  p a i r s o f amino a c i d s  formation  o f amino a c i d  ( n - 1 ) and (n+1) on t h e c o n -  (n),  K a b a t a n d Wu  designed,then l a t e r revised t h e i r of occurrences of various values-  ( 1 9 7 3 a , 1973b)  20x20 t a b l e o f f r e q u e n c y  conformations t a b u l a t i n g  a - h e l i x , 3 - s h e e t and n e i t h e r .  then used t o l o c a t e h e l i x - b r e a k i n g proteins.  neigh-  three  The f r e q u e n c i e s  positions i n various  Due t o l i m i t e d d a t a on p r o t e i n s w i t h  extensive  3-sheet fragments, r e c o g n i t i o n o f t h e 3-sheet b r e a k i n g was  made on p a p a i n o n l y .  breaking  residues  The r e g i o n s  regions  b e t w e e n two 3 ~ s h e e t  w o u l d be p e r m i s s i v e l y  22  were  3-sheet  regions.  A p p l i c a t i o n o f t h e m e t h o d on c o n c a n a v a l i n A, w h i c h has 3 - s h e e t r e g i o n s , a l l o w s l o c a t i o n o f 10 o u t o f t h e 13 a r e a s . A l t h o u g h no d i c t i o n o f a- and  Lim into  3 - r e g i o n s , the c o n j u n c t i o n of t h i s m e t h o d o r o t h e r schemes may  of accuracy  (Chou and  (1974b) p r o p o s e d  Fasman,  Based  characteristic  features of globular proteins  form; presence  of a  lar  the r o l e of the d i f f e r e n t  shell)  and  interactions,  different  most e n e r g e t i c a l l y chain.  Lim  advantageous  hydrophobic  core; a  types of  tertiary  structure.  i n the  relative  not belong to h e l i x  to f i n d  r e g i o n s . Through the use  and  triplets  at p o s i t i o n s  [1-4-5]-,  Lim  of h e l i c a l  [1-2],  (1974b)  23  [1^3],  and  role  are  as  which  [1-5],  do  irregu-  antihelical  a  of  specific  Regions  [1-4],  developed  the  protein  3-sheet  or 3-sheet type are c l a s s i f i e d  lar  po-  stabilization  a - H e l i x and  to the g l o b u l e s u r f a c e .  of  long-range  conformations f o r the  s i d e groups  qual-  (compactness  i n t o v a r i o u s types a c c o r d i n g to t h e i r  orientation  and  and  on t h e most  r e q u i r e m e n t s were s e t up  hydrophilic  of the p r o t e i n s  2-5]  takes  (1974a) a l s o e l a b o r a t e d on t h e s t r u c t u r a l  the d i f f e r e n t  classified  t i g h t l y packed  to  1978b).  a n o t h e r method t h a t  stereochemical considerations.  method  lead  account both q u a n t i t a t i v e evaluation"• of energy  itative  3-sheet  g u i d e l i n e s were g i v e n t o p r e v e n t o v e r p r e -  w i t h the h e l i c a l wheel a h i g h e r degree  many  pairs  [1-  predictive  algorithm  for helices.  o n l y done on helical  accuracy  B-structural  not  the  a t t r i b u t e d t o ot-  long h e l i x  than s e v e r a l s h o r t e r  of  81%  f o r a - h e l i x and  85%  25 unknown p r o t e i n s was  of hydroxypropyl-L-glutamine  following conclusions.  p o l y m e r and creasing and  c)  c o p o l y m e r s was  t e m p e r a t u r e , b)  of the  t h a t of a l l the  helical  regions  t h a t Leu residue and  8-sheet.  tested  conformational  may  be  conformational tablished  content  i n c r e a s i n g methanol of  Leu  reached  of the  of eleven  occurs  the  strongest  a)  de-  copoly-  proteins  i n the  most f r e q u e n t l y . T h i s  re-  inner suggests  h e l i c a l - f o r m i n g amino a c i d  as w e l l a s , first  in proteins  time,  p o t e n t i a l o f a l l 20  the  helix  (Chou and  amino a c i d s w e r e  i n t h e i r h i e r a r c h i c a l . order.'  Following  more c o m p l e t e i n v e s t i g a t i o n (Chou and  24  homo-  concentration,  i n the  amino a c i d s o c c u r i n g  the  of copol-  with L-leucine  helical  conformation  in polypeptides,  and  found to i n c r e a s e w i t h :  Leu  Fasman, 1 9 7 3 ) . F o r  study  The  i n c r e a s i n g molar r a t i o s  mers. A s u r v e y veals  proteins  for  also  s t u d i e s of poly(N(3-hydroxypropyl)-L-glutamine)  the  B-regions.  method. Chou _et _ a l . (1972) t h r o u g h CD  ymers  areas i s  o f t h e p r e d i c t i v e method a p p l i e d t o 25  conformation  with  of  r e g i o n s , b e c a u s e i t i s e n e r g e t i c a l l y more a d v a n t a -  o f known s t r u c t u r e was The  search  fragments of the.chain  geous t o h a v e one The  The  8-sheet esthis  Fasman,1974a)  on t h e c o n f o r m a t i o n a l p a r a m e t e r s P , P„  and  p  Co  P  of  each  L  amino a c i d r e s i d u e i n 15 p r o t e i n s s e r v e d as t h e b a s i s f o r a new  p r e d i c t i v e method  (Chou and  a d v a n t a g e s o f t h e i r method a r e accuracy. one  can  Without recourse  Fasman, 1 9 7 4 b ) . The i t s simplicity  to complicated  major  and i t s  computer  analysis,  e x p e d i e n t l y l o c a t e t h e h e l i x , 3 - s h e e t and  r e g i o n s o f p r o t e i n s w i t h 70-80% a c c u r a c y 1 9 7 8 a , 1978b) by  simply averaging  coil  (Chou and  t h e P , P^ a  and  Fasman,  P^  values  o f t h e r e s i d u e s i n t h e segment u n d e r c o n s i d e r a t i o n . A n o t h e r way  of l o c a t i n g the v a r i o u s conformations  r e s i d u e as a f o r m e r , its helix  and  p a r a m e t e r P^  i n d i f f e r e n t , or a breaker  g-sheet p o t e n t i a l . was  and  The  g-turn  based  on  conformational  tertiary  folding  in proteins.  r e a s o n s f o r i t s w i d e use  (Chou and  a c c o r d i n g t o A r g o s _et a J .  (1976),  Fasman, 1 9 7 8 b ) . the c o m p l e x i t y  proposed algorithms  i s such  that their  not been developed.  T h i s p r o b l e m may  be  method o f Chou and  Fasman  Wu,  main Indeed,  o f some  computerization the reason  t h e s e methods h a v e l i m i t e d p o p u l a r i t y c o m p a r e d t o  25  has  why the  o r o t h e r p o p u l a r methods  1973a).  of  The  e f f e c t i v e n e s s o f t h e method a r e t h e  1974b; K a b a t and  each  a l s o computed, e n a b l i n g the p r e d i c t i o n  c h a i n r e v e r s a l s and simplicity  an  i s to assign  (Lim,  MATERIALS AND  The  P r e d i c t i v e M e t h o d o f Chou and U s i n g the  criteria  gen bond f o r m a t i o n , Chou and termined the d i f f e r e n t The  METHODS  Fasman  of dihedral  a n g l e s and  Fasman ( 1 9 7 8 a ,  1978b) f i r s t  conformational states  f r e q u e n c y o f a l l 20 amino a c i d s i n e a c h t h e i r occurrence  t i o n u n d e r c o n s i d e r a t i o n by  their  29 p r o t e i n s . found  The  percentages  i n the h e l i c a l ,  <f  their  i n the  occurrence  was  conformai n the  and  due  i s a s s i g n e d t o t h e a , 8, o r c o i l  '  8-turn regions  average  < f > = 0.20, ' 8  c  > = 0.42,  and  0.38,  Q  proteins.  o f r e s i d u e s i n t h e 29 p r o t e i n s  sheet, c o i l ,  r e s p e c t i v e l y r e p r e s e n t e d by  i n 29  de-  conformation  t h e n c a l c u l a t e d by d i v i d i n g  total  hydro-  <f  t  fractions  > = 0.20.  state  are  <f > = a  Each  resi-  so t h a t <f  > + a  <f„> + <f 3  c  > = 1.00.  The  8-turn r e s i d u e assignment  i s made  to  i n d e p e n d e n t l y . E a c h amino a c i d  i s t h e n a s s i g n e d as  former,  indifferent,  o r b r e a k e r a c c o r d i n g t o i t s c o n f o r m a t i o n a l pa-  rameters  Pg w h i c h  P^,  cy o f i t s o c c u r r e n c e average  frequency  f. /<f >). t t  The  a r e o b t a i n e d by d i v i d i n g i n a c o n f o r m a t i o n by  (e.g., P  a  along with t h e i r  a  assignment  i n Table  frequen-  the r e s p e c t i v e  = f /<f >,.P^ = f /<f a  c o n f o r m a t i o n a l parameters  20 amino a c i d s a r e l i s t e d  the  >,  g  P  a  and  P  0  3  1 in hierarchical  as f o r m e r , i n d i f f e r e n t , o r  26  P  t  =  f o r the order breaker.  T a b l e 1.  C o n f o r m a t i o n a l p a r a m e t e r s f o r a . . - h e l i c a l and g - s h e e t r e s i d u e s b a s e d on 29 p r o t e i n s .  a-Residues  p  a  Helical ^ Assignment  ...  Val  1 .70  H  lie  1 . 60  g  H  Tyr  1 .17  3  H  Phe  1 . 38  g  h  Trp  1 .37  g  h  Leu  1 . 30  g  h  Cys  1 .19  g  h  Thr  1 .19  g  h  Gin  1 .10  g  h  h  1 .05  g  h  Met  h  Arg  0 .93  g  Asn  0 .89  His  0 .87  Ala  0 .83  Ser  0 .75  b  Gly  0 .75  g  b  0 . 74  g  b  Pro  0 .55  B  B  Asp"  0 . 54  g  B  Glu"  0 .37  g  B  g  Glu"  1., 51  Met  1..45  H  Ala  1.,42  H  Leu  1.. 21  H  1..16  V  Lys  +  g-Sheet Assignment  g-Residues  GL  a a a  Phe  1.,13  h  Gin  1.,11  Trp  1.. 08  lie  1..0.8  Val  1., 06  Asp"  1.. 01  T  His  1.. 00  I  Arg  0 ,, 98  i  Thr"  0..83  i  Ser  0..77  i  Cys  0..70  i  Tyr  0 .69 .  b  Asn  0 ,67 ,  b  Pro  0.. 57  B  Gly  0.. 57  B  a a h a a h  a a  a a  a a  a a a a a  Lys  +  h  a  C h o u and Fasman  (1978b)  b  H e l i c a l a s s i g n m e n t s : H^, s t r o n g a-fo'rmer; h , a - f o r m e r ; I , weak a - f o r m e r ; i , a - i n d i f f e r e n t ; b d , a - b r e a k e r ; Bpi, strong a-breaker. a  a  c  a  g - s h e e t a s s i g n e m e n t s : Hg, s t r o n g g - f o r m e r ; h g , g - f o r m e r ; Ig , weak g - f o r m e r ; i g , g - i n d i f f e r e n t ; beg, g - b r e a k e r ; Bg, s t r o n g g - b r e a k e r .  The  s y m b o l s H and h may be t h o u g h t o f as s t r o n g and m o d e r a t e  h y d r o g e n b o n d i n g , r e s p e c t i v e l y w i t h t h e s u b s c r i p t s a, 8 d e n o t ing can  h e l i c a l o r 8-sheet c o n f o r m a t i o n .  E a c h amino a c i d  residue  a l s o be c h a r a c t e r i z e d by i t s b o u n d a r y c o n f o r m a t i o n a l p a -  rameters listed  (P  a N  ,  P  a C  i n Tables  , P  n a N  , P  n a C  , P  3 N  ve b e e n c l a s s i f i e d , to  3 C  , P  n 3 N  , P  n 3 C  ) as  2 a n d 3.  When a l l t h e r e s i d u e s  below  , P  i n a p r o t e i n sequence ha-  one c a n u s e t h e e m p i r i c a l r u l e s d i s c u s s e d  p r e d i c t i t s s e c o n d a r y s t r u c t u r e (Chou a n d F a s -  man, 1 9 7 8 a , 1 9 7 8 b ) .  A.  Search The  od  of  cribed  Regions  s e a r c h was c a r r i e d o u t a c c o r d i n g  t o t h e meth-  Chou a n d Fasman ( 1 9 7 8 a , 1 9 7 8 b ) , w h i c h c a n be d e s as f o l l o w s : 1.  residues  Helix nucleation.  A cluster of four  ( h ^or H ) out of s i x residues along initiate  as 1/2 h a  v  a helix.  ( i . e . , three h ' a  A weak h e l i c a l and two I a  helical  the p r o t e i n  &  sequence w i l l counts  for Helical  residue ( I ) a  residues out of  s i c may a l s o c a u s e h e l i x n u c l e a t i o n ) . 2. ment i n b o t h  Helix propagation.  d i r e c t i o n s as l o n g as a d j a c e n t  are n o t h e l i x breakers ments a l l s a t i s f y together  Extend the h e l i c a l  (seebelow).  tetrapeptides  When o v e r l a p p i n g  the h e l i x nucleation r u l e ,  into a long h e l i x .  The n u c l e a t e d 28  seg-  they  seg-  are linked  helix of six  T a b l e 2.  Conformational Residues  Parameters of H e l i c a l  Boundary  i n 29 P r o t e i n s .  P  P aN  P ac.  P naN  naC  Glu(-)  2 .44  Lys(+)  i . 83  Ser  1 .55  His (+ )  1 .86  Asp(-)  2 .02  His(+)  1. 77  Asn  1 .42  Asn  1 .64  Pro  2 .01  Met  1. 57  Gly  1 .41  Gly  1 .64  Trp  1 .47  Val  1. 2 5  His (+ )  1 .22  Pro  1 . 58  Ala  1 . 29  Arg( + )  1 . 20  Pro  1 .10  Lys(+)  1 .49  Gin  1 . 22  Glu(-)  1. 24  Thr  1 .09  Agr(+)  1 .24  Thr  1 .08  Gin  1. 2 2  Glu(-)  1 .04  Asp(-)  1 .06  Asn  0 .81  Ala  1. 20  Lys( + )  1 .01  Phe  1 .04  Gly  0 .76  Leu  1. 1 3  Tyr  0 .99  Tyr  0 .96  Ser  0 .74  Cys  1. 1 1  Asp(-)  0 .98  Cys  0 .94  His (+ )  0 . 73  Phe  1. 1 0  Phe  0 .93  Ser  0 .93  Met  0 .71  TJ C  0. 9 8  Leu  0 .85  He  0 .87  Tyr  0 .68  Ser  0. 9 6  Met  0 .83  Thr  0 .86  lie  0 .67  Thr  0. 7 5  He  0 .78  Leu  0 .84  Cys  0 .66  Tyr  0. 7 3  Gin  0 .75  Gin  0 .70  Lys( + )  0 .66  Asp(-)  0. 6 1  Val  0 .75  Glu(-)  0 . 59  Phe  0 .61  Asn  0. 5 9  Ala  0 .70  Ala  0 .52  Val  0 .61  Gly  0. 4 2  Cys  0 .65  Met  0 .52  Leu  0 . 58  Trp  0. 4 0  Trp  0 .62  Val  0 .32  Arg( + )  0 .44  Pro  0. 0 0  Arg( + )  0 . 34  Trp  0 .16  H e l i x boundary r e s i d u e s i n c l u d e the t h r e e h e l i c a l r e s i d u e s on b o t h ends o f a h e l i c a l r e g i o n and t h e t h r e e n o n h e l i c a l r e s i d u e s a d j a c e n t t o t h e h e l i c a l end r e s i d u e s , a t o t a l o f s i x r e s i d u e s on e a c h end o f t h e h e l i x . P ^ = normalized frequency o f r e s i d u e s i n the N - t e r m i n a l h e l i x r e g i o n ; P r = normalized frequency of r e s i d u e s i n the C-terminal h e l i x region; P aN normalized frequency of r e s i d u e s i n the N - t e r m i n a l n o n h e l i c a l r e g i o n ; P c normalized frequency of r e s i d u e s i n the C - t e r m i n a l n o n h e l i c a l r e g i o n . a  a  =  n  n a  29  =  Table  3.  Conformational Residues  P a r a m e t e r s o f 3-Sheet  i n 29  P~  3N  Boundary  Proteins.  r  P  P  3C  n3N  p  n3C  lie  1 .94 T y r  1. 96 A s n  1 .86 P r o  1. 69  Val  1 .69 V a l  1. 79 P r o  1 . 58 G l y  1. 68  Gin  1 .65 Phe  1. 50 G l y  1 .46 T r p  1. 59  Phe  1 .40 l i e  1. 35 S e r  1 .41 S e r  1. 49  Trp  1 .49 Leu  1. 27 A s p ( - )  1 . 39 A s p ( - )  1. 32  Met  1 .43 A s n  1. 21 Cys  1 .34 Thr  1. 16  Leu  1 .30 T r p  1. 19 T y r  1 .23 A s n  1. 13  Thr  1 .17 Cys  1. 11 L y s ( + )  1 .09 A r g ( + )  1. 05  Tyr  1 .07 Met  0. 95 G i n  1 .09 T y r  1. 01  Lys(+)  1 . 00 H i s ( + )  0. 90 T h r  1 .09 H i s ( + )  0. 96  Arg( + )  0 .90 A r g ( + )  0. 90 G l u ( - )  0 .92 Met  0. 85  Cys  0 .87 A s p ( - )  0. 85 A r g ( + )  0 .89 G l u ( - )  0. 85  Als  0 .86 S e r  0. 79 H i s ( + )  0 .78 L y s ( + )  0. 82  Pro  0 .66 T h r  0. 75 A l a  0 .67 G i n  0. 77  Asn  0 .66 A l a  0. 75 H e  0 . 59 A l a  0. 74  Gly  0 .63 G l y  0. 74 Met  0 . 52 V a l  0. 59  Ser  0 .63 L y s ( + )  0. 74 T r p  0 .48 L e u  0. 59  His(+)  0 . 54 G i n  0. 65 Leu  0 .46 H e  0. 53  Asp(-)  0 .38 G l u ( - )  0. 55 V a l  0 .42 Cys  0. 53  Glu(-)  0 . 35 P r o  0. 40 Phe  0 .30 Phe  0. 44  3 - s h e e t b o u n d a r y r e s i d u e s i n c l u d e t h e t h r e e r e s i d u e s on b o t h ends o f a 3 r e g i o n s and t h e t h r e e n o n - 3 r e s i d u e s a d j a c e n t t o t h e 3 - s h e e t end r e s i d u e s , a t o t a l o f s i x r e s i d u e s on e a c h end o f t h e 3 - s h e e t r e g i o n . P g ^ = n o r m a l i z e d f r e q u e n c y o f r e s i d u e s i n t h e N - t e r m i n a l 3 r e g i o n ; P^Q = n o r m a l i z e d frequency of residues i n the C-terminal region; n3N normalized frequency of residues i n the N-terminal non-3 region; P gc = normalized frequency of residues i n the C - t e r m i n a l non-3 r e g i o n . p  =  n  30  residues  should  c o n t a i n a t l e a s t two t h i r d s  propagated h e l i x helix  should  formers.  be c o m p r i s e d o f one h a l f o r more  I t i s important  a weak h e l i c a l  former  h's,.while.the  to u t i l i z e  the r u l e  that  ( I ) c o u n t s as l / 2 h i n t h e segment. a  B o t h t h e h e l i x n u c l e a t i o n segments and t h e e n t i r e h e l i x should B  h a v e f e w e r t h a n one t h i r d h e l i x b r e a k e r s  (b  or  a  a)• 3.  terminated breakers bij,  Helix Termination.  on b o t h s i d e s by t h e f o l l o w i n g t e t r a p e p t i d e  with  <P > < 1.00 : bq, b^i, b ^ h , b 2 i 2 > b 2 i h , a  bi2h, b i l ^ ,  h i g and l ^ ^ ' breakers  and i ^ .  Some t e t r a p e p t i d e s , s u c h as a  s i n c e they a l l o w h e l i x propagation i s defined,  ends.  to continue.  some o f t h e r e s i d u e s  above t e t r a p e p t i d e b r e a k e r s  helical  b2h2»  may h a v e <P > < 1.00 b u t a r e n o t l i s t e d as  Once t h e h e l i x the  The p r o p a g a t e d h e l i x i s  (h o r i )  may be i n c o r p o r a t e d  in  atthe  F o r e x a m p l e , t h e h i o f t h e b r e a k e r b b h i may  be a d d e d t o t h e p r e d i c t e d h e l i x o n l y  at the N-terminal  s i d e , b u t t h e bb may n o t be i n c l u d e d a t e i t h e r t h e N- o r C-terminal breakers  helix.  also include  8-regionsthat <Pg>  >  The n o t a t i o n s  <  P  > a  I , B, and H, r e s p e c t i v e l y .  have h i g h e r )  c  4.  a  n  i , b, h i n t h e t e t r a p e p t i d e  8-  than a - p o t e n t i a l  also terminate  helix  31  (i.e.,  propagation.  P r o l i n e as H e l i x B r e a k e r .  in the inner h e l i x or at the C-terminal  Adjacent  Pro cannot  helical  occur  end b u t  I  can occupy t h e f i r s t terminal  incorporated  H e l i x boundaries.  into  Lys^ ^  and A r g ^ ^  helical  end. I  the N - t e r m i n a l  r e s i d u e s ) i n t h e N-  P r o , Asp^ ^ , G l u ^ ^ helical  e n d , w h i l e His'- '', +  a s s i g n m e n t s a r e g i v e n t o P r o and A s p helix),  as w e l l  to s a t i s f y  as H a t t h e N - t e r m i n a l  1.  condition A . l . Glu i s s t i l l h e l i x w h i l e H i s and L y s a r e helix.  Any segment o f s i x r e s i d u e s o r l o n g e r i n  a n a t i v e p r o t e i n w i t h <P_> ^  satisfying  (near  a s , A r g (near the C - t e r m i n a l  h and I , r e s p e c t i v e l y , a t t h e C - t e r m i n a l  Rule  are  are i n c o r p o r a t e d i n t o the C-terminal  i f necessary  assigned still  the N-terminal  +  helix)  third  helix. 5.  +  turn (i.e.,  a  > 1.03 and <P > > <P„>, and —  a  c o n d i t i o n s A . l through  3-  A . 5 , i s p r e d i c t e d as  helical. B.  Search  f o r 3-sheet Regions  The s e a r c h o u t by a p p l y i n g  f o r 3-pleated  sheet  r e g i o n s was  carried  t h e s e t o f r u l e s o u t l i n e d by Chou and Fasman  ( 1 9 7 8 a , 1978b) as f o l l o w s : 1. 3-formers  3-sheet N u c l e a t i o n .  A sequence o f t h r e e  (hg o r Hg) o r a c l u s t e r o f t h r e e 3 - f o r m e r s o u t o f  f o u r o r f i v e r e s i d u e s a l o n g the p r o t e i n sequence initiate  a 3-sheet  32  will  2.  8-Sheet P r o p a g a t i o n .  segment i n b o t h peptides  directions  8-sheet  (see below).  ( b ^ o r B^) o r l e s s t h a n one  8-Sheet T e r m i n a t i o n .  Apply  c o n d i t i o n s A.3  f o r h e l i x t e r m i n a t i o n by u t i l i z i n g  peptide breakers pagation.  8-Sheet  formers. 3.  outlined  tetra-  i f t h e e n t i r e segment c o n t a i n s one  t h i r d o r more 8 - s h e e t b r e a k e r s half  t h e 8-sheet  as l o n g a s a d j a c e n t  a r e n o t 3-sheet b r e a k e r s  formation i sunfavorable  Extend  w i t h <Pg>  Adjacent  8-potential  (i.e.,  <  a-regions <P >  >  a  <  1  t h e same  tetra-  f o r s t o p p i n g 8-sheet  pro-  t h a t h a v e h i g h e r a- t h a n  ^3 ) >  c  a  n  a l s o t e r m i n a t e 8-  propagation. 4. the into <P > a  Strong  8-Sheet B r e a k e r s .  s t r o n g e s t 8-sheet b r e a k e r s 8-sheets u n l e s s they occur <  <P > 8 C  incorporated into  C h a r g e d r e s i d u e s and  t o 8 - s h e e t f o r m a t i o n a n d s h o u l d n o t be  8-sheets u n l e s s they occur  <P > p D  Rule  2.  >  i n tetrapeptides  1.  A n y segment o f t h r e e r e s i d u e s o r l o n g e r  i n a n a t i v e p r o t e i n w i t h <Pg> and  i n tetrapeptides with  8-Sheet B o u n d a r i e s .  Pro a r e u n f a v o r a b l e  <  a n d s h o u l d n o t be i n c o r p o r a t e d  > 1.  5.  w i t h <P > ot  G l u and P r o a r e  >  1.05 a n d  s a t i s f y i n g c o n d i t i o n s B . l through  8-sheet.  33  < p  g  >  >  <  ^  > a  '  B.5 i s p r e d i c t e d as  C.  Overlapping  a-  and  g-RegTons  I n most cases,, u t i l i z a t i o n described  of the set of rules  a b o v e . was._ a d e q u a t e t o l o c a t e t h e s e c o n d a r y  structures of proteins.  However t h e r e w e r e r e g i o n s i n  p r o t e i n s c o n t a i n i n g b o t h a- a n d 3 r e s i d u e s w h e r e arose,  ambiguities  so t h a t a d d i t i o n a l measures were r e q u i r e d t o r e s o l v e  the dilemma.  Chou a n d Fasman  ( 1 9 7 8 a , 1978b) f o l l o w e d t h e  procedure d e s c r i b e d below t o determine whether the overl a p p i n g r e g i o n was p r e d o m i n a t e l y 1.  a  <Pg>  >  >  residues  a- a n d g - a s s i g n m e n t s .  with  (^h^ib^  since there (b ) a  the region  i t i s g-sheet.  of the overlapping the  g.  C a l c u l a t e t h e <Pa> and <Pg> f o r t h e o v e r -  l a p p i n g r e g i o n ; i f <P > if  a or  The a-  i s helical,  and g - p o t e n t i a l  c a n a l s o be c o m p a r e d b y  grouping  Thus a r e g i o n o f s i x r e s i d u e s  and ( H h ^ i B ) ^  assignments should  a r e two s t r o n g a - f o r m e r s  c o m p a r e d t o one s t r o n g  g-former  be h e l i c a l ,  ( H ) a n d one a - b r e a k e r a  (Hg) a n d one s t r o n g  g-breaker (Hg). 2. helix  Use  Tables  2 a n d 3 on t h e f r e q u e n c y  of  and a - s h e e t b o u n d a r i e s t o d e l i n e a t e w h e t h e r t h e r e g i o n  is a or  g.  34  3.  than 3-sheets, a  Since h e l i c e s are longer  l o n g segment c o n t a i n i n g b o t h a- a n d B - p o t e n t i a l i s p r e d i c t e d as h e l i c a l  i f <P >  >  <  P  > C  >  even though  may  be a s m a l l e r f r a g m e n t , t h a t i s , f i v e r e s i d u e s  the  segment whose P g <  >  >  ^o^-  long h e l i x  region  within  Hence, i n t h e example  g i v e n above f o r c a r b o x y p e p t i d a s e , one  there  173-186 i s p r e d i c t e d as 173-178 and a 3-  instead o f a short h e l i x  179-183. R e g i o n s w i t h b o t h a- a n d 3 - p o t e n t i a l  4.  to a p r e d i c t e d 3-turn  adjacent  (see b e l o w ) a r e p r e d i c t e d t o be 3  s h e e t a s l o n g as t h e r e  _  3 - f o r m e r s on e a c h  are a t least three  s i d e o f t h e 3 - t u r n ; t h a t i s , t h e minimum 3 l e n g t h i s r e d u c e d from f i v e  t o t h r e e , w i t h t h e m i d d l e two r e s i d u e s  3-turn counting  as c o i l  residues.  F o r example,  105-110 and 115-124 i n r i b o n u c l e a s e  113-115 e a s i l y 2  2  as  2  3  allows  and  a  [(H h Iib ) 3  h a v e b o t h a- a n d 3-  g  t h e a s s i g n m e n t o f 105-110  <P > = 1.31 >  > (Hh i bB) 5  2  3-sheets r a t h e r than  a  and <P > '= 1.13 >  a  p  <P > = 1.02]  (C.2)  with  a  a-helices.  A n y segment c o n t a i n i n g o v e r l a p p i n g a-  3-residues i s resolved through conformational  analysis  [(H^hlb)^  <P > = 1.10] a n d 115-124  g  R u l e 5. and  regions  However, t h e h i g h p r o b a b i l i t y o f a 3 - t u r n a t  potential.  > (Hh I i)  o f the  <  P  >  > A  35  <  P  R  >  f°  r t  n  e  boundary  predicted  a-region helix  ( C l ) .  8-Formers may be i n c o r p o r a t e d i n t o a l o n g  i f they are not h e l i c a l  H e l i x p r o p a g a t i o n may  tetrapeptide breakers  (C.3).  be t e r m i n a t e d by a r e s i d u e s i f t h e s e 8-sheets.  same r e s i d u e s f a v o r t h e f o r m a t i o n o f a n t i p a r a l l e l  I n summary, a c c o r d i n g t o Chou and Fasman (1978!b) , there are only three basic rules secondary  structure.  While  for predicting protein  t h e a and 8 s e a r c h c o n d i t i o n s  e l a b o r a t e d above seem t o be q u i t e e x t e n s i v e t h e y a r e g i v e n so t h a t i n c o r r e c t p r e d i c t i o n s w i l l  D.  Search  be  f o r 3-Turns  A t p r e s e n t , 408 8 - t u r n s from  29 p r o t e i n s and t h e f r e q u e n c y  20 amino a c i d s i n t h o s e as w e l l Table  as t h e i r P  4.  p  t  t  values  occurrence  = (f ) (f i  i + 1  )  0f  have b e e n e l u c i d a t e d of occurrence  408 t u r n s , a t p o s i t i o n s  (Chou and Fasman,  of 8-turn  minimized.  )  ( i  i t o i + 3,  ( P ^ = f /<f+.>) a r e g i v e n i n t  1977 , 1 9 7 9 ) .  The  probability  a t r e s i d u e i i s computed i + 2  f o r the  £  + 3  )  w  i  The a v e r a g e p r o b a b i l i t y o f 8 - t u r n  t  h  t h e  a i d  o  £  from T  a  b  l  e  4  -  occurrence i s  -4 <p^> p  = 0.55 x 10  .  Two  c u t - o f f v a l u e s were s e l e c t e d :  = 1.0 x 10 ^ (a v a l u e a p p r o x i m a t e l y  a v e r a g e ) and p  t  double  t h a t o f the  = 0.75 x 10 ^ (a v a l u e t h a t i s 1 1/2  that of the average).  According  3 6  t o Chou and Fasman  times (1979),  T a b l e 4.  Frequency H i e r a r c h i e s  i  i + 1  o f Amino A c i d s  i + 2  i n the  8-Turns o f  i + 3  P  29  Proteins.  P  t  t 2  Asn  0. 161  Pro  0. 301  Asn  0 . 101 Trp  0. 167  Asn  1. 56  Pro  2 . 04  Cys  0. 149  Ser  0. 139  Gly  0 . 190 G l y  0. 152  Gly  1. 56  Gly  1. 63  Asp  0. 147  Lys  0. 115  Asp  0 .179  Cys  0. 128  Pro  1. 52 Asp  1. 61  His  0. 140  Asp  0 . 110  Ser  0 .125  Tyr  0. 125  Asp  1 .46 Asp  1. 56  Ser  0. 120  Thr  0 . 108  Cys  0 . 117 S e r  0. 106  Ser  1. 43  Ser  1. 52  Pro  0. 102  Arg  0 . 106  Tyr  0 . 114 G i n  0. 098  Cys  1. 19  Lys  1. 13  Gly  0. 102  Gin  0. 098  Arg  0 . 099 Lys  0. 095  Tyr  1. 14  Tyr  1. 08  Thr  0. 086  Gly  0. 085  His  0 . 093 Asn  0. 091  Lys  1. 01 A r g  1. 05  Tyr  0. 082  Asn  0. 083  Glu  0 . 077 A r g  0. 085  Gin  0. 98  Thr  0. 98  Trp  0. 077  Met  0. 082  Lys  0 . 072 Asp  0. 081  Thr  0. 96  Cys  0 . 92  Gin  0. 074  Ala  0. 076  Thr  0 . 065 Thr  0. 079  Trp  0. 96  Gin  0. 84  Arg  0. 070  Tyr  0. 065  Phe  . 065 ]Leu  0. 070  Arg  0. 95  Glu  0. 80  Met  0. 068  Glu  0 . 060  Trp  0 . 064 P r o  0. 068  His  0 . 95  His  0. 77  Val  0. 062  Cys  0. 053  Gin  0 . 037 Phe  0. 065  Glu  0 . 74 A l a  0. 64  Leu  0. 061  Val  0. 048  Leu  0 . 036 G l u  0. 064  Ala  0. 66  Phe  0. 62  Ala  0. 060  His  0. 047  Ala  0 . 035 A l a  0. 058  Met  0. 60 Met  0. 51  Phe  0. 059  Phe  0 . 041  Pro  0 . 034 H e  0. 056  Phe  0. 60  Trp  0. 48  Glu  0. 056  He  0. 034  Val  0 .028  Met  0. 055  Leu  0. 59 V a l  0. 43  Lys  0. 055  Leu  0. 025  Met  0 . 014 H i s  0. 054  Val  0. 50  Leu  0. 36  lie  0. 043  Trp  0. 013  He  0 . 013 V a l  0. 053  He  0. 47  He  0. 29  Table  4.  F r e q u e n c y H i e r a r c h i e s o f Amino A c i d s  i n the  3-Turns o f 29 P r o t e i n s ,  (cont'd)  i , i + 1 , i 2 , and i+3 r e p r e s e n t t h e f r e q u e n c i e s o f t h e f i r s t , s e c o n d , t h i r d , and f o u r t h r e s i d u e s , r e s p e c t i v e l y , i n a r e v e r s e 3 - t u r n . Pt i s the c o n f o r m a t i o n a l p o t e n t i a l o f a r e s i d u e i n a B - t u r n b a s e d on a l l f o u r p o s i t i o n s o f a r e v e r s e t u r n . Pt2 i s the c o n f o r m a t i o n a l p o t e n t i a l o f a r e s i d u e i n a 3 - t u r n b a s e d on t h e s e c o n d and t h i r d p o s i t i o n s o f a r e v e r s e turn. T h i s f r e q u e n c y t a b l e was b a s e d on 408 3 - t u r n s i n 29 p r o t e i n s . +  the lower  c u t - o f f v a l u e p r e d i c t s more b e n d r e s i d u e s  c o r r e c t l y w h i l e t h e h i g h e r c u t - o f f v a l u e p r e d i c t s more non-bend r e s i d u e s c o r r e c t l y . p r e d i c t i v e accuracy  However i t a p p e a r s t h a t t h e  i s similar  f o r t h e two v a l u e s .  The  c u t - o f f v a l u e o f 0.75 x 10 ^  h a s b e e n u s e d b y Chou a n d  Fasman ( 1 9 7 8 b , 1 9 7 9 ) i n t h e i r  search  f o r 3-turns  i n 29  proteins. -4 Rule  4.  as w e l l a s <P >  >  t  Tetrapeptides with p 1.00  s e l e c t e d as p r o b a b l e (i.e.,  and <P > a  bends.  <  Adjacent  > 0.75 x 10  <P > t  >  probable  <P^> a r e bends  1 1 - 1 4 , 1 2 - 1 5 , 13-16) a r e c o m p a r e d p a i r w i s e , and t h e  tetrapeptide with the highest p  t  value  i s p r e d i c t e d as a  3-turn. E.  Evaluation of the Predictive To  it  i snecessary  evaluate the success  Accuracy  o f any p r e d i c t i v e  scheme  t o compare t h e p r e d i c t e d c o n f o r m a t i o n a l  s t a t e f o r each r e s i d u e o f a p r o t e i n w i t h t h e observed a s s i g n m e n t ' b a s e d on X - r a y d i f f r a c t i o n .  The p e r c e n t a g e  residues n^ p r e d i c t e d i n the conformational  state ki s  given by: 100 Ik'  (n - n ) k  x  =  (1)  38  of  where k r e p r e s e n t s  t h e a-, 3 - o r c o i l  regions  i n the n a t i v e  p r o t e i n s t r u c t u r e as d e t e r m i n e d by x - r a y c r y s t a l l o g r a p h y and  n  i s t h e number o f i n c o r r e c t l y p r e d i c t e d r e s i d u e s i n  the  s t a t e k. The p e r c e n t a g e o f o v e r p r e d i c t i o n i s g i v e n  the  criteria: % nk  =  H£ n  where  Ink r e p r e s e n t s  (2)  nk  the percentage of c o r r e c t l y p r e d i c t e d  r e s i d u e s not i n the conformational and  by  s t a t e k,  = N - nk,  n i s t h e number o f k r e s i d u e s o v e r p r e d i c t e d . nx  Hence t h e  r  q u a l i t y o f p r e d i c t i o n f o r a given type c a n be e x p r e s s e d  of conformational  as t h e mean o f Ik ( e q . 1) a n d Ink Ik  (eq. 2 ) .  + Ink  A v a l u e o f 100% f o r I k , I n k , and  indicates  t o t a l a g r e e m e n t b e t w e e n o b s e r v a t i o n and p r e d i c t i o n , 0% i n d i c a t e s t o t a l  k  disagreement  (Chou and Fasman,  while  1978a,  1978b). R e c e n t l y , Matthews tion coefficient prediction  (1975) i n t r o d u c e d a c o r r e l a -  t h a t i n d i c a t e s how much b e t t e r  i s t h a n .a random; one.  38a  . a given  [ Cn a  = ca  r r r < [( a n  a  + a m  o  a T n  )/N]. - [ ( n - a +a )/N] (n /N) a  ) / N ]C  n  a  /  N  )  U  "  n  a  /  n  N  )  1  [  a  " Kr<V o a  ) / N ] }  '  172  (3)  The  c o r r e l a t i o n c o e f f i c i e n t f o r B - s h e e t and  obtained  b y s u b s t i t u t i n g B and  e q u a t i o n 3.  prediction total  with  i s no b e t t e r  disagreement  predicted  points  generally  one  useful. turn  If C  C=-l a  indicates  >^ 0.6, t h e  that o f the observed  structure  g e n e r a l l y missed but w i t h  o f f b y a few r e s i d u e s . helical  regions  If C  might  a  N- and C-  >_ 0.4,  be m i s s e d o r  however t h e p r e d i c t i o n would  still  S i m i l a r s t a t e m e n t s c a n b e made r e g a r d i n g  be q u i t e sheet  and  ( A r g o s e_t a j l . , 1976) .  Amino A c i d  Sequence o f P r o t e i n s The  used  regions  agreement  C=0 i n d i c a t e s t h a t a  t h a n r a n d o m , and  o r 0% a c c u r a c y .  o r two  overpredicted,  indicates perfect  observation,  s t r u c t u r e i s near  no h e l i c a l  terminal  t, respectively, for a i n  A c o r r e l a t i o n o f C=l  b e t w e e n p r e d i c t i o n and  B - t u r n may be  amino a c i d s e q u e n c e o f t h e v a r i o u s  proteins  f o r t e s t i n g o u r p r o g r a m comes f r o m t h e A t l a s o f p r o t e i n  s e q u e n c e and  structure  ( D a y h o f f , 1972,  39  1973,  1976,  1978).  I  Programming The p r o g r a m was w r i t t e n t e s t e d a t t h e UBC  c o m p u t i n g c e n t r e . The  o f e a c h p r o t e i n was The  20 amino  i n F o r t r a n l a n g u a g e and  converted  amino  i n t o a sequence o f  a c i d r e s i d u e s were  1  sequence  integers.  sorted alphabetically  e a c h o f them a s s i g n e d a f i x e d number b e t w e e n instance:  acid  A l a , 2 -> A r g ,  and  1 and 20. F o r  19 -> T y r , and 20 -> V a l .  Hence i n o r d e r t o u s e t h e p r o g r a m , one must c o n v e r t t h e p r o tein  sequence i n t o accorresponding  the n e c e s s a r y  details  concerning  given i n the appendix.  40  series of integers. A l l t h e use o f t h e program a r e  RESULTS AND  Programming of the  DISCUSSION  method  F o l l o w i n g t h e r u l e s o u t l i n e d by  Chou and  Fasman  ( 1 9 7 8 a , 1978b) f o u r d i f f e r e n t p r o g r a m s w e r e w r i t t e n t o dict  a - h e l i x , 3 - s h e e t , and  p i n g areas  3-turn,  b e t w e e n a - h e l i x and  and  to s o l v e the  pre-  overlap-  3-sheet.  E a c h p r o g r a m c o n s i s t e d o f t h e m a i n p r o g r a m and eral  subroutines.  In every  p r o g r a m was t o r e a d sideration,  and  then  the c o r r e s p o n d i n g (P^, Pg, (P I T , aN'  P^)  P , aC  v  n  P  and M  3-turn  A.  the purpose of the  t o a s s i g n t o e a c h amino a c i d  values  o f the c o n f o r m a t i o n a l  the boundary c o n f o r m a t i o n a l r  D M  Dn  c a l l e d on  r e g i o n s , and  Scheme f o r t h e  , P n3N'  O M  to search  lix  or  O  s e a r c h o f a - h e l i x and  3-sheet p o t e n t i a l  according  s p e c t i v e l y . T h e n w i t h i n the  parameters subrou-  areas.  3-sheet  3-sheet  been r e c o r d e d ,  to d e t e c t the areas  residue  parameters  • The n3C  to solve o v e r l a p p i n g  once t h e w h o l e s e q u e n c e has is called  or  con-  f o r a - h e l i x , 3-sheet,  I n t h e c a s e o f a - h e l i x and  tine  main  sequence of the p r o t e i n under  , P , P , V, P naN' naC 3N 3C  t i n e s were t h e n and  i n the  case,  sev-  i n the  the  regions  prediction, first  subrou-  sequence w i t h  he-  to r u l e  1 or r u l e  2,re--  l i m i t s of those  potential  areas,  the r u l e s f o r n u c l e a t i o n , p r o p a g a t i o n  41  and  termination  are  a p p l i e d to l o c a t e the d i f f e r e n t  s e c t i o n s more a c c u r a t e l y .  Those r u l e s were e l a b o r a t e d i n t h e s e c o n d and s u b r o u t i n e s . The  various important  s u c h as  h e l i x or 3-sheet  h e l i x o r g - s h e e t b r e a k e r s , and were a l s o t a k e n i n t o  factors  third strong  boundaries  account.  Main Program - Read p r o t e i n s e q u e n c e - Assign P b  Subrc u t i n e Rule  , P„ a  to each r e s i d u e  3  1  1 / Rule  2  Search f o r p o t e n t i a l h e l i x 3-sheet areas  Subrc m t i n e  or  2  Helix/3-sheet nucleation within those p o t e n t i a l areas  1 Subroutine  3  Helix/3-sheet propagation termination  P r i n t out sections  42  and  helix/3-sheet  B.  Scheme f o r t h e 3--Tu'rh S e a r c h For  the 8-turn  search,only  one s u b r o u t i n e  needed t o l o c a t e t h e d i f f e r e n t t u r n s a c c o r d i n g and  t o compare t h e a d j a c e n t  was  to rule 3  p r e d i c t e d t u r n s so as t o  c o n s i d e r o n l y t h e one w i t h t h e h i g h e s t p r o b a b i l i t y o f occurrence ( p ) • t  Main Program - Read i n s e q u e n c e - A s s i g n m e n t o f P^ and frequency of occurrence  Subroutine - Rule  3  - Then c o m p a r i s o n o f adjacent turns  1  Print  C.  out p o s i t i o n of turns.  Scheme f o r S o l v i n g O v e r l a p p i n g In t h i s case,  a- and 8-Areas  t h e p u r p o s e o f t h e m a i n p r o g r a m was  t o r e c o r d t h e w h o l e s e q u e n c e o f e a c h p r o t e i n as w e l l as t h e consecutive  p a i r s o f o v e r l a p p i n g a r e a s w h i c h wer.e f o r m a t t e d  43  in  t h e f o l l o w i n g manner:  S2  H3  S3  H4  :  boundary v a l u e s fragment  of the  helical  SI  - S2  :  boundary v a l u e s fragment  of the  3-sheet  f i r s t subroutine  .. .  i t s e l f and t h a t o f  3 I n c a s e t h e 3 - s h e e t was  the o v e r l a p p i n g area. -the. a - h e l i x ,  S4  c a r r i e d out the comparison  and P„ o f e a c h f r a g m e n t a  &  The  H2  - H2  of the average P  self.  SI  HI  The  in  HI  the 3-sheet i t -  t h e o v e r l a p p i n g a r e a was  results obtained  gest whether - the e n t i r e  at this  fragment  contained with-  step could already ( 3-sheet/a-helix  o v e r l a p p i n g a r e a had a h i g h e r p r o p e n s i t y t o e x i s t  sug-  ) and  i n one  the of  the conformations than the other ( i . e . , i n h e l i c a l s t a t e i f <P > > <P„>," o r i n 3 - s h e e t c o n f o r m a t i o n i f <P > < <P >) .  3  a  In to  e a c h amino a c i d r e s i d u e  i n the fragments under  R  Bg, t h e a l p h a b e t i c r e p r e s e n t a t i o n s w e r e c o n v e r t e d (i.e.,  H  ,H  + 2.00;  Q  ' a ' 3 + 0.25; b , b ^  i , i a'  3  Q  '  stead o f c o m p a r i n g did  3  a'  ' a ' 3 -> -0.50; B , B '  a'  3  0  to numeri-  I , I„ -»• 0.50;  ' a ' 3 -> - 1 , 0 0 ) . H e n c e , i n -  (H^ h  v  1^ i  (1978a, 1978b), n u m e r i c a l  the c o n f o r m a t i o n a l  under c o n s i d e r a t i o n .  + 1.00;  0  sets of characters  Chou and Fasman  used to r e p r e s e n t  D  h , h  considera-  I , I , B ,  R  ones  C  the second s u b r o u t i n e , i n s t e a d of a s s i g n i n g  t i o n the a l p h a b e t i c r e p r e s e n t a t i o n of H , H ,  cal  3  a  potential  b  values  as  were  of the  T h i s " c h a r a c t e r a n a l y s i s " was  44  B^)  z  regions  performed  oh  t h e a - h e l i x and  lapping  as w e l l a s , on  the  over-  area. The  third  8-sheet fragments,  " b o u n d a r y a n a l y s i s " was  subroutine.  conformational  T h i s c o n s i s t e d o f summing up  the  those  of the t h r e e r e s i d u e s  fragment ends, u s i n g the v a l u e s P  (HI)  a N  ™  P  +  P  aC  P  naN  ^ -^  naC  ^  P  boundary  +  P  H 1  H 2 + 1  ^  +  aC  +  +  from Tables (Hl 1)  a N  ( H 2  P  P  adjacent  "^  P  +  +  P  2 and +  W ~V 2  aC  CH1-2) • P  naC  ^  ^  t i o n the aries.  thus  i n f l u e n c e o f the n e i g h b o u r i n g H e n c e , i f a f r a g m e n t has  the h e l i c a l a r i e s may  "boundary a n a l y s i s "  state,  +  P  (Hl-3)  n a N  naC  <  H 2 + 3  took  into  considera-  r e s i d u e s at the  r e s i d u e s at the  p a r t i c i p a t e i n s t a b i l i z a t i o n of the h e l i x  are f a v o r a b l e to i t s presence.  45  )  8-sheet  very high p o t e n t i a l  the n e i g h b o u r i n g  the  (Hl 2)  a N  naN  H 2 + 2  to  to  3.  S i m i l a r p r o c e d u r e s were a p p l i e d t o the f r a g m e n t . The  i n the  parameters of the three r e s i d u e s b e l o n g i n g  t h e f r a g m e n t and  e.g.  c a r r i e d out  bound-  for bound-  i f they  Main  Program  - Read i n s e q u e n c e - Read i n d i f f e r e n t o v e r l a p p i n g areas - Assignment o f P ,  1  Subroutine  C o m p a r i s o n o f <Pa>, < P g - I n each fragment  itself  - In the overlapping  Subroutine  >  area  2  G r o u p i n g o f a - and 8-assignments ( H , Hg, ..., B , - B ) a  a  3  - I n each fragment  itself  - In the overlapping  area  Subr o u t i n e 3 Boundary a n a l y s i s P  (P^,P  a C  n3N' n8C^ P  - F o r each fragment  46  itself  ,  Efficiency  of the a-helix  When t h e r u l e s (1978a, missed  prediction  e s t a b l i s h e d b y Chou a n d Fasman  1978b) were s t r i c t l y i n the present  predicted  followed, several  p r e d i c t i o n and t h e b o u n d a r i e s o f t h e  r e g i o n s were q u i t e d i f f e r e n t  Fasman o r f r o m X - r a y  a r e a s were  analysis  f r o m t h o s e o f Chou a n d  ( T a b l e 5, p. 189)  H o w e v e r , when t h e r e s u l t s  obtained f o r the various  p r o t e i n s were a n a l y z e d , t h e d i f f e r e n c e between t h e boundary v a l u e s o f t h e p r e s e n t s t u d y a n d t h o s e o f Chou a n d Fasman o r from X-ray factors (P  aN  tial  a n a l y s i s c o u l d be r e d u c e d b y t a k i n g  such as: T  n  ;  b) t h e 3 - t u r n o r 3 - s h e e t J  i n the v i c i n i t y of the h e l i c a l  words, a f t e r search,  account  a) t h e b o u n d a r y c o n f o r m a t i o n a l p a r a m e t e r s  , P , P • , P ) and aC naN naC n  into  potenr  boundaries. In other  going through the e n t i r e procedure  of h e l i c a l  i f a r e g i o n d e l i n e a t e d b y two v a l u e s J l a n d J 2  (Jl:  N - t e r m i n a l p f t h e p r e d i c t e d r e g i o n and J 2 : C - t e r m i n a l o f t h e same r e g i o n ) was p r e d i c t e d Jl  and P £ o f J 2 w o u l d a  as a - h e l i x t h e n t h e v a l u e s P ^ o f  be c o m p a r e d t o t h o s e o f t h e i r n e i g h -  b o u r i n g r e s i d u e s so t h a t t h e new b o u n d a r i e s J l +_ n , J 2 +_ n' (n and n ' : i n t e g e r s ) w o u l d for of  helix  have t h e most f a v o r a b l e P ^ and  stabilization.  the n o n h e l i c a l  The p a r a m e t e r s  P  n a N  a n d P -.  residues adjacent to the h e l i c a l  47  na(  bound-  a r i e s . were also'  important' i n t h i s  •When c o n s i d e r i n g t h e p o s s i b i l i t y lix  boundaries  t a t e d by RMJ1,  of B-turn presence  a t t h e he-  o r t h e o v e r l a p p i n g o f t h e end r e s i d u e s w i t h  a fragment p o s s e s s i n g h i g h necessary  "move o f t h e b o u n d a r i e s " .  B-sheet p o t e n t i a l ,  i t may a l s o be  t o move J l a n d J 2 t o new p o s i t i o n s w h i c h a r e d i c P - respectively  ( c f . s u b r o u t i n e s M0J1, M0J2,  a(  a n d RMJ2 f o r a - h e l i x p r e d i c t i o n ) . The  f o l l o w i n g a r e some e x a m p l e s o f a - h e l i x bound-  a r i e s adjustment to i l l u s t r a t e  the concept  o f "move o f bound-  aries". (1)  J l= J l - 2 a. q - H e m o g l o b i n :  Ala  Asp  I  !  Asp  8-17  Lys-  Thr .  I  Asn  I  8  *  1  (6). has t h e second h i g h e s t  ( 7 ) , and 1 I  a  area  !  10  v a l u e . As  t h e r e i s n e i t h e r B-turn n o r B-sheet p o t e n t i a l  Lys  i n this  region,  8-17 g a i n s  1 h^,  , A s p ( 6 ) . T h e r e i s no n e e d t o move  ther to the N-terminal  Val"  !  6  by m o v i n g b a c k t o A s p ( 6 ) , t h e h e l i c a l  '  fur-  because t h e program has a l r e a d y  d i c t e d t h e a r e a 1-8 as h e l i c a l .  48  pre-  b\ M y o g l o b i n :  Asp __!  24-36  Val  Ala  Gly  His  Gly  Glu  !  t  T  t  I  t  20  22  24  Ala The  (22) has  a higher  i n c o r p o r a t i o n of a breaker,  that of A l a , a strong h e l i x ment h e l p s  (2)  former.  value than His (23)  i s balanced  (24). by  This boundary a d j u s t -  t o l i n k t h e h e l i c a l r e g i o n s 13-22  and  24-36.  Glu  J l = J l + 3 a. C a r b o x y p e p t i d a s e :  Lys  Tyr  Ala  i  !  !  169  (172)  presence as t h e  Asn  Ser  Glu  Val  I  f  !  171  ! _  173  175  t  b o u n d a r y a d j u s t m e n t i s j u s t i f i e d by  strong potential the very h i g h P  170-182  !  I  The  Ser  Gly  i _  26  I  t  Asp  8-turn a N  of Glu  the  o f t h e t e t r a p e p t i d e 169-172 and (173). Furthermore,  have the h i g h e s t P  n a I  i s f a v o r a b l e to Glu  N-boundary.  49  Asn  q v a l u e s , hence  (173)  i f this  (171)  by and  their  r e s i d u e i s chosen  i  b. P a p a i n :  47-60  Leu  Asn  Gin  45  Tyr  Ser  47  Glu  49  I  I  By m o v i n g t h e N - b o u n d a r y t o p o s i t i o n  50  (Glu),  a d v a n t a g e i s t a k e n o f t h e good P , o f G l u and a t t h e same T  t i m e t h e h e l i x b r e a k e r T y r (48) and t h e h e l i x : ' i n d i f f e r e n t Ser  (49) a r e removed. S e r (48) h a s h i g h p r o p e n s i t y t o be  found a t the n a n h e l i c a l  (3)  N-boundary.  J2 = J2 + 2 a. L y s o z y m e :  105-112  Val  Ala  Trp  Arg  Asn  Arg  Cys  Lys  I  t  I  I  I  I  I  |  110  112  114  i  1 formers  to  t h e i n c o r p o r a t i o n o f an e x t r a h e l i x b r e a k e r , A s n  ( 1 1 3 ) . And A r g (114) has good P ^ v a l u e w h i c h c o n s i d e r a t i o n as t h e new Lys  |__  116  The r e g i o n 105-112 has enough h e l i x balance  Gly  justifies i t s  C - b o u n d a r y . The r e s i d u e s Cys  (116) and G l y (117) a r e f a v o r a b l e t o t h e new  of J2.  50  (115),  position  b.Carboxypeptidase:  297-303  Met  Glu  His  Thr  Val  Asn  l  !  !  t  t  t  301  303  305  I  1  Despite the lower ?  a C  Asn t_  307  o f V a l (305) c o m p a r e d t o  t h a t o f H i s (303) , t h e move o f J 2 t o J 2 + 2 a l l o w s t h e a d d i t i o n o f an e x t r a i In f a c t  P  a c  ( T h r ) and H  (Val) to the r e g i o n .  o f V a l i s h i g h e r than the average  t h e two r e s i d u e s A s n ( 3 0 6 , 307) a r e l i s t e d P  v a l u e and  second  for their  naC'  (4)  J 2 = J 2 - 5: a .Papain: Asp  47-60  Cys  Asp  Arg  56  58 t  helical  a C  o f Asp  o d  P  Tyr  Gly 62  i  57-60 has 8 - t u r n p o t e n t i a l  (57) i s l o w e r t h a n t h a t o f Cys  c o n f o r m a t i o n a l parameter  t h a n t h a t o f Cys ( i ) . S°  Ser 60  :  The t e t r a p e p t i d e although P  Arg  o f Asp  51  ( 5 6 ) , the  ( I ) i s higher a  The r e s i d u e s A r g ( 5 8 , 59)  naC-  and  exhibit  b. g - C h y m o t r y p s i n : Ala T  172-186  Gly  Ala  Ser  Gly  Val  1  I  I  I  I  184  186  Ser J__  188  I  t The  t e t r a p e p t i d e 185-188 e x h i b i t s  B-turn p o t e n t i a l  and by m o v i n g J 2 t o J 2 - 3 , t h e r e a r e two a d v a n t a g e s . t h e b r e a k e r , G l y ( 1 8 4 ) , i s a v o i d e d and ary w i t h r e l a t i v e l y The  boundary  a d j u s t m e n t p r o c e d u r e s were e l a b o r a t e d RMJ1  s i m p l e and may  the boundary  bound-  A  and J 2 , r e s p e c t i v e l y .  not always  s e c o n d , a new  good P Q > A l a ( 1 8 3 ) , i s o b t a i n e d .  i n t h e s u b r o u t i n e s M0J1, Jl  First,  and M0J2, RMJ2 f o r t h e  A l t h o u g h such c o n s i d e r a t i o n s give unexpected r e s u l t s  c o n d i t i o n s v a r y from fragment  w e l l as f r o m p r o t e i n ,  residues  to p r o t e i n  are  because  to fragment  , they s t i l l  as  h e l p t o save  p a r t of the a n a l y s i s of o v e r l a p p i n g areas between h e l i x B-sheet  and t o a v o i d c o n f l i c t s b e t w e e n B - t u r n s and  and  helices.  A c e r t a i n number o f segments were m i s s e d i n t h e p r e d i c t i o n by t h e p r e s e n t p r o g r a m  due  t o one  of the  following  reasons: I t was  r u l e d t h a t t h e n u c l e a t i o n segment s h o u l d  c o n t a i n f e w e r t h a n one This condition s t r i c t l y  t h i r d h e l i x b r e a k e r s (B o r b ) ., ^ a or eliminated  some s e g m e n t s w h i c h  h a v e two b r e a k e r s o u t o f s i x r e s i d u e s a l t h o u g h t h e y 52  met  the requirement a-hemoglobin  23-28 and p a p a i n The  t h i r d s h's 85)  of having at l e a s t  ( e . g . c a r b o x y p e p t i d a s e 116-121 and  the whole fragment  left  79-84) t h e new  have b e t t e r  and The  terminal  end  boundaries  i s also c r i t i c a l . fragment  had  would  N-  t h e r i g t h number o f  from b e i n g taken i n t o accoung.  not p o s s i b l e to s h i f t  the fragment  than a - h e l i x p o t e n t i a l .  T h e r e f o r e we  then t r i e d  to s h i f t  h's,  86) In  to the  89 t h e a r e a  a p r e l i m i n a r y c o n s i d e r a t i o n o f the fragment possible helix  rule,  ( J l - 1 , J2-1)  o f P r o as t h e f o u r t h r e s i d u e ( p o s i t i o n  3-sheet  final  For example, i n c o n c a n a v a l i n  t o p o s i t i o n 84-89 b e c a u s e f r o m p o s i t i o n higher  80-  p o s i t i o n to the  p o s i t i o n of the r e s i d u e Pro at the  impeded t h e f r a g m e n t i t was  by one  r e s p e c t the  two  respectively.  A 83-88, a l t h o u g h t h i s the presence  lysozyme  i n c o r p o r a t i n g an e x t r a r e s i d u e , t h e  ( c a r b o x y p e p t i d a s e 116-122) would  (lysozyme  (e.g.  120-126).  o r by s h i f t i n g  case  t h i r d s h's  n u c l e a t i o n segment d i d n o t h a v e a t l e a s t  , a l t h o u g h by  segment  two  this right  had attempted  83-88 as a  i t to a v o i d h a v i n g Pro  the f o u r t h r e s i d u e (e.g. c o n c a n a v a l i n A  as  80-85).  F o r R u s s e l l ' s V i p e r Venom 4 7 - 5 5 , as t h e r e w e r e more I a ,' ai  and b a t h a n H a and h a i n t h i s p a r t o f t h e rp o lJyrp erp r  t i d e c h a i n , the n u c l e a t i o n s e a r c h s k i p p e d the e n t i r e  area  be-  c a u s e none o f t h e c o m b i n a t i o n s o f s i x c o n s e c u t i v e a m i n o . a c i d  53  r e s i d u e s had half  .h's  (5 h's  at l e a s t  " requirement  t h i r d s h's.  was  met  by  Nevertheless  the e n t i r e  applied s t r i c t l y  to such  i n a l l c a s e s , and  an e x t e n t t h a t t h e  n o t be c o n s i d e r e d as h a v i n g The  another  (e.g. bovine  colostrum  i t may  N-terminal  end  position),  then  lix  residue  5 was  for missing a  t h e f r a g m e n t 5-10  s i n c e i t met  the requirement  5-10)'. The  f r a g m e n t 1-10 5, and  i n the i n n e r h e l i x .  a l t h o u g h not w o u l d be f o r two  c l e a t i o n r u l e was circumstances  as  2 first  third a-he-*  h's. are  Fasman o r t o X - r a y d a t a , t h e  s l i g h t l y modified  ( i . e . , under  t h e n u c l e a t i o n segment may  d u r i n g the a - h e l i x p r o p a g a t i o n ,  at the  an  a potential  and  54  third  the a d d i t i o n of  the presence  p o s i t i o n s of the N - t e r m i n a l  nu-  specific  h a v e one  b r e a k e r s , w h i c h t h e n must be c o m p e n s a t e d by h's  the  Howev-  at the  thirds  of 11,  as  I n summary, i n o r d e r t o o b t a i n r e s u l t s w h i c h c l o s e r t o t h o s e o f Chou and  will  fragment  considered acceptable  (good P ^ ,  mo-  criteria.  h e n c e t h e p r o g r a m c o u l d o n l y d e t e c t t h e segment 9-14  e r - , i f the r e s i d u e Pro  be  or t h r e e Pro r e s i d u e s i n the  t h i s , p r o t e i n " c o n t a i n e d t h r e e P r o a t p o s i t i o n s 4,  most s u i t a b l e t o a v o i d h a v i n g P r o  47-55  f i n a l p r e d i c t e d segment  p o s s i b l e reason inhibitor  "one  requirement  d e v i a t e d from the normal  g a t h e r i n g o f two  same a r e a was  the  fragment  o u t o f 9 amino a c i d r e s i d u e s ) . Hence t h i s  c o u l d n o t be dified  two  end  of  Pro  instead of  at  t h e t h i r d one  does n o t  a-helix formation). of the p o l y p e p t i d e  In a d d i t i o n , before  was  d e v e l o p e d i n two  first  and  the  subroutines  s e c o n d one  extra  analy-  e x t r a s u b r o u t i n e s , the  first  one  residue  (cf. subroutines  w i t h the C - t e r m i n a l  M0J1,  residue (cf.  M0J2, RMJ2). The  was  f o r i t s most  c a r r i e d out. This  d e a l i n g w i t h the N - t e r m i n a l RMJ1)  to  a s s e r t i n g a segment  c h a i n as a - h e l i x , t h e s e a r c h  s u i t a b l e b o u n d a r i e s was sis  a l w a y s c o n s t i t u t e an o b s t a c l e  f o l l o w i n g program, f o r the a - h e l i x  e v e n t u a l l y adopted.  55  search,  1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  C C c  c c c  MAIN PROGRAM OF HELIX  PREDICTION  c c  c c c  c c c c c  PURPOSE TO READ IN THE SEQUENCE OF THE PROTEIN AND TO ASSIGN TO EACH AMI NO ACID RESIDUE ITS CONFORMATIONAL P A R A M E T E R S ( P A , P B , P T ) AND ITS BOUNDARY CONFORMATIONAL PARAMETERS(PAN,PAC,PNAN,PNAC,PBN,PBC. PNBN.PBNC)  c  c c c  REAL S , T 1 , T 2 . A 1 , A 2 , T 3 , T 4 , T 5 , T T , P INTEGER G , F , H , U , D , V 1 , V 2 , W , V 3 , V 4 , V 5 , V 6 . V 7 . V 8 , 0 LOGICAL H E L L O , B Y E ,BALL,MOVE DIMENSION S ( 1 0 0 0 , 2 0 ) , M ( 1 0 0 0 ) , H ( 1 0 0 0 ) , D ( 1 0 0 0 , 1 6 ) , P ( 1 0 0 0 , 1 0 ) COMMON S , T 1 , T 2 , T 3 , T 4 , T 5 , T T , A 1 , A 2 , P , F , H , U , D , W , M . M 1 , M 2 , M 3 , M 4 , M 5 , M 6 1L,I,K,L1,L2,NZ,NY,JA,JB,JC,JD,J1,J2,KM,N1,N2,NN,J,G,K3,V1,V2,V3,' 2,V5,V6,V7,0.HELLO,BYE,BALL,MOVE  c c c c c c c c c c c c c  c c c  c c c  c c c c.  DESCRIPTION OF PARAMETERS S - ARRAY RECORDING THE DIFFERENT CONFORMATIONAL PARAMETERS FOR EACH AMINO ACID RESIDUE (K) S(K,1) - PA -S(K, 2) - PB S(K,5) - PT S(K,6) - PNAN S(K,7) - PNAC S(K,8) - PAN S(K.9) - PAC T1 - SUM OF PA OF N AMINO ACID RESIDUES T2 - SUM OF PB OF N AMINO ACID RESIDUES , T5 - SUM OF PT OF N AMINO ACID RESIDUES T3 - SUM OF THE ASSIGNMENTS AS FORMER,BREAKER,INDIFFERENT IN THE NUCLEATION FRAGMENT T4 - SUM OF THE ASSIGNMENTS AS FORMER,BREAKER,INDIFFERENT IN THE ENTIRE PREDICTED HELICAL AREA TT - ALLOWED NUMBER OF BREAKERS IN THE ENTIRE PREDICTED AREA ( EQUAL TO ONE THIRD OF THE LENGTH) P - FREQUENCY OF THE RESIDUES IN A REVERSE B-TURN  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  C C C C C C C C C C C C C C C C C C  102 103 104 106 107 108 109 1 10 1 11  c c c  -  NN N  -  D  -  REMARK SOME OF THE PARAMETERS WILL BE DESCRIBED IN THE SUBSEQUENT SUBROU TINES SINCE THEIR DEFINITION MAY CHANGE FROM ONE SUBROUTINE TO ANO THER 100  C  M  P(K,1) - FREQUENCY OF THE FIRST RESIDUE P(K,2) - FREQUENCY OF THE SECOND RESIDUE P(K,3) - FREQUENCY OF THE THIRD RESIDUE P(K,4) - FREQUENCY OF THE FOURTH RESIDUE ARRAY RECORDING THE NUMERICAL ASSIGNMENT OF EACH AMINO ACID RESIDUE TOTAL NUMBER OF RESIDUES OF THE PROTEIN NUMBER OF LINES USED TO ENTER THE WHOLE SEQUENCE (16 RESI DUES PER LINE) ARRAY RECORDING THE POSITION OF EACH AMINO ACID RESIDUE ON THE NTH LINE D(K,L) - AMINO ACID RESIDUE AT POSITION K ON LINE L  1 12  PRINT 100 FORMAT('1' 35X '*********************************') PRINT 102 FORMAT(' ' ,35X, '*' ,31X, '*' ) PRINT 103 FORMAT(' ' ,35X, '*' .4X. 'ALPHA-HELIX PREDICTION',5X,'*' ) PRINT 102 PRINT 104 FORMAT(' ' 35X '*********************************'//) READ (5,106) NN ,N FORMAT(6X,14,6X,14) WRITE (6,107) NN FORMAT('0','TOTAL NUMBER OF AA:',I7) WRITE (6,108) N FORMAT(' ', 'NUMBER OF DATA LINES: ' ,15,/) PRINT 109 FORMAT('0','PROTEIN SEQUENCE') PRINT 110 FORMAT( ' ' , ' '/) RE AD (5, 111) ((D(d,K),K=1,16),d=1,N) FORMAT (1615) WRITE (6,112) ((D(d.K),K=1, 16) ,d=1 ,N) F0RMAT(' ',1615)  TO CHECK THE NUMERICAL ASSIGNMENT OF EACH AMINO ACID RESIDUE IN THE SEQUENCE SO TO ASSIGN ITS CORRESPONDING CONFORMATIONAL PARAMETERS 1=1 DO 21 d=1,N DO 22 K=1,16 M( I )=D(d,K) IF (M(I).EQ.O)  GO TO 999  101 102 103 104 105 106 107 108 109 1 10 1 1 1 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 149 150  1=1+1  22 21 999  C C  1  CONTINUE CONTINUE DO 32 K == 1 , NN IF (M K ) EO. 1) IF (M K ) EQ. 2) IF (M K ) EO. 3) IF (M K ) EO.4) I F (M K ) EO. 5) IF (M K ) EO.6) IF (M K ) EO. 7) IF (M K ) EO.8) IF (M K ) EO.9) IF (M K ) EO. 10) IF (M K ) EO. 11) IF (M K ) EO. 12) I F (M K) EO. 13) IF (M K ) EO. 14) IF (M K ) EO. 15) IF (M K ) EO. 16) IF (M K ) EO. 17) IF (M K ) EO. 18) IF (M K ) EO. 19) IF (M K ) EO. 20) IF (M K) EO. 25) S(K , 1 S(K , 2 S(K , 5 S(K , 6 S(K , 7 S(K,8 S(K ,9 P(K , 1 P(K , 2 P(K , 3 P(K , 4  GO TO 2  S(K , 1 S(K, 2 S(K,5 S(K ,6 S(K , 7  S(K,8  S(K ,9 P(K, 1 P(K,2 P(K , 3 P(K ,4  = 142 =0 83 =0 66 =o 70 =0 52 = 129 = 120 =0 0 6 0 =0 076 =0 035 =o 058 32 =0 98 =0 93 =o 95 =0 34 = 124 =0 44 = 1 25 =0 0 7 0 =0 106 =0 099 = 0 085  GO GO GO GO GO GO GO GO GO GO GO GO GO GO GO GO GO GO GO GO GO  •  TO TO TO TO TO TO TO TO TO TO TO TO TO TO TO TO TO TO TO TO TO  1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 25  151 152 153 154 155 15G 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  3  4  5  6  7  GO TO 32 S(K, 1 )=0 67 S(K,2)=0 89 S(K,5)=1 56 S(K,6)=1 42 S(K,7)=1 64 S(K,8)=0 81 S(K,9)=0 59 P(K, 1 )=0 16 1 P(K,2)=0 083 P(K,3)=0 191 P(K,4)=0 09 1 GO TO 32 S(K , 1 ) = 011 S(K,2)=0 54 S(K,5)=1 46 S(K,6)=0 98 S(K,7)=1 06 S(K,8)=2 02 S(K,9)=0 61 P(K, 1 )=0 147 P(K,2)=0 1 10 P(K,3)=0 1 79 P(K,4)=0 081 GO TO 32 S(K, 1 )=0 70 S(K.2)=1 19 S(K,5)=1 19 S(K,6)=0 65 S(K,7)=0 94 S(K,8)=0 66 S(K,9)=1 1 1 P(K, 1 ) =0149 P(K,2)=0 053 P(K,3)=0 1 17 P(K,4)=0 128 GO TO 32 S(K, 1 ) = 1 11 S(K,2)=1 10 S(K,5)=0 98 S(K,6)=0 75 S(K.7)=0 70 S(K,8)=1 22 S(K,9)=1 22 P(K. 1 )=0 074 P(K,2)=0 098 P(K,3)=0 037 P(K,4)=0 098 GO TO 32 S(K, 1 ) = 51 1  201  as  o  202 203 204 205 206 207 208 209 210 21 1 212 2 13 2 14 2 15 2 16 217 2 18 2 19 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 24 1 242 243 244 245 246 247 248 249 250  8  9  10  1 1  S(K,2 )=0 37 S(K, 5)=0 74 S(K,6 )= 1 04 S(K,7 )=0 59 S(K,8 = 2 44 S(K,9 = 1 24 P(K, 1 =0 056 P(K , 2 =0 060 P(K.3 =0 077 P(K,4 =0 064 GO TO 32 S(K, 1 =0 57 S(K, 2 =0 75 S(K,5 = 1 56 S(K,6 = 1 4 1 S(K.7 = 1 64 S(K,8 =0 76 S(K.9 =0 42 P(K , 1=0 102 P(K,2 =0 085 P(K,3 =0 190 P(K , 4 =0 152 GO TO 32 S(K, 1 = 1 00 S(K , 2 =0 87 S(K,5 =0 95 S(K,6 = 1 22 S(K , 7= 1 86 S(K,8 =0 73 S ( K , =91 77 P(K, 1 =0 140 P(K , 2 =0 047 P(K, 3 =0 093 P(K , 4 =0 054 GO TO 32 S(K, 1 = 1 08 S(K,2 = 1 60 S(K, 5 =0 47 S(K,6 =0 78 S(K,7 =0 87 S(K,8 =0 67 S(K,9 =0 98 P(K , 1=0 043 P(K,2 =0 034 P(K.3) =0 013 P(K , 4=0 056 GO TO 32 S(K, 1 = 1 21 S(K,2 = 1 30 S(K,5 =0 59  251 252 253 254 255 256 257 258 259 260 26 1 262 263 264 265 266 267 268 269 270 27 1 272 273 274 275 276 277 278 279 280 28 1 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300  S(K,6 S(K,7 S(K,8 5(K,9  12  13  14  15  =0 =0 =0 = 1 P(K, 1 =0 P(K,2 =0 P(K, 3 =0 P(K, 4 =0 GO TO 32 S(K , 1 = 1 S(K , 2 =0 S(K,5 = 1 S(K , 6 = 1 S(K . 7 = 1 S(K,8 =0 S(K , 9 = 1 P(K, 1 =0 P(K,2 =0 P(K,3 =0 P(K, 4 =o GO TO 32 S(K. 1 = 1 S(K, 2 = 1 S(K , 5 =0 S(K,6 =0 S(K , 7 =0 S ( K , 8 =0 S(K,9 = 1 P(K , 1 =0 P(K,2 =0 P(K,3 =0 P(K,4 =0 GO TO 32 S(K , 1 = 1 S(K,2 = 1 S(K , 5 =0 5 ( K , 6 =0 S(K,7 = 1 S(K,8 =0 S(K,9 = 1 P(K , 1=0 P(K , 2 =0 P(K,3 =0 P(K , 4 =0 GO TO 32 S(K, 1 =0 S(K,2 =0 S(K,5 = 1 S(K,6 = 1 S(K,7 = 1  85 084 58 13 061 025 036 070 16 74 01 01 49 66 83 055 1 15 07 2 095 45 05 60 83 52 7 1 57 068 082 014 055 13 38 60 93 04 61 10 059 04 1 065 065 57 55 52 10 58  301 302 303 304 305 306 307 308 309 310 31 1 312 313 314 315 316 317 318 3 19 320 32 1 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 34 1 342 343 344 345 346 347 348 349 350  16  17  18  19  S(K.8 = 2 01 S(K,9 =0 00 P(K, 1 =0 102 P(K, 2 =0 301 P(K , 3 =0 034 P(K,4 =0 068 GO TO 32 S(K, 1 =0 77 S(K, 2 =0 75 S(K, 5 = 1 43 S(K,6 = 1 55 S(K,7 =0 93 S(K,8 =0 74 S(K,9 =0 96 P(K , 1 =0 120 P(K, 2 =0 139 P(K,3 =0 125 P(K,4 =0 106 GO TO 32 S(K , 1 =0 83 S(K.2 = 1 19 S(K,5 =0 96 S(K , 6 = 1 09 S(K,7 =0 86 5(K,8 = 1 08 S(K,9 = 0 75 P(K, 1 =0 086 P(K , 2 =0 108 P(K , 3 =0 065 P(K,4 =0 079 GO TO 32 S(K, 1 = 1 08 S(K , 2 = 1 37 S(K,5 =0 96 S(K , 6 =0 62 S(K,7 =0 16 S(K,8 = 1 47 S(K,9 =0 40 P(K, 1 =0 077 P(K , 2 =0 013 P(K, 3 =0 064 P(K,4) =0 167 GO TO 32 S(K, 1 =0 69 S(K,2 = 1 47 S(K,5) = 1 14 S(K,6) =0 99 S(K,7) = 0 96 S(K,8) =0 68 S ( K , 9 ) =0 73  W  351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 C 38 1 C 382 383 384 385 386 387 388 C 389 C 390 C 391 C 392 393 394 End of F i l e  20  25  32  40 41  P(K,1)=0.082 P(K,2)=0.065 P(K,3)=0.114 P(K,4)=0.125 GO TO 32 S(K,1)=1.06 S(K,2)=1.70 S(K,5)=0.50 S(K,6)=0.75 S(K.7)=0.32 S(K,8)=0.61 S(K,9)=1.25 P(K,1)=0.062 P(K,2)=0.048 P(K,3)=0.028 P(K,4)=0.053 GO TO 32 S(K,1)=0.00 S(K,2)=0.00 5(K,5)=0.00 S(K,6)=0.00 S(K,7)=0.00 S(K,8)=0.00 S(K,9)=0.00 P(K,1)=0.00 P(K.2)=0.00 P(K,3)=0.00 P(K,4)=0.00 CONTINUE PRINT 40 FORMAT( 12X.'PRELIMINARY SEARCH FOR REGIONS WITH HELIX POTENTIA 1L - RULE 1') PRINT 41 FORMAT(' ',12X.' 1 ' ) TO CALL SUBROUTINE ONE TO CARRY OUT THE PRELIMINARY SEARCH OF HELI CAL REGIONS CALL ONE STOP END  1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  C C SUBROUTINE C C C C C C C C C C C C C C C C C C c  ONE  PRELIMINARY  PURPOSE PRELIMINARY <PA> > 1 . 0 3  SEARCH FOR H E L I C A L  SEARCH FOR H E L I C A L AND <PA> > <PB>  REGIONS  REGIONS BY A P P L Y I N G RULE  1 :  REAL S , T 1 , T 2 , A 1 , A 2 ,T3,T4,T5,TT,P INTEGER G , F , H , U , D , V 1 , V 2 , W , V3 , V4 , V5 , V.6 , V7 , V8 , 0 LOGICAL H E L L O , B Y E ,BALL,MOVE DIMENSION S(1000,20),M(1000),H(1000),D(1000,16),P(1000,10) COMMON S,T1,T2.T3.T4,T5.TT,A1,A2.P,F,H,U,D,W,M,M1,M2,M3,M4,M5,M6 1 L , I , K , L 1 , L 2 , N Z . N Y , d A , J B , J C . U D , <J 1 , J 2 . K M . N 1 , N 2 , N N , d , G , K 3 , V 1 , V 2 , V 3 , ' 2,V5,V6,V7,0.HELLO,BYE,BALL,MOVE c c c c c c c c c c ' c c c c c c c c c c c  D E S C R I P T I O N OF H  J JA N1 N2 A1 A2 I  K  -  PARAMETERS  BOUNDARY RESIDUES OF A PREDICTED REGION H(K) - N - T E R M I N A L RESIDUE H ( K + 1 ) - C - T E R M I N A L RESIDUE - F I R S T RESIDUE OF A SECTION TO BE CONSIDERED FOR THE P R E L I MINARY SEARCH BUT W I L L CHANGE DURING N-PROPAGATION (J-1) - F I R S T RESIDUE OF A SECTION TO BE CONSIDERED FOR THE P R E L I MINARY SEARCH BUT W I L L CHANGE DURING C-PROPAGATION ( J A + 1 ) - F I R S T RESIDUE OF A SECTION TO BE CONSIDERED FOR THE P R E L I MINARY SEARCH - LAST RESIDUE OF A SECTION TO BE CONSIDERED FOR THE PREL1 MINARY SEARCH - AVERAGE <PA> OF A SECTION - AVERAGE <PB> OF A SECTION - SWITCHING VALUE FOR D E C I S I O N MAKING 1=1 N-PROPAGATION 1=2 C-PROPAGATION - COUNTER USED WITH THE ARRAY H TO STORE THE BOUNDARY R E S I  <7\ Ul  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  C C C C C C  DUES OF PREDICTED REGIONS THE SEARCH WILL STOP WHEN THE LAST SEGMENT AT THE C-TERMINAL HAS ONLY 5 AMINO ACID RESIDUES. IT IS NOT LONG ENOUGH FOR THE HELICAL STATE 10  15 20  c c c c c c c c  IF ARG OR CYS IS AT THE C-TERMINAL THEY CAN BE ADDED TO THE POTEN TIAL FRAGMENT BECAUSE OF THEIR GOOD PAC VALUE TO CALCULATE THE AVERAGE <PA>,<PB> AND TO COUNT THE NUMBER OF BREA KERS IN THE SECTION N1-N2  25  30  c c c c c c c c  K=2 H(K)=0 H(K-1)=0 NZ=NN-5 J= 1 JA= 1 1=0 N2=dA+5 HELLO=.FALSE. IF (d.EQ.H(K)) HELLO=.TRUE. IF (HELLO) N1=H(K)+1 IF (.NOT.HELLO) N1=d  T1 = 0 T2 = 0 L =0 LB = 0 L = N1+1+(N2-N1)/2 DO 25 LN=L,N2 IF (M(LN).EQ.2.OR.M(LN).EO.5) S(LN,1)=1.00 CONTINUE DO 30 L=N1,N2 T1 = T1 + S(L, 1 ) T2 = T2 + S(L.2) IF (S(L,1).LE.0.69) LB=LB+1 CONTINUE A1 = T1/(N2-N1+1) A2 = T2/(N2-N1+1) IF <PA> < 1.03 TO START THE SEARCH AGAIN FROM NEXT POSITION d+1 IF (A1 .LT. 1 .03000000)  GO TO 45  SPECIAL SITUATION WHERE THE SECTION MAY HAVE HELICAL POTENTIAL EVEN THOUGH <PA> < <PB>  101 102 103 104 105 106 107 108 109 1 10 1 1 1 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 149 150  IF (A1.GT.1.1100.AND.(A2-A1).IT.0.0640.AND.M(N1).EO.4.AND.M(N2+1 1 .EO.2.AND.M(N1 + 3) .EO. 1 .AND.M(N1 - 1) .EO. 11.AND.M(N1-2).E0.8.AND. 2S(N2, 1).GT. 1 .01 .AND.LB.EO.O) GO TO 60  C C C C  IF <PA> < <PB> EVEN IF <PA> > 1.03 TO START SEARCH AGAIN FROM NEXT POSITION d+1 UNLESS THE LAST AMINO ACID RESIDUE HAS BEEN REACHED IF (A1.LT.A2 .AND. N2.EQ.NN .AND.(N2+1-N1).E0.6) IF (A1.LT.A2 .AND. N2.E0.NN .AND.(N2+1-N1).GT.6) IF (A1.LT.A2 .AND. N2.NE.NN) GO TO 45  GO TO 80 GO TO 70  C C C C  TO PROPAGATE AT THE C-TERMINAL SIDE WHEN THE SEARCH HAS NOT REACHED THE LAST AMINO ACID RESIDUE YET (NN)  C C C C  TO START N-PROPAGATION WHEN <PA> > 1.03 AND <PA> > <PB> UNLESS THE HELICAL SEGMENT STARTS FROM POSITION 1  C C C C  IF (I.EQ.2 .AND. N2.EQ.NN) IF (I.E0.2 .AND. N2.NE.NN)  J = J-1 I = 1 BYE=.FALSE. IF (J.EO.H(K)) BYE=.TRUE . IF (BYE) I = 2 IF (BYE) GO TO 50 IF (.NOT. BYE) GO TO 20 TO SWITCH FROM N-PROPAGATION TO C-PROPAGATION WHEN THE SECTION OF THE SEQUENCE HAS MORE THAN 5 RESIDUES 45 50  C C C C  c  GO TO 55 GO TO 50  REMAINING  J=d+1 IF (I.EQ.2) GO TO 70 IF (I.EQ.1) I = 2 JA=UA+1 IF (JA.LE.NZ) GO TO 20 IF (UA.GT.NZ) GO TO 80 TO PRINT OUT THE LAST HELIX POTENTIAL AREA H(K),H(K+1) AND THE MA XIMUM VALUE OF THE COUNTER K WHICH WILL BE USED INT THE NEXT SUBROU TINE  55  58  K=K+1 H(K)=N1 K=K+1 H(K)=N2 PRINT 58,H(K-1),H(K) FORMAT('0',30X,16,10X,16) KM = K GO TO 80  151 C 152 C 153 C 154 C 155 C 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 17 1 172 173 174 175 176 177 178 1 79 180 181 C 182 C 183 • C 184 C 185 186 187 End of F i l e  TO PRINT OUT THE HELIX POTENTIAL AREAS H(K),H(K+1),THEN THE PRELI MINARY SEARCH STARTS AGAIN FROM POSITION (H(K+1) +1) 60  70  75 78  80 85  90 95  K = K+1 H(K)=N1 K = K+1 H(K)=N2 GO TO 75 K = K+1 H(K)=N1 K =K+ 1 H(K)=N2-1 PRINT 78,H(K-1) ,H(K) FORMAT('0',30X, 16,10X,16) J=H(K) JA=H(K) KM = K IF (UA.LE.NZ) GO TO 15 PRINT 85 ,KM FORMAT('0',40X, 'KM:',14) K=2 W= 1 PRINT 90 FORMAT('-',12X, 'SEARCH FOR ACTUAL HELICES FROM THE POTENTIAL REGIO 1 NS ' ) PRINT 95 FORMAT(' ',12X, 1 . . '//) TO CALL SUBROUTINE TWO TO CARRY OUT THE NUCLEATION SEARCH ON THOSE POTENTIAL AREAS CALL TWO RETURN END  cy, 00  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  c C  C C C C C C C C C C C C C C C C C C C  C C C C C C C C C C C C C C C C C C C C C  SUBROUTINE  TWO  SEARCH FOR HELIX NUCLEATION  PURPOSE SEARCH FOR NUCLEATING HELICAL REGIONS WHICH SHOULD CONTAIN AT LEAST 4 FORMERS OUT OF 6 RESIDUES  REAL S,T1,T2,A1,A2 ,T3,T4,T5,TT,P INTEGER G,F,H,U,D,V1,V2,W, V3,V4,V5,V6.V7,V8,0 LOGICAL,HELLO,BYE .BALL,MOVE DIMENSION S(1000,20),M(1000),H(1000),D(1000,16).P(1000,10) COMMON S,T1,T2,T3,T4,T5,TT,A 1,A2,P,F,H,U,D,W,M,M1,M2,M3,M4,M5,M6. ' 1L,I,K,L1,L2,NZ,NY,JA,JB,JC,JD, J 1,J2,KM,N1,N2,NN,J,G,K3,V1,V2,V3,V4 2, V5,V6,V7,0,HELLO,BYE,BALL.MOVE DESCRIPTION OF PARAMETERS J FIRST RESIDUE OF THE 6 RESIDUE PEPTIDE SUBJECTED TO THE NUCLEATION SEARCH JA - SIXTH RESIDUE OF THE 6 RESIDUE PEPTIDE SUBJECTED TO THE NUCLEATION SEARCH W - SWITCHING VALUE FOR DECISION MAKING W=1 THE CURRENT POTENTIAL AREA IS STILL LONG ENOUGH (> 6 RESIDUES) TO BE SUBJECTED TO THE NUCLEATION SEARCH W=2 THE CURRENT POTENTIAL AREA IS TOO SHORT FOR ANOTHER HELIX SO TO START WITH THE NEXT POTENTIAL AREA REMARKS UNLESS NOTIFIED THE OTHER PARAMETERS STILL HAVE THE SAME DEFINITION IF W = 2 THE NUCLEATION SEARCH WILL START ON A NEW POTENTIAL AREA SINCE THE PREVIOUS ONE HAS BEEN THOUROUGHLY ANALYZED. EACH TIME K INCREASES BY 1 THE NEXT POTENTIAL AREA IS SUBJECTED TO THE NUCLEA TION PROCEDURE  cn  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  C 10 15  C C C C C C C  20  TO COUNT THE DIFFERENT TYPES OF ASSIGNMENTS (T3) AND THE NUMBER OF BREAKERS (L) IN THE SEGMENT d~dA S(I,3) = 0.0 IF RESIDUE I IS A BREAKER OR AN INDIFFERENT S(I,3) = 0.5 IF RESIDUE I IS A WEAK FORMER S(I,3) = 1.0 IF RESIDUE I IS A FORMER  25 30 C C C C  T3 = 0 L =0 DO 25 I=J,dA S(I.3)=0 IF (S(I . 1).GE. 1 .00) S(I,3)=0.5 IF (S(I , 1),GE. 1 .06) S(I,3) = 1.0 T3 = T3+S(I,3) IF (S(I, 1 ) .LE.0.69) L = L+1 CONTINUE PRINT 30,d,dA,T3,L FORMAT(' ',1OX,'J :',14,5X,'JA:',14,5X,'T3:',F7.4,5X,'L:',13,5X 1'HELIX NUCLEATION') IF CASE ARG IS AT THE C-TERMINAL IT MAY SWITCH FROM INDIFFERENT TO FORMER SO THAT THE NUCLEATION RULE CAN BE SATISFIED IF (T3.EO.3.5.AND.M(dA).EO.2) S(dA,1)=1.00 IF (T3.EQ.3.5.AND.M(dA).EQ.2) T3=4.0  C C C C C c  c  IF (W.E0.2) GO TO 20 K=K+ 1 IF (K.GT.KM) GO TO 170 N1=H(K) K = K+1 N2=H(K) IF (W.E0.1) d = N1 NY=N2-5 dA=d+5  LIST OF SPECIAL SITUATIONS WHERE THE NUCLEATION RULE AND THE TYPES OF RESIDUES IN THE SEGMENT SHOULD BE COMBINED TOGETHER SINCE THE NUCLEATION RULE BY ITSELF IS TOO DISCRIMINATIVE IF ((UA+2).GT.NN.OR.(d-2).LE.0) GO TO 35 IF (T3 . GE . 4 .0. AND . L . LE . 2 . AND .M(d).EQ.7. AND . M( <JA ) . E-Q . 1 . AND .M(dA1.EQ.7.AND.M(J+3).EO- 1 .AND.M(d~2).EQ. 1 .AND.S(JA+1, 1).GT. 1 . 16.AND 3S(dA+2,1).GT. 1.16) GO TO 90 35  IF ((JA+1).GT.NN.OR.(J-3).LE.0) GO TO 40 IF (T3.GE.4.5.AND.L.EQ.1.AND.M(U+3).EO.15.AND.M(J).EQ.4.AND. M (d 1.EO.11.AND.M(d+4).EQ.7.AND.M(OA).EQ.18.AND.S(d-3,8).GE.2.01.AND 2 S(d-2, 1).GT. 1 . 16.AND.S(dA+1 , 1 ).GT. 1 .01) GO TO 110  -J °  101 102 103 104 105 106 107 108 109 1 10 111 112 113 114 115 116 117 118 119 120 121 122 123 1 24 1 25 126 127 128 129 130 13 1 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 149 150  c  C  40  IF ((dA+2) .GT.NN.OR. (d-4) .LE.O) GO TO 45 IF (T3.GE.3.5.AND.L.EQ.0.AND.M(J).EO.14.AND.M(d+1).EO.1.AND.M(JA+1 1).E0.7.AND,M(dA-1).EO.9.AND.S(J-1,1).LT.0.67.AND. S(dA + 2,1).LE.0.69 2 .AND.5(J-2,8).LT.1.08.AND.S(d-3,8).GT.1.47.AND.S(d-4,8).LT.1.08) 3 GO TO 120  45  IF ((d-2).LE.O) GO TO 50 IF (T3.GE.4.0.AND.L.LE.2.AND.M(d 1).EO.6.AND.M(d+3).EO.1.AND.M(dA 1 - 1 ) .EO. 1 1 .AND.M(JA) .EO. 1 1 .AND.S(d,1).LE.0.69.AND.S(d-1, 1 ).LE.0.69 2.AND.S(J-2,8).GT.1.47) GO TO 110 +  C  C  C  C  50.  IF ((JA+2).GT.NN) GO TO 55 IF (T3.GE.4.0.AND.L.EO. 1 . AND.M(J).EO. 15.AND.M(J+1).EQ.4.AND.M(dA) 1.E0.7.AND.M(d+3).EO.5.AND.M(dA+1).EO.15.AND.M(dA+2).EO.15) GO TO 2 130  55  IF ((dA+1).GT.NN.OR.(d-2).LE.0) GO TO 60 IF (T3.GE.3.5.AND.L.EQ.0.AND.M(J-1).EQ. 15.AND.5(d+2, 1 ) .GT . 1 .21 .AND 1 .S(d+3, 1).GT.1.16.AND.S(d+4, 1).GT. 1. 16.AND. S (dA , 9 ) . GT . 0 . 75 .AND.S( 2 dA+1.9).GT.0.75.AND.S(d-2,8).LT.1.08) GO TO 140  60  IF ((dA+2).GT.NN.OR.(d-1).LE.0) GO TO 65 IF (T3.GE.3.0.AND.L.EQ.3.AND.M(d).EQ. 15.AND.M(d+2).EQ. 1 .AND.M(J+3 ) 1 .EQ.20.AND.M(J + 4) .EQ. 13.AND.M(dA).EQ.8.AND.S(JA+1 ,2).LT.O.93.AND. 2S( JA+2 , 2) .LT.0.74.AND.S(J-1,2).LT.O.93) GO TO 130  65  IF ((JA + 7 ) .GT.NN) GO TO 70 IF (T3.EQ.2.5.AND.L.EQ.2.AND.M(d+2).EQ.14.AND.5(d+3,8).GT.2.02.AND 1 .S(d+4. 1).GT.0.7 7.AND.S(dA, 1 ) .GT.0.83.AND.S(JA+1,8).GT.2.01.AND. 2 S(JA + 5,9).GT. 1 . 10.AND.S(JA + 5, 1 ) .GT. 1 . 16.AND.M(JA + 2).NE. 15.AND.S( 3 JA+3,9).GT.1.10.AND.S(JA+4,9).GT.1.24.AND.S(JA+6,9).LT.1.10.AND.S 4 (JA + 7, 1 ) .LT. 1 .06) GO TO 150  70  IF ((JA+1).GT.NN.OR.(J-3).LE.0) GO TO 75 IF (T3.GE.4.0.AND.L.LE. 1 .AND.M(JA) .EQ. 15.AND.S(JA+1, 1 ) .LT.S(JA- 1 , 1 1 ) . AND.S(JA-1, 1 ) .GT. 1 . 16.AND.S(JA- 1,9).GT. 1. 10.AND.S(d,8).GT.2.01 2 .AND.S(d-1,8).GT.1.29.AND.S(d-2,6).GT.1.09.AND.S(J-3,8).LT.S(d-1 3.8).AND.S(d+1,1).GT.1.16.AND.S(d+2,9).GT.1.10.AND.S(d+3.9).GT.1. 4 20) GO TO 140  75  IF ((dA+3).GT.NN.OR.(d-3).LE.0) GO TO 80 IF (T3.GE.3.50.AND.L.LE.1.AND.M(d).EQ.15.AND.S(dA,9).GE.1.10.AND.S 1 (dA, 1 ) .GT. 1 .08.AND.S(dA+1, 1) . LE.0.69.AND.S(dA+2, 1 ) .LE.0.69.AND.S( 2 dA + 3,9) .LT.0.98.AND.S(d+1, 1 ) . GT. 1 . 16.AND.S(d+2, 1) .GT. 1 . 16.AND.S(d 3 +3,9).GE.1.57.AND.S(d+4,9).GT.0.75.AND.S(d-1,1).GT.1.16.AND.S(d-2 4 . 1 ) .GT. 1 . 13.AND.S(d-3, 1 ) .GT. 1.01) GO TO 130  C  C  C C C  THE NUCLEATION RULE BY ITSELF IS THE CRITERIA FOR SELECTION IF NO NE OF THE ABOVE CONDITIONS IS SATISFIED  151 152 153 154 155 156 157 158 159 160 16 1 162 163 164 165 166 167 168 169 170 17 1 172 173 174 175 176 177 178 179 180 18 1 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  C C C C C C  80  THE NUCLEATION SEARCH FAILED FOR THE SEGMENT J-JA. TO START AGAIN FROM NEXT POSITION d+1 d = d+1 IF (d.LE.NY) GO TO 20 GO TO 15  C C C C C  A VALID NUCLEATION SEGMENT ACCORDING TO RULE HELIX-4 SHOULD NOT HA VE PRO RESIDUE IN THE INNER HELIX 90  C C C C  95  DO 95 I=d,dA IF (M(I).EQ.15 .AND. I.E0.(d+2)) IF (M(I).E0.15 .AND. I.E0.(d+2)) IF (M(I).E0.15 .AND. I.NE.(d+2)) CONTINUE  d1=d GO TO 100 GO TO 105  TO CALL SUBROUTINE THREE FOR THE PROPAGATION OF THE VALID NUCLEATI NG SEGMENT 100  C C C C  IF (T3.GE.4.0.AND.L.LT.2 ) GO TO 90  CALL THRE GO TO 10  THE PRESENCE OF PRO IN THE INNER HELIX IS UNFAVORABLE TO THE NUCLE TION SO TO START THE SEARCH AGAIN FROM NEXT POSITION d+1 105  d = d+1 IF (d.LE.NY) GO TO 15  GO TO 20  C  c c c c c  c  TO PRINT OUT THE POSSIBLE HELICAL REGIONS WHICH ARE THEN SUBdECTED TO THE BOUNDARY ADdUSTMENT(SUBROUTINE M0d1). THOSE ARE ALSO SPECIAL CASES BECAUSE THE PROPAGATION PROCEDURE IS OMITTED 1 10 115  PRINT 115,d,dA FORMAT('0',10X,'PSEUDO HELIX FR0M',5X.'d :',I5,3X,'T0 JA:',I5.10X 1'SPECIAL CASE' / ) d1=d d2 = dA GO TO 160  120  d1=d - 3 d2=dA+1 PRINT 125,d1,d2 FORMAT('0',10X,'PSEUDO-HELIX FROM',5X,'d1:',I5,3X,'TO d2:',15,10X  125  201 202 203 C 204 205 206 207 C 208 209 2 10 * 21 1 2 12 213 C 214 2 15 216 217 2 18 c 2 19 c 220 c 22 1 c 222 c 223 c 224 c 225 226 227 228 229 230 c 231 232 233 234 End of F i l e  1 'SPECIAL CASE'/) GO TO 165 130  J1=d J2 = JA PRINT 125, U1 , J2 GO TO 165  140  J1=J-1 J2=JA-1 PRINT 125, J1 , J2 GO TO 165  150  J1= d+2 J2=JA+5 PRINT 125,01,02 GO TO 165  TO CALL SUBROUTINE M0J1 FOR THE BOUNDARY ADJUSTMENT OF THE PREDIC TED AREA. WHEN RETURNING FROM THAT PROCEDURE IF THE POTENTIAL AREA IS NOT LONG ENOUGH FOR ANOTHER HELIX THEN TO START ANALYZING THE NEXT POTENTIAL. AREA 160 165  CALL M0J1 IF (J2.LT. NY) IF (J2.LT. NY) IF (J2.GE. NY) GO TO 10  170 175  PRINT 175 FORMAT('-' , 'END OF PROGRAM' ) RETURN END  J=J2+1 W=2 W=1  co  1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  C C SUBROUTINE C C C C C C C C C C C C C C C C C C C  THRE  PROPAGATION OF THE  ALPHA-HELIX  PURPOSE TO ADD TO THE NUCLEATING FRAGMENT TETREPEPTIDES WHICH HAVE <PA> > 1 . 0 0 AND WHICH S A T I S F Y THE PROPAGATION SET OF RULES  REAL S , T 1 , T 2 , A 1 , A 2 ,T3,T4,T5,TT,P INTEGER G , F , H , U , D , V 1 , V 2 . W , V3,V4.V5,V6,V7,V8,0 LOGICAL H E L L O . B Y E ,BALL,MOVE DIMENSION S(1000,20),M(1000),H(1000),D(1000,16),P(1000,10) COMMON S,T1,T2,T3,T4,T5,TT,A1,A2,P,F,H,U,D,W,M,M1,M2,M3,M4,M5,M6, 1L,I,K,L1,L2.NZ,NY,JA,JB,JC,JD,J1,J2,KM,N1,N2.NN,J,G,K3,V1,V2,V3,V4 2, V5.V6,V7,0,HELLO,BYE,BALL,MOVE C c c c  D E S C R I P T I O N OF PARAMETERS JB - WHETHER I T I S N - OR C-PROPAGATION JB W I L L ALWAYS BE THE F I R S T LEFT RESIDUE OF THE ADJACENT TETRAPEPTIDE JC - WHETHER I T I S N - OR C-PROPAGATION JC W I L L ALWAYS BE THE FOURTH RESIDUE OF THE ADJACENT TETRAPEPTIDE N1 - N - T E R M I N A L RESIDUE OF THE CURRENT P O T E N T I A L AREA U - SWITCHING VALUE FOR D E C I S I O N MAKING U=1 N-PROPAGATION U=2 C-PROPAGATION  c c c c  c c c c c c c c c  I F PRO OCCUPY THE F I R S T PROPAGATION I M M E D I A T E L Y D I N G TO RULE H E L I X - 4 10  M1=0 M2=0 M3=0 M4= 1  TURN OF THE NUCLEATING SEGMENT TO START C BECAUSE N-PROPAGATION I S NOT P O S S I B L E ACCOR  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  C C C  C C C C C  C C C  C  M6 = 0 IF (M(I).E0.15 .AND. I.E0.(J+2)) U= 1  AS LONG AS JB BELONGS TO THE CURRENT POTENTIAL AREA THE N-PROPAGA TION CAN BE CARRIED OUT 20 M1=M1+1 JB=J-(4*M1) IF (JB.GT.0.AND.JB.GE.N1) GO TO 30 IF (JB.LT.N1.AND.M1.EO.1) J1=J IF (JB.LT,N1 .AND.M1 .NE. 1) J1=J-4*(M1- 1) 25 U=2 M2=0 30 T3=0 IF (U.EO.1) GO TO 35 TO START C-PROPAGATION WHEN N-PROPAGATION HAS BEEN STOPPED AND AS LONG AS THE ADJACENT TETRAPEPTIDE IS WfTHIN THE LIMITS OF THE POT ENTIAL AREA IF (M2.NE.0) JB=JA+1+(4*M2) IF (M2.E0.O) JB=JA+1 M2=M2+1 IF (JB.GT.N2) GO TO 70 TO CALCULATE THE <PA> OF THE ADJACENT TETRAPEPTIDE (JB-JC) 35  JC=JB+3 IF (JCGT.N2 . AND . JB . LE . N2) DO 40 I=JB,JC T3 = T3 + S(I, 1 ) 40C0NTINUE 45  C C C C C C C C C C  GO TO 25  GO TO 70  PRINT 45,JB,JC,T3 FORMAT(' . ' , 10X, 'JB: ' . 14,5X, 'JC: ' ,I4,5X. 'T3: ' ,F7.4, 15X, 'HELIX PROPA 1GATI0N')  IF <PA> > 1.00 TO CHECK THE NUMBER OF BREAKERS AND FORMERS IN THE SECTION FORMED BY THE TETRAPEPTIDE AND THE TWO ADJACENT RESIDUES OF THE NUCLEATING FRAGMENT OR OF THE PROPAGATING ONE IF (T3.GE.4.0)  GO TO 190  TETRAPEPTIDES WITH <PA> <1.00 SHOULD NOT CONTAIN ANY BREAKER NOR ONLY 4 IA IN ORDER TO ALLOW HELIX PROPAGATION TO CONTINUE  50  DO 50 I=JB,JC IF ( S O , 1 ) . L E . 0 . 6 9 ) CONTINUE  GO TO 60  101 102 103 104 105 106 107 108 109 1 10 1 1 1 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 149 150  55 C C C  L=0 DO 55 I=dB,dC IF (S(I, 1 ).LE . 1 .01 .AND.S(I. 1).GE.0.70) CONTINUE IF (L.EQ.4) GO TO 60 IF (L.NE.4) GO TO 190  L = L+1  TO SWITCH TO C-PROPAGATION WHEN N-PROPAGATION HAS BEEN STOPPED 60  C C C C C C C C  BALL=.FALSE . IF (U.EO.1) BALL=.TRUE. IF (BALL) d1=dB+4 IF (BALL) U=2 IF (BALL) GO TO 30 BOTH N- AND C-PROPAGATIONS BY TETRAPEPTIDE ADDITION HAVE BEEN STOP PED . TO START ADDING ONE RESIDUE AT A TIME TO N-TERMINAL FIRST THEN TO C-TERMINAL OF THE PROPAGATING SECTION. WHEN ADDING IA TO EACH END TO CHECK IMMEDIATELY WHETHER THE RULE OF AT LEAST HALF OF FORMERS IS STILL SATISFIED OR NOT  70 75  80 85  C C C C C C c  IF (M(d1+2).EO. 15) GO TO 80 L1 =J 1 - 1 IF (L1.LT.(N1) .0R.L1.E0.0) GO TO 80 IF (M(L1).E0.4 .OR.M(L1).EO.17) S(L1,1)=1.00 IF (S(L1,1).GT.1.00) d1=L1 IF (S(L1 , 1 ) .LE . 1.00.AND.S(L1, 1),GE.0.70) d1=L1 IF (S(L1 , 1).GT. 1 .00) GO TO 75 d2=dB-1 L2=d2+1 IF (L2.GT.NN) GO TO 90 IF (L2.GT.(N2) ) GO TO 90 IF (M(L2).EO.2.OR.M(L2).E0.5) S(L2,1)=1.00 IF (S(L2,1).GT.1.00) d2 = L2 IF (S(L2,1).LE.1.00 .AND. S(L2,1).GE.0.70) d2=L2 IF (S(L2,1) .GT.1.00) GO TO 85 CHECK FOR THEffOF HELIX FORMERS IN THE ENTIRE HELIX... TO COMPARE THE ACTUAL NUMBER OF FORMERS (T4) TO ITS THEORITICAL ONE (TT: EQUAL TO AT LEAST HALF OF THE SECTION)  90  T4=0 DO 95 1=01 ,J2 S( I ,4)=0 IF ( S ( I , 1 ) GE. 1.00) S(I,4)=0.5 IF (S(I,1).GE.1.06) S(I,4)=1.0 T4=T4+S(I,4)  CTv  151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  95 100 C C  c  c c c c c  TO CONTINUE ADDING ONE RESIDUE AT A TIME TO BOTH ENDS IF T4 > TT IF (T4.GE.TT .AND. S(J1-1.1) .LE.0.69) GO TO 110 IF (T4.GE.TT .AND. S(d1 - 1, 1) .GT.0.69.AND.L1 .GT.(N1)) GO TO 70 IF (L2.GT.NN) GO TO 170 1 10 IF (T4.GE.TT .AND. S(d2+1,1) .GT.0.69.AND.L2.LT.(N2)) GO TO 85 IF((T4.GE.TT.AND.S(d2+1,1).LE.0.69) .OR.(T4.GE.TT.AND.S(d2+1, 1 ) .GT 1.0.69.AND.L2.GE. N2)) GO TO 170 IF T4 < TT THEN TO WITHDRAW SOME BOUNDARY RESIDUES (ESPECIALLY BA,,IA) SO THAT T4 > TT 120  125 130 135 140  c c c c  CONTINUE TT=(02-01+1)/2.0 PRINT 100,0 1 ,02,T4,TT .' 4 ,F7 , FORMAT(' ' , 10X, '01 : ' ,I4,5X, ' 02 : ' ,I4,5X, 'T4: ' ,F7.4,5X, 'TT: 1 4X , ' ACTUAL AND THEORIT. H FORMERS FROM J1 TO L)2 ' )  IF (S(d2,1) .LT.1.00) GO TO 125 IF (S(d1 , 1 ) . LT . 1 .00 .AND.M(d1 + 2).NE. 15) GO TO 130 IF (S(d2,1) .LT.1.06) GO TO 135 IF (S(d1,1).LT.1.06 .AND.M(d1+2),NE.15) GOTO 140 02=02-1 IF (S(d2+1.1).LT.1.00) GO TO 150 01=01+1 IF (S(01-1, 1 ) .LT . 1.00) GO TO 150 02=02-1 IF (S(d2+1,1).LT.1.06) GO TO 150 01=01+1 IF (S(d1-1, 1 ) . LT . 1 .06) GO TO 150  TO CHECK T4 AND TT EVERY TIME A BOUNDARY RESIDUE IS WITHDRAWN 150  155 160  170 175  T4=0 DO 155 1=01,02 S( I , 4)=0 IF (S(I,1).GE.1.00) S(I,4)=0.5 IF (S(I,1).GE.1.06) S(I,4)=1.0 T4=T4+S(1,4) CONTINUE TT=(d2-d1+1)/2.0 PRINT 160,d1,d2,T4,TT FORMAT(' ',1OX,'01:',I4,5X,'02:',I4,5X,'T4:',F7.4,5X,'TT:',F7 .4, 1 4X. 'ACTUAL AND THEORIT. H FORMERS FROM 01 TO'02') IF (T4.GE.TT) GO TO 170 IF (T4.LT.TT) GO TO 120 PRINT 175,01,02 FORMAT('0' , 10X, 'PSEUDO-HELIX FROM 01 : ' ,15,3X, 'TO 02:',15,/)  201 202 203 204 205 206 207 208 209 210 21 1 212 213 2 14 215 216 217 218 2 19 220 22 1 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 24 1 242 243 244 245 246 247 248 249 250  C C C C  C C C C C C C  TO CALL SUBROUTINE CALL M0J1 IF (J2.LT.NY) IF (J2.LT.NY) IF (J2.LT.NY) IF (J2.LT.NY) W= 1 180 RETURN  PRO I CAN ONLY EXIST AT THE FIRST TURN OF N-TERMINAL SIDE. ANY OTH ER POSITION ESPECIALLY AT THE C-TERMINAL WILL IMPEDE THE PROPAGA TION  200  215 218  c c  DO 200 I=JB,JC IF (M(I).EO.15.AND.I.EO.(JB+2).AND.U.EO.1) GO TO 210 IF (M(I).E0.15.AND.I.NE.(JB+2).AND.U.EO.1) GO TO 220 IF (M(I).EQ.15.AND.U.EO.2) GO TO 70 CONTINUE IF (U.EO.1) GO TO 210 IF (JB.EO.(JA+1)) JB=JA-1 IF (JB.NE.(JA-1 ) ) JB = JB-2  IF PRO IS NOT FOUND IN THE TETRAPEPTIDE THEN TO CHECK THE NUMBER OF FORMERS OF THE 6 RESIDUE UNIT (= TETRAPEPTIDE + 2 ADJACENT RESI DUES) 210  C C C  J=J2+1 W=2 N1=J2 RETURN  CHECK FOR THE NUMBER OF FORMERS IN THE 6 RESIDUE UNIT ...  190  C C C C C  M0J1 TO CARRY OUT THE BOUNDARY ADJUSTMENT  JC=JB+5 T4=0 DO 215 I=JB.JC S(I,4)=0 IF (S(I, 1 ) .GE. 1 .00) S(I.4)=0.5 IF (S(I,1).GE.1.06) S(I.4)=1.0 T4 = T4 + S(I.4) CONTINUE PRINT 218,JB,JC,T4 FORMAT(' ' , 10X, 'JB: ' ,I4,5X, 'JC: ' ,I4,5X, 'T4: ' ,F7.4, 14X, ' HELIX FORM 1IN 6 OVERL. RESIDUES' ) IF (T4.GE.4.0) GO TO 240  IF THE 6 RESIDUE UNIT DOES NOT HAVE AT LEAST TWO THIRDS FORMERS THEN EITHER TO SWITCH FROM N-PROPAGATION TO C-PROPAGATION OR TO START ADDING ONE RESIDUE AT A TIME TO BOTH ENDS 220  IF (U.EO.2) U=2  GO TO 230  251 252 253 254 255 C 256 C 257 C 258 C 259 C C 260 261 C C 262 263 C 264 C 265 C 266 C 267 C 268 C 269 C 270 27 1 272 273 274 275 00 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 End of F i l e  230  J1=JB+4 GO TO 30 JB=JC-3 GO TO 70  ... TO CHECK THE NUMBER OF BREAKERS IN THE ENTIRE POLYPEPTIDE . . . DESCRIPTION OF PARAMETERS JB - N-TERMINAL RESIDUE OF THE HELICAL POLYPEPTIDE JD - C-TERMINAL RESIDUE OF THE HELICAL POLYPEPTIDE M3 - COUNTER M4 - COUNTER IF THE ACTUAL NUMBER OF BREAKERS (L) IS LESS THAN THE THEORITICAL ONE (M5: ONE THIRD OF THE SECTION) THEN THE REGION CAN KEEP ON PRO PAGATING. OTHERWISE EITHER TO SWITCH FROM N-PROPAGATION TO C-PROPA GATION OR TO START ADDING ONE RESIDUE AT A TIME 240 250  255 258  260  M5=0 IF (U.EO.1) GO TO 250 JB=JB-(4*M4 ) JD=JB+9+(4*M3) M3=M3+1 M4=M4+1 M5= (JD-JB+1)/3 L =0 DO 255 I=JB,JD IF (S(I.1).LE.0.69) L=L+1 CONTINUE PRINT 258,JB,JD,M5 ,L FORMAT(' ' , 10X, 'JB: ' , 14 , 5X, ' JD: ' , 14,5X, 'M5: ' , 17,5X, ' L: ' ,I 3,5X, 1 'THEORIT. AND ACTUAL # BREAKERS FROM JB TO JD') IF (L.LT.M5.AND.U.EO. 1 .AND.M(JB+2).EO. 15) GO TO 260 IF (L.LT.M5.AND.U.EO.1) GO TO 20 IF (L.LT.M5.AND.U.EO.2 ) GO TO 30 M6 = M2 IF (U.EO.2.AND.M6.EO.O) JB=JB+6 IF (U.E0.2.AND.M6.NE.O) JB=JB+6+(4*M6) IF (U.EO.2) GO TO 70 U=2 J1=JB+4 GO TO 30 J1=JB U=2 GO TO 30 END  1 2 3 4 5 G 7 8 9 10 11 12 13 14 15 16 17 18 19  C C C C C C C C C C C C C C C C .  20  C  21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  c C  C C C C C C C C C C C C C C C C C C C C  SUBROUTINE  MO<J1  BOUNDARY MOVE OF THE N-TERMINAL  PURPOSE TO FIND OUT THE MOST FAVORABLE N-BOUNDARY RESIDUE FOR THE PREDIC TED HELIX BASED ON THE BOUNDARY CONFORMATIONAL PARAMETERS OF THE ADJACENT RESIDUES  REAL S.T1,T2,A1 ,A2 , T3,T4.T5.TT,P INTEGER G,F,H,U,D,V1,V2,W, V3,V4,V5,VS,V7,V8,0 LOGICAL HELLO,BYE ,BALL,MOVE DIMENSION S(1000,20),M(1000),H(1000),D(1000,16),P(1O0O.10) COMMON S,T1 ,T2,T3.T4,T5,TT.A1.A2,P.F.H.U.D,W.M.M1.M2.M3,M4,M5,M6, 1L, I ,K,L1 ,L2.NZ,NY,JA,JB,JC, <JD , J 1 , J2 , KM , N 1 , N2 , NN ,J,G,K3,V1 .V2.V3.V4 2,V5.V6,V7.0.HELLO.BYE,BALL,MOVE DESCRIPTION OF PARAMETERS V1 - ACTUAL NUMBER OF BREAKERS IN THE PREDICTED HELIX (=L) V2 - COUNTER INDICATING THE POSITION OF THE ADJUSTMENT BECAUSE THE PROCEDURE CONTAINS SEVERAL DIFFERENT POSSIBILITIES OF AD JUSTMENT (COUNTER USED FOR N-TERMINAL ADJUSTMENT) J1 N-TERMINAL RESIDUE OF THE PREDICTED HELIX J2 - C-TERMINAL RESIDUE OF THE PREDICTED HELIX K3 - C-TERMINAL RESIDUE OF THE PREVIOUS PREDICTED HELIX SITUATION WITH J1 CLOSE TO ZERO TO TAKE INTO ACCOUNT THE POSITION OF J1 WHEN IT IS CLOSE TO THE NTERMINAL OF THE PROTEIN SINCE THERE IS LESS FREEDOM FOR MOVING IT TOWARDS THIS SIDE  V1=L V2=0 V3=0  51 52 53 54 55 56 57 58 59 60 61 62 63  00 O  65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 gO  91 92 93 94 95 96 97 98 99 100  C C Q  PRINT 5 FORMAT('O',30X,'BOUNDARY ANALYSIS OF THE N-TERMINAL')  5  -j  * * *  * * *  IF ( (d1-1).LE.0) GO TO 10 BALL=.FALSE. IF (d1.EQ.2.AND.S(J1,8).GT. 1.47.AND. S (d1 - 1,1).LE.0.69.AND.S(J 1 + 1 , 2 1) .GT. 1 .01 .AND,S(d1+2,8).GE.S(d1,8) .AND. M ( d 1 - 1) .NE. 15)BALL=.TRUE. IF (BALL) d1=d1 IF (BALL) V2=1 IF (BALL) GO TO 300  C 10  C C ,  C C  BALL=.FALSE. IF (d1.EQ.1.AND.S(J1,8) .GT. 1 .08.AND.S(J1 + 1 ,8).LT.S(d1,8).AND.S(d1 + 1 2,8) .LE.S(d1.8).AND.S(d1+3,8).LT. 1.08) BALL=.TRUE. IF (BALL) d1=d1 IF (BALL) V2=2 IF (BALL) GO TO 300 *** 3 *** BALL=.FALSE. IF (d1.EQ. 1.AND.S(d1,8).GT. 1.08.AND.S(d1, 1 ) . GT. 1 .01.AND.S(d1 + 1, 1). 1 LT.1.06.AND.S(d1+2,8).LT.S(d1.8).AND.S(d1+3,8).LT.1.08) BALL== 2 .TRUE. IF (BALL) d1=d1 IF (BALL) V2=3 IF (BALL) GO TO 300 ++*4*#* BALL=.FALSE. IF (d1.EQ. 1 .AND.S(d1, 1 ) . GT. 1 . 16.AND.S( d 1 + 1 ,8 ) .GT.2.02.AND.S(d1+ 2, 1 8).GT.2.02.AND.S(d1+3,1).GT.1.11.AND.S(d1+4.1).GT.1.16) BALL= 2 .TRUE. IF (BALL) d1=d1 IF (BALL) V2=4 IF (BALL) GO TO 300  C C  ***  2  5  ***  BALL=.FALSE. T1=0 T2=0 T5=0 T1=S(d1+1 , 1 )+S(d1+2,1) + S(d1+3, 1 ) + S(d1+4, 1) T2=S(d1+1,2)+S(d1+2,2)+S(d1+3,2)+S(d1+4,2) T5=S(d1+1,5)+S(d1+2,5)+S(d1+3,5)+S(J1+4,5) PRINT 2,T1,T2,T5 FORMAT(' ' ,30X. 'T1,T2,T5' ,3(F7.3 ) , ' STEP 5,M0d1 CLOSE TO 0') IF (T5.GT.T1 .AND.T5.GT.T2.AND.S(d1 ,8 ) .LT. 1 .08.AND.S(d1+4,8).LT. 1 .  00 1-1  101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 12 1 122 123 124 1 25 126 127 128 129 130 131 132 133 134  135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150  C C  C C  C C  C C  C C  1 08 . AND . S ( J 1 +5 , 8 ) . GT . 1 . 47 . AND . S(d 1+6 , 8 ).. GT . 1 .08 .AND . S( d 1 , 8 ) . LT . 1 . 0 2 8.AND.S(d1 + 2,8).LT.0.66.AND.S(d1 + 2,6).LT.1.01 .AND.S(d1+ 3,8 ) . LT . 1 . 3 08) BALL=.TRUE. IF (BALL) d1=d1+5 IF (BALL) V2=5 IF (BALL) GO TO 300 *** g *** BALL=.FALSE. IF (d 1 .EO. 1 .AND.S(d1,8).LT. 1.08.AND.S(d1 + 1,8).GE. 1.08.AND.S(d1 + 2, 18).LT.S(d1+1,8).AND.S(d1+3,8).LT.S(d1+1,8).AND.S(d1+4,8).LT.S(d1+ 2 1,8)) BALL= . TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=6 IF (BALL) GO TO 300 *+* 7 *** BALL=.FALSE. IF (d1.EO.1.AND.S(d1,8).LT.1.08.AND.S(d1+1,8).LT.1.08.AND.S(d1+2,8 1 ),GE. 1 .08.AND.S(d1 + 3,8) .LT. 1 .08.AND.S(d1+4,8).LT. 1.08) BALL = 2 .TRUE. IF (BALL) d1=d1+2 IF (BALL) V2=7 IF (BALL) GO TO 300 *** 8 *** BALL=.FALSE. IF ( J1.EO.1.AND.S(d1,8).LT.1.08.AND.S(d1+1,8).LT.1.08.AND.S(d1+2, 1 8).LT. 1,08.AND.S(d1 + 3.8).GE. 1.08) BALL= . TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=8 IF (BALL) GO TO 300 *** g *** BALL=.FALSE. IF (K.E0.3) K3=N2 I F ( S ( d 1 , 8 ) . L T . 1 .08.AND.(d1-2) .LT . (K3-1) .AND.S(d1 + 1.8) .LT. 1 .08.AND 1 .S(d1+2,1).LE.0.69.AND,M(d1+2).NE.15.AND.S(d1+3,8).GE.1.08) BALL 2 =.TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=9 IF (BALL) GO TO 300 ' . *** 10 *** IF ((J1-3).LE.O) GO TO 6 BALL=.FALSE. IF (S(d1,2).GE. 1 .47.AND.S(d1- 1,2).GE. 1.47.AND.S(d1-2,2) .GT.0.93.AN 1D.S(d1+2,2).GE. 1.47.AND.S(d1+ 3,8).GT. 1.47.AND.S(d1-3, 1 ) . LT . 1 .06.AN 2D.(S(J1+1,2).GT.O.75 .OR.M(d1+2).EO.1)) BALL=.TRUE. IF (BALL) d1=d1+3  151 152 153 154 155 15G 157 158 159 160 16 1 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 19 1 192 193 194 195 196 197 198 199 200  C C  IF (BALL) IF (BALL) *++ 1 1 ***  6  BALL=.FALSE. IF ( (U1--2 ) . LE .O) GO TO 15 . IF (d1 .EO.3.AND.S(J 1 ,8).LT. 1 .08.AND.S(d1 + 1,8).LT . 1 .08.AND.S(d1+2, 1 8).LT . 1 .08.AND.S(d1+ 3,8 ) .LE. 1.08.AND.S(d1+ 4,8).GT. 1.08.AND.S(d 12 1,8).LT. 1 .08.AND.S(d1-2,8).LT. 1 .08) BALL= . TRUE. IF (BALL) d1=d1+4 IF (BALL) V2=11 IF (BALL) GO TO 300  C C C C C C  c c  .. TO REPEAT THE B-TURN CHECK TO CHECK THE PRESENCE OF TURNS IN THE VICINITY OF THE HELIX BOUNDA RIES WHICH MAY FORCE THE PREDICTED BOUNDARIES TO BE MOVED TO A NEW POSITION. WE CHECK IT FROM POSITION J1-3 (1=0) TO d1 + 3 (1=6) 15 20  c c c c  c  c c  1=0 LE=d1-3 IF (LE.LE.O) GO TO 200 LF = LE + 3 IF ((LE+3).GT,NN) GO TO 210  TO COMPARE PA (T1).PB (T2),AND PT (T5) AND TO CALCULATE THE PROBABI LITY OF B-TURN OCCURRENCE (TT) OF THE TETRAPEPTIDE LE-LF  25  30  c  V2=10 GO TO 300  T1=0 T2 = 0 T5=0 TT=0 HELLO=.FALSE. DO 25 L=LE,LF T1=T1+S(L, 1 ) T2=T2+S(L,2) T5=T5+S(L.5) CONTINUE TT=P(LE,1)*P(LE+1,2)*P(LE+2.3)*P(LE+3.4) PRINT 30, LE , T 1 , T-2 . T5 , TT , I FORMAT(' '.10X,'LE,T1,T2,T5,TT,I',I5,3(F7.4,2X),F13.9.I4,3X, 1 'B-TURN SEARCH AT N-TERMINAL') IF (T5.GT.T1.AND.T5.GT.T2.AND.TT.GT.0.000075000)  HELLO=.TRUE.  *** 1 ***  IF (HELLO.AND.LE.EO.(d1-3).AND.S(d1 + 1,8).GE. 1.08.AND.S( d1 ,8) . LT. 1 1 .08.AND.S(U1+2,8).GT.1.08.AND.S(d1+3,8).GT.1.08) GO TO 101  00  201 202 203 204 205 206 207 208 209 210 211 212 . 213 2 14 215 216 217 2 18 219 220 221 222 2 23 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250  C C  ***2+ + * IF (HELLO.AND.LE.E0.(J1-2).AND.S(J1-2,5).GT.1.52.AND.S ( J1 - 1 . 5).GT. 1 1.52.AND.S(d1,5).GT.1.43.AND.S(J1,1).LT.1.06.AND.S(J1 1,1).GT.1.0 2 6.AND.S(d1 +1,8).GT.1.08.AND.S(J1+2.1).LT.S(J1+1,1)) GO TO 101 +  C C  C C  C C  C C  C C C C  C C C C  C C  *** 3 *** IF (HELLO.AND.LE.EO.(J1- 1).AND.S(01+2,8).GT. 1 .47.AND.S(J 1+4,8 ) . LT. 1 1 .08.AND.S(J1+5,8).LT. 1.08.AND.S(J 1+6,8).LT.S(d1+2.8).AND.S(d1 + 1 . 2 8).LT.S(J1+2,8)) GO TO 102 *+*4+** IF (HELLO.AND.5(J1+2,8).LT.1.08.AND.S(J1+3.8).LT.1.08.AND.LE.EO.(J 11-2).AND.S(J1+5, 1 ) .GT.1.13.AND.S(J1+6,8).LT. 1.08.AND.S(d1+7,8).LT 2 . 1 .08.AND. (S(d1+4,8 ) . LT. 1 .08.OR.S(J1+4, 1).LT. 1 .06)) GO TO 105 *** 5 *** IF (HELLO.AND.LE.EO.J1 .AND.S(J1+4, 1 ) .GT. 1 . 1 1 .AND.5(01 + 5, 1).GT. 1.21 1 .AND.S(J1+6, 1 ) .GT . 1 .21.AND.S(J1 + 3, 1 ) . LT . S ( J 1+4, 1 ) .AND.S(J1 + 3 .6) . 2 GT .1.01. AND .S(d1+2,6) . GT .1.22) GO TO 104 *** g *** IF (HELL0.AND.S(J1 + 3,8).GT. 1 . 47 . AND.S(d1+4,8).LT. 1.08.AND.LE.EO.( 1 J 1 - 1).AND.S(J1+5,8).LE.S(J1+3,8).AND.S(J1- 1 . 1 ) . LT. 1 . 16.AND . S(<J1 -2 2 , 1 ) .LE.0.69.AND.S(J1 + 2, 1).LT.0.98.AND.S(J1+2.6).GT. 1 .41 ) GOTO 3 102 **+ 7 *** IF (HELLO.AND.S(J1+5,8).GE.1.08.AND.S(J1+6,8).LT.1.08.AND.S(J1+7,8 1 ).LT.1.08.AND.LE.EQ.(d1+2)) GO TO 105 **+g**+ IF (HELLO.AND.LE. EQ.(J1-2).AND.S(J1+2.8).LT.1.08.AND,S(d1+3,8).GE. 1 1.08.AND.S(J1+3,1).GT.1.01.AND.S(d1+4,8).LE.S(d1+3,8).AND.S(d1,1) 2 .LT.0.83.AND.(S(J1+1,2).GT.1.47.OR.S(d1+1.1).LT.O.67).AND.(S(d1+2 3 ,2 ) .GT. 1 .47.OR.S(d1 + 2, 1).LT.0.83)) GO TO 103 ***  9  ***  IF (HELLO.AND.S(d1+4,8).GT.1.47.AND.S(d1+3,8).LE.S(d1+4,8).AND.LE. 1 EO.d1.AND.S(d1+5,8).LT.1.08.AND.S(d1+6,8).LT.S(d1+4,8)) GO TO 104 *** 10 *** IF (HELLO.AND.LE.E0.d1.AND.S(d1+3,8).LT.1.08.AND.S(d1+4,8).LT.1.08 1 .AND.S(d1+5.8).LT.1.08.AND.S(d1+6.8).LT.1.08.AND.S(d1+4,1).GT.0. 2 69.AND.S(d1,8).LT. 1 .47.AND.S(d1 + 1, 1 ) . LT . 1 . 16.AND.S(d1+2, 1).LT . 1 . 3 21) GO TO 104 *** 11 *** IF ((J1-1).LE.0) GO TO 40 IF(HELLO.AND.S(d1+3,8 ) .LT. 1 .08.AND.S(d1+4,8).GE. 1.08.AND.LE.EO.(d1  251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273  274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294  295 296 297 298 299 300  C C  1 -1 ) .AND.(S(01+2,1).LE.0.69.OR.S(01+1,1).LE.0.69).AND.S(J1-1,1).LE 2.0.69) GO TO 104 40  C C  *** 13 *** IF (HELLO.AND,LE.EQ.(01+2).AND.S(J1+6.8).GT.1.08.AND.S(01+5,8).LT. 1 S(01+6,8) ) GO TO 106  C C  ***14*** IF (HELLO.AND.LE.EO.(J1+1).AND.S(01+6,8).GE.S(J1+5,8).AND. S(01+6, 1 1 ).GT.S(J1+5.1).AND.S(01+7,8).LT.S(J1+6,8).AND.S(01 + 4.8).LT.1.08) 2 GO TO 106  C C  *** 15 *** IF ((01-3).LE.0) GO TO 50 IF (S(01,8).LT.1.08.AND.S(01,2).GE.1.47.AND.S(01-1,2).GT.1.38.AND. 1S(01-3,1).LE.0.69.AND.S(01+1,8).GT.1.47.AND.S(01+2,2).GT.1.19.AND. 2S(01+3.8).LT.1.08.AND.HELLO.AND.LE.EQ.(01+3)) GO TO 1 0 7  C C 50 C C  C C  C C  IF (HELLO.AND.LE.EO.(01-3).AND.S(01+1,8).LT.1.08.AND.S(01+2,8).LT. 1 1.08.AND.S(01+3,8).LT. 1 .08.AND.S(01+4,8).GT. 1.08.AND.S(01+4, 1 ) . GT 2 . 1 .08.AND.S(01 + 5, 1 ) . LT . S(01 + 4, 1)) GO TO 104 ***-|7+** IF (HELLO.AND.S(01+3,8).LT.1.08.AND.S(01+4,8).LT.1.08.AND.S(01+5,8 1 ).GT.1.08.AND.S(01+5,1).GT.1.08.AND.S(01+6,8).LT.1.08.AND.S(01+5. 2 1).GT.S( 01+4,1).AND.LE.EO.(01-1)) GO TO 105  " C C  ***-|2*** IF (HELLO.AND.LE.EQ.(01+1).AND.S(01+4 ,8) .LE. 1 .08.AND.S(J1+ 5.8 ) . LT 1 . 1,08.AND.S(01+6,8).GT . 1.47) GO TO 106  *** 18 *** IF ((01+7).GT.NN) GO TO 80 . IF (HELLO.AND.S(01+4,8).LT.1.08.AND.5(01+5,8).LT.1.08.AND.S(01+6, 1 8) . LT. 1.08.AND.S(01+6, 1).LE.0.69.AND.S(01+7,8).GE. 1.08.AND.LE.EO. 2 (01+1)) GO TO 107 80  *** 19 *** IF (HELLO.AND.LE.EO.01.AND.5(01+3,8).LT.1.08.AND.S(01+4.8).LT.1.08 1 .AND,S(01+5.1).GT.1.16.AND.S(01+6.8).LT.1.08.AND.(01+7).GE.02) 2 GO TO 105 +** 20 *** IF ((01+8).GT.NN.OR.(01-1).LE.0) GO TO 90 IF (HELLO.AND.LE.EO.01.AND.S(01+2,5).GT.0.74.AND.S(01+3,5).GT.1.52 1 .AND.S(01 + 1,5).GT.0.98.AND.S(01+4,2).GT. 1 .47.AND.S(01+5,2 ) .GT . 1 .6 2 O.AND.5(01+6,8).LT.0.58.AND.S(01+7, 1) . GT. 1 . 13.AND.S(01+7,8).GT.S( 301+8,8).AND.S(01+8, 1).GT. 1.01 .AND.S(01 - 1,8).LT. 1 .08.AND.5(01-1,1). 4 LE.0.69) GO TO 107  00  301 302 303 304 305 306 307 308 309 310 31 1 312 313 314 315 3 16 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 34 1 342 343 344 345 346 347 348 349 350  C C  *#*  90  C 100  C C C C C  IF IF IF IF IF IF IF  (I.EO.O) (I.EQ.1) (I.EQ.2) (I.EQ.3) (I.EQ.4) (I.EQ.5) (I.EQ.6)  GO GO GO GO GO GO GO  TO TO TO TO TO TO TO  200 200 200 200 200 200 210  MOVE OF N-BOUNDARY AS A CONSEQUENCE OF STRONG B-TURN POTENTIAL IN THE VICINITY OF THE PREDICTED HELIX 101 102 103 104 105 106 107 1 12  C  J1=J1+1 GO TO 110 J1=J1+2 GO TO 110 J1=J1+3 GO TO 110 J1=J1+4 GO TO 110 J1=J1+5 GO TO 110 J1=J1+6 GO TO 110 J1=J1+7 GO TO 110 J1=J1+12 GO TO 110  1 10  V2 = 80 GO TO 300  200  1=1+1 LE=LE+1 GO TO 20  C  C C C C C C  21 * * * IF ((J1+13).GT.NN) GO TO 100 IF (HELLO.AND.S(J1+2,8).LT.1.08.AND.LE.EQ.(J1-2).AND.S(J1+3,2) 1 0.93.AND.S(J1+4,2).GT.1.38.AND.S(J1+5,2).GT.1.19.AND.M(J1+6). 2 .AND.S(J1+7,8).LT . 1.08.AND.S(J1+8,2).GT. 1 .38.AND.S(J1+9,8).LT 3 8.AND.M(J1+9).EO.M(J1+10).AND.M(J1+11).EQ.M(J1+9).AND.S(J1+12 4 GT.0.81.AND.S(J1+13,8).GT.2.02) GO TO 112  .. B-TURN PROBLEMS OR OTHER PROBLEMS ADJUSTMENT OF N-BOUNDARY MAY ALSO BE CAUSED BY EITHER RANDOM COIL OR B-SHEET POTENTIAL OR BY THE LOW BOUNDARY CONFORMATIONAL PARAME  00 °^  351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400  C C C C  C C  C C  TER OF THE CURRENT BOUNDARY RESIDUE *** 12 *** 210 BALL=.FALSE. IF ( ( J 1 + 7).GT.NN) GO TO 230 LC=0 JN=J1+6 DO 215 L = J1.JN IF (S(L,2).LE.0.75) LC=LC+1 215 CONTINUE JN=J1+4 JM=J1+1 T1=0 T2=0 T5=0 DO 218 L=JM,JN T1=T1+S(L,1) T2=T2+S(L.2) T5=T5+S(L,5) 218 CONTINUE PRINT 220.T1,T2,T5,LC 220 FORMAT(' ' ,SOX, 'T1,T2,T5' ,3(F7 . 3),2X, 'LC: ' ,13, ' STEP 12 ,M0J1, 1 B-TURN PROBLEM') IF (LC.GT.2.AND.T5.GT.T1.AND.(T2-T5).LT.0.500.AND.S( J1+5,8).LT. 1 .0 1 8.AND.S(J1+6.8).GT. 1 .47.AND.S(J1+7.8).LT.2.01.AND.S(J1, 1 ) . LT. 1 . 21 2 .AND.S(J1+5,1).LT.1.21.AND.S(J1-1.6).LT.1.01) BALL=.TRUE. IF (BALL) J1=J1+6 IF (BALL) V2=12 IF (BALL) GO TO 300 *** 13 *** BALL=.FALSE. IF (J1.LE.K3.AND.(P(J1+1,1)*P(J1+2,2)*P(J1+3,3)*P(J1+4,4)).GT.0.00 1 0100.AND.P(J1 + 1, 1).GT.0. 120.AND.P(J1+2,2).GT.O. 139.AND.S(J1 ,5 ) . 2 GT.0.96.AND.S(J1+4,2).GT.1.19.AND.S(J1+4,8).LT.1.08.AND.S(J1+5,2) 3 .GT . 1 .47.AND . S(J1+5,8).LT.S(J1+6,8).AND.S(J1+7,8).LT.S ( J1+ 6,8).AN 4 D.S(J1+6.1).GT.0.69) BALL=.TRUE. IF (BALL) J1=J1+6 IF (BALL) V2=13 IF (BALL) GO TO 300 *** 14 *** 230 BALL=.FALSE. IF ((J1-2).LE.0) GO TO 250 T1=0 T2=0 T5=0 T1=S(J1-2, 1 ) + S(J1-1 , 1 ) + S(J1 . 1 ) + S(J1 + 1, 1 ) T2=S(J1-2,2)+S(J1-1,2)+S(J1.2)+S(J1+1,2)  00 —1  401 402 403 404 405 406 407 408 409 410 411 412 413 4 14 415 416 417 418 419 4 20 421 422 423 4 24 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450  235  C C  C C  C C  C C  C  T5 = S(d1-2,5)+S(d1-1,5) + S(d1,5) + S(d1 + 1 . 5) PRINT 235.T1.T2.T5 FORMAT(' ' , 30X, 'T1,T2,T5' ,3(F 7.3), ' STEP 14,M0d1 B-T PROBL.') IF (T5.GT.T1.AND.T5.GT.T2.AND.S(J1+1,8).GT.1.47.AND.S(J1+2,8).LT. 11 .08.AND.S(d1 + 3,8).LT,S(d1 + 1 , 8)) BALL= . TRUE. IF (BALL) J1=J1+1 IF (BALL) V2=14 IF (BALL) GO TO 300  *** 15 *** BALL=.FALSE. IF (S(J1,2).GT.1.37.AND.S(J1+1,2).GT.1.47.AND.S(d1+2,2).GT.1.47.AN 1D.S(d1+3,1).LE.0.69.AND.S(J1+3,8).LT.1.08.AND.S(J1+4,8).GT.1.08.A 2 ND.S(J1+4, 1).GT. 1 . 16.AND.S(J1- 1,2) .GT. 1 . 10.AND.S(J1-2,8) .LT. 1 .08. 3AND.S(d1+5, 1 ) . LT . S(d1 + 4, 1)) BALL=.TRUE. IF (BALL) d1=d1+4 IF (BALL) V2=15 . IF (BALL) GO TO 300 ** +  ++ BALL=.FALSE. IF ((U1+7).GT.NN) GO TO 240 IF ( ( P ( J 1 - 2 , 1)*P(d1- 1 ,2)*P(d1,3)*P(U1 + 1,4)).GT.0.000075.AND.S(d1 + 1 8,8).GE.1.08.AND,(P(d1+4.1)*P(d1+5,2)*P(d1+6,3)*P(d1+7,4)).GT. 2 0.000075) BALL=.TRUE. IF (BALL) d1=d1+8 IF (BALL) V2=16 IF (BALL) GO TO 300  +++17+++ 240 BALL=.FALSE. IF ((U1-3).LE.0) GO TO 250 IF(S(J1 , 1).LE.0.69.AND.M(J1).NE. 15.AND.S(J1-1, 1 ) .LE.O.69.AND.M(d1 1 1).NE. 15.AND.(S(d1-2, 1 ) .LE.0.69.OR.S(J1-2.8).LT. 1.08).AND.S(d1-3, 22).GT.0.93.AND.S(d1+1,8).GE. 1 . 08 .AND.S(d1+2,8).LT. 1 .08.AND.S(d1 + 3, 3 D.GT.1.01) BALL= . TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=17 IF (BALL) GO TO 300 *** 18 *** BALL=.FALSE. IF (S(d1,8) .LT. 1 .08.AND.(P(d1-3,1)*P(d1-2,2)+P(d1-1,3)*P(d1,4)). 1GT.0.000075.AND.S(d1+1, 1 ) .GT. 1 . 1 1 .AND.S(d1+2, 1).GE. 1. 13.AND.S(d1 + 3 2 ,1).GT.0.69.AND.S(d1+4,1).GT.1.21.AND.(S(d1+5,1).GT.1.21.OR.S(d1+ 3 5,2).LE.0.75)) BALL=.TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=18 IF (BALL) GO TO 300  45 1 452 453 454 455 456 457 458 459 460 46 1 462 463 464 465 466 467 468 469 470 47 1 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500  C  C C  C C  C C  *  c  1g *** BALL=.FALSE . IF (S(J1,8) . LT. 1 .08.AND.S(d1+1, 1).LE.0.69.AND.M(d1 + 1).NE. 15.AND.S( 1 d1+2,8).GE . 1 .08.AND.S(d1+3,8).LT. 1.08.AND.S(d1+ 4,8).LE.S(d1 + 2,8) 2 .AND.(P(d1-3,1)*P(d1-2,2)*P(d1-1,3)*P(d1,4)).GE.0.000075) BALL 3 =.TRUE. IF (BALL) d1=d1+2 IF (BALL) V2=19 IF (BALL) GO TO 300  * * * 20 * * *  BALL=.FALSE. IF ((d1-4).LE.0) GO TO 250 IF (S(d1,8).LT.1.08.AND,S(d1,1).LE.0.69.AND.S(d1+1,1).GT.1.08.AND. 1 S(d1 + 2,8) .LT . 1.08.AND.S(d1 + 2, 1).GT.0.69.AND.S(d1+ 3,8).LT.2.01 .AND 2 .(P(d1-4,1)*P(d1-3,2)*P(d1-2,3)*P(d1-1,4)).GT.O.000075) BALL= 3 .TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=20 IF (BALL) GO TO 300 * * * 21 * * *  BALL=.FALSE . IF (S(d1,8).LT. 1 .08.AND.S(d1 + 1,8).LT . 1.08.AND.(S(d1+2,8).GT. 1 . 47 . 1 OR.S(d1 + 2,8 ) .GT.S(d1+3,8)).AND.S(d1 + 3,8).LT.1.47.AND,S(d1 + 4,8).LT 2 .1.47.AND,(P(d1-4,1)*P(d1-3,2)+P(d1-2,3)*P(d1-1,4)).GE.0.000075.A 3 ND.S(d1+1,8).LT.0.73.AND.S(d1.2).GT.1.19.AND.S(d1+1,2).GT.1.47) 4 BALL=.TRUE. IF (BALL) d1=d1+2 IF (BALL) V2=21 IF (BALL) GO TO 300 * * * 22 * * *  245  C  * H<  BALL=.FALSE. T1=0 T2=0 T5=0 T1=S(d1-4,1)+S(d1-3,1)+S(d1-2,1)+S(d1-1,1) T2=S(d1-4,2)+S(d1-3,2)+S(d1-2,2)+S(d1-1,2) T5=S(d1-4,5)+S(d1-3,5)+S(d1-2,5)+S(d1-1,5) PRINT 245,T1,T2,T5 FORMAT(•' ' ,30X, 'T1 ,T2,T5' ,3(F7.3), ' STEP 22,M0d1 B-T PROBL.') IF (T5.GT.T1 . AND.T5.GT,T2.AND.S(d1,8).LT. 1.08.AND.S(d1, 1 ) .LT . 1 .01 . 1 AND.S(d1 + 1,8) . LT . 1.08.AND.S(d1 + 2,8).GT.1.08.AND.S(d1+3,8).LE.S(d1 2 +2,8)) BALL=.TRUE. IF (BALL) d1=d1+2 IF (BALL) V2=22 IF (BALL) GO TO 300  * * * 23 * * *  501 502 503 504 505 506 507 508 509 510  00 VD  511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 54 1 542 543 544 545 546 547 548 549 550  BALL=.FALSE. IF (S(d1, 1 ) . LT . 1 .06.AND.S(d1 - 1.2).GT. 1 .38.AND.S(d1-2,2).GT. 1.60.AN 1 D.S(d1-4,2).GT.1.38.AND.S(d1+1,2).GT.1.19.AND. S ( J1 + 1 , 8 ) . LT . 0 . 66 . A 2 ND.S(d1+2,8).GT.0.81.AND.S(d1+2,1).GT.0.77.AND.S(d1+3,8).LT.S(d1+ 3 2,8) .AND.S(d1 + 3, 1 ) .GT. 1 . 13) BALL=.TRUE. IF (BALL) d1=d1+2 IF (BALL) V2 = 23 IF (BALL) GO TO 300 C C  C C  '  C C  '  * * * 24 * * * BALL=.FALSE. IF ((J1-5).LE.0) GO TO 250 IF (S(d1,8).LT.1.08.AND.S(d1+2,8).GT.1.08.AND.S(d1+1,8).LT.S(d1+2, 1 8 ) . AND.S(d1- 1 ,8) . LT. 1.08.AND.(S(d1+3,8).LE.S(d1+2,8).0R.S(d1 + 3,8) 2 .LT.2.01) .AND.S(d1+2,1).GT.1.01.AND.(P(d1-5,1)*P(d1-4,2)*P(d1-3. 3 3)*P(d1-2,4)).GE.0.000075) BALL=.TRUE. IF (BALL) J1=d1+2 IF (BALL) V2=24 IF (BALL) GO TO 300 **+25*** BALL=.FALSE. IF ((d1-6).LE.O) GO TO 250 IF (S(J1 ,8) . LT . 1.08.AND.S(d1+4,8).GT. 1 .47.AND.S(d1-1, 1).LE.0.69 1 .AND.M(d1-1).NE.15.AND.S(d1-2,8).LE.S(d1+4,8).AND.S(d1+1,8).LT. 2 1.08.AND.S(d1+2,8).LT.1.08.AND.S(d1+2,1).LE.0.69.AND.S(d1+3,8).LT 3 .1.08.AND.(P(d1-6,1)*P(d1-5,2)*P(d1-4,3)*P(d1-3,4)).GT.0.000100) 4 BALL=.TRUE. IF (BALL) d1=d1+4 IF (BALL) V2 = 25 IF (BALL) GO TO 300 *** 26 *** 250 IF ((d1-1).LE.0) GO TO 260 BALL=.FALSE. T1=0 T2=0 T5=0 T1=S(J1-1,1)+S(d1,1)+S(d1+1,1)+S(d1+2,1) T2=S(d1-1,2)+S(d1,2)+S(d1+1,2)+S(d1+2.2) T5=S(d1-1,5)+S(d1,5)+S(d1+1.5)+S(d1+2.5) PRINT 255,T1,T2,T5 255 FORMAT(' ' ,30X, ' T 1 , T2 , T5' , 3(F7.3 ) , ' STEP 26,M0d1 B-T PROBL. ' ) IF (T5.GT.T 1 .AND.T5.GT.T2.AND.S(d1+3,8).GE. 1 .08.AND.S(d1+4,8) .LT. 1 1 .47.AND.S(d 1+5,8).LT. 1 .47.AND.((d1+6).GE.d2.OR.S(d1+6,8).LT.S(d1 2 + 3,8)).AND.S(d1,5).GT. 1 . 19.AND.S(d1+1,5).GT.0.74.AND.S(d1 + 1. 1 ) .LT . 3 1.21.AND.S(d1+3,1).GT.S(d1,1).AND.S(d1+2,2).GT.1.19.AND.S(d1+3,2) 4 .GT. 1.05.AND.S(d1+4,2).GT. 1 .05.AND.S(d1+5,2).GT.0.75.AND.S(d1 + 1 , 5 2).GT.0.89) BALL=.TRUE. IF (BALL) d1=d1+3  o  551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 57 1 572 573 574 575 576 577  End of F i l e  IF IF C C  C C C c c c  *** 260  (BALL) (BALL)  V2 = 26 GO TO 300  27 * * * BALL=.FALSE. IF ( S ( J 1 + 4 , 8 ) . G T . 1 . 4 7 . A N D . S ( J 1 + 5 , 8 ) . L E . S ( d 1 + 4 , 8 ) . A N D . S ( d 1 + 6 , 8 ) . L E . 1 S(J1+4,8).AND.(P(d1+1,1)*P(d1+2,2)*P(d1+3,3)*P(d1+4,4)).GT.0.0001 2 00.AND.S(d1+3,1 ).LT.S(d1+4 , 1 ) . A N D . S ( d 1 + 2 , 8 ) . L T . 2 . 0 1 . A N D . S ( d 1 + 1 . 8 ) 3 .LT.2.01) BALL=.TRUE. IF ( B A L L ) d1=d1+4 IF ( B A L L ) V2=27 IF ( B A L L ) GO TO 300  I F NONE OF THE ABOVE CONDITIONS IS S A T I S F I E D TO CALL THE NEXT SUB ROUTINE RMd1 TO KEEP ON CHECKING FOR P O S S I B I L I T I E S OF N-TERMINAL ADJUSTMENT CALL RMd1 RETURN  c c c c  N- TERMINAL OF THE PREDICTED HELIX HAS BEEN ADdUSTED TO CALL TINE M0d2 FOR C-TERMINAL ANALYSIS 300  CALL M0d2 RETURN END  SUBROU  1 2 3  C C SUBROUTINE  RMd 1  A  5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3 1 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  c c c c c c c c c c c c c c c c c c  RMd1  = REMAINING OF MOVE OF 01  PURPOSE TO KEEP ON CHECKING FOR THE BEST POSITION FOR SUBROUTINE IS A CONTINUATION OF M0d1  N-BOUNDARY,THIS  REAL S,T1 . T 2 , A 1 ,A2 , T 3 , T 4 , T 5 , T T , P INTEGER G , F , H , U , D , V 1 , V 2 , W , V 3 , V 4 , V 5 , V 6 , V 7 , V 8 , 0 LOGICAL H E L L O , B Y E .BALL,MOVE DIMENSION S ( 1 0 0 0 , 2 0 ) , M ( 1 0 0 0 ) , H ( 1 0 0 0 ) , D ( 1 0 0 0 , 1 6 ) , P ( 1 0 0 0 , 1 0 ) COMMON S . T 1 , T 2 , T 3 , T 4 , T 5 , T T , A 1 , A 2 , P , F , H , U , D . W , M , M 1 , M 2 , M 3 , M 4 , M 5 , M 6 , 1L,I,K,L1,L2.NZ,NY,dA,dB,dC,JD,J1,J2.KM.N1,N2,NN,J,CK3,V1,V2,V3,V4 2,V5,V6,V7,0.HELLO,BYE,BALL,MOVE c c c c c c c c c c c c c c c c  REMARKS THE PARAMETERS DESCRIBED IN THE SUBROUTINE M0d1 S T I L L ME D E F I N I T I O N IN THIS SUBROUTINE THE DIFFERENT COMMENTS J1=J1+1,J1=J1+5 d1=d1-1 ENTUAL POSITION OF J1 IF ITS ENVIRONMENT MEETS ONE TIONS DESCRIBED BELOW. IF NOT J1 WILL S T I L L REMAIN S I T I O N BECAUSE IT APPEARS TO BE THE MOST FAVORABLE  . ..  KEEP THE SA  INDICATE THE EV OF THE CONDI AT THE SAME PO ONE  U1 = J1+7  ***  28 * * * BALL=.FALSE. IF ( ( d 1 + 8 ) . G T . N N ) GO TO 20 IF ( d 1 . L E . K 3 . A N D . ( P ( d 1 + 1 , 1 ) * P ( d 1 + 2 , 2 ) * P ( d 1 + 3 , 3 ) * P ( d 1 + 4 , 4 ) ) . G T . 0 . 0 0 1 0 0 7 5 0 0 . A N D . S ( d 1 + 3 , 5 ) . G T . 1 . 4 3 . A N D . S ( d 1 + 4 , 5 ) . G T . 1 . 19.AND.S(d1+ 5,8 ) 2 .LT.0.66.AND.S(d1+6,8).LT.1.08.AND.S(d1+7,8).GT.0.71.AND.S(d1+7,1  ^>  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  3 ).GT.0.G9.AND.S(J1+8,8).LT.S(J1+7,8).AND.S(J1+8,1).GT.1.01.AND. 4 S(J1+6.1).LE.0.69) BALL=.TRUE. IF (BALL) d1=d1+7 IF (BALL) V2=28 IF (BALL) GO TO 300  C C ... J1 = J1+5 C C C *** 29 *** 20 BALL=.FALSE. IF ( ( J 1-2).LE.O) GO TO 300 T1=0 T2=0 T5=0 T1=S(J1-2,1)+S(d1-1,1)+S(d1,1)+S(d1+1,1) T2=S(d1-2.2)+S(d1-1,2)+S(d1,2)+S(d1+1,2) T5=S(d1-2,5)+S(d1-1,5)+S(d1,5)+S(d1+1,5) PRINT 25,T1,T2,T5 25 FORMAT ( ' • ' , 30X , 'T1.T2.T5' ,3(F7.3), '  STEP 29, J1+5  , RM<J 1 ' )  IF (T5.GT.T1.AND.T5.GT.T2.AND.S(J1+1.8).LT.1.08.AND.S(J1+2,8).LT. 11.08.AND.S(d1+3,8).LT.1.08.AND.S(d1+ 4,8 ) .LT. 1 .08.AND.S(d1+5,8).GE. 2 1.08.AND.S(d1+4, 1 ) . LT . 1 .06.AND.S(d1+ 3, 1).LE.0.69.AND.S(d1 + 1, 1).LT 3 . 1.06) BALL= .TRUE . IF (BALL) d1=d1+5 IF (BALL) V2=29 IF (BALL) GO TO 300  C C C ... d1 = d1+4 C C C *** 30 *** BALL=.FALSE. IF ((d1-3).LE.0) GO TO 300 IF (S(d1,8).LE.1.08.AND.S(d1+1.8).LT.1.08.AND.S(d1+2,8).LT.1.08.AN 1 D.S(d1+4,8).GT. 1 .08.AND.S(d1 - 1,8).LT. 1 .08.AND.S(d1-2,8).LT. 1.08 2 .AND. (S(d1-3,8).LT. 1 .08.OR.S(d1 -3, 1) .LE.0.69) .AND,S(d1+3,8).LE. 3 1.08.AND.(S(d1+4,8).GT:1.29.OR.(S(d1+4,8)-S(d1,8)).GT.0.65)) BALL 4 =.TRUE. IF (BALL) d1=d1+4 IF (BALL) V2 = 30 IF (BALL) GO TO 300 C C *** 31 *** BALL=. FALSE. IF (S(d1,8) .LE . 1.08.AND.S(d1+4,8).GT. 1.47.AND.S(d1+1,8).LT. 1.08.AN 1D.S(d1+2,8).LT.2.01.AND.S(d1+3,8).LT.2.O1.AND.S(d1-1,8).LT.1.08 2 .AND.(S(d1-2,8).LT.1.08.OR.S(d1-2,8).LT.S(J1+4.8)).AND. S(J1-3,8)  101 102 103 104 105 106 107 108 109 1 10 1 1 1 112 1 13 1 14 1 15 1 16 1 17 1 18 1 19 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 • 149 150  3 .LT.S(J1 + 4,8)) BALL = . TRUE ., IF (BALL) 01=01+4 IF (BALL) V2=31 IF (BALL) GO TO 300  C C C . .. 01 = 01-5 C C C *** 32 *** IF ((01-6).LE.0) GO TO 30 BALL=.FALSE. AND.M(01 -2).EO.7.AND.S(J1-1,8 IF (S(01 ,8) .LT. 1 .08.AND.M(01 + 1).EO. 4. M(01-3) .E0.4.AND.S(01-4, 1).GT 1 ) . LT. 1.08.AND.S(01- 1, 1 ) . GT. 1 .01 .AND. 2.1.01.AND.S(01-4.8).LT.S(01-5,8).AND. S(01 -5,8).GE. 1.08.AND.S(01-6, 3 6) .GE. 1 . 22) BALL=.TRUE. IF (BALL) 01=01-5 IF (BALL) V2=32 IF (BALL) GO TO 300 C *** 33 *++ C BALL=.FALSE . IF (S(01.8).LT . 1 .08.AND.S(01 , 1 ) .GT. 01.AND.M(J1-5).EO.1.AND.S(01+ 1. 1 1,8).GE.1.08.AND.S(01+2.8).GE.1.08.AND.S(01+3,8).GE.1.08.AND.M(01 2 - 1).EO. 18.AND.S(01-2,8).GE.0.81.AND. S(01-3,1).GT.1.O1.AND.S(01-4, .S(01-4.8).LT.1.08.AND.S(01-6 3 1).GT. 1 .01 .AND.S(01-3,8 ) .LT. 1.08.AND 4 ,6).GT.1.04) BALL=.TRUE. IF (BALL) 01=01-5 IF (BALL) V2=33 IF (BALL) GO TO 300 C C . .. 01 = 01-4 C C * + * 34 * * * C 30 IF ((01-5).LE.0) GO TO 40 BALL=.FALSE. IF (S(J1,8) . LT . 1 .08.AND.S(01. 1).GT.01.AND.S(01-4,8).GT.1.47.AND. 1. 1 S(01-3,8).GE.2.01.AND.S(01- 1 ,8) . LT. .08.AND.S(J1-2.8).LT.2.01.AND 1 1.08.AND.S(01+1,8).LT.1.08. 2 .S(J1-2.1).GT.1.01.AND,S(01-5,8).LT. 3 AND.S(J1+2,8).LT.1.08.AND.S(01+2,1).GT.1.01.AND.S(01+3,8).LT.1.08 4 .AND.S(01 + 3 . 1).GT . 1 .01 .AND.S(01 + 1..GT. 1) 1.01) BALL=.TRUE . IF (BALL) 01=01-4 IF (BALL) V2=34 IF (BALL) GO TO 300 C *** 35 *** C BALL=.FALSE . IF (S(01,8) .LT. 1 .08.AND.S(01-4,8).GT.1.29.AND.S(01-5,6).GT.1.10. 1 AND.S(01-3,8).LT.1.08.AND.S(01-2,8). LT.1.08.AND.5(01-1.8).LT.1.08  151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 17 1 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  2 .AND.S(d1-2,1).GT.1.01.AND.S(J1-3.1).GT.1.01.AND.S(d1+1.8).LT. 3 1.08.AND.S(d1+1,1).GT.1.01.AND.S(d1+2,8).GE.2.02.AND.S(d1+3,8).GT 4 .1.08.AND.S(d1+4,8).GT.1.08) BALL= . TRUE . IF (BALL) d1=d1-4 IF (BALL) V2 = 35 IF (BALL) GO TO 300 C C  +** 36 +** BALL=.FALSE. IF ( S(d1,8).LT. 1 .08 .AND.S(d 1 , 1).GT. 1 .01.AND.S(d1 + 1,8).LT . 1 .08.AND. 1 S(d1 + 2,8).LE.S(d1-4,8).AND.S(d 1-4,8 ) .GT. 1.47.AND.S(J1-5,8).GT. 1 . 4 2 7.AND.S(d1-1, 1).GT. 1 .01.AND.S(d1-2, 1 ) .GT.0.69.AND.S(d1 -3, 1).GT. 1 . 3 01.AND.S(d1-1,8).LT.2.01.AND.S(d1-2,8).LT.2.01.AND.S(d1-3,8).LT. 4 2.01) BALL=.TRUE. IF (BALL) V2=36 IF (BALL) d1=d1-4 IF (BALL) GO TO 300  C . .. d1 = d1+3 C C * * * 37 * * * C 40 BALL=.FALSE. IF ((d1-3).LE.O) GO TO 300 IF (S(d1,1).LE.0.69.AND.S(d1,8).LT. 1 .08.AND.S(d1 - 1,8) .LT. 1 .08.AND. 1 S(d1-2,8).LT.1.08.AND.S(d1-3,8).LT.1.08.AND.S(d1+1,8).LT.1.08.AND 2 . S(d1+2,8).LT. 1.08.AND.S(d1+3,8).GE. 1.08.AND.S(d1+3, 1 ) .GT . 1.01) 3 BALL=.TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=37 IF (BALL) GO TO 300 C *** 3g *** C BALL=.FALSE . IF (S(d1,8).LT. 1 .08.AND.S(d1 + 1,8).LT. 1.08.AND.S(d1 - 1,8).LT. 1 .08.AN 1D.S(d1-2,8).LT.1.08.AND.S(d1 -3,8).LT,1.08.AND.S(d1+3,8).GT.S(d1+2, 2 8).AND.S(d1+2.1).LT.S(d1+3,D) BALL=.TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=38 IF (BALL) GO TO 300  c c  +**  3g ***  IF ((d1-4).LE.0) GO TO 50 BALL=. FALSE. IF (S(d1,8).LT.1.47.AND.S(d1+1,8).LT.1.08.AND.S(d1+1,1).LE.0.69.AN 1 D.S(d1 + 2,6).GT.1. 10.AND.S(d1+3,8).GT. 1.47.AND.S(d1 - 1,8) .LT. 1 .08.A 2 ND.S(d1-3,8).LT.1.08.AND.S(d1-2,8).LT.1.08.AND.S(d1-4,8).LT.1.08) 3 BALL=.TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=39  201 202 203 204 205 206 207 208 209 210 21 1 212 213 214 215 216 217 2 18 2 19 220 22 1 222 223 224 225 226 227 228 229 230 23 1 232 233 234 235 236 237 238 239 240 24 1 242 243 244 245 246 247 248 249 250  IF C C  ***  C c  c c  GO TO 300  40 * * * BALL=.FALSE. IF (S( J1 ,8) . LT . 1 . 0 8 . A N D . S ( d 1 , 1 ) . LE .0 . 69 . AND . S (<J 1 - 1 , 1 ) . LE . 0 . 6 9 .AND 1 .M(J1 - 1 ) . N E . 1 5 . A N D . S ( J 1 - 2 , 8 ) . L T . 1 . 0 8 . A N D . S ( J 1 - 3 , 8 ) . L T . S ( d 1 + 3 , 8 ) . A 2 ND.S(J1-4,8).LT.1.08.AND.S(J1+1,8).LT.1.08.AND.S(d1+2,8).LT.1.08 3 . A N D . S ( J 1 + 3 , 8 ) . G T . 1 .08) BALL=.TRUE . IF ( B A L L ) d1=d1+3 IF ( B A L L ) V2=40 IF ( B A L L ) GO TO 300  *++ 41 *** BALL=.FALSE . IF ( S ( J 1 , 8 ) . G E . 1 . 0 8 . A N D . S ( J 1 + 3 , 8 ) . G T . S ( d 1 , 8 ) . A N D . S ( d 1 + 1 , 1 ) . L E . 0 . 6 9 1 .AND.M(d1 + 1 ) . N E . 15.AND.S(d1 + 2 , 8 ) . L T . 1 . 0 8 . A N D . S ( d 1 - 1 , 8 ) . L T . 1 . 0 8 . A N 2 D.S(d1-2,8).LT.1.08.AND.S(d1-3,8).LT.1.08.AND.(d1-4).LT.(K3-2)) 3 BALL=.TRUE. IF ( B A L L ) V2=41 IF ( B A L L ) d1=d1+3 IF ( B A L L ) GO TO 300  c c  c c c c c  (BALL)  * * + 42 * * * BALL=.FALSE. IF (S(d1+3,8).GT.1.47.AND.S(d1,8).LT.1.47.AND.S(d1+1,8).LT.1.08.AN 1 D.S(d1 + 2 , 8 ) . L T . S ( d 1 + 3 , 8 ) . A N D . S ( d 1 + 2, 1 ) . LT . 1 . 16 . A N D . S ( d 1 - 1 ,8 ) . LT . 2 1.08.AND.S(d1-2,8).LT.1.08.AND.S(d1-3,8).LT.1.08.AND.S(d1-4,8).LT 3 ' .1.08) BALL=.TRUE. IF ( B A L L ) d1=d1+3 IF ( B A L L ) V2=42 IF ( B A L L ) GO TO 300 . ..  J1 = d 1 - 3  * + * 43 * + * BALL=.FALSE . IF ( S ( d 1 , 8 ) . G T . 1 . 4 7 . A N D . S ( d 1 - 3 . 8 ) . G E . S ( d 1 , 8 ) . A N D . S ( d 1 - 1 , 8 ) . L T . 1.08 1 . A N D . S ( d 1 - 2 , 1 ) . G T . 1.01.AND.S(d1 - 2 , 8 ) . L T . S ( d 1 - 3 , 8 ) . A N D . S ( d 1 - 4 , 8 ) . 2 LT . 1 . 0 8 . A N D . S ( d 1 + 2 , 8 ) . L T . 1 . 0 8 . A N D . S ( d 1 + 1 , 8 ) . L T . 1 . 0 8 . A N D . S ( d 1 + 3 , 8 ) 3 .GT.1.47) BALL=.TRUE . IF ( B A L L ) d1=d1-3 IF ( B A L L ) V2=43 IF ( B A L L ) GO TO 300 * * * 44 * * * BALL=.FALSE. IF (S(d1,8).LT.1.08.AND,S(d1+1.8).LT.1.08.AND.S(d1+2,8).GT.1.29. 1 AND.S(d1-1,8).GE.0.81.AND,S(d1-2,8).GE.1.08.AND.(S(d1-2,1).GT.1.0 2 1.0R.S(d1-2.8).GT.2.01) .AND.S(d1-3,8).GT.1.08.AND.S(d1-4.8).LE.  i£> CTi  251 3 S(J1-3,8)) BALL=.TRUE. 252 IF (BALL) d1=d1-3 253 IF (BALL) V2=44 254 IF (BALL) GO TO 300 255 C 256 C *** 45 *** 257 BALL=.FALSE. 258 IF (S(<J1 ,8) . LT . 1 .08 . AND.S(d1 -3,8) . GE . 1 . 47 . AND . S ( J 1 - 1 , 8 ) . LT . 1 . 08 . AN 259 1 D . S(d1-2,8).LT. 1.08.AND.S(d1-4,8).LT. 1 .08.AND.S(d1+1,8).LT. 1.08 260 2 .AND.S(d1 + 2.8).LE. 1.08.AND.S(d1+3,8).LT. 1 .08.AND.S(d1, 1).GT. 1 .01 261 3 .AND.S(d1+1, 1).GT. 1 .01 .AND.S(d1+3, 1 ) .GT. 1 .01) BALL=.TRUE. 262 IF (BALL) J1=d1-3 263 IF (BALL) V2 = 45 264 IF (BALL) GO TO 300 265 C 266 C *** 46 *** 267 BALL=.FALSE. 268 IF (M(d1).EO.1.AND.S(d1-3,8).GE.1.08.AND.M(d1-4).EO.8.AND.S(d1-1,8 269 1 ) . LT. 1 .08.AND.S(d1 - 1, 1 ) .GT. 1.01.AND.S(d1 -2,8).LT. 1.08.AND.S(d1-2, 270 2 1).GT.0.69.AND.S(d1+1,8).LT.1.08.AND,S(d1+2,8).LE.S(d1-3,8).AND. 271 3 S(d1+3,8).LT.1.08) BALL=.TRUE. 272 IF (BALL) d1=J1-3 273 IF (BALL) V2=46 274 IF (BALL) GO TO 300 275 C * 2 7 6 C * * * 4 7 * * * 277 BALL=.FALSE. 278 IF (S(d1 ,8) .LT. 1 .08.AND.M(d1 - 1).EO. 15.AND.S(d1 -2,8).LT.S(d1-3,8). 279 1 AND,M(d1-3).EO.1.AND.S(d1-4,8).LT.1.08.AND.S(d1+1,8).LT.1.08.AND. 280 2 M(d1+2).EQ.1.AND.M(d1+4).EQ.1) BALL=.TRUE . • . 281 IF (BALL) J1=d1-3 282 IF (BALL) V2=47 283 IF (BALL) GO TO 300 284 C 285 C *** 48 *** 286 BALL=.FALSE. 287 IF (M(d1).EO.1.AND.S(d1-3,8).GE.1.08.AND.S(d1-4,6).GE.1.22.AND. 288 1 S(d1-1,8).LT.1.08.AND.S(d1-1,1).GT.0.69.AND.S(d1-2,8).LT.1.08.AND 289 2 .S(d1-2,1).GT.1.01.AND.S(d1+1,8).LT.1.08.AND.S(d1+2,8).LT.S(d1-3 290 3 ,8) .AND.S(d1+3,8) .LT. 1.08) BALL=.TRUE. 291 IF (BALL) J1=d1-3 292 IF (BALL) V2=48 293 IF (BALL) GO TO 300 294 C 295 C *** 49 *** 296 BALL=.FALSE. 297 IF (S(d1,8).GE.1.08.AND,S(d1+1,8).GE.1.08.AND.S(d1+2,8).LT.1.08.A 298 1 ND.S(d1 + 3,8).LT. 1.08.AND.M(d1-3).E0.15.AND.S(d1 - 1,8).LT. 1 .08.AND. 299 2 S(J1-1, 1).GT. 1.01.AND.S(d1-2,8).LT. 1.08.AND.S(d1 -2, 1).GT.0.69.AND 300 3 .S(d1-4,8).LT.1.08) BALL=.TRUE.  *  301 302 303 304 305 306 307 308 309 310 31 1 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 34 1 342 343 344 345 346 347 348 349 350  IF (BALL) IF (BALL) IF (BALL)  C C 50  C C  d1=d1-3 V2=49 GO TO 300  * * * 50 * * * BALL=.FALSE. IF ( (J 1 - 3 ) .LE.O) GO TO 300 IF (M(d1).EO.1.AND,S(d1-3,8).GT.1.47.AND.S(d1-1,8).LT.1.08.AND.S 1 (d1-2,8) .LT. 1 .08.AND.(M(d1 - 1).EQ.3.OR.M(d1 - 1).EQ. 16.OR.M(d 1 - 1 ) . EQ 2 . 8 ) . AND.S(d1 + 1,8).LT. 1.08.AND.S(d1+2,8).LT. 1.08.AND.S(d1+3,8).GE. 3 S(d1-3,8).AND.S(d1+4,8).GT.1.08) BALL=.TRUE. IF (BALL) d1=d1-3 IF (BALL) V2=50 IF (BALL) GO TO 300 *** 5 1 *+*  IF ((d1-5).LE.O) GO TO 60 BALL=.FALSE. IF (S(d1 ,8) .LT. 1 .08.AND.S(d1, 1 ) .GT. 1 .01 .AND.M(d1-3) .EO. 15.AND.S(d1 1 -1,2).LE.0.75.AND.S(d1-5,2).LE.0.75.AND.S(d1+1,2).LE.0.75.AND.S(d 2 1-2,8).LT.2.01.AND.S(d1+1,8).LT.2.01.AND.S(d1+2,8).LT.2.01.AND.S( 3 1+3 , 1 ) .GT. 1 .01 .AND.S(d1+4, 1 ) .GT. 1.08) BALL=.TRUE. IF (BALL) J1=d1-3 IF (BALL) V2=52 IF (BALL) GO TO 300  C C . .. d1 = d1+2 C C * * * 53 *** C 60 BALL=.FALSE. IF ((d1-3).LE.O) GO TO 300 IF (S(d1,8).LT. 1 .08.AND.S(d1 , 1 ) . LT . 1 .06.AND.S(d1+2,8) .GT. 1 .47.AND. 1 S(d1 + 1,8).LT. 1.08.AND.S(d1 - 1, 1).LE.0.69.AND.M(d1-1).NE.15.AND.S( 2 d1-2,8).LT.1 .08.AND.S(d1 -3,8).LT. 1.08.AND.S(d1-2, 1 ) . LT. 1.06.AND.S 3(d1-3, 1).LT. 1.06) BALL=.TRUE . IF (BALL) d1=d1+2 IF (BALL) V2=53 IF (BALL) GO TO 300 C * * * 54 * * * C BALL=.FALSE . IF (S(d1,1).LE.0.69.AND.M(d).NE.15.AND.S(d1+1.8).LT.1.08.AND.S(d1+ 1 2,8).GT. 1 .47.AND.S(d1-1, 1).LE.0.69.AND.M(d1 -1).NE. 15.AND.S(d1-2, .2.8).LT.S(d1 + 2,8).AND.S(d1- 3,8) .LT. 1 .08) BALL=.TRUE. IF (BALL) d1=d1+2 IF (BALL) V2=54 IF (BALL) GO TO 300 C *** 55 *** C  00  351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 38 1 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400  BALL=.FALSE. IF (S(J1,8).LT . 1 .08.AND.S(J1+1, 1).LE.0.69.AND.M(J1 + 1).NE. 15.AND. 1 S(J1+2.8).GT.1.08.AND.S(J1-1,8).LT.1.08.AND.S(J1-2,8).LT.S(J1+2,8 2 ).AND.S(<J 1-3,8).LE.S(J1+2,8).AND.S(d1-4,8),LE.S(d1+2,8)) BALL = 3 .TRUE. IF (BALL) d1=d1+2 IF (BALL) V2 = 55 IF (BALL) GO TO 300  C C  * * * 56 ** +  IF ((J1-4).LE.0) GO TO 70 BALL=.FALSE. IF (S(J1,8).LT.1.08.AND.S(J1,1).LT.1.06.AND.S(J1+1.8).LT.1.08.AND. 1 S(d1+2,8).GT.1.08.AND.S(d1+2,1).GT.1.01.AND.S(J11 ,8) .LT. 1 .08.AND 2 .S(d1-1,2).GE.1.47.AND,S(J1-2,1).LE.0.69.AND.M(d1-3).EO.1.AND.S(d 3 1-4,2).GE.1.47) BALL=.TRUE. IF (BALL) d1=d1+2 IF (BALL) V2=56 IF (BALL) GO TO 300  C C  ***  70  75  57  ***  BALL=.FALSE . IF ((d1-3).LE.O) GO TO 300 T1=0 T2 = 0 T5=0 T1=S(d1-3.1)+S(d1-2,1)+S(d1-1,1)+S(d1.1) T2=S(d1-3,2)+S(d1-2.2)+S(d1-1,2)+S(d1,2) T5=S(d1-3,5)+S(d1-2.5)+S(d1-1.5)+S(d1.5) PRINT 75,T1,T2,T5 FORMATC ' ,30X, 'T1 ,T2,T5' ,3(F7 . 3) , ' STEP 57, d1 + 2 ,RMd1') IF (T5.GT.T 1 .AND.T5.GT.T2.AND.S(d1 + 1 ,8) .LT. 1 .08.AND.S(d1 + 1 ,2) .GT . 1 1.38.AND.S(d1+2,8).GT.0.8 1.AND.S(d1+3,8).LT.1.08.AND.S(d1+4,8).LT 2 .1.08.AND.S(d1+3,2).GT.1.10.AND.S(d1+4.2).GT.1.10.AND.S(d1,1).LT. 3 1.06.AND.S(d1,5).GT. 1 .43.AND.S(d1-1,5).GT. 1.52.AND.S(d1-2,5).GT. 4 1.52) BALL=.TRUE. IF (BALL) d1=d1+2 IF (BALL) V2=57 IF (BALL) GO TO 300  C C . . . d1 = d1-2 C C * * + 58 *** C BALL= . FALSE. IF (S(d1,8).LE.1.08.AND.S(d1 - 1, 1).GT. 1 .01.AND.S(d1-1.8).LT.S(d1-2, 1 8).AND.S(d1-2,8).GT. 1 .47.AND.S(d1 -3,8).LT.S(d1-2,8).AND.S(d1 + 1.8) 2 .LT. 1 .08.AND,S(d1 + 2,8).LT.1.08.AND.S(d1+3,8).LT. 1 .08) BALL= .TRUE . IF (BALL) d1=d1-2 IF (BALL) V2=58  vo  401 402 403 404 405 40G 407 408 409 410 411 4 12 413 414 4 15 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450  C C  C C  C C  C C  C C  C C  IF (BALL)  GO TO 300  * + * 59 +** BALL=.FALSE. IF (S(Jl,8).LT.1.08.AND,S(J1.1).GT.1.01.AND.S(J1-2.8).GE.S(J1,8) 1 .AND.S(J1-1,1).LT.S(J1-2,1).AND,S(J1+1,8).LT.1.47.AND.S(J1+2,8).L 2T.1.08.AND.(S(d 1 -3, 1).LE.0.69.OR.S(d1-3,8).LT. 1 .08).AND.(S(J1-2,1) 3 .GT. 1. 16.OR.M(J1- 1).NE. 15)) BALL= . TRUE . IF (BALL) J1=J1-2 IF (BALL) V2=59 IF (BALL) GO TO 300 ***  *** BALL=.FALSE. IF (S(J1,8).LT. 1.08.AND.S(J1-1,8).LT. 1.08.AND.S(J1-2,8).GT. 1.22 . AN 1D.S(J1+1,8).LT.1.08.AND.S(J1+2,8).GE.1.08.AND.S(J1+3,8).GT.2.01 2 .AND.S(J1-3.8).LT.S(J1-2,8).AND.S(J1-2,1).GT.0.69) BALL=.TRUE. IF (BALL) J1=d1-2 IF (BALL) V2=60 IF (BALL) GO TO 300 6  0  *** g1 * * * BALL=.FALSE. IF (S(J1.8).LT. 1.08.AND.S(J1. 1 ) .GT. 1.01 .AND.S(d1-2,8).GT.2.01 .AND. 1 S( J1-1,8).LT.S(J1-2.8).AND.S(J1-3,8).LE.S(J1-2,8).AND.S(J1+1,8). 2LE.S(J1-2,8).AND.S(J1+2.8).GE.1.08.AND.S(J1+3,8).GE.S(J1-2.8)) 3 BALL=.TRUE. IF (BALL) J1=J1-2 IF (BALL) V2 = 61 IF (BALL) GO TO 300 *** 62 *** BALL=.FALSE. IF(S(J1,8).GT.1.47.AND.S(J1-1,8).GT.1.47.AND.S(J1-2,8).GT.1.47.AN 1 D.S(J1+1,8).GT.1.47.AND.S(J1+2,8).LT.1.08.AND.S(J1+3,8).LT.1.08 2.AND.S(J1-3,8).LT. 1.08) BALL= . TRUE. IF (BALL) J1=J1-2 IF (BALL) V2=62 IF (BALL) GO TO 300 *** 63 *** BALL=.FALSE. IF(S(J1,8).GT.1.47.AND.M(J1-2) .EO. 1 .AND.S(J1 - 1 ,8) .LT. 1 .08.AND.S(J 1 1-3,8).LT. 1 .08.AND.S(J1 + 1,8).LT.S(J1-2,8).AND.S(J1+ 2 , 8).LT.S(d1-2 2 ,8) . AND.S(J1 + 3.8).LE.S(J1-2.8)) BALL=.TRUE. IF (BALL) J1=J1-2 IF (BALL) V2=63 IF (BALL) GO TO 300 *** 64 ***  451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 47 1 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500  BAL'L= . FALSE . IF (M(J1).EO.7.AND.S(d1-2,8).GT.1.47.AND.S(d1-3,1).GT.0.69.AND.S( 1 d1-1,8).LT.1.08.AND.S(d1+1,8).LT.1.08.AND.S(d1+2,1).GT.1.01.AND.S 2 (J1-3.8) .LT. 1 .47) BALL= . TRUE . IF (BALL) d1=d1-2 IF (BALL) V2=64 IF (BALL) GO TO 300  C C . .. J1 = d1+1 C C * * * 65 *** C BALL=.FALSE. "IF (S(d1,8).LE.1.08.AND.S(d1+1,8).GE.1.08.AND.S(d1+1,1).GT.1.01 1.AND.(S(d1+2,8).LT.S(d1 + 1 .8) .OR.S(d1 + 2.8) .LT. 1.47).AND.S(d1 + 3.8) 2.LT. 1 .08.AND.S(d1 - 1,8).LT. 1.08.AND.S(d1-2,8).LE. 1 .08.AND.S(d1-3.8 ) 3.LT. 1 .08.AND.S(d 1 +4,8 ) .LT . 1.08) BALL=.TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=65 IF (BALL) GO TO 300 C * * * 66 *** C BALL=.FALSE. IF (S(d1 ,8) .LT. 1 .08.AND.S(d1 + 1,8).GT. 1 .47.AND.S(d 1+2,8 ) .LT.S(d1+1 , 1 8).AND.S(d1+3,8).LT.S(d1+1,8).AND.S(d1-1,8).LT.S(d1+1,8).AND.S(d1 2 -2,8) .LT.S(d1 + 1,8).AND.S(d1-3,8) .LT.S(d1 + 1 ,8)) BALL=.TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=66 IF (BALL) GO TO 300 C C C . .. . J1 = d1-1 C C * * * 67 *** C BALL=.FALSE. IF (S(d1,8) .LT . 1 .08.AND.S(d1-1,8).GT. 1.08.AND.S(d1-1, 1).GT. 1. 16.AN 1 D.S(d1-2,8).LT.S(d1-1,8).AND.S(d1+1,8).LT.1.08.AND.S(d1+2.8).LE. 2 S(d1-1,8).AND.S(d1-3,8) .LT. 1 .08) BALL=.TRUE. IF (BALL) d1=d1-1 IF (BALL) V2=67 IF (BALL) GO TO 300 C * * * 68 '*** C BALL= . FALSE . IF (S(d1 ,8).LT . 1 .08.AND.M(d1-1).EO. 15.AND.S(d1+1,8).LT.S(d1-1,8). 1 AND. S(d1+2,8) . LT . S(d1-1 ,8) . AND . S (d 1+3 . 8 ) . LT . S ( d 1 - 1 , 8 ) . AND.S(d1-"2, 2 8).LT.S(d1-1,8).AND,S(d1-3,8) .LT. 1.08.AND.S(d1, 1 ) .GT. 1 .01) BALL 4 =.TRUE. IF (BALL) d1=d1-1  501 IF (BALL) V2 = 68 502 IF (BALL) GO TO 300 503 C 504 C *** 69 *** ' 505 BALL=.FALSE. 506 IF (S(J1.8).GE.1.08.AND. S ( d 1 , 1 ) .GT.1.01.AND.S(d1,8).LT.2.44.AND.S( 507 1 d1 + 1,8).LE.1 .08.AND.S(d1 + 2,8).GE. 1.08.AND.S(d1+ 3,8).LT.S(d1 - 1,8) 508 2 .AND.M(d1-1).EO.15.AND.S(d1-2,1).LE.O.69 .AND.(S(d1-3,1).LE.0.69 509 3 .OR . S(d1-3,8).LT. 1.08)) BALL=.TRUE. 510 IF (BALL) d1=J1-1 511 IF (BALL) V2=69 512 IF (BALL) GO TO 300 513 C 5 1 4 C * * * 7 0 * * * 515 BALL=.FALSE. 516 IF (M(d1).E0.7.AND.M(d1-1).EO. 15 . AND . (S(d1-2, 1 ) .LE.0.69.OR.S(d1-2, 517 1 8).LT. 1.08).AND.(S(d1-3, 1).LE.0.69.OR.S( d 1-3,8).LT. 1 .08).AND.S(d 518 2 1+1,8).GE.1.08.AND.S(d1+2,8).LT.1.08.AND.S(d1+3,8).GE.1.08) BALL 519 3 =.TRUE. 520 IF (BALL) d1=d1-1 521 IF (BALL) V2 = 70 522 IF (BALL) GO TO 300 523 C 524 1-1  O  C  525 526 527 528 529 530 531 532  C  533  534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550  c  C C  '  ***7-|***  BALL=.FALSE. IF (S(d1,8) .LT. 1.08.AND.S(d1 + 1,8).LT. 1 .08.AND.S(d1 + 2,8).LT. 1.08.AN 1 D.S(d1+3,8).LT. 1 .08.AND.S(d1 - 1, 1).GT. 1. 16.AND.S(d1 - 1,8).GE. 1 .08. 2 AND.S(d1-2,8).LT. 1.08.AND.S(d1-3,8).LE.S(d1-1,8)) BALL= . TRUE . IF (BALL) d1=d1-1 IF (BALL) V2=71 IF (BALL) GO TO 300 **+  72  +**  BALL=.FALSE. IF(S(d1,8).GE.1.08.AND.S(d1-1,8).GT.S(d1,8).AND.S(d1 + 1 , 8).LT. 1.08 1 .AND.S(d1+2,8).LT.1.08.AND.S(d1+3,8).LT.1.08.AND.S (d1-2,1).LE. 2 0.69.AND.S(d1-3,8).LE.S(d1-1,8).AND.S(d1,1).GT.1.01) BALL=.TRUE. IF (BALL) J 1 =J 1 - 1 IF (BALL) V2=72 IF (BALL) GO TO 300 *** 73 *** BALL=.FALSE. IF (S(d1,8).GT.1.08.AND.S(d1,1).GT.1.01.AND.S(d1+1,8).LE.1.08.AND. 1 S(d1 + 2,8).LT. 1 .08.AND.S(d1+3,8).LT.S(d1 - 1,8).AND,M(J1-1).EO. 15.AN 2 D.(S(d1-2,1).LE.0.69.0R.S(d1-2,8).LT.S(d1-1,8)).AND. (S(d1-3,1).LE 3 .0.69.0R.S(d1-3,8).LT.1.08)) BALL=.TRUE. IF (BALL) d1=d1-1 IF (BALL) V2=73 IF (BALL) GO TO 300  551 552 553 554 555 556 557 558 nd of F i l e  END OF N-BOUNDARY ADJUSTMENT. TO CALL SUBROUTINE MOJ2 FOR C-BOUNDA RY ADJUSTMENT 300  CALL M0J2 RETURN END  1 2 3 4 5 G 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 2G 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  C C  SUBROUTINE  C C C C  c c c c c c c c c c c c c c  M0J2  BOUNDARY MOVE OF THE C-TERMINAL  PURPOSE TO ADJUST THE C-TERMINAL RESIDUE BASED ON THE BOUNDARY CONFOR MAT IONAL PARAMETERS AND ON THE POTENTIAL OF TURN OR SHEET OF THE ADJACENT REGIONS  C '  c  REAL S,T1,T2,A1,A2 ,T3,T4,T5,TT,P INTEGER G,F.H,U,D,V1,V2,W, V3,V4,V5,V6.V7,V8.0 LOGICAL HELLO,BYE .BALL,MOVE DIMENSION S(1000,20),M(1000),H(1000),D(1000,16),P(1000,10) COMMON S.T1.T2.T3.T4,T5,TT,A 1,A2,P,F,H,U,D,W,M,M1,M2,M3,M4,M5,M6, 1L,I,K,L1,L2,NZ,NY,JA,JB,JC,JD,J1,J2,KM,N1,N2,NN,J,G,K3,V1,V2,V3,V4 2,V5,V6,V7,0,HELLO,BYE,BALL,MOVE  c c c c c c c c c c c c c c  DESCRIPTION OF PARAMETERS V1 - NUMBER OF BREAKERS IN THE PREDICTED HELIX BEFORE THE BOUN DARY ADJUSTMENT V2 - COUNTER USED IN N-BOUNDARY ADJUSTMENT V3 - COUNTER USED IN C-BOUNDARY ADJUSTMENT V2=80 WHEN THE N-TERMINAL ADJUSTMENT IS DUE TO STRONG B-TURN POTEN TIAL (THROUGH THE PROCEDURE OF REPEATING THE B-TURN CHECK). IF V2=0 NONE OF THE CONDITIONS LISTED IN THE N-TERMINAL ADJUSTMENT FIT THE CURRENTLY TESTED SEGMENT. IN OTHER WORDS J1 HAS NOT CHANGED 1  c c c  PRINT 1 FORMAT('O', 30X,'BOUNDARY ANALYSIS OF THE C-TERMINAL') IF (.NOT. BALL.AND. V2.NE. 80) V2=0  .... SITUATION WITH J2 CLOSE TO THE C- BOUNDARY  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  C C C C C C  TO TAKE INTO ACCOUNT THE POSITION OF J2 WHEN IT IS CLOSE TO THE CTERMINAL OF THE PROTEIN SINCE THERE IS LESS FREEDOM TO MOVE IT TO WARDS THIS END ***  \  ***  BALL=.FALSE. IF (02 .EO•NN.AND. S(02,9).GT. 1. 10.AND.S(J2, 1 ) .GT. 1 .01.AND.(S(J2,9) 1 .GT.S(02-1,9).OR.S(02-1 , 1 ) . LT.S(02, 1))) BALL =.TRUE. IF (BALL) 02 = 02 IF (BALL) V3=1 IF (BALL) GO TO 300  C C  *** 2 ***  C C  *** 3 **+  BALL=.FALSE . IF ((02 + 3).GT.NN) GO TO 20 IF (S( J2 ,9) .GT .0.98 . AND .S(02,1).GT.0.69. AND. S(02+1 , 1 ) . LE .0.69 . AND . 1 S(J2+2, 1).LE.0.69.AND. S(02+3,9).LT. 1.57.AND. S(02- 1, 1).GT.0.69.AND 2 .S(J2-2,9).GT. 1 . 10.AND.S(02-3, 1 ) .GT. 1 . 16) BALL=.TRUE. IF (BALL) 02 = 02 IF (BALL) V3 = 2 IF (BALL) GO TO 300  10  C C  BALL =.FALSE. T1=0 T2=0 T1=S(J2,1) + S(J2+1,1)+S(J2+2,1 ) + S(J2+3, 1) T2=S(02,2)+S(02+1,2)+S(J2+2,2)+S(J2+3,2) PRINT 10.T1.T2 FORMAT(' ',30X,'T1,T2 ' ,2(F7 . 3 ) ,7X, ' STEP 3, 02 CLOSE TO 0') IF (T2.GT.T1.AND.S(02,2).GT.1.38.AND.S(02+2,2).GT.1.38.AND.S(J2+1, 1 1).LE.0.69.AND.S(02,9).GT. 1 .20.AND.5(02-1. 1).GT. 1 . 16.AND.S(02-2. 2 2).LE.O.75) BALL =.TRUE. IF (BALL) 02=02 IF (BALL) V3=3 IF (BALL) GO TO 300 *** 4 ***  BALL=.FALSE. IF ((02 + 4) .GT.NN) GO TO 20 IF (S(02,9).GT.1.08.AND.S(02-1,1).GT.1.16.AND.S(02-2,1).GT.1.16.AN 1 D . S(02+1, 1),GT.0.77.AND.S(02+2, 1 ) .GT. 1 .01.AND.S(02+2.9).GT.S(02+1 2,9) .AND.S(02 + 3,7) . GT . 1 .58.AND.S(02 + 4,7) .GT. 1 .58) BALL=.TRUE. IF (BALL) 02=02+2 IF (BALL) V3=4 IF (BALL) GO TO 300  C C  *** 5  20  ***  BALL=.FALSE .  I-O  101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 12 1 122 123 124 125 126 . 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150  IF ((02+ 1 ) .GT.NN) GO TO GO IF ((02+1).EQ.NN.AND.M(02+1).NE.15.AND.V1.LT.((02+1-01)/3).AND.S(0 1 2.9) .GT. 1 . 10.AND.S(J2-1,9).GT. 1. 10) BALL = . TRUE. IF (BALL) 02=02+1 IF (BALL) V3 = 5 IF (BALL) GO TO 300  C C  C C C C C C C C C C  *** g *** BALL=.FALSE. IF ((J2+2).GT.NN) GO TO 60 T1 =0 T2=0 T5=0 TT=0 T 1 = S(02-1 , 1) + S(02, 1 )+S(02+1 . 1)+S(02+2, 1 ) T2=S(02-1,2)+S(02,2)+S(02+1.2)+S(02+2,2) T5=S(J2-1,5)+S(02,5)+S(02+1,5)+S(J2+2,5) TT=P(02-1,1 )*P(J2,2 )*P(02+1,3)*P(02 + 2,4) PRINT 25,T1 ,T2,T5,TT 25 FORMAT( ' ' ,30X, 'T 1 ,T2,T5,TT' ,3(F7.3),F13.9, ' STEP 6, J2 CLOSE O') IF (T5.GT.T1 .AND.T5.GT.T2.AND.TT.GT.0.00007500.AND.S(02-1,9).GT. 1. 1 57 .AND.S(J2-1, 9 ) . GT . S(02-2,9).AND.(S(J2-3, 1 ) . GT. 1 . 16.OR.S(02-3,9) 2.GT.1.20)) BALL=.TRUE. IF (BALL) 02=02-1 IF (BALL) V3 = 6 IF (BALL) GO TO 300  THE DIFFERENT COMMENTS 02 =02,02 =J2+10,...,02=02-4 INDICATE THE EVE NTUAL POSITION OF J2 IF ITS ENVIRONMENT MEETS ONE OF THE CONDITIONS DESCRIBED BELOW :  . . . . 02 = 02-10  30  40  IF ((02-10).LE.0) IF ((02+3).GT.NN) 03=02 04=03+3 BALL=.FALSE. T1=0 T2=0 T5=0 TT=0 DO 40 N=03,04 T1=T1+S(N,1) T2=T2+S(N,2) T5=T5+S(N,5) CONTINUE  GO TO 50 GO TO 60  £ 5^  151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  45  C C C C C  02 = 02+10  50  C C C  C C C C C C  TT = P(03.1 )*P(03+1,2)*P(03+2,3)*P(03+3,4) PRINT 45,T1,T2,T5,TT FORMAT(' ',30X,'T1,T2,T5,TT',3(F7.3),F13.9,' STEP7, 02-10 , M0J2' IF (T5.GT.T1.AND.T5.GT.T2.AND.TT.GE.0.00007500.AND.S(J3+1,1).LE.0. 1 69.AND.S(03+2,1).LE.0.69.AND.S(J3+3,1).LT.1.06.AND.S(03,1).LT.0. 2 98.AND.S(03-1,1).LT.0.98) 03=02-7 IF (T5.GT.T1.AND.T5.GT.T2.AND.TT.GE.O.00007500.AND.5(J3+1,1).LE.0. 1 69.AND.S(03+2,1).LE.0.69.AND.S(J3 + 3, 1 ) .LT. 1.06.AND.S(03, 1).LT.0. 2 98.AND.S(J3-1,1).LT.0.98) GO TO 30 IF (T5.GT.T1.AND.T5.GT.T2.AND.TT.GE.O.00007500.AND.03.EO.(02-7).AN 1 O.S(J2-8,8) .,LT . 1 . 10.AND.S(02-8 . 1 ) . LT .0.98 . AND.S(02-9,9) . LT . 1 . 10. A 2ND.S(02-10,9).GT.1.25.AND.S(02-10,1).GT.1.16) BALL =.TRUE. IF (BALL) 02=02-10 IF (BALL) V3 = 7 IF (BALL) GO TO 300  *** 8 +** IF ((02+11).GT.NN) GO TO 60 BALL=. FALSE. IF (M(02+1).EO.16.AND.M(J2).EQ•16.AND.M(J2+3).EQ.16.AND.M(02+8).EO 1 . 16 . AND.(P(J2+8, 1)*P(J2 + 9,2)*P(02 +10,3)*P(02+11,4)).GT.0.000100.A 2 ND.S(02-1, 1).GT, 1 . 16.AND.S(02+2, 1 ) .GT. 1 .01.AND.S(02 + 5, 1).GT. 1 .01 3 .AND.S(02+6.1).GT.1.16.AND.S(02+4.1).GT.0.77.AND.S(02+7,1).GT.0. 4 77.AND.S(02-3,1).GT.1.13.AND.S(02-2,1).GT.1.11) BALL=.TRUE. IF (BALL) 02=02+8 IF (BALL) V3 = 8 IF (BALL) GO TO 300 *** g *** BALL=.FALSE. IF ((02+12).GT.NN) GO TO 60 IF (S(02,9).GT. 1 .57.AND . S(02-2, 1 ) .GT. 1 . 16.AND.S(02+10, 1 ) .GT. 1 . 16 1 .AND.S(02+11,7).GT. 1 .49.AND.S(02+12,7).GT. 1.58.AND.S(02 + 2, 1).GT. 1 2 . 1 6 . AND.S(02 + 3, 1).GT. 1. 16.AND.S(02+6, 1).GT. 1. 16.AND.S(02 + 7, 1 ) .GT. 3 1.21.AND.S(02+8,9).GT. 1.20.AND.S(02 + 8, 1 ) .GT. 1.01.AND.S(02+9,7).GT 4 . 1 . 57.AND.S(02+ 1 ,9).GT.0.98.AND.S(02+4,2).EQ.O.75.AND.S(02 + 5, 1). 5 GT.0.77) BALL=.TRUE. IF (BALL) 02=02+10 IF (BALL) V3 = 9 IF (BALL) GO TO 300 TO REPEAT THE B-TURN CHECK  TO CHECK THE PRESENCE OF TURNS IN THE VICINITY OF THE HELIX BOUNDA RIES WHICH MAY FORCE THE PREDICTED BOUNDARIES TO BE MOVED TO A NEW POSITION. THIS PROCEDURE STARTS FROM POSITION 02-4 (1=0) TO 02+2  201 202 203 204 205 206 207 208 209 210 21 1 212 213 214 215 216 217 2 18 219 220 22 1 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 24 1 242 243 244 245 246 247 248 249 250  C C  (1 =6) 60  1=0 LE=J2-4 LF=LE+3 IF ((LE+3).GT.NN) HELLO=.FALSE.  70 C C C C  TO COMPARE PA (T1),PB (T2),AND PT (T5) AND TO CALCULATE THE PROBA BILITY OF B-TURN OCCURRENCE (TT) OF THE TETRAPEPTIDE LE-LF T1=0 T2=0 T5=0 TT=0 DO 75 L = LE,LF T1=T1+S(L,1) T2=T2+S(L,2) T5=T5+S(L,5) CONTINUE TT = P(LE, 1)*P(LE+1 ,2)*P(LE + 2,3)*P(LE + 3,4) PRINT 78 , LE , T 1 ,T2,T5,TT,I FORMAT(' ' , 10X , 'LE,T1,T2,T5.TT,I' . 15,3(F7.4,2X),F13.9,14,3X, 1 'B-TURN SEARCH AT C-TERMINAL') IF (T5.GT.T1.AND.T5.GT.T2.AND.TT.GE.0.0O0O7500) HELLO=.TRUE.  75 78 C C  ***  C C  *** 2 *+*  C C  GO TO 210  1 *+* IF ((J2+1).GT.NN) GO TO 80 IF (HELLO.AND.LE.EQ.(02+1).AND.S(J2,9).GT.1.10.AND.S(J2,1).GT.1.01 1 . AND.S(02+1 , 1 ) . LE . 0. 69 . AND .S(02 - 1 .9 ) . LE . S ( J2 , 9 ) . AND . S (<J2 - 1 , 1 ) . GT . 2 0.67.AND.((S(J2-2,5) + S(U2-1,5 ) + S(02,5 ) + S(02+ 1 ,5) ) .LT.(S(02-2, 1 ) + 3 S(02-1, 1) + S(02, 1) + S(02+1, 1) ) .OR.(P(d2-2, 1)*P(02-1,2)*P(02,3)*P(02 4 +1,4)).LT.0.00007500)) GO TO 10O IF (HELLO . AND. LE . EO . 02 . AND . S ( 02-1 , 9 ) . GT . 1 . 10. AND. S( 02- 1 , 1 ) . GT . 1 . 16 1 .AND.S(02,9).LT. 1 . 10.AND.S(02- 1, 1).GT.S(02, 1).AND.S(02+1, 1).LE.0. 2 69) GO TO 101  *** 3 ***  80  IF (HELLO.AND.LE.EO.(02-1).AND .M(02- 1 ) . EO. 16.AND.S(02-2,9 ) .GT . 1 . 10 1 .AND.S(02-3,1).GT.1.16.AND.S(02,5).GT.1.19.AND.S(02+2,5).GT.1.19) 2 GO TO 101  C C  *** 4 *+*  C C  * * * 5 ***  IF (HELLO.AND.LE.EO.(02-1).AND.S(02-2,9).GE.1.10.AND.(S(02-3,9).LT 1.S(02-2,9).OR.(S(02-3,9)-S(02-2,9)).LT.0.15).AND.S(02-1,9).LT.S(02 2 -2,9).AND.S(02+1,5).GT.1.19.AND.S(02+2,5).GT.1.19) GO TO 102  25 1 252 253 254 255 256 257 258 25g 260 261 262 263 264 265 266 267 268 269 270 271 272  273  00  274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300  IF (HELLO.AND.LE.EO•(J2-2).AND.S(02-2,9).GT.S(J2-3,9).AND.S(02 - 3,2 1 ).LE.0.75.AND. S(02-2,1).GT.O.69.AND.S(02-5,2).LE.0.75) GO TO 102 C C  +** g *** IF (HELLO.AND.LE.EO.(J2-2).AND. S(02-2,9).EO.0.98.AND.S(02 - 3, 1).GT 1 .1.16.AND.S(02-4, 1).GT. 1. 16.AND.5(J2-3,9).LT. 1.57.AND.S(02- 1 ,9) . 2 LT . 5 ( 0 2 - 2 , 9 ) ) GO TO 102  C C  *** 7 *** IF (HELLO.AND.LE.EQ.02.AND.(S(02,9).LT.O.98.OR.S(J2.2).GT.1.38).AN 1 D.(S(02- 1,2) .GT. 1 .60.OR.(S(02 - 1,9).LT. 1 . 10.AND.S(02- 1,2).GT. 1.38) 2 ).AND.S(02-2,9).GT.1.10.AND.S(J2-2,1).GT.1.16) GO TO 102  C C  ***  3 *** IF (HELLO.AND.LE.EQ.(02-1).AND.S(02-1,9).LT.S(02-2 , 9).AND.S(02-2, 1 1 ) .GT. 1 . 16.AND.S(02-3, 1) .GT. 1 . 16.AND.S(02-1, 1) .LT.S(02-2, 1 ) ) GO 2 TO 102  C C  *** g *** IF (HELLO.AND.LE.EQ.(02-2).AND.S(02-2,9).GT.1.10.AND.S(02-3,9).LT. 1 1.10.AND.S(02-4,9).LT.5(02-2,9)) GO TO 102  C  *»*io***  C  IF ((02+4).GT.NN) GO TO 90 IF (HELLO.AND.LE.EQ.(02-2).AND.M(02-2).EO.16.AND.S(02-3,9).GT.0.98 1 .AND.S(02-4,9).GT. 1 . 10.AND.S(02 + 2,2) .GT. 1 .38.AND.S(02- 1 ,2).GT. 1 .O 2 5.AND.S(02+3,2).GT.0.75.AND.S(02+4,2).GT.1.10) GO TO 102  C C 90 C C  C C  C C C C  IF (HELLO.AND.LE.EQ.02.AND.(P(02-3, 1 ) * P ( 0 2 - 2 , 2 ) * P ( 0 2 - 1 , 3 ) * P ( 0 2 , 4 ) ) 1 .GT.0.00007500.AND.S(02-4,9).GT.5(02-3,9).AND.S(02-4 , 9).GT. 1. 10 2 .AND.S(02-4,9).GT.5(02-5,9)) GO TO 104 **+-12*** IF (HELLO.AND.LE.EO.(02-3).AND.S(02-4. 1) . LT . 1.00.AND.S(02-3, 1).GT. 1 O.98.AND.S(02-4,9).GT.0.98.AND.S(02-4,9).LT.1.57.AND.S(02-2,7).GT 2 .1.06) GO TO 103 *** 13 *** IF (HELLO.AND.LE.EQ.(02- 1).AND.S(02-1 , 1 ) . LE.0.69.AND.S(02-2. 1).GT. 1 1 .01 .AND.S(02-2,9).GT.0.98.AND.S(02-3,9).LT. 1.57.AND.S(02- 1,5).GT 2 . 1 . 19.AND.S(02,5).GT.0.98.AND.S(02+1,5).GE. 1 . 56) GO TO 102 *+ * i ; 4 * * * IF (HELLO.AND.LE.EQ.(02-1).AND.S(02-2,1).LE.O.69.AND.S(02-2,7).GT! 1 1.49.AND.S(02-3,9).GT.1.10.AND.S(02-3,1).GT.1.16) GO TO 103 *** 15 *** IF (HELLO.AND.LE.EO.(02-2).AND.S(02-3,9).GT.O.98.AND.S(d2-3,1).GT. 1 i .01.AND.S(02-4,9).LE.S(02-3,9).AND.(S(02-5,9)-S( 0 2 - 3 , 9 ) ) . L E . 0 . 1 6  301 2 .AND.S(02- 1,9) .LT. 1.77.AND.S(02-2,9).LT. 1 .77.AND.S(02-1,5).GT .1.1 302 3 9 . AND.S(02-2,5).GT. 1 . 19) GO TO 103 303 C 304 C *** 1 6 * * * 305 IF (HELLO.AND.LE.EO.(02-3).AND.S(02-3,9).GT.1.57.AND.S(02-4,1).GT. 306 1 1 . 16.AND.S(02-2,7).GT. 1 . 24.AND.S(02-5, 1).GT. 1 . 16.AND.S(02- 1,1). 307 2 LE.0.69) GO TO 103 308 C 309 C *** *** 310 IF (HELLO.AND.LE.EO.(02-2).AND.S(02-3,9).LT. 1. 10.AND.S(02-3, 1 ) . LE . 311 1 O.69.AND.S(02-4,9).GT.0.98.AND.S(02-4, 1) .GT. 1 .01 .AND. (S(02~5, 1 ) . 312 2 LT.S(02-4, 1 ) .OR . (S(02-5,9)-S(02-4,9)).LE.0. 15)) GO TO 104 313 C 314 C +** 18 *** 3 15 IF (HELLO .AND . LE.EO.(02-4).AND.S(02-4 , 1).GT. 1. 16 . AND.S(02-4,9).GT. 3 16 1 0.98.AND.S(02-5, 1 ) . LT . S(02-4, 1 ) .AND.S(02-6, 1 ) . LT.S(02-4, 1).AND. 317 2 S(02-3,9).LT.0.98) GO TO 104 318 C 319 C *** 19 *** 320 IF (HELLO.AND.LE.EO.(02-4).AND.S( J2-4, 1).GT.0.98.AND.S ( 02-5 , 9).LT . 321 1 1.1O.AND.S(02-6,9).LT.1.10) GO TO 104 322 C 3 2 3 c * * * 2 0 * * * 324 IF (HELLO.AND.LE.EO.(02-4).AND.S(02-5,9).GT.1.25.AND.S(02-5,1).GT. 325 1 1 . 16.AND.S(02-4,9).LT.5(02-5,9 ) .AND . 5(02-4 , 1).LT.S(02-5 , 1).AND.S( 326 2 02-6,9).LT.S(02-5,9)) GO TO 105 327 C 328 C *** 21 *** 329 IF ((02+2).GT.NN) GO TO 95 330 IF (HELLO.AND.LE.EQ.(02+1).AND,(S(02-3,1)+S(02-1,1)+S(02-1,1)+5(02 331 1,1 )+S(02+1, 1 )+S(02 + 2,1)).LT.(S(02-3,2)+S(02-2,2)+S(02-1,2)+S(02,2 332 2 )+S(02+1;2)+S(02+2,2)).AND.S(02-4,1).GT.1.16.AND.S(02-4,9).GT.1. 333 3 08.AND.S(02-5,9).GT.1.10) GO TO 104 334 C 335 C 336 95 IF (I.EO.O) GO TO 200 337 IF (I.EQ.1) GO TO 200 338 IF (I.EQ.2) GO TO 200 339 IF (I.EQ.3) GO TO 200 340 IF (I.EQ.4) GO TO 200 34 1 IF (I.EQ.5) GO TO 200 342 IF (I.EQ.6) GO TO 210 343 C 344 100 02=02 345 GO TO 110 346 101 02=02-1 347 GO TO 110 348 102 02=02-2 349 GO TO 110 350 103 02=02-3 1 7  o  351 352 353 354 355 356 357 358 359 360 36 1 362 363 364 365 366 367 368 369 370 37 1 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 39 1 392 393 394 395 396 397 398 399 400  104 105  GO TO 1 1 0 d2=d2-4 GO TO 1 1 0 d2=d2-5 GO TO 1 1 0  C 1 10  V3 = 80 GO TO 3 0 0  200  1=1+1 L E = LE+ 1 GO TO 7 0  C  C  c c c c c c c c c  c  c  c c  P O S I T I O N OF THE C U R R E N T N E E D TO A D J U S T IT d2  J2  MAY BE THE  MOST  FAVORABLE ONE,HENCE  I  = J2  * * * 1Q * * * BALL=.FALSE. IF ( ( J 2 + 4) .GT.NN) GO TO 2 2 0 T1=0 T2=0 T5=0 T1=S(d2+1. 1)+S(d2+2. 1)+S(d2+3. 1)+S(d2+4. 1 ) T2=S(d2+1,2)+S(d2+2,2)+S(d2+3.2)+S(d2+4,2) T5=S(d2+1,5)+S(d2+2,5)+S(d2+3.5)+S(d2+4,5) PRINT 2 1 5 , T 1 , T 2 , T 5 FORMAT(' ' , 3 0 X . ' T 1 , T 2 , T 5 ' , 3 ( F 7 . 3 ) , ' S T E P 1 0 , <J2=J2 , M 0 J 2 215 I F ( T 5 . G T .T 1 . AND . T 5 . GT . T 2 . AND . S ( v J 2 , 9 ) .GT . 0 . 9 8 . A N D . S (<J2 , 1 ) . GT 1 . A N D . S ( d 2 - 1 , 9 ) . G E . 1 . 5 7 . A N D . ( S ( J 2 - 2 . 1 ) .GT . 1 . 1 6 . O R . S ( J 2 - 2 , 9 ) . i 2 20)) BALL=.TRUE. 2 10  IF ( T 5 . G T . T 1 . A N D . T 5 . G T . T 2 . A N D . S ( d 2 + 1 , 1 ) . L E . 0 . 6 9 . A N D . S ( d 2 , 9 ) . I 1 5 7 . A N D . S ( J 2 - 1 , 9 ) . G T . 0 . 9 8 . A N D . S ( J 2 - 2 , 1 ) . G T . 1 . 16) BALL=.TRUE IF ( B A L L ) J2=J2 IF ( B A L L ) V3=10 IF ( B A L L ) GO TO 3 0 0 ***+ 1 1 *** BALL= . FALSE . IF ( T 5 . G T . T 1 . A N D . T 5 . G T . T 2 . A N D . S ( d 2 , 9 ) . L T . 0 . 7 3 . A N D . S ( J 2 - 1 , 9 ) J 1 5 7 . A N D . S ( d 2 - 1 , 1 ) .GT. 1 . 0 1 . A N D . S ( d 2 - 2 , 9 ) . G T . 1. 10) BALL=.TRUE IF ( B A L L ) d2=d2-1 IF ( B A L L ) V3=11 IF ( B A L L ) GO TO 3 0 0 * * + 12  ***  h-i  H  401 402 403 404 405 406 407 408 409 410 411 412 413 414 4 15 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 4 35 436 437 438 439 440 44 1 442 443 444 445 446 447' " 448 449 450  C C  C C  C C  C  C  BALL=.FALSE. IF ((J2+6).GT.NN) GO TO 220 IF (S(J2 , 9).GT. 1. 10.AND.S(J2, 1 ) .GT. 1 .01.AND. S (<J2 - 1 . 9 ) . LT . S (<J2 , 9 ) . 1 AND . S(<J2 , 2) . GT .0. 93 . AND . S( J2+ 1 , 2 ) . GT . 1 . 05 . AND . S(J2+2 . 2 ) . GT . 1 . 38 . AN 2 D.S(J2+4,2).GT.O.75.AND.S(J2 + 5,2).GT. 1. 38.AND.S(J2+6,2).GT. 1 .05 ) 3 BALL=.TRUE. IF (BALL) J2=J2 IF (BALL) V3=12 IF (BALL) GO TO 300 ***13++* 220 IF ((J2+3).GT.NN) GO TO 230 BALL=.FALSE. IF (S(J2,9).GT.1.10.AND.S(U2,1).GT.1.16.AND.S(J2+1,2).GT.1.38.AND. 1 S(J2- 1 , 2 ) . GT . 1 . 38 . AND ,S(<J2-2,2) . GT .1.38. AND . S( J2 + 3 . 2 ) . GT . 1 . 10. AND 2 . S(J2-3,2).GT. 1 .05) BALL=.TRUE. IF (BALL) J2=J2 IF (BALL) V3=13 IF (BALL) GO TO 300 *** 14 *** 230 BALL=.FALSE. IF ((J2+1 ) .GT.NN) GO TO 270 IF (S(<J2.9).GT. 1 .57 . AND .M(J2+1).EO. 15.AND.S(J2-1,9).LT.S(J2,9) .AND 1 . S( J2-2 , 9 ) . GT . 1 . 10. AND . (S( J2-3, 9 ) . GT . 1 . 10.OR . S( J2-3 , 1 ) . GT .0. 69 ) ) 2 BALL=.TRUE. IF (BALL) J2 = «J2 IF (BALL) V3=14 IF (BALL) GO TO 300 *** 15 *** BALL=.FALSE. IF ((J2+5).GT.NN) GO TO 240 IF (M(J2).EO.16.AND.S(J2+1,1).LE.O.69.AND.S(U2+2,2).GT.1.38.AND. 1 S(J2+3,2).GT.1.38.AND.S(U2+4,2).GT.1.38.AND.S(J2+5.2).GT.1.38.AND 2 .S(J2-1,9).GT. 1 . 10.AND.S(J2-2,9).GT.1.10) BALL=.TRUE. IF (BALL) U2=J2 IF (BALL) V3=15 IF (BALL) GO TO 300  *** ng *** 240  245  BALL=.FALSE. IF ((J2+4).GT.NN) GO TO 250 T1=0 T2=0 T1=S(U2+1. 1 ) + S(U2 + 2. 1 ) + S(J2 + 3, 1 ) + S(J2 + 4, 1) T2 = S(U2+1 , 2 )+ S ( J2 + 2 . 2 )+ S ( J2+3 , 2)+ S ( J2 + 4', 2 ) PRINT 245,T1,T2 FORMATC ',30X,'T1,T2 ',2(F7.3),7X,' STEP 16 , J2=J2') IF (T2 GT .T1 .AND.S(U2. 1).GT. 1. 16.AND.S(U2+1,9).LT . 1 . 10.AND.S(U2- 1  451 452 453 454 455 456 457 458 459 460 46 1 462 463 464 465 466 467 468 469 470 47 1 472 473 474 475 476 477 478 479 480 48 1 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500  C C  C C C C C C  C C  C C  1 , 1 ) . LT.S(J2, 1).AND. S(02+2.2) .GT.0.75.AND.S(J2 + 3,2).GT. 1 .38.AND. 2 S(02+4,2).GT. 1 .38) BALL=.TRUE. IF (BALL) 02 = 02 IF (BALL) V3=16 IF (BALL) GO TO 300 *** -|7 *** BALL=.FALSE. IF (S(02, 1 ) .GT. 1 . 16.AND.S(02,9).GT. 1. 10.AND,S(02-1,9).GT. 1. 10.AND. 1 S(02+1,1).LE.0.69.AND.(S(J2+2,9).LT.1.10.OR.S(02 + 2,1).LT.1.06).AN 3 D.S(J2+3.9).LT . 1 . 10.AND.(S(J2+4,9).LT. 1. 10.OR.M(02 + 3).EO. 15) ) BA 4 LL= .TRUE. IF (BALL) 02 = 02 IF (BALL) V3=17 IF (BALL) GO TO 300 02 = 02-1  250  18 *** IF ((J2+2).GT.NN) GO TO 260 BALL =.FALSE. IF ((P(J2 - 1 . 1)*P(02,2)*P(02+1,3)*P(J2 + 2.4)).GT.0.00007500.AND.S(02 1 -1,9).GT. 1 . 10.AND.S(02-1,1).GT. 1 . 16.AND.S(J2-2, 1).LT.S(02 -1 , 1 ) . AN 2 D . S(02-2,1).GT.0.69.AND.S(J2,1).LT.1.06.AND.S(02.7).GT.0.84.AND. 3 S(02+1.7).GE. 1 .64) BALL=.TRUE. IF (BALL) 02=02-1 IF (BALL) V3=18 IF (BALL) GO TO 300  *** -jg *** BALL=.FALSE. IF ((02+3).GT.NN) GO TO 260 T1=0 T2=0 T5 = 0 T1=S(02, 1) + S(02+ 1 , 1 ) + S(02 + 2, 1) + S(02 + 3, 1 ) T2 = S(02,2) + S(02+1.2) + S(02+2,2 ) + S(02 + 3.2) T5=S(02,5)+S(02+1,5)+S(02+2,5)+S(02+3,5) PRINT 255,T1.T2,T5 FORMAT(' ',30X, 'T1.T2.T5' ,3(F7.3), ' STEP 19 , 02-1 ,M002') 255 IF (T5.GT.T1 .AND.T5.GT.T2.AND.S(02-1,9).GE. 1.57.AND.S(02- 1 , 1).GT. 1 1.08.AND.S(02,9).LT.1.10.AND.S(02,2).GT.1.38) BALL=.TRUE. IF (T5.GT,T1.AND.T5.GT.T2.AND,S(02,1).LE.0.69.AND.S(02-1,1).GT.1.1 1 6.AND.S(02-1,9).GT.1.10) BALL=.TRUE. IF (T5.GT.T1 .AND.T5.GT ,T2.AND.S(02,9).LT. 1 . 10.AND.S(02, 1 ) . LT . 1 .01 1 .AND.S(02-1.9) .GE. 1 . 10.AND.S(02-1, 1).GT . 1 .01 ) BALL=.TRUE.  501 502 503 504 505 506 507 508 509 510 51 1 512 513 514 515 516 517 518 5 19 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 54 1 542 543 544 545 546 547 548 549 550  IF IF IF C C  C C  (BALL) (BALL) (BALL)  J2=J2-1 V3=19 GO TO 3 0 0  * * * 20 * * * BALL=.FALSE. I F ( S( <J2 , 9 ) . LT . 1 . 1 0 . AND . S ( J 2 , 2 ) . GT . 1 . 38 . AND . S ( J 2 + 1 , 2 ) . GT . 1 . 3 8 . AND . 1 S ( , J 2 + 2 , 2 ) . GT . 1 . 3 8 . AND . S ( J 2 + 3 , 2 ) . GT . 0 . 7 5 . AND . S ( J 2 - 1 , 9 ) . GE . 1 . 57 . AND 2 .S(d2-2,9).GT.1.10) BALL=.TRUE. IF ( B A L L ) J2=J2-1 IF (BALL) V3=20 IF (BALL) GO TO 3 0 0 *** 260  C C  2 1 +•** BALL=.FALSE. IF ( ( J 2 + 1 ) . G T . N N ) GO TO 2 7 0 IF ( S ( J 2 , 1 ) . L E . 0 . 6 9 . A N D . M ( J 2 + 1 ) . E O . 15.AND.S(02- 1, 1 ) . G T . 1 . 0 1 . A N D . S ( 1 J2-1,9).GT.1.10.AND.S(U2-2.9).GE.1.08) BALL=.TRUE. IF (BALL) J2=J2-1 IF (BALL) V3=21 IF ( B A L L ) GO TO 3 0 0  # * * 22 ***  BALL=.FALSE. IF ( ( J 2 + 3 ) . G T . N N ) GO TO 2 7 0 I F ( S ( « J 2 , 9 ) . LT . 1 . 1 0 . A N D . S ( J 2 - 1 , 9 ) . G T . 1 . 1 0 . A N D . S ( U 2 - 1 , 1) . G T . 1 . 0 1 . A N 1 D . S(<J2 , 2 ) . GT . 1 . 3 8 . A N D . S ( J 2 + 1 , 2 ) . GT . 1 . 3 8 . AND . S ( U2 + 2 , 2 ) . GT . 0 . 9 3 . AND 2 . S ( J 2 + 3 , 2 ) . G T . 1. 1 0 . A N D . S ( J 2 - 2 , 2 ) . G T . 1 . 3 8 ) BALL=.TRUE .  C C C C C  ....  IF  (BALL)  IF IF  (BALL) (BALL)  u2=u2-1  V3=22 GO TO 3 0 0  J 2 = J2+1  c  ***  c c  * * * 24 * * *  2 3 *** BALL=.FALSE. IF ( ( J 2 + 4 ) . G T . N N ) GO TO 2 7 0 I F ( S(vJ2 . 9 ) . LT . 1 . 1 0 . AND . S ( J 2 + 1 , 9 ) . GT . 1 . 10 . AND . S(<J2+ 1 , 1 ) . GT . 1 . 0 1 . A N 1 D . S ( d 2 , 1 ) . G T . 1.01 .AND.S ( J 2 - 1 , 1 ) . G T . 1. 1 3 . A N D . S ( J 2 - 1 , 2 ) . L E . 0 . 7 5 . A N D 2 . S ( J 2 - 2 . 2 ) . L E . 0 . 7 5 . A N D . ( P ( J 2 + 1 . 1)*P(U2 + 2 . 2 ) * P ( U 2 + 3 , 3 ) * P ( J 2 +4 . 4 ) ) 3 .GT.0.000100.AND.S(J2-2,1).GT.1.13) BALL=.TRUE. IF ( B A L L ) J2=d2+1 IF ( B A L L ) V3=23 IF ( B A L L ) GO TO 3 0 0  BALL=.FALSE. IF ( ( J 2 + 5 ) . G T . N N )  GO TO 2 7 0  551 552 553 554 555 556 557 558 559 560 561 562 563 564 C 565 C 566 C 567 C 568 C 569 C 570 571 572 C 573 574 C !_. 575 C H 576 C ^ 577 C 578 579 580 581 582 583 End of F i l e  265  T1=0 T2=0 T5=0 T1=S(J2+2, 1 )+S(J2+3 , 1 ) + S( J2+4 , 1 )+S(J2+5, 1 ) T2=S(J2+2.2)+S(J2+3.2)+S(J2+4,2)+S(J2+5,2) T5=S(J2+2,5)+S(J2+3,5)+S(J2+4,5)+S(J2+5,5) PRINT 265.T1.T2.T5 FORMATC ' ,30X, 'T 1 ,T2,T5' ,3(F7 . 3 ) , ' STEP 24, J2+1 ,M0J2') IF (T5.GT.T1.AND.T5.GT.T2 .AND.S(J2,9).GT.1.57.AND.S(J2+ 1 ,9 ) . 1 GT.1.20.AND.S(J2+1,1).GT.1.01) BALL =.TRUE. IF (BALL) J2=J2+1 IF (BALL) V3=24 IF (BALL) GO TO 300  TO CALL SUBROUTINE RMJ2 TO KEEP ON CHECKING FOR C-TERMINAL ADJUST MENT,RMJ2 IS A CONTINUATION OF THIS SUBROUTINE 270  CALL RMJ2 RETURN  THE C-TERMINAL HAS BEEN ADJUSTED ACCORDING TO ONE OF THE SITUATIONS MENTIONED ABOVE. TO PRINT OUT THE FINAL VALUES FOR J1.J2 AND TO RE TURN TO SUBROUTINE ONE TO START THE WHOLE PROCEDURE AGAIN 300 301  K3=J2 PRINT 301,J1,J2.V2.V3 FORMAT('0' ,20X,'EVENTUAL HELIX FROM J 1 : ' .15,5X. 'TO J2:',I5.14X, 1 ' *** V2,V3:',2(15),' ***<//) RETURN END  1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  C C SUBROUTINE  C C C C C C C C C C C C C C C C C  RMJ2 = REMAINING OF MOVE OF J2  PURPOSE TO KEEP ON CHECKING FOR OTHER POSSIBILITIES OF ADJUSTING THE CBOUNDARY OF THE PREDICTED HELIX  c c c c c c c  c c c c c c  RMJ2  REMARK ALL THE PARAMETERS STILL HAVE THE SAME DEFINITION AS IN THE PRE VIOUS SUBROUTINES REAL S,T1,T2,A1,A2 ,T3,T4,T5,TT,P INTEGER G,F,H,U,D.V1,V2.W, V3.V4,V5,V6,V7,V8,0 LOGICAL HELLO,BYE ,BALL,MOVE DIMENSION S(1000,20),M(1000),H(1000),D(1000,16),P(1000,10) COMMON S,T1,T2,T3,T4,T5,TT,A 1 ,A2,P,F,H,U,D,W,M,M1,M2,M3,M4,M5,M6, 1L.I,K,L1,L2.NZ,NY,JA,JB,JC,JD,J1.J2,KM,N1,N2,NN,J.G.K3,V1,V2,V3,V4 2,V5,V6,V7,Q.HELLO,BYE,BALL,MOVE .... J2 = J2-2 ***  5  25  ***  BALL=.FALSE. IF ((J2+2).GT.NN) GO TO 20 T1=0 T2=0 T5=0 T1=S(J2-1, 1 ) + S(J2, 1 ) + S(J2+1, 1 ) + S(J2+2, 1 ) T2=S(J2-1,2)+S(J2,2)+S(J2+1,2)+S(J2+2,2) T5=S(J2-1,5)+S(J2,5)+S(J2+1,5)+S(J2+2.5) PRINT 5,T1,T2,T5 FORMAT(' ',30X,'T1,T2,T5',3(F7.3),' STEP 25, J2-2 , RMJ2')  i_!  l_i  CTi  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  10 C C  C C  C Q  C  IF (T5.GT.T1.AND.T5.GT.T2.AND.S(J2-2,9).GE. 1 . 57 .AND.S(J2-2.1).GT. 1 1 .08.AND.((S(J2-1, 1 ) . LT.S(J2-2,1).AND.S(J2-1.9).LT. 1 . 10).OR.S(J22 1,9).LE.S(J2-2,9)).AND.S(U2-1,5).GT. 1 . 19.AND.S(J2 + 2,5).GT. 1 .43.AN 3 D.S(J2+3.1).LT.1.06.AND.S(J2+4.1).LT.1.06) BALL=.TRUE. IF (T5.GT.T1 .AND.T5.GT.T2.AND.S(J2- 1,9).LT.0.98.AND.S ( J2-2 . 9).GT. 1 1 . 10.AND.S(J2-2. 1 ) .GT.S(J2-1. 1 ) .AND.S(J2- 3,9).GT. 1 . 10.AND.S(J2.9) 2 .LT.0.98) BALL=.TRUE. IF ((J2 + 3 ) .GT.NN) GO TO 10 IF (T1.LT.T2.AND.S(J2,2).GT.1.37.AND.S(J2-1,2).GT.1.37.AND.S(J2+1, 1 2) .GT. 1 .38.AND.S(U2-2, 1).GT. 1 . 16.AND.S(J2 + 3. 1 ) .LE.0.69) BALL = 3 .TRUE. IF (BALL) J2=J2-2 IF (BALL) V3=25 IF (BALL) GO TO 300 *** 26 *** BALL=.FALSE. IF (T5.GT.T 1 .AND.T5.GT.T2.AND.S(J2-3,9).GT. 1 . 10.AND.S(J2-3,1).GT. 1 1 . 16.AND.S(J2-2.9) .LT. 1. 10.AND.S(J2- 1,9).LT.S(02-3.9).AND.S(02-1, 2 1 ) . LT . 1 .01 . AND . S (<J2 , 5) . GT .1.52. AND . S( J2+1 ,5) . GT . 1 . 46 . AND . S(J2 - 1 , 5 3 ).GT.1.14) BALL=.TRUE. IF (BALL) J2=J2-3 IF (BALL) V3=26 IF (BALL) GO TO 300 *** 27 *** BALL=.FALSE. IF ((J2+3).GT.NN) GO TO 20 IF (S(d2-2,9) .GT.S(U2,9) . AND . S( J2-2 , 9).GT.S(J2-3,9).AND.S(J2-1 ,2 ) 1 .GT. 1 .38.AND.S(J2+1 ,2) .GT. 1 .38.AND.S(U2 + 2,2 ) .GT. 1 .38.AND.S(J2 + 3,2 2 ) .GT. 1.38.AND.S(U2-2.2).LE.0.75) BALL= . TRUE. IF (BALL) J2=J2-2 IF (BALL) V3=27 IF (BALL) GO TO 300  * * * 28 * * * BALL=.FALSE. T1=0 T2=0 T1=S(J2-1, 1 ) + S(J2, 1 ) + S(J2+1, 1 ) + S(J2 + 2. 1 ) + S(J2 + 3, 1) T2=S(J2-1,2)+S(J2,2)+S(J2+1,2)+S(J2+2,2)+S(J2+3,2) PRINT 15.T1.T2 15 FORMAT(' ',30X,'T1,T2 ' ,2(F7 . 3) ,7X. ' STEP 28. J2-2, RMJ2') IF (T2.GT.T 1 .AND.S(J2- 1 ,2) .GT. 1 .38.AND.S(02-1.9) .LT. 1 . 10.AND.S(J21 2 , 9 ) .GT.O.98.AND.S(J2-2, 1).GT.1.01.AND.S(J2-3,9).LT.S(J2-2,9).AND 2 .S(J2,2).GT.1.38) BALL=.TRUE. IF (BALL) J2=U2-2 IF (BALL) V3=28 IF (BALL) GO TO 300  101 102 103 104 105 106 107 108 109 1.10 1 1 1 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 149 150  C  20  C C  C C C C C C C  * + * 29 + * *  BALL =.FALSE. IF ((J2+ 1 ) .GT.NN) GO TO 60 IF (S(J2.9) .LT.0.98.AND. M(U2+1) .EO. 1 5 . AND . S (02 - 2 . 9 ) . GT . 1 ,57.AND.S( 1 J2-1,2).GT. 1 .38.AND.S(J2-3,9).LT,S(J2-2.9)) BALL=.TRUE . IF (BALL) J2=J2-2 IF (BALL) V3=29 IF (BALL) GO TO 300  * * * 30 ***  BALL=.FALSE. IF (S(J2-2,9) ,GE. 1 . 10.AND.S(J2- 1.2).GT. 1.38.AND.S(J2- 1,9 ) . LT . 1 . 10 1 .AND.M(J2+1).EQ.15.AND.S(J2,2).GT.0.75) BALL=.TRUE. IF (BALL) J2=J2-2 IF (BALL) V3=30 IF (BALL) GO TO 300  .... J2 = -J2 + 2 + * * 31 * + *  BALL=.FALSE. IF ((K+1).GT.KM) GO TO 30 IF ( S (02 , 9 ) . GT . 1.25. AND . S ( J2+2 , 9 ) . GT . 1 . 20 . AND . S( <J2+2 , 1 ) . GT . 1 .01 . A 1 ND.M(J2+1).NE. 15.AND.S(<J2-1,9).GE.S(02,9).AND. (U2 + 3).GE.H(K+1) ) 2 BALL=.TRUE. IF (BALL) J2=J2+2 IF (BALL) V3=31 IF (BALL) GO TO 300  C c  * * * 32 * * *  BALL=.FALSE. IF (S(U2,9).GT.1.10.AND.S(J2+2,9).GT.1.57.AND.M(J2+1).NE.15.AND.S 1 (02-1,9).LE.S(J2 + 2,9).AND.(J2 + 3).GE,H(K+1)) BALL=.TRUE . IF (BALL) J2=U2+2 IF (BALL) V3 = 32 IF (BALL) GO TO 300  c c 30  * * * 33 ***  IF ((J2 + 6 ) .GT.NN) GO TO 40 BALL=.FALSE. IF (S(U2.9).LT.1.25.AND.S(d2+4,9).GT.1.57.AND.S(J2+5,7).GT.1.49.AN 1 D.S(J2+1,1).GT.1.01.AND.S(J2+2,1).GT.1.16.AND.S(J2+3,1).GT.1.08 2 .AND.S(J2 + 3, 1 ) . LT . 1.57.AND.S(02+2,2).LT.0.87.AND.S(J2+6.2).LT.0.7 3 4.AND.S(J2- 1 , 1).GT.1.16) BALL=.TRUE. IF (BALL) U2=U2+4 IF (BALL) V3=33 IF (BALL) GO TO 300  I— M  1  00  151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 ' 166 167 168 169 170 17 1 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  C C 40  C C  C C  C. C  C C  C  ***34**+ BALL=.FALSE. IF ((K+1).GT.KM) GO TO 50 IF (S(02,9).GT.0.98.AND.S(<J2, 1).GT. 1 .01 .AND.S(02+2,9),GT.S(J2,9) 1 .AND.S(J2 + 2, 1) .GT. 1 . 16.AND.M(02+1) .NE. 15.AND.(02+3) .GE.H(K+1) ) 2 BALL =.TRUE . IF (BALL) 02=02+2 IF (BALL) V3 = 34 IF (BALL) GO TO 300 ***  50  35  +**  BALL=.FALSE. IF ((02+4).GT.NN) GO TO 60 IF (S(02.9).GE.1.57.AND.S(J2+2,9).GT.1.10.AND.S(02 + 2,1).GT.1.16.AN 1 D . S(J2+1, 1).GT. 1.01.AND.S(J2+1,9) .GT.0.98.AND.S(02 + 3,9).LT . 1 . 10. 2 AND.S(02+4.1).LE.O.69) BALL =.TRUE. IF (BALL) 02=02+2 IF (BALL) V3=35 IF (BALL) GO TO 300 * * * 36 * * * BALL=.FALSE. IF ((02+6).GT.NN) GO TO 60 IF (S(02,9).GT. 1. 10.AND.S(02 + 2,9).GT. 1 . 10.AND.S(02 + 2, 1 ) .GT . 1 . 16 . AN 1 D.M(02+1).NE.15.AND.(P(02+3,1)*P(02+4,2)*P(02+5,3)*P(02+6,4)).GT. 2 0.000100.AND.S(02- 1 .9) .LT. 1 . 10.AND.S(02-2.9) .GE. 1 .57) BALL=.TRUE. IF (BALL) 02=02+2 IF (BALL) V3=36 IF (BALL) GO TO 300 *** 37 *** BALL=.FALSE. IF (S(02,9).GE.1.10.AND.S(02+2,9).GE.S(02,9).AND.M(02+1).NE.15.AND 1 .S(02-1, 1).GT. 1.06.AND.(P(02+3, 1)*P(02+4,2)*P(02+5,3)*P(02+6 , 4) ) 2 .GT.0.000100) BALL=.TRUE. IF (BALL) 02=02+2 IF (BALL) V3=37 IF (BALL) GO TO 300 ***  3g  +**  BALL=.FALSE. IF ((P(02+3,1)*P(02+4,2)*P(02+5,3)*P(02+6,4)).GT.0.00007500.AND.M( 1 02+ 1 ) .NE. 15.AND.S(02,9).GT.0.98.AND.S(02 + 2,9).GT. 1.24.AND.S(02 + 2. 2 1).GT. 1 .01 .AND.S(02-1,9).GT. 1 .57.AND.S(02 + 3.9).LT. 1. 10) BALL = 3 .TRUE. IF (BALL) 02=02+2 IF (BALL) V3=38 IF (BALL) GO TO 300  201 202 203 204 205 206 207 208 209 210 21 1 212 213 2 14 215 2 16 217 2 18 2 19 220 22 1 222 223 224 225 226 227 228 229 230 23 1 232 233 234 235 236 237 238 239 240 24 1 242 243 244 245 246 247 248 249 250  C C C C C  C C  C C  C C  J2 = 02-3 *+*  60  39  ***  BALL =.FALSE . IF (S(02,9).LT.1.10.AND.M(02-2).EO.15.AND.S(02-3,9).GT.0.98.AND.S( 1 02-4,9) .GT.0.98.AND.S(02+1,9).LT. 1 .77 .AND.S(02-3. 1 ) .GT. 1 . 16) 2 BALL=.TRUE. IF (BALL) 02=02-3 IF (BALL) V3=39 IF (BALL) GO TO 300 ***  40  ***  BALL=.FALSE. IF ((02+1).GT.NN) GO TO 90 IF (S(02, 2) .GT . 1 . 19 .AND . S ( 02 - 1 , 2 ) . GT . 1 . 19 . AND . S ( 02-3 , 2 ) . GT . 1 . 38 . A 1 ND . S(02+1 ,2).GT. 1 .38.AND.S(02,9).LT. 1.24.AND.S(02-3,9).GT. 1 . 57 . AN 2 D.S(02-4,9).GT.1.10) BALL=.TRUE. IF (BALL) 02=02-3 IF (BALL) V3=40 IF (BALL) GO TO 300 * * * 4 -| * + * BALL=.FALSE. T1=0 T2=0 T5=0 T1=S(02-2,1) + S(02-1.1) + S(02,1 ) + S(02+1 , 1 ) T2 = S(02-2,2) + S(02-1,2)+S(02,2 )+S(02+1,2) T5=S(02-2,5)+S(02-1,5)+S(02,5)+S(02+1,5) PRINT 65,T1,T2,T5 65 FORMAT( ' ' ,30X, 'T 1 ,T2,T5 ' ,3(F7 . 3 ) , ' STEP 41, 02-3 , RM02 ' ) IF (T5.GT.T1 .AND.T5.GT.T2.AND.S(02-3,9).GT.0.98.AND.S ( 02-3, 1) .GT. 1 1 .01.AND.S(02-4,9).LT.1.77.AND.S(02-5,9).LT.1.77.AND.S(02-3,9).GT. 2 S(02-2,9).AND.S(02+2,5).GT.0.96.AND.S(02.5).GT.O.96.AND.S(02-1,5) 3 .GT.1.19) BALL=.TRUE . IF (T5.GT.T1.AND.T5.GT.T2.AND.S(02-3,9).GE.1.10.AND.S(02-3,1).GT. 1 1.13.AND.S(024,1).GT.0.69.AND.S(02-5,1).GT.1.16.AND.M(02+1).EO. 2 15.AND.S(02-2,7).GE. 1 .64.AND.S(02- 1,7).GT. 1.24.AND.S(02,9).LT. 1. 3 10) BALL=.TRUE. IF (BALL) 02=02-3 IF (BALL) V3=41 IF (BALL) GO TO 300 +**42* BALL=.FALSE. IF ((02+2).GT.NN) GO TO 90 IF (S(02,2) .GT. 1 .30.AND.S(02- 1,2).GT. 1 .30.AND.M(02-2).EO. 1.AND.S(0 1 2+1,2) GT.1.38.AND.M(02+2).EO.1.AND.S(02-3,9).GT.1.10)BALL=.TRUE. +rr  251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 283 290 291 292 293 294 295 296 297 298 299 300  C C C C C C  IF (BALL) IF (BALL) IF (BALL) ... J2 = J2 + 3 * * * 43  .+ * *  BALL=.FALSE . IF ((02+4).GT.NN) GO TO 80 IF (S(J2,9).LT. 1 . 10.AND.S(J2- 1 ,9).LT. 1 . 10.AND.S(U2 +1 ,9) .LT. 1 . 10.AN 1 D.S(J2+2,9).GT.0.98.AND.S(J2+3,9).GT.0.98.AND.S(J2 + 3. 1).GT. 1. 16. 2 AND.S(J2+1,1).GT.0.69) BALL=.TRUE. IF (BALL) U2=J2+3 IF (BALL) V3=43 IF (BALL) GO TO 300  C C  ***  C C  +**45***  C C  C C  J2=J2-3 V3=42 GO TO 300  44  ***  BALL=.FALSE. IF (S(J2,9).GT.1.25.AND.S(J2+3,9).GT.0.98.AND.S(J2+3,1).GT.1.16.AN 1 D.S(J2+1,9).LT.S(U2+3,9).AND.S(J2+2,9).LT.S(U2+3,9).AND.S(J2+4,7) 2 .GT. 1.58.AND . S(J2+1 . 1).GT.0.67.AND.S(J2 + 2, 1).GT.0.67) BALL=.TRUE . IF (BALL) J2=U2+3 IF (BALL) V3=44 IF (BALL) GO TO 300 BALL=.FALSE . IF ( S (<J2 , 9 ) . GT . 1 . 20. AND. S(d2 + 3,9) . GT . 1 . 24 . AND . S( J2+3, 1 ) . GT . 1 . 01 . AN 1 D.S(J2+4,7).GT. 1 .58.AND.S(U2 + 5,7).GT. 1.58.AND.S(J2+1, 1 ) . LT.S(J2 + 3 2 ,1).AND.S(d2+2,9).LT.S(U2+3.9)) BALL=.TRUE. IF (BALL) J2=J2+3 IF (BALL) V3=453 IF (BALL) GO TO 300 ***  4g  ***  BALL=.FALSE. IF (S(U2,9).GT.0.98.AND.S(U2+3,9).GT.1.25.AND.S(U2+3,1).GT.1.16.AN 1 D . S(J2+2,9).LT.S(J2 + 3.9).AND.S(J2+1,9).LT.S(J2 + 3,9).AND.S(J2 + 4.7) 2 .GT. 1.58.AND.S(J2-1,9) .LT.S(J2+3,9) ) BALL=.TRUE. IF (BALL) J2=J2+3 IF (BALL) V3=46 IF (BALL) GO TO 300 ***  47  ***  BALL=.FALSE . IF ((K+1).GT.KM) GO TO 70 IF (S(J2,9).LT . 1. 10.AND.S(02 + 3,9).GT. 1.57.AND.S(J2+2,9).GE.S(d2+3, 1 9) . AND.S(J2+1,9) .LT . 1. 10.AND,M(U2+1).NE. 15.AND.S(J2+4,7).GT.0.96  ,_, K) 1-1  301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350  2 .AND.S(02-1,9).GT.0.98.AND.(02+3).GE.H(K+1)) IF (BALL) 02=02+3 IF (BALL) V3=47 IF (BALL) GO TO 300  C C  C C '  *** 48 *** BALL=.FALSE. IF (S(02,9) .LT . 1.25.AND.S(02 + 3,9).GT.S(02,9 ).AND.S(J2 + 3, 1 ) . GT .1.16 1 .AND.(J2+4).GE.H(K+1).AND.M(02+1).NE.15.AND.S(J2+2,1).GT.0.69.AND 2 .S(02- 1 , 1 ) .GT.1.01) BALL=.TRUE. IF (BALL) 02=02+3 IF (BALL) V3=48 IF (BALL) GO TO 300 70  C C  *** 49 *** BALL=.FALSE. IF ((02+5).GT.NN) GO TO 80 IF (S(02,9).LT.1.25.AND.S(02+3,9).GT.S(J2.9).AND. S(02 + 3, 1).GT. 1 .08 1 .AND. S(02 + 2,9).LT.S(J2+3.9).AND.S(<J2+1, 1 ) . LT.S(02 + 3, 1).AND.S(02 + 4 2 , 1 ) .LE.0.69.AND.S(02+5.7) GT. 1 .58) BALL=.TRUE. IF (BALL) 02=02+3 IF (BALL) V3 = 49 IF (BALL) GO TO 300 *** 50 *** BALL=.FALSE. IF ( (K+1 ).. GT. KM) GO TO 80 IF (S(J2.9).GT. 1 .25.AND.S(J2+1, 1 ) .GT. 1 .01 .AND.S(02 + 2, 1).GT. 1.06.AN 1 D.S(02 + 3, 1) .GT. 1 . 16.AND.S(02 + 4, 1) .GT. 1 . 13.AND.S(02- 1 , 1 ) .GT. 1 . 16.A 2 ND.S(02-2, 1 ) . GT. 1.01.AND.S(02-3, 1 ) .GT. 1 . 16.AND.(02 + 5).GE.H(K+1)) 3 BALL =.TRUE. IF (BALL) 02=02+3 IF (BALL) V3=50 IF (BALL) GO TO 300  C C 80  C C  BALL =.TRUE.  *** 51 *** BALL=.FALSE. IF ((02 + 3).GT.NN) GO TO 90 IF (S(J2.9).GT.1.57.AND.S(02+3.9).GE.S(02,9).AND.S(02+2.9).GE.S(02 1 ,9) . AND.S(02+1,9).LT. 1 .57.AND.S(02+4,9).LT. 1 .57.AND.M(02+1 ) .NE . 15 2 .AND.S(02+2.1).GT.1.16.AND.S(02+3,1).GT.1.16) BALL=.TRUE. IF (BALL) 02=02+3 IF (BALL) V3=51 IF (BALL) GO TO 300 *** 52 *** BALL=.FALSE. IF ( (K+1).GT.KM) GO TO 90 IF (S(02,9 ) .LT. 1 .25.AND.S(02 + 3.9) .GT.S(02,9).AND.S(02 + 3, 1).GT. 1 .08 1 .AND.S(02 + 2,9).LT.S(02 + 3,9).AND.S(02+1, 1 ) . LT.S(02 + 3, 1 ) .AND.S(02-1  351 352 353 ' 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 37 1 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400  2 ,9).LT.S(d2+3,9).AND.(J2+3).GE.H(K+1)) IF (BALL) d2=d2+3 IF (BALL) V3 = 52 IF (BALL) GO TO 300  C C C C C C  d2 = J2-4 ***  90  C C  53  ***  BALL=.FALSE. IF ((J2-6) .LE.O) GO TO 100 IF ((P(J2- 4,1)*P(d2-3,2)*P(d2-2,3)*P(d2-1,4)) .GT.0.00007500.AND. 1 S(J2-1,9) .LT.O.98.AND.S(02-2,9).LT.0.98.AND.M(d2-5).EO.12.AND.S(J 2 2-4,9).LT . 1.57.AND.S(J2-6,9).LT. 1 . 77) BALL=.TRUE. IF (BALL) d2=d2-5 IF (BALL) V3 = 53 IF (BALL) GO TO 300 ***  54  ***  BALL=.FALSE. T1=0 T2=0 T5 = 0 T1=S(J2-4, 1 ) + S(J2-3, 1 ) + S(J2-2. 1)+S(d2-1, 1) T2=S(J2-4, 2)+S(d2-3.2)+S(d2-2,2)+S(J2-1,2) T5=S(J2-4. 5)+S(d2-3,5)+S(d2-2,5)+S(d2-1.5) PRINT 95. T 1 , T2,T5 STEP 54, J2-4 ,RMJ2' ) FORMAT(' ' ,30X, 'T1 ,T2,T5' ,3(F7.3) , ' IF (T5.GT. T1.AND.T5.GT.T2.AND.S(d2-5,9).GT.0.96.AND.S(J2-5,1).GT. 1 1.01.AND. S(d2-6.9).LT.S(J2-5.9) .AND.S(J2-6, 1 ) .LT.S(J2-5 , 1)) BALL 2 =.TRUE. IF (BALL) J2=U2-5 IF (BALL) V3 = 54 IF (BALL) GO TO 300  95  C C  BALL=.TRUE .  ***  100  105  55  ***  IF ((J2-5) .LE.0) GO TO 110 BALL=.FALSE. T1=0 T2=0 T5 = 0 T1=S(u2-3, 1)+S(J2-2,1)+S(J2-1,1)+5(d2,1) T2=S(J2-3. 2)+S(d2-2,2)+S(d2-1,2)+S(d2,2) T5=S(J2-3, 5)+S(d2-2,5)+S(d2-1,5)+S(d2,5) PRINT 105, T1.T2.T5 F0RMAT(' ' ,30X.'T1,T2,T5',3(F7.3),' STEP 55, d2-4 ,RMJ2') .AND.S(d2-3, 1).LE.0.69.AND.S(d2-4 . 1 ) IF (T5.GT. T1 .AND.T5.GT.T2 1 .GT.1.16.AND.S(d2-4,9).GT.1.10.AND.S(d2-5,1).GT.1.01.AND.S(d2-5,9 2 ).LT.S(d2 -4,9)) BALL= . TRUE .  401 402 403 404 405 405 407 408 409 4 10 41 1 412 413 4 14 415 4 16 417 4 18 4 19 420 421 422 423 424 425 426 427 428 429 4 30 431 432 433 434 435 • 436 437 438 439 440 44 1 442 443 444 445 446 447 448 449 450  C  C C C C C C  C C C C C C C  C C  C  ***  12 * * *  GT J2-4 . IF (T5 .GT .T1 . AND . T5 . GT . T2 . AND . S (<J2 -3 , 1 ) . LE .0.69 . AND . S( , 1 ) . 11.01.AND.S(J2-4,9).GT.0.96.AND.S(J2-5,9).GT.1.10) BALL=.TRUE. IF (BALL) J2=J2-4 IF (BALL) V3=55 IF (BALL) GO TO 300  .... J2 = <J2-5 * * * 56 * * * IF ((J2-7).LE.0) GO TO 110 BALL=.FALSE. IF (S(02,2).GT.1.47.AND.S(J2-1,2).GT.1. 47.AND.S(J2-2,2).GT.1.37.AN 1 D.S(J2-4,2),GT. 1.47.AND.S(J2-3,2).GT. 1.47.AND.S(J2-5, 1) . GT . 11 .1 . A 2 ND.S(J2-5.9).GT.1.01.AND.S(J2-5,2).LE.0.75.AND.S(U2-6,2).LE.0 . 75 .0.6 3 .AND.S(02-7, 1).GT.1.11.AND.S(J2+1, 1).LE.0.69.AND.S(J2 + 2, 1 ).LE 4 9.AND.S(J2+3, 1).LE.0.69) BALL =.TRUE. IF (BALL) J2=J2-5 IF (BALL) V3=56 IF (BALL) GO TO 300 .... J2 •= J2 + 4 110  IF ((J2+5).GT.NN)  GO TO 130  * * * 57 * * *  BALL=.FALSE . 9) . A IF (S(J2,9) .LT. 1 . 10.AND.S(J2, 1).GT.0.69.AND,S(J2 + 4,9).GT.S(J2. 1 ND.S(J2+1,9).LT.S(U2+4,9).AND.S(J2+2,9).LT.S(J2+4.9) 2 .AND.S(J2+5,9).LT.S(J2+4.9).AND.S(J2+1.1).GT.1.01.AND.S(U2+2 , O 3 .GT.1.08 .AND.M(J2+3).NE.15.AND.S(J2-1,9).GT.1.10.AND.S(J2+3, 9) . 4 LT.S(J2 + 4.9)) BALL= . TRUE. IF (BALL) J2=J2+4 IF (BALL) V3=57 IF (BALL) GO TO 300  *** 58 BALL=.FALSE. 1.16 IF (S(J2,9).LT. 1 .25.AND.S(J2 + 4,9).GT.S(U2,9).AND.S(J2 + 4, 1 ) .GT. 1 .AND.S(«J2+5,7) .GT. 1.58.AND.S(U2+4,9).GT.S(J2 + 3.9).AND.S(J2 + 3,1) 2 . GT . 1 .01 . AND. S( J2 + 2 , 1 ) . GT . 0 . 69 . AND . S (<J2+ 1 , 1 ) . GT -1,9 .0.67 . AND . S ( J2 3 ).GT.1.10) BALL=.TRUE. IF (BALL) U2=J2+4 IF (BALL) V3=58 IF (BALL) GO TO 300  451 452 ' 453 454 455 456 457 458 459 460 46 1 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500  C  C C  C  C C  C C  * * * 59 * + * BAI_L= . F A L S E . I F ( S ( J 2 , 9 ) . L T . 1 . 1 0 . A N D . S ( J 2 , 1 ) . G T . 0 . 6 9 . A N D . S ( J 2 + 4 , 9 ) . G T . 1 . 0 8 . AND . 1 S ( J 2 + 4 , 1 ) . GT . 1 . 16 . AND . S(<J2 + 3 , 1 ) . GT . 0 . 6 9 . AND . S ( U 2 + 2 , 1 ) . GT . 0 . 6 9 . AND 2 . M ( J 2 + 1 ) . N E . 1 5 . A N D . S ( J 2 + 5 , 9 ) . L T . S ( J 2 + 4 , 9 ) . A N D . S ( J 2 + 2 , 9 ) . G T . 0 .9 8 . AN 3 D.S(U2+3,9).GT.0.98.AND.S(«J2-1,9).GT.1.10) BALL=.TRUE. IF (BALL) J2=U2+4 IF (BALL) V3=59 IF (BALL) GO TO 3 0 0 + * + 60 * * * IF ((K+1).GT.KM) GO TO 1 2 0 BALL= . FALSE . T1=0 T2 = 0 T5 = 0 TT = 0 T 1 = S ( J 2 + 1 , 1) + S ( J 2 + 2 , 1) + S ( U 2 + 3, 1 )+S(U2+4, 1) T 2 = S ( J 2 + 1 , 2 ) + S ( J 2 + 2 , 2 )+ S ( J 2 + 3 , 2 )+ S ( J 2 + 4 , 2 ) T5=S(J2+1,5)+S(J2+2,5)+S(J2+3,5)+S(J2+4.5) TT=P(J2+1,1)*P(d2+2,2)*P(U2+3,3)*P(J2+4,4) PRINT 1 1 5 , T 1 , T 2 , T 5 , T T FORMAT(' ' , 3 0 X , ' T 1 , T 2 , T 5 , T T ' , 3 ( F 7 . 3 ) , F 1 3 . 9 , ' 1 15  STEP 6 0 ,  J2+4  , RMJ2':  I F ( ( T 5 . L T . T 1 . O R . T 5 . L T . T 2 ) . A N D . T T . L T . 0 . 0 0 0 0 7 5 0 0 . A N D . S ( J 2 - 1 , 9 ) . GE . 1 . 1 1 0 . A N D . S ( J 2 , 9 ) . L T . 1 . 1 0 . A N D . S ( J 2 , 1 ) . G T . 0 . 6 9 . A N D . S ( J 2 + 4 , 9 ) . G T . 1 .08 2.AND.S(U2+4,1).GT.1.16.AND.S(J2+3,1).GT.0.69.AND.S(J2+2,1).GT .1.16 3 .AND.M(U2+1).NE. 15.AND.(U2 +5 ) . G E . H ( K + 1 ) ) BALL=.TRUE . IF (BALL) U2=J2+4 IF (BALL) V3=60 IF ( B A L L ) GO TO 3 0 0 * * * 61 * * * BALL=.FALSE. IF ( S ( J 2 . 9 ) .LT. 1 . 2 5 . A N D . S ( J 2 + 4 . 9 ) . G T . S ( U 2 , 9 ) . A N D . S ( J 2 +4, 1).GT . 1 . 1 6 1 .AND.S(U2+5,7).GT.1.24.AND.S(J2+1,1).GT.0.98.AND.S(J2+2,1).GT.1.0 2 1 . A N D . S ( J 2 + 3 , 1 ) . G T . 1 . 0 1 . A N D . S ( J 2 + 2 , 7 ) . L T . 0 . 9 6 . A N D . S ( J 2 + 3 , 7 ) . LT . 0 . 3 96.AND.(J2+4),GE.H(K+1).AND.S(J2-1,9).GT.1.10) BALL=.TRUE. IF (BALL) U2=J2+4 IF (BALL) V3=61 IF (BALL) GO TO 3 0 0 * + * 62 * * * 120 BALL=.FALSE. I F ( ( U2+6 ) . GT . N N ) GO TO 1 3 0 I F ( S ( J 2 , 9 ) . L T . 1 . 2 5 . A N D . S ( U 2 + 4 , 9 ) . G T . 1 . 5 7 . A N D . S ( J 2 + 5 , 7 ) . G T . 1. 4 9 . AN 1 D.S(J2+1,1).GT.1.01.AND.S(J2+2.9).GE.S(J2+3,9).AND.S(J2+3,1) .LE.O 2 .69.AND,M(U2+3).NE.15.AND.S(J2+6,7).GT.1.58) BALL=.TRUE. IF (BALL) J2=J2+4 IF (BALL) V3=62  501 502 503 504 505 506 507 508 509 5 10 51 1 512 513 514 515 5 16 517 518 5 19 520 521 522 523 524 525 526 527 528 529 530 53 1 532 533 534 535 536 537 538 539 540 54 1 542 543 544 545 546 547 548 549 550  C C  C C  C C  C C C C  IF (BALL)  GO TO 300  *** 63 +** BALL = . FALSE . IF (S(J2.9).GE. 1 .57.AND.S(02, 1 ) .GT. 1. 16.AND.M(02+1).NE. 15.AND.S(J2 1 +2,9).GE.S(02,9).AND.S(02+3, 1 ) . GT . 1 . 16.AND.S(02+4, 1).GT. 1. 16.AND. 2 S(J2+4,9).GT. 1 . 10.AND.S(02+5,7).GT. 1 .58.AND.S(J2+6. 1) .LE.0.69) 3 BALL=.TRUE. IF (BALL) 02=02+4 IF (BALL) V3=63 IF (BALL) GO TO 300 *** g 4 *** BALL =.FALSE. IF (S(02,1). GT .1.13.AND.S(02,9) . GT . 1 . 10.AND.S(02-1 , 1).GT.1.16.AND. 1 S(J2-2,1).GT.1.16.AND.S(J2-4,2).LT.0.55.AND.S(02 + 4,t).GT.1.16.AND 2 .S(J2+5,1).LE.0.69.AND.S(J2+6.1).LE.0.69.AND.S(J2+3,1).GT.1.16.AN 3 D.S(02+2, 1).GT. 1 . 13.AND.S(02+2,2).LT.0.75.AND.M(J2+1) .NE. 15) 4 BALL=.TRUE. IF (BALL) 02=02+4 IF (BALL) V3 = 64 IF (BALL) GO TO 300 *** 65 *** 130 BALL=.FALSE. IF ((02+4).GT.NN) GO TO 300 IF (S(02,9).LT.1.10.AND.S(02 +1,1).LE.O.69.AND.M(02+1).NE.15.AND.S( 1 02+2,1).LE.0.69.AND.M(02+2).NE.15.AND.S(02+3,9).GE.1.57.AND.S(J2+ 2 4. 1 ).GT. 1. 16.AND.S(J2 + 4.9).GT.0.98.AND.S(J2+3. 1 ) . GT. 1 .08.AND.S(d2 3 - 1 , 1 ) .GT. 1 . 16) BALL=.TRUE. IF (BALL) 02=02+4 IF (BALL) V3=65 IF (BALL) GO TO 300 .... 02 = J2+5 BALL=.FALSE . IF ((02+6).GT.NN)  C C  C C  GO TO 300  *** 66 *** IF (S(02,9).GT.0.98.AND.S(02-1,9).LE.S(J2,9).AND.S(02-2,9).GE.S(02 1 ,9).AND.S(02,1).GT.1.16.AND.S(02+5,9).GT.S(02.9).AND.S(02+5,1).GT 2 .1.01.AND.S(02+2,1).GT.1.16.AND.S(02+3,1).GT.1.01.AND.S(02+6,7).G 3 E. 1 .58.AND.S(02+4,2).LE.0.75) BALL =.TRUE . IF (BALL) 02=02+5 IF (BALL) V3=66 IF (BALL) GO TO 300 *** 67 ***  551 552 553 554 555 555 557 558 559 560 561 562 563 564 565 566 567 568 569 570 57 1 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600  C C  C C C C C C  IF ((J2+7).GT.NN) GO TO 300 BALL=.FALSE. IF (S(u2,9).GT.0.98.AND.S(J2+5,9).GT.1.57.AND.(J2+6).GE.H(K+1).AND 1 .M(J2+1) .NE. 15.AND.M(J2 + 2).NE. 15.AND.S(J2- 1, 1).GT.1.16.AND.S(J2 + 3 2 , 1 ) .GT. 1 .01 .AND.S(J2 + 4, 1).GT. 1.08.AND.S(J2+6, 1).GT. 1 . 16.AND.S(J2+ 3 7. 1 ) .GT. 1 . 16) BALL=.TRUE. IF (BALL) J2=J2+5 IF (BALL) V3=67 IF (BALL) GO TO 300 *#* 68 *** BALL=.FALSE . IF (S(U2,9).GT.1.10.AND.S(J2,1).GT.1.16.AND.S(J2-2, 1) .GT . 1 . 1 1.AND. 1 M(J2+1).EQ.M(J2 + 2).AND . S(J2+ 1 ,9).GT. 1 .24.AND.S(J2 + 5. 1).GT.1.16.AN 2 D.S(J2+5,9).GT. 1 . 10.AND.S(J2+6,7).GT. 1.58.AND.S(J2+4, 1).GT. 1 .01 3 .AND.S(J2+3.1).GT.0.77.AND.S(U2+7,2).GT.1.38) BALL=.TRUE. IF (BALL) J2=J2+5 IF (BALL) V3=68 IF (BALL) GO TO 300 . . . . J2 = J2+6 BALL=.FALSE. IF ((J2+7).GT.NN)  GO TO 300  *** 69 *** IF (S(J2, 1).GT. 1 . 16.AND.S(J2, 1 ) . LT. 1.25.AND.S(J2+6, 1).GT. 1 .00.AND. 1 S(U2 + 2,9).GT . 1.57.AND.S(J2 + 3, 1).GT. 1. 16.AND.S(J2+4.9).GT. 1.24.AND 2.S(U2+5,1).GT.0.69.AND.S(U2+6,2).LT.0.74.AND.S(J2+1,1).GT.0.69 3 .AND.S(J2+7, 1).LE.O.69) BALL= . TRUE . IF (BALL) J2=J2+6 IF (BALL) V3=69 IF (BALL) GO TO 300  C C  * * * 70 * * *  C C C C C  V3=80 WHEN THE C-TERMINAL ADJUSTMENT IS DUE TO STRONG B-TURN POTEN TIAL (THROUGH THE PROCEDURE OF REPEATING THE B-TURN CHECK).  BA L L =.FALSE. IF (S(J2,9).GT.0.98.AND.S(U2, 1).GT.1.01.AND.S(U2 + 6,9).GT.S ( U2 , 9) 1 .AND.S(J2+6,1).GT.1.08.AND.S(J2+7,1).LE.0.69.AND.S(J2+2,9).GT.1.2 2 0.AND.S(J2+3,1).GT.1.16.AND.S(J2+4,1).GT.1.16.AND.S(U2+1,1).GT.0. 3 67.AND.S(J2 + 5, 1).GT.0.69.AND.S(J2+5,9).LT.S(J2+6.9).AND.S(J2 - 1 . 1 ) 4 .GT.1.01.AND.S(J2-2,1).GT.1.13) BALL=.TRUE. IF (BALL) J2=d2+6 IF (BALL) V3=70 IF (BALL) GO TO 300  C 601 602 C 603 C 604 C 605 606 607 C 608 C C 609 610 C 61 1 612 613 614 615 End of F i l e  to  IF V3=0 NONE OF THE CONDITIONS LISTED IN THE SUBROUTINES M0J2 AND RMJ2 FIT THE CURRENTLY TESTED SEGMENT. IN OTHER WORDS U2 HAS NOT CHANGED. 300  K3=J2 IF (.NOT.  BALL.AND. V3.NE. 80)  V3=0  TO PRINT OUT THE FINAL VALUES FOR J1.U2. TO RETURN TO SUBROUTINE ONE TO START THE WHOLE PROCEDURE AGAIN. 301  PRINT 301 ,U1,J2,V2,V3 FORMAT('0',25X,'EVENTUAL HELIX FROM J1 : ' ,I 5,5X. 'TO J2:',I5.14X •j ' * * * V2.V3: ' .2(15), ' ***'//) RETURN END  E - f f i c i e n c y o f t h e 3-sheet p r e d i c t i o n As  i n d i c a t e d i n t h e p r e v i o u s s'ection  strict  adhe-  r e n c e t o Chou and Fasman's s e t o f r u l e s l e d t o t h e m i s s i n g a c e r t a i n number o f r e g i o n s  and t o some d i f f e r e n c e s  the b o u n d a r i e s o f p r e d i c t e d  areas from t h i s  o f Chou a n d Fasman and X - r a y a n a l y s i s the  r e s u l t s obtained  the  helical  search  showed t h a t  could  of  between  s t u d y and t h o s e  (Table  6). Analysis of  the observations  made f o r  a l s o be a p p l i e d f o r 3 - s h e e t .  This  i n v o l v e d m o v i n g t h e b o u n d a r i e s J l and J 2 t o a more s u i t a b l e position nal  through c o n s i d e r a t i o n  p a r a m e t e r s Pg >  P  N  3C  P  n3N  o f the boundary  a n <  ^ n3C P  t  b  e  n e  conformatio-  ighbcmring  res idues. Some e x a m p l e s o f b o u n d a r y a d j u s t m e n t f o r p r e d i c ted  3-sheet r e g i o n s  (1)  below:  J l = J l - 2: Concanavalin: Val  Gly  47 t  49-57 Thr  Ala  49 I Val  strong  are l i s t e d  His  He  51  (47) i s l i s t e d  s e c o n d f o r i t s P„, and i t i s a  3-former. Hence, b e s i d e s t h e f a c t t h a t  T  i t s presence, b a l  a n c e s . t h e b r e a k e r G l y (48) , i t a l s o e n s u r e s a v e r y  128  stable  N-boundary t o the p r e d i c t e d  (2)  -sheet.  J l = J l +3 g-Chymotrypsin: Gin  'Asp  I  !  34  36-42  Lys  Thr  Gly  I  I  t  36  Phe I  38  t h e good P ^  (36)  peptide  (3)  and  Gly  (38)  ( 3 9 ) , two  are avoided,as  35-38 w h i c h e x h i b i t s  40  8-turn  well  (39) ,  8-sheet as,the  breakers,  tetra-  potential.  J 2 = J2 + 3 C y t o c h r o m e b^:  Lys 1  Thr  Phe  I  I  72  73-76  Phe"  He  t  iGly  r  74  I  76  Glu  Leu  Pro  I  I  I  78  i  The to balance (78) . Leu Pro  L_  o f t h e r e s i d u e Phe  by m o v i n g J l t o p o s i t i o n J l + 3 , Phe Lys  Phe  I  1  i Besides  His  (80) , Asp  Asp !__  I  80  82  r e g i o n 73-76 c o n t a i n s enough 8 - s h e e t  t h e a d d i t i o n o f two has  1  Asp  been ranked (81)  and  Asp  fifth  breakers, Gly for i t s P^  (82) p o s s e s s  129  (77) and  good P -  and  the Rr;  formers  .  Glu  residues  (4)  J 2 = J 2 - 3:Concanavalin Asp  He  l  A:  25-32  Lys  Ser  Val  Arg  Ser  Lys  !  !  I  >  I  t  L  28  30  Although new  region  25-29  32  He  29 has a l o w e r  still  L y s 30, a n d S e r 31.  has 4 b r e a k e r s  (5)  Pg  C  t h a n V a l 32, t h e  has a s t a b l e C - b o u n d a r y and i s  i t s e l f more s t a b l e b e c a u s e o f . breakers  34  e l i m i n a t i o n o f t h e two  In f a c t , '  region  25-32  out o f 8 r e s i d u e s .  J2 = J 2 : Carboxypeptidase Tyr _J  A:  277-281  Gly  Phe  Leu  Leu  Pro  Ala  Ser  !  !  t  I  !  t  t  277  279  281  283  Gin L_  285  t Considering  i t sneighbouring  r e s i d u e s , Leu  281  a p p e a r s t o be a good c h o i c e f o r t h e C - b o u n d a r y s i n c e i t i s ranked Pro  fifth  for i t sP^  and i s a 3 - f o r m e r .  The  282 and S e r 284 e x h i b i t good P .g£ w h i c h may  s t a b i l i z a t i o n o f the sheet  n  C-terminal.  130  residues f a v o r the  These boundary a n a l y s e s  f o r t h e B-sheet  prediction  w e r e e l a b o r a t e d i n two e x t r a s u b r o u t i n e s added t o t h e end of the propagation  procedure  ( s u b r o u t i n e FOUR d e a l s w i t h t h e  N - b o u n d a r y a d j u s t m e n t a n d s u b r o u t i n e F I V E w i t h t h e C-boundary adjustment).  A g a i n , i t was: r e c o g n i z e d t h a t s u c h  w e r e q u i t e t e d i o u s and d i d n o t a l w a y s e n s u r e satisfying  r e s u l t s due t o t h e c o m p l e x i t y  analyses  completely  of p r o t e i n arrange-  ment . The  n u c l e a t i o n procedure  was a l s o s u b j e c t e d t o  some m o d i f i c a t i o n s t o r e d u c e t h e number o f m i s s i n g r e s i d u e s . I n most c a s e s , o n c e an a r e a w i t h B - s h e e t p o t e n t i a l was l o c a t e d , t h e n u c l e a t i o n s e a r c h would s t a r t a g a i n from i t s ( C - t e r m i n a l + 1) r e s i d u e t o a v o i d r e p e t i t i o n area  (cf. subroutine FIRS).  (e.g. bovine  colostrum  However, f o r some p r o t e i n s  inhibitor,  glucagon,  T o x i n K and R u s s e l l ' s V i p e r venom), such sulted  i n the omission The  again every previous was  i n t h e same  o f some r e g i o n s  B l a c k Mamba  a procedure r e -  ( T a b l e 6, p.  p r o b l e m was s o l v e d by . s t a r t i n g t h e s e a r c h  time  f r o m t h e ( N - t e r m i n a l + 1) r e s i d u e o f t h e  f r a g m e n t . The m a j o r d r a w b a c k o f s u c h  a  procedure  the tedious r e p e t i t i o n of the search f o r high  weight  205).  p r o t e i n s . The p r o t e i n s f o r w h i c h t h e new  molecular procedure  improved the q u a l i t y  o f t h e p r e d i c t i o n were: bovine  inhibitor,  B l a c k Mamba T o x i n K  glucagon,  131  colostrum  and R u s s e l l ' s  Y i p e r Venom. T h e s e p r o t e i n s h a v e m o l e c u l a r 3,483, 6,566 and those of other possible that  6,850, r e s p e c t i v e l y , w h i c h a r e  p r o t e i n s used i n t h i s  formation  of  in bigger  proteins.  may  out  search  residues  of  5 residues  (e.g.  t o be  ignored  i n the m o d i f i e d  ribonuclease,  p r o g r a m , two  (1)  (Leu)  shifting value  closer  as  198  of the  P^^,  3-sheet  and  an  with  nucleation  Pro  117  116-124). in  to r u l e B . l .  unfavoraTherefore,  d e c i s i o n s w e r e made:  3-sheet, boundary a n a l y s i s  e n t i r e fragment to the having  the  right.  The  advantage of better  being P^  (Asn) . for ribonuclease,  fragment 116-120, the with  encountered  third  i n the  t o X - r a y r e s u l t s ( 1 9 9 - 2 0 3 ) , a l s o has and  the  respectively, is  distinct  a possible  199-204, b e s i d e s  (2)  124)  not  range  f o r a - c h y m o t r y p s i n , o n c e segment 197-201  been c o n s i d e r e d  final  l e a d to  a - c h y m o t r y p s i n 197-201, r i b o n u c l e a s e  to 3-sheet n u c l e a t i o n a c c o r d i n g  allows  than  sometimes provoke a s e c t i o n or a p r o t e i n  a - c h y m o t r y p s i n and  has  lower  Hence i t i s  r e q u i r e m e n t o f l e s s t h a n one  In a d d i t i o n , the p r e s e n c e of Pro  ble  may  3-sheets under circumstances  The  3 h's  study.  7,511,  i n low-molecular weight p r o t e i n s , short  i n t e r a c t i o n s between a d j a c e n t  breakers  weights of  instead of  a d d i t i o n o f an  acceptable  shifting  extra tetrapeptide  3 - s h e e t p o t e n t i a l makes t h e  132  (121-  presence  of Pro  117  eventually very  less unfavorable leads  8-sheet c o n f o r m a t i o n  and i t  to the p r e d i c t i o n of a B-sheet area w i t h  favorable P^ (Val)  and  N  As  to  Lys  P ^ (Val) .  d i d not occur  often  at the N - t e r m i n a l  B-sheet s e c t i o n , the change o f i t s assignment from a breaker  , •  to a B-sheet former c o u l d not  of  a  B-sheet  r e a d i l y be made b e c a u s e  o f t h e p o s s i b l e r e s u l t o f e r r o n e o u s p r e d i c t i o n s . However, i n the case of r i b o n u c l e a s e 61  equivalent  d i d not  t o a hg  violate The  61-65  i t was  ^3N(Lys)  to s t a r t  arising  4-9).  any  from t h i s  be  This  by  a  t o have this  t  thirds  Lys  section  h's.  s u c h as Asp  could i n -  8-sheet poten-  d i s r u p t i o n r e s u l t e d i n an i n a - -  3-6,  the  two  fragments  7-10). These  two  t h e m s e l v e s meet t h e r e q u i r e m e n t o f  Hence i n s u c h c o n d i t i o n s t h e n u c l e a t i o n r u l e so t h a t e v e n t u a l l y , w i t h t h e  o f b o u n d a r y a n a l y s i s , one  could s t i l l  B - s h e e t a r e a . A somewhat s i m i l a r with s u b t i l i s i n  4 4 - 5 1 . The  49-52 c o u l d n o t  be  .  n  of areas w i t h  d i s r u p t i o n (papain  s l i g h t l y modified  They  t  n u c l e a t i o n p r o c e d u r e on  fragments c o u l d not t h i r d s h's.  s o  presence of a strong  (e.g. papain  bility  ^•^•^  t h e r e q u i r e m e n t o f two  t e r r u p t the p r e l i m i n a r y search tial  =  necessary  the  were s e p a r a t e d  two  l o c a t e an  s i t u a t i o n was  adjacent  combination  encountered  f r a g m e n t s 42-45  a s e c t i o n w i t h q u i t e low  133  may  appropriate  s t a r t i n g p o i n t f o r 3-sheet by  two  and  formation. 8-sheet  ipotential section (1974b) It  (Gly-Gly-Ala: 44-51,  <  Pg  >  =  was d e t e c t e d as g - s h e e t by Chou a n d Fasman  a n d by X - r a y d i f f r a c t i o n  (Chou a n d Fasman, 1 9 7 4 b ) .  a l s o h a s good e n d r e s i d u e s , V a l (44) and V a l ( 5 1 ) , a n d  <P^> i s g r e a t e r t h a n <P > a  In  (1.045 v e r s u s  summary, by t a k i n g  c o n t r i b u t i o n o f t h e boundary routines  1.040).  i n t o account the important  conformational parameters  FOUR a n d F I V E ) a n d t h e n e c e s s i t y o f a l l o w i n g  flexibility the  0.76). N e v e r t h e l e s s t h e e n t i r e  specific  to the nucleation rule  (submore  ( s u b r o u t i n e SECO) u n d e r  c o n d i t i o n s p r e v i o u s l y mentioned,  the following  p r o g r a m was a d o p t e d f o r t h e g - s h e e t s e a r c h . O n l y t h e d i f f e r ent for  s u b r o u t i n e s a r e p r e s e n t e d here s i n c e t h e main g-sheet p r e d i c t i o n  helix nal  i s identical  program  t o t h e one u s e d f o r a-  s e a r c h , except t h a t t h e g-sheet boundary c o n f o r m a t i o -  parameters replace those p e r t a i n i n g to a - h e l i x charac-  terization.  134  co cn  1 2 3 4 5 6 7 8 9 10 1 1 12 '13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  C C  c c c c c c c c c c c c c c c c c c c  c c c c c c c c c c c c c c c c c c c c c  SUBROUTINE FIRS  PRELIMINARY SEARCH FOR B-SHEET REGIONS  PURPOSE PRELIMINARY SEARCH FOR B-SHEET REGIONS BY APPLYING RULE 2 : <PB> > 1 05 . AND <PA> < <PB>  REAL S, T1.T2.A1.A2 ,T3,T4,T5,TT.P INTEGER G,F,H,U,D,V1.V2,V3,V4,V5,V6,V7,V8.0 LOGICAL HELLO,BYE,BALL,MOVE DIMENSION S(1000,10),M(1000),H(1000),D(1000,16),P(1000,10) COMMON S.T1,T2,T3,T4,T5,TT,A1,A2,P ,V4,V5,V6,V7,V8,0,G,F,H,U ,D,NN, 1NW.KX,MA,MB,MC,MD,L,I,L1,L2,L3.J1,J2,N,K1,K2,V1,V2.IM.M,K3,K4,V3. 2BYE.BALL,HELLO,MOVE DESCRIPTION OF PARAMETERS H - BOUNDARY RESIDUES OF A PREDICTED REGION H(I) - N-TERMINAL RESIDUE H(I + D - C-TERMINAL RESIDUE - FIRST RESIDUE OF A SECTION TO BE CONSIDERED FOR THE PRE LI MB MINARY SEARCH BUT WILL CHANGE DURING N-PROPAGATION (MB-1) MA - FIRST RESIDUE OF A SECTION TO BE CONSIDERED FOR THE PRELI MINARY SEARCH BUT WILL CHANGE DURING C-PROPAGATION (MA+1) K 1 - FIRST RESIDUE OF A SECTION 'TO BE CONSIDERED FOR THE PRELI MINARY SEARCH K2 - LAST RESIDUE OF A SECTION TO BE CONSIDERED FOR THE PRELI MINARY SEARCH A1 - AVERAGE <PA> OF A SECTION A2 - AVERAGE <PB> OF A SECTION N - SWITCHING VALUE FOR DECISION MAKING N=1 N-PROPAGATION N=2 C-PROPAGATION - COUNTER USED WITH THE ARRAY H TO STORE THE BOUNDARY RESI I DUES OF PREDICTED REGIONS  o\  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  C C C C C C 10  20 25  C C C C  30 C C C C  c•  C C C ' C  C C C C C C  THE SEARCH WILL STOP WHEN THE LAST SEGMENT AT THE C-TERMINAL HAS ONLY 2 AMINO ACID RESIDUES. IT IS NOT LONG ENOUGH FOR THE B-SHEET STATE I =2 H(I) =0 H( I - 1 ) =0 NW = NN-2 MB = 1 MA = 1 LP = 1 N =0 K2 = MA+2 HELLO=.FALSE. IF (MB.EO.O) HELLO=.TRUE. IF (HELLO) K1=H(I)+1 IF (.NOT.HELLO) K1=MB TO CALCULATE <PA>,<PB> FOR A POLYPEPTIDE CHAIN STARTING AT POSITION K1 AND ENDING AT POSITION K2 T1=0 T2=0 DO 30 MC=K1,K2 T1=T1+S(MC,1) T2=T2+S(MC,2) CONTINUE A2=T2/(K2-K1+1) A1=T1/(K2-K1+1) IF <PB> IS LESS THAN 1.05 THEN TO START THE SEARCH AGAIN FROM NEXT POSITION K1+1 IF (A2.LT.1.05-1.E-6)  GO TO 35  TO START THE SEARCH AGAIN FROM NEXT POSITION MB+1 WHEN <PB> < <PA> EVEN IF <PB> > 1.05. THE SEARCH IS STOPPED WHEN THE LAST AMINO ACID RESIDUE HAS BEEN REACHED IF (A 1 .GT.A2.AND.K2.EO.NN .AND.(K2+1-K1).EO.3) GO TO 70 IF (A 1 .GT.A2.AND.K2.EQ.NN .AND.(K2+1-K1).GT.3 ) GO TO 55 IF ( A 1 . GT . A2 . AND .K2 . NE . NN ) GO TO 35 IF <PB> > <PA> AND <PB> > 1.05 TO CONTINUE THE PROPAGATION AT EI THER N- OR C-TERMINAL SIDE (N=1 INDICATES N-TERMINAL PROPAGATION, N=2 C-TERMINAL PROPAGATION) UNLESS WE REACHED THE LAST RESIDUE OF THE SEQUENCE (NN)  101 102 103 104 105 106 107 108 109 1 10 1 1 1 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 149 150  IF (N.E0.2 .AND. K2.E0.NN) IF (N.E0.2 .AND. K2.NE.NN) C C C  GO TO 45 GO TO 40  TO START N-TERMINAL PROPAGATION WHEN <PB> > 1.05 AND <PB> > <PA> MB = MB-1 N = 1  C C C C C  AS LONG AS THE N-TERMINAL PROPAGATED PEPTIDE DOES NOT OVERLAP WITH THE PREVIOUS SHEET THE PEPTIDE CAN BE ELONGATED ON THAT SIDE,OTHER WISE TO SWITCH TO C-TERMINAL PROPAGATION BYE =.FALSE. IF (MB.EO.H(I)) BYE=.TRUE. IF (BYE) N=2 IF (BYE) GO TO 40  C C C C  C C C C C 35 40 C C C C 45  50  N-TERMINAL PROPAGATION IS STOPPED WHEN MB OR K1 = 1,TO SWITCH THEN TO C-TERMINAL PROPAGATION BALL=.FALSE. IF (MB.LE.H(1-1)) BALL =.TRUE. IF (BALL) MB=MB+1 IF (BALL) MA=MA+1 IF (BALL) N=2 IF (MA.GT.NW). GO TO 45 IF (BALL) GO TO 25 IF (MB.GT.H(I-1)) GO TO 25 TO START C-TERMINAL PROPAGATION WHEN IT IS STOPPED AT THE N-TERMIN AL SIDE. IF BOTH SIDES CANNOT BE ELONGATED ANYMORE THEN THE SEGMENT BEING ANALYZED SO FAR IS RECOGNIZED AS HAVING SHEET POTENTIAL MB=MB+1 IF (N.E0.-2) GO TO 55 IF (N.EQ.1) N=2 MA = MA+1 IF (MA.LE.NW) GO TO 25 IF (MA.GT.NW) GO TO 70 AFTER PRINTING OUT THE AREA WITH SHEET POTENTIAL THE SEARCH IS STOP PED BECAUSE WE GOT TO THE LAST RESIDUE IN THE SEQUENCE 1 = 1+1 H(I )=K1 1 = 1+1 H(I)=K2 PRINT 50,H(I-1),H(I) FORMAT('0',30X,16,10X,16) IM=I  15 1 152 C 153 C 154 C 155 55 156 157 158 159 160 60 161 C 162 C 163 C 164 165 166 167 168 C 169 C 170 C 17 1 C 172 70 l_i 173 75 LO 174 C 175 176 90 177 178 179 95 180 181 C 181.5 181.7 182 183 184 End of F i l e 0 0  GO TO 70 TO PRINT OUT THE AREA WITH SHEET POTENTIAL (H(I - 1),H(I)) 1=1+1 H(I)=K1 1=1+1 H(I)=K2-1 PRINT 60.H(I-1),H(I ) FORMAT('0'.30X,16,10X,16) TO START THE SEARCH AGAIN EITHER FROM (H(I-1) + 1) OR (H(I) + 1) MB=H(I) + 1 MA=H(I) + 1 IM=I IF (MA.LE.NW)  GO TO 20  TO PRINT OUT THE LAST VALUE OF THE COUNTER I (IM) WHICH WILL BE USED IN THE NEXT SUBROUTINE PRINT 75,IM FORMAT ( 'O' , 40X , ' IM ' ,14) PRINT 90 FORMAT(12X,'SEARCH 1S') PRINT 95 FORMAT( ' ', 12X, ' 1.'//) 1=2 0=1 CALL SECO RETURN END  FOR ACTUAL SHEETS FROM THE POTENTIAL REGION  SUBROUTINE SECO  SEARCH FOR SHEET NUCLEATION  PURPOSE SEARCH FOR NUCLEATING REGIONS WHICH SHOULD CONTAIN THREE BETAFORMERS OUT OF FIVE RESIDUES REAL S,T1.T2.A1,A2 ,T3,T4,T5,TT,P INTEGER G,F,H.U.D.V1.V2,V3,V4,V5,V6,V7,V8,0 LOGICAL HELLO,BYE.BALL,MOVE DIMENSION S( 1000, 10),M(1000),H(1000),D(1000, 1G) ,P( 1000, 10) COMMON S,T1 ,T2.T3.T4,T5,TT,A 1 ,A2,P ,V4,V5,VG,V7,V8.0,G,F.H.U.D,NN, 1NW.KX,MA,MB,MC,MD,L,I,L1,L2,L3.J1.J2.N.K1,K2,V1,V2,IM.M.K3.K4,V3, 2BYE,BALL,HELLO,MOVE DESCRIPTION OF PARAMETERS G - FIRST RESIDUE OF THE 5 RESIDUE PEPTIDE SUBJECTED TO THE NUCLEATION SEARCH MA - FIFTH RESIDUE OF THE 5 RESIDUE PEPTIDE SUBJECTED TO THE NUCLEATION SEARCH 0 - SWITCHING VALUE FOR DECISION MAKING Q=1 THE CURRENT POTENTIAL AREA IS STILL LONG ENOUGH (> 3 RESIDUES) TO BE SUBJECTED TO THE NUCLEATION SEARCH 0=2 THE CURRENT POTENTIAL AREA IS TOO SHORT FOR ANOTHER SHEET SO TO START WITH THE NEXT POTENTIAL AREA REMARKS UNLESS NOTIFIED THE OTHER PARAMETERS  STILL HAVE THE SAME DEFINITION  IF 0=2 THE NUCLEATION SEARCH WILL START ON A NEW POTENTIAL AREA SI NCE THE PREVIOUS ONE HAD BEEN THOROUGHLY SCANNED THROUGH. EACH TI ME THAT I INCREASES BY. 1 THE NEXT POTENTIAL AREA WILL BE ANALYZED IF (Q.E0.2) GO TO 25 1 = 1+1 IF (I.GT.IM) GO TO 180 K1=H(I )  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  25 C C C C  C C C C  30  c c c c c c c c c  N = G+1+(MA-G)/2 DO 30 L=N,MA IF (M(L).E0.3) S(L,2)=1.05 CONTINUE TO COUNT THE DIFFERENT TYPES OF ASSIGNMENTS BREAKERS (N) IN THE SECTION G-MA  36  c c c c  MA=MA-1  THE RESIDUE ASN CAN BE CONSIDERED AS A B-FORMER AT THE C-TERMINAL OF THE PEPTIDE CHAIN BECAUSE OF IT GOOD P.BC VALUE  35  C C  1=1+1 K2=H(I) G = K1 KX=K2-3 MA=G+4 IF (MA.GT.K2)  (T3) AND THE NUMBER OF  T3=0 N=0 DO 35 L=G,MA S(L,3)=0 IF (S(L,2).GE.1.05) S(L.3)=1.0 T3=T3+S(L,3) IF (S(L,2).LE.O.75) N = N+1 CONTINUE PRINT 36,G.MA,T3,N FORMAT(' ',10X.'G : ' ,14,5X, 'MA: ' ,14,5X, ' T3 : ' ,F7 . 4 , 5X , 'N :',I3, 1 8X,'SHEET NUCLEATION') IF THERE IS AT LEAST 3 HB AND LESS THAN 2 BB,THE NUCLEATION RULE IS SATISFIED. WE STILL HAVE TO CHECK FOR THE PRESENCE OF PRO OR GLU IN THE NUCLEATING SEGMENT (THEY ARE STRONG B-BREAKERS) IF (T3.GE.3.0.AND.N.LT . 2)  GO TO 60  SOME MODIFICATIONS OF THE RULE WHICH TAKE INTO ACCOUNT THE PRESEN CE OF NEIGHBORING RESIDUES FAVORABLE TO SHEET NUCLEATION ALTHOUGH THE SEGMENT MAY CONTAIN MORE THAN ONE THIRD OF SHEET-BREAKERS IF (T3.GE.3.0.AND.N.GE.2.AND.S(G,2).GE.1.05.AND.S(MA,9).GE.1.50 1.AND.S(MA-1,9).GE.1.50) GO TO 100 IF (T3.GE.3.0.AND.N.GE.2.AND.M(MA).EO.10.AND,M(G).EO.10)  GO TO 100  IF (T3.GE.3.0.AND.N.GE.2.AND.M(MA).EO.19.AND.S(G,2).GE.1.05.AND. 1(M(G+1).EO.20.OR.M(G+2).EO.20.OR.M(G+3).EO.20)) GO TO 100  101 102 103 104 105 106 107 108 109 1 10 1 1 1 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 120 121 122 123 1 24 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 149 150  C  IF (T3 . GE.3.0.AND.N.GE.2.AND.S(MA.2).LE.0.75)  C  IF (S(G,2).LE.0.75.AND.T3.GE.3.0.AND.N.EO.2.AND.M(G+1).EO.15.AND. 1M(MA).EO.5.AND.M(MA-1 ) .EO.20.AND.M(MA-2).EO. 11) GOTO 100  C  IF (T3.GE.3.0.AND.N.GE.2.AND.5(G,2).LE.0.75)  C  GO TO  IF (T3.GE.2.0.AND.N.EO.1.AND,M(G+2).EO.20.AND.M(G+4).EO.5.AND.M(G) 1.EO.12) GO TO 120  C  IF (M(G).EO.10.AND,M(MA+1).EO.20.AND.S(G+1,2).GE.0.93.AND.S(G+2.2) 1 .GE.0.75.AND.S(G+3,2).GE.0.75.AND.M(G-1).EO. 1 .AND.M(G-2).EQ. 1 ) 3 GO TO 130  C  IF ((G-4).LE.O.AND.(MA+2).GT.NN) GO TO 45 IF (T3.GE.2.0.AND.N.EO.1.AND.S(G-1,2).GE.0.54.AND.M(G-1).NE.15.AND 1 .S(G-2.2).GE.1.60.AND.S(G-3,2).GE.1.47.AND.S(G-4,2).LE.0.74.AND.S 2 (MA,2).LE.0.74.AND.S(MA-1,2).GT.0.93.AND.S(MA-3,2).GT.1.30.AND.S( 3 MA+1,2).LE.0.75.AND.S(MA+2,2).LE.0.83) GO TO 160 45  IF ((G+10).GT.NN.AND.(G-2).LE.O) GO TO 50 IF (T3.GE.2.0.AND.N.EO.1.AND.S(G+1,2).LE.0 .74.AND.S(G+2,8).GE.1.6 1 9.AND.S(G.8),LT.S(G+2,8).AND.M(G+3).EO.1.AND.S(G+4,2).GT.0.74.AND 2 .S(G+5,2).GT.0.74.AND.M(G+6).EQ.1.AND.S(G+7,2).GT.0.74.AND.S(G+8, 3 2).GT.0.93.AND.S(G+9,2).GE.1.60. AND.S(G+10.2).LE.0.74.AND. S(G-1, 4 2) .LE.0.74.AND.S(G-2,2) .LE.0.74) GO TO 170  50  IF ((G-5).LE.0.AND.(MA+2).GT.NN) GO TO 55 IF(T2.GE.2.0.AND.N.EO.2.AND.S(G-3,2).GE.1.60.AND.S(G-4,2).LE.0.75 1 .AND.S(G-5,2).LE.0.75.AND.S(G-2,2).GE.1.60.AND.M(G-1 ) . EQ . 1.AND. S( 2 G,2).GE.1.60.AND.S(G+1.2).GE.0.75.AND.M(G+2).EO.1.AND.S(G+3,2).GE 3 . 1 .60.AND.S(MA,2).LE.0.55.AND.S(MA+1,2).LE.0.75.AND.S(MA + 2,2 ) .LE. 4 0.75) GO TO 160  C  C C C C C  GO TO 90  IF (T3.GE.2.0.AND.N.LT.2.AND.M(G).EO.20.AND.M(G+2).EO.14) 1 120  C  C  GO TO 75  IF THE SEGMENT UNDER CONSIDERATION CANNOT SATISFY ANY OF THE ABOVE CONDITIONS THEN THE SEARCH WILL START AGAIN FROM NEXT POSITION G+1 55  c c c c  G = G+1 IF (G.LE.KX) GO TO 20  GO TO 25  IF THERE IS NO GLU NOR PRO IN THE NUCLEATING SEGMENT THEN SUBROUTI NE THIR IS CALLED TO CARRY OUT THE PROPAGATION PROCEDURE 60  DO  61  L=G,MA  151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  61 C C C C  C C C C C  65  IF (T3.GE.3.0.AND.N.EQ.1.AND.S(G,2).GE.1.30.AND.M(G).E0.M(G+2) 1.AND.M(G).E0.M(G+3).AND.(G-2).EQ.K3.AND.S(G-1,2).GE.0.75) GO TO 2 150 IF ((G+8).GT.NN) GO TO 70 IF (T3.GE.3.0.AND.N.EO.1.AND.S(G,8).GE.1.65.AND.S(G+1,2).GE.1.19. 1 AND.S(G-1,2).LE.0.75.AND.S(G+2,2).GE. 1 .30.AND.S(G+4,9).GE. 1 .50.AND 2 .S(G+5,9).GT.0.79.AND.S(G+6,9).GT.1.79.AND.S(G+7,2).LE.0.75.AND. 3 S(G+8,2).LE.O.75) GO TO 140 NUCLEATION SEARCH STARTS AGAIN FROM NEXT POSITION G+1  70  G = L+1 IF (G.LE.KX) GO TO 20  GO TO 25  TO START N-TERMINAL PROPAGATION WHEN THE PRESENCE OF A SHEET-BREA KER AT THE C-TERMINAL (MA) IMPEDES THE ELONGATION ON THAT SIDE 75 76 80  85 86  c c c c c  GO TO 65  IN SOME INSTANCES,DESPITE THE PRESENCE OF PRO OR GLU THE NUCLEATING AREA REMAINS STABLE BECAUSE OF STRONG B-FORMER RESIDUES  C  C C C C  IF (M(L).E0.7 .OR .M(l_) .EQ. 15) CONTINUE CALL THIR GO TO 10  MV=MA-1 DO 76 L=G,MV IF (M(L).E0.7.OR.M(L).EQ.15) GO TO 65 CONTINUE BALL=.FALSE. IF ((G-1).LE K3) GO TO 85 IF (S(G-1,2) GE.1.05) BALL=.TRUE. IF (BALL) G=G-1 IF (BALL) GO TO 80 PRINT 86,G,MA FORMAT('0' , 10X, 'PSEUDO-SHEET FROM G TO MA- 1' ,5X, 'G: ' ,15,5X, 'MA: ' , 115/) J1=G J2=MA-1 GO TO 115 TO START C-TERMINAL PROPAGATION WHEN THE PRESENCE OF A SHEET-BREA KER AT THE N-TERMINAL (G) IMPEDES THE ELONGATION ON THAT SIDE  90  MU=G+1  LO  201 202 203 204 205 206 207 208 209 210 211 212 2 13 214 215 2 16 2 17 2 18 2 19 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250  DO 92 L=MU,MA IF (M(L).EQ.7.OR.M(L).EQ.15) GO TO 65 CONTINUE NV = MA +1 NU=MA+4 DO 94 L=NV,NU IF (S(L,2).GE.1.05) MA=MA+1 IF (S(L,2).LT. 1 .05) GO TO 95 CONTINUE PRINT 96,G,MA FORMAT('O'.10X, PSEUDO-SHEET FROM G+1 TO MA'.5X,'G: 115/) J1=G+1 J2=MA GO TO 115  92  94 95 96  C C C C C C C  I NTHEN C-TERMINAL PROPAGATION BY ADDING ONE RESIDUE AT A TIME TO THE"NUCLEATING SEGMENT. IT IS DIFFERENT FROM THE PROCEDURE IN SUB I ROUTINE THIR WHERE TETRAPEPTIDES INSTEAD OF SINGLE RESIDUES ARE CON SIDERED FOR ELONGATING THE SEGMENT 100  BALL=.FALSE. IF ((G-1).LE.K3) GO TO 110 IF (S(G-1,2).GE.1.05) BALL=.TRUE. IF (BALL) G = G- 1 IF (BALL) GO TO 100  1 10  HELLO= . FALSE . IF (S(MA+1.2).GE.1.05) HELLO=.TRUE. IF (HELLO) MA = MA+1 IF (HELLO) GO TO 1 10 J 1=G J2 = MA PRINT 112.J1.J2 FORMAT('0',10X. PSEUDO-SHEET FROM J1: GO TO 115  C  1 12 C C C C C C C C  15,5X,'MA:  I5,5X.'T0  J2:',I5/)  \WHEN THE PROPAGATION HAS BEEN STOPPED ON BOTH SIDES THEN SUBROUTINE FOUR IS CALLED FOR ADJUSTING THE BOUNDARIES TO THEIR MOST FAVORABLE I I POSITIONS. WHEN RETURNING FROM THE BOUNDARY ANALYSIS IF THE CURRENT POTENTIAL AREA IS NOT LONG ENOUGH FOR ANOTHER SHEET FRAGMENT THEN I THE NEXT POTENTIAL AREA WILL BE ANALYZED (Q=1)  1 15 1 18  CALL FOUR IF (J2.LT.KX) IF (J2.LT.KX) IF (J2.GE.KX) GO TO 10  G = J2+ 1 Q= 2 Q=1  r—  1  251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 28 1 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 End of F i l e  TO PRINT OUT THE NUCLEATING SEGMENTS WHICH DO NOT FOLLOW THE COM MON NUCLEATION RULE. SUBROUTINE FOUR IS THEN CALLED TO CARRY OUT THE BOUNDARY ADJUSTMENT 120 125  J1 = G J2 = MA PRINT 125.J1.J2 FORMAT('0'.10X,'PSEUDO-SHEET FROM J1 GO TO 115  130  J1=G-2 J2=MA+1 PRINT 125.J1.J2 GO TO 115  140  J1=G J2=MA+3 PRINT 125.J1.J2 GO TO 115  ',I5,5X,'T0  J2:  15/)  TO CHECK THE NUMBER OF B-BREAKERS (JC) WHICH SHOULD BE LESS THAN ONE THIRD OF THE LENGTH OF THE SEGMENT (JCC) 150  155  JC = 0 J2 = MA DO 155 L=J1,MA IF (S(L,2).LT.0.83) JC=JC+1 CONTINUE JCC=(MA+1-J1)/3 IF (JC.LE.2.AND.JC.LT.JCC) PRINT 125.J1.J2 GO TO 115  160  J1=G-3 J2=MA-1 PRINT 125.J1.J2 GO TO 118  170  J1=G+2 J2=MA+6 PRINT 125.J1.J2 GO TO 118  180 185  PRINT 185 FORMAT('O' RETURN END  'END OF PROGRAM')  Ul  1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  C C  SUBROUTINE THIR  C C C C C C C C C C '  c c c c c c c  c c c c c c c c c c c c c c c c c c c  PROPAGATION OF THE BETA-SHEET  PURPOSE TO ADD TO THE NUCLEATING FRAGMENT TETRAPEPTIDES WHICH HAVE <PB> > 1.00 AND WHICH SATISFY THE PROPAGATION SET OF RULES  REAL S,T1,T2,A1,A2 ,T3,T4,T5,TT,P INTEGER G.F.H.U.D,V1,V2,V3,V4.V5,V6,V7.V8.0 LOGICAL HELLO,BYE,BALL,MOVE DIMENSION S(1000,10),M(1000),H(1000),D(1000,16),P(1000,10) COMMON S,T1,T2,T3.T4,T5,TT,A1,A2,P ,V4,V5,V6,V7,V8,0.G,F,H.U,D 1NW,KX,MA,MB,MC,MD,L,I,L1,L2,L3,J1,J2,N,K1,K2,V1,V2,IM,M,K3,K4, 2BYE,BALL.HELLO,MOVE DESCRIPTION OF PARAMETERS MB - WHETHER IT IS N- OR C-PROPAGATION JB WILL ALWAYS BE THE FIRST LEFT RESIDUE OF THE ADJACENT TETRAPEPTIDE MC - WHETHER IT IS N- OR C-PROPAGATION JC WILL ALWAYS BE THE FOURTH RESIDUE OF THE ADJACENT TETRAPEPTIDE K1 - N-TERMINAL RESIDUE OF THE CURRENT POTENTIAL AREA N2 - C-TERMINAL RESIDUE OF THE CURRENT POTENTIAL AREA F - SWITCHING VALUE FOR DECISION MAKING F=1 N-PROPAGATION F=2 C-PROPAGATION V1 - COUNTER V2 - COUNTER AS LONG AS MB BELONGS TO THE CURRENT POTENTIAL AREA THE N-PROPAGA TION CAN BE CARRIED OUT 10  V1=0 V2 = 0 V3 = 0 V4 = 0  CTi  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  15 C C C C C  V7=0 F =1 V1=V1+1 MB=G-(4*V1) IF (MB.GT.O .AND.MB.GE.K1)  GO TO 20  TO SWITCH TO C-TERMINAL PROPAGATION WHEN THE N-TERMINAL SIDE IS STOP PED.THE VARIABLE MB THEN BECOMES THE FIRST RESIDUE IN THE TETRAPEP TIDE ADDED TO THE C-TERMINAL SIDE  20  25 C C C  IF (V1.E0.1) J1=G IF (V1.NE.1) J1=G-4*(V1-1) F =2 T2=0 T1=0 IF (F . EO. 1) GO TO 25 IF (V2.NE.O) MB=MA+1+(4*V2) IF (V2.E0.O) MB=MA+1 V2=V2 + 1 IF (MB.GT.K2) GO TO 50 MC=MB + 3 IF (MC.GT.K2 .AND. MB.LE.K2)  *  GO TO 50  CALCULATION OF THE PA.PB OF THE TETRAPEPTIDE MB-MC  30 35  DO 30 L=MB,MC T1=T1 + S(L, 1 ) T2=T2+S(L,2) CONTINUE PRINT 35,MB,MC,T1,T2 FORMAT ( ' ' , 10X , ' MB : ' , 14 , 5X , ' MC : ' , I 4 , 5X , ' T 1 : ' , F7 . 4 , 5X , ' T2 : ' . F7 . 4., 1'SHEET PROPAGATION')  C C C C C  IF PA > PB THEN TO SWITCH TO C-PROPAGATION IF N-PROPAGATION HAS BEEN CARRIED OUT,OTHERWISE TO START ELONGATING BOTH SIDES BY ONE RESIDUE AT A TIME  C C C C C  IF PB > PA AND <PB> <1.00 TO TAKE INTO CONSIDERATION THE TYPES OF SHEET RESIDUES IN THE SEGMENT SINCE IT MAY STILL BE VALID FOR THE PROPAGATION  C C C  c c  IF (T1 .GT.T2)  GO TO 45  IF (T2.LT.4.0000)  GO TO 130  IF PRO OR GLU OCCURS IN THE PROPAGATED TETRAPEPTIDE THEN EITHER TO SWITCH FROM N-PROPAGATION TO C-PROPAGATION (F=2) OR TO START ELON GATING BOTH SIDES BY ONE RESIDUE AT A TIME DO 40  L=MB,MC  101 102 103 104 105 106 107 108 109 1 10 1 1 1 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 120 121 122 123 1 24 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150  40 45  C C C C C C  C C C  C C C C C  ADDITION OF ONE RESIDUE (HB OR IB) AT A TIME TO THE N-TERMINAL SIDE WHEN ADDING IB TO EACH END TO CHECK IMMEDIATELY WHETHER THE RULE OF AT LEAST HALF OF FORMERS IS STILL SATISFIED OR NOT 50  L1=01 - 1 IF (L1 .LT.(G -4)) GO TO 55 IF (M(L1).EO.12) S(L1.2)=1.05 IF (S(L1,2),GE. 1 .05) J1=L 1 IF (S(L1,2).LE.O.93 .AND.S(L1,2).GE.0.83) IF (S(L1,2).GE.1.05 ) GO TO 50  J1=L1  ADDITION OF ONE RESIDUE (HB OR IB) AT A TIME TO THE C-TERMINAL SIDE 55 60  J2=MB-1 L2=J2+1 IF (L2.GT.(MA + 4 ) ) GO TO 65 IF (M(L2).E0.3) S(L2,2)=1.05 IF (S(L2.2).GE.1.05) J2=L2 IF (S(L2,2).LE.O.93 .AND.S(L2,2).GE.0.83) IF (S(L2,2).GE.1.05) GO TO 60  J2=L2  TO COUNT THE NUMBER OF SHEET-FORMERS IN THE ENTIRE SHEET AREA TO COMPARE THE ACTUAL NUMBER OF FORMERS (T4) TO ITS THEORITICAL ONE (TT : EQUAL TO AT LEAST ONE HALF OF THE SECTION) 65  70 75 C C C c  IF (M(L).EQ.15.0R.M(L).EQ.7) GO TO 45 CONTINUE GO TO 150 BALL=.FALSE. IF (F.EO.1) BALL=.TRUE. IF (BALL) J1=MB+4 IF (BALL) F=2 IF (BALL) GO TO 20  T4 = 0 DO 70 L=J1,J2 S(L,4)=0 IF (S(L,2) .GE.1.05) S(L,4)=1.0 T4=T4+S(L,4) CONTINUE TT = (J2-U1+1) / 2.0 PRINT 75,J1,J2,T4, TT FORMAT ( ' ' , 10X, 'U1 : ' , I4.5X , 'U2 : ' , 14, 5X , ' T4 : ' , F7 .4 , 5X , ' TT : ' , F7 .4 1 4X, 'ACTUAL AND THEORIT. H FORMERS FROM J1 TO <J2 ' ) IF THE RULE OF MORE THAN HALF OF SHEET-FORMERS IS SATISFIED THEN TO KEEP ON ADDING HB OR IB TO EACH SIDE OF THE SHEET SECTION IF (T4.GE.TT .AND. S(J1 - 1.2).LE.0.75 ) GO TO 80  151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  IF (T4.GE.TT .AND. S(J1 - 1,2).GT.0.75.AND.L1 .GE.(G -4)) GO TO 50 IF (T4.GE.TT .AND. S(J2 + 1,2).GT.0.75.AND.L2.LE.(MA+4)) GO TO 60 IF((T4.GE.TT .AND. S(J2+1,2).LE.O.75) . OR . (T4 . GE.TT.AND.S(J2+1 ,: 1GT.0.75.AND.L2.GT.(MA+4))) GO TO 115  80 C C C C  IF THE RULE IS NOT SATISFIED THEN TO TAKE AWAY RESIDUES WHICH ARE NOT HB SO THAT EVENTUALLY THERE IS ENOUGH HB IN THE SECTION 85  IF (S(J2,2).LT.1.05) GO TO 90 IF (S(J1,2).LT.1.05) GO TO 95 J2=J2-1 IF (S(J2+1,2).LT.1.05) GO TO 100 J1=J 1 + 1 IF (S(J1-1,2),LT.1.05) GO TO 100  90 95 C C C C  EVERY TIME A RESIDUE IS TAKEN AWAY THE RULE OF MORE THAN HALF OF SHEET-FORMERS IS CHECKED AGAIN ON THE SHORTENED SECTION 100  105  115 120 C C C  c  c c c c c c c  '  T4=0 DO 105 L=J1,J2 S(L,4)=0 IF (S(L,2).GE.1.05) S(L,4)=1.0 T4=T4+S(L,4) CONTINUE TT=(J2-J1+1)/2.0 PRINT 75, J1,J2.T4.TT IF (T4.GE.TT) GO TO 115 IF (T4.LT.TT) GO TO 85 PRINT 120.J1.J2 FORMAT('0'. 10X, 'PSEUDO-SHEET FROM J1: ' ,15.5X, 'TO  J2:',I5/)  WHEN THE PROPAGATION IS TERMINATED ON BOTH SIDES TO CALL SUBROUTINE FOUR FOR THE BOUNDARY ADJUSTMENT 125  CALL FOUR IF (J2.LT.KX) IF (J2.LT.KX) IF (J2.LT.KX) 0=1 RETURN  G=J2+1 0=2 RETURN  PRESENCE OF B-BREAKER OR OF CHARGED RESIDUE (ARG.LYS) IS NOT FAVO RABLE TO PROPAGATED TETRAPEPTIDES WITH <PB> '< 1.00. SO EITHER TO SWITCH TO C-PROPAGATION OR TO START ADDING HB OR IB TO EACH SIDE OF THE SHEET AREA 130 135  DO 135 L = MB, MC IF (S(L,2).LE.0.75) CONTINUE  GO TO 45  ._, ^  201 202 203 204 205 206 207 208 209 210 211 212 2 13 2 14 215 216 2 17 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250  C C C C  140  L3=0 DO 145 L=MB,MC IF (M(L).EQ.3 .OR. M(L).E0.1) CONTINUE IF (L3.E0.4) GO TO 45  L3=L3+1  ... TO CHECK THE NUMBER OF BREAKERS IN THE ENTIRE POLYPEPTIDE ... TO COUNT THE NUMBER OF BB IN THE ENTIRE SECTION (V8). IT SHOULD NOT BE GREATER THAN ONE THIRD OF THE LENGTH (V6). IF V8 IS LESS THAN V6 THEN THE SECTION IS CONSIDERED TO BE VALID AND SUBROUTINE FOUR IS CALLED TO CARRY OUT THE BOUNDARY ADJUSTMENT. IF NOT EITHER HB OR IB IS ADDED TO BOTH SIDES TO SATISFY THE REQUIREMENT OR C-PROPAGATION .WILL REPLACE N-PROPAGATION DESCRIPTION OF PARAMETERS V3 - COUNTER V4 - COUNTER MB - N-TERMINAL OF THE SHEET REGION MD - C-TERMINAL OF THE SHEET REGION 150  155  C  GO TO 45  IF THE TETRAPEPTIDE WITH <PB> <1.00 ONLY HAS IB THEN IT CANNOT BE ADDED TO THE PROPAGATED SHEET  145 C C C C C C C C C C C C C C C C C  DO 140 L=MB,MC IF (M(L).E0.2 .OR. M(L).E0.9) CONTINUE  160 165  V6 = 0 V8=0 IF (F . EQ. 1) GO TO 155 MB=MB-5-(4+V4) MD=MB+8+(4*V3) V3=V3+1 V4=V4+1 V6=(MD-MB+1)/3 DO 160 L=MB.MD IF (S(L,2).LE.0.75) V8=V8+1 CONTINUE PRINT 165,MB,MD.V6,V8 FORMAT(' ' , 10X, 'MB: ' ,14,5X, 'MD: ' , 14 , 5X , ' V6 : ' , 17,5X, 'V8: ' ,I 3 , 8X, 1'THEORITIC. AND ACTUAL tt BREAKERS FROM MB TO MD' ) IF (V8.LT.V6.AND.F.EQ.1) GO TO 170 IF (V8.LT.V6.AND.F.EQ.2) GO TO 180 V7=V2 IF (F.EQ.2) MB=MB+5+(4*V7) IF (F.EQ.2) GO TO 50 F=2  251 252 253 C 254 C 255 C 256 C 257 C 258 259 260 261 262 263 264 265 266 End of F i l e  (Jl O  J1=MB+4 GO TO 20 TO PRINT OUT THE POSSIBLE SHEET AREAS THEN TO CALL SUBROUTINE FOUR FOR THE BOUNDARY ADJUSTMENT 170  180  J2 = MA J1=MB PRINT 120.J1.J2 GO TO 125 J1=G J2=G+8 PRINT 120.J1.J2 GO TO 125 END  H <-n 1-1  1 2 3 4 5 G 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  c C  C C C C C C C C C C C C C C C C C C  C C C C C C C C C C C C C C C C C C  SUBROUTINE FOUR  BOUNDARY MOVE OF THE N-TERMINAL  PURPOSE TO FIND OUT THE MOST FAVORABLE N-BOUNDARY RESIDUE FOR THE PREDIC TED SHEET BASED ON THE BOUNDARY CONFORMATIONAL PARAMETERS OF THE ADJACENT RESIDUES  REAL S,T1,T2,A 1 ,A2 ,T3,T4,T5,TT,P INTEGER G,F,H,U,D,V1,V2,V3,V4,V5,V6,V7,V8,0 LOGICAL HELLO, BYE , BALL, MOVE DIMENSION S(1000,1O),M(1000),H(1000),D(1000,16),P(1000.1O) COMMON S.T1,T2,T3,T4,T5,TT,A 1,A2,P ,V4,V5,V6,V7,V8,0,G,F,H,U,D,NN, 1NW,KX,MA,MB,MC,MD,L,I,L1,L2,L3,J1,J2,N,K1,K2,V1,V2,IM,M,K3,K4,V3, 2BYE.BALL,HELLO,MOVE DESCRIPTION OF"PARAMETERS V8 - ACTUAL NUMBER OF BREAKERS IN THE PREDICTED SHEET K3 - C-TERMINAL RESIDUE OF THE PREVIOUS PREDICTED SHEET V2 - COUNTER USED IN THE N-BOUNDARY ADJUSTMENT V3 - COUNTER USED IN THE C-BOUNDARY ADJUSTMENT V2 = 0 V3 = 0 J1 = J1 THE POSITION J1 APPEARS TO BE THE MOST FAVORABLE COMPARED TO ITS ADJACENT RESIDUES,NO NEED TO ADJUST IT. BALL=.FALSE. IF (M(J1).EO. 1.AND.M(J1 + 1).EO. 1.AND.S(J1- 1.8).LT. 1 .07.AND.S(J1-2,  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  C C  18 ) . LT. 1 .07.AND.M(J1+2 ) .EO. 10) IF (BALL) 01=01 IF (BALL) V2=1 IF (BALL) GO TO 200  BALL=.TRUE.  #** 2 ***  BALL = .FALSE. IF (M(01).EQ.1.AND.S(01-1,2).LT.1.05.AND.M(01+1).EQ.11) 1 .TRUE. IF (BALL) 01 = U1 IF (BALL) V2=2 IF (BALL) GO TO 200  C C  *** 3 ***  C C  *** 4 *+*  BALL =  BALL=.FALSE. IF (M( 01).EO.1.AND.S(J1-1.2).LT.1.05.AND.V8.EO.1.AND.S(J1+2,8). 1LT.1.69.AND.(J2-J1).GE.8) BALL =.TRUE. IF (BALL) 01 = J1 IF (BALL) V2=3 IF (BALL) GO TO 200 BALL =.FALSE. IF (M(J1).EO.1.AND.S(d1-1,2).LE.0.75.AND.S(J1+1,8).GE.1.30.AND. 1S(01+1,8).LT.1.69) BALL=.TRUE. IF (BALL) J1=J1 IF (BALL) V2 = 4 IF (BALL) GO TO 200  C C C C C C C C  C C  MOVE OF J1 THE POSITION OF J1 IS LESS FAVORABLE THAN THAT OF ITS ADJACENT RE SIDUES *** 5 ***  BALL=.FALSE. IF (S(J1,2).GT.0.89.AND.V8.EO.0.AND.M(J1 - 1).EO. 1 .AND.(J2-J1).GE.8' 1 BALL=.TRUE. IF (BALL) 01=01-1 IF (BALL) V2=5 IF (BALL) GO TO 200 *+* £ ***  BALL=.FALSE. IF (M(01).EQ.12 .AND.S(01-1,2).LT.0.74.AND.S(01+1,8).LT.1.30. 1 AND . S( J1 + 2, 2 ) . LE'.O. 75 . AND . S ( 0 1+ 3 , 8 ) . GE . 1 . 50 . AND :(S(01-2,8)-S(01+3. 28) ) . LT.0.20) BALL = . TRUE. IF (BALL) 01=01+3  ,_, (jl CO  101 102 103 104 105 10G 107 108 109 110 111 112 113 114 115 116 117 118 1 19 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150  IF (BALL) IF (BALL) C C  C Q  C C  C C  C C  C C  V2 = 6 GO TO 200  *** 7 *** BALL=.FALSE. IF (S(d1,8).LT.1.50.AND.S(d1+1,8).LE.S(d1-2,8).AND .S(d1-2,8).GE. 11. 50.AND.S(d1-1,2).LT.0.74.AND.(J2-J1 + 3).GT.8.AND. ( d1 - 2).GE.K3) 3BALL=.TRUE. IF (BALL) d1=d1-2 IF (BALL) V2=7 IF (BALL) GO TO 200 +** g *** BALL=.FALSE. IF(S(d1.2).LT.1.05.AND.S(d1-1.2).LT.1.05.AND.S(d1-2,2).LT.1.05 1 .AND.S(d1-3,8).LT. 1 .65.AND.S(d1 + 1.8).GE. 1.07) BALL=.TRUE . IF (BALL) d1=d1+1 IF (BALL) V2=8 IF (BALL) GO TO 200 *+*9**+ BALL=.FALSE. IF ((S(d1 + 1,8 )-S(d1,8) ) .GT.0.20.AND.(M(d1-1).EO. 15.OR.M(d1-1).EO. 17.OR,M(d1- 1).EO.4).AND.(d2-d1+2).LE.8.AND.S ( d1 -2 , 8) . LT . 1.69.AND. 2S(d1 ,8).LT. 1 .07) BALL =.TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=9 IF (BALL) GO TO 200 *** 10 *** BALL=.FALSE. IF (S(d1 ,2) . LT . 1 .05.AND.S(d1 - 1,2).LT. 1 .05.AND.S(d1-2,8) .LT. 1 .65 1.AND.S(d1+1,8).GE.1.07) BALL=.TRUE. IF (BALL) J1=d1+1 IF (BALL) V2=10 IF (BALL) GO TO 200 *** 11 *** BALL=.FALSE. IF (S(d1,2).LT.1.05.AND.M(J1+1).EO.1.AND.(M(d1+2).EQ.11.0R.S(d1+2, 12).GE. 1.30).AND.S (d1 - 1.2).LT.O.83) BALL=.TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=11 IF (BALL) GO TO 200 *** 12 **.* BALL=.FALSE. I F ( S ( d 1 , 2 ) . L T . 1 .05.AND.S(d1 - 1,2).LT. 1 .05.AND.S(d1 -2,8).GE. 1 .69 1.AND.S(d1+1,8).LT.1.65.AND.(d1-2).GT.K3) BALL=.TRUE. IF (BALL) d1=d1-2  ,_, Ul >fc»  151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 1.80 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  IF (BALL) IF (BALL) C C  C C  C 6  C C  C C  C C  V2=12 GO TO 200  *** 13 *** BALL=.FALSE. IF (S(d1,8).GE.1.07.AND.S(d1-2.8).GE. 1 .65.AND.S(d1-1,2).LT.0.74 1.AND . (J2-J1 + 3).GT.8.AND.(d1-2).GT.K3) BALL= . TRUE. IF (BALL) d1=d1-2 IF (BALL) V2=13 IF (BALL) GO TO 200 * * * 14+** BALL=.FALSE. IF (S(d1,8).GE.1 .07.AND.S(J1-2,8) .GT.S(J1 ,8) .AND.S(J1-1 ,2) .GE . 10.74 . AND . ( J1.-2) . GT . K3 . AND . ( S ( J 1-2 , 8 )-S( d1 ,8) ) .GE .0.50) 2.TRUE. IF (BALL) J1=J1-2 IF (BALL) V2=14 IF (BALL) GO TO 200 *** 15**+ BALL=.FALSE. IF ((S(J1+3,8)-S(J1,8)).GE.0.35 .AND.S(d1+1,8).LT.1.07.AND.S(d1+2, 18).LT.1.07.AND.S(d1-1,8).LT.1.07.AND.S(d1-2,8).LT,S(d1+3.8)) BALL= 2.TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=15 IF (BALL) GO TO 200 ***  16*** BALL=.FALSE. IF (S(d1.8).GE.1.07.AND.S(d1-2,8).GT.S(d1,8).AND.S(d1-1,2).GE. 10.74 .AND.(d1-2).GT.K3) BALL=.TRUE. IF (BALL) d1=d1-2 IF (BALL) V2=16 IF (BALL) GO TO 200  *** 17 *** BALL=.FALSE. IF (S(d1.8).GE.1.07.AND.S(d1-2,8).GT.S(d1,8).AND.(S(d1-1,2).GE.0. 174.OR.M(d1-1).EO.4).AND.(d2-d1+3).GE.8.AND.(J1-2).GT.K3) BALL= 2 .TRUE. IF (BALL) d1=d1-2 IF (BALL) V2=17 IF (BALL) GO TO 200 *** 18 *** BALL=.FALSE. IF (S(d1,2).LT.1.05.AND.S(d1-1,2).LT.1.05.AND.S(d1+1,8).GE.1.07 1.AND.(d1-2).LE.K3) BALL=.TRUE.  BALL =  ^  201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 24 1 242 243 244 245 246 247 248 249 250  IF IF IF C C  C C  C C  C C  C C  C C  ***  (BALL) (BALL) (BALL)  J1=J1+1 V2=18 GO TO 2 0 0  19 *** BALL=.FALSE. I F ( S ( J 1 , 2 ) . L T . 1 . 0 5 . AND . S ( <J 1 + 1 , 2 ) . LT . 1 . 0 5 . A N D . (M( J 1 - 1 ) . EO . 15 . OR . M( 1J1 - 1 ) . E O . 7 . O R . M ( J 1 - 1 ) . E O . 4 ) . A N D . S ( J 1 + 2 , 8 ) . G E . 1 . 0 7 ) BALL=.TRUE . IF (BALL) d1=d1+2 IF (BALL) V2=19 IF (BALL) GO TO 2 0 0  * * * 20 * * * BALL=.FALSE. I F ( S(vJ 1 - 1 , 2 ) . L T . 1 . 0 5 . AND . S ( <J 1 - 2 . 8 ) . LT . 1 . 0 7 . AND . S ( d 1 + 1 , 2 ) . LT . 1 . 0 5 1.AND.(S(d1+2,8)-S(J1,8)).GE.0.55 .AND.S(d1+2,8).GE.1.07) BALL= 2.TRUE. IF ( B A L L ) d1=d1+2 IF ( B A L L ) V2=20 IF (BALL) GO TO 2 0 0 ***21  *++ BALL=.FALSE. I F ( S ( d 1 , 2 ) . LT . 1 . 0 5 . AND . S ( d 1 + 1 , 2 ) . LT . 1 . 0 5 . A N D . ( S( d 1 + 2 . 8 ) - S ( d 1 - 1 , 8 ) 1 ).GT.0.25.AND.S(d1+2,8).GE. 1.50) BALL= . TRUE. IF ( B A L L ) d1=d1+2 IF (BALL) V2=21 IF ( B A L L ) GO TO 2 0 0  *** 2 2 *** BALL=.FALSE. IF ( M ( G ) . E O . 1 9 . A N D . S ( G - 1 , 8 ) . G E . 1 . 6 9 . A N D . M ( G + 1 ) . E O . 4 . A N D . S ( G + 2 . 8 ) . L 1T.S(G-1.8)) BALL=.TRUE. IF ( B A L L ) d1=d1-1 IF ( B A L L ) V2=22 IF ( B A L L ) GO TO 2 0 0 *** 2 3 *** BALL=.FALSE. I F ( S ( d 1 + 2 . 8 ) . G T . S ( d 1 , 8 ) . A N D . ( M ( d1 + 1) . EO . 1 5 . O R . M ( d 1 + 1 ) . E O . 7 . O R . 1M(d1 + 1 ) . E O . 4 ) . A N D . S ( d 1 + 2 , 8 ) . G E . 1 . 0 7 . A N D . S ( d 1 , 8 ) . L T . 1 . 4 2 ) BALL = 2.TRUE. IF (BALL) d1=d1+2 IF ( B A L L ) V2=23 IF (BALL) GO TO 2 0 0 * * * 24  *** BALL=.FALSE. IF ( S ( d 1 . 2 ) . L T . 1 . 0 5 . A N D . S ( d 1 + 1 , 2 ) . L E . 0 . 7 5 . A N D . S ( d 1 - 1 . 2 ) . L T . 1 . 0 5 1. A N D . S ( d 1 + 2 . 8 ) . G E . 1.07) B A L L = . TRUE .  251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300  C C  C C  C C  C C  C C  C C  IF (BALL) IF (BALL) IF (BALL)  d1=d1+2 V2=24 GO TO 200  *** 25 *** BALL=.FALSE. IF (S(J1 . 8 ) . LT . 1 .07 . AND . S (d 1 - 1 , 8 ) . LT . 1 . 07 . AND . S(d1 -2,8).LT. 1 .65 1.AND.S(d1+1,8).GE.1.07) BALL=.TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=25 IF (BALL) GO TO 200 *** 26 *** BALL=.FALSE. IF (S(d1.8).GE . 1 .07.AND.S(d1 -2,8).GE. 1.07.AND.S(d1- 1.2).GE.0.75 1.AND.S(d1,8).LE.S(d1-2.8).AND.M(d1+1).EO.20.AND.M(d1+3).EO.20) 2 BALL=.TRUE. IF (BALL) d1=d1-2 IF (BALL) V2=26 IF (BALL) GO TO 200 *** 27 *** BALL=.FALSE. IF((S(d1+1,8)-S(d1,8)).GT.0.20.AND.(S(d1-1.8)-S(d1,8)).LT.0.20 1 .AND.S(d1 + 1.8).GE. 1 .07.AND. (d1-2) . LE.K3) BALL=.TRUE. IF (BALL) d1=d1+1 IF (BALL) V2=27 IF (BALL) GO TO 200 *** 28 *** BALL=.FALSE. IF ((S(d1+3,8)-S(d1 ,8) ).GE.0.35 .AND.S(d1 + 1,8).LT. 1.07.AND.S(d1+2, 18) . LT. 1.07.AND.S(d1- 1 ,8).LT. 1.07.ANDS(d1-2,8).LT.S(d1+3,8).AND. 2(d1-2).LE.K3) BALL=.TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=28 IF (BALL) GO TO 20O *** 29  +** BALL=.FALSE. IF (S(d1 .8) .LT. 1 .07.AND.(d1- 1).LE.K3.AND.S(d1 + 1 ,8) .LT. 1 .07.AND. 1S(d1+2,8).LT.1.07.AND.S(d1+3,8).GE.1.07) BALL=.TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=29 IF (BALL) GO TO 200  *** 30 *** BALL=.FALSE. IF(((S(d1+1,2).LT.0.74.AND.S(d1+2,8).LT.1.07).OR.(S(d1+1.8).LT. 11.07.AND.S(d1+2,2).LT.0.74)).AND.S(d1+3.8).GE.1.07.AND.S(d1,8).LE.  (-J Ul  301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 .316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 34 1 342 343 344 345 346 347 348 349 350  C C  C C  -  C C  C  C C  C C C C  21.65) BALL =.TRUE. IF (BALL) d1=d1+3 IF (BALL) V2=30 IF (BALL) GO TO 200 *** 3 1 * * * BALL=.FALSE. IF (S(d1,8).GE.1 .69 .AND. S(d1 -2.8).GE. 1 .69.AND.S(J1+1,8).LT.S(J1-2, 18) . AND . S(d1- 1 ,2) . LT .0. 74 .AND . (d2-d1+ 3).GE.7. AND . (d 1 - 2 ) . GT . K3 ) 2 BALL=.TRUE. IF (BALL) d1=d1-2 IF (BALL) V2=31 IF (BALL) GO TO 200 *** 32 *** BALL=.FALSE. IF (S(d1,2).LT.0.83.AND.S(d1+1,8) .GE . 1.07.AND.(S(d1 + 1,8)-S(d1-1,8) 2).GT.0.30.AND.S(d1-1,8).LT.1.30) BALL=.TRUE. IF (BALL) d1=d1+1 IF (BALL) V2 = 32 IF (BALL) GO TO 200 *** 33 *** LK=d1-3 IF (LK.LE.O) GO TO 100 V8=0 DO 50 L=LK,d2 IF (S(L, 1).LE.0.75) . V8 = V8+1 50 CONTINUE LY=(d2+1-LK)/3 BALL=.FALSE. IF (V8.LE.LY.AND.M(d1).EQ.6 .AND.M(d1 - 1).EO.4.AND.S(d1 -2,2).GE. 10.75.AND.S(d1-3,2).GE.1.30.AND.S(d1-3,8).GE.1.30) BALL=.TRUE. IF (BALL) J1=d1-3 IF (BALL) V2=33 IF (BALL) GO TO 200 *** 34 *** BALL=. FALSE. IF (S(d1,8) .GE. 1.69.AND.S(d1-3.8).E0.1.94.AND.S(d1-1,2) GE.0.74 1.AND.S(d1-2,2).GT.0.75.AND.S(d1+1,8).LT.S(d1-3,8).AND.N.LE.LY) 2 BALL=.TRUE. IF (BALL) J1=d1-3 IF (BALL) V2=34 IF (BALL) GO TO 200 d1 = d1  351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400  C C  C C  C C  C C '  C C  C C  *** 35 *** 100 BALL = . FALSE . IF (S(d1,8).GE.1. 50 . AND .5(01-1,8).LT. 1 . 50. AND . S( J 1-2 , 8 ) . LT . 1 . 50 1 . AND . S(<J1 + 1 , 8) . GE . 1 . 07 ) BALL = . TRUE . IF (BALL) d1=J1 IF (BALL) V2=35 IF (BALL) GO TO 200 *** 36 *** BALL=.FALSE. IF (S(d1,8).GE. 1 .07.AND.S(d1-1,2).LE.0.75.AND.S(d1-2,8).LT.S( J1,8) 1 .AND.S(d1+1,8).LT.S(d1,8).AND.S(J1+2,8) .LT . S(d1.8).AND.S(J1 + 1 ,2) 2.GE.0.74.AND.S(d1+2,2).GE.0.74) BA LL =.TRUE. IF (BALL) d1=d1 IF (BALL) V2=36 IF (BALL) GO TO 200 *** 37 *** BALL=.FALSE. IF (S(d1 .8) .GE. 1 .07.AND.S(d1+1.8).GE.1 .07.AND.(S(J1+1,8)-S(d1.8)) 1 .LT.0.20.AND. S ( J 1 - 1 , 2 ) .LT. 1 .05.AND.S(d1 -2,2) .LT. 1 .05) BALL=.TRUE. IF (BALL) d1=d1 IF (BALL) V2=37 IF (BALL) GO TO 200 *** 38 *** BALL=.FALSE. IF ((M(d1-1).E0.15.0R.M(d1-1).E0.7.0R.M(d1-1).E0.4).AND.(d2-d1+2) 1 .LE . 8.AND.(S(d1 + 1,8)-S(J1,8)).LT.O.20.AND.S(d1,8).GE. 1 .07) BALL = 2.TRUE. IF (BALL) d1=d1 IF (BALL) V2=38 IF (BALL) GO TO 200 *** 39 *** BALL=.FALSE. IF ((J1-1).LE.K3.AND.S(d1,8).GE.1.07.AND(S(d1+1,8)-S(d1,8)).LT. 10.20) BALL= . TRUE. IF (BALL) d1=d1 IF (BALL) V2=39 IF (BALL) GO TO 200 *** 40 * * BALL=.FALSE. IF (S(d1,8).GE.1.07.AND.(S(d1+1,8)-S(d1,8)).LT.O.20.AND.S(d1-2,2) 1.LT.1.05.AND.S(d1-1,2).LT.1.05.AND.S(d1+2,8).LT.S(d1.8).AND.S(d1+1 2,2 ) .GE.0.74.AND.S(d1+2,2).GE.0.74) BALL=.TRUE. IF (BALL) d1=d1 IF (BALL) V2=40 +  401 402 C 403 C 404 C 405 C 406 C 407 408 409 End of F i l e  Ul  IF (BALL)  GO TO 200  TO CALL SUBROUTINE FIVE TO CARRY OUT THE ADJUSTMENT OF THE C-BOUN DARY 200  CALL FIVE RETURN END  H o  1 2 3 4 5 G 7 8 9 10 1 1 12 13 14 15 1G 17 18 19 20 21 22 23 24 25 2G 27 28 29 30 31 32 33 34 35 3G 37 38 39 40 41 42 43 44 45 4G 47 48 49 50  C C SUBROUTINE FIVE  C C C C C C C C C C  BOUNDARY MOVE OF THE C-TERMINAL  c c c c c c c  c c c c c c c c c c c  PURPOSE TO ADJUST THE C-TERMINAL RESIDUE BASED ON THE BOUNDARY CONFOR MAT IONAL PARAMETERS OF THE ADJACENT RESIDUES REAL S,T1,T2,A1,A2 ,T3,T4,T5,TT,P INTEGER G,F,H,U,D,V1,V2, V3,V4,V5,V6,V7,V8,0 LOGICAL HELLO,BYE ,BALL,MOVE DIMENSION S(1000,10),M(1000),H(1000),D(1000,16),P(1000,10) COMMON S,T1,T2,T3,T4,T5,TT,A1,A2,P ,V4,V5,V6,V7,V8,0.G,F,H,U,D,NN, 1NW,KX,MA,MB.MC,MD,L,I,L1,L2,L3,J1,J2,N,K1,K2,V1,V2,IM,M,K3,K4,V3, 2BYE,BALL,HELLO,MOVE ... TO CHECK THE NUMBER OF BREAKERS IN THE SECTIONS J1-J2 + 4,J1 -J2 + 3 ... DESCRIPTION OF PARAMETERS V8 - ACTUAL NUMBER OF BREAKERS IN THE SEGMENT J1 TO J2+4 JJ - THEORITICAL NUMBER OF BREAKERS IN THE SECTION J1 TO J2+4 MM - ACTUAL NUMBER OF BREAKERS IN THE SEGMENT J1 TO J2+3 JM - THEORITICAL NUMBER OF BREAKERS IN THE SECTION J1 TO J2+3 K3 - C-TERMINAL RESIDUE OF THE PREVIOUS PREDICTED SHEET  90  c  102  J3=J2+4 V8=0 DO 90 JC=J1,J3 IF (S(JC,2).LE.O.75) CONTINUE JJ=(J3+1-J1)/3 J5=J2+3 MM=0 DO 102 JC=J1,J5 IF (S(JC,2).LE.0.75) CONTINUE  V8=V8+1  MM=MM+1  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  C C C C C C  C  OM=(05+1-01 )/3 . 02 = 02 *** 1 ***  BYE= . FALSE. IF (M(02).EQ.1 . AND.S(J2+1,2).LT. 1.05.AND.S(02-1,2).GE. 1 .47) 1 .TRUE. IF (BYE) 02=02 IF (BYE) V3=1 IF (BYE) GO TO 300  c  *** 2 ***  c c  *** 3 ***  . c c  c c. c c c c c  BYE =  BYE =.FALSE. IF (S(02,9).GE.1.11.AND.M(02).EQ.5.AND.M(02-1).EO.1.AND.S(02-2,2) 1 .GE. 1.30.AND.S(02+1,2 ) .LT . 1 .05) BYE= .TRUE . IF (BYE) 02=02 IF (BYE) V3=2 IF (BYE) GO TO 300 BYE =.FALSE. IF (M(02).EQ. 1.AND.S(02- 1,9).LT. 1 . 79.AND.T3.EO.3.0.AND.N.EQ.0.AND 1 (02+1-01).EO.5) BY E =.TRUE. IF (BYE) 02=02 IF (BYE) V3=3 IF (BYE) GO TO 300 *** 4  ***  BYE= . FALSE. IF (M(02-1).EQ.19.AND.M(02-2).EQ.20.AND.M(02-3).EQ.13.AND.S(02,2) 1.GE.0.75) BYE=.TRUE . IF (BYE) 02=02 IF (BYE) V3=4 IF (BYE) GO TO 300 MOVE OF 02  * * * 6 ***  BYE = . FALSE. IF (S(02,2).GE. 1 .05.AND.S(02+1,9).LT. 1. 11 .AND.S(02 + 2,9) .LT . 1 . 1 1 1.AND.S(02+3,9).GE.1.96.AND.((M(02+1).EQ.4.AND.S(02+2,2).GE.0.74) 2.OR.(M(02+2).EQ.4.AND.S(02+1,2).GE.0.74)).AND.(02-01+5).GE.7.AND. 3MM.LE.0M) BY E =.TRUE. IF (BYE) 02=02+3  to  101 102 103 104 105 106 107 108 109 1 10 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 14 1 142 143 144 145 146 147 148 149 150  C C  C C  C C  C C  C C  IF (BYE) IF (BYE)  V3=6 GO TO 300  **+7**+ BYE=.FALSE. IF (S(J2,9).LT.1.11.AND.S(d2-1,9).GT.S(d2,9 ) .AND.S(d2+1,9).LT.1.11 1.AND.S(J2+2.9).LT. 1. 1 1 .AND.S(d2- 2,9)LE.S(d2-1,9)) BYE =.TRUE. IF (BYE) d2=d2-1 IF (BYE) V3 = 7 IF (BYE) GO TO 300 *** 8 *** BYE=.FALSE. IF (S( J2 + 3 , 9) .GE . 1 .96 . AND . S ( J2 + 2 , 2 ) . GE . 1 .05 . AND . S ( <J2 . 9 ) . LT . 1 . 1 1 1.AND. S(d2-1 ,9) .LT. 1 . 1 1.AND.S(J2-2,9).LE.S(J2+3.9).AND. S(02+1,9). 2 LT. 1. 11.AND.M(d2+1).NE. 15.AND.M(d2 +1).NE.7.AND.M(J2+1).NE.4.AMD. 3MM.LE.dM) 8YE=.TRUE. IF (BYE) d2=d2+3 IF (BYE) V3=8 IF (BYE) GO TO 300 *** g *** BYE=.FALSE. IF (S(d2,9).GE. 1. 11.AND.S(d2+3,9).GE.S(d2,9).AND.S(d2-1 ,9).LE. 1S(d2 + 3,9).AND.M(d2+1 ) .NE. 15.AND.M(d2+1).NE.4.AND.M(d2+1).NE.7.AND. 2M(d2 + 2).NE. 15.AND.M(d2 + 2).NE.4.AND.M(d2+2).NE.7.AND.(S(d2- 1,2).GE. 30.74 .AND.S(d2-2,2).GE.0.74).AND.MM . LT.dM) BYE= . TRUE. IF (BYE) d2=d2+3 IF (BYE) V3=9 IF (BYE) GO TO 300 *+*io*+* BYE=.FALSE. IF (S(d2,9).GE.1.11.AND.S(J2-1,9).LT.1.11.AND.S(d2+4,2).GE.1.05.AN 1D.S(d2+1,2) .GE.0.74.AND.S(d2+2,2).GE.0.74.AND.S( d2+3 , 2) .GE .0. 74 2.AND.S(d2-1,2).GE.0.74.AND.(S(d2-2,9)-S(d2,9)).LT.0.60.AND.V8.LE. 3dd.AND.S(d2+2,9).LT.1.50) BYE=.TRUE. IF (BYE) d2=d2+4 IF (BYE) V3=10 IF (BYE) GO TO 300 ***  11  * * *  BYE=.FALSE. IF ((d2+5).GT.NN) GO TO 100 IF (S(d2,9).GE. 1. 11.AND.S(d2- 1,9).GE. 1.50.AND.S(d2 + 4,9).GT . 1 . 1 1 1.AND.S(d2+5,9).GT.S(d2+4,9).AND.S(d2+1,2).GT.0.74.AND.S(d2+2,2 ) 2.GT.0.74.AND.S(d2+3,2).GT.0.74.AND.(S(d2-1,9)-S(d2+5,9)).LT.0.30 3.AND.V8.LE. ((d2 + 6-d1)/3) ) BYE=.TRUE. IF (BYE) d2=d2+5 IF (BYE) V3=11  15 1 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 17 1 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  C C  C C  C C  C C  C C  C C  IF (BYE)  GO TO 300  *** 12 *** 100 BYE=.FALSE. IF (S(J2,2).LE.0.75.AND.S(J2-1,2).LE.0.75.AND.M(J2-2).EQ.19.AND. 1(S(d2-2,9)-S(u2+1,9)).GE.0.84) BYE=.TRUE. IF (BYE) J2=J2-2 IF (BYE) V3=12 IF (BYE) GO TO 300 *** 13 *** BYE=.FALSE. IF (S(J2 - 1 ,9) . LT. 1. 11 .AND.S(U2+1,9).LT. 1 . 11 .AND.S(J2-2,9).GT.S(J2, 19) .AND.S(J2-2,9).GE. 1 .79.AND.(J2-2).GT.J1) BYE = . TRUE. IF (BYE) J2=J2-2 IF (BYE) V3=13 IF (BYE) GO TO 300 *** 14 '*** BYE=.FALSE. IF (S(J2,9) .LT. 1 . 11.AND.S(J2+1,9).GE.1 . 11.AND.(S(J2-1,9)-S(J2+1,9) 1).LE.O.60) BYE=.TRUE. IF (BYE) J2=J2+1 IF (BYE) V3=14 IF (BYE) GO TO 300 *** 15 *** BYE=.FALSE. IF (S(d2,9).LT .1.11 .AND.S(J2+1,9).GE. 1.11 .AND.(S(J2-1,9)-S(J2+1,9). D.GT.0.60) BYE =.TRUE. IF (BYE) J2=J2-1 IF (BYE) V3=15 IF (BYE) GO TO 300 *** 16 *** BYE=.FALSE. IF (S(J2.9) .LE. 1 . 11.AND.S(J2- 1,9).LT. 1. 11 .AND.S(U2-2,9).LT. 1. 11 1 .AND.S(J2+1 ,9 ) .LT. 1 . 11 .AND.S(U2 + 2,9) .LT. 1 . 11 .AND. (S(J2-3,9) .GE. 1 . 211 .OR.M(J2-3).EQ. 19).AND.(J2-3).GT.J1.AND.S(J2 + 3.9).LT. 1 .96) BYE = 3.TRUE. IF (BYE) J2=J2-3 IF (BYE) V3=16 IF (BYE) GO TO 300 *** 17 *** BYE=.FALSE. IF ( S (<J2 - 1 , 9 ) . GE . 1 . 79 . AND . ( S ( J2 - 1 , 9 )-S (J2 . 9 ) ) . GE . 0 . 70 . AND . S ( J2+1 , 19) . LT . 1 . 11.AND.S(J2-2.9).LT.S(J2-1,9)) BYE= .TRUE. IF (BYE) J2=J2-1 IF (BYE) V3=17  ^ ,y\  >J>  201 202 203 204 205 206 207 208 209  C  21Q  C  211 212 213 214 215 216 2 17 218 219 220 22 1 222 223 224 225 226 227 228 229 230 231 232 233 234 235  236 237 238 239 240 241 242 243 244 245 246 247 248 249 250  C C  C C  C C  IF (BYE) GO TO 300 *** 18 * + * BYE=.FALSE. IF (S(J2-1,9).LT.1.11.AND.S(02+1,9).GT.S(J2.9).AND.S (<J2 , 9).GE. 1. 11 1) BY E =.TRUE. IF (BYE) J2=J2+1 IF (BYE) GO TO 300 ***  1 9 * * *  BYE=.FALSE. IF ( (S(U2-1,9).LT. 1 . 1 1 .OR.S(d2-1 ,9) . LE. S(J2+1.9)).AND.S(J2+1,9) 1.GT.S(d2,9).AND.S(d2,9).GE.1.11) BYE =.TRUE. IF (BYE) d2=d2+1 IF (BYE) V3=19 IF (BYE) GO TO 300 * * * 20 * * * BYE=.FALSE. IF (S(U2+2,9).GE.S(J2,9).AND.(S(J2,9).GE.1.11.OR.M(J2+2).EO.5) 1 .AND.(M(d2 +1) .NE.15.AND.M(U2+1) .NE.7) .AND. S(d2 + 2,9).GT.S(d2+1,9)) 2 BYE= .TRUE. IF (BYE) J2=J2+2 IF (BYE) V3 = 20 IF (BYE) GO TO 300 ***21*** BYE=. FALSE. IF (S(d2,2) .LT. 1.05.AND,S(J2+1,2).LT. 1.05.AND.S(J2- 1,9).GT.S(J2 + 2. 19) .A ND.S(J 2 1,9).G E. 1 . 11) BY E =.TRUE. IF (BYE) d2=d2-1 IF (BYE) V3=21 IF (BYE) GO TO 300 -  C Q  C C  *** 22 * ** BYE=.FALSE. IF (S(d2 , 2) . LT . 1 .05. AND . S ( J2 - 2 , 9 ) . GT . S (J2 - 1 . 9 ) . AND . S («J2 - 2 , 9 ) .GE . 11 .07.AND.S(d2+1,9).LT. 1 . 11 .AND.S(U2+2,9).LT. 1. 11 .AND.(J 2-2) .GT.J1) 2BYE =.TRUE. IF (BYE) J2=J2-2 IF (BYE) V3=22 IF (BYE) GO TO 300 *** 23 *** BYE=. FALSE. IF (S(d2,9) .GE. 1 . 11 .AND.(S(J2+1,2).GE.0.74.OR,M(J2+1) !E0.4).AND. 1S(J2+2.9).GE. 1 . 11 .AND.S(U2- 1,9).GE. 1. 11 .AND.S(U2,9).LT. 1 .96) BYE 2=.TRUE. IF (BYE) U2=J2+2 IF (BYE) V3=23  H <^ ^  251 252 253 254 255 255 257 258 259 260 261 262 263 264 ' 265 266 267 268 269 270 27 1 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300  IF (BYE) C C  C C  GO TO 300  *** 24 *** BYE=.FALSE. IF (S(J2,2) .GE. 1 .05.AND.S(02 + 3,9).GE. 1.79.AND. S(-J2+2, 2).GE. 1.05 1.AND.S(J2-1,2).GE.1.05.AND. (J2-J1 + 5) .GE.7.AND.(M(J2+1).EO.7 .OR. 2M(J2+1).EO.4).AND.MM.LE.JM ) BYE=.TRUE. IF (BYE) J2=U2+3 IF (BYE) V3=24 IF (BYE) GO TO 300 *** 25 +*+ BYE=.FALSE. IF ((J2-U1+1).LE.5.AND.T3.GE.3.0.AND.N.LE.1.AND.M(U2+1).EO.4.AND. 1S(J2+2.9).GE.1.11.AND.S(J2+3,9).GE.1.11.AND.S(U2+1,2).GE.1.05.AND. 2S(U2+3,2).GE.1.05.AND.MM.LE.JM) BYE =.TRUE. IF (BYE) J2=J2+3 IF (BYE) V3=25 IF (BYE) GO TO 300  C C***26*** BYE=.FALSE. IF (S(J2,9) . LT. 1 . 11 .AND.S(J2+1,9).LT. 1. 1 1 . AND.(U2+2).GT.NN.AND. 1S(J2-1.9).GE.1.11) BYE=.TRUE. IF (BYE) <J2 = J2-1 IF (BYE) V3 = 26 IF (BYE) GO TO 300 C C ***27*** BYE=. FALSE. IF (S(J2,9).LT.1.11.AND.S(J2+1.9).LT.1.11.AND.S(J2+2,9).GE.1.79.AN 1D.S(J2-2,9).LT.S(U2 + 2,9)) BYE= . TRUE. IF (BYE) U2=J2+2 IF (BYE) V3=27 IF (BYE) GO TO 300 C C *#*28*** BYE=.FALSE. IF ( S ( J2 , 9 ) . GE . 1 . 79 . AND . S ( U2 - 1 , 9 ) . GE . 1 . 79 . AND . S ( <J2 - 2 . 9 ) . GE . 1 . 79 1.AND.S(J2-3.9).GE.1.79.AND.M(J2+1).EO.1.AND,M(J2+2).EO.1) BYE= 2.TRUE. IF (BYE) J2=d2+2 IF (BYE) V3=28 IF (BYE) GO TO 300 C C *** 29 *** BYE=.FALSE. IF (M(J2).EO.5.AND.S(U2-1,9).GE.1.79.AND.MM.LE.JM.AND.S(U2-2,9).GE 1.1.27.AND.S(J2 + 3,9) .GE. 1 .21 .AND.S( <J2+1 ,9).GE.0.74.AND.S(U2 + 2,2) .GE 2.0.74) BYE =.TRUE.  301 302 303 304 305 30G 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350  •  C C  C c  C C C C C C C  C C  C C  IF (BYE) IF (BYE) IF (BYE)  J2=J2+3 V3=29 GO TO 300  *** 30 *** BYE=.FALSE. IF (S(J2,2).GE.1 .GO.AND.S(J2- 1,2 ) .GE. 1.60.AND.S(02-2,2).GE. 1.38 . AN 1D.MM.LE.JM.AND.S(J2+1,2).GE.0.75.AND.(M(U2+2).EQ.7.OR.M(J2+2) . EQ . 24).AND.S(J2 + 3,9).GE . 1 .27 ) BYE =.TRUE. IF (BYE) J2=U2+3 IF (BYE) V3=30 IF (BYE) GO TO 300 ***31 BYE=.FALSE.' IF (M(J2).EQ.20.AND.M(J2+4).EO.20.AND.M(J2-2).EQ.20.AND.S(J2+1,2) 1.GE.0.75.AND.S(J2 + 2,2).GE.0.93.AND.S(J2 + 3,2).GE.0.75) BYE=.TRUE . IF (BYE) J2=J2+4 IF (BYE) V3=31 IF (BYE) GO TO 300 J2 = 02  * + * 32 * + * BYE=.FALSE. IF (S(02,9).GE. 1. 11 .AND.S(J2-1,9).LT. 1. 11 .AND.S(J2+1,9).LT. 1. 1 1 1.AND.S(J2-2,9).LE.S(d2,9)) BY E =.TRUE. IF (BYE) J2=d2 IF (BYE) V3=32 IF (BYE) GO TO 300 *** 33 *** BYE=.FALSE. IF (S(J2,9).GE.1.11.AND.S(J2+1,9).LT.1.11.AND.S(02+2,9).LT.S(U2,9) 1 . AND . S(d2- 1 ,9) . LE . S(J2 ,9) . AND . S( J2-2 ,9) . LT .1.11) BYE = . TRUE . IF (BYE) U2=J2 IF (BYE) V3=33 IF (BYE) GO TO 300 *** 34 *** BYE=.FALSE. IF (S(J2,9).GE. 1 .27.AND.S(U2+1 ,9) .LT. 1 . 11.AND.(S(U2 + 2,9) . LT . 1 . 1 1 1.OR.S(J2+2,9).LT.S(02,9)).AND.S(J2-1,9).LE.S(J2,9).AND.S(J2-2,9) 2.LE.S(d2,9)) BYE= . TRUE. IF (BYE) J2=U2 IF (BYE) V3=34 IF (BYE) GO TO 300  351 C 352 C 353 354 355 356 357 358 359 C C 360 361 362 363 364 365 366 367 C 368 C 369 370 37 1 372 H G\ 373 374 375 C 376 c 377 c 378 c 379 c 380 381 382 383 384 385 End of F i l e  BYE= . FALSE . IF (S(d2,9).GE. 1 . 11 .AND.(d2-2).LE.d1.AND.S(d2- 1,9).LT . 1 . 1 1 .AND. 1S(J2+1,9).LT.1.11.AND.S(d2+2.9).LT.1.11) BYE=.TRUE. IF (BYE) d2 = d2 IF (BYE) V3 = 35 IF (BYE) GO TO 300 * + + 3^ * + +  BYE= . FALSE . IF (S(J2,9).GE. 1.27.AND.S(J2-1,9).GE. 1.21 .AND.S(J2-2,9).GE . 1 . 2 1 1.AND.S(J2+1,9).LT.1.11.AND.S(J2+2,9).LT.1.11) BYE=.TRUE. IF (BYE) d2 = d2 IF (BYE) V3=36 IF (BYE) GO TO 300 BYE=.FALSE. IF (S(02,9).GE. 1 . 11 .AND.S(d2+ 1,9) . LT . 1 . 1 1 .AND.S(J2+2,9).LT. 1 . 11 1.AND.(d2-1).LE.d1) BY E =.TRUE. IF (BYE) J2=J2 IF (BYE) V3 = 37 IF (BYE) GO TO 300 TO PRINT OUT THE FINAL VALUES d1,02 OF THE PREDICTED RETURN TO SUBROUTINE FIRS TO START THE SEARCH AGAIN 300 301  SHEET. THEN TO  K3=d2 PRINT 301,d1,d2,V2.V3 FORMAT('0' .25X,'EVENTUAL SHEET FROM d1: ' ,15,5X, 'TO <,/ *** V2.V3 :',2I5,' ***'//) RETURN . END  d2:',I5,14X.  E f f i c i e n c y of the  8-turn p r e d i c t i o n  Although the r e s u l t s  some s m a l l d i f f e r e n c e s e x i s t e d b e t w e e n  in this  Fasman ( 1 9 7 7 , 1979)  study  and  those  r e p o r t e d by  Chou  and  ( e . g . c a r b o n i c a n h y d r a s e 71-74, 1 0 9 - 1 1 2 ;  a - c h y m o t r y p s i n 1 4 8 - 1 5 1 ; a - h e m o g l o b i n 81-84; t h e r m o l y s i n 22,  43-46), i n general  very w e l l w i t h those Therefore,no The  the r e s u l t s  o f Chou and  m o d i f i c a t i o n was  in this  study  agreed  Fasman ( 1 9 7 7 , 1 9 7 9 ) .  needed f o r 8-turn  prediction.  program used f o r 8-turn p r e d i c t i o n c o n s i s t e d of the  p r o g r a m and presented t o t h e one  one  subroutine.  in this  The  s u b r o u t i n e was  the o n l y  s t u d y b e c a u s e t h e m a i n p r o g r a m was  u s e d f o r a - h e l i x and  168  19-  8-sheet  search.  main part  similar  1 2 3 4 5 6 7  C C  SUBROUTINE  C C C c C C C C C C C C  8 9 10 11 12 13 14 15 16 . 17 18 19 20 21 22 C 23 C 24 C 25 C 26 C 27 C 28 C 29 C C 30 31 C 32 C 33 C 34 C 35 C 36 C 37 C 38 C 39 40 ' 41 42 43 44 45 46 47 48 49 50  TURN  : PURPOSE TO LOCATE B-TURNS BY APPLYING THE RULE: <PA> < <PT> > <PB> AND THE PROBABILITY OF TURN OCCURRENCE SHOULD BE GREATER THAN 0.000075. AND FOR 2 ADJACENT TURNS THE ONE WITH THE HIGHEST PRO BABILITY OF OCCURRENCE WILL BE CHOSEN  INTEGER G.F.H.U.D REAL S.T1,T2,A1,A2 .T3 , T4,T5,TT,PRB,P.PRBO,A3 LOGICAL HELLO.BYE.BALL DIMENSION S(1000,8),M(1000).H(100) .D(100.16),P(1000.8) COMMON S,T1,T2,A1,A2,T3.T4,T5.TT.PRB.P.G.F.H.U.D.M,IM.I 1 NN , NW,N,K,J,MC,HELLO,BYE,BALL DESCRIPTION OF PARAMETERS - COUNTER I - ARRAY TO STORE THE BOUNDARY VALUES OF TURNS H H(I) - N-BOUNDARY VALUE H(I+1) - C-BOUNDARY VALUE - FIRST RESIDUE OF A TETRAPEPTIDE (=K1) MB - FOURTH RESIDUE OF A TETRAPEPTIDE (=K1+3) K2 - AVERAGE PA OF A TETRAPEPTIDE A1 - AVERAGE PB OF A TETRAPEPTIDE A2 - AVERAGE PT OF A TETRAPEPTIDE A3 - PROBABILITY OF B-TURN OCCURRENCE PRB PRBO - PROBABILITY OF B-TURN OCCURRENCE OF THE ADJACENT PEPTIDE STARTING AT K1-1 10  20  1=1 H(I)=0 NW = NN-3 I MB = 1 K2 =MB+3 K1= MB PRB =0 T 10 = T2 =0 T3 =0 A1 =0 A2 =0  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  A3=0  C C C  TO CALCULATE THE AVERAGE PA,PB,PT OF A TETRAPEPTIDE  25  30  40 50  55 C C C C  DO 25 MC=K1,K2 T1=T1+S(MC, 1 ) T2=T2+S(MC,2) T3=T3+S(MC,6) CONTINUE A1=Tl/4 .0 A2 = T2/4 .0 A3 = T3/4 .0 PRB = P(K 1 , ' 1 )*P(K1 + 1 , 2)*P(K 1+2 , 3)*P(K2 , 4 ) PRINT 30,A1,A2.A3,PRB,MB FORMAT ( ' ' , 10X, 'A 1: ' ,F6.3 , 5X , ' A2 : ' ,F6.3,5X, 'A3: ' ,F6.3,5X, 'PRB: ' ,F 1 1 3.10,18) IF ((A3.GT.A2.AND.A3.GT.A1).AND.(PRB.GT.0.000075).AND.A3.GT . 1.0000 1 O) GO TO 50 MB = MB+ 1 IF (MB.LE.NW) GO TO 20 IF (MB.GT.NW) GO TO 70 1 = 1+1 H(I )=K1 1=1+1 H(I)=K2 PRINT 55,H(I-1),H(I) FORMAT('0' . 10X, 'POTENTIAL BETA-TURN' ,5X,14,5X,14) TO CHECK FOR' THE POSSIBLE PRESENCE OF AN ADJACENT TURN IF (I.LE.3) GO TO 60 IF (K1.EO.(H(I-3)+1 ) ) GO TO 80  C 60 70 75 C C C C  MB=K1+1 IM=I IF (MB.LE.NW) GO TO 20 PRINT 75,IM FORMAT('0',10X.'END OF PROGRAM',5X,16) GO TO 90 TO CALCULATE THE PROBABILITY OF OCCURRENCE OF THE ADJACENT TURN  80  85  KO=H(1-3) PRBO=0 PRBO=P(KO,1)*P(K0+1,2)*P(KO+2,3)*P(K0+3,4) IF (PRBO.GT.PRB) PRINT 85,PRBO,PRB,K1,KO FORMAT('0',20X,'PRBO:',F11.8,4X,'PRB:',F11.8,6X,'B-TURN 1 ,' BUT AT',15,/)  NOT AT'.15  101 102 103 104 105 C 106 107 End of F i l e .  88  90  IF (PRBO.LT.PRB) PRINT 88,PRBO,PRB,KO,K1 FORMAT('0' ,20X, 'PRBO: ' , F 1 1 . 8 . 4X, 'PRB: ' .F 1 1 .8,6X, 'B-TURN NOT AT' ,15 1 ,' BUT AT',15,/) GO TO GO RETURN END  Efficiency  of the  r e s o l u t i o n of overlapping  In g e n e r a l , Fasman  a-  t h e p r o c e d u r e o u t l i n e d by  ( 1 9 7 8 a , 1978b ) was  effective  p r o g r a m , i f more t h a n h a l f o f t h e  tested  P^>;  a  of h e l i x  character  l e n g t h to  mations, then t h i s  length)  conformation  the B  a  A l t h o u g h the  overlapping t h a n B^.  8-sheet they  be  conformational  compensate the  ribonuclease  stability.  solved  were  e x p l a i n e d by  confor-  because  equally <  P  > a  <  <  t h a n H^,  the h i g h e r  Pg ' >  or  less  values  p a r a m e t e r s compared to h e l i x ; ,  of thus,  o f H^  ambiguous s i t u a t i o n s (e.g. p a p a i n  49-59, m y o g l o b i n 100-119, lysozyme  175-180, t h e r m o l y s i n  of a n t i p a r a l l e l  predicted  In  a d o p t e d . However, i t  easily  8-sheet)  of the  l o w e r number o f o c c u r r e n c e  more w e i g h t was  overlapping  be  one  c o n t a i n more H^  269-275, t h e r m o l y s i n  thermolysin  dilemma.  in  the  regions. For  subtilisin  and  conditions  c a l c u l a t i o n s showed t h a t  s e c t i o n may  T h i s may  overlapping  246),  ( a - h e l i x and  favored  w o u l d be  h a p p e n e d t h a t some c a s e s c o u l d n o t  favored.  areas  a n a l y s i s ; boundary a n a l y s i s ; r a t i o  8-sheet  both conformations  g-  Chou  to s o l v e the  the p r e s e n t (P >  and  8-sheets.  areas,  due  given  160-175,  261-274, t h e r m o l y s i n  According  a)  234-  presence  to r u l e 3 f o r s o l v i n g  8-sheets  are  to i n t e r a c t i o n s w h i c h enhance  Thus, i n case a n t i p a r a l l e l  172  107-114,  138-150, t h e r m o l y s i n  to f a c t o r s such as:  antiparallel  26-33,  preferentially conformational  8-sheets  are  absent,  preference  f o r long a - h e l i x over  major f a c t o r s conformation  t o be c o n s i d e r e d ; e s p e c i a l l y when t h e h e l i c a l i s s u p p o r t e d by o n l y h a l f o r l e s s  the c o n d i t i o n s t e s t e d . length  s h o r t e r 8 - s h e e t i s one o f t h e  (R^ _> 2.0) and  account the d i f f e r e n t to, or breaker  b) r a t i o  of h e l i x  length to  c) c h a r a c t e r a n a l y s i s  indifferent  1 3 - 1 8 , 30-39; c o n c a n a v a l i n  161-166 a r e e x a m p l e s o f a n t i p a r a l l e l  exhibit  into  o f a- and 8 - c o n f o r m a t i o n ) .  125-133; r i b o n u c l e a s e 94-110; a - c h y m o t r y p s i n  of  -sheet  (to take  types of r e s i d u e s , former,  Staphylococcal nuclease  instead  than h a l f of  longer h e l i c e s ,  good p o t e n t i a l  85-91;  papain  8-sheets being predicted  although these regions  for helical  also  conformation.  The r e f e r e n c e t o known p r o t e i n s i n t h e p r e d i c t i o n o f unknown ones i s v e r y u s e f u l ,  e s p e c i a l l y when some  homolo-  gy e x i s t s b e t w e e n t h e known and unknown p r o t e i n s ( A r g o s e t a l . , 1 9 7 6 ) . T h i s was o b s e r v e d prediction of proteinase  i n the present  study  f o r the  inhibitors.  The f o l l o w i n g p r o g r a m was w r i t t e n t o a s s e s s t h e different  important  o f o v e r l a p p i n g a-  factors  contributing  to the r e s o l u t i o n  and 8 - r e g i o n s . An e x t r a p a r t t o r e a d  of o v e r l a p p i n g h e l i c e s  pairs  and 8 - s h e e t s was a d d e d t o t h e m a i n  p r o g r a m common t o t h e s e a r c h o f a - h e l i x , 8 - s h e e t and 8 - t u r n .  173  !~J ^  1 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51  c C C C C C C C C C C C C C C C C C  EXTRA PART FOR OVERLAPPING AREAS TO READ IN PAIRS OF OVERLAPPING HELIX AND SHEETS DESCRIPTION OF PARAMETERS NR - NUMBER OF LINES OF DATA (16 DATA PER LINE) NT - TOTAL NUMBER OF DATA AH - ARRAY TO STORE THE HELICAL VALUES AH(I) - N-TERMINAL VALUE AH(I+1) - C-TERMINAL VALUE SH - ARRAY TO STORE THE SHEET VALUES SH(I) - N-TERMINAL VALUE SH(I+1) - C-TERMINAL VALUE  42 .  PRINT 35 F O R M A T ( ' P A I R S OF OVERLAPPING HELICES AND SHEETS') PRINT 36 FORMAT( ' ' , ' .'....'/) READ(5,40) NT,NR F0RMAT(6X, I4.6X, 14) F0RMAT(16I5) WRITE (6,42) ((R(J,K),K=1 , 16 ) ,U =1.NR) FORMAT(' ',1615) IM=NT/2  51 52  1=1 DO 52 J= 1 , NR DO 51 K=1,16,2 AH(I)=R(J,K) IF (AH(I).EO.O) 1=1+1 CONTINUE CONTINUE  35 36 40 41  C  C  • C C C C  54  55 56  1=1 DO 56 J=1,NR DO 55 K=2,16,2 SH(I)=R(J,K) IF (SH(I).EQ.O) 1=1+1 CONTINUE CONTINUE  GO TO 54  GO TO 60  TO CALL SUBROUTINE 0LA1 TO CARRY OUT THE COMPARISON OF PA,PB OF EACH REGION AND THAT OF THEIR OVERLAPPING AREA  < o  _1  <  o  CL O a h- z l/"> UJ  o to  CM ro 1 LO «LO LO LO LO 0 "U  c UJ  17  -0  cn  1 2 3 4 5 • S 7 8 g 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  C C C  SUBROUTINE  C C C C c C c c c c c c  0LA1  - PROCEDURE OF OVERLAPPING ff 1  PURPOSE TO COMPARE THE AVERAGE PA, PB OF THE PREDICTED HELIX (H1-H2),0F THE SHEET (S1-S2),AND OF THEIR OVERLAPPING AREA AND TO CALCULATE THE RATIO HELIX LENGTH/SHEET LENGTH  REAL A1,A2,S,T1,T2,TTH,TTS,P,HN,HC,NHN,NHC,SN,SC,NSN,NSC,HHF,HF, 1 IIH.IH.BH.BBH.SSF,SF,IS,BS,BBS INTEGER H1 .H2.S1 .S2.AH.SH.IT1, IT2.D.R DIMENSION S(1000,20),AH(1000),SH(1000),M(1000),R(1000,16),D(1000, 1 16),P(1000,10) COMMON A1,A2,S.T1 ,T2,TTH,TTS,P,HN,HC,NHN,NHC,SN,SC,NSN,NSC,HHF,HF 1 ,IIH.IH.BH.BBH,SSF,SF,IS.BS.BBS.HI,H2,S1,S2.AH,SH.IT1 . IT2,D.R.NR, 2 NT.NN.N.M.IM,I,K,J  c c c c c c c c c c c c c c c c c  DESCRIPTION OF PARAMETERS I - COUNTER H1 - N-TERMINAL OF THE PREDICTED H2 - C-TERMINAL OF THE PREDICTED S1 - N-TERMINAL OF THE PREDICTED S2 - C-TERMINAL OF THE PREDICTED LH - HELIX LENGTH LS - SHEET LENGTH A1 - AVERAGE PA OF A SECTION A2 - AVERAGE PB OF A SECTION EVERY TIME I INCREASES BY 1 A NEW EET IS SUBJECTED TO THE ANALYSIS 1  c  0LA1  1=1 1=1+1 IF (I,GT.IM) H1=AH(I-1) H2=AH(I) S1=SH(I-1) S2=SH(I) LH=H2-H1+1  GO TO 300  HELIX HELIX SHEET SHEET  SET OF OVERLAPPING HELIX AND SH  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  LS=S2-S1+1 A1=LH/LS PRINT 5,LH,LS,A1 ' FORMAT('-',20X. ' * * *COMPARISON OF THEIR LENGTH + * *' ,5X, 'L-HELIX 1 14,3X, 'L-SHEET ' : ' , 1 4 , 5 X , 'RATIO = LH/LS: ' ,F4. 1) PRINT 8 FORMAT('0',30X, ****** COMPARISON OF P-HELIX AND P-SHEET *+***')  5 8 C 10 1 1 12  K= 1 GO TO 110 IF (A1.GT.A2) PRINT 11,H1,H2,A1 ,A2 FORMAT('0'.15X, 'H1 : ',14,3X,'H2 :',14,5X,'A1:',F6.3,3X,'A2:',F6 FROM H1 TO H2 ' ) 1 10X,'A1 > A2 IF (A1.LT.-A2) PRINT 12,H1,H2,A1,A2 FORMAT('0',15X,'H1 : ' ,I4,3X, 'H2 : ' ,14,5X, 'A 1 : ' ,F6.3,3X, 'A2: ' .F6 FROM H1 TO H2' ) 1 10X,'A1 < A2  C 20 21 22 C  K=2 GO TO 120 IF (A1.GT.A2) PRINT 21 ,S1 ,S2, A1 ,A2 FORMAT('0'.15X, 'S1 : ' ,I4.3X, 'S2 : ' .I4.5X, 'A1 : ' .F6.3.3X, 'A2: ' ,F6 1 10X,'A1 > A2 FROM S1 TO S2' ) IF (A1.LT.A2) PRINT 22,S1 ,S2,A1 ,A2 FORMAT('0' . 15X,'S1 : ' ,14,3X, 'S2 : ' ,14,5X, 'A1 : ' ,F6.3,3X, 'A2: ' ,F6 1 10X.'A1 < A2 FROM S1 TO S2' ) IF (SH(I-1).LT. AH(I- 1).AND.SH(I).GT.AH(I-1).AND.SH(I).LT.AH(I)) 1 K=3 IF (SH(I-I).LT. AH( I- 1).AND.SH(I).GT.AH(1-1).AND.SH(I).LT.AH(I) ) 1 GO TO 130  C  IF (AH(I-I).LT. SH( I- 1).AND.AH(I).GT.SH(1-1).AND.AH(I).LT.SH(I ) ) 1 K=4 IF (AH(I-1).LT. SH( I- 1) .AND.AH(I) .GT.SH(I-1).AND.AH( I) .LT.SH(I)) 1 GO TO 140  C C C C  TO CALL SUBROUTINE 0LA2 TO ANALYZE THE TYPES OF RESIDUES WITHIN EACH SECTION 50  CALL 0LA2 GO TO 1  1 10  L1=H1 L2=H2 GO TO 200 L1=S1 L2 = S2 GO TO 200 L1=AH(I-1)  C C  120 130  101 102 103 104 105 106 C 107 C 108 C 109 110 111 112 113 114 115 116 117 118 119 C 120 121 ~0 122 123 124 C 125 126 127 128 1 29 130 131 132 133 134 135 C 136 137 138 139 140 141 142 143 144 145 146 C 147 148 149 150 End of f i l e  140  L2=5H(I) GO TO 200 L1=SH(I-1) L2=AH(I) GO TO 200  TO CALCULATE PA,PB OF THE 200  210  A 1=0 A2=0 T1=0 T2=0 DO 210 L=L1,L2 T1=T1+S(L,1) T2=T2+S(L,2) CONTINUE A1=T1/(L2+1-L1) A2=T2/(L2+1-L1) IF IF IF IF  0 0  230 232 233 234  240 242 243 244  300 305  REGION L1-L2  (K.EQ.1) (K.EQ.2) (K.EQ.3) (K.EQ.4)  GO GO GO GO  TO TO TO TO  10 20 230 240  PRINT 232 FORMAT( 'O' ,25X, '*** P-HELIX AND P-SHEET OF INTERS. AREA : H1 TO S2 1 ***') IF (A1.GT.A2) PRINT 233. L1, L2.A1.A2 FORMAT('0' . 15X. 'OL1 : ' , 14,3X, '0L2: ' ,14,5X, 'A 1: ' .F6.3,3X, ' A2 : ' .F6.3, 1 10X,'A1 > A2 FROM H1 TO S2'./) IF (A1.LT.A2) PRINT 234, L1, L2.A1.A2 FORMAT('0' . 15X. 'OL1 : ' .14,3X, '0L2: ',14,5X, 'A 1: ',F6.3,3X, 'A2 : ' .F6.3, 1 10X,'A1 < A2 FROM H1-T0 S2',/) GO TO 50 PRINT 242 FORMAT('0' ,25X, '*** 1 *** ' ) IF (A1.GT.A2) PRINT FORMAT('0' , 15X, 'OL1 1 10X,'A1 > A2 FROM IF (A1.LT.A2) PRINT FORMAT('0' . 15X, 'OL1 1 10X.'A1 < A2 FROM GO TO 50  P-HELIX AND P-SHEET OF INTERS. AREA  : S1 TO H2  243, L1, L2,A1,A2 : ' , 14,3X, '0L2: ' . 14,5X, 'A 1 : ' .F6.3,3X, ' A2 : ' ,F6.3, S1 TO H2'./) 244, L1, L2.A1.A2 : ' . 14 , 3X , '0L2: ' , 14 , 5X , ' A 1 : '.F6.3,3X, 'A2 : ' ,F6 . 3 , S1 TO H2',/)  PRINT 305 FORMAT('0',1OX,'END OF PROGRAM') RETURN END  SUBROUTINE OLA2  0LA2  - PROCEDURE OF OVERLAPPING H 2  PURPOSE TO COMPARE THE TYPES OF RESIDUES (BREAKER,FORMER,INDIFFERENT) CONTAINED IN THE PREDICTED HELIX (H1-H2).THE SHEET (S1-S2),AND IN THEIR OVERLAPPING AREA •  REAL A 1,A2,S.T1 ,T2,TTH,TTS,P,HN,HC,NHN,NHC,SN,SC,NSN,NSC.HHF,HF, 1 IIH,IH.BH.BBH,SSF,SF,IS,BS.BBS INTEGER H1,H2,S1,S2,AH,SH,IT1,IT2.D.R DIMENSION S(1000,20),AH(1000),SH(1000),M(1000),R(1000,16),D(1000, 1 16),P(1000,10) COMMON A 1 , A2 , S . T 1 , T2 , TTH. TTS . P , HN , HC , NHN . NHC , SN-, SC , NSN, NSC , HHF , HF 1 ,IIH,IH.BH.BBH,SSF,SF,IS.BS,BBS,H1,H2,S1.S2,AH,SH,IT1,IT2,D,R.NR. 2 NT.NN.N.M.IM.I.K.d DESCRIPTION OF PARAMETERS HHF - COUNTER FOR STRONG HELIX-FORMER HF - COUNTER FOR HELIX-FORMER IIH - COUNTER FOR WEAK HELIX-FORMER IH - COUNTER FOR HELIX-INDIFFERENT BH - COUNTER FOR HELIX-BREAKER BBH - COUNTER FOR STRONG HELIX-BREAKER SSF - COUNTER FOR STRONG SHEET-FORMER SF - COUNTER FOR SHEET-FORMER - COUNTER FOR SHEET-INDIFFERENT IS - COUNTER FOR SHEET-BREAKER BS - COUNTER FOR STRONG SHEET-BREAKER BBS TTH - TOTAL OF THE DIFFERENT HELIX COUNTERS =HHF+HF+IIH+IH+BH+ BBH TTS - TOTAL OF THE DIFFERENT SHEET COUNTERS K - COUNTER PRINT 5 5  FORMAT('0',32X,'***  COMPARISON OF ASSIGNMENTS  TYPES ***'/)  10 13  K=1 GO TO 1 10 IF (TTH.GT.TTS) PRINT 13,H1,H2,TTH,TTS FORMAT('O' ,15X, 'H1 :',I4.3X,'H2 : ' , 14,5X, 'TTH: ' .F6.3.3X, 'TTS: ' ,F6.  1— 00 o 1  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  1 3,10X,'TTH > TTS FROM H1 TO H 2 ' / ) 14,H1,H2,TTH,TTS IF (TTH.LT.TTS) PRINT F O R M A T ( ' 0 ' . 1 5 X , 'H1 : ' , I 4 . 3 X , 'H2 : ' , I 4 . 5 X . ' T T H : ' , F 6 . 3 . 3 X , ' T T S : ' , 1 3 , 1 0 X , ' T T H < TTS FROM H1 TO H 2 ' / )  14 C  K=2 GO TO 1 2 0 PRINT 23,S1 ,S2,TTH,TTS IF (TTH.GT.TTS) F O R M A T ( ' 0 ' , 1 5 X , ' S 1 : ' , I 4 , 3 X , ' S 2 : ' , I 4 , 5 X , 'TTH:'. F 6 . 3 . 3 X , ' T T S : ' , 1 3,10X,'TTH > TTS FROM S1 TO S 2 ' / ) PRINT 24.S1,S2.TTH,TTS IF (TTH.LT.TTS) F O R M A T ( ' 0 ' , 1 5 X , 'S1 : ' , I 4 . 3 X , 'S2 : ' , 1 4 , 5 X ,' T T H : ' , F 6 . 3 , 3 X , ' T T S : ' , 1 3,10X,'TTH < TTS FROM S1 TO S 2 ' / )  20 23  24 C C C C C  TO C H E C K THE B O U N D A R I E S OF THE W I T H I N A - H E L I X NO I S CONTAINED THE : O V E R L A P P I N G AREA A G A I N .  O V E R L A P P I N G AREA. N E E D TO CARRY OUT  IN CASE B-SHEET THE: A N A L Y S I S FOR  ( S H ( I - 1 ) . L T . AH(I - 1 ) . A N D . S H ( I ) . G T . A H ( I - 1).AND. S H ( I ) . L T . A H ( I ) ) K=3 IF ( S H ( I - I ) . L T . A H ( I - 1 ) . A N D . S H ( I ) . G T . A H ( I - 1).AND. S H ( I ) . L T . A H ( I ) ) 1 GO TO 1 3 0 IF  1  C ( A H ( I - I ) . L T . SH(I - 1 ) . A N D . A H ( I ) . G T . S H ( I - 1).AND. A H ( I ) . L T . S H ( I ) ) K=4 IF ( A H ( I - I ) . L T . S H ( I - 1 ) . A N D . A H ( I ) . G T . S H ( I - 1).AND. A H ( I ) . L T . S H ( I ) ) 1 GO TO 1 4 0 IF  1  C c c c  c  TO C A L L REGION 50  CALL 0LA3 RETURN  iio  L 1 =H1 L 2 = H2 GO TO 2 0 0 L 1=S 1 L2 = S2 GO TO 2 0 0 L1=AH(I-1) L2=SH(I) GO TO 2 0 0 L1=SH(I-1) L2=AH(I) GO TO 2 0 0  120  130  140  c c C  SUBROUTINE 0 L A 3  TO 200  CALCULATE HHF =0  THE  TO  CARRY  OUT  D I F F E R E N T T Y P E S OF  THE  BOUNDARY A N A L Y S I S OF  RESIDUES IN  THE R E G I O N  EACH  L1-L2  H  CO H  101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150  HF=0 IIH=0 IH=0 BH=0 BBH=0 SSF=0 SF=0 IS=0 IS=0 BS=0 BBS=0 TTH=0 TTS=0  C  210  211  C  C  212  DO 210 L = L1,L2 IF ( S ( L , 1 ) .GT.1.16) HHF=HHF+2.00 IF (S(L, 1).GT. 1.01 .AND.S(L, 1) .LE. 1 . 16) HF=HF+1.00 IF (S(L, 1).GT.0.98.AND.S(L. 1 ) . LE . 1.01) IIH=IIH+0.50 IF (S(L,1).GT.0.69.AND.S(L,1).LE.0.98) IH=IH+0.25 IF (S(L,1).GT.0.57.AND.S(L,1).LE.0.69) BH=BH-0.50 IF (S(L, 1).LE.0.57) BBH=BBH-1.00 IF (S(L,2).GT.1.38) SSF=SSF+2.00 IF (S(L,2).GT.0.93.AND.S(L,2).LE.1.38) SF=SF+1.00 IF (S(L,2).GT.0.75.AND.S(L,2).LE.O.93) IS=IS+0.25 IF (S(L,2).GT.0.55.AND.S(L,2).LE.O.75) BS=BS~0.50 IF (S(L,2).LE.O.55) BBS=BBS-1.00 CONTINUE TTH = HHF + HF +11H+1H+BH+BBH TTS=SSF+SF+0.0+IS+BS+BBS PRINT 211 FORMATC ',11X,'HHF',6X,'HF',5X,'IIH',6X,'IH',6X,'BH',5X,'BBH',5X. 1 'SSF' ,SX, 'SF' ,6X, 'IS' .6X, 'BS' ,5X, 'BBS') PRINT 212,HHF,HF,IIH,IH.BH.BBH,SSF.SF,IS,BS,BBS FORMAT(' ',10X,11(F5.2.3X)) IF IF IF IF  230 231 235 236  (K.EO.1) (K.EQ.2) (K.EQ.3) (K.EQ.4)  GO GO GO GO  TO TO TO TO  10 20 230 240  PRINT 231 FORMAT( '0' ,28X. '*** ASSIGNM. TYPES IN OVERL. AREAS : H1 TO S2 *** 1 ' ) IF (TTH.GT.TTS) PRINT 235, L1, L2,TTH,TTS FORMAT('0' , 15X, 'OL1: ' , 14 , 3X , ' 0L2 : ' ,14,5X. 'TTH: '.F6.3.3X. 'TTS: ' ,F6. 1 3,10X,'TTH > TTS FROM H1 TO S2'/) IF (TTH.LT.TTS) PRINT 236, L1, L2,TTH,TTS FORMAT('0' , 15X, 'OL 1 : ' ,14,3X, '0L2 : ' ,14,5X. 'TTH: ' .F6 . 3 , 3X , 'TTS : ' . F6 . 1 3,10X.'TTH < TTS FROM H1 TO S2'/) GO TO 50  151 C 152 153 154 155 156 157 158 159 160 161 162 End of F i l e  00  240 241 245 246  PRINT 24 1 FORMAT('O' ,28X, '*** ASSIGNM. TYPES IN OVERL. AREAS : S1 TO H2 *** 1 ' ) IF (TTH.GT.TTS) PRINT 245, L1, L2,TTH,TTS FORMAT( 'O' , 15X, 'OL1 : ' ,14,3X, '0L2: ' , I 4,5X, 'TTH: ' ,F6.3,3X, 'TTS : ' ,F6. 1 3,10X,'TTH > TTS FROM S1 TO H2'/) IF (TTH.LT.TTS) PRINT 246, LI, L2,TTH,TTS FORMAT('0' , 15X, 'OL1 : ' ,14,3X, '0L2 : ' , 14 , 5X , 'TTH: ' ,F6 . 3,3X , 'TTS : ' ,F6. 1 3.10X,'TTH < TTS FROM S1 TO H2'/) GO TO 50 END  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  1 C C C C C C C C C C C C C C C  C C C C C C C C C C C C C C C C  SUBROUTINE 0LA3  OLA3  - PROCEDURE OF OVERLAPPING H 3  PURPOSE TO COMPARE THE SUM OF THE BOUNDARY CONFORMATIONAL PARAMETERS OF THE PREDICTED HELIX AND SHEET. ONLY THE 3 RESIDUES BELONGING TO THE BOUNDARIES OF EACH SECTION AND THOSE 3 ADJACENT TO THE BOUN DARIES ARE CONSIDERED  REALA1.A2.S,T1,T2,TTH,TTS,P,HN,HC,NHN,NHC,SN.SC,NSN,NSC.HHF,HF, 1 11H,IH,BH,BBH,SSF,SF,IS,BS,BBS INTEGER H1,H2,S1,S2,AH,SH,IT1,IT2,D,R DIMENSION S(1000,20),AH(1000).SH(1000),M(1000),R(1000.16),D(1000, 1 16 ) ,P(1000, 10) COMMON A1,A2.S.T1,T2,TTH.TTS,P,HN.HC,NHN,NHC.SN,SC.NSN,NSC,HHF,HF 1 ,IIH,IH,BH.BBH,SSF,SF,IS,BS,BBS,H1,H2.S1,S2,AH.SH,IT1,IT2.D,R.NR. 2 NT.NN.N.M.IM,I,K,J DESCRIPTION OF PARAMETERS HN - SUM OF THE BOUNDARY CONFORMATIONAL PARAMETERS SIDUES BELONGING TO THE HELIX N-TERMINAL HC - SUM OF THE BOUNDARY CONFORMATIONAL PARAMETERS SIDUES BELONGING TO THE HELIX C-TERMINAL NHN - SUM OF THE BOUNDARY CONFORMATIONAL PARAMETERS SIDUES ADJACENT TO THE HELIX N-TERMINAL NHC - SUM OF THE BOUNDARY CONFORMATIONAL PARAMETERS SIDUES ADJACENT TO THE HELIX C-TERMINAL  OF THE 3 RE OF THE 3 RE OF THE 3 RE OF THE 3 RE  REMARKS THE DEFINITIONS OF SN,SC,NSN,SNC ARE SIMILAR TO HN,HC,NHN.HHC EXCEPT THAT SHEET IS CONSIDERED INSTEAD OF HELIX HN=0 HC=0 NHN = 0 NHC=0 HN=S(H1,8)+S(H1+1.8)+S(H1+2,8) HC=S(H2,9)+S(H2-1,9)+S(H2-2,9) IF ((H1-3).LE.0) NHN=0 IF ((H1-3).LE.0) GO TO 1 NHN=S(H1-1,6)+S(H1-2.6)+S(H1-3,6)  M  51 52 53 54 55 5G 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72  00  73  **  74 75 76 77 78 End of F i l e  1 C  2  3 C  4 10 11 12  IF ((H2+3) GT.NN) NHC=0 IF ((H2+3).GT.NN) GO TO 2 NHC=S(H2+1.7)+S(H2+2,7)+S(H2 + 3 , 7) SN=0 SC=0 NSN=0 NSC=0 SN=S(S1,10)+S(51+1,10)+S(S1+2,10) SC = S(S2,11) + S(S2-1. 1 1 ) + S(S2-2,11) IF ((S1-3).LE.0) NSN=0 IF ((S1-3).LE.0) GO TO 3 NSN=S(S1-1,12)+S(S1-2,12)+S(51-3,12) IF ((S2+3).GT.NN) NSC = 0 IF ((S2+3).GT.NN) GO TO 4 NSC=S(S2+1,13)+S(S2+2,13)+5(S2+3.13) PRINT 10,H1,H2,S1,S2 FORMAT('0',12X,' BOUNDARY ANALYS. FOR HELIX F R O M : ' , 1 5 , ' TO:', 1 I5,3X,'AND FOR SHEET F R O M : ' , 1 5 , ' TO:',15/) PRINT 11 FORMAT(' ',12X,'HN',7X,'SN',7X,'HC',7X,'SC',6X,'NHN',6X,'NSN',6X, 1 'NHC ,6X, 'NSC' ) PRINT 12,HN,SN,HC,SC,NHN,NSN,NHC,NSC FORMAT(' ',1OX,8(F5.2,4X)//) 1=1+1  RETURN END  Comparison o f the p r e d i c t i v e  In Tables Chou and Fasman  accuracy  5 and 6 ( p . 189 and 198) t h e r e s u l t s o f  (1974b) , t h o s e o f X - r a y  analysis  (Chou and  Fasman, 1974b) and t h o s e o b t a i n e d b e f o r e a n d a f t e r of the program o f t h i s t i o n of the d i f f e r e n t  refinement  s t u d y a r e p o o l e d t o g e t h e r . The p r e d i c conformations  o f lysozyme  (egg w h i t e )  was c h o s e n as an e x a m p l e o f t h e o u t p u t y i e l d e d by t h e p r e s e n t program  ( A p p e n d i x ) . The q u a l i t y o f p r e d i c t i o n was a s s e s s e d by  the parameters  Q,  and t h e c o e f f i c i e n t s  a  f o r most o f t h e p r o t e i n s u s e d not a v a i l a b l e  i n this  f o r some p r o t e i n s ) .  study  Fasman  results 1979),  (1979). Although  e n t i r e procedure Tables  Q, a  t h e same as t h o s e o f  s t u d y i s c o m p a r a b l e t o t h a t o f Chou Chou and Fasman r e p o r t e d  and c o m p a r e d them t o ; X - r a y . d a t a this  ( X - r a y d a t a were  (1979) , i t i s r e a s o n a b l e t o assume t h a t t h e  accuracy obtained i n t h i s and  calculated  F o r t h e 8 - t u r n s e a r c h , as  r e s u l t s o f t h e p r e s e n t s t u d y were almost Chou and Fasman  C^, C^  will  their  (Chou and Fasman,  n o t be r e p e a t e d  7 and 8 ( p . 207 and 209) l i s t  again.  the values of  C^ and Q^, C^, r e s p e c t i v e l y as o b t a i n e d by Chou and  Fasman  (1974b) and by t h e p r e s e n t p r o g r a m . As an e x t r a  r e f e r e n c e , v a l u e s r e p o r t e d by A r g o s et. al_.  ( 1 9 7 6 ) , who  j o i n t p r e d i c t i o n histograms  the combination of  f i v e computerized  methods  Fasman) were a l s o  used  resulting  from  used  ( i n c l u d i n g t h e method o f Chou and  185  The diction  good a g r e e m e n t b e t w e e n X - r a y d a t a and  from t h i s  and  a-chymotrypsin  aim  in refining  c y r e p o r t e d by  study (C  = 0 . 3 9 ) , was  Chou and  of C  (Q » a  a  C^).  The  (P < 0.01)  Fasman  (1976),  (1974b) was  paired-sample and  C  Fasman ( 1 9 7 4 b ) .  rameters  Q and  One  may  C in this  clearly  f o r the  present  improved from the v a l u e s  argue about the v a l i d i t y the r e l i a b i l i t y  (24 p r o t e i n s ) .  Chou and  i n f l u e n c e of neighbouring  n. They n o t e d  tripeptides  t h a t the  of the  the combination  o f Lys  residues  (n-1)  result  present lim-  (1978b) and  (n+1)  o f amino  i n d i p e p t i d e s or  conformational  and  parameters  G l u ) . Hence e f f o r t s  q u a l i t y o f t h e p r e d i c t i v e methods a r e s t i l l t h a t t h e e v e n t u a l 'program w i l l s i n c e so many d i f f e r e n t  and  become more and  f a c t o r s must be  acid high  tri-  (e.g.,  to improve  necessary  pa-  pres-  i n t e r a c t i o n s o f some r e s i d u e s w i t h  p e p t i d e s w i t h much l o w e r  complicated  o f Chou  of the  Fasman  on t h e c o n f o r m a t i o n  a - h e l i x o r 3 - s h e e t p o t e n t i a l may  expect  values  :  i n d i p e p t i d e s and  may  myo-  a d j u s t e d t o f i t X - r a y d a t a on t h e b a s i s o f a  i t e d number o f s a m p l e s s t u d i e d the  accura-  s u p e r i o r to  e n t , p r o g r a m when a p p l i e d t o unknown p r o t e i n s s i n c e t h e p r o g r a m was  the  f o r c y t o c h r o m e c and  calculated  study,and  A  s i n c e t h i s was  t - t e s t r e v e a l e d t h a t the  (P < 0.05)  g  p r e d i c t i o n were s i g n i f i c a n t l y and  expected  except  pre-  for concanavalin  the program. In g e n e r a l , the p r e d i c t i v e  t h a t o f Argo.s e t • a l . gen  (C _> 0 . 4 0 ) , e x c e p t  the  the one more  considered  i n o r d e r t o o b t a i n good a g r e e m e n t w i t h X - r a y d a t a . A r g o s e_t a l .  186  (1976) s u g g e s t e d  that a perfect predictive  algorithm  should  i n c l u d e a c o n s i d e r a t i o n o f energy m i n i m i z a t i o n , t h e r m a l i z a t i o n , land l o n g - r a n g e  interactions.  p r e d i c t i o n histograms,  In t h e i r study, the use o f j o i n t  w h i c h were shown t o be s u p e r i o r t o any  i n d i v i d u a l p r e d i c t i o n , d i d not always y i e l d w i t h X-ray d a t a . Hence, i n t h e p r e s e n t t i o n s made t o t h e p r e s e n t  good a g r e e m e n t  study, the modifica-  program are not c o m p l e t e l y  useless  because i f t h e models used a r e a d j u s t e d t o f i t e x p e r i m e n t a l data, they can s t i l l  provide  known s y s t e m s .  In f a c t ,  consideration'  to  some u s e f u l g u i d e l i n e s f o r un-  in refining  .' t h e i n f l u e n c e o f t h e n e i g h b o u r i n g  dues, e s p e c i a l l y a t the b o u n d a r i e s , nal  potential  result,  t h e p r e s e n t p r o g r a m , more  and t o t h e c o n f o r m a t i o -  of the adjacent.segments,  t h e number o f o v e r l a p p i n g a r e a s  was e m p h a s i z e d . As a b e t w e e n a - h e l i x and  B - s h e e t o r b e t w e e n a - h e l i x and B - t u r n was d e c r e a s e d 5).  least  a specific  conformation  will  n o t be m i s s e d  the a t t a i n -  results.  A r g o s et_ a l _ .  (1976) and M a t t h e w s  t h a t no f a v o r a b l e p r e d i c t i o n c a n be e x p e c t e d t e i n s u n l e s s they possess known ones t h r o u g h  potential  when u s i n g t h e  p r o g r a m . S p e c i a l s i t u a t i o n s may n o t p e r m i t  of satisfactory  resi-  Hence, a t  one may be c o n f i d e n t t h a t a r e a s w i t h s t r o n g  present ment  (Table  I n g e n e r a l , t h e p r e d i c t e d r e g i o n s a l s o had boundary  dues w i t h f a v o r a b l e c o n f o r m a t i o n a l p a r a m e t e r s .  for  resi-  (1975a)  agreed  f o r unknown p r o -  some common o r g a n i z a t i o n w i t h t h e  sequence homology. F u r t h e r m o d i f i c a t i o n s  187  of the present extra rules and  program w i l l  be made when a d d i t i o n a l d a t a o r  f o r the p r e d i c t i v e  a l g o r i t h m a r e r e p o r t e d by Chou  Fasman o r o t h e r r e s e a r c h e r s . As i t has b e e n e m p h a s i z e d by  Fasman  (1980) , t h e l a c k o f h i g h a c c u r a c y  of the present  d i c t i v e methods s h o u l d n o t s t o p r e s e a r c h e r s  from u s i n g  prethem  to  o b t a i n a s u g g e s t i v e model f o r p r o t e i n s . T h i s w i l l  partial-  ly  h e l p t o g e t an i n s i g h t on p r o t e i n b e h a v i o r w h i l e X - r a y  data  are not y e t a v a i l a b l e . In  summary, i n t h e p r e s e n t  work f o r t h e s e c o n d a r y o f Chou and Fasman  study, the major  frame-  s t r u c t u r e s e a r c h b a s e d on t h e method  ( 1 9 7 8 a , 1978b) has b e e n  Extra m o d i f i c a t i o n s which w i l l  computerized.  be n e c e s s i t a t e d by t h e a d v e n t  o f i m p r o v e m e n t s i n t h e p r e d i c t i v e methods a r e n o t p e r c e i v e d as b e i n g  o f any g r e a t o b s t a c l e t o t h e u s e o f t h e b a s i c p r o -  grams d e v e l o p e d  i n this  study.  188  Table  5.  Comparison of Experimental Helical  ( X - r a y ) and  Predicted  R e g i o n s O b t a i n e d by Chou and Fasman^  and by t h e P r e s e n t P r o g r a m B e f o r e and A f t e r i t s Refinement.  P r e s e n t P r o g ;ram  Adenylate (194  Kinase  aa)  Carboxypeptidase A (307  aa)  Before  After  1-•14  1--9  23--31  Chou § Fasman  X-Ray  23--28  1-•8 c  23-•31  39-•49  41--48  40--48  41-•48  51--67  52--67  55--68  52--64  69--88  69--86  69--86  70-•86  97--109  98--108  97--109  99-•108  123- -132  123- -132  123- -132  124- -133  138- -152  143- -156  142- -151  142- -157  157- -167  157- -165  157- -164  159- -162  178- -194  180- -194  186- -194  178- -194  19--25  14--29  13--29  14--28  79--85  72--88  72--88  72--88  97--110  98--102  98--102  94--103  116- -122  116- -122  112- -122  170- -182  173- -186  173- -184  173- -187  215- -233  215- -233  215- -233  215- -231  1-•9  254- -262 286- -292  288- -305  297-302 (cont'd)  189  289- -305  288- -306  Table  5.  (cont'd) Comparison o f Experimental Helical by  Regions  (X-Ray) and  Predicted  O b t a i n e d by Chou and Fasman^ and  t h e P r e s e n t -Program B e f o r e and A f t e r i t s  Refinement Chou § P r e s e n t Program Before C o n c a n a v a l i n AJ a c k Bean (237  (245  aa)  X-Ray  After  32-40  38-42  38-43  80-85  81-86  155-160  155-160  180-188  178-190  180-189  53- 58  55- 60  55- 60  -  76- 90  78- 84  78- 84  -  111- 116  111- 116  111- 116  42-47  d  aa)  a-Chymotrypsin  Fasman  81-85  164 -173  C y t o c h r o m e bg (93 aa)  238- 244  233- 245  1-6  -  . 7-15  233- 245  234 -245  8-  9- 15  8- 15  31-39  33-  34- 39  33- 38  42-51  42-  43- 50  42- 49  53-76  54-  54- 61  55- 62  65- 74  64- 74  (cont'd)  190  Table  5.  (cont'd) Comparison o f Experimental Helical  (X-Ray) and P r e d i c t e d  Regions  O b t a i n e d b y Chou and Fasman by t h e P r e s e n t - P r o g r a m B e f o r e . and - A f t e r i t s  and  Refinement Present- Program Before Cytochrome  c  X-Ray  2-13  9-13  14-21 .  14-18  After 2-20  2-22  Chou § Fasman  (104 aa)  49-54 59-69  55-69  59-69  62-70 71-75  77-102  a-Hemoglobin (141  aa)  1-8  89-101  88-101  91-101  4-17  4-17  3-18  21-36  20-36  20-35  8-17 25-34  36-42 45-64  53-73  53-73  52-71  79-94  79-84  80-89  68-76 79-94  86-93 98-103  94-113  120-129  120-138  (cont'd)  191  96-113  94-112  120-138 118-138  Table  5.  (cont'd) Comparison o f Experimental Helical  Regions  Obtained  a  (X-Ray) and  Predicted  by Chou and Fasman  and b y t h e P r e s e n t P r o g r a m : B e f o r e " a n d A f t e r i t s Refinement.  P r e s e n t Program  B-Hemoglobin (146  aa)  Chou § Fasman  X-Ray  Before  After  1- 23  6- 23  6- 23  4- 18  26- 35  26- 34  26- 34  19- 34  37- 45  -  -  35- 41  -  51- 56  51- 55  50- 56  59- 71  58- 78  59- 71  57- 76  73- 78  73- 82 82- 99 101- 106  85- 97  85- 98  85- 94  98- 117  101- 118  99- 117  123- 143  122- 135  123- 143  106- 118 122- 129  137- 145  129- 135 137- 144  Lysozyme (129.-i.aa)  3- 15  7- 15  7- 15  5- 15  27- 36  27- 35  27- 35  25- 35  -  79- 84  79- 84  79- 84  90- 98  89- 99  88- 99  88- 99  105- 112  107- 114  107- 114  108- 115  119- 124  119- 125  119- 124  (cont'd)  192  Table  5.  (cont'd) Comparison o f E x p e r i m e n t a l H e l i c a l Regions  (X-Ray) and  Predicted  O b t a i n e d b y Chou and Fasman  and by t h e P r e s e n t .Program^. B e f o r e - a n d A f t e r i t s :  Refinement.  P r e s e n t Program Before  Chou § Fasman  X-Ray  After  Myogen  1-6  1-9  1-6  (108  5-24  9-19  8-21  7.-15  24-55  26-33  26-33  26-33  40-50  40-52  40-51  59-79  57-77  57-77  67-71  81-92  81-88  81-88  78-89  aa)  Myoglobin (153  aa)  -  102-107  96-108  100-108  99-108  1-11  4-22  4-22  3-18  24-36  22-36  24-36  20-35  38-64  37-43  38-43  36-42  48-77  48-57  51-57  58-77  58-77  81-85  -  86-97  86-95  13-22  66-87 81-96 89-99 101-119  100-119  101-119  100-118  123-145  123-149  123-128  124-149  130-149  (cont'd)  1?3  Table  5.  (cont'd) Comparison o f Experimental Helical  (X-Ray) and P r e d i c t e d  R e g i o n s O b t a i n e d by Chou and Fasman  and b y t h e P r e s e n t P r o g r a m . B e f o r e  and-After i t s  Refinement.  P r e s e n t Program Before Papain (212  Ribonuclease  S  aa)  'Staphylococcal Nuclease (149  X-Ray  After  5-10  aa)  (124  Chou § Fasman  aa)  24- 30  26- 35  26- 35  24 -41  47- 60  50- 58  50- 57  50 -57  69- 74  68- 77  68- 77  67 -78  -  118- 126  120- 126  117 -126  133- 143 136- 143  136- 143  138 -143  1--23  2--13  2--13  3--13  26--33  28--35  28--35  24--35  45--61  49--59  49--59  50--59  3-10  5-10  5-10  -  56-76  56-67  57-78 94-106  54-67  69-76 98-106  98-110  99-106  120-137 122-137  121-142  122-134  (cont'd)  194  Table  5.  (cont'd) Comparison of Experimental Helical  Regions  Obtained  (X-Ray) and  Predicted  by Chou and Fasman  and by t h e P r e s e n t P r o g r a m - B e f o r e ; and A f t e r i t s Refinement.  P r e s e n t Program  Subtilisin (275  aa)  Thermolysin (316  aa)  BPN'  Chou § Fasman  X-Ray  Before  After  8-13  _  -  15-20  15-19  13-19  14-20  69-75  66-7 5  64-75  64-73  110-120  111-116  111-116  103-117  130-145  132-145  132-145  132-145  195-200  195-200  195-200  -  226-238  223-238  222-238  223-238  -  -  -  242-252  267-275  269-275  267-275  269-275  53-59  55-60  53-58  -  67-74  67-77  67-74  65-88  136-144  137-150  137-150  137-152  163-172  160-180  158-180  159-180  236-241  234-246  238-246  235-246  261-267  261-273  261-271  259-274  280-295  281-295  281-295  280-296  299-313  302-313  301-313  302-313  5-10  175-180  (cont'd)  195  Table  5.  (cont'd) Comparison o f Experimental Helical and  Regions  Obtained  Before Pancreatic Trypsin Inhibitor  After its  •  P r e s e n t • Program  Chou,§ Fasman  X-Ray  After 2-7  2-7  3-6  44-55  45-55  45-54  45-56  19-29  22-37  19-37  19-38  46-63  -  40-62  aa)  Myohemerithrin (118  aa)  33-3-9 44-51  58-65  53-66 68-85  68-84  70-84  69-87  91-104  92-110  86-96  93-110  100-108  106-115  Thioredoxin (108  by Chou and Fasman  b y the. P r e s e n t . P r o g r a m Before"• and  Refinement  (58  (X-Ray) and P r e d i c t e d  aa)  12-19  10-19  12-19  11-18  38-48  38-48  38-48  34-49 59-63  84-90  85-91  85-91  98-108  98-108  98-108  (cont'd)  196  95-107  Table  5.  (cont'd) Comparison o f Experiment Helical and  Regions  Obtained  (X-Ray) and P r e d i c t e d by Chou and Fasman  by t h e P r e s e n t P r o g r a m B e f o r e  and A f t e r i t s  Refinement.  P r e s e n t Program  Glucagon (29  Chou § Fasman  Before  After  14-27  15-27  19-27  9-14  5-10  5-10  49-59  48-56  47-55  47-55  44-53  45-51  X-Ray  aa)  Bovine Colostrum Inhibitor 6  (67  aa)  Russell's Toxin  17-23 48-59  27-36  Viper  6  (60  aa)  B l a c k Mgmba Toxin K (57 aa)  44-53  e  a  References (1974b).  t o the X-ray  d a t a a r e g i v e n b y Chou and Fasman  b p r e d i c t e d v a l u e s r e p o r t e d by Chou and Fasman c  Region  (1974b)  omitted i n prediction  dOverpredicted ®The r e s u l t s values.  region  o f Chou and Fasman  197  (1978b) s e r v e as r e f e r e n c e  Table  6..  Comparison. of E x p e r i m e n t a l 3-Sheet Regions O b t a i n e d  (X-Ray) and  Predicted  by Chou and Fasman  and by the- P r e s e n t P r o g r a m B e f o r e -and . A f t e r i t s Refinement.  Present-Program  Adenylate (194  Kinase  aa)  Chou § Fasman  X-Ray  Before  After  9-14  10-14  10-15  10-15  27-39  29-39  26-35  34-39  80-85 89-92  90-95  88-93  89-95  113-118  113-118  110-118  114-118  151-157 169-175  169-174  169-175  182-188  Carboxypeptidase A (307  aa)  169-175  182-187  33- 42  32- 38  32- 38  32- 36  47- 52  47- 52  47- 52  49- 53  62- 66  61- 66  61- 68  60- 67  105- 107  103- 110  103- 111  104- 109  125- 133  125- 132  -  -  137- 141  137- 141  137- 141  -  189- 195  189- 195  191- 195  190- 196  200- 204  200- 204  200- 204  200- 204  206- 211  206- 211  206- 211  -  233- 234  -  234- 238 c  239- 241  243- 248  243- 249  243- 249  -  263- 269  263- 269  261- 269  265- 271  277- 281  277- 2 8 1  277- 281  -  (cont'd) 198  d  Table  6.  (cont'd) Comparison of E x p e r i m e n t a l 3-Sheet R e g i o n s O b t a i n e d  5 1  (X-Ray) and P r e d i c t e d  by Chou and Fasman'  and by t h e P r e s e n t . P r o g r a m B e f o r e  3  and A f t e r i t s  Refinement.  Concanavalin Jack  A  Bean  (237  aa)  a-Chymotrypsin (245  aa)  Present  Program  Before  After  Chou § Fasman  X- Ray  1-7  3-7  9-12  9-12  25-32  25-29  25-29  25- 29  49-57  47-55  47-55  48- 55  60-65  61-67  60-67  60- 67  79-82  73-79  73-80  73- 78  88-93  88-97  88-96  92- 97  106-109  105-115  106-113  106- 116  125-132  125-133  124-134  124- 132  137-143  140-143  140-144  140- 144  172-177  173-177  173-177  173- 177  193-199  191-199  190-200  190- 199  209-217  210-215  209-215  209- 215  226-230  228-232  229-234  -  29-33  29-34  29-34  29- 35  34-42  39-47  39-47  39- 47  51-54  51-54  50-54  50- 54  61-67  61-67  61-68  65- 68  88-91  85-91  85-89  86- 91  (cont'd) 199  3 -12  4- 9  Table  6.  (cont'd) C o m p a r i s o n o f E x p e r i m e n t a l " (X-Ray) and P r e d i c t e d 3  3-Sheet R e g i o n s O b t a i n e d  by Chou and Fasman  and by t h e P r e s e n t - P r o g r a m B e f o r e and A f t e r i t s Refinement.  P r e s e n t .."Projgram  a-Chymotrypsin (245  aa)  (cont'd)  Chou § Fasman  X -Ray  Before  After  103 -108  103- -108  103- -108  103- -108  117 -122  117- -122  117- -123  119- -122  134- -143  134- -146  134- 146  134- -140  154- -158  155- •163  155- •163  155- -163  180- -182  179- •183  179- 184  179- -184  199- •204  197- •201  199- -203  207- -213  206- 213  206- 214  206- -214  227- -232  227- 232  226- 232  226- -230  2-•9  4- 7  4- 8  4- 6  20- •28  21- 29  21- 25  21- 25  30- 33  29- 33  29- 33  28- 32  -  -  50- 54  72- 76  73- 79  75- 79  75- 79  31- 36  32- 36  45T 49  -  46- 50  -  78- 83  78- 82  80- 85  140- -146  C y t o c h r o m e ^5 (93 aa)  Cytochrome C (104  aa)  (cont'd) 200  Table  6.  (cont'd) Comparison o f Experimental 8-Sheet R e g i o n s O b t a i n e d  (X-Ray) and P r e d i c t e d  by Chou and Fasman  and by t h e P r e s e n t P r o g r a m B e f o r e a n d ' A f t e r i t s :  Refinement  •  P r e s e n t Program  a-Hemoglobin (141  aa)  8-Hemoglobin (146  Before  After  36-39  38-43  40-43  38-43  X-Ray  _  -  37-45  37-42  35-42  1-6  2-6  2-6  38-46  38-46  38-43  38-46  53- 59  51-59  50-58  50-54  56-65  -  -  57-60  aa)  Lysozyme  Myogen (108 aa)  Myoglobin (153  Chou $ Fasman  aa)  (cont'd)  2 0 1  .1-3  Table  6.  (cont'd) Comparison of Experimental 3-Sheet Regions  Obtained  cl  (X-Ray) and P r e d i c t e d  by Chou and Fasman^  and by t h e P r e s e n t : P r o g r a m B e f o r e and A f t e r i t s Refinement.  P r e s e n t Program  Papain (212  aa)  Chou $ Fasman  X- Ray  Before  After  -  4-9  4-9  5- 7  37-45  37-40  37-42  -  7.8-82  78-82  -  -  91-94  91-95  91-95  -  110-113  110-113  110-114  111- 112  130-136  130-134  130-135  -  161-166  161-166  161-167  162- 167  170-173  170-175  170-174  169- 175  186-188  184-189  185-189  185- 191  197-201  199-208  199-208  206- 208  43-48  43-47  43-48  41- 48  -  61-65  60-65  60- 65  69-82  69-76  69-76  69- 76  -  79-84  79-85  79-•87  95-110  95-102  96- 110  202-205  Ribonuclease (124  aa)  S  94-110  105-110 116-124  _  (cont'd) 202  115-124  116- 124  Table  6.  (cont'd) C o m p a r i s o n o f E x p e r i m e n t a l " (X-Ray) and P r e d i c t e d 3  g-Sheet Regions O b t a i n e d  by Chou and Fasman  and b y t h e P r e s e n t P r o g r a m B e f o r e , and A f t e r i t s Refinement.  Present  Staphylococcal Nuclease (149  aa)  Subtilisin (275  aa)  BPN'  Program  Chou § Fasman.  X--Ray  Before  After  12-15  13-18  12-18  12 -19  22-27  22-27  22-27  21 -27  32-41  30-39  32-41  30 -36  87-94  89-94  88-94  -  108-115  111-115  111-115  -  8-11  4-11  4-11  26-31  26-32  28-32  28 -32  -  44-51  44-51  45 -50  81-84  81-84  79-84  -  90-96  89-96  89-96  89 -94  103-111  103-108  103-108  -  116-124  119;--124 "  119-124  120 -124  147-150  147-152  147-152  148 -152  -  174-180  174-180  -  203-207  203-209  205-209  -  241-246  241-246  241-24.6  -  250-257  250-255  250-255  -  (cont'd)  203  Table  6.  (cont'd) Comparison o f Experimental 8-Sheet Regions and by  Obtained  a.  (X-Ray) and P r e d i c t e d  by Chou and Fasman  t h e P r e s e n t P r o g r a m .Before and A f t e r i t s  Refinement.  P r e s e n t Program Before Thermolysin (316  aa)  Chou § Fasman  X-•Ray  4-•17  4-•13  After  1-•4  4--13  7--9  14- •20  17--33  21- • 3 3  20--32  15--32  39--42  39-•42  37--50  37--46  41- -50  41-•50 52--58  61--66  61-•66  61--66  71--84  78-•84  75-•84  98-•106  98--110  60- - 6 3 97--106  108- •110 110- -116.  112- -116  120- -122  120- •123  120- -124  128- -131  128- •131  127- -131  148- -157  151- -157  151- -157  119- -123  192- -197  192- -193  221- •225 249- -258  249- -258  272- •276  266- -274  Pancreatic Inhibitor  Trypsin  251- -260  18--24  16--24  16--23  16--24  29--35  27--35  27--38  27--36  (58 aa) (cont'd)  204  T a b l e 6.  (cont'd) Comparison o f Experimental  (X-Ray) and P r e d i c t e d  S h e e t R e g i o n s O b t a i n e d by Chou and Fasman and b y t h e P r e s e n t P r o g r a m B e f o r e and A f t e r i t s Refinement.  P r e s e n t Program  Myohemerithrin ;  After  13- 21  14- 21  Thioredoxin aa)  14- 18  X-Ray  -  44- 52  47- 51  (118 aa)  (108  Before  Chou § Fasman  4- 7  4- 8  4- 8  2- 8  22- 25  22- 29  22- 29  22- 29  52- 55  53- 60  53- 60  53- 58  77- 81  77- 81  77- 81  54- 60 77- 81  88-91  Glucagon  6  (29 aa)  Bovine Colostrum Inhibitor  (67 a a )  Russell's Viper Toxin  6  (60 aa)  3-7  6-10  5-10  20-29  20-26  19-27  21-29  21-26  21-26  -  36-38  36-38  -  5-10  5-9  20-27  20-27  23-27  34-37  31-37  32-37  (cont'd) 205  Table  6.  (cont'd) Comparison o f Experimental 3 - S h e e t Regions Obtained  (X-Ray) and P r e d i c t e d  by Chou a n d Fasman^  and by the- P r e s e n t P r o g r a m B e f o r e and A f t e r i t s Refinement.  P r e s e n t Program Before  After  -  4-7  4-9  18-23  21-25  21-25  23-31  22-35  29-35  B l a c k Mamba Toxin K  e  (57 aa)  a  References (1974b) .  Chou § Fasman  t o the X-ray  d a t a a r e g i v e n by Chou and Fasman  P r e d i c t e d v a l u e s r e p o r t e d by Chou and Fasman c  Region  X-Ray  (1974b).  omitted i n prediction.  ^Overpredicted region. e  T h e r e s u l t s o f Chou a n d Fasman values.  206  (1978b) s e r v e as r e f e r e n c e  Table  7.  Agreement  Factors Q , C  o b t a i n e d by Chou and  F a s m a n , A r g o s e t a l . , and t h e . P r e s e n t  Program  a  Qa Carboxypeptidase (bovine)  A  89  a  82  b  90  C  .81  a  .70  b  .83°  Concanavalin A ( J a c k bean)  95  95  95  .40  .37  .39  a-Chymotrypsin (bovine)  73  64  73  .39  .21  .39  Cytochrome b (bovine)  84  82  89  .69  .67  .79  Cytochrome c (horse)  73  89  74  .45  .78  .48  a-Hemoglobin (horse)  81  72  79  .59  .38  .58  3^Hemoglobin (horse)  83  64  84  .52  .25  .65  94  79  94  .89  .59  .91  Myogen (carp)  66  85  69  .35  .72  .42  Myoglobin (sperm w h a l e )  81  72  79  .67  .43  .71  Lysozyme (hen egg  5  white)  (cont'd) -2 07  T a b l e 7.  (cont'd) Agreement F a c t o r s Q ,  C  a  o b t a i n e d by Chou a n d  a  Fasman , Argos e t a l _ . , a  b  and t h e P r e s e n t - Program*"  Qa 88  89  81  82  S  93  92  87  87  Staphylococcal nuclease^  85  87  60  66  S u b t i l i s i n BPN' (B. a m y l o l i q u e faciens)  80  76  80  64  .55  67  Thermolysin (B. thermoproteolyticus)  85  81  89  74  64  80  Pancreatic trypsin i n h i b i t o r (bovine)  90  71  94  82  51  87  Myohemerithrin (T. p y r o i d e s )  73  61  87  42  20  .70  Thioredoxin (E. c o l i )  77  77  54  Papain (papaya) Ribonuclease (bovine)  a  d  R e s u l t s o b t a i n e d by Chou and Fasman  (1974b)  ^ R e s u l t s o b t a i n e d b y A r g o s e_t aJL. (1976) c  Results  o b t a i n e d by o u r p r o g r a m .  ^ P r o t e i n s n o t t e s t e d by A r g o s et. a l . (1976)  208  . 54  Table  8.  Agreement  F a c t o r s Q^.,  o b t a i n e d by Chou and  Fasman , Argos e t a _ l . , a  b  and the' P r e s e n t  QB  Carboxypeptidase (bovine)  A  83  a  C  70  b  84  c  .54  a  Program  B  .33  b  .70°  Concanavalin A ( J a c k bean)  90  72  90  .77  .45  .78  a-Chymotrypsin (bovine)  92  75  92  .80  .49  .82  Cytochrome b (bovine)  85'  82  86  . 73  5  Cytochrome c (horse)  d  a-Hemoglobin (horse)  d  9 6 - 9 6  B-Hemoglobin (horse)  d  95  -  96  83  6i;  90  100  -  100  100  -  100  Lysozyme (hen egg d  Myoglobin (sperm w h a l e ) d  . 74  8 9 - 9 0  white)  Myogen (carp)  . 67  (cont'd)  209  -  --  .68  .20  .78  0  Table  8.  (cont'd) A g r e e m e n t F a c t o r s Q^,  o b t a i n e d by Chou and  Fasman , A r g o s e_t a l _ . , and t h e P r e s e n t  Program  C, Q  81  82  93  87  87  88  57  64  63  89  54  17  52  75  80  44  47  54  85  81  97  89  88  89  Myohemeri t h r i n (T. p y r o i d e s )  88  93  Ribonuclease (bovine)  S  93  S t aph-y.Lo c o'c c a 1 nuclease  85  S u b t i l i s i n BPN' (B. a m y l o l i q u e faciens)  91  Thermolysin (B. thermoproteolyticus)  75  Thioredoxin  89  Pancreatic trypsin i n h i b i t o r (bovine)  95  Papain (papaya)  e  a  79  R e s u l t s o b t a i n e d by Chou and Fasman  74 61  96  (1974b).  ^ R e s u l t s o b t a i n e d by A r g o s e t a l . (1976). c  R e s u l t s o b t a i n e d by o u r p r o g r a m .  ^ P r o t e i n s w i t h l i t t l e o r no s h e e t c o n f o r m a t i o n s w e r e n o t t e s t e d by A r g o s ejt a l . (1976) . e  P r o t e i n s n o t t e s t e d by A r g o s e t a l . (1976) . 210  Conformations  o f some f o o d r e l a t e d  The  second  objective  proteins  of t h i s  some i n f o r m a t i o n on t h e c o n f o r m a t i o n s u c h as b o v i n e  serum a l b u m i n  helix,  B - s h e e t , and  I  t o X.  bility  9 lists  ovalbumin,  the percentage  o f a-  f o r each p r o t e i n u s i n g the  shows t h e p o s s i b l e  c o n f o r m a t i o n s . The  schematic  of the t e s t e d p r o t e i n s  Some r e f e r e n c e s w e r e f o u n d of the p r e d i c t i o n  proteins  s  B - t u r n found  m o d i f i e d p r o g r a m . T a b l e 10  d i n g t o each  of food r e l a t e d  (BSA) , a ^ - c a s e i n , . B - c a s e i n , K -  trypsinogen. Table  the d i f f e r e n t  to o b t a i n  a-lactalbumin, 8-lactoglobulin,  c a s e i n , chymosin, p e p s i n , and  s t u d y was  c a n be  locations  diagram found  correspon-  i n Figures  to c o r r o b o r a t e the  from the p r e s e n t study.  for  K-casein  a-helix,  for ent  the d i f f e r e n t helix  90-97 w h i c h was  ( 2 0 , 33, and  p r o g r a m , and  helix  predicted  as  2 9 % ) . The  t h e same,  not p r e d i c t e d  (1974b) t o c o n t a i n 43% h e l i x  the present study.  O v a l b u m i n was  40% h e l i x by Yang and 25-30% h e l i x was  and  Doty  15%  (1957) u s i n g ORD,  ill  except  by  predicted and'12%  B-sheet obtained i n  r e p o r t e d t o be  f o u n d by G o r b u n o f f  to  the pres-  e t a l . ( 1 9 7 8 ) . a - L a c t a l b u m i n was  the method o f Lim  B-turn  locations  B - s h e e t by  62-68 w h i c h was  B - s h e e t c o m p a r e d t o 38% h e l i x  of  21%  Fasman  are q u i t e comparable  c o n f o r m a t i o n s were a l m o s t  Loucheux-Lefebvre by  B - s h e e t , and  ( b o v i n e ) . These r e s u l t s  those of the present study of  31%  relia-  Loucheux-  L e f e b v r e . e t a l . ( 1 9 7 8 ) , u s i n g t h e method o f Chou and ( 1 9 7 4 b ) , o b t a i n e d 23%  of  composed o f  while a value  (1969). E x t r a  ref-  erences sults  w o u l d be u s e f u l  o f Y a n g and  present  study  Doty  (44%  The  to e v a l u a t e the p r e c i s i o n of the r e -  (1957) , o f G o r b u n o f f  bovine  e v e n be  e t a ] . , 197 3 ) . diction of  from  g a s t r i c p r o t e a s e s , c h y m o s i n and  versus  sin).  The  may  be  bovine  This i s p a r t i a l l y  the present  3.71  study which  f o r c h y m o s i n , and  difference  e x p l a i n e d by source  not  difference  the albumin  of a - h e l i x 33.4  in their  31.0%  no  trypsin,  w i t h X-ray  ( 5 2 . 1 % ) may  The be  pre-  f o r both  versus  1.8%  f o r pepenzymes  source, chymosin  also  homology Hence, i t  conformational  g-sheet versus diffraction  g-sheet v e r s u s  from  pancreatic  exhibit  a very s i m i l a r  13.5%  (Chou  a-helix  9.0% and for  program.  r e f e r e n c e was  found  t o compare i t t o o v a l b u m i n group.  their  percentages  Tang, 1970).  enzymes: 33.5%  t r y p s i n w i t h the present Although  pepsin,  i n the  yielded high  (Huang and  to observe  f o r a-chymotrypsin  Fasman, 1 9 7 4 b ) , and  f o r BSA  and  structure  surprising  reasonable  reflected  i n t h e v a l u e s b e t w e e n t h e two  p a t t e r n b e t w e e n t h e two a-helix  s e q u e n c e and  and p e p s i n f r o m p o r c i n e s o u r c e . The  in t h e i r primary  the  a c t i v a t e d by a - s i m i l a r mechanism- (Foltmann  proteases, a-chymotrypsin  was  amino a c i d  g - s h e e t and v e r y low p e r c e n t a g e s  (40.2  of  helix).  a r e v e r y homologous i n t h e i r zymogens may  (1969) , and  high percentage  f o r BSA,  as t h e y b o t h b e l o n g  be to  of a - h e l i x p r e d i c t e d  comparable to t h a t of  212  i t may  ovalbumin  ( 4 4 . 1 % ) . However, the p e r c e n t a g e BSA  (2.2%)  c o m p a r e d t o 20.5%  of  8 - s h e e t was  f o r ovalbumin.  much l o w e r f o r  a^-Casein  and  c a s e i n were p r e d i c t e d t o c o n t a i n v e r y s i m i l a r p e r c e n t a g e s the t h r e e types o f c o n f o r m a t i o n a ^ - c a s e i n v e r s u s 13.9, tunately, No  t h e r e i s no  r e f e r e n c e was  tion  23.0,  s h o u l d be  that  33.0%  the r e s u l t s  (CD,  ORD,  results.  confirmed  exhibiting  sensitive  be  I t has  ad-  1978b) i s potential  8-sheet c o n f o r m a t i o n s . Hence,  protein functionalities  predic-  8-sheet)  Fasman ( 1 9 8 7 a ,  c h a n g e s o b s e r v e d w i t h CD may  activity.  30.9%  s h o u l d be  conforma^  e x p l a i n e d by t h e  tran-  a r e a s have undergone. A l l the  c o n f o r m a t i o n a l t r a n s i t i o n phenomena may  fying  the p r e s e n t  X - r a y ) . N e v e r t h e l e s s , one  o f t h e method o f Chou and  t h a t those  for  concerning food r e l a t e d p r o t e i n s  c o n s i d e r e d as s u g g e s t i v e and  f o r b o t h a - h e l i x and  sitions  30.1%  of  f o r 8-casein). Unfor-  ( 3 5 . 8 % a - h e l i x and  i t a l l o w s the d e t e c t i o n of areas  tional  and  found to assess the p r e c i s i o n of the  by o t h e r t e c h n i q u e s vantage  and  26.1,  r e f e r e n c e to check  for 8-lactoglobulin All  (14.6,  8-  help to  understand  s u c h as g e l a t i o n , f o a m i n g , a n d e m u l s i been o b s e r v e d  that d e n a t u r a t i o n of  p r o t e i n s must o c c u r t o some e x t e n t b e f o r e t h o s e p r o p e r t i e s are a c t u a l l y  exhibited.  acid residues),  i t has  For i n s t a n c e , f o r glucagon been h y p o t h e s i z e d t h a t the  f r o m a- t o 8 - c o n f o r m a t i o n for  o f t h e r e g i o n 19-27  (29 amino transition  i s necessary  t h e r e c e p t o r b i n d i n g b e c a u s e o f t h e more compact  213  structure'  resulting I t was  such a t r a n s i t i o n f ( C h o u  from  also  observed that  higher percentage of (21%)  (Gratzer  3-sheet  to locate  and Fasman, 1 9 7 8 b ) .  In  E p a n d , 1 9 7 1 ) . The  sickle  (Val)  r e s u l t s i n the t r a n s i t i o n  cells ones  due  1-6.  cell  hemoglobin,  o r 8 - b r e a k e r s by s t r o n g  to i n t e r c h a i n i n t e r a c t i o n s r e p l a c i n g  i m p a c t on t h e w h o l e p r o b l e m ,  nature of  one  of  hemoglobin  intrachain  a complete p i c t u r e  expected w i t h o u t the  t i o n of the t h r e e - d i m e n s i o n a l o r g a n i z a t i o n  complex  8-formers  1978b).  b e h a v i o r c a n n o t be  remains  replace-  f r o m a- t o 6 - c o n f o r m a t i o n  I n summary, e v e n t h o u g h  structure  (Chou  the  T h i s l e a d s to the a g g r e g a t i o n of  (Chou-and Fasman,  protein  predictive  t h e s e n s i t i v e a r e a 19-27  o f some a - f o r m e r s  section  a.  (52%) t h a n g l u c a g o n i n s o l u t i o n  ment  the  1978b).  g l u c a g o n i n t h e g e l s t a t e has  e t a l . , 1967;  method:can h e l p  and Fasman,  considera-  w h i c h has  the knowledge o f the  of -  a great  secondary  o f t h e u s e f u l means t o e x p l o r e t h e  proteins.  214  •Table 9.  Percentages  of H e l i x ,  3-Sheet and  Food R e l a t e d P r o t e i n s O b t a i n e d  3-Turn o f Some  from  the  Present  Program Helix  (•%)  Sheet  (%)  Turn  B o v i n e serum a l b u m i n (582 aa)  52.1  2.2  29.6  a -.-Casein ( b o v i n e ) (199 aa)  14.6  26.1  30.1  3-Casein (209  (bovine) aa)  13.9  23.0  33.0  K-Casein (169  (bovine) aa)  20.1  33.1  29.0  Chymosin (323  (bovine) aa)  3.7  40.2  36.2  38.2  14.6  37.4  35.8  30.9  17.3  44.1  20.5  19.5  1.8  33.4  46.9  13.5  31.0  35.4  s i  a-Lactalbumin (123 aa)  (bovine)  3-Lactoglobulin (162 aa)  (bovine)  Ovalbumin (385 aa) Pepsin (porcine) (326 aa) Trypsinogen (bovine) (229 aa)  215  (%)  T a b l e 10.  H e l i x , S h e e t , and T u r n R e g i o n s o f Some Food Related  P r o t e i n s as P r e d i c t e d by. t h e P r e s e n t  Program  Helix 6-33  B o v i n e serum a l b u m i n (582 aa)  38-58 63-70 72-81 100-106 122-134 140-145 164-170 179-187 192-201 206-221 223-242 289-295 305-313 318-329 341-361 373-381 418-423 450-463 497-512 517-533 535-552 573-581  (cont'd)  216  Sheet 403-415  Table  10.  (cont'd) H e l i x , Sheet,  and T u r n  Related Proteins  Regions  o f Some F o o d  as P r e d i c t e d by the-. P r e s e n t  Program  Sheet  Helix  Bovine (582  serum a l b u m i n  (cont'd)  aa)  Turns:  1-4, 3 4 - 3 7 , 5 9 - 6 2 , 8 2 - 8 5 , 8 8 - 9 1 , 9 5 - 9 8 , 105- 108,  107- 110,  109- 112 , 116- 119,  118- 1 2 1 ,  135- 138,  145- 148,  155- 158,  157- 160,  171- 174,  188- 1 9 1 ,  202- 205 , 243- 246,  245- 248 , 263- 266 ,  265- 268 , 270- 273,  276- 279 , 278- 281,  284- 287 ,  296- 299 , 301- 304,  314- 317,  332- 335,  336- 339 ,  363- 366,  382- 385 , 424- 427 , 431- 434,  437- 440,  443- 446,  474- 477 , 480- 4 8 3 ,  446- 449,  435- 438,  464- 467 , 471- 474,  482- 485 , 489- 492 , 513- 516,  553- 556 , 559- 562 , 569- 572.  a^-j-Casein (209  (bovine)  aa)  13-18  20-26  34-42  30-32  52-65  91-95 97-101 135-140 142-146 149-158 163-173  (cont'd)  217  Table  10.  (cont'd) H e l i x , Sheet,  and T u r n R e g i o n s  o f Some Food  R e l a t e d P r o t e i n s as P r e d i c t e d by t h e P r e s e n t Program  Helix a-^-casein Turns:  (bovine) 1-4,  Sheet  (cont'd)  8-11, 2 7 - 2 9 , 4 3 - 4 6 , 4 5 - 4 8 , 4 8 - 5 1 ,  6 6 - 6 9 , 7 2 - 7 5 , 8 7 - 8 9 , 8 7 - 9 0 , 1 1 2 - 1 1 5 , 159-162 174-176, 176-179, 182-185, 184-187, 188-191, 190-193. 8-casein (209  (bovine)  aa)  1.1-6  23-27  11-16  39-41  29-37  52-60  43-50  92-95 123-130 138-143 160-165 187-193  Turns:  8-11, 1 7 - 2 0 , 6 1 - 6 3 , 6 2 - 6 5 , 6 6 - 6 9 , 75-78, 85-88, 1 0 4 - 1 0 7 , 109- 1 1 2 ,  71-74,  111-114,  136-317, 146-149,  152-155, 158-160,  178-181, 180-183,  201-204, 203-206.  (cont'd)  218  166-169,  T a b l e 10.  (cont'd) Helix,  S h e e t , and T u r n R e g i o n s o f Some F o o d  R e l a t e d P r o t e i n s as P r e d i c t e d by t h e P r e s e n t Program  K-casein  (bovine)  (169 a a )  Helix  Sheet  1- 7  22- 26  9- 16  28- 32  62- 68  38- 43  102- 108  48- 56  137- 147  72- 79 93- 98 121- 126 159- 169  Turns:  69- 72, 8 0 - 8 2 , 85- 88, 99-101, 109-112, 113-116, 127-129, 129-132, 1 3 3 - 1 3 6 , 1 4 9 - 1 5 2 , 156-158.  Chymosin  (bovine)  2-6  (323 aa)  318-323  8-12 20-22 29-33 40-42 45-47 65-69 82-86 91-97 94-103 105-108 113-116 122-126  (cont'd)  219  T a b l e 10.  (cont'd) H e l i x , S h e e t , and T u r n R e g i o n s o f Some F o o d R e l a t e d P r o t e i n s as P r e d i c t e d by t h e P r e s e n t Program  Helix Chymosin  (bovine)  (cont'd)  Sheet 136-143 148-156 165-171 180-183 185-194 198-204 212-215 229-240 253-255 275-277 296-298 301-303 306-310  Turns:  13-16, 24-27, 34-37, 3 6 - 3 9 , 4 7 - 5 0 , 5 0 - 5 3 , 52-55, 59-62, 61-64, 76-79, 78-81, 87-90, 109-112, 127-130, 132-135, 144-147, 158-161, 1 6 1 - 1 6 4 , 1 7 2 - 1 7 5 , 176, 1 7 9 ,  207-210,  208-211, 216-219, 218-221, 224-227, 226-228, 2 4 1 - 2 4 4 , 2 4 7 - 2 5 0 , 2 5 0 - 2 5 2 , '. 272-274, 2 7 8 - 2 8 0 , 2 7 9 - 2 8 2 , 283-286,-291-294, 2 9 3 - 2 9 5 ,  Ccont'd)  220  312-315.  Table  10.  (cont'd) Helix,  Sheet,  and T u r n R e g i o n s  o f Some Food  R e l a t e d P r o t e i n s as P r e d i c t e d by the- P r e s e n t Program  a-Lactalbumin (123  Helix  Sheet  1-16  26-31  89-99  52-59  104-123  72-75  (bovine)  aa)  Turns:  1 7 - 2 0 , 3 2 - 3 5 , 33-36, 34-37, 4 3 - 4 6 , 4 5 - 4 8 , 4 7 - 5 0 , 4 8 - 5 1 , 61-64, 6 4 - 6 7 , 6 6 - 6 9 , 6 8 - 7 1 , 7 6 - 7 9 , 8 2 - 8 5 , 85-88,  8-Lactoglobulin (162  (bovine)  aa)  100-103.  22-37  1-5  67-78  12-20  80-87  39-43  129-143  56-61  156-162  92-95 102-107 115-123 145-151  Turns:  6-9, 4 9 - 5 2 , 63-66, 8 8 - 9 1 , 9 6 - 9 9 , 152-155.  (cont'd)  221  125-128,  T a b l e 10.  (cont'd) H e l i x , S h e e t , and T u r n R e g i o n s o f Some Food R e l a t e d P r o t e i n s as P r e d i c t e d by the. P r e s e n t . Program  Ovalbumin (385 aa)  Helix  Sheet  5- 23  27- 29  31- 41  51- 56  102- 109  77- 79  133- 143  86- 91  169- 189  117- 121  198- 206  145- 149  221- 232  156- 161  239- 245  194- 196  248- 259  208- 219  259- 268  276- 282  284- 290  291- 305  319- 334  364- 371  340- 362 373- 379 Turns:  24-27, 4 5 - 4 8 , 4 7 - 5 0 , 6 2 - 6 5 , 6 5 - 6 8 , 71-74, . 73-76, 8 0 - 8 3 , 9 2 - 9 5 , 95-98, 9 7 - 1 0 0 ,  125-128,  152-155, 162-165, 165-168, 190-193,  235-238,  245-48, 2 6 9 - 2 7 2 , 3 0 7 - 3 1 0 ,  (cont'd)  222  311-314.  T a b l e 10.  (cont'd) Helix,  S h e e t , and T u r n R e g i o n s o f Some Food  R e l a t e d P r o t e i n s as P r e d i c t e d by t h e P r e s e n t . Program  Pepsin  (porcine)  Helix  Sheet  6 5-70  15- 21 26- 31  (326 aa)  38- 40 71- 75 83- 91 99- 103 111- 115 140- 146 151- 155 164- 167 179- 182 191- 194 203- 205 211- 214 228- 231 245- 249 259- 267 274- 277 298- 313 Turns:  11-14, 22-25, 32-35, 34-37, 35-38, 45-48, 5 0 - 5 3 , 5 2 - 5 5 , 54-57, 57-60, 5 9 - 6 2 , 7 6 - 7 9 , 79-82, 9 4 - 9 7 , 9 6 - 9 9 , 1 0 7 - 1 1 0 ,  116-119,  125-128, 129-132, 137-140, 147-150,  156-159,  158-161, 160-163, 171-174, 175-178,  187-190,  (cont'd) 223  T a b l e 10.  (cont'd) Helix,  S h e e t , and T u r n R e g i o n s  o f Some F o o d  R e l a t e d P r o t e i n s as P r e d i c t e d by t h e P r e s e n t Program  Helix Pepsin (326  Sheet;  (porcine) ( c o n t ' d ) aa)  Turns:  198-201, 200-203, 206-209, 207-210,  215-218,  217-220, 221-224, 223-226, 2 3 2 - 2 3 5 ,  238-241,  240-243 , 250-253', 251-254 , 255-258 , 268-270 , 2 7 0 - 2 7 3 , 2 7 8 - 2 8 1 , 279-282, 2 8 2 - 2 8 5 , 2 9 2 - 2 9 5 , 293-296,  Trypsinogen (229  (bovine)  aa)  288-291,  315-318.  92-102  12-18  106-111  21-25  141-146  28-30  223-228  52-58 61-64 68-71 82-87 120-125 161-172 193-199 211-221  (cont'd)  224  T a b l e 10.  (cont'd) H e l i x , S h e e t , and T u r n R e g i o n s  o f Some F o o d  R e l a t e d P r o t e i n s as P r e d i c t e d by t h e - P r e s e n t Program  Helix Trypsinogen (229  (bovine)  Sheet  (cont'd)  aa)  Turns:  3-6, 7-10, 2 6 - 2 7 , 3 2 - 3 5 , 4 6 - 4 9 , 4 8 - 5 1 , 6 5 - 6 7 , 78-81, 88-91, 103-105, 112-115,  117-119,  126-129, 129-132, 132-135, 134-137,  149-152,  151-154, 154-157, 158-161, 173-175,  175-178,  177-180, 179-182, 181-184, 182-185,  200-203,  205-208,  208-210.  225  CONCLUSIONS  A c o m p u t e r p r o g r a m has language to p r e d i c t the secondary on t h e m e t h o d o f Chou and mainly r e l i e s  been w r i t t e n  s t r u c t u r e of p r o t e i n s based  Fasman ( 1 9 7 8 a ,  on t h e f r e q u e n c y  indifferent  1978b),  of occurrence  acid residue i n a c e r t a i n conformation. classification  type of conformation 8 - s h e e t and  of the  8-turn)  various  conformations.  presence  (a-  correspond  to  the  (nucleation,  t e r m i n a t i o n ) , or to the v a r i o u s c o n d i t i o n s  (<P >, P g > <  >  a  t i o n a l parameters  structure  Each program c o n s i s t s o f the main  s t e p s t o be f o l l o w e d i n t h e method  checked  to l o c a t e each  to s o l v e the p o s s i b l e over-  s e v e r a l subroutines which  p r o p a g a t i o n and t o be  and  the  former,  i n v o l v e d i n the secondary  l a p p i n g a- and 8 - a r e a s . p r o g r a m and  o f e a c h amino  o f t h e 20 amino a c i d s as e i t h e r  t o , or breaker  which  This l e d to  F o u r p r o g r a m s have b e e n d e s i g n e d  helix,  in Fortran  c h a r a c t e r assignment,  conforma-  o f t h e b o u n d a r y r e s i d u e s , and p o s s i b l e  of a n t i p a r a l l e l  8-sheets). For the  8-turn  search,  b e c a u s e o f t h e c o n s t a n t number o f r e s i d u e s i n v o l v e d ( f o u r ) and  the l e s s c o m p l i c a t e d p r e d i c t i v e r u l e ,  corresponding On  the  program  t o i t i s much s i m p l e r t h a n t h e o t h e r testing  t h e p r e s e n t p r o g r a m on  24  ones.  different  p r o t e i n s , some m i s s i n g a r e a s and  d i f f e r e n c e s i n t h e bound-  ary residuesibetween  of the p r e s e n t  the r e s u l t s  .2.2.6  study  and  t h o s e o f Chou and Fasman w e r e o b s e r v e d . ysis  of the problem,  program o f t h i s that at least  two t h i r d s  o f formers  The  condition  f o r h e l i x n u c l e a t i o n may  i n some c a s e s , a l t h o u g h t h e e v e n t u a l l y p r e -  o f one h a l f o r more h e l i x t h a n one t h i r d  formers.  of breakers  i t c o n t a i n s enough 3 - f o r m e r s .  residues  i n the n u c l e a t i o n area stabilize  comprised  the  requirement  f o r 3-sheet n u c l e a t i o n  although  i n g , r e s i d u e s may  of being  Similarly  lead to the omission of a p o t e n t i a l  the presence  anal-  some m o d i f i c a t i o n s w e r e a d d e d t o t h e  d i c t e d a r e a met t h e g e n e r a l r e q u i r e m e n t  may  a thorough  study, i n c l u d i n g the f o l l o w i n g .  n o t be s a t i s f i e d  of less  After  3-sheet  area,  Hence, the type o f  , as w e l l  a s , the surround-  the area conformation  o f some b r e a k e r s  cannot  provoke  such  that  i t s disruption.  F o r t h e b o u n d a r y r e s i d u e s o f t h e p r e d i c t e d a - and  3-areas,  the use o f the boundary c o n f o r m a t i o n a l parameters  (P ^>  J?3C'  Pg ) results N  Chou and Fasman  i n p r e d i c t i v e values closer  ( 1 9 7 4 b , 1978b) and o f X - r a y  ^ N' a  to those of  data  Fasman, 1 9 7 4 b , 1 9 7 8 b ) . T h e . u s e o f t h o s e p a r a m e t e r s helps to avoid p r e d i c t i n g  a  (Chou and also  t o o many o v e r l a p p i n g a - and 3~  a r e a s , o r o v e r l a p p i n g a - h e l i x and 3 - t u r n . The m e t h o d o u t l i n e d by Chou and Fasman 1978b) t o s o l v e t h e p r o b l e m proved  t o be u s e f u l  t u a t i o n s may  o f o v e r l a p p i n g a - and  i n most c a s e s .  (1978a, 3-regions  However, ambiguous s i -  o c c u r where t h e a r e a under c o n s i d e r a t i o n e x h i b -  227  its.  strong potential  f o r both c o n f o r m a t i o n s . In such  more e m p h a s i s s h o u l d be g i v e n t o t h e p r e s e n c e 6-sheets though  cases  of antiparallel  and t o t h e t y p e o f r e s i d u e s p r e s e n t i n t h e a r e a a l -  i t may h a p p e n t h a t t h e a v e r a g e  <  P  > a  or P g <  >  does n o t  s u p p o r t t h e same c o n f o r m a t i o n as t h e r e s i d u e a s s i g n m e n t . r a t i o n of length of the predicted a-helix another u s e f u l It  factor  i s not unexpected  t e i n s which cedure  to evaluate  The  and 3 - s h e e t i s  the importance  o f each  one  t h a t f o r t h e p r e d i c t i o n o f unknown p r o -  e x h i b i t some h o m o l o g y w i t h known o n e s ,  g i v e s l e s s problems  this  pro-  t h a n f o r c o m p l e t e l y unknown p r o -  teins . Comparing the p r e d i c t i v e a c c u r a c y ^a(B)  a  n  ^ a ( 3 ) °btained by Chou and Fasman  d  parameters (1974b),  e t a l . (1976) and t h e p r e s e n t p r o g r a m , i t a p p e a r s dictions  ( 1 9 7 6 ) . The p a i r e d - s a m p l e (P  a  < 0.01) and C  g  ues  o f Chou and Fasman  used  i n this (C  (P <  0.05) c a l c u l a t e d  (1974b).  improved  f o r the  from t h e v a l -  F o r most o f t h e p r o t e i n s  s t u d y , e x c e p t f o r c o n c a n a v a l i n A and a-chymo= 0.39),  0.40) was o b s e r v e d cations  ejt a l .  t - t e s t revealed that the values  p r e s e n t p r e d i c t i o n were s i g n i f i c a n t l y  trypsin  that pre-  f r o m t h e p r e s e n t s t u d y and t h o s e o f Chou and Fasman  (1974b) a r e i n g e n e r a l b e t t e r t h a n t h o s e o f A r g o s  of C  Argos  good a g r e e m e n t w i t h X - r a y  data  (C >_  as an e x p e c t e d c o n s e q u e n c e o f t h e m o d i f i -  g i v e n t o the p r e s e n t program.  228  Stimulated developed so  i n this  s t u d y was a p p l i e d t o f o o d  as t o p r o v i d e  ting  food  by t h o s e p o s i t i v e r e s u l t s ,  (bovine  could  under v a r i o u s  n o t be f o u n d . f o r  conditions.  similar  regions  tested  K-casein,  B-casein,  8 - l a c t o g l o b u l i n , ovalbumin,  trypsinogen) , the p r e d i c t e d  Although  a l l of the p r o t e i n s  albumin, a ^-casein,  serum  chymosin, a - l a c t a l b u m i n , and  related proteins  a p o s s i b l e means o f e x p l a i n i n g and p r e d i c -  p r o t e i n behavior  references  the program  pepsin,  f o r K.-casein and  to those reported  lactalbumin  were v e r y  researchers.  They e i t h e r u s e d t h e method o f Chou and Fasman  ( L o u c h e u x - L e f e b v r e e_t a l _ . , 1978)' , o r t h e i r  by  a-  other  own m e t h o d ( L i m ,  1974b). I n summary, t h e m a i n o b j e c t i v e o f t h i s c o m p u t e r i z e t h e m e t h o d of. Chou and Fasman attained.  Extra modifications  when a d d i t i o n a l d a t a range i n t e r a c t i o n lished  o r new  ( 1 9 7 8 a , 1978b) was  o f the program w i l l  set of rules  be made  (incorporating  and e n e r g y m i n i m i z a t i o n  by Chou a n d Fasman o r o t h e r  study to  f a c t o r s ) a r e pub-  researchers.  So f a r  most o f t h e p r e d i c t i v e methods do n o t a l w a y s e n s u r e predictive  a c c u r a c y and c a u t i o n  should  be g i v e n  d i c t i o n o f unknown p r o t e i n s . N e v e r t h e l e s s cost  and t h e l e n g t h y  X-ray technique, valuable  tool  and c o m p l e x o p e r a t i o n s  the p r e d i c t i v e algorithms  f o r access to the complicated  229  long-  high  to the pre-  c o n s i d e r i n g the involved still  i n the  remain a  organization  o f p r o t e i n s w h i l e a w a i t i n g f o r c o n f i r m a t i o n by X - r a y Furthermore,  analysis.  t h e a c c u r a c y o f t h e p r e d i c t i v e methods may be im-  p r o v e d by c o m b i n i n g  them w i t h CD o r ORD t e c h n i q u e s w h i c h  con.-"  s t i t u t e a n a d d i t i o n a l means t o s o l v e a m b i g u o u s c a s e s o f o v e r l a p p i n g a - and 3 - a r e a s . The p e r c e n t a g e in  o f each  «proteins c a n be o b t a i n e d u s i n g t h e s e  230  conformation  techniques.  LITERATURE CITED  A n g l e m i e r , A. F. a n d M o n t g o m e r y , M. N., 1976. I n " P r i n c i p l e s o f F o o d S c i e n c e " , p. 205-284, E d . Fennema, 01' R. , P a r t I , M a r c e l D e k k e r I n c . A r g o s , P., S c h w a r z , J . a n d S c h w a r z , J . , 1976. An a s s e s s m e n t o f p r o t e i n s e c o n d a r y s t r u c t u r e p r e d i c t i o n methods b a s e d on amino a c i d s e q u e n c e . B i o c h i m . B i o p h y s . A c t a 439: 261-273 A n f i n s e n , C. B., H a b e r , E., S e l a , M. and W h i t e , F. H., J r . , 1961. The k i n e t i c s o f f o r m a t i o n o f n a t i v e r i b o n u c l e a s e during o x i d a t i o n o f the reduced polypeptide chain. P r o c . N a t l . A c a d . S c i . U.S. 47: 1309-1314 B i r k t o f t , J . J . a n d B l o w , D. M., 1 9 7 2 . S t r u c t u r e o f c r y s t a l l i n e a - c h y m o t r y p s i n . V. The a t o m i c s t r u c t u r e o f t o s y l - a - c h y m o t r y p s i n a t 2% r e s o l u t i o n . J . Mol. Biol. 68: 187-240 B l o u t , E. R., de L o z e , C., B l o o m , S. M. a n d Fasman, G. D., 1960. The d e p e n d e n c e o f t h e c o n f o r m a t i o n o f s y n t h e t i c p o l y p e p t i d e s on amino a c i d c o m p o s i t i o n . J . Amer. Chem. S o c . 82: 3787-3789 B l o u t ; E. R., 1 9 6 2 . I n " P o l y a m i n o A c i d s , P o l y p e p t i d e s a n d P r o t e i n s " , p. 275-279, E d . Stahmann, M. A., U n i v e r s i t y o f Wisconsin Press, Madison. B l u n d e l l , T., D o d s o n , G., H o d g k i n , D. a n d M e r c o l a , D., 1972. I n s u l i n : t h e s t r u c t u r e i n t h e c r y s t a l and i t s r e f l e c t i o n i n c h e m i s t r y and b i o l o g y . A d v . P r o t e i n Chem. 26: 279-402 B r a d b u r y , A. F., S m y t h , D. G. a n d S n e l l , C. R., 1976. L i p o t r o p i n : p r e c u r s o r t o two b i o l o g i c a l l y a c t i v e peptides. B i o c h e m . B i o p h y s . R e s . Commun. 69: 950-956  231  Chou, P. Y., W e l l s , M. and Fasman, G. D., 1972. Conformat i o n a l s t u d i e s on c o p o l y m e r s o f h y d r o x y p r o p y l - L g l u t a m i n e and L - l e u c i n e . C i r c u l a r d i c h r o i s m s t u d i e s . B i o c h e m i s t r y 1 1 : 3028-3043 Chou, P,'. Y. and Fasman, G. D., 1973. S t r u c t u r a l and functional r o l e of leucine residues i n proteins. M o l . B i o l . 74: 263-281  J,  Chou,.P. Y. and Fasman. G. D., 1 9 7 4 a . Conformational p a r a m e t e r s f o r amino a c i d s i n h e l i c a l , 3 - s h e e t , and random c o i l r e g i o n s c a l c u l a t e d f r o m p r o t e i n s . B i o c h e m i s t r y 13: 211-221 Chou, P. Y. and Fasman, G. D., 1974b. P r e d i c t i o n o f p r o t e i n conformation. B i o c h e m i s t r y 13: 222-245 Chou, P. Y. and Fasman, G. D., 1977. J . M o l . B i o l . 115: 135-175 Chou, P. Y. and Fasman, G. D., of p r o t e i n conformation. 276  3-turns i n p r o t e i n s .  1978a. E m p i r i c a l p r e d i c t i o n s Ann. Rev. B i o c h e m . 47: 251-  Chou, P. Y. and Fasman, G. D., 1978b. P r e d i c t i o n o f t h e s e c o n d a r y s t r u c t u r e o f p r o t e i n s f r o m t h e i r amino a c i d sequence. Adv. E n z y m o l . 47: 45-148 Chou, P. Y. and Fasman, G. D., ' B i o p h y s . J . 26: 367-384  1979.  P r e d i c t i o n of 3-turns  D a v i e s , D. R., 1964. A c o r r e l a t i o n b e t w e e n amino a c i d c o m p o s i t i o n and p r o t e i n s t r u c t u r e . J . Mol. B i o l . 605-609  9:  D a y h o f f , M. 0., 1972. A t l a s o f P r o t e i n Swquence and S t r u c t u r e , National Biomedical Research Foundation, G e o r g e t o w n U n i v e r s i t y M e d i c a l C e n t r e , W a s h i n g t o n , D.C.  232  D a y h o f f , M. 0., 1 9 7 3 . A t l a s o f P r o t e i n S e q u e n c e a n d •. S t r u c t u r e , N a t i o n a l Biomedical Research Foundation, G e o r g e t o w n U n i v e r s i t y M e d i c a l C e n t r e , W a s h i n g t o n , D.C. D a y h o f f , M. 0., 1976. A t l a s o f P r o t e i n S e q u e n c e and S t r u c t u r e , N a t i o n a l Biomedical Research Foundation, G e o r g e t o w n U n i v e r s i t y M e d i c a l C e n t r e , W a s h i n g t o n , D.C. D a y h o f f , M. 0., 1 9 7 8 . A t l a s o f P r o t e i n S e q u e n c e a n d S t r u c t u r e , N a t i o n a l Biomedical Research Foundation, G e o r g e t o w n U n i v e r s i t y M e d i c a l C e n t r e , W a s h i n g t o n , D.C. D e b e r , C. M., M a d i s o n , Y. and B l o u t , E. R., 1976. Why c y c l i c p e p t i d e s ? Complementary approaches t o conformations. A c c . Chem. R e s . 9: 106-113 D i c k e r s o n , R. F., T a k a n o , T., E i s e n b e r g , D., K a l l a i . , 0. B., Samson, L., C o o p e r , A. and M a r g o l i a s h , E., 1 9 7 1 . F e r r i c y t o c h r o m e c. I . G e n e r a l f e a t u r e s o f t h e h o r s e and b o n i t o p r o t e i n s a t 2.8A r e s o l u t i o n . J . B i o l . Chem. . 246: 1511-1533 Dunn, B. M. and C h a i k e n , I . M., 1 9 7 5 . R e l a t i o n b e t w e e n a - h e l i c a l p r o p e n s i t y and f o r m a t i o n o f t h e r i b o n u c l e a s e S complex. J . M o l . B i o l . 95: 497-511 D z i o n a r a , M., R o b i n s o n , S. M.L. a n d W i t t m a n - L i e b o l d , B., 1977. S e c o n d a r y s t r u c t u r e o f - p r o t e i n s f r o m t h e 30S subunit o f the E s c h e r i c h i a c o l i ribosome. Hoppe S e y l e r ' s Z . P h y s i o l . Chem. 358: 1003-1019 E d e l m a n , G. M., C u n n i n g h a m , B. A., R e e k e , G. N., J r . , B e e k e r , J . W. , W a x d a l , M. J . a n d Wang, J . L., 1972'. The c o v a l e n t and t h r e e - d i m e n s i o n a l s t r u c t u r e o f c o n c a n a v a l i n A. P r o c . N a t l . A c a d . S c i . U. S. 69: 2580-2584 E p a n d , R. M., 1 9 7 1 . S t u d i e s o f t h e c o n f o r m a t i o n o f g l u c a g o n . Can. J . B i o c h e m . 49: 166-169 Fasman, G. D., 1 9 8 0 . P r e d i c t i o n o f p r o t e i n c o n f o r m a t i o n f r o m the p r i m a r y s t r u c t u r e . Ann. N.Y. A c a d . S c i . 348: 147159  233  F i n k , M. L. and B o d a n s z k y , M. J . , 1976. Secretin. VI. Simultaneous " i n s i t u " syntheses of three analogues t h e C - t e r m i n a l t r i c o s a p e p t i d e and a s t u d y o f t h e i r conformation. J . Amer. Chem. Soc. 98: 974-977  of  F o l t m a n n , B., K a u f f m a n , D., P a r i , M. and Maack A n d e r s e n , P., 1973. Comparison between the p r i m a r y s t r u c t u r e of c h y m o s i n ( r e n n i n ) , - p e p s i n and o f t h e i r zymogens. Neth. M i l k D a i r y J . 27: 288-297 G a r e l , A., K o v a c s , A. M., Champagne, M. and Daune, M., 1975. Comparison between h i s t o n e s F and ? si2 ° e r y t h r o c y t e . I. S t r u c t u r e , s t a b i l i t y and c o n f o r m a t i o n of the f r e e p r o t e i n s . B i o c h i m . B i o p h y s . A c t a 395: 515 £  c  h  i  c  k  e  n  2  Gamier, J . , Pernollet, J. C, Tertrin-Clary, C, Salerse, R., C a s t e i n g , M., B a r n a v o n , M., de l a L l o s a and J u t i s z , M. , 1975. Conformational studies of ovine l u t r o p i n ( l u t e i n i z i n g hormone) and i t s n a t i v e and chemically m o d i f i e d s u b u n i t s by c i r c u l a r d i c h r o i s m and u l t r a v i o l e t absorption spectroscopy. E u r . J . B i o c h e m . 53: 243-254 G o l d s a c k , D. E., 1969. R e l a t i o n o f amino a c i d c o m p o s i t i o n and t h e M o f f i t p a r a m e t e r s t o t h e s e c o n d a r y s t r u c t u r e of p r o t e i n s . B i o p o l y m e r s 7: 299-313 G o r b u n o f f , M. J . , 1969. Exposure of t y r o s i n e residues i n p r o t e i n s . I I I . The r e a c t i o n o f c y a n i c f l u o r i d e and N-acetyl imidazole with ovalbumin, chymotrypsinogen, and t r y p s i n o g e n . Biochemistry 8: 2591-2598 G r a t z e r , W. B., B a i l e y , E. and B e a v e n , G. H., 1967. Conformational states of glucagon. Biochem. B i o p h y s . Res. Commun. 28: 914-919 G r e e n , N.  M.,  1975.  Avidin.  234  Adv.  P r o t e i n Chem. 29:  85-133  6uz~zo, A. V., 1 9 6 5 . The i n f l u e n c e o f amino a c i d s e q u e n c e on p r o t e i n s t r u c t u r e . B i o p h y s . J . 5: 809-822 H a v s t e e n , B. H., 1966. A s t u d y o f t h e c o r r e l a t i o n b e t w e e n t h e amino a c i d c o m p o s i t i o n and t h e h e l i c a l c o n t e n t o f proteins. J . T h e o r . B i o l . 10: 1-10 H o l l a d a y , I . A. and P u e t t , D., 1976. S o m a t o s t a t i n c o n f o r mation: evidence f o r a s t a b l e i n t r a m o l e c u l a r s t r u c t u r e f r o m c i r c u l a r d i c h r o i s m , d i f f u s i o n , and s e d i m e n t a t i o n e q u i l i b r i u m . P r o c . N a t l . A c a d . S c i . U.S. 73: 1199-1202 Huang, W. Y. and T a n g , J . , 1970. C a r b o x y l - t e r m i n a l s e q u e n c e o f human g a s t r i c s i n and p e p s i n . J. Biol. Chem. 245: 2189-2193 IUPAC - IUB, 1970. C o m m i s s i o n on b i o c h e m i c a l n o m e n c l a t u r e - A b b r e v i a t i o n s and s y m b o l s f o r t h e d e s c r i p t i o n o f the c o n f o r m a t i o n o f p o l y p e p t i d e c h a i n s . Tentative r u l e s (1969). B i o c h e m i s t r y 9: 3471-3479 K a b a t , E. A. and Wu, T. T., 1 9 7 3 a . The i n f l u e n c e o f n e a r e s t n e i g h b o u r i n g amino a c i d r e s i d u e s on a s p e c t s o f secondary s t r u c t u r e o f p r o t e i n s . Attempt to l o c a t e a - h e l i c e s and 3 - s h e e t s . B i o p o l y m e r s 12: 751-774 K a b a t , E. A. and Wu, T. T., 1073b. The i n f l u e n c e o f n e a r e s t n e i g h b o u r i n g amino a c i d s on t h e c o n f o r m a t i o n o f t h e m i d d l e amino a c i d i n p r e d i c t e d a n d e x p e r i m e n t a l determination o f g - s h e e t s i n c o n c a n a v a l i n A. Proc. N a t l . A c a d . S c i . U.S. 70: 1473-1477 K a t o , A. a n d N a k a i , S., 1980. H y d r o p h o b i c i t y determined by a f l u o r e s c e n c e p r o b e method and i t s c o r r e l a t i o n with surface properties of proteins. Biochim. B i o p h y s . A c t a 624: 13-20  235  K a w a u c h i , H. and L i , C. H., 1974. R e a c t i o n o f human c h o r i c-ni.c somatomammotropin and human p i t u i t a r y g r o w t h h o r mone w i t h t e t r a n i t r o m e t h a n e a t 0° C. A r c h . B i o c h e m . B i o p h y s . 165: 255-262 Kendrew, J . C , D i c k e r s o n , R. E. , S t r a n d b e r g , B. E. and D a v i e s , D. R., 1960. S t r u c t u r e o f m y o g l o b i n : a t h r e e d i m e n s i o n a l F o u r i e r s y n t h e s i s a t 2A r e s o l u t i o n . . N a t u r e 185: 422-427 0  K i n s e l l a , J . E., 1976. F u n c t i o n a l p r o p e r t i e s o f p r o t e i n s i n foods: a survey. C r i t . Rev. Food S c i . N u t r i t . 7: 219-280 K o p p l e , K. D., Go, A. and P i l i p a u s k a s , D . R., 1 9 7 5 . S t u d i e s of p e p t i d e c o n f o r m a t i o n . Evidence f o r g - s t r u c t u r e s i n s o l u t i o n s o f l i n e a r t e t r a p e p t i d e s c o n t a i n i n g proline. J . Amer. Chem. S o c . 97: 6830-6838 K o t e l c h u c k , D. a n d S c h e r a g a , H. A., 1969. The i n f l u e n c e o f s h o r t - r a n g e i n t e r a c t i o n s on p r o t e i n c o n f o r m a t i o n II. A model f o r p r e d i c t i n g t h e a - h e l i c a l r e g i o n s o f p r o t e i n s . P r o c . N a t l . A c a d . S c i . U.S. 62: 14-21 L e b e r m a n , R. J . , 1 9 7 1 . S e c o n d a r y s t r u c t u r e o f t o b a c c o mosaic v i r u s p r o t e i n . J . M o l . B i o l . 55: 23-30 L e w i s , P. N., Go, N., Go, M., K o t e l c h u c k , D. and S c h e r a g a , H. A., 1970. H e l i x p r o b a b i l i t y p r o f i l e s o f d e n a t u r e d -proteins, and their correlation with native structures. P r o c . N a t l . A c a d . S c i . U.S. 65: 810-815 L e w i s , P. N. and S c h e r a g a , H. A., 1 9 7 1 . P r e d i c t i o n s o f s t r u c t u r a l homologies i n cytochromes c p r o t e i n s . A r c h . B i o c h e m . B i o p h y s . 144: 576-583  236  L e w i s , P. N., Momany, F. A. a n d S c h e r a g a , H. A., 1 9 7 1 . Folding of polypeptide chains i n p r o t e i n s : a proposed m e c h a n i s m f o r f o l d i n g . P r o c . N a t l . A c a d . S c i . U.S. 68: 2293-2297 L e w i s , P. N., Momany, F. A. a n d S c h e r a g a , H. A., 1 9 7 3 . Chain reversals i n p r o t e i n s . Biochim. Bio;hys. Acta 303: 211-229 L i l j a s , A. a n d Rossman, M. G., 1974. X - r a y s t u d i e s o f p r o t e i n i n t e r a c t i o n s . A n n . Rev. B i o c h e m . 43:475-507 Lim,  V. I . , 1 9 7 4 a . S t r u c t u r a l p r i n c i p l e s of the globular organization of protein chains. A stereochemical theory o f g l o b u l a r p r o t e i n secondary s t r u c t u r e . J . M o l . B i o l . 88: 857-872  Lim,  V. I . , 1974b. A l g o r i t h m s f o r p r e d i c t i o n o f a - h e l i c a l and 8 - s t r u c t u r a l r e g i o n s i n g l o b u l a r p r o t e i n s . J . M o l . B i o l . 88: 873-894  L o u c h e u x - L e f e b v r e , M. H., A u b e r t , J . P. a n d J o l l e s , P., 1978. P r e d i c t i o n o f t h e c o n f o r m a t i o n o f t h e cow a n d sheep K - c a s e i n s . B i o p h y s . J . 23: 323-334 Low,  B. W., L o v e l l , F. M. a n d R u d k o , A. D., 1 9 6 8 . P r e d i c t i o n o f a - h e l i c a l r e g i o n s i n p r o t e i n s o f known s e quence. P r o c . N a t l . A c a d . S c i . U. S. 60: 1519-1526  M a t t h e w s , B. W., 1 9 7 5 a . Comparison o f t h e p r e d i c t e d and o b s e r v e d s e c o n d a r y s t r u c t u r e o f t h e T4 phage l y s o zyme. B i o c h i m . B i o p h y s . A c t a 4 0 5 : 442-451 M a t t h e w s , B. W., 1975b. I n "The P r o t e i n s " , 3 r d e d . , p. 4 0 3 - 5 9 0 , E d s . N e u r a t h , H. a n d H i l l , R. L., A c a d e m i c P r e s s , New Y o r k  237  Mathews, F. S., L e v i n e , M. and A r g o s , P., 1972. Threedimensional F o u r i e r s y b t h e s i s of c a l f l i v e r cytochrome b :.at 2.8A r e s o l u t i o n . J . M o l . B i o l . 64: 449-464 r  Mufioz, P. A., W a r r e n , J . R. and N o e l k e n , M. E. , 1976. 3 - S t r u c t u r e o f aqueous s t a p h y l o c c o c a l e n t e r o t o x i n B by s p e c t r o p o l a r i m e t r y and s e q u e n c e - b a s e d c o n f o r m a t i o n a l predictions. B i o c h e m i s t r y 15: 4666-4671 Nagano, K., 1973. L o g i c a l a n a l y s i s o f t h e m e c h a n i s m o f p r o t e i n f o l d i n g . I . P r e d i c t i o n o f h e l i c e s , l o o p s , and 3 - s t r u c t u r e s f r o m p r i m a r y s t r u c t u r e . J . M o l . B i o l . 75: 401-420 N i s h i k a w a , K. and O o i , T., 1972. T e r t i a r y s t r u c t u r e o f p r o t e i n s . I I . F r e e d o m o f d i h e d r a l a n g l e s and e n e r g y calculations. J . P h y s . S o c . J a p . 32: 1338-1347 N i s h i k a w a , K., and O o i , T., 1973. In "Conformation of m o l e c u l e s and p o l y m e r s " , p. 1 7 3 - 1 8 8 , E d s . Bergmann, D. and P u l l m a n , B., A c a d e m i c P r e s s , New Y o r k .  E.  P e f i a , C , S t e w a r t , J . M. , P a l a d i n i , A. C , D e l l a c h a , J . M. and Santome, J . A., 1973. In "Peptides: chemistry, s t r u c t u r e and b i o l o g y " , p. 523-528, E d s . W a l t e r , R. and M e i e n h o f e r , J . , Ann A r b o r S c i e n c e P u b l i s h e r s , Ann Arbor. P e r i t i , P F., Q u a g l i a r o t t i , G. and L i q u o r i , A. M., 1967. R e c o g n i t i o n o f a - h e l i c a l s e g m e n t s i n p r o t e i n s o f Known p r i m a r y s t r u c t u r e . J . M o l . B i o l . 24: 313-322 P r o t h e r o , J . N., 1966. C o r r e l a t i o n between the d i s t r i b u t i o n o f amino a c i d s and a l p h a - h e l i c e s . B i o p h y s . J . 6: 367370  238  P t i t s y n , 0. B. a n d F i n k e l s h t e i n , A. V., 1 9 7 0 . C o n n e x i o n between t h e s e c o n d a r y and p r i m a r y s t r u c t u r e s o f • globular proteins. B i o f i z i k a 15: 757-767 S c a n u , A. M., E d e l s t e i n , C. a n d K e i n , P., 1 9 7 5 . I n "The P l a s m a P r o t e i n s " , 2nd e d . , V o l . I , p. 3 1 7 - 3 9 1 , A c a d e m i c P r e s s , New Y o r k . S c h i f f e r , M. a n d Edmunson, A. B., 1 9 6 7 . Use o f h e l i c a l wheels t o r e p r e s e n t t h e s t r u c t u r e o f p r o t e i n s and to i d e n t i f y segments w i t h h e l i c a l p o t e n t i a l . Biophys. J . 7: 121-135 S c h e r a g a , H. A., 1 9 6 0 . S t r u c t u r a l s t u d i e s o f r i b o n u c l e a s e . III. A model f o r t h e s e c o n d a r y and t e r t i a r y s t r u c ture. J . Amer. Chem. S o c . 82:3847-3852 S z e n t - G y o r g y i , A. G. a n d C o h e n , C , 1957. R o l e o f p r o l i n e in polypeptide chain configuration of proteins. S c i e n c e 126: 697-698 T a k a n o , T., K a l l a i , 0. B., Swanson, R. a n d D i c k e r s o n , R. E., 1973. The s t r u c t u r e o f f e r r o c y t o c h r o m e c. a t 2.45$. resolution. J . B i o l . Chem. 248: 5234-5255 V e n k a t a c h a l a m , C. M., 1 9 6 8 . S t e r e o c h e m i c a l c r i t e r i a f o r p o l y p e p t i d e s a n d p r o t e i n s . V. C o n f o r m a t i o n o f a system o f three l i n k e d p e p t i d e u n i t s . B i o p o l y m e r s 6: 1425-2436 W a l l a c e , D. G., 1976. P r e d i c t i o n o f t h e s e c o n d a r y and -'• tertiary structure of plastocyanin. B i o p h y s . Chem. 4; 123-130 Y a n g , J . T. a n d D o t y , P., 1957. The o p t i c a l r o t a t o r y d i s p e r s i o n o f p o l y p e p t i d e s and p r o t e i n s i n r e l a t i o n t o configuration. J . Amer. Chem. S o c . 79: 761-775  239  Zimm, B. H. and B r a g g , J . K., 1959. Theory o f the phase t r a n s i t i o n b e t w e e n h e l i x and random c o i l i n p o l y p e p t i d e chains. J . Chem. P h y s . 31: 526-535  240  APPENDIX  How  t o Use t h e P r o g r a m s  A f t e r c o n v e r t i n g the e n t i r e p r o t e i n sequence i n t o a series of corresponding must be p r e p a r e d  numbers, t h e f o l l o w i n g s e t o f c a r d s  as i n p u t d a t a  f o r the program o f h e l i x ,  3 - s h e e t and 3 - t u r n p r e d i c t i o n . The f i r s t the  t o t a l number  question data  o f amino a c i d r e s i d u e s  (NN) and t h e number  cards  of data  card: o f the s e t g i v e s of the p r o t e i n i n  cards  (N). Each o f  those  i s composed o f 16 numbers o r amino . a c i d r e s i d u e s ,  except the l a s t  data  c a r d w h i c h may  o r may  n o t be f i l l e d  with  16 n u m b e r s . The f o l l o w i n g f o r m a t has b e e n u s e d f o r NN and N: (6X,  1 4 , 6X, 1 4 ) .  J  1  I  !  !  I  !  I  6 blanks  !  t  I  NN  t  I  An e x a m p l e o f how  the f i r s t  a p r o t e i n o f 164 amino a c i d r e s i d u e s  16 I  Column  I  1  I  t  t  I  I  6  t  t  I  t  t  I  I  6 blanks  I  card looks  10  241  j  like for  (NN = 164 and N = 1 1 ) :  4 I  T  N  11  i  T  t  t  I  I  16  I  I  t  I  i  20 •  sequent trarily  The  p r o t e i n s e q u e n c e i s r e p o r t e d on t h e  subr  cards  (16 d a t a p e r c a r d ) whose f o r m a t has  arbi-  b e e n c h o s e n as 16  16 numbers w i l l of  5xn  2, 3, 4,  should s t a r t at  j  t  To k e e p  a l l t h e numbers  t h e o n e - d i g i t d a t a s h o u l d be  (n = 1,  look  In o t h e r words, each  1 6 ) , and  ( 5 n - l ) columns.  the card  right  l o c a t e d a t columns  t h e t w o - d i g i t ones:  A typical  data card  may  like:  t  1  i  8 i  t  t  i  5  output  i  i  i  i  echo p r i n t  enables In  s h e e t , and  12  10  An  i  14 i  t  i  i  i  t  15  3 i  t  i  i  9  i  t  20  o f the i n p u t data i n the  t h e d e t e c t i o n o f any  has  mation  i  12 i  30  prediction  t o e n t e r the  protein  (which  the  total  number o f d a t a c a r d s ) f o l l o w e d by t h e a c t u a l d a t a  In  t  the programs f o r h e l i x ,  p r o v i d e s t h e t o t a l number o f amino a c i d s and  (16 d a t a p e r  t  typographical error.  summary, i n o r d e r t o u s e  t u r n p r e d i c t i o n , one  t  25  s e q u e n c e i n t h e f o r m o f an " i n t r o d u c t o r y " c a r d  and  of  o c c u p y f i v e c o l u m n s on a c u r r e n t IBM  80-column w i d t h .  justified,  15.  cards  card). a d d i t i o n to the p r o t e i n sequence, e x t r a  infor-  c o n c e r n i n g the p o s i t i o n s of the o v e r l a p p i n g h e l i c e s  sheets are necessary  f o r the u t i l i z a t i o n  242  of  the  o v e r l a p p i n g program.  For t h i s reason, the l a s t data c a r d of  the  p r o t e i n sequence  will  set  of cards which c o n s i s t s of:  i m m e d i a t e l y be f o l l o w e d by a  second  an " i n t r o d u c t o r y " c a r d o f t h e same f o r m a t similar  to the f i r s t  one  (6X, 14, 6 X , . I 4 ) . The  q u e s t i o n a r e t h e t o t a l number o f v a l u e s g i v i n g of  the o v e r l a p p i n g h e l i c e s  m u l t i p l e of f o u r because involved cards  for  pairs  o f h e l i c e s and  3-sheets  f o r e a c h datum) f o r t h e d a t a c a r d s c a r r y i n g  and  v a l u e s o f h e l i c e s and following  a  are  the  of the d i f f e r e n t p a i r s  3 - s h e e t s . On  (5  of  each c a r d , the  boundary  3-sheets were a r r a n g e d a c c o r d i n g t o  ways:  HI  SI  H2  S2  H3  S3  J  t  t  I  I  t  t  1  5  10  15  20  25  30  HI :  N-boundary o f  the  helix  SI  :  N-boundary of  the  3-sheet  H2  :  C-boundary of  the  helix  S2 :  C-boundary of  the  H3  N-boundary of  the  :  a l w a y s be  c o n v e n i e n c e , k e e p t h e f o r m a t o f 16 15  overlapping helices  Column  ( i tw i l l  positions  card).  i n f o r m a t i o n on t h e p o s i t i o n s  the  3-sheets  the  i n t h e p r o c e d u r e ) , and t h e t o t a l number o f d a t a  (16 d a t a p e r  columns  and  two numbers i n  starting  f r o m HI t o H2  starting  from SI to  H1-H2  3-sheet-Sl-S2 helix  starting  243  f r o m H3  t o H4  S2  Hence t h e r e l a t i v e  p o s i t i o n s of h e l i c e s  8 - s h e e t s a l t e r n a t e w i t h e a c h o t h e r . An c o n t a i n up  to e i g h t p a i r s of  p r o g r a m , two  sets of cards  provides  computer w i t h the  s e q u e n c e and first  , contains  overlapping type  the  80-column c a r d  t o use  must be  the  overlapping  prepared.  i n f o r m a t i o n on  second s e t , which immediately data  on  h e l i c e s and  the  can  values.  I n summary, i n o r d e r  the  and  relative  8-sheets. For  of format i s used i n each set of  244  The  first  the p r o t e i n follows  p o s i t i o n s of  the  the  c o n v e n i e n c e , the cards.  set  same  63  70  72  81  88  95 106  100 116  L T  134  122  juuuuLmiismi. 140  145  155 164 -pi  170  179  206 263 270  276 284 289 MASLWL  296  305  313  -JLWIMJUUL 32 9  318  361 381  ,373  -MSUUUJiSJL403  i*15  418  423  JvW\AWAAVWV_Jim431 443  463  V  471 480  512 517  533  535  552  559  581  Fig.  I-  Schematic diagram o f the p r e d i c t e d secondary s t r u c t u r e o f bovine serum albumin  1  1 3  18  20  42  65  26  34  52  72  91  140  95  97  101  135  Wvw^ 1  49  1 58  _ywwwvwv^ 182  199  Fig. I I -  Schematic diagram o f the predicted secondary s t r u c t u r e o f a - c a s e i n (bovine) s l  24 6  112  mm 16  1]  i 23  i — i  27  37  / A / l A / V A , 0O00000OO  39  kl  43  50  PflPPPOPQ.  52  178 193  WvWW  187  \_  20*9  Fig. I l l -  Schematic diagram o f the p r e d i c t e d secondary s t r u c t u r e o f 3-casein (bovine)  247  60  AAAAAAAAA,  32  28  38  26  1+3  TsAMAA.  56  "+8  vwvwwv\ 68  62  W V V v V V v ^ , 85  108  113 126  121  133 1^7  F i g . IV -  137  Schematic diagram o f the p r e d i c t e d secondary s t r u c t u r e o f K-casein (bovine)  248  102  29  Fig. V -  Schematic diagram o f the p r e d i c t e d secondary s t r u c t u r e o f chymosin (bovine)  249  33  16 31  26 43 47 52  59  —^AAA/WW. 75  72  ,-JVWV. 82  89  99  MJUUUL19WL123  F i g . VI -  104  Schematic diagram o f the p r e d i c t e d secondary s t r u c t u r e o f a-Lactalbumin (bovine)  250  1  43  39  37  /VvVvV  49  61  7VWWV  87  92 123  107  115  80  78  67  95  102  J W V Y V V 129  143  145  151  WvNAAAT 162  154  Fig. VII -  Schematic diagram o f the p r e d i c t e d secondary s t r u c t u r e o f 3 - L a c t o g l o b u l i n (bovine)  251  5  AAAAA_ 22  20  1  2  5  23  tOOOtOOOD 51  62  56  J 79  71 77 91  109 133  1«*3 PgQOOtlQOOOO  102  --^^t  145 149 A / y A ^ A  g g Q 0 (LO C 206 194 196 198 i u o 208 219 221 23 232 A A / y \ 000000000 /yvyVA/yyv/^AAA^ a<L0QQ0O0QQ0  2  Voo gr 290  385  379  373 371  jU£UJL^vyvAAAAA  3  Qfl oo POP Qoo oo  340  Q 0  319 334 • 00D0QQ00OIQ0O0OOp_,  F i g . V I I I - Schematic diagram o f the predicted secondary structure o f ovalbumin  252  305  , 164  I  A  A  A  167  A  171  215  267  259 ,  r-^vVWWVVV-l 272 A AAA278  288 313  298 326  F i g . IX - Schematic diagram o f the predicted secondary s t r u c t u r e o f pepsin C^orGineO)  253  25 33  46 53,  59  '—'  6"+ 68  71  AAAA_  • 82 .  87  J-WWVW-, 92  102  in ^Q0tL0Q.0P0flPJL_  h 0 6  UMJUL117  125  132  141  146  _JULOJUL154 161  172  178  200  205  221  ^ A V A V v V v V v \  Fig. X -  223  22!  SLWLSUU  Schematic diagram o f the predicted secondary structure o f trypsinogen (bovine)  2 5 4  *********************************  ALPHA-HELIX PREDICTION *********************************  TOTAL NUMBER OF AA: NUMBER OF DATA LINES:  129 9  PROTEIN SEQUENCE  to  12 1 1 12 8 3 1G 12 3 1 1  20 4 14 16 •4 1 10 2 0  14 3 7 '17 8 1 1 20 5 0  8 19 is 4 2 1 1 16 12 0  2 2 3 19 17 16 4 8 0  PRELIMINARY  5 8 14 8 15 16 8 17 0  7 19 3 10 8 4 4 4 0  11 16 17 1 1 16 10 8 20 0  1 1 1 6 6 2 17 13 6 0  1 8 1 1 1 3 1 3 1 0  1 3 17 3 1 1 16 1 18 0  13 18 3 16 5 20 18 10 0  12 20 2 2 3 3 20 2 0  SEARCH FOR REGIONS WITH HELIX POTENTIAL 3  23  26  33  24  38  80  85  90  98  103  1 16  1 19  124 KM :  2 5 3 18 10 5 1 8 0  9 1 17 18 15 1 18 5 0  RULE 1  16  SEARCH FOR ACTUAL HELICES FROM THE POTENTIAL REGIONS  J  :  3  OA:  8  T3: 4.0000  L:  1  HELIX NUCLEATION  8 1 4 5 5 12 2 2 O  JB JB JB JB J1 J1  9 7 3 13 3 3  PSEUDO-HELIX  JC JC JD JC J2 J2  12 12 12 16 14 15  T3 T4 M5 T3 T4 T4  5 7 100 6 0000 3 3 7300 9 5000 10 0 0 0 0  FROM J 1  TO  BOUNDARY ,T1,T2,T5  (Jl  EVENTUAL  •16 17 18 19 20 21 22 23 24 25 26 27 27  JA JA JA JA JA JA JA JA JA JA JA JA J2  A N A L Y S I S OF THE N - T E R M I N A L 4.080 3.240 4.440 STEP  21 22 23 24 25 26 27 28 29 30 31 32 33  .  HELIX  FROM J 1 :  T3 T3 T3 T3 T3 T3 T3 T3 T3 T3 T3 T3 T4  OOOO OOOO OOOO 5000 5000 5000 OOOO OOOO OOOO 5000 5000 5000 5000  2 2 1 0 1 1 1 2 3 3 3 4 5  J2 J2  15  5.M0J1  CLOSE  TO  O  BOUNDARY A N A L Y S I S OF THE C - T E R M I N A L S T E P 3 . J 2 C L O S E TO 0 3.790 3 . 460 T1 , T 2 4.050 0.000043757 STEP 6 , 3.780 '3.850 T 1 , T 2 , T 5 TT 4.560 0.000034700 STEP7, 3 . 790 3 460 T 1 , T 2 , T 5 TT 0.000030110 0 B-TURN 5500 3 2200 11 5 . 0 3 0 0 B-TURN 5900 3 5100 O. 0 0 0 0 4 1 8 0 6 12 6100 B-TURN 2900 4 4700 O. 0 0 0 0 8 2 4 1 3 13 7300 B-TURN 8500 4 0500 0. 000043757 14 7800 B-TURN 4600 4 5600 0. 000034700 15 7900 B-TURN 4800 5 1700 O. 0 0 0 0 4 1 5 3 7 16 4600' B-TURN 2000 4 7500 0.000160201 17 5800 460 3 . 480 5 . 170 STEP 1 0 , J 2 = J 2 . M0J2 T 1 T2 , T 5 460 S T E P 16 T1 , T 2 3 480 J2 = J2 790 S T E P 19 4 .560 T1,T2,T5 J 2 - 1 ,M0J2 3 460 580 4 750 STEP 2 4 . J2+1 T1,T2,T5 ,M0J2 4 200 4 050 780 STEP 2 5 . J 2 - 2 T1,T2.T5 , RMJ2 3 850 790 STEP 2 8 . J 2 - 2 , T 1 , T2 RMJ2 4 390 730 S T E P 41 , J 2 - 3 T1,T2,T5 4 . 470 , RMJ2 3 290 030 STEP 5 4 , J 2 - 4 T1,T2,T5 3 . 220 , RMJ2 3 550 610 STEP 5 5 , J 2 - 4 T1,T2,T5 3.510 , RMJ2 3 590 3 .460 T1,T2,T5. 3 480 5.170 0.000041537 STEP 6 0 ,  LE,T1,T2,T5,TT,I LE , T 1 , T 2 , T 5 , T T , I LE.T.1 , T 2 , T 5 , T T , I LE,T1.T2,T5,TT,I LE,T1,T2,T5,TT,I LE,T1,T2,T5,TT,I LE,T 1 ,T2.T5,TT, I  J J J J J J J J J J J J J1  J2:  HELIX PROPAGATION H E L I X FORMERS I N 6 O V E R L A P P I N G R E S I D U E S T H E O R I T . AND A C T U A L # B R E A K E R S FROM J B TO J D 1 L : HELIX PROPAGATION A C T U A L AND T H E O R I T . H FORMERS FROM J 1 TO TT : 6 OOOO A C T U A L AND T H E O R I T . H FORMERS FROM J 1 TO . TT : 6 5 0 0 0  7  L L L L L L L L L L L L TT :  TO  J2 :  15  3 HELIX NUCLEATION HELIX NUCLEATION 3 4 HELIX NUCLEATION HELIX NUCLEATION 4 HELIX NUCLEATION 3 3 HELIX NUCLEATION 4 HELIX NUCLEATION 3 HELIX NUCLEATION 2 HELIX NUCLEATION 2 HELIX NUCLEATION HELIX NUCLEATION 2 1 HELIX NUCLEATION A C T U A L AND Tr 3 5000  J2 CLOSE 0 J 2 - 1 0 , M0J2 S E A R C H AT C - T E R M I N A L S E A R C H AT C T E R M I N A L SEARCH A T ' C - TERMINAL S E A R C H AT TERMINAL S E A R C H AT C - T E R M I N A L S E A R C H AT C - T E R M I N A L S E A R C H AT TERMINAL  J2+4  ***  ,RMJ2  V2.V3:  1 1  o  ***  PSEUDO-HELIX FROM J1:  LE LE LE LE LE LE LE  , T 1T2 T5.TT ,T1 T2 T5.TT ,T1 T2 T5.TT ,T1 T2 T5 , TT ,T1 T2 T5 , TT .T1 T2.T5.TT . T 1T2 T5 , TT  TO  J2:  33  BOUNDARY ANALYSIS OF THE N-TERMINAL T1,T2,T5 4.560 5 090 3.310 STEP 5,M0J1 CLOSE TO 0 B-TURN SEARCH AT N- TERMINAL 24 3.2200 5.1400 0 00005 1870 0 3.6900 B-TURN SEARCH AT N- TERMINAL 0 000165386 1 25 3.5300 4.3100 4.6700 B-TURN SEARCH AT N- TERMINAL 2 26 3.3800 4.7100 4.5800 0 0000287 17 3 B-TURN SEARCH AT N- TERMINAL 27 3.8100 5.1500 4.2 100 0 000007501 4 B-TURN SEARCH AT N- TERMINAL 3.3100 0 0OO02508 1 28 4.5600 5.0900 5 B-TURN SEARCH AT N- TERMINAL 29 4.9000 3.0100 0 0O0OO6671 4.5500 6 B-TURN SEARCH AT N- TERMINAL 30 5.0000 3.5900 3.5200 0 000037652 B-TURN PR.OBLEM T1 .T2,T5 4.560 5 090 3.3 10 LC: 1 STEP 12 ,M0J1, T1 .T2.T5 3.530 4 310 4.670 STEP 14, M0J1 B -T PROBL. STEP 22.M0J1 B -T PROBL. T1.T2.T5 3.240 4 270 4.720 STEP 26. M0J1 B -T PROBL. T1 .T2.T5 3.380 4 710 4.580 T1 ,T2,T5 3.530 4 310 4.670 STEP 29, J1 + 5 , RMJ 1 STEP 57. J1 + 2 , RMJ 1 T1,T2.T5 3.220 3 690 5.140  I I I I I  i  I  LE , T 1T2 T5.TT,I LE . T 1T2 T5 , TT I LE ,T 1 ,T2,T5.TT,I LE .T1 T2 T5.TT I LE ,T1 T2 T5.TT I LE , T 1T2 T5.TT I LE , T 1T2 T5 , TT I  27  BOUNDARY ANALYSIS OF THE C-TERMINAL T1,T2 4.570 3.240 STEP 3, J2 CLOSE TO O T1 .T2.T5 TT 5.220 3.320 3. 010 0.000028704 STEP 6. STEP7 , T 1.T2.T5 TT 4.570 3.240 3. 780 0.000018405 B-TURN 29 4.9000 4.5500 0 000006671 0 3.0100 B-TURN 1 3.5200 0 000037652 30 5.0000 3.5900 2 B-TURN 31 5.1300 2.9300 3.7800 0 000021341 B-TURN 3 32 5.2200 3.0100 0 000028704 3.3200 4 B-TURN 33 4.5700 3.2400 3.7800 0 000018405 B-TURN 34 4.0800 5 4.3300 0 000040267 3.3900 B-TURN 35 4.0800 4.3300 6 3.3900 0 000096638 STEP 10, J2 = J2 , M0J2 T1 ,T2,T5 4.080 3 390 4.330 J2 = J2 T1 , T2 STEP 16 4.080 3 390 T 1,T2,T5 4.570 3 240 3.780 STEP 19 J2- 1 ,M0J2 STEP 24, J2+1 ,M0J2 T1 .T2.T5 4.080 3 390 4.330 T1 ,T2,T5 5.220 3 320 3.010 STEP 25. J2-2 . RMJ2 T1 .T2 STEP 28, J2-2, RMJ2 5.990 4 070  EVENTUAL HELIX FROM J1: J J J J JB JB JB J1  24 25 26 27 33 31 27 27  JA JA JA JA JC JC JD J2  29 30 31 32 36 36 36 36  PSEUDO-HELIX FROM J1:  T3 T3 T3 T3 T3 T4 M5 T4 27  3 3 3 4 4 5  0000 5000 5000 5000 5700 0000 3 7 5000 TO  J2:  27  TO J2 :  ***  V2.V3:  O  HELIX NUCLEATION HELIX NUCLEATION HELIX NUCLEATION HELIX NUCLEATION HELIX PROPAGATION HELIX FORMERS IN 6 OVERLAPPING RESIDUES THEORIT. AND ACTUAL H BREAKERS FROM JB TO JD L: 1 TT : 5 0000 ACTUAL AND THEORIT. # FORMERS FROM J1 TO L L L L  : : : :  36  2 2 2 1  35  J2 CLOSE 0 M0J2 J2- 10 SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL  35  LE LE LE LE LE LE LE  LE LE LE LE LE LE LE  T 1 T2.T5.TT, I T1 T2 T5.TT,I T1 T2 T5.TT,I T1 T2 T5 , TT , I T 1 T2 T5.TT,I T1 T2 T5 , TT , I T1 T2 T5.TT,I  BOUNDARY ANALYSIS OF THE N-TERMINAL T1 ,T2,T5 4.560 5 090 3.310 STEP 5.M0J1 CLOSE TO 0 B-TURN SEARCH AT N- TERMINAL 24 3.2200 3.6900 5.1400 0 000051870 0 1 B-TURN SEARCH AT N- TERMINAL 25 3.5300 4.3100 4.6700 0 000165386 B-TURN SEARCH AT N- TERMINAL 4.5800 0 000028717 2 26 3.3800 4.7100 B-TURN SEARCH AT N- TERMINAL 27 3.8100 0 000007501 3 5.1500 4.2100 4 B-TURN SEARCH AT N- TERMINAL 28 4.5600 5.0900 3.3100 0 000025081 5 B-TURN SEARCH AT N- TERMINAL 3.0100 0 000006671 29 4.9000 4.5500 6 B-TURN SEARCH AT N- TERMINAL 30 5.0000 3.5900 3.5200 0 000037652 T1 .T2.T5 4.560 5 090 3.310 LC: 1 STEP 12 .M0J1. B-TURN PROBLEM T1 ,T2,T5 3.530 4 310 4.670 STEP 14, M0J1 B -T PROBL. STEP 22. M0J1 B -T PROBL. T1 ,T2,T5 3.240 4 270 4.720 T1 ,T2,T5 3.380 4 7 10 4.580 STEP 26, M0J1 B -T PROBL. T1,T2,T5 3.530 4 310 4.670 STEP 29, J1+5 , RMJ 1 T1,T2,T5 3.220 3 690 5.140 STEP 57, J1+2 , RMJ 1 BOUNDARY ANALYSIS OF THE C-TERMINAL T1.T2 3.240 3.910 STEP 3, J2 CLOSE TO 0 T1 ,T2,T5, TT 4 .080 3. 390 4 .330 0.000096638 STEP 6, STEP7, T1.T2.T5, TT 3 . 240 3 .910 5 .150 O.000058913 B-TURN 0 32 5.2200 3 . 3200 3 .0100 0 000028704 1 B-TURN 3 . 2400 3 .7800 O OOOO18405 33 4.5700 B-TURN 2 3 . 3900 4 .3300 0 000040267 34 4.0800 B-TURN 3 . 3900 4 .3300 3 35 4.0800 0 000096638 4 B-TURN 3 .9100 5 .1500 36 3.2400 0 000058913 B-TURN 4 . 3500 4 .6800 5 0 000099602 37 3.3000 B-TURN 4 .5600 4 .1000 6 38 3.7400 0 000031194 T1 ,T2,T5 3 .300 4 35C 4 .680 STEP 10, J2 =J2 , M0J2 STEP 16 J2 = J2 T1 ,T2 3 .300 4 350  T1 T2,T5,TT,I T 1 T2,T5,TT,I T1 T2,T5,TT. I T1,T2,T5,TT,I T1 T2,T5,TT,I T1 T2,T5,TT. I T 1 T2,T5,TT,I  EVENTUAL HELIX FROM J1 : J  80  PSEUDO-HELIX J J J J1 J1  90 91 92 .91 90  JA  85  FROM JA JA JA J2 J2  T3 J1 :  95 96 97 98 98  PSEUDO-HELIX FROM J1:  T3 T3 T3 T4 T4 90  3.5000 79  TO J2: L: 0  TO J2 :  3.5000 3.5000 4.5000 5.5000 6.5000 TO  27  J2:  L: L: L: TT TT  84  ***  35  J2 CLOSE 0 M0J2 J2- 10 SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL  V2.V3:  0  )8  ***  HELIX NUCLEATION SPECIAL CASE  HELIX NUCLEATION 1 1 HELIX NUCLEATION HELIX NUCLEATION 1 4 OOOO ACTUAL AND THEORIT. 4 5000 ACTUAL AND THEORIT.  H H  FORMERS FROM J1 TO J2 FORMERS FROM J1 TO J2  98  BOUNDARY ANALYSIS OF THE N-TERMINAL T1.T2.T5 3.500 4.530 4.680 STEP 5.M0J1 CLOSE TO 0 LE,T1 ,T2,T5,TT, I 87 4.3400 4.1600 3.5500 0.000018842 0 B-TURN SEARCH AT N-TERMINAL LE,T1,T2,T5.TT , I 88 4.1000 4.3700 3.5200 0.000017229 1 B-TURN SEARCH AT N-TERMINAL LE.T1, T 2 , T 5 , T T , I 89 4.0800 4.4700 3.5500 0.000043301 2 B-TURN SEARCH AT N-TERMINAL  LE,T 1 , T2 ,T5,TT,I LE.T1 ,T2,T5,TT,I LE,T1,T2,T5.TT,I LE.T1 ,T2,T5,TT.I  LE.T1,T2 , T5,TT, LE.T1 ,T2 , T5.TT, LE.T1 ,T2 , T5.TT, LE,T1,T2 •T5.TT, LE.T1,T2 •T5.TT, LE.T1 ,T2 , T5,TT, LE,T1,T2 .T5.TT.  4 . 1700 4.1500 4 . 5300 4.6800 4 . 6 100 3.9 100 3 . 6500 4.4200 3 . 500 4530 4.680 370 3.520 4 . 100 080 4.320 3.690 470 3.550 4 .080 370 3.520 4 . 100 160 3.550 4 . 340  90 3.9200 91 3.5000 92 4.1500 93 4.2500 T 1 ,T2,T5 T1 ,T2,T5 . T1 ,T2,T5 . T 1,T2,T5 . T1 •T2.T5 . T 1,T2,T5  BOUNDARY ANALYSIS OF THE C-TERMINAL T 1 , T2 3.920 4. 590 STEP 3, J2 CLOSE TO O T1,T2,T5, TT 4.070 4 . 790 3 .410 0.000005550 STEP 6, T1 ,T2,T5, TT 3 .920 4 590 3 .860 0.000020898 STEP7, 94 4.7400 5000 3 8700 0.000077456 0 B-TURN 95 8200 9100 3 1500 0.000027821 1 B-TURN 96 4600 7800 2 9900 0.000004358 2 8-TURN 97 0700 7900 3 4100 0.OOO0O5550 3 B-TURN 98 9200 5900 3 8600 0.OO002O898 4 B-TURN 99 4100 7400 4 9500 0.000234478 5 B-TURN 100 3600 5800 5 9100 0.000203148 6 B-TURN T1,T2,T5 3.410 3. 740' 4 . 950 STEP 10 J2 = J2 , M0J2 T 1 , T2 3.410 3. 740 STEP 16 J2 = J2 T1 ,T2,T5 3.920 4. 590 3.860 STEP 19 J2-1 ,M0«J2  EVENTUAL HELIX FROM J1: J J J JB J1  103 104 105 1 1 1 105  JA JA JA JC J2  108 109 1 10 114 112  T3 T3 T3 T3 T4  PSEUDO-HELIX FROM J1: 105  LE,T1,T2 T5 . TT LE,T 1 ,T2 T5.TT LE,T1,T2,T5,TT LE,T1 ,T2 T5, TT LE,T1,T2 T5, TT LE,T1,T2 T5.TT LE,T 1 ,T2 T5.TT  I I I I I I I  0.000021250 3 B-TURN SEARCH AT N-TERMINAL 0.000140820 4 B-TURN SEARCH AT N-TERMINAL 0.000034921 5 B-TURN SEARCH AT N-TERMINAL 0.000028372 6 B-TURN SEARCH AT N-TERMINAL LC: 2 STEP 12 ,M0J1, B-TURN PROBLEM STEP 14.M0J1 B-T PROBL. STEP 22.M0J1 B-T PROBL. STEP 26.M0J1 B-T PROBL. STEP 29, J1 + 5 ,RMJ1 STEP 57, J1+2 ,RMJ1  3 4 5 3 6  90  5000 OOOO OOOO 7500 5000  TO  J2 :  TO J2: L: L: L:  99  ***  HELIX NUCLEATION HELIX NUCLEATION HELIX NUCLEATION HELIX PROPAGATION ACTUAL AND THEORIT TT : 4 OOOO  J2 CLOSE 0 J2-10 , M0J2 SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL TERMINAL SEARCH AT SEARCH AT C- TERMINAL  V2.V3:  23'  2 2 1  M FORMERS FROM J1 TO J2  1 12  BOUNDARY ANALYSIS OF THE N-TERMINAL STEP 5.M0J1 CLOSE TO 0 T1 .T2.T5 4.230 4 790 3.680 B-TURN SEARCH AT N- TERMINAL 5.1800 0 102 3.6000 3.0900 0 000117 249 1 B-TURN SEARCH AT N- TERMINAL 5.1800 0 000015919 103 3.7000 3.2300 B-TURN SEARCH AT N- TERMINAL 2 104 4.1100 4.3800 0 000092656 3.5200 B-TURN SEARCH AT N- TERMINAL 105 4.6200 0 000032989 . 3 4.1400 3.7800 4 B-TURN SEARCH AT N- TERMINAL 106 4.2300 3.6800 0 000041504 4.7900 B-TURN SEARCH AT N- TERMINAL 5 107 4.9800 . 4.7300 0 000001267 2.7800 6 B-TURN SEARCH AT N-TERMINAL 108 4.6400 5.2700 3.0800 0 000021603 B-TURN PROBLEM T 1,T2,T5 4.230 4 790 3.680 LC: 0 STEP 12 ,M0J1, T1 .T2.T5 3.700 3 230 5.180 STEP 14.M0J1 B -T PROBL. STEP 22.M0J1 B -T PROBL. T1 ,T2,T5 3.160 2 580 6.040  LE , T 1 ,T2,.T5,, TT,,I LE , T 1 .T2,. T5, , TT,,I LE . T 1 .T2,,T5,, TT, , I LE ,T1 .T2,,T5., TT,,I LE,T1,T2,,T5,. TT,I LE ,T1 .T2,T5,, TT I , LE ,T1 .T2,,T5,.TT, I  BOUNDARY ANALYSIS OF THE C-TERMINAL T1.T2 3.670 3.940 STEP 3, 02 CLOSE TO 0 T1 ,T2,T5, TT 3.750 4.120 4. 420 0.000132510 STEP 6, T1 ,T2,T5, TT 3.670 3.940 4. 650 0.000073624 STEP7, 108 4.6400 B-TURN 5.2700 3.0800 0. 00002 1603 0 1 B-TURN 109 4.5600 4.8300 3.0700 0. 000025633 B-TURN 1 10 4.1700 4.0200 4.1300 O. 000007027 2 1 1 13.7500 4.1200 3 B-TURN 4.4200 0. 000132510 1 12 3.6700 4 B-TURN 4.6500 3.9400 0. 000073624 1 13 3.8300 5 B-TURN 3.7500 4.7100 0. 000189688 1 14 3.7300 6 B-TURN 3.6100 4.7100 0. 000040602 T1 , T2.T5 3.830 3. 750 4.710 STEP 10, 02 = 02 . M002 T 1,T2 STEP 16 ,, 02 = 02 3.830 3.. 750 T1 ,T2,T5 3.670 3..940 4.650 STEP 19 ,, 02-1 ,M002 T 1, T2,T5 3.730 3. 610 4.710 STEP 24, 02+1 , M0J2 T1 ,T2,T5 3.750 4. 120 4.420 STEP 25, J2-2 , RM02 T 1,T2 STEP 28, J2-2, RM02 4.750 5. 310  EVENTUAL HELIX FROM 01:  °J  0 : 119 01: 119  OA: 124 02: 124  PSEUDO-HELIX FROM 01:  LE.T1 ,T2,T5,TT, I LE.T1 ,T2,T5,TT, I LE,T 1 ,T2,T5,TT,I LE.T1,T2,T5,TT,I LE,T1,T2,T5,TT,I LE.T1 ,T2.T5,TT,I LE,T1,T2,T5,TT,I  T3: 5.5000 T4: 5.5000 119  TO  02:  107  TO 02:  114  02 CLOSE 0 02-10 ,, M002 SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C- TERMINAL SEARCH AT C-TERMINAL SEARCH AT C- TERMINAL  ***  V2.V3:  24  37  L: 0 HELIX NUCLEATION TT: 3.0000 ACTUAL AND THEORIT. H FORMERS FROM 01 TO 02 124  BOUNDARY ANALYSIS OF THE N-TERMINAL STEP 5.M001 CLOSE TO 0 T1,T2,T5 4.670 5 000 3.100 0 B-TURN SEARCH AT N-TERMINAL 116 3.5700 3.2200 4.9900 O 0O0024614 1800 1 B-TURN SEARCH AT N-TERMINAL 117 3 4 700 4 .4800 O.OOO1045O9 5300 0.000025958 2 B-TURN SEARCH AT N-TERMINAL 3.9000 118 4 0100 0.000015142 1700 3 B-TURN SEARCH AT N-TERMINAL 119 4.6000 3 .6000 O.0OOO35514 OOOO 4 B-TURN SEARCH AT N-TERMINAL 3 .1000 120 4.6700 0.000020156 9000 5 B-TURN SEARCH AT N-TERMINAL 121 4 .6900 3 .0700 0.000000862 7300 6 B-TURN SEARCH AT N-TERMINAL 122 4 . 5600 3 .0400 670 5 000 3 . 100 LC: 1 STEP 12 ,M0U1, B-TURN PROBLEM T1 ,T2,T5 470 4 180 STEP 14.M001 B-T PROBL. T1,T2,T5 480 560 3 870 STEP 22.M001 B-T PROBL. T1 ,T2,T5 720 010 4 530 STEP 26.M001 B-T PROBL. T1.T2.T5 900 470 4 180 STEP 29. 01+5 .RM01 T1,T2,T5 480 570 3 220 4 .990 STEP 57. 01+2 .RM01 T1 .T2.T5  BOUNDARY ANALYSIS OF THE C-TERMINAL T1 ,T2 3.330 4.470 STEP 3, 02 CLOSE TO 0 T1,T2,T5,TT 3.710 4.650 3.940 0.000039396 STEP 6, T1,T2,T5,TT 3.330 4.470 4 170 0.000110850 STEP7, LE,T1,T2,T5.TT,I O.000035514 O 120 4.6700 5.0000 3.1000 B-TURN LE , T 1 ,T2,T5 , TT,I 121 4.6900 0.000020156 1 4.9000 3.0700 B-TURN LE ,T 1 ,T2.T5,TT,I 122 4.5600 0.000000862 2 4.7300 3.0400 B-TURN 0.000039396 3 LE.T1 ,T2.T5,TT, I 123 3.7100 4.6500 3.9400 B-TURN  02 CLOSE 0 02-10 . M002 SEARCH AT C-TERMINAL SEARCH AT C-TERMINAL SEARCH AT C-TERMINAL SEARCH AT C-TERMINAL  ***  LE,T 1 ,T2,T5, , TT , I LE.T1 ,T2,T5,, TT , I LE,T 1 .T2,T5, , TT I ,  124 3.3300 125 3.2500 126 3.4800 T1 ,T2,T5 T1 ,T2 T1 ,T2,T5 T 1,T2,T5 T1 ,T2,T5 T1 • T2 T1 ,T2,T5 T1 ,T2,T5 T 1.T2,T5  4 B-TURN SEARCH AT C -TERMINAL 4.4700 4 .1700 0 .000110850 4 .6500 i 5 B-TURN SEARCH AT C -TERMINAL 0 .000059173 3.8000 4 .2900 6 B-TURN SEARCH AT C -TERMINAL 4.1700 0 .000037464 STEP 10, J2=J2 , M0U2 3 . 250 3 .. 800 4.650 STEP 16 ,, J2=J2 3 . 250 3 . 800 STEP 19 ,, J2-1 ,M0d2 3 . 330 4 .. 470 4 . 170 STEP 24 , J2+1 , M0J2 3 . 480 4 . 170 4 . 290 . STEP 25 , J2-2 , RMd2 3 . 940 3 . 710 4 .650 . STEP 28, J2-2, RMJ2 4.410 5 .840 STEP 4 1 ,02-3 , RMd2 4 . 560 4 .. 730 3.040 STEP 54, J2-4 ,RMd2 4.670 5 .000 3 . 100 . STEP 55, J2-4 ,RMJ2 4 . 690 4 .900 3 .070  EVENTUAL HELIX FROM J1:  END OF PROGRAM CTN  119  TO J2 :  124  ***  V2,V3:  0  0  ***  ******************************** *  BETA-SHEET  PREDICTION  *********************************  TOTAL NUMBER OF AA : NUMBER OF DATA LINES: PROTEIN 12 1 1 12 8 3 16 12 3 1 1  129 9  SEQUENCE 20 4 14 16 4 1 10 2 0  14 3 7 17 8 1 1 20 5 0  8 19 16 4 2 1 1 16 12 0  PRELIMINARY  2 2 3 19 17 16 4 8 O  5 8 14 8 15 16 8 17 0  1 1 16 17 1 1 16 10 8 20 0  7 19 3 10 8 4 4 4 0  1 1 1 6 6 2 17 13 6 0  1 8 1 1 1 3 1 3 1 0  1 3 17 3 11 16 1. 18 0  13 18 3 16 5 20 18 10 0  12 20 2 2 3 3 20 2 0  2 5 3 18 10 5 1 8 0  SEARCH FOR REGIONS WITH SHEET POTENTIAL - RULE 2 1  6  19  21  21  25  25  34  37  41  51  53  53  69  73  76  76  80  83  85  87  89  9 1 17 18 15 1 18 5 0  8 1 4 5 5 12 2 2 O  89  95  95  99  104  109  1 11  1 13  1 18  121  121  129 IM = 36  SEARCH FOR ACTUAL SHEETS FROM THE POTENTIAL REGIO  K> cn CO  G : G : J1:  1 2 1  MA: MA: J2:  5 6 6  PSEUDO-SHEET FROM d1 :  T3 : 2 OOOO T3 : 3 OOOO T4 : 4 OOOO 1  TO  N: N: TT  2 1 3.OOOO  19 21 22 25 30 25 25  MA MA MA MA MC d2 d2  22 25 25 29 33 31 32  PSEUDO- SHEET FROM d1:  T3 T3 T3 T3 T1 T4 T4  1 2 2 3 4 4 4  25  TO  TO N: N: N: N: T2 TT TT  OOOO OOOO OOOO OOOO 7000 OOOO OOOO d2:  37 37 37 37  MA d2 d2 d2  41 42 45 45  PSEUDO-SHEET FROM d1:  T3 T4 T4 T4 37  3 3 5 5  OOOO OOOO OOOO OOOO TO d2:  1 2 2 1 3 5900 3 5000 4 OOOO  V2, V3  d2 :  25  SHEET NUCLEATION SHEET NUCLEATION SHEET NUCLEATION SHEET NUCLEATION SHEET PROPAGATION ACTUAL AND THEORIT'. tt FORMERS FROM d1 TO d2 ACTUAL AND THEORIT. tt FORMERS FROM d1 TO d2  32  EVENTUAL SHEET FROM d1 : G d1 d1 d1  tt .FORMERS FROM d1 TO d2  6  d2:  EVENTUAL SHEET FROM d1 G G G G MB J1 d1  SHEET NUCLEATION SHEET NUCLEATION ACTUAL AND THEORIT  25 N: TT TT TT  TO  0 3 OOOO 4 5000 4 5000 45  d2:  32  ***  V2,V3  36  SHEET NUCLEATION ACTUAL AND THEORIT. tt FORMERS FROM d1 TO d2 ACTUAL AND THEORIT. tt FORMERS FROM d1 TO d2 ACTUAL AND THEORIT. tt FORMERS FROM d1 TO d2  EVENTUAL SHEET FROM 01 : G G MB 01  51 53 58 53  MA MA MC 02  54 57 61 59  PSEUDO-SHEET FROM J1 :  G MB MB  60 56 56  MA MC MD  T3 : T3 . T1 T4  2.0000 4 .0000 3 . 6300 6 OOOO TO  59  EVENTUAL SHEET FROM 01 :  51  T3 3 OOOO T 1 4 2000 V6 3  PSEUDO-SHEET FROM <J 1 : 56  G G G G G G J1  66 73 76 83 87 89 88  MA MA MA MA MA MA J2  64  EVENTUAL SHEET FROM 01 :  56  PSEUDO- SHEET FROM J1:  T3 T3 T3 T3 T3 T3 T4  1 2 3 2 2 3 5  88  N N N N N N TT :  OOOO OOOO OOOO OOOO OOOO OOOO OOOO TO  02:  EVENTUAL SHEET FROM 01 : G G G G J1 01 01  95 96 104 105 105 105 105  PSEUDO-SHEET  MA MA MA MA 02 02 02  99 99 108 109 1 10 112 1 13  T3 T3 T3 T3 T4 T4 T4  FROM J1: 105  2 2 2 3 3 4 5  02 :  EVENTUAL SHEET FROM 01  TC  2 0 1 2 1 1 4 OOOO  02:  46  ***  V2,V3  8  SHEET SHEET SHEET ACTUAL  NUCLEATION NUCLEATION PROPAGATION AND THEORIT. H FORMERS FROM 01 TO 02  02 :  59  ***  V2,V3 :  SHEET NUCLEATION SHEET PROPAGATION THEORITIC. AND ACTUAL H BREAKERS  ***  17  FROM MB TO MD  02 :  65  SHEET SHEET SHEET SHEET SHEET SHEET ACTUAL  NUCLEATION NUCLEATION NUCLEATION NUCLEATION NUCLEATION NUCLEATION AND THEORIT. H FORMERS FROM 01 TO 02  02 :  94  SHEET SHEET SHEET SHEET ACTUAL ACTUAL ACTUAL  NUCLEATION NUCLEATION NUCLEATION NUCLEATION AND THEORIT. # FORMERS FROM 01 TO 02 AND THEORIT. H FORMERS FROM 01 TO 02 AND THEORIT. H FORMERS FROM 01 TO 02  *+*  -| 4  V2 V3  19  95 88 N N N N TT : TT : TT :  OOOO OOOO OOOO OOOO OOOO OOOO OOOO TO  TC  N 1 T2 : 4 7500 V8 : 1 02 :  69 76 80 86 90 93 95  TO  TC  N: 2 N 1 T2 : 3 8700 TT : 3 5000 02 :  64 59 64  53  38  TC  2 2 1 0 3 OOOO 4 OOOO 4 5000  ***  V2.V3  35  21  1 13 105  TO  02:  113  ***  V2.V3 :  36  32 ***  G : G : 01 : 01 :  1 1 1 1  1 1 18 18 18  PSEUDO-SHEET  G : 121 MB : 12G MB: 121 PSEUDO-SHEET  MA MA 02 02  : : : :  1 14 121 122 125  T3 T3 T4 T4  : : : :  FROM 01: 118  2 3 3 5  N: N: TT : TT :  OOOO OOOO OOOO OOOO 02:  125  EVENTUAL SHEET FROM 01:  120  MA : 125 MC : 129 MD : 129 FROM 01:  TO  T3 : 3 OOOO T 1 :3 4600 3 V6 : 12 1  TO  END OF PROGRAM  TO  N: 0 T2 : 4.1700 V8 : 1 02 :  EVENTUAL SHEET FROM 01: DO O Ul  0 1 2 5000 4 OOOO  SHEET NUCLEATION SHEET NUCLEATION ACTUAL AND THEORIT. If FORMERS FROM 01 TO 02 ACTUAL AND THEORIT. # FORMERS FROM 01 TO 02  02:  127  *** V2.V3 :  23  20 **  SHEET NUCLEATION SHEET PROPAGATION THEORITIC. AND ACTUAL ff BREAKERS FROM MB TO MD  129 121  TO  02:  129  ***  V2.V3  39  32  ***  OVERLAPPING RESOLUTION  TOTAL NUMBER OF AA : NUMBER OF DATA LINES:  129 9  PROTEIN SEQUENCE 12 1 1 12 8 3 16 12 3 1 1  20 4 14 16 4 1 10 2 0  14 3 7 17 8 1 1 20 5 0  8 19 16 4 2 1 1 16 12 0  2 2 3 19 17 16 4 8 0  5 8 14 8 15 . 16 8 17 0  7 19 3 10 8 4 4 4 0  11 16 17 1 1 16 10 8 20 0  1 1 1 6 6 2 17 13 6 0  1 8 1 1 1 3 1 3 1 0  1 3 17 3 1 1 16 1 18 0  13 18 3 16 5 20 18 10 0  12 20 2 2 3 3 20 2 0  2 5 3 18 10 5 1 8 0  9 1 17 18 15 1 18 5 O  8 1 4 5 5 12 2 2 0  113  119  120  124  124  (7\  PAIRS OF OVERLAPPING HELICES AND SHEETS 27  27  35  32  89  88  99  *** COMPARISON  94  107  107  114  OF THEIR LENGTH ***  ***** COMPARISON  L-HELIX:  RATI0=LH/LS: 1.0  L-SHEET:  OF P-HELIX AND P-SHEET *****  H1  27  H2  35  A1: 1.128  A2 : 1 .033  A1 •> A2  FROM H1 TO H2  S1  27  S2  32  A1: 1.058  A2 : 1 . 135  A1 < A2  FROM S1 TO S2  *** COMPARISON HHF 6 .00 H1 HHF 4 .00 S1  HF 4 .00 27 HF 2 .00 27  IIH 0.0 H2  IH 0. 25 35  S2  32  BH -0.50  TTH: 9.750 IH 0. 25  IIH  0.0  OF ASSIGNMENTS TYPES ***  BH -O. 50  TTH: 5.750  BBH 0.0  SSF 2 .00  SF 3 .00  TTS: 4.250 BBH 0.0  SSF 2 .00  TTS: 4.750  IS O. 75 TTH  SF 2 .00  TTS IS O. 75  TTH > TTS  BS -0.50  BBS -1 .00  FROM H1 TO H2 BS 0.0  BBS 0.0  FROM S1 TO S2  BOUNDARY ANALYS. FOR HELIX FROM: HN 2.89  SN 3.84  HC 4.17  27  SC 2.61  TO:  NHN 3.81  35  AND FOR SHEET FROM:  NSN 3.33  *** COMPARISON OF THEIR LENGTH ***  NHC 3.61  L-HELIX:  27  TO:  32  NSC 2.11  11  L-SHEET:  7  RATIO=LH/LS: 1.0  ***** COMPARISON OF P-HELIX AND P-SHEET ***** H1  89  H2 :  99  A1: 1.030  A2: 1.105  A1 < A2  FROM H1 TO H2  S1  88  S2 :  94  A l : 0.933  A2: 1.164  A1 < A2  FROM S1 TO S2  *** p-HELIX AND P-SHEET OF INTERS. AREA : H1 TO S2 *** 89  OL1  0L2:  to  94  A1: 0.908  A2: 1.092  A1 < A2  FROM H1 TO S2  *** COMPARISON OF ASSIGNMENTS TYPES ***  <3  HHF 4 .OO  HF 5 .00  H1 HHF 2.00  S1  89 HF 2.00 : 88  HHF 2 .OO  HF 1 .00  IIH 0.0  IH O. 75  • H2  TTH:  99  IIH 0.0 S2 :  94  IIH 0.0  BH -O. 50  IH 0.75  BBH 0.0  9.250  BH -0.50 TTH: 4.250  IH 0. 75  BH -0.50  SSF 6.00  TTS: 7.250  89  0L2:  94  TTH: 3.250  BOUNDARY ANALYS. FOR HELIX FROM: HN 3.11  SN 3.97  HC 4 .06  SC 4.11  89  SF 2 .00  BBH O.O  SF 2 .00  SSF 2 .00  NHN 3.31  99 NSN 4.21  *** COMPARISON OF THEIR LENGTH ***  BS 1 . 50  BBS O.O  *** .  AND FOR SHEET FROM:  L-HELIX:  FROM H1 TO H2  BS -0.50  IS O. 50  TTH < TTS  NHC 3 . 63  BBS 0.0  IS BS BBS -O. 50 O. 50 0.0 TTH < TTS FROM S1 TO S2  H1 TO 52  TTS: 4.OOO TO :  IS O. 75 TTH > TTS  BBH SSF 0.0 4 .00 TTS: 6.000  *** ASSIGNM. TYPES IN OVERL. AREAS OL 1  SF 2 .00  FROM H1 TO S2 88  TO:  NSC 2 . 38  8  L-SHEET:  7  RATIO=LH/LS: 1.0  ***** COMPARISON OF P-HELIX AND P-SHEET ***** H1  : 107  H2 : 114  A1: 1.086  A2: 1.106  94  A1 < A2  FROM H1 TO H2  S1  107  S2  113  A1: 1 . 101  A2: 1. 131  A1 < A2 FROM S1 TO S2  *+* COMPARISON OF ASSIGNMENTS TYPES *** HHF 4 .00  HF 3 .00  H1 : 107 HHF 4 .00 S1  HF 3 .OO 107  IIH 0.0 H2  IH O. 50  S2  IH 0. 25  BH -0.50  SN 4 .04  HC 3 .09  SC 3 . 30  BBH 0.0  SF 2 .00  107  TO:  SSF 2 .00  NHN 3 .66  1 14  IS 1 . 25 TTH > TTS  SF 2 .OO  TTS: 5.000  IS 1 .00  AND FOR SHEET FROM: NHC 4 . 07  L-HELIX:  6  BS 0.0  BBS 0.0  FROM H1 TO H2 BS 0.0  TTH > TTS  NSN 4.84  *** COMPARISON OF THEIR LENGTH ***  00  SSF 2.00  TTS: 5.250  TTH: 6.750  1 13  BOUNDARY ANALYS. FOR HELIX FROM: HN 3 . 37  BBH 0.0  TTH: 7.000  1 14  IIH 0.0  BH -0. 50  BBS 0.0  FROM S1 TO S2 107  TO:  1 13  NSC 2.40  L-SHEET:  5  RATIO=LH/LS: 1.0  ***** COMPARISON OF P-HELIX AND P-SHEET ***** H1 : 119  H2 : 124  A1: 1.127  A2: 1.190  A1 < A2  FROM H1 TO H2  S1 : 120  S2 : 124  A1: 1.150  A2: 1.320  A1 < A2  FROM S1 TO S2  *** COMPARISON OF ASSIGNMENTS TYPES *** HHF 2 .00 H1 HHF 2 .00 S1  HF 4 .00 1 19 HF 4 .OO 120  IIH O. 50 H2  124  IIH 0.0 S2  IH 0.0  BH 0.0  SN 4.20  END OF PROGRAM  HC 2 . 58  SC 3 . 29  119 NHN 3.51  SSF 4 .00  SF 2 .00  TTS: 5.250 BBH 0.0  TTH: 6.000  BOUNDARY ANALYS. FOR HELIX FROM: HN 3.85  BBH 0.0  TTH: 6.500 IH 0.0  124  BH 0.0  TTH > TTS  SSF 4 .00  SF 2 .00  TTS: 6.250 TO:  124 NSN 3 . 94  IS 0. 25  IS O. 25 TTH < TTS  AND FOR SHEET FROM: NHC 3 . 82  NSC 3 . 26  BS 0.0  BBS - 1 .00  FROM H1 TO H2 BS 0.0  BBS 0.0  FROM S1 TO S2 120  TO:  124  ********************************* * *  *  BETA-TURN  PREDICTION  *  * * *********************************  TOTAL NUMBER OF AA: NUMBER OF DATA LINES:  129 9  PROTEIN SEQUENCE 12 1 1 12 8 3 16 12 3 1 1  20 4 14 16 4 1 10 2 0  14 3 7 17 8 1 1 20 5 0  8 19 16 4 2 1 1 16 12 0  2 2 3 19 17 16 4 8 0  5 8 14 8 15 16 8 17 0  7 19 3 10 8 4 4 4 0  1 1 16 17 1 1 16 10 8 20 0  1 1 1 6 6 2 17 13 6 0  1 8 1 1 1 3 1 3 1 0  1 3 17 3 1 1 16 1 18 0  DEFINITION OF PARAMETERS PRB : PROBABILITY OF OCCURRENCE OF THE B-TURN PRBO : PROBABILITY OF OCCURRENCE OF THE B-TURN A 1 :0 A 1 :0 A 1 :O A1 : 0  980 935 845 940  A2 A2 A2 A2  : : : :  1 1 1 0  142 190 062 810  A3 A3 A3 A3  POTENTIAL A 1 1 100 A1 1 210 A 1 1 390 A1 1 367 A 1 1 427 A 1 1 362 A1 1 252 A 1 1 147 A 1 0 927  BETA-TURN A2 0 947 A2 0 922 A2 0 832 A2 0 947 A2 0 885 A2 0 862 A2 0 887 A2 0 897 A2 0 822  4 A3 A3 A3 A3 A3 A3 A3 A3 A3  POTENTIAL A 1 :0 940 A 1 :0 947 A 1 :0 865  BETA-TURN A2 : 0 962 A2: 0 865 A2 : 0 870  : : : :  13 18 3 16 5 20 18 10 0  12 20 2 2 3 3 20 2 0  2 5 3 18 10 5 1 8 0  •  917 902 075 1 10  PRB : PRB: PRB : PRB :  0 0 0 0  0000260832 0000410533 0000635500 0000809601  0 0 0 0 0 0 0 0 1  7 867 795 662 642 645 732 805 877 1 17  PRB PRB PRB PRB PRB PRB PRB PRB PRB  0 0 0 0 0 0 0 0 0  0000199969 0000186667 00000284 20 00000941 1 1 0000087780 0000060648 0000301103 00004 18056 0000824127  PRB : 0 0000437570 PRB : 0 0000347003 PRB : 0 0000415369  8  4 5 5 12 2 2 0  STARTING FROM I STARTING FROM ( -1)  0 0 1 1  13 16 A3 : 1 .012 A3: 1 . 140 A3 : 1 . 292  9 1 17 18 15 1 18 5 0  1 2 3 4 5 6 7 8' 9 10 1 1 12 13 14 15 16  1  A 1: 0.895  A2 : 1.050  A3 : 1 . 187  PRB : 0 0001602011  17  POTENTIAL A 1: 0.837  BETA-TURN A2 : 0.957  17 20 A3 : 1 . 277  PRB : 0 0001182275  18  POTENTIAL  BETA-TURN  18  PRB : 0.00011823  PRBO: 0.00016020 A 1• 0.727 POTENTIAL  A2 : 1.010 BETA-TURN  A3  0. 732  POTENTIAL  A2  1 . 155  BETA-TURN  A3  A1 A1 A1 A1 A1  B-TURN NOT AT  PRB : 0 0001574771  O.00015748  1 197  PRB  B-TURN NOT AT 0 0002064347  PRB  0.00020643  B-TURN NOT AT  270 180 180 285 167  PRB PRB PRB PRB PRB  0 0 0 0 0  00007 18997 0000580124 0000623697 0000518699 0001653858  21 22 23 24 25  POTENTIAL BETA-TURN A 1 0.845 A2 1 . 177 A 1 0 . 877 A2 1 . 287 A 1 1 .065 A2 1 . 272 A 1 1 . 150 A2 1 . 137 A 1 1 . 175 A2 0.897 A 1 1 . 282 A2 0.945 A 1 1 . 305 A2 0. 830 A 1 1.142 • A2 0.810 A 1 1 .020 A2 0. 847 A 1 1 .020 A2 0.847  25 A3 A3 A3 A3 A3 A3 A3 A3 A3 A3  1 1 0 0 0 0 0 0 1 1  28 145 052 827 752 880 732 752 945 082 082  PRB PRB PRB PRB PRB PRB PRB PRB PRB PRB  0 0 0 0 0 0 0 0 0 0  0000287166 0000075013 0000250810 0000066706 0000376522 0000213408 0000287039 0000184053 0000402674 0000966383  26 27 28 29 30 31 32 33 34 35  35 A3 A3  38 1 287 1 170  PRB PRB  0 0000589133 0 0000996024  36 37  37 A3 A3 A3 A3 A3  1 1 0 1 1  40 025 040 890 040 032  PRB PRB PRB PRB PRB  0 0 0 0 0  0000311938 0000373146 0000233034 0000332659 0001052027  38 39 40 41 42  BETA-TURN A2 0.977 A2 1 .087  POTENTIAL BETA-TURN A1 0.935 A2 1 . 140 A 1 1 .007 A2 1 .002 A 1 1 .047 A2 1 .077 A 1 1 .007 A2 1 .002 A 1 0.975 A2 0.960 POTENTIAL  BETA-TURN  42  19  BUT AT  20  23  20  1 1 1 1 1  POTENTIAL A1 0.810 A 1 0.825  BUT AT 20  A3 A3 A3 A3 A3  A2 A2 A2 A2 A2  17  19  0-. 975 1 .067 1 .067 0.922 1 . 077  0. 752 0.810 0.810 0.805 0.882  BUT AT  22 PRB  PRBO: 0.00015748  to o  1 302  19  PRBO: 0.00011823 A1  21  45  A1 A1  O. 787 0. 787  A2: 0.975 A2: 0.975  POTENTIAL A1 : 0.872 A1: 0.770 POTENTIAL  BETA-TURN A2: 0.887 A2: 0.842 BETA-TURN  A1: 0.795  A2: 0.807  POTENTIAL  BETA-TURN  A3: 1.257 A3: 1.257  PRB: 0.0000643061 PRB: 0.0002575084  43 44  44 47 A3 : 1 . 232 A3 : 1 . 385 46 49  PRB: 0.0000305896 PRB: 0.0004730921  45 46  A3: 1.352  PRB: 0.000190524 1  47  47  PRBO: 0.00047309 A1: 0.795 POTENTIAL  A2: 0.807 BETA-TURN  PRB: 0.00019052 A3:  A2: 0.807 A2: 0.987  POTENTIAL A1: 0.775 POTENTIAL  POTENTIAL  0 0 1 1 0 0 0  887 992 152 050 940 907 875  POTENTIAL A 1 0 977 A1 0 960 A 1 0 882 A1 0 865 A 1 0 737  O.0001233840  B-TURN NOT AT 49 50  BETA-TURN A2: 0.987  50 53 A3: 1 . 280  PRB: 0.0001639226  51  BETA-TURN  51  BETA-TURN  A2 A2 A2 A2 A2 A2 A2  1 1 1 1 1 0 0  BUT AT  46  48  BUT AT  47  51  BUT AT  50  52  BUT AT  51  51  PRB: 0.000074647 1 PRB: 0.0002899796  A2: 1.090  47 48  A3: 1.352 A3: 1.247  54 PRB: 0.00016392  A3:  1.157  52  PRBO: 0.00016392 A1 A1 A1 A1 A1 A1 A1  PRB:  B-TURN NOT AT  PRB: 0.00012339  PRBO: 0.00028998 A1: 0.837  1.352  48  PRBO: 0.00019052 A 1 0.795 A 1 0.825  50  B-TURN NOT AT  PRB: 0.0001016651  52  55 PRB: 0.00010167  B-TURN NOT AT  280 187 325 147 010 967 985  A3 A3 A3 A3 A3 A3 A3  0 0 O 0 1 1 1  940 900 657 930 140 132 225  PRB PRB PRB PRB PRB PRB PRB  0 0 0 0 0 0 0  0000063427 0000122351 0000027842 0000195839 0000374550 0000537943 0003699916  53 54 55 56 57 58 59  BETA-TURN A2 1 105 A2 1 215 1 205 A2 A2 0 997 A2 0. 842  59 A3 A3 A3 A3 A3  1 1 1 1 1  62 075 015 167 292 442  PRB PRB PRB PRB PRB  0 0 0 O 0  OO0135951 1 0000074547 0000106576 0000631370 0003364808  • 60 61 62 63 64  POTENTIAL A1: 0.807  BETA-TURN A2: 0.777  64 67 A3: 1.382  POTENTIAL  BETA-TURN  65  PRBO: 0.00033648 A1: 0.847 POTENTIAL  A2: 0.852 BETA-TURN  A3: 1.232 66  B-TURN NOT AT  PRB: 0.0000977233  65  BUT AT  64  66  BUT AT  65  70  BUT AT  69  70  BUT AT  71  71  BUT AT  72  66  69 PRB: 0.00009772  B-TURN NOT AT  A3 : 1 247 A3 : 1 247 A3 : 1 367  PRB : 0 0000477890 PRB : 0 0000390700 PRB : 0 0005213434  67 68 69  BETA-TURN POTENTIAL A 1 : 0.722 A2: 0.745  69 72 A3 : 1 365  PRB : 0 0000921187  70  POTENTIAL  70  ^  A2: 0.855 A2: 0.855 A2: 0.810  65  68 PRB: 0.00028602  PRBO: 0.00028602 A1: 0.737 A1: 0.737 A1: 0.685  PRB: 0.0002860161  BETA-TURN PRBO: 0.00052134  A1: 0.747 POTENTIAL  A2: 0.830 BETA-TURN  73 PRB: 0.00009212  A3: 1.375 71  POTENTIAL  A2: 0.967 BETA-TURN  PRB: 0.00012773 A3: 1.132 72  PRBO: 0.00012773 A1 A1 A1 A1 A1 A1  0 890 " 0 812 0 9 15 0 755 0 755 0 780  POTENTIAL A1 0 865 A 1 1 025 A 1 1 152 A 1 1 152 A1 0 990 A1 0 940  A2 A2 A2 A2 A2 A2  1 1 1 1 1 1  PRB: 0.0001277295  71  74  PRBO: 0.000092 12 A1: 0.907  B-TURN NOT AT  B-TURN NOT AT  PRB: 0.0001700662  72  75 PRB: 0.00017007  B-TURN NOT AT  077 067 245 057 057 022  A3 ' A3 A3 A3 A3 A3  1 1 0 1 1 1  072 225 952 185 185 152  PRB PRB PRB PRB PRB PRB  0 0 0 0 0 0  0000267724 00004 2854 1 0000345801 0000109324 0000238228 0001605189  73 74 75 76 77 78  BETA-TURN A2 0 830 A2 1 017 A2 1 045 A2 1 04 5 A2 1 025 A2 0 835  78 A3 A3 A3 A3 A3 A3  1 0 0 0 1 1  81 200 967 817 817 010 227  PRB PRB PRB PRB PRB PRB  0 0 0 0 0 0  0000391935 0000507419 0000229824 0000057240 0000202062 0000858498  79 80 81 82 83 84  POTENTIAL A1: 0.907  BETA-TURN A2: 0.910  84 87 A3: 1.197  POTENTIAL  BETA-TURN  85  A2 A2 A2 A2 A2 A2  85  88  PRBO: 0.0OOO8585 A 1 0.922 A 1 1 .085 A1 1 .025 A1 1 .020 A 1 0 . 980 A 1 O. 800  PRB: 0.0001672002  PRB: 0.00016720  B-TURN NOT AT  84  1 .020 1 .040 1 .092 .1.117 1 .042 1 . 132  A3 A3 A3 A3 A3 A3  1 0 0 0 1 1  080 887 880 887 037 170  PRB PRB PRB PRB PRB PRB  0 0 0 0 0 0  0000135564 0000188424 0000172292 0000433009 0000212503 0001408203  86 87 88 89 90 91  POTENTIAL BETA-TURN A1 0.962 A2 1 . 152 A 1 0.987 A2 0.912 A 1 1 . 1 10 A2 0.875 A 1 1 . 205 A2 0.977 A1 1.115 A2 1 . 195 A1 1 .017 A2 1 . 197 A 1 0.980 A2 1 . 147 A1 0.852 A2 0.935  91 A3 A3 A3 A3 A3 A3 A3 A3  0 1 0 0 0 0 0 1  94 977 105 967 787 747 852 965 237  PRB PRB PRB PRB PRB PRB PRB PRB  0 0 0 0 0 0 0 0  0000349207 0000283722 0000774560 0000278207 0000043579 0000055502 0000208980 0002344783  92 93 94 95 96 97 98 99  POTENTIAL  99  PRB  0 00020314 77  100  A1  BETA-TURN  0.840  A2  POTENTIAL  0.645  BETA-TURN  A3 100  PRBO: 0.00023448 A1: 0.790  A2: 0.645  POTENTIAL  BETA-TURN  A2: 0.772  POTENTIAL  BETA-TURN PRBO: 0.00033996  A1: 0.925 A 1 : 1 .027  A2: 0.807 A2: 0.880  POTENTIAL BETA-TURN A 1 : 1 . 155 A2: 1.035 A1: 1.057 A2: 1.197 A1 : 1 .245 A2: 1.182  1 477 103  PRB: 0.00020315 A3: 1.510 101  PRBO: 0.00020315 A1: 0.900  102  PRB: 0.0003399635  100 101  104 PRB: 0.00033996  A3: 1.295 102  B-TURN NOT AT  B-TURN NOT AT  PRB: 0.0001172489  100 102  105 PRB: 0.00011725  B-TURN NOT AT  102  A3 : 1 295 A3: 1 095  PRB : 0 0000159186 PRB : 0 0000926563  103 104  104 107 A3 : 0 945 A3 : 0 920 A3 : 0 695  PRB : 0 0000329891 PRB : 0 0000415044 PRB : 0 0000012667  105 106 107  A1 1 A1 1 A1 1 A1 0  A2 A2 A2 A2  160 1 35 037 927  1.317 1 . 207 1 .005 1 .030  A3 A3 A3 A3  0 0 1 1  770 767 032 105  PRB PRB PRB PRB  O 0 0 O  0000216031 0000256332 0000070270 0001325099  108 109 1 10 1 1 1  POTENTIAL A1 O 832 A1 0 877  BETA-TURN A2 0.985 A2 0.937  1 1 1 A3 A3  114 1 162 1 177  PRB PRB  0 0000736242 0 0001896883  1 12 1 13  POTENTIAL A1 0 852 A 1 0 8 15  BETA-TURN A2 0.902 A2 0. 967  1 13 A3 A3  1 16 1 177 1 180  PRB PRB  0 0000406022 0 0002571961  1 14 1 15  POTENTIAL A1 0 892 A 1 0 867  BETA-TURN A2 0.805 A2 1 .045  1 15 A3 A3  1 18 1 247 1 120  PRB PRB  0 0000246138 0 0001045087  1 16 1 17  POTENTIAL A1 1 002 A 1 1 150 A1 1 167 A1 1 172 A1 1 140 A 1 0 927 A 1 0 832 A 1 0 807 A1 0 865  BETA-TURN A2 1 . 132 A2 1 .042 A2 1 . 250 A2 1 . 225 A2 1.182 A2 1 . 162 A2 1.117 A2 0.95O A2 1 .04 2  1 17 A3 A3 A3 A3 A3 A3 A3 A3 A3  0 0 0 0 0 0 1 1 1  PRB PRB PRB PRB PRB PRB PRB PRB PRB  0 0 0 0 0 0 0 0 0  0000259582 0000151422 0000355142 0000201564 0000008619 0000393956 0001108504 0000591727 0000374636  1 18 1 19 120 121 122 123 124 125 1 26  END OF PROGRAM  77  120 975 900 775 767 760 985 042 162 072  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0095254/manifest

Comment

Related Items