Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Rounding errors in digital computer arithmetic subroutines. Lastman, Gary Joseph 1963

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
UBC_1963_A8 L23 R6.pdf [ 1.63MB ]
Metadata
JSON: 1.0080564.json
JSON-LD: 1.0080564+ld.json
RDF/XML (Pretty): 1.0080564.xml
RDF/JSON: 1.0080564+rdf.json
Turtle: 1.0080564+rdf-turtle.txt
N-Triples: 1.0080564+rdf-ntriples.txt
Original Record: 1.0080564 +original-record.json
Full Text
1.0080564.txt
Citation
1.0080564.ris

Full Text

ROUNDING ERRORS IN DIGITAL COMPUTER ARITHMETIC SUBROUTINES  ty GARY JOSEPH LASTMAN B.A.Sc., The U n i v e r s i t y o f B r i t i s h Columbia,  A THESIS SUBMITTED I N PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS i n t h e Department of MATHEMATICS  We a c c e p t t h i s , t h e s i s as c o n f o r m i n g t o t h e required standard  THE UNIVERSITY .OF BRITISH COLUMBIA March,.1963  1961  In presenting  this thesis i n p a r t i a l fulfilment of  the r e q u i r e m e n t s f o r an advanced degree a t t h e U n i v e r s i t y  of  B r i t i s h Columbia, I agree t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and for extensive  study.  I f u r t h e r agree t h a t p e r m i s s i o n  c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may  g r a n t e d by the Head o f my  Department o r by h i s  be  representatives.  I t i s understood t h a t copying or p u b l i c a t i o n of t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n  Department o f  M  ATH  EN\A  The U n i v e r s i t y o f B r i t i s h Vancouver 8, Canada. Date  8  APRIL  TIC  S  Columbia, IU3  permission.  i  Abstract  I n t h i s t h e s i s we procedures..  i n v e s t i g a t e a r i t h m e t i c s u b r o u t i n e s and  round-off  An e r r o r a n a l y s i s o f s i n g l e o p e r a t i o n s i n n o r m a l i z e d  floating  p o i n t a r i t h m e t i c l e a d s us t o t h e c o n s t r u c t i o n o f an improved form o f a d d i t i o n subtraction subroutine. round-off procedures  I n . a d d i t i o n , t h e p r o p e r t i e s of s e v e r a l types of ;  a r e examined ( a d d i n g \;  adding random d i g i t s ;  dropping  digits). The  e x p e r i m e n t a l work ( u s i n g the above mentioned s u b r o u t i n e s ) w i t h t h e  system k = y,  y = -x shows t h a t the s y s t e m a t i c a c c u m u l a t i o n o f r o u n d - o f f e r r o r  observed by Huskey i s due t o the t y p e o f r o u n d i n g - o f f p r o c e d u r e Furthermore,  used.  H a r t r e e ' s e x p l a n a t i o n o f t h i s • e f f e c t . i s found t o be  because c a r r y i n g e x t r a d i g i t s throughout the s y s t e m a t i c  round-off.  inadequate  t h e c a l c u l a t i o n s does not e l i m i n a t e  iii Acknowledgement I w i s h t o thank Dr. T. E. H u l l and Dr. C h a r l o t t e F r o e s e f o r t h e i r suggestions  helpful  i n the p r e p a r a t i o n o f t h i s manuscript.  The f i n a n c i a l a s s i s t a n c e o f t h e Defence Research Board o f Canada ( g r a n t number DRB-95^0-03) i s g r a t e f u l l y  acknowledged.  11  Table of Contents  Page Chapter I  Introduction,  ].  Chapter I I  Computing Schemes and Round-off P r o c e d u r e s  2  Chapter I I I E r r o r .Analysis of Normalized F l o a t i n g P o i n t A r i t h m e t i c C h a p t e r IV  A S p e c i a l S u b r o u t i n e and Round-off P r o c e d u r e  Chapter V  6 il+ 20  Table I  Descriptions of P a r t i c u l a r Arithmetic Subroutines  21  Table I I  E x e c u t i o n Times o f t h e S u b r o u t i n e s  22  Table I I I  Experimental Results  23  Figure I  2k  Figure  2k  Chapter VI Bibliography  II Conclusions  28 29  '. - Chapter . I . . For the numerical s o l u t i o n of a mathematical  p r o b l e m on a d i g i t a l  we s h o u l d p r o v i d e some s o r t o f e r r o r a n a l y s i s t o account which are u s u a l l y present.  The p r o p a g a t e d  f o r round-off  computer errors,  r o u n d - o f f e r r o r may be g i v e n i n  terms o f a maximum upper bound o r an e x p e c t e d v a l u e and v a r i a n c e : t h e s t a t i s t i c a l e s t i m a t e i s much more s a t i s f a c t o r y . advocates  H e n r i c i \}\~\ s t r o n g l y  t h e usage o f s t a t i s t i c a l methods, b u t Huskey c a u t i o n s a g a i n s t  t h e i r a p p l i c a t i o n on t h e b a s i s o f some e x p e r i m e n t a l r e s u l t s , our i n t e n t i o n t o t r y t o e x p l a i n Huskey's r e s u l t s I b y examining a r i t h m e t i c s u b r o u t i n e s and r o u n d i n g - o f f p r o c e d u r e s ; t h a t H a r t r e e ' s e x p l a n a t i o n [j3 the o b s e r v e d  effect.  >  it is  v a r i o u s types o f  i n a d d i t i o n , . we w i l l show  °^ Huskey's r e s u l t s i s n o t s u f f i c i e n t t o e x p l a i n  F o r t h i s purpose we f i r s t d i s c u s s computing schemes and  r o u n d i n g - o f f p r o c e d u r e s 'in Chapter  II.  I n Chapter  I I I we g i v e an e r r o r  a n a l y s i s of single operations i n normalized f l o a t i n g p o i n t a r i t h m e t i c . s p e c i a l form o f s u b r o u t i n e and a s p e c i a l r o u n d i n g - o f f procedure i n Chapter  IV, w h i l e i n Chapter V we p r e s e n t some e x p e r i m e n t a l  A  are described results.  Chapter I I  I n g e n e r a l , e r r o r s are i n t r o d u c e d when any m a t h e m a t i c a l p r o b l e m i s s o l v e d by n u m e r i c a l  c o m p u t a t i o n on a d i g i t a l computer.  e f f e c t of round-off h e n c e f o r t h we  e r r o r s on the n u m e r i c a l  We w i s h to determine the  s o l u t i o n t o a problem, so  s h a l l be c o n c e r n e d w i t h the a r i t h m e t i c o p e r a t i o n s o f a d d i t i o n ,  s u b t r a c t i o n , m u l t i p l i c a t i o n and d i v i s i o n .  •Since t h e s e o p e r a t i o n s  are  dependent upon the manner i n w h i c h the c o m p u t a t i o n s are p e r f o r m e d ,  we  f i r s t d i s c u s s v a r i o u s computing schemes. Numbers i n a d i g i t a l computer can be r e p r e s e n t e d (m,e)  as an o r d e r e d  pair  where m i s the m a n t i s s a o f word l e n g t h H d i g i t s . a n d e i s the exponent;  b o t h are e x p r e s s e d i n the number base b, o f the machine. (m, e) i s a c t u a l l y m.b . e  A machine number  I f the exponent e i s a p r e d e t e r m i n e d c o n s t a n t  a l l c a l c u l a t i o n s we have the system known.as f i x e d p o i n t a r i t h m e t i c . a l l o w e t o v a r y between f i x e d l i m i t s , say  -a ^  e  i  -  a , t h e n we  for I f we  have  floating point arithmetic. W i t h f i x e d p o i n t a r i t h m e t i c we u s u a l l y d e a l w i t h & - d i g i t numbers of form (m,.o), w i t h  •?! < m  1  .  T h i s system i s v e r y i n f l e x i b l e s i n c e  must r e s c a l e numbers so t h a t t h e r e s u l t s o f n u m e r i c a l w i t h i n the open i n t e r v a l ( - l , l ) .  one  computations w i l l l i e  A l s o , the programmer must keep t r a c k o f  d e c i m a l p o i n t t h r o u g h o u t the c a l c u l a t i o n s .  the  the  But, by u s i n g a f l o a t i n g p o i n t  a r i t h m e t i c t h e programmer i s not r e q u i r e d t o l o c a t e the d e c i m a l p o i n t a t each s t e p i n a program, o r t o r e s c a l e .  Because o f t h e s e advantages o v e r f i x e d  p o i n t , f l o a t i n g p o i n t a r i t h m e t i c i s now  u s e d i n almost a l l computers.  I n c o n s t r u c t i n g a f l o a t i n g p o i n t a r i t h m e t i c we become aware of  two  f a c t s : . ( l ) , t h a t a l t h o u g h we can p e r f o r m c a l c u l a t i o n s u s i n g numbers whose magnitudes v a r y o v e r a c o n s i d e r a b l e can r e t a i n a t any  range, the maximum number o f d i g i t s  stage i s the word l e n g t h o f m,  the m a n t i s s a of our  we  floating  3p o i n t number;  (2)^ the r e p r e s e n t a t i o n f o r a number x, (m,e), i s not  For uniqueness we must t u r n t o a n o r m a l i z a t i o n c o n v e n t i o n  unique(*)-  such as b | m |  Even so, the r e p r e s e n t a t i o n f o r zero i s not u n i q u e ; h u t we may d e f i n e  <  1.  zero  as (o, -a),' where -a i s the lower bound f o r e . By r e s t r i c t i n g |m| t o t h e range  b"^ ^ |m| «£. 1  we r e t a i n a maximum number o f s i g n i f i c a n t d i g i t s a t  each c o m p u t a t i o n s t e p .  However, now we are u n c e r t a i n as t o the number o f  t r u l y s i g n i f i c a n t d i g i t s retained i n a long series o f c a l c u l a t i o n s .  This  l o s s o f i n f o r m a t i o n can be p a r t i a l l y e l i m i n a t e d b y t h e use o f an u n n o r m a l i z e d floating point or significant d i g i t arithmetic,  • But now we must  f o r f e i t the o b v i o u s p r o c e d u r a l . c o n v e n i e n c e s o f n o r m a l i z e d arithmetic.  f l o a t i n g point  Because o f t h i s the l a t t e r system i s more commonly u s e d i n  computers. I n summary:  f l o a t i n g p o i n t a r i t h m e t i c i s extremely, u s e f u l f o r comput-  a t i o n s where the magnitudes o f q u a n t i t i e s may v a r y w i d e l y . f l o a t i n g point i t i s p o s s i b l e t o represent i n memory o n l y the f i r s t  With  normalized  numbers i n a computer b y s t o r i n g  JL s i g n i f i c a n t d i g i t s and an exponent.  The f l o a t i n g  p o i n t system has two o b v i o u s d i s a d v a n t a g e s : • ( l ) , more c o m p l i c a t e d  computer  hardware i s needed t o h a n d l e b o t h the m a n t i s s a and the exponent, and ( 2 ) , ' f e w e r s i g n i f i c a n t d i g i t s can be c a r r i e d because a p o r t i o n o f a word must be used f o r the exponent.  I n s p i t e o f t h i s , f l o a t i n g p o i n t i s an e f f e c t i v e and  e f f i c i e n t method o f r e p r e s e n t i n g numbers w i t h i n a computer. Wow t h a t we have some p r a c t i c a l a r i t h m e t i c s t h a t can be h a r n e s s e d t o do  (*) T h i s i s i m m e d i a t e l y e v i d e n t m  f  , m*  arid e  / e*.  s i n c e we c o u l d have m b = m*b * e  e  with  h. our n u m e r i c a l work, we are i n a p o s i t i o n t o g i v e a q u a l i t a t i v e d e s c r i p t i o n o f rounding  errors.  The p r o c e s s o f r o u n d i n g - o f f i s t h a t p r o c e d u r e by which l o w - o r d e r are e l i m i n a t e d i n a number.  Consequently  digits  r o u n d - o f f e r r o r i s t h e d i f f e r e n c e ••  between a f i n i t e p r e c i s i o n number ( a f t e r t h e l o w - o r d e r d i g i t s are e l i m i n a t e d ) and the c o r r e s p o n d i n g i n f i n i t e p r e c i s i o n number.  B e a r i n g i n mind the  computer and i t s a r i t h m e t i c u n i t we c l a s s i f y as r o u n d - o f f e r r o r s those e r r o r s i n d u c e d by s p e c i a l r e p r e s e n t a t i o n s o f numbers and a l s o those e r r o r s r e s u l t i n g from a r i t h m e t i c o p e r a t i o n s .  B o t h are caused by r o u n d i n g - o f f .  S e v e r a l schemes have been proposed  t o implement the r o u n d - o f f p r o c e s s .  I n each a number x i s changed t o x, i t s rounded form. r o u n d i n g - o f f o p e r a t o r (R: x < o  X  >  x) :  we round t h e a b s o l u t e v a l u e o f x,  at a l l .  Let S  if x > o |x| ;  L e t R denote t h e  we round x i t s e l f ; i f x = o we do not  i f round  be t h e e r r o r i n the l e a s t s i g n i f i c a n t d i g i t r e t a i n e d i n  a f t e r r o u n d i n g - o f f ; 6"  |x| ,  g i v e s a measure o f t h e accuracy, o f a r o u n d i n g  procedure. F o r e v e r y d e f i n i t i o n o f R we o b t a i n a d i f f e r e n t r o u n d i n g scheme : (i)  ,Add b/2 t o  |x| a t the p o s i t i o n w h i c h i s t o be dropped.(*)  w i l l r e s u l t i f t h i s d i g i t i s b/2 (ii)  A carry  or greater.  Make the l o w e s t - o r d e r r e t a i n e d d i g i t o f  |x| e q u a l t o b/2 r e g a r d l e s s  o f t h e c o r r e c t v a l u e of t h a t d i g i t and the d i g i t s i n lower o r d e r s .  (iii)  Add a "0" o r  , n  l"  randomly i n t o t h e l o w e s t - o r d e r r e t a i n e d d i g i t r e g a r d -  l e s s o f t h e v a l u e s o f the d i g i t s o f D i s c a r d low-order d i g i t s .  |x|  The e r r o r i s  (*) b i s the number base o f the machine.  '5-  E v e r y one o f the p r e c e d i n g schemes has i t s own m e r i t s : s e l e c t i o n of a p a r t i c u l a r rounding procedure  the  final  i s a compromise between the  d e s i r e d a c c u r a c y f o r a r e s u l t and the computer time r e q u i r e d t o a c h i e v e accuracy.  F o r example, procedure  this  ( i v ) i s the l e a s t time-consuming b u t g i v e s  a l a r g e b i a s e d e r r o r , w h i l e ( i ) has the minimum e r r o r b u t t a k e s l o n g e r .  6.  Chapter I I I Regardless of the numerical scheme that we choose to perform our computations, the rounding-off process may introduce errors.  For.a  q u a n t i t a t i v e d e s c r i p t i o n of these errors we now give an e r r o r a n a l y s i s of s i n g l e operations i n normalized f l o a t i n g point arithmetic.  The e r r o r  a n a l y s i s w i l l t e l l us the accuracy, of the i n d i v i d u a l arithmetic operations. In the d i s c u s s i o n , r e a l arithmetic operations w i l l be denoted by t h e i r usual symbols t>  e  (i)  > ©>  +, -,.x or . , *  ;  the corresponding machine operations w i l l  @> G) > also,.x,y,z s h a l l be machine numbers.  w i l l be b and the mantissas w i l l have JL d i g i t s . (0,  The machine base  Zero w i l l be of the form  -a) where -a i s the lower l i m i t on the range of values of the exponents.  Errors  6j_  > £jL w i l l be e r r o r s i n the l e a s t s i g n i f i c a n t d i g i t of the  r e s u l t i n g Ji - d i g i t mantissa.  In the f o l l o w i n g e r r o r analyses we s h a l l assume  that neither exponential underflow nor exponential overflow occur. The area of computer memory i n which a numerical r e s u l t . i s formed i s known as an accumulator. accumulators are used:  In the e r r o r analyses two d i f f e r e n t types of the s i n g l e - l e n g t h  " e x t r a - d i g i t " accumulators.  - d i g i t accumulators, and the  In the usual JL - d i g i t accumulator the "decimal"  point i s considered to be immediately to the l e f t of the f i r s t (most s i g n i f i c a n t ) digit.  An {Ji +l) - d i g i t accumulator i s formed from t h i s by p r o v i d i n g an  e x t r a d i g i t p o s i t i o n to the l e f t of the "decimal" point.  This form of e x t r a  d i g i t accumulator i s used f o r a d d i t i o n , subtraction and d i v i s i o n . . A second form of the e x t r a - d i g i t accumulator i s the double-length one used i n m u l t i p l i c a t i o n ; . m u l t i p l i c a t i o n of two JL - d i g i t numbers gives a product of 2JL  digits.  We form t h i s double-length accumulator by using two single  length accumulators.  A.  Addition - Subtraction  X © Y  = ( m , , e,) 0  ( m  2  , e  I f x = -y.then x(+)y = ( o , - a ) ; suppose e > e-^ ^ 2  x0y  e  2  ) =  2 -  e  (m ,e ) 3  l ^> +  3  then  - (x+y) =-x.  For  je  " i| ^ e  2  1. ( X + l ) d i g i t -  (a)  X -1 r  accumulator  We must s h i f t m-j_ t o /.align t h e d e c i m a l p o i n t s :  we form a new  m a n t i s s a V, o f JL d i g i t s .  V - m,b (V has  je^ - e | significant 2  (b)  Add mg and V : m  (c)  Form t h e H-digit (i)  zeros)  + V  mantissa o f the r e s u l t ,  No c a r r y on t h e a d d i t i o n o f m  m  where  2  + 6, b  j  3  = {m2+v)b  2  and V  f = o i,2,. ;  i s t h e number o f s i g n i f i c a n t z e r o s b e f o r e n o r m a l i z a t i o n .  8. Thus  = b  X© Y  (m,  +  v)b  m  X © Y  6b  ^  -(X+Y)  D  A  N o t i c e t h a t we do n o t have an e r r o r bound i n terras o f t h e exponent o f t h e final result.  I f we w i s h t o use  we have t h a t  + -j-  -JL 6  x© But we do n o t know  -(X +  Y .  Y)|  £b  <  b  3  A  ^ - JL ~ I  A l l we can say i s t h a t  X©Y  , and t h e r e f o r e  e -  -(X+Y) 6  £ b  3  A  However, f o r almost a l l a d d i t i o n s and s u b t r a c t i o n s , t h i s bound g r o s s l y over-estimates the e r r o r .  We c o n c l u d e t h a t t h e o n l y r e a l i s t i c bound i s  -I - (X+Y)  X © Y  | *  e  £B  b  A  I f we had performed t h e e r r o r a n a l y s i s u s i n g obtained  .  Z  6, > Q^.  we would have  an e r r o r bound  X © Y  - (X + Y)|  i  ££  b'  A  ..  T h e r e f o r e , t h e most g e n e r a l e r r o r bound i n t h i s case i s  |X©Y (ii)  £ £ b  - (X +Y)  A  C a r r y on the a d d i t i o n o f m  S i n c e (ra  z  + V ) has JL +1  m  3  2  b  and V  d i g i t s we must s h i f t i t one d i g i t t o the r i g h t .  = 15' (m  2  + v)  +  £.t 2  9-  X  0 Y  =  + Y  x  b  +  b  Z  [e,  3  +  b  e  2  X © Y - ( X + Y ) | ^ CA^V ' [l + b] 3  Here t h e e r r o r i s g i v e n e x p l i c i t l y  i n terms o f the exponent o f t h e f i n a l  result. 2. X (i)  - d i g i t accumulator. W i t h o u t c a r r y , t h e r e s u l t s a r e t h e same as f o r the A+1  d i g i t accumulator.  ( i i ) Carry Now we must s h i f t b o t h m-^ and nLp (assume (a)  C  1  v  2  = nnD  +•  2  Z  e,b  S h i f t mj and -.-Align d e c i m a l p o i n t s  ,-' e -e m, b b 1  v, (c)  ]_)  S h i f t m2 and form Vg  IT (b)  e  =  l  .JL e b  1  +  2  Add V-L and V"2 ; b o t h have o n l y & d i g i t s .  m e  3  = v, + v =  3  X©Y  e  =  a  +  2  1  x + Y  X © Y - ( X + Y)  +  b^tD  i 2£  A  3  [e,  £*l>  3  I f we compare t h i s e r r o r bound t o t h e c o r r e s p o n d i n g e r r o r bound i n 1. above, we see t h a t t h i s bound i s l a r g e r by a f a c t o r o f For  b>2  , I < j> <: 2 .  b =2 • b = IO  J>  =  j ? ^ 4/3  j> = 20/) I  -1.33 -1.82  T h e r e f o r e , i f p o s s i b l e one s h o u l d use an (Z+l) d i g i t a c c u m u l a t o r f o r a d d i t i o n and  subtraction.  B. M u l t i p l i c a t i o n X ® Y  (m,,  e,)® (m  ,e )  2  2  =(m ,e ) 3  Let ISJLI  *  6  ,  M  \l \  ±  x  ;  JL -  I f e i t h e r o r b o t h x, y a r e z e r o X 0  then  1.  - ( 0 , " a)  Y  A .2A-digit a c c u m u l a t o r (a) (b)  rn,*m  Form t h e p r o d u c t -2 i f b £ r 2 | m  _  m e  3  3  * b  m  (rri,  =  b  =  e,  2  1  •»- 6 , b  .m ) a  + e,  - i e  - XY  X®Y  |X®Y ( ) c  i f  2  m e  - X-Y|  |m.-™ |  *  3  3  = -  +  ^  X® Y =  b  b  b  e  £  M  i  m.-mj, e,  E.b  +  + e  2  2  X-Y  X ® Y - X-Y |  e b  +  ±  £  £  M  2  b  b  b  b  1,2  3  11.  In b o t h c a s e s we have t h e e r r o r bound i n terms o f t h e exponent o f t h e r e s u l t . The most g e n e r a l e r r o r bound i s  4,e |X®Y  - X-Y | £  6  B  M  3  b  Ah X - d i g i t accumulator (a)  Form  v where  = m  2  c, + c  +  2  e  b °  2  = JL  2  I f JL i s even, t a k e  C  I f JL i s odd, t a k e  C, = ( - C - 0 / 2  (b)  (c)  - C  2  ,  C  2  = (-0 + 0 / 2  M u l t i p l y . V-^ and Vg t o g e t an ^ - d i g i t number  l - T i | V.-Va. | ^ b*' m  3  ^  X0Y  b(v,-v ) 2  =X  x®r  (d)  (  2  «3  Y +  - XY| ^  +  b  e  M  l  m  b  2  § , S  V m , ! ^  + b  ^b  + e  M  z  +6,6 b 2  b y  v,v |< I 2  m  3  =  v,.v  2  e, + e X-Y  +  2  3  [no 6, b ' C  2  m, e b z  2  + e.l* b  12.  r* c  i3 TL^ e  X© Y  X-Y  -  - ,-4/2 L 3 e  D  I n each o f ( c ) and ( d ) t h e e r r o r bound . i s p r o p o r t i o n a l t o f o r a double-length accumulator the e r r o r i s p r o p o r t i o n a l t o D  D -  C l e a r l y a d o u b l e - l e n g t h a c c u m u l a t o r i s t h e b e t t e r o f t h e two. C-  Division  X © Y  =  We must have y f o. Let  1.  le.ji  e  (m,,  e.)  If x = o  then  ,  *  D  |e,|  © ( m X © e  2  ee)  1  Y  "=•  ;  0  ( 0 , - Cl) 1,2  digits)  I ^ |m,/m | < b 2  m e  00  3  -L *  Jh + 1 d i g i t a c c u m u l a t o r ( a q u o t i e n t .of ^+1  (a)  = (m ,e )  = b'(rc\/rn ) =  3  e,  -  e  = ( X * Y)  | X © Y  -( XT Y ) | £  i/  m  nn 6  2-| 3  =  3  -  x ©  Y  X © Y  (  +.1  2  X© Y  m  4- e b ^  2  3  +  D  6 , b  b  £ b D  * 1  rn,/m ©1 -  ~ (x  ~0  +• 2 b £  2  ^2 r  Y )  - ( X * Y ) |  + ^  e  2  b^  e ^ b *  b' e  3  The g e n e r a l e r r o r bound i s  |X©Y  - (X 4 Y ) |  <  £  D  £  b  3  P  but ;  132.  An Jl - d i g i t  accumulator  i ±  (a)  Shift  |m,/m | 2  m-|_  and form V-j_  = b m , + £, b  v,  m =  (v,.TmJ  3  e X © Y  | x © Y (b)  <- b  3  = (i + e . ) =  (X*Y)  +  -(X-Y)| ^  b' ^ | m , / m | 2  +  £,&'  e  z  B  D  E*l?  + a  [fi  0  +  +• b g j  ^ 1  T h i s i s t h e same as t h e c o r r e s p o n d i n g case f o r  digits.  The p r e c e d i n g e r r o r bounds c h a r a c t e r i z e t h e s p e c i f i c machine a r i t h m e t i c operation.  I n t h e n e x t c h a p t e r we examine f u r t h e r c h a r a c t e r i s t i c s o f t h e s e  machine o p e r a t i o n s .  Ik.  Chapter I V  Automatic  programming languages must use permanent machine language  subroutines t o perform a r i t h m e t i c operations.  Since the accuracy o f  c a l c u l a t i o n s i s v i t a l l y dependent upon t h e s e a r i t h m e t i c s u b r o u t i n e s , we s h o u l d use r o u t i n e s whose maximum e r r o r i s as s m a l l as p o s s i b l e . Let us consider the normalized f l o a t i n g p o i n t a r i t h m e t i c procedures d i s c u s s e d in-Chapter I I I (those f o r e x t r a - d i g i t accumulators).  D e f i n e t as  follows:  (x ^ Y)  -  (x - Y )  X ~ Y where  x-^y £ o ' d e n o t e s /&*  a real arithmetic operation,  denotes t h e c o r r e s p o n d i n g machine o p e r a t i o n .  Addition-Subtraction (a)  No c a r r y  t 'A I -"  e  A I  A  -JL b rna/K e  whe  3  ,rrwxfe.,e,.) - e  = fc,b  ID  = rruwc(e,,e )  -  z  (b) , C a r r y  ^  t.,  -.  £ (i+b)b A  3  b  =  k,b  f  15.  Multiplication  t  k  =  M  b"  3  Division  The maximum r e l a t i v e e r r o r , t , i s p r o p o r t i o n a l t o b d i v i s i o n and a d d i t i o n - s u b t r a c t i o n w i t h c a r r y .  for multiplication,  "No c a r r y " on a d d i t i o n - s u b t r a c t i o n  can produce a l a r g e v a l u e o f t ( p r o p o r t i o n a l t o b ) .  From t h i s e v i d e n c e we  c o n c l u d e t h a t t h e u s u a l p r o c e d u r e s for- a d d i t i o n - s u b t r a c t i o n a r e so  inaccurate  t h a t we must d e v e l o p a p r o c e d u r e w h i c h always g i v e s a v a l u e o f t p r o p o r t i o n a l t o b"-^  .  Suppose t h a t we have a (2JL + l ) - d i g i t a c c u m u l a t o r a v a i l a b l e , w i t h one d i g i t p o s i t i o n t o the l e f t of the decimal p o i n t . those given i n Chapter I I I . if  e >e,+i +i 2  |e  F o r  • 1.  z  - e,|  Shift  Form  V,  V  m,  has  Q,  z  -{x-+y)  =-x  .  Add  |6  -  thus  Vg = ^2  b  Z+  |e,-ej  digits  £  Add V-, and V  1  =  v,  (  V  3.  <  m Form  2.  X 0 Y  ,then  z  ^  6,  Suppose  The a n a l y s i s i s s i m i l a r t o  2  2  £,  2  has  J  ' n o n - s i g n i f i c a n t z e r o s onto t h e end o f m^,  >£+|e,-e | 2  0  6 — 6j  digits  16. (i)  Carry  m  e  b'(v,  3  3  =  e  + v ) 2  e,S*  +  +.  z  -i  (ii)  e  No c a r r y  m  3  =  b (V, + V )  + £ b  Z  X © Y  2  = X+Y  +£ bb  vl  M*)?*  ;  f«o,i,...^  2  E r r o r Bounds I  (i)  Carry  ( i i ) No c a r r y  X©Y  -(X+Y)|  £  £  |x©Y  -(X+Y)|  ^  6  b  A  A  D  D V  Advantages o f t h i s method 1.  We a v o i d t h e r o u n d i n g e r r o r i n m-^ b e f o r e t h e a d d i t i o n  2. -The f i n a l e r r o r bounds.are e x p l i c i t l y 3-  F o r t h i s method we o b t a i n  This  s a t i s f i e s t h e r e q u i r e m e n t on t .  operation.  i n terms o f t h e f i n a l exponent.  17 The machine language a r i t h m e t i c s u b r o u t i n e s , f o r use i n c o n j u n c t i o n w i t h an a u t o m a t i c programming language, accumulator  s h o u l d be t h o s e w i t h a (2-& + l )  f o r a d d i t i o n - s u b t r a c t i o n , a 2£  and an (JL+l)  d i g i t accumulator  d i g i t accumulator  for division.  We  digit  for multiplication,  are l e a d t o t h e s e c o n c l u s i o n s  b e f o r e we have c o n s i d e r e d t h e c h o i c e o f a r o u n d i n g - o f f p r o c e d u r e . t u r n t o the problem of choosing a round-off  L e t us  now  procedure.  I n G h a p t e r ' I I we gave some examples o f t y p i c a l r o u n d i n g - o f f methods:  the  most commonly u s e d are (1) (2) We  .the d r o p p i n g o f d i g i t s , the a d d i t i o n of  and  i n the l a s t r e t a i n e d d i g i t p o s i t i o n .  s h a l l d i s c u s s t h i s c h o i c e i n s o f a r as i t a f f e c t s the more s i g n i f i c a n t p r o b l e m  o f g i v i n g e r r o r e s t i m a t e s f o r a sequence o f c a l c u l a t i o n s .  For example,.if  are p e r f o r m i n g a r e p e t i t i v e c a l c u l a t i o n i n w h i c h a r o u n d - o f f e r r o r a t one  we step  a f f e c t s t h e r e s u l t s a t s u c c e s s i v e s t e p s we can g i v e an e s t i m a t e o f t h e propagated  e r r o r i n terms o f the r o u n d i n g e r r o r a t each s t e p .  are e x p r e s s e d as a c o m b i n a t i o n e r r o r s (the  £'s  ).  The  step e r r o r s  ( u s u a l l y l i n e a r ) of the i n d i v i d u a l a r i t h m e t i c  By u s i n g t h e above maximum e r r o r bounds f o r i n d i v i d u a l  a r i t h m e t i c o p e r a t i o n s we a r e a b l e t o o b t a i n a n . e r r o r e s t i m a t e a t e v e r y step i n a series of c a l c u l a t i o n s .  But t h i s i s a maximum e r r o r bound and i s u n s a t i s f a c t o r y  because r a r e l y does t h e p r o p a g a t e d has been suggested  e r r o r become e q u a l t o i t s upper bound.  It  t h a t one g i v e a s t a t i s t i c a l e s t i m a t e f o r the p r o p a g a t e d  we would a a l c u l a t e an e x p e c t e d v a l u e and a v a r i a n c e .  error:  The v a l i d i t y o f the  s t a t i s t i c a l e s t i m a t e u l t i m a t e l y r e s t s on whether t h e £'s  can be  considered  as independent random v a r i a b l e s . H u s k e y ^ 7 j g i v e s an example where, f o r a p a r t i c u l a r a r i t h m e t i c s u b r o u t i n e and r o u n d i n g - o f f p r o c e d u r e , has  suggested  t h e e r r o r s s y s t e m a t i c a l l y b u i l d up.  t h a t one c o u l d a v o i d e f f e c t s s i m i l a r t o s y s t e m a t i c  Forsythe  £3_J  round-off  by a d d i n g random d i g i t s t o t h e d i g i t p o s i t i o n s w h i c h are t o be dropped.  We  18.  s h a l l examine such a p r o c e d u r e .  Our r e s u l t s a r e somewhat d i f f e r e n t from  those  g i v e n "by F o r s y t h e . L e t m be t h e a b s o l u t e v a l u e o f a f l o a t i n g p o i n t m a n t i s s a m which we w i s h to round-off at the k t h d i g i t .  D e f i n e a new number M such t h a t M  Consider where Oi  M -  =  b m  [M] = V  [ M ] i s t h e g r e a t e s t i n t e g e r l e s s t h a n o r e q u a l t o M. • O b v i o u s l y  V < 1  ;  rounding-off.  V c o n s i s t s o f t h o s e d i g i t s o f m which a r e t o be dropped a f t e r We c o n s i d e r V t o be t h e p r o b a b i l i t y o f r o u n d i n g up ( a d d i n g 1  i n t o t h e k t h d i g i t o f m) and 1 - V t o be t h e p r o b a b i l i t y o f r o u n d i n g down ( d r o p p i n g the k + 1 , k + 2 , . . . d i g i t s ) .  i n t o t h e k + 1 , k + 2 , ... d i g i t s o f m.  Let  The e r r o r £ i s now a random v a r i a b l e ,  6  [M] + I  =  £  =  ,(0 ^  Now suppose we add a random number  - M  [M] -  M  be u n i f o r m l y d i s t r i b u t e d on  and i s g i v e n b y  = I -V  ( r o u n d i n g up)  ~V  =  ( r o u n d i n g down)  The p r o b a b i l i t y d e n s i t y f o r £ i s =  \>(C) |3(£)  Expected v a l u e :  Variance:  E(£) V(£) =  (The p r o b a b i l i t y t h a t  = Je(i  =  1-6  ( r o u n d i n g .up, 6 >0 )  I + £  ( r o u n d i n g down, t £ 0 )  + e)de  £*(•+£)<*£ |£| * £ ; i s 3/4)  + je(i-£)cLs  + (£ (l-£)dU l  =  = g  6  l) [0,l).  19. A s u b r o u t i n e u s i n g random d i g i t s , w h i l e b e i n g t h e o r e t i c a l l y i d e a l , would be, i n p r a c t i c e , q u i t e c o s t l y i n computing t i m e . add  (-g-)^  The p r o c e d u r e  would be f a s t e r and may g i v e j u s t as good r e s u l t s .  b y w h i c h we  In the following  c h a p t e r we p r e s e n t some e x p e r i m e n t a l r e s u l t s t h a t we o b t a i n e d u s i n g d i f f e r e n t t y p e s o f a r i t h m e t i c s u b r o u t i n e s ; . o f t h e s e , two i n c o r p o r a t e the s p e c i a l p r o p e r t i e s d e s c r i b e d i n t h e f i r s t p a r t o f t h i s c h a p t e r , and another use random d i g i t s f o r r o u n d i n g .  two  20. Chapter V  Our  e x p e r i m e n t a l w o r k - w i l l be w i t h a r i t h m e t i c s u b r o u t i n e s w r i t t e n . i n t h e  IBM 1620 (machine number base, i s 10) machine language, and t o be used w i t h t h e automatic  programming language, FORTRAN.  p o i n t mantissa.  F o r t r a n uses a n . e i g h t d i g i t  floating  A l a t e r v e r s i o n o f F o r t r a n , FORTRAN 2, has t h e advantage o f  v a r i a b l e word l e n g t h (2 t o 28) f o r f l o a t i n g p o i n t  mantissas.  I n T a b l e I we l i s t t h e p r o p e r t i e s o f t h e s u b r o u t i n e s : the e x e c u t i o n .times  ( i n m i l l i - s e c o n d s ) , o f the subroutines.  Table I I contains  21. Table I  Type o f F o r t r a n  .Addition-Subtraction  Multiplication  Division  (1)  Drops t h e l o w o r d e r d i g i t s on t h e s m a l l e r number b e f o r e adding, I f there i s a carry, .the l o w e s t .order d i g i t i s dropped. (9 d i g i t a c c u m u l a t o r )  Rounds t o 8 significant digits by•adding \ i n t o t h e 8th p l a c e .  Forms a 9 d i g i t quotient Takes t h e most significant 8 digits.  (l6 d i g i t  (9 d i g i t .accumulator)  As i n ( l ) b u t v a r i a b l e  Takes t h e most significant X d i g i t s o f product. Drops t h e l o w e s t order d i g i t s .  Subroutine i s the s p e c i a l one d e s c r i b e d i n C h a p t e r IV. •Rounds by a d d i n g \ i n t o t h e 8th p l a c e .  Same as i n ( l ) above.  Same as (3) above except t h a t low order d i g i t s a r e dropped.  R e t a i n s only, t h e As i n ( l ) most s i g n i f i c a n t above. 8 d i g i t s ; drops o t h e r s .  Rounds t h e s m a l l e r number b y a d d i n g \ i n the l a s t r e t a i n e d digit. I f carry, rounds r e s u l t b y a d d i n g \ i n t o 8th d i g i t .  Same as ( l )  Same as (3)  Same as (5) above e x c e p t t h a t we add random d i g i t s .  Same as (3)  FORTRAN 8 digits  Same as (5) above e x c e p t t h a t we add random d i g i t s . Same as (3) above e x c e p t t h a t we add random d i g i t s .  Same as (3) above e x c e p t t h a t we add random d i g i t s .  Same as (3)  FORTRAN 8 digits  FORTRAN 8 digits  (2)-X  where  1 = 2,3,-. .28  £  FORTRAN 2  (3)  (4)  (5)  FORTRAN 8 digits  FORTRAN 8 digits FORTRAN 8 digits  (6)  (7)  accumulator)  . As i n ( l ) but v a r i a b l e ^  Rounds b y adding \ t o the 8th significant d i g i t of the quotient.  22. The random d i g i t s used w i t h s u b r o u t i n e s (6) and (7) were o b t a i n e d from a random number g e n e r a t o r o f the form (see  x  n  = 101  x _j_ T1  +  C  (mod 1 0 ^ )  L6J )• Table I I  Subroutine  'Addition-Subtraction Time  (1) (2)-8  (2)-9  (2)-10 (2)-12 (2)-l6 (3)  \h) (5) (6) (7)  Multiplication Time 15-8U2 16.572 19.868 23-530 31-952 53-188 15-8U2 15.0I+2 15.81+2 20.1+62 21.392  8.29 9.512 9.778 10.0M+ IO.576  11.640  .12.80 .II.85 ,11.44 15-322 17-121  Division Time 50.518 1+9.51I+ 59-712 70.950 96. 5I+6 160.218 55-906 50.518 55-906 55.906 55-906  D e s c r i p t i o n o f t h e Experiment C h o i c e o f e x p e r i m e n t a l problem was m o t i v a t e d b y t h e i n t e r e s t i n g r e s u l t s o b s e r v e d b y Huskey ^jj i n t h e i n t e g r a t i o n o f t h e system •  X  y  =  •  b y . t h e Heun method  y  -X  • =  + X* == X. !3 j -3 i-1 j -3 x  =  y  X  y  ! J  o  x  =  =  +  j  y  =  = =  y  y  + h(-  + j-l l . + * ( - : c. . + (-x* J -l J-l 3 2  X  y  h  (  y  over t h e range .5200 ± t ± -5290 w i t h step s i z e , . h , o f 2x10"^.  Huskey f o u n d  t h a t t h e r o u n d - o f f e r r o r s s y s t e m a t i c a l l y accumulated t o such an e x t e n t as t o c o n t r a d i c t t h e assumption t h a t t h e i n d i v i d u a l e r r o r s ( t h e £ ' s) were independent random v a r i a b l e s .  We p e r f o r m e d a s e r i e s o f e x p e r i m e n t s w h i c h c o n s i s t e d o f i n t e g r a t i n g t h e g i v e n system u s i n g t h e s u b r o u t i n e s  ( l ) t o (7)-  A t e v e r y f i f t h i n t e g r a t i o n step t h e  computed r e s u l t f o r x was compared t o a 25 d i g i t r e s u l t o b t a i n e d u s i n g t h e same i n t e g r a t i o n p r o c e d u r e .  with  Fortran  An e r r o r a n a l y s i s o f t h e method, b y  Rademacher [^J and quoted b y Huskey £7^ , g i v e s f o r t h e p r o p a g a t e d e r r o r  ,  uniform d i s t r i b u t i o n  d i s t r i b u t i o n f o r random digits 'assuming t h e i n d i v i d u a l r o u n d i n g e r r o r s a r e independent.random v a r i a b l e s uniformly  d i s t r i b u t e d between -a and a ;  n i s t h e number o f i n t e g r a t i o n s t e p s , and \(f . i s t h e s t a n d a r d d e v i a t i o n of  IV  .  Table I I I Observed R e s u l t s Subroutine  (1) (2)-8 (2)-9 (2)-10 (2)-12 (2)-l6 (5) (3)  00 (6) (7)  Error at 450th step  E r r o r Range xlO 8  <r(r  n  )  •  -223xl0"8 -223x10" 0 -25.6x10"° -2.85xl0 -223xl0 2 -225xl0 ^ -1.36x10"°  0,-223 0,-223 0,-25.6 0,-2.85 0,-.0223 0,-225xl0 2-05,-5-3  -1.36xl0" -227xl0"  2.05,-5-3 0,-227  -8  _1  -1  8  E r r o r o f Maximum Absolute Value Occurs a t Step n  Theoretical Standard Deviation  8  -8  1+50 i+50 1+50 1+50 1+50 1+50 330  12.2x10"° 12.2x10";: 1.22x10"° .122xl0 12.2x10 12.2x10" 5-25xlO"  330  5.25xlO" 12.2x10"  H-50  -8  16  8  8  at  3.17,-11-9 10.6,-8.8  8  n=4-50  8.66xlO" 8.66xiO  8  -8  Figure I I  o  100  zoo  300  Number of I n t e g r a t i o n Steps  400  25: Comments on E x p e r i m e n t a l R e s u l t s R e l e v a n t e x p e r i m e n t a l r e s u l t s appear i n T a b l e I I I and i n F i g u r e s I and I I . F i g u r e s I and I I a r e graphs o f t h e o b s e r v e d  r  n  1.  Rounding b y  =  errors,  ( X - d i g i t r e s u l t a f t e r n s t e p s ) - (25 d i g i t r e s u l t a f t e r n steps)  d r o p p i n g d i g i t s (see F i g u r e s I and I I ) .  We o b t a i n e d almost i d e n t i c a l r e s u l t s w i t h s u b r o u t i n e s ( l ) ,  (h) and  (2)-8; a l l three e x h i b i t e d t h e systematic accumulation o f round-off e r r o r . I n t h e i r c o n s t r u c t i o n s u b r o u t i n e s ( l ) and (2)-8 d i f f e r o n l y , i n t h e multiplication operation:  a computed r e s u l t i n ( l ) i s rounded b y a d d i n g  i> i n s t e a d o f d r o p p i n g t h e l o w e r - o r d e r d i g i t s .  T h i s i n d i c a t e s t h a t t h e dominant  c o n t r i b u t i o n t o r o u n d i n g e r r o r a c c u m u l a t i o n was due t o t h e a d d i t i o n o p e r a t i o n . The u s e o f s p e c i a l s u b r o u t i n e (k)  d i d n o t g i v e us any advantage o v e r t h e  usual subroutines. 2.  Rounding b y a d d i n g  (see F i g u r e i ) .  S u b r o u t i n e (3) ( o f t h e type d e s c r i b e d i n Chapter produced  I V ) and s u b r o u t i n e  i d e n t i c a l r e s u l t s i n which s y s t e m a t i c r o u n d - o f f was absent.  (5)  Of  the two r o u t i n e s , (3) i s the b e t t e r s i n c e i t has t h e l o w e r v a l u e o f t h e " t " parameter (see C h a p t e r I V ) . . Comparing t h e r e s u l t s o f (3) and (5) w i t h those o b t a i n e d w i t h t h e c o r r e s p o n d i n g r o u t i n e s (h)  and ( 2 ) - 8 , which round b y d r o p p i n g d i g i t s , we  c o n c l u d e t h a t t h e e l i m i n a t i o n o f s y s t e m a t i c r o u n d - o f f i n t h e former was  due t o t h e r o u n d i n g - o f f p r o c e s s o f a d d i n g \.  results  The c h o i c e o f r o u n d - o f f  process i s obviously c r i t i c a l . 3-  Rounding b y a d d i n g random d i g i t s . We p e r f o r m e d  s e v e r a l e x p e r i m e n t s w i t h d i f f e r e n t i n i t i a l v a l u e s X Q and  c o n s t a n t s c f o r t h e random number g e n e r a t o r X 2_ n+  2  1015^ +c (mod 10^-2).  8 The maximum a b s o l u t e v a l u e o f o b s e r v e d e r r o r was 11.9x10"  8 and 10.6x10"  26. f o r s u b r o u t i n e s (6) and (7) r e s p e c t i v e l y .  In a l l cases, n e i t h e r r o u t i n e  gave r e s u l t s which e x h i b i t e d t h e s y s t e m a t i c r o u n d - o f f . F u r t h e r e x p e r i m e n t a l work w i t h o t h e r t y p e s o f problems i s n e c e s s a r y to f u l l y evaluate these subroutines. Longer word l e n g t h s (see F i g u r e s I and I I ) . S u b r o u t i n e s ( 2 ) - 9 , (2)-10, round-off.  ('2)-12, and ( 2 ) - l 6 a l l e x h i b i t e d s y s t e m a t i c  The r e s u l t s o b t a i n e d w i t h t h e 8 - d i g i t r o u t i n e s , (3) and ( 5 ) ,  were g e n e r a l l y b e t t e r t h a n t h o s e o f (2)-9 and (2)-10.  T h i s seems t o  i n d i c a t e t h a t b y c a r r y i n g one o r two e x t r a d i g i t s throughout t h e c a l c u l a t i o n s we may n o t c o u n t e r a c t t h e e f f e c t s o f such a p o o r r o u n d - o f f p r o c e s s as t h e d r o p p i n g o f the l o w e r o r d e r d i g i t s . H a r t r e e Q7I a t t r i b u t e s s y s t e m a t i c r o u n d - o f f t o t h e f a c t t h a t the - " l e a d i n g d i g i t rounded o f f remains t h e same i n a number o f s u c c e s s i v e c o n t r i b u t i o n s t o t h e i n t e g r a l " ; on t h i s b a s i s he develops a c r i t e r i o n t o determine whether s y s t e m a t i c r o u n d - o f f i s l i k e l y t o o c c u r . f o r step s i z e  o f 2x10  y  According t o h i s analysis,  and .52 * t £ . 529> e might a v o i d s y s t e m a t i c w  r o u n d - o f f i f we t a k e t h e word l e n g t h  g r e a t e r than 12.  Our e x p e r i m e n t a l  r e s u l t s showed a s y s t e m a t i c a c c u m u l a t i o n o f r o u n d - o f f e r r o r f o r b o t h t h e 12 d i g i t and 16 d i g i t word l e n g t h s ( s u b r o u t i n e s (2)-12 and ( 2 ) - l 6 ) . conclude t h a t Hartree's analysis systematic build-up o f errors.  We  i s n o t s u f f i c i e n t t o account f o r t h e The e f f e c t i s due t o t h e t y p e o f r o u n d - o f f  process (dropping d i g i t s ) . S t a t i s t i c a l estimation of errors. T a b l e I I I shows t h a t t h e t h e o r e t i c a l s t a n d a r d d e v i a t i o n , C T ( r ) , does n  not g i v e an a c c u r a t e e r r o r e s t i m a t e f o r t h o s e s u b r o u t i n e s which rounded b y d r o p p i n g d i g i t s ( t h e maximum observed e r r o r was l a r g e r b y a f a c t o r o f approximately.18).  On t h e o t h e r hand, t h e s t a t i s t i c a l e s t i m a t e was  s u f f i c i e n t l y p r e c i s e f o r t h e r o u t i n e s which rounded b y adding \ o r b y  27. a d d i n g random d i g i t s .  T h i s shows t h a t a s t a t i s t i c a l t r e a t m e n t of* r o u n d - o f f  e r r o r can give reasonable e r r o r estimations, p r o v i d e d the round-off process i s a c c u r a t e enough.  Chapter  VI  Conclusions  Our e x p e r i m e n t a l r e s u l t s emphasize the f a c t t h a t a d e s i g n e r o f an a r i t h m e t i c s u b r o u t i n e s h o u l d approximate a r e a l a r i t h m e t i c o p e r a t i o n as c l o s e l y as p o s s i b l e .  The f o r m o f s u b r o u t i n e b e s t f i t t i n g t h e s e r e q u i r e m e n t s was  the  one w h i c h u t i l i z e d a double p r e c i s i o n ( 2 - & - d i g i t s ) p r o d u c t a r e a , an {Z + l ) d i g i t q u o t i e n t a r e a , and a {2.JL  + l ) - d i g i t a r e a f o r sums and  i n c o n j u n c t i o n w i t h . a rounding-off process of adding \  differences^),  into the'last  digit  position retained. The  s y s t e m a t i c a c c u m u l a t i o n o f r o u n d - o f f e r r o r was  t h e type o f r o u n d i n g - o f f p r o c e s s :  the e f f e c t was  found t o be caused by  observed.in r e s u l t s obtained  w i t h t h o s e s u b r o u t i n e s w h i c h rounded by d r o p p i n g d i g i t s , b u t d i d not o c c u r w i t h r o u t i n e s t h a t u s e d \ o r random d i g i t s f o r r o u n d i n g .  S t a t i s t i c a l methods gave  us a c c u r a t e e r r o r e s t i m a t e s f o r d a t a o b t a i n e d w i t h the l a t t e r s u b r o u t i n e s . a consequence, we c o n c l u d e t h a t s t a t i s t i c a l methods may  be a p p l i e d t o the  p r o p a g a t i o n .of r o u n d - o f f e r r o r p r o v i d e d t h e a r i t h m e t i c s u b r o u t i n e s are sufficiently  accurate.  (*) As d e s c r i b e d i n Chapter  IV.  As  29Bibliography 1.  A s h e n h u r s t , R.L., and M e t r o p o l i s , - N., "Unnormalized F l o a t i n g - p o i n t • A r i t h m e t i c " , J . Assoc. CM., 6 (1959), .1+15-U28.  2."  C a r r , J.W. , I I I . "Error Analysis of Floating-point Arithmetic", Comm. o f A.CM. . 2 (1959), -10-15.  3.  F o r s y t h e , G.E., " R e p r i n t o f a Note on R o u n d i n g - o f f E r r o r s " , S.I.A.M. Review 1 (1959), 66-67.  1+.  H e n r i c i , P., D i s c r e t e V a r i a b l e Methods i n O r d i n a r y D i f f e r e n t i a l . ' Equations, J . W i l e y & Sons, New York, 1962.  5-  Householder, A.S., " G e n e r a t i o n o f Errors i n D i g i t a l Computation", B u l l . Amer. Math-Soc. 6£ (195*4 )> 23*4-21+9.  6.  H u l l , T.E., and D o b e l l , . A.R., "Random Number G e n e r a t o r s " , S.I.A.M. Review *+ (1962),  23O-25I+.  7.  Huskey, H.D., "On t h e P r e c i s i o n o f a C e r t a i n Procedure o f N u m e r i c a l I n t e g r a t i o n " , W i t h an appendix b y D.R. H a r t r e e , J . R e s e a r c h o f Nat. Bur, o f Stand. *+2 , (l9*+9), 57-62.  8.  von Neumann, J . , and G o l d s t i n e , H.H., " N u m e r i c a l I n v e r t i n g o f M a t r i c e s o f H i g h Order", Bull-Amer.Math.Soc.. 53 (19*47), 1021-1099.  9.  Rademacher, H., "On t h e A c c u m u l a t i o n o f E r r o r s i n P r o c e s s e s o f I n t e g r a t i o n on High-Speed C a l c u l a t i n g Machines", A n n a l s Comput. Labor. H a r v a r d Univ., 16, (19*4-8), 176-I87.  30. Bibliography  - cont:  10.  R i c h a r d s , R.K., . A r i t h m e t i c O p e r a t i o n s i n D i g i t a l : Computers, van Nostrand-Co. Inc.,. 1955»  11.  W i l k i n s o n , J.H., •. " E r r o r A n a l y s i s o f F l o a t i n g P o i n t - C o m p u t a t i o n " Num.Math. 2 ( i 9 6 0 ) , 319-3^0.  

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
France 7 0
United States 4 0
China 4 27
Japan 4 0
City Views Downloads
Unknown 7 4
Beijing 4 0
Tokyo 4 0
Ashburn 2 0
Mountain View 1 0
Wilmington 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080564/manifest

Comment

Related Items