UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Multi-resolution stereo vision with application to the automated measurement of logs Clark, James Joseph 1985

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-UBC_1985_A1 C53.pdf [ 14.87MB ]
JSON: 831-1.0096549.json
JSON-LD: 831-1.0096549-ld.json
RDF/XML (Pretty): 831-1.0096549-rdf.xml
RDF/JSON: 831-1.0096549-rdf.json
Turtle: 831-1.0096549-turtle.txt
N-Triples: 831-1.0096549-rdf-ntriples.txt
Original Record: 831-1.0096549-source.json
Full Text

Full Text

MULTI-RESOLUTION STEREO VISION WITH APPLICATION TO THE AUTOMATED MEASUREMENT OF LOGS  by JAMES JOSEPH CLARK B.A.Sc, The University of British Columbia, 1980 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY  in THE FACULTY OF GRADUATE STUDIES Electrical Engineering  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA September, 1985 c  James Joseph Clark, 1985  In presenting t h i s thesis i n p a r t i a l fulfilment of the requirements for an advanced degree at the University of B r i t i s h Columbia, I agree that the Library s h a l l make i t f r e e l y available for reference and study.  I further  agree that permission for extensive copying of t h i s thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It i s understood that copying or publication of t h i s thesis for f i n a n c i a l gain s h a l l not be allowed without my written permission.  Department of  ELCCT/ZJCAC^  £rQ£{jj  The University of B r i t i s h Columbia 1956 Main Mall  Vancouver, Canada V6T  1Y3  Date  DE-6  (3/81)  .Sgy-rctwb-er-  6,  ££fU*}&  ABSTRACT  A  serial multi-resolution stereo matching algorithm is presented  Marr-Poggio  matcher  (Marr  and Poggio,  disambiguation and in-range/out-of-range functions.  It is proposed  that  1979).  It is shown  mechanisms  a disparity  function  that  that is based on the  the Marr-Poggio  are unreliable for non-constant estimate  reconstructed  from  feature disparity  the disparity  samples at the lower resolution levels be used to disambiguate possible matches at the high resolutions.  Also presented  is a disparity scanning algorithm with a similar control  structure,  which is based on an algorithm recently proposed by Grimson (1985).  It  is seen  measurements  that  the proposed  algorithms  will  function reliably only if the disparity  are accurate and if the reconstruction process is accurate. The various sources of  errors in the matching are analyzed in detail. Witkin's (Witkin, 1983) scale space is used as an analytic tool for describing a hitherto unreported spatial filtering of the images with non-constant The  reconstruction  reconstruction distributed  are  samples  process is analyzed in detail. Current methods  reviewed. based  A  new method  on applying  that caused by  disparity functions.  for  coordinate  reconstructing  functions  from  arbitrarily function is  process is analyzed, and a general  formula for  The error due to the reconstruction  the  as a function of the function spectra,  transformations  for performing the  to the sampled  presented. error  form of disparity error,  sample  distribution and reconstruction  filter  impulse response is derived.  Experimental  studies  are presented  which show how the matching  algorithms  perform  with surfaces of varying bandwidths, and with additive image noise. It is proposed that matching of scale space feature maps can eliminate many of the problems that the Marr-Poggio type  of matchers have.  A method  for matching  maps which operates in the domain of linear disparity functions is presented.  ii  scale space  This algorithm  is used to experimentally for non-constant  verify the  effect of spatial  disparity measurements  disparity functions.  It is shown that measurements can be made give an independent estimate of the disparity gradient diffrequency. It is shown that the filtering effect for linear  disparities.  industrial application  for  on the  binocular scale space maps that  This leads to the concept of binocular  diffrequency measurements are Experiments  gradient can be obtained by diffrequency  An  filtering on the  are  described  by the  which show that the  spatial  disparity  measurement.  stereo vision  is described.  measurement of logs, or log scaling. A moment based method from the segmented  not affected  The  application  automated  the  log volume  two dimensional disparity map of the log scene is described.  Experiments  are described which indicate that log volumes can be estimated  iii  for estimating  is  to within  10%.  Table of Contents  Abstract  ii  Table of Contents  iv  List of Figures  vii  Acknowledgements  xv  I. I N T R O D U C T I O N  1  1.1  The  1.2  Overview of the  II.  Stereo Vision  FEATURE  Problem  1  Thesis  '.  8  MATCHING  13  2.1  Image Representations  13  2.2  The Feature  21  2.3  The M a r r - Poggio  2.4  Problems  2.5  A  2.6  Disparity Scanning Matching Algorithms  39  2.7  Other matching  42  2.8  Summary  III.  IV.  Matching Problem Matching Algorithm  with the  23  M a r r - P o g g i o Matching Scheme  27  Simplified Multi-resolution Matching Algorithm  34  methods  of Chapter  RECONSTRUCTION  2  44  O F T H E DISPARITY  FUNCTION  FROM  ITS S A M P L E S  45  3.1  Introduction  3.2  Interpolator  3.3  The Methods of Grimson and Terzopoulos  49  3.4  The W K S Sampling Theorem  57  3.5  The Transformation  or  Warping Method -  ID  Case  65  3.6  The Transformation  or  Warping Method -  2 D Case  75  3.7  Implementation  3.8  Including Surface  3.9  Summary  ERROR  45 Methods  of the  48  2D  and its Extensions  Transformation  Gradient Information  Method  in the  88  Reconstruction  Process  95  of chapter 3  ANALYSIS  O F DISCRETE  99 MULTIRESOLUTION  iv  FEATURE  MATCHING  100  4.1  Sources  4.2  Effect of Sensor  4.3  Analysis of Disparity Measurement  4.4  Reconstruction  4.5  Matching Error Analysis  170  4.6  Geometry  183  4.7  Effect of the  4.8  Summary of chapter  V.  of Error in Discrete Multiresolution Feature Noise on Feature  Error  Matching  100  Position Errors  106  due to Filtering  115  Analysis  145  Errors  EXPERIMENTS  various  WITH  errors  on the  multi-resolution matching algorithm  4  189 192  T H E DISCRETE  MULTI-RESOLUTION  MATCHING  ALGORITHMS  194  5.1  Introduction  194  5.2  Implementation  5.3  Frequency  5.4  Surface  5.5  Performance  5.6  Comparison of the  of the  Response  Multi-Resolution Feature  of the  Gradient Response of the  Multi-resolution  Matching Algorithms of the  5.8  Summary of Chapter SPACE  Log Scaling  233  MATCHING  259  Introduction  6.2  Matching of Scale  6.3  Matching of Two Dimensional Scale  6.4  Problems With  6.5  Implications for Biological Depth Perception  6.6  Summary of Chapter  BINOCULAR  226  257  6.1  VII.  Noise  231  5  FEATURE  220  with the DispScan  Matching Algorithm  Application of Image Analysis to  SCALE  Matching Algorithms  Simplified Matching Algorithm  198 205  Matching Algorithms with Additive  5.7  VI.  Extraction  259 Space  Scale  Image Representations  Space  Feature  263  Maps  267  Matching  6  271 £  277 279  DIFFREQUENCY  280  7.1  Introduction  280  7.2  Diffrequency Measurement  282  v  7.3  Psychophysical  7.4  Experiments  7.5  Summary  Summary  8.2  Directions  For  Diffrequency Stereo  288 292  of Chapter  VIII. C O N C L U S I O N S 8.1  Evidence  7  AND A  297 LOOK  TO THE FUTURE  298  and Conclusions for Future  298  Work  304  Appendices  306  References  :  vi  333  List of Figures 1.1 Two correlated  arrays of numbers which encode a three dimensional scene  3  1.2 Two views of a scene taken from different vantage points  4  1.3 The setup that produced the images shown in figure 1.2  4  1.4 The structure of the thesis  9  2.1 A summary of the topics covered in chapter 2  14  2.2 A feature pyramid  15  2.3 The response  of the V G filter to a step input. The zero crossings of the filtered 2  output are seen to coincide with the edge  18  2.4 The scale map of a random one dimensional function  20  2.5 The geometry of the epipolar lines  22  2.6 The Marr-Poggio multi-resolution matching scheme  25  2.7 The three matching pools of the three pool hypothesis  26  2.8 The failure of the Marr-Poggio in-range/out-of-range  mechanism for discontinuous  disparity functions. After Grimson, 1985 2.9  The failure (continuous)  2.10  of  the  Man-Poggio  29 in-range/out-of-range  mechanism  for  linear  disparity functions  The failure  of  the  Marr-Poggio  30 disambiguation  procedure  for  non- constant  disparity functions  32  2.11 The operation of the multi-resolution nearest neighbour matching algorithm  35  2.12 Nearest neighbour matching  37  2.13 The processing flow of a multi-level iterative matching algorithm  38  2.14 The operation of the multi-resolution Dispscan matching algorithm  41  3.1 The topics covered in this chapter  47  3.2 Computational  molecules  for the relaxation  surface  approximation  algorithm  (after  Terzopoulos, 1982). The thick bars indicate the boundary of the grid 3.3  The three  special  types  of sample  distributions  formula of Yen (1956)  handled  by the  54 reconstruction 63  vii  3.4 a) A function, f(t) sampled at non-uniformly distributed positions b) The transformed  function, g(r),  sampled at uniformly distributed positions  66  3.5  A burst type of signal, with time-varying bandwidth  70  3.6  The reconstruction  74  3.7  The hexagonal  3.8  The Voronoi and Dirichlet tessellations for a set of points  3.9  a) A set {x,,}.  of a chirp signal for uniform and non-uniform sampling  sampling lattice for functions with isotropic spectra  b) An attempt to create a G H T from  76 82  (x.J.  c) Some local GHTs of the point set of a)  85  3.10  The operation of the mapping heuristic for N = 7  89  3.11  The sample locations in \  90  3.12  The relation of the heuristic mapping efficiency in terms of sample density to the  g  for N =7  shape of the sample distribution  94  4.1 The topics covered in chapter 4  101  4.2 The perturbation of feature contours by additive noise  107  4.3  The probability SNR  4.4  f  4.6  The  of  zero crossing  position  error,  for  o<~\,  and 112  function of zero crossing  position error, for  SNR=1,  and  .5, 1., 2. and 4  Probability 0^=1,  function  .5, 1., 2., and 4  The probability density a =  4.5  =  density  of  an  113  n pixel  error,  for  n=0,  1,  and  2,  given  that  q = l/i/2  and  as a function of the SNR left  and  right  scale  maps  114 of  a  randomly  textured,  tilted  surface  with  a  disparity gradient of -60/255  117  4.7  The relationship between the left and right scale maps of a tilted surface  119  4.8  A zero crossing contour of the random process F(x,a)  121  4.9  The probability density function of the disparity measurement error for k = 3, and 4  1, 2, 126  viii  4.10 The probability of an n pixel disparity measurement error as a function of the disparity gradient, given q = l/j/2, for n = 4.11  0, 1, 2, and 3  128  The left and right skew maps obtained from a randomly textured surface with a  horizontal disparity gradient of -60/255 with a — 2 4.12 The probability density a = l , 0, =  133  of disparity measurement error for zero crossing  features,  -0.1 to -0.4  138  4.13 The probability density of disparity measurement error for zero crossing  features,  0, = -0.1, o = 1, to 4  139  4.14 The probability of an N pixel error for zero crossing j3, for a = l , q =  1V2 and  N =  0,1,2 and 3  140  4.15 The probability of an N pixel error for zero crossing  a for p\ = -0.1, q =  features as a function of  l/j/2 and N =  features as a function of  0,1,2 and 3  141  4.16 The probability density function of the disparity measurement error for extremum features for p\=-0.1 to -0.4 with a = l 4.17  142  The probability density function of the disparity measurement error for extremum features with p\=-0.1 for a = 1,2,3 and 4 with q =  l/j/2  143  4.18 The probability of an N pixel disparity measurement error for extremum features as a function of p\ for a = 1 and q = l / / 2  143  4.19 The probability of an N pixel disparity measurement error for extremum features as a function of a for /3j =  -0.1 and q = l/v/2  144  4.20 The shapes of the regions of support for F(a>) and G(c3) for exact  reconstruction  of f(x) from its samples  147  4.21  Aliasing error in the reconstruction  4.22  The effect  of having  caused by too low a sample density  an improper  reconstruction  filter.  Note  that  148 the central  repetition is partly filtered out and that parts of the other repetitions are passed by the 4.23 A plot of Slepians approximation  filter to the optimum  ix  148 filter  154  4.24  A plot of Slepians first approximation  to  the  optimum filter extended  past its  region of strict validity 4.25  The  average  RMS reconstruction  a = l V 3 , for c=5, 4.26  The  average  The  error  for  a  Gaussian process  and filter, with  10, 15 and 20  164  RMS reconstruction  a = l V 6 , for c=5, 4.27  155  error  for  a  Gaussian process  and filter, with  10, 15 and 20  165  average RMS reconstruction  a = l/>/12, for c=5,  error  for  a  Gaussian process  and filter, with  10, 15 and 20  165  4.28  The matching process  171  4.29  The analysis of the matching error given that the closest match  to the  estimated  match position is the Nth feature 4.30  The theoretical  probability of obtaining the  disparity measurement 4.31  174  error, a  =  correct match  as  a  function of the  /2  178  Experimentally derived relationship between the probability of obtaining the correct  match  and  the  error  in the  disparity  estimate  for  a  number  of  different  angle  quantizations 4.32  179  The probability density of obtaining a matching error as a function of the  error  in the disparity estimate  180  4.33  The distortion of zero crossing contours for non constant disparity functions  4.34  The  probability of obtaining  estimate error for non-constant  the  correct  match  as  a  function of  disparity functions  the  181  disparity 182  4.35  The stereo camera geometry  184  4.36  The effects of vertical misalignment on the disparity measurements  187  4.37  The action of the various errors on the matching process  190  5.1 The topics covered in this chapter  195  5.2 The spatial filtering process  199  5.3 The frequency response of the four spatial  filters  5.4 The zero crossing pyramid of a random image pair  x  203 206  5.5  The variation of the RMS disparity error with the matching region size  207  5.6  The variation of the RMS disparity error with the number of resolution levels  208  5.7 The RMS disparity error as a function of the number of relaxation 5.8  Perspective plots of the disparity the warping reconstruction  5.9  RMS disparity disparity  5.10  =  5.11  =  5.12  =  5.13  =  5.14  =  5.15  =  5.16  =  5.17  =  5.18  =  5.19  =  =  maximum 211  error as  a  function  of  a , g  for  matching  algorithm  1,  maximum 212  error as  a  function  of  a , g  for  matching  algorithm  1,  maximum 212  error as  a  function  of  a  g)  for  matching  algorithm  1,  maximum 213  error as  a  function  of  a , g  for  matching  algorithm  2,  maximum 213  error as  a  function  of  a ,  for  matching  algorithm  2,  maximum 214  error as  a  function  of  a , g  for  matching  algorithm  2,  maximum 214  error as  a  function of  0 , g  for  matching  algorithm  2,  maximum 215  error as  a  function  of  a , g  for  matching  algorithm  3,  maximum 215  error as  a  function of  a ,  for  matching  algorithm  3,  maximum 216  error as  a  function of  o  g>  for  matching  algorithm  3,  maximum  15  RMS disparity disparity  1,  10  RMS disparity  disparity 5.20  =  algorithm  5  RMS disparity  disparity  matching  20  RMS disparity  disparity  for  15  RMS disparity disparity  g  10  RMS disparity  disparity  a ,  5  RMS disparity  disparity  of  20  RMS disparity disparity  function  210  15  RMS disparity  disparity  method a  209  1, with  10  RMS disparity  disparity  as  method  5  RMS disparity  disparity  error  function obtained using matching  iterations  216 error as  a  function of  a ,  20  for  matching  algorithm  3,  maximum 217  xi  5.21 Perspective plots of the error maps for the three reconstruction methods, obtained for a =40 5.22  Perspective  217 plots  of the disparity  function  obtained  for a =40  for the three  reconstruction techniques  218  5.23 The left hand scale map for the disparity gradient experiments  222  5.24 The right hand scale map for the /3 j =—20/255 case  222  5.25 The right hand scale map for the /31 =—40/255 case  223  5.26 The right hand scale map for the /31 =—60/255 case  223  5.27 The right hand scale map for the /31 = -80/255 case 5.28  The measured  RMS disparity  error  due to filtering as a  224 function of a for  disparity gradients of -20/255, -40/255, -60/255 and -80/255 5.29  The expected  RMS disparity  error  due to filtering as a  225 function  of a for  disparity gradients of -20/255, -40/255, -60/255 and -80/255 5.30 Increase in RMS disparity error  as a function of added noise variance.  225 Surface  a ' 30, Iterative matching, Zero crossings only 5.31 Increase in RMS disparity error  227  as a function of added noise variance. Surface  a = 15, Iterative matching, Zero crossings only 5.32 Increase in RMS disparity error  228  as a function of added noise variance.  Surface  a = 30, Iterative matching, Zero crossings and extrema  228  5.33 Increase in RMS disparity error as a function of added noise variance.  Surface  o —15, Iterative matching, Zero crossings and extrema 5.34 The pseudo-variance noise variance, for  of the additive noise error =  229 as a function of the additive  y/2  230  5.35 The RMS disparity errors for the Dispscan and simplified matching algorithms as a function of a  232  g  5.36 A log lying on a flat deck  236  5.37 The effects of thresholding figure 5.36  236  5.38 The result of applying an edge operator (Marr-Hildreth) to figure 5.36  238  5.39 The video log scaling system setup  239  xii  5.40 Two stereo image pairs depicting single log scenes  244  5.41 The zero crossings of the steTeo pairs shown in figure 5.40  245  5.42 The thresholded disparity maps of the log scenes  246  5.43 The approximation of the log boundary by its convex hull  248  5.44 The filled in log region  249  5.45 The detected log boundary  250  5.46 Fitting an ellipsoid to the log region  255  6.1 The topics covered in this chapter  260  6.2 Three adjacent one dimensional slices of a two dimensional scale map exhibiting non- well- behavedness  268  6.3 A linked coarsely quantized scale map (j/2:l a ratio)  272  6.4 Splitting and merging of scale map contours for nonlinear disparities  274  6.5 A stereo pair of scale maps with sinusoidal disparity  274  6.6 The scale maps of a real stereo image pair  276  7.1 The topics to be covered in this chapter  281  7.2 The spatial organization of a foveal image representation  284  7.3 The transformed version of the foveal image representation offigure7.1  284  7.4 The relationship between the left and right foveal scale maps  286  7.5 The relationship between the left and right scale maps for /30 = 0  287  7.6 The relationship between the left and right foveal scale maps for /30 = 0  287  7.7 A pair of ambiguous sinusoidal stimuli  289  7.8 The diffrequency search paths for a logarithmically scaled a axis  293  7.9 The RMS diffrequency error as a function of a for p\= -20/255, linear disparity  294  7.10 The RMS diffrequency error as a function of a for p\= -40/255, linear disparity  295  7.11 The RMS diffrequency error as a function of a for j3x =  -60/255, linear  disparity  295  7.12 The RMS diffrequency error as a function of a for 0!= -80/255, linear disparity  296  xiii  7.13  The standard deviation  of the  2.1  The diamond search  2.2  Examples of edges in the  diffrequency  path and the five  five  quantization error  edge types  edge modes  296 313 314  xiv  ACKNOWLEDGEMENTS  The interaction  production of a thesis  does  not occur  in a vacuum. Without the support and  of a number of people and organizations  this particular  thesis  would never have  been completed.  I  owe a  enthusiasm  great  deal  and confidence  to my supervisor,  in my work  gave  Dr. Peter  Lawrence,  me the encouragement  whose  never-ending  needed  to successfully  times,  by Dr. Allan  undertake this research. I  would  like  to acknowledge  the advice  provided, at various  Mackworth of the Computer Science department at U.B.C. The fellow students with which one has the opportunity of working with provide much of the intellectual and social interactions  that one requires if they are not to emerge from  their studies narrow-minded and lacking in basic having  many  fine  people  as  colleagues.  I  tennis skills.  would  I have  especially  like  had the pleasure of to  acknowledge the  contributions, both social as well as intellectual, of Brian Maranda, Richard Jankowski, Nick Jaeger, Norman Beaulieu, Kevin Huscroft and Jim Reimer. As well, I must acknowledge some of the Computer Science students that have shown me the view from their side, as well as keeping me entertained on Friday afternoons. Barry Brachman has been a friend as well as a colleague,  and Jim Little  and Marc  Majka  have  shown  me the computational  side of  Computer Vision.  I  would also like to thank the support given by the Forest  Engineering Institute of  Canada, particularly Verne Wellburn and Alex Sinclair. The B.C. Science Council funded the research  described  in this  thesis  with a grant  G.R.E.A.T. scholarship.  xv  and provided me with a much  appreciated  Finally, way  I would  like to thank  my mother  through my long studies, and not making me  high school counselor suggested.  xvi  and father become  for being behind me all the  a welder or something like my  1  I - INTRODUCTION  "They came to Bethsaida and some people brought him a blind man whom they begged him to touch. He took the blind man by the hand and led him outside the village. Then putting spittle on his eyes and laying his hands on him, he asked, 'Can you see anything?' The man, who was beginning to see, replied, 'I can see people; they look like trees to me, but they are walking about.' Then he laid his hands on the man's eyes again and he saw clearly; he was cured and he could see everything plainly and distinctly." 1  1.1 - The Stereo Vision Problem Like the  blind man in the  quotation  that begins this thesis, the  machines of man's  creation have recently acquired the gift of sight. This gift allows these machines the ability to perform tasks unheard of a scant decade ago such as being able to autonomously manoeuver in a loosely constrained environment, visually inspect industrial components for defects, locating and tracking objects  for the purpose of manipulating them, and just plain seeing what's out  there. However, unlike the benefactor of the above quotation, human engineers have somewhat less than divine powers. As a result our machines currently  operate in a mostly black and  white, blurred, and two dimensional world and don't really understand what they are looking at  unless  industrial object with  it  is  explained  applications  at  except under the spatial  perception  to  them  this  time  by  a  cannot  human.  Most  vision  systems  the  three  dimensional position of an  determine  most contrived conditions. It will good  enough  to  enable  them  to  that  be a while before play baseball (not  are  used in  we see to  robots  mention the  other skills involved).  There the  survey  have article  been by  methods  Jarvis,  developed, such  1983)  which  can  as  structured  accurately  light and laser  determine  the  ranging  (see  three dimensional  positions of objects. These techniques, however, require a very constrained environment and are active,  which means  that they  alter  their  visual environment  in some  fashion (such  as by  projecting patterns of light on to the scene). Being taken out of this constrained environment and placed into a less constrained one significantly reduces  Mark 8:22-25, Jerusalem Bible  the ability of these systems. The  2  fact that a system often  used  by  engineering  uses active methods is not necessarily  biological  point  of  systems  view,  an  (the  active  bat's  echo-location  sensing  system  consumption, efficiency and its effect on the depth  the  visual  method  processing  of is  perception methods  choice  for  centred  around  function and the  Despite some encouraging advances to the  efforts  Technology)  biological  of the  late David  a drawback, and such methods is  must  environment.  a  be  determination  However,  of  on  its  From size,  an  power  for visual perception  Much  how  design of similar methods  example).  judged  systems is passive.  the  good  are  current  research  biological  passive  of into  depth  tailored for use in machines.  in our understanding of such methods (due in large part Marr and his co-workers  stereo vision systems that can  operate  at  the  well in loosely  Massachusetts constrained  Institue  of  environments  have not yet been developed. It  has  acquainted  been  with  the  the  author's  computer  experience  vision  field,  with that  laymen, they  as  well  greatly  as  educated  underestimate  people  the  not  difficulties  involved in the perception of depth. The usual argument that they present is that the human visual  system  can  perceive  depth  so  effortlessly  processes involved must  be  present  arrays of numbers,  them with two  determine example  fairly simple. One way such  they,  quite  naturally,  in which to  as  answer  assume  that  the  these people is to  those in figure 1.1, and ask  them  to  the depth of the object which gave rise to these arrays of numbers. This simple points out the  processing  that  mechanism  fact that the human visual system  that  is  tailored  to  process  spatial  actually imagery  contains such  as  a very that  complex  caused  by  patterns of light falling on our retinae. This same mechanism operates very inefficiently when presented way  a  with stimuli such as the array of numbers depicted in figure 1.1. In fact the only  human can  detect depth  in this pair  centres altogether and try and determine  of  arrays is  to  bypass  the  visual processing  the correlation between the two arrays with the use  of the higher cognitive centres of the brain, which are ill adapted for such computations. In doing this sort of experiment,  a human begins to appreciate  actually involved in stereoscopic  depth perception.  the complex  processes that are  3  67 68 73 67 56 44 49 55 48 47 49 49 46 41 44 47 48 50 57 53  67 70 77 76 65 65 50 58 52 45 50 48 45 44 45 49 45 51 50 46  68 74 76 73 65 78 65 56 61 56 57 54 47 49 52 54 52 54 57 52  71 73 72 72 58 71 80 58 61 63 58 57 48 49 48 50 49 52 55 51  71 73 69 66 53 46 76 56 48 62 56 52 48 44 47 48 49 51 54 54  74 79 76 69 61 34 70 71 60 67 66 56 52 49 49 52 53 55 59 58  68 78 81 69 65 32 43 81 70 62 67 56 50 46 47 49 51 53 58 58  61 76 80 69 67 42 24 65 74 59 66 56 48 44 45 47 48 50 54 57  66 73 81 73 73 56 24 42 77 65 67 62 52 45 48 48 49 51 56 57  62 67 73 71 72 62 25 23 59 71 61 62 51 45 42 44 45 48 52 54  65 64 68 72 74 69 36 20 42 73 67 68 61 45 44 44 52 54 57 60  67 64 65 72 76 73 49 20 30 64 75 68 65 50 46 43 50 51 55 57  69 60 64 65 72 73 57 25 22 48 77 67 63 54 49 47 51 53 53 57  81 63 61 60 72 81 69 38 19 40 77 76 60 51 50 50 54 51 53 55  87 62 50 49 63 81 72 45 17 31 63 76 56 34 45 49 53 48 49 52  94 70 46 48 63 85 78 54 25 26 58 75 54 25 42 54 58 52 52 57  96 79 41 46 55 77 78 62 31 21 45 69 44 20 32 48 53 50 49 53  99 85 52 49 55 76 82 68 46 20 38 62 43 18 21 38 49 44 46 50  96 95 64 50 54 68 83 75 59 23 28 62 47 21 17 28 4 4. 45 42 52  87  27 26 22 17 13 9 6 9 9 10 11 13 16 19 23 20 17 26 33 33  29 26 23 19 14 11 8 8 10 7 10 9 13 13 22 24 19 16 25 33  27 25 25 19 15 12 10 9 11 8 10 12 11 12 19 23 24 18 17 24  26 24 24 20 15 13 8 11 10 10 9 12 15 15 18 21 24 25 20 17  27 25 22 21 16 13 9 10 11 11 8 11 14 16 17 20 24 26 25 17  28 25 21 19 16 15 9 8 10 10 9 9 12 16 16 17 23 26 27 21  27 26 21 18 17 16 11 8 9 9 8 8 12 14 15 16 19 25 28 21  25 26 21 18 19 15 13 10 9 8 8 10 12 13 15 14 18 24 26 20  23 24 20 17 17 15 12 9 8 8 9 12 13 13 14 14 16 22 25 15  19 23 20 19 17 16 14 10 8 9 9 11 11 13 13 14 14 21 21 14  15 19 20 18 18 15 12 12 7 8 8 8 9 11 12 13 13 16 16 9  13 16 20 20 16 15 13 13 8 6 7 7 10 10 12 12 11 12 9 5  12 15 20 22 17 17 15 14 10 8 5 6 10 11 13 12 11 11 7 1  13 16 20 22 20 18 18 14 13 8 5 7 8 12 13 10 8 5 3 1  13 15 17 22 22 18 18 14 14 9 5  14 15 16 20 21 20 18 15 15 11  15 15 14 17 19 16 16 16 14 12  14 11 13 14 19 14 14 14 14 11  7 6  9 10 12 11 4 0 1 3  11 11 10 10 7 5  10 10 11 10  6 6  6 2  4  99 78 50 51 59  77 77 65 34  22 56 58 27 18 27 41 48 47 53  LEFT 26 24 19 16 11 9 11 11 13 12 13 16 19 25 26 27 29 30 34 31  28 25 20 17 12 9 10 9 12 11 13 15 17 21 24 21 25 31 34 32  6  6 6  9 9 0  7  9 9 10 10 11  2 0  RIGHT  FIGURE 1.1 Two correlated arrays of numbers which encode a three dimensional scene In order to understand the difficulties involved in stereoscopic depth perception one must first understand the process by which depth can, in principle, be measured. Consider the pair of images shown in figure 1.2. These are two views of the same scene as seen from two different vantage points. A schematic description of the situation is depicted in Figure 1.3. Looking closely at these images we notice that the image of the same physical point (such as the lower right corner of the telephone for example) occurs at different points in the two images. Even closer inspection reveals that this difference in position is not the same for all  FIGURE  1.2 Two views of a scene taken from different vantage points.  FOCAL PoiNT  IMAGE  FIGURE  PLflME  1.3 The setup that produced the images shown in figure 1.2.  physical points in the scene. In fact it can be seen from figure 1.3 that the farther away a  5  physical point is from the cameras, the smaller is the difference in position, or disparity. This is the basic principle which allows us to measure from  two different  positions.  As Marr  (1974)  depth given two images  has stated,  there  of a scene taken  are three  steps  that are  required for the determination of depth in this manner to proceed. These are: 1. A point in one of the images corresponding to an actual  physical event must be  found. 2. The corresponding point must be found in the other image.  3. geometry  The disparity  between  these  corresponding  points  is  measured  and, given the  of the imaging process, the depth to the point in space giving rise to the physical  event is computed. The second of these steps, commonly refered to as the correspondence problem, is the step that has proven the most difficult to implement. The problem, reduced to its most basic form, is how can we distinguish one point in an image from all of the other points in the image?  Part  perception  of the answer  process,  namely  to this  question  the requirement  lies  in the first  step  of the stereo  that the features that we try to match  depth  between  the two images arise from unique physical events. As Marr and Poggio (1979) point out, this rules out the use of image intensity as a matching feature since a physical event cannot be uniquely associated  with a given image intensity due to the fact that there are many physical  events which give  rise  features which are more intensity  be used.  associated  to that image directly related  However,  even  intensity.  to physical events,  if one can find  with physical events, the correspondence  of the inherent ambiguity of these matching object descriptions (chair, already  been  done  telephone  Marr and Poggio  (1979) suggest that  such as sharp  other  changes in image  a set of features that can be uniquely  problem is still not solved. This is because  features.  Unless very complex  features such as  etc.), implying that an immense amount of processing has  to extract these features,  are used, the features to be matched  are to  some extent ambiguous. This means that two independent features may look identical, (even if they  result from totally  separate physical events) and can be confused by the system that is  6  attempting  to find correspondences.  One of the main challenges  that stereo vision researchers  have faced is in reducing this feature ambiguity without excessively  increasing the amount of  processing required to find the features.  The  Multi-Resolution Paradigm For  most  scenes the range  of shifts or disparities  scene can be bounded. This means that any search  between  the two images  of that  for corresponding features can be limited  to a certain finite region. Now, if the density of the features were such that there were few features in this matching region then the correspondence Thus  one possible  method  for determining  problem would not be very difficult.  correspondences  involves matching  scene features  which have a low density. Alternatively one can limit the size of the matching region, which means  limiting the range of disparities that the system  devised a scheme disparity  range.  whereby  This  type  can handle. Marr and Poggio (1979)  one could use features with a high density of scheme  was first used  in the work  and handle a large  of Moravec (1977). The  technique used by Marr and Poggio, which was intended to model the way in which human (and  some other  biological systems) performed stereoscopic  multi-resolution feature set Such a set consists The  low density,  or low resolution,  feature  depth perception, involved using a  of collections of features of various densities. set can be matched  over  a  large  range of  disparities, but only a sparse set of depth values is obtained. These values, however, can be used to guide the estimate to the next dense (higher  resolution)  feature set, which can be  matched over a smaller range of disparities, and which provide a denser set of depth values. This process can proceed to higher and higher resolutions. The net result of this algorithm is that a dense set of depth values can be obtained over a large disparity range. However, this explanation has been oversimplified, and like many things, the process is not a simple as it seems. There must be a way of transferring information from one resolution level to another, which  was not addressed  by Marr  and Poggio  in their  proposal  of the multi-resolution  method. The matching algorithm proposed by Marr and Poggio, which we describe in chapter 2, is seen to have problems with non-constant  disparity functions.  in detail  7  Goals of the thesis  The objectives of this thesis are four-fold. First we wish to develop a multi-resolution stereo matching algorithm that is potentially rapid enough for real time applications. Secondly we want to analyze the component parts of the resulting algorithms to see where and how errors are introduced, and how these errors affect the performance of the matching algorithms. Thirdly, we wish to test the algorithm thoroughly and apply it to an industrial task, that of log scaling. Our final goal is to understand the mechanisms that are performance  of multi-resolution matching algorithms and propose other  offer improved performance.  most important to the algorithms which may  8  1.2 - Overview of the Thesis This section briefly describes the layout of the thesis and points out the contributions that each  chapter  makes to the attainment  topics to be covered  in the thesis,  of the goals of the thesis. Figure 1.4 depicts the  where they can be found,  them. In addition, at the end of each chapter  and the relationship  we present a summary of the most  between important  points raised in that chapter.  Chapter Marr-Poggio  2  begins  with  multi-resolution  a  discussion  algorithm  of multi-resolution  (1979)  is examined  image  in some  representations.  detail.  The  We point out  difficulties involved with the Marr-Poggio method when the disparity function is not constant It  is shown  information  that, must  to handle  non-constant  be accurately  passed  to  disparity  functions,  the higher  the low resolution  resolutions.  Based  on this  disparity idea we  propose a modification to the Marr-Poggio algorithm. A different sort of matching algorithm, based on some recent work of Grimson (1985) is described, which involves scanning through a large  disparity range for possible matches. Grimson's method is essentially single-resolution,  using lower resolution information only to disambiguate competing matches. We show that this method also has difficulty with non-constant this algorithm which allows non-constant For data  from  chapter  the multi-resolution methods lower  resolutions  to higher  disparity functions. We propose a modification of  disparity functions to be handled. discussed is seen  in this  thesis,  the projection  to be of paramount  of disparity  importance.  Thus, in  3, we discuss, at some length, various methods by which this can be done. Grimson  (1981b, 1982) has also discussed this problem, but only in the context of 'filling in' the gaps between the disparity values at the highest resolution, rather than the process of reconstructing the disparity function at the lower resolutions, which is what we are concerned with in this thesis.  We show  that  the assumptions  implicit in Grimson's reconstruction  method  are not  entirely valid for the lower resolutions. Because of this we look at other possible methods for the  disparity  function  sampling theory;  reconstruction.  This  search  led us to  examine  methods  based  on  that is the theory of reconstructing analytic functions from their samples. One  9  Scale space matching CW-fo  Image representations Binocular Diffrequency CH.Z  1  measurement CH"?  Feature matching CH.2  Error analysis CH.4  Disparity function reconstruction  Experiments  CH.3  I  CH.5  Application to Log scaling CH. & <  Conclusions CH.8  FIGURE 1.4 The structure of the thesis. of the problems that we encounter is that the disparity samples we obtain are distributed non-uniformly,  whereas most reconstruction  methods based on sampling theory  require the  10  samples to be on regular lattices (such method,  based  reconstruction  on  the  as rectangular  sampling theory  for  or hexagonal).  uniform sample  We therefore  distributions, which  of functions from non-uniformly distributed samples. We present  dimensional  versions  dimensional  case.  This  problem of course; radio telescope  of  this  method,  method  has  and  a  give  wider  a  computational  domain of  develop a  it has been succesfully applied to the reconstruction  the  one and two  algorithm  applicability than  allows  for  the  the  two  stereo vision  of synthetic  aperture  imagery (Clark, Palmer and Lawrence, 1985). We also discuss how this method  can be altered to handle the inclusion of disparity gradient, or surface normal information in the reconstruction process. It has been suggested (e.g. Ikeuchi, 1983) independent information about the surface  will  that the addition of such  result in a more robust  system  for obtaining  the three dimensional structure of an object  Chapter 4, in some respects, forms the heart of the thesis. In this chapter we analyze the  various processes by which errors  These errors measured camera  can  can  arise in the  determination of the  be partitioned into four distinct types.  position of  the  features.  This  can  result  misalignment as well as from a heretofore  The first type  from sensor  noise,  depth function.  is an error  in the  position quantization,  unreported effect involving spatial filtering  of images of tilted surfaces. The second class of errors from the  incorrect matching  matching error  resolution  levels. It  However,  it  is  parameters (such incorrectly.  is the matching errors.  of ambiguous features.  is sensitive  to  errors  in the  is shown that using more  shown  that  non-constant  as edge orientation)  Experimental  studies  are  We  of the  the  errors  show, for our simplified  disparity estimate obtained  complex  disparity features  described  These are  which  features  functions  reduces can  the  cause  which arise  algorithm, that from the lower matching  distortion  error.  in  the  which may cause them to be matched show  that  this  distortion  causes  an  increase in the matching error.  The third class of errors consists of those incurred in transferring disparity information from  one  resolution level to  the  next  The chief component  of this error  is the  error in  11  performing the  reconstruction  of the  disparity function from the  disparity samples at a given  resolution.  The  fourth and last type of error described is the error inherent in the  computation  of the depth values from the disparity values. In general, these errors depend on the accuracy to  which the  various  parameters of  the  imaging process, such  as  the  baseline  between  the  sensors, relative sensor tilt, focal length, etc., are known. If the values of these parameters are in  error,  then  industrial  so  will  applications  require accurate  depth  (such  as  the  values.  This  log  scaling  is an  important  application  consideration  described  in  for  chapter  the  many  5) which  measurements.  Chapter  5  done  in  analyses  the  describes chapter  the  results of a  4. We  run  the  number  of  experiments  multi-resolution  designed  matching  algorithms  to  check  the  proposed in  chapter 2 on a number of different synthetic image pairs. These image pairs are created from arrays  of  uncorrelated  (white)  normally  distributed  random  numbers.  The  disparity  functions  that are used are gaussian cylinders, that is constant disparity parallel to one axis and varying with a gaussian  shape  parallel  to  the  other axis. The spread  (or  variance) of this gaussian  disparity function is varied, which has the result of changing the bandwidth of the disparity function.  Chapter 5 concludes with an example a typically messy yard.  We  show  to  industrial application; that of measuring or scaling lumber in a lumber sort that  stereo  vision  provides  process. The steps involved in this automated of this  of the application of stereo depth perception  process on  actual  images  are  a  feasible  solution  to  the  automation  of  this  log scaling procedure are outlined and examples  provided. Estimates of  the  log  measurement accuracy  obtainable are given.  Chapters 6 and 7 address alternative methods,  which are  problems encountered  methods for obtaining disparity information. These  based on scale space image  representations,  are  immune to some of the  by the matching techniques proposed in chapter 2.  12  Chapter 6 considers the idea of matching scale maps (defined in chapter  Chapter measurement  7  discusses  a  recently  proposed  binocular  process,  that  of  2).  diffrequency  Introduced by Blakemore (1970), this process involves measuring, in some fashion,  the difference in the spatial frequency content between the two images. Using the scale space transform  as  mathematical binocular  an  analytic  as  well  as  a  representational  basis. It is shown that measurements  scale  space  image  representation  and  tool,  we  give  this  idea  a  firm  of the diffrequency can be made on the that  these  diffrequency  values  are  directly  related to the disparity gradient of the surface.  The  final  chapter  (8)  contains  a  discussion of the  results  obtained  in the  previous  chapters and makes some conclusions. We provide possible directions for future research, based on the unanswered questions that the thesis raises.  13  D  -  FEATURE  -  2.1  Image  In the in  MATCHING  Representations  this  chapter  correspondence block  diagram  we  discuss  between  ways  them.  form  in  that  image  A  figure  in which  summary  2.1.  The  a  of  bulk  pair  the  of  images  topics  of  the  matchable  it  features,  such  can  covered  sections  in  marked  be  matched  this  chapter  with  '*'  to  yield  is given  contain  new  material.  In  order  representation features such  should  as  to  illumination,  raw be  these  correspond  such  intensity  an  In  one  features.  In  at  a  the  a  degree  there  for  scene  use  by  are  more  in  to  localized  the  create  intensity  scene.  In  referred  shadows  of  the  and  this to  symbolic  changes. regard,  as  other  a  These features  edges) usually  discontinuities  in  objects,  surface  markings  and  so  on.  matching  process.  Features  such  as  the  by the  event  of ambiguity  as  as  for use  physical  necessary  mistakenly  orientation  adequate  given  (often  the  surface  not  higher  a  be  general, different  be confused  A  for  must  resolution  in  events  is  in  matching  the  than  scene.  process since  These  so-called  do edge features.  indistinguishable  gray  they  cannot  gray  level  That is, in a  given  levels  than  there  are  edges.  order  scheme,  the  physical  adequate  with  image  indistinguishable  in be  of  changes  events  levels are  a  terms  intensity  will  exhibit  be  distinct  physical  uniquely associated  of  in  to  image  features  image  region  the  images  discontinuities  features also  each  the  localized  correspond  Thus  of  pairs  given  able such  with each  of  feature  to  be  useful  in  a  multi-resolution  to create a multi-resolution image  representation  a  of  resolution.  decreases,  type  thus  representation The  consists  feature  reducing  the  density  of  a  set  generally  matching  (but  ambiguity  single not  comprised  resolution  of these  descriptions, decreases  as  features  to  feature pyramid (also known  as  (as  neccessarily)  matching  there  are  less  other).  popular multi-resolution  feature cone) (Tanimoto,  1978,  image and  representation Levine  is the  1978). A  feature pyramid is  depicted  in  figure  14  Introduction  I  CH.l.\  Summary  Feature Matching  Image  CH 2-1  representations CH2.I  Disparity Scan  The Marr-Poggio matching algorithm  «  algorithms  CH. 23  A simplified  Problems  method  with the  #  CW.3-S  CH.2.6  Marr-Poggio ^ approach CU.2.4  (*) Indicates New Material  FIGURE 2.1 A summary of the topics covered in chapter 2. 2.2. A feature pyramid is made up of a number of single resolution feature maps, each  15  FIGURE 2.2 a  resolution  a  level  that  the If  This  to  means  fraction  of the preceeding  its resolution. that  there  higher one  resolution,  can  at  resolutions resolution matching  use  the  the  processing  search  so obtained  In this  may be achieved. over  to limit  (1985) discusses  gained  resolutions,  may be reduced. process  Since  a large  single resolution matching  lower  is less  much  more  from  these  the  amount  way an overall  level  data  of the search  in computation  possible  of processing  resulting  resolutions  processes required  in computation matching,  to  than  at  over  the  at  levels.  guide  the  higher  a single  high  one can perform a  and use the disparity  region at the higher from such  levels,  is the  at the lower resolution  low resolution  at low resolutions  the coarser  resolution  at the lower  quickly  saving  is spatially quantized at  the resolution,  In the case o f stereo image  range cheaply  scheme.  is, the  there  then  the required size  the savings  level. Each  information at the lower  takes place  information  high  That  is less  the characteristic pyramid shape.  processsing  a  is some  proportional  quantization. in  A feature pyramid.  resolutions.  a stereo matching  estimates Grimson  scheme  over  16  Feature resolutions,  pyramids  the  type  are  of  distinguished  feature  by  encoded  information at the various resolutions  in  three the  be found in (Rosenfeld,  These  three  dimensional  implementations  of  feature  1984). Quadtrees structures  are  pyramids  obtain  in the higher  pyramid,  the  way in  succesive which  the  resolutions is  Applications and descriptions of quadtrees  known  resolution  between  of succeeding  can be extended  the  ratio  and the  are obtained. If the ratio  2:1 the resulting pyramid is known as a quadtree. can  factors;  as  lower  levels  to three spatial dimensions.  oct-trees  (Srihari,  resolution  in some  levels  1984).  Some  by averaging  way. Other  the  information  contained  obtain each  resolution level by independent means. As will be seen, the stereo algorithms to  be described later acquire the multi-resolution feature maps by operating  implementations  on the information  in the higher resolutions.  Often  multi-resolution  feature  descriptions  have  feature resolutions. The resulting structure is strictly of having less  a constant  spatial  quantization  at all  not a pyramid, and loses the advantage  data to process at the lower resolutions.  These types  of structures are used  when one wishes to have a high spatial resolution, along with a low feature density. One resolution  can also envisage a multi-resolution image representation  between  successive  levels  is vanishingly small. In such  continuum of resolution levels. Such an image representation values.  representation  wherein the difference in a case we would have a  has been called a scale space  (Witkin, 1983), where the term scale space refers to the continuum of resolution  The  term  multi-resolution  scale  zero  space  crossings  originally  was  meant  of V G filtered images 2  to  apply  to  (these terms will  the  case  of  the  be defined shortly).  However, the concept can, and should be, extended to cover any type of feature for which a continuum of resolutions can be defined.  We have stated that features based on localized changes in image intensity are suitable features  for  representation  a  stereo  based  matching  on these  (1980), who proposed  system.  How can  features? This  an edge operator  question  we  obtain  a  was addressed  multi-resolution  image  by Marr and Hildreth  that was able to detect edges at a given scale or  17 resolution. By altering a parameter  in their operator,  detected,  consists of three parts. The first part involves smoothing  or localized. This operator  edges at  different resolutions could be  the image by convolving it with a gaussian low pass filter. This allows different resolutions to be achieved by changing the  cutoff frequency  of the  low pass filter. The second part  of  Marr and Hildreth's multi-resolution edge detection operator involves taking the second spatial derivative (or  in two dimensions taking the  Laplacian) of the  low pass filtered signal. The  final step is to determine where the differentiated signal passes through zero. Such a point is called a zero crossing, and for linear image intensity changes  locates the  filter is popularly known as a V G filter, from its component parts. 2  edge  exactly. The  In two dimensions the  V G filter's impulse response is written: 2  V G(r) 2  where reduces  r  =  2  the  =  x + y. 2  2  -(l/iro- )[l-rV2a ]e" 4  J  The factor  resolution of the  a  (2.1.1)  (rV2c72)  in the filter response  resulting edge  representation.  is the  scale  factor.  Increasing  o  The operation of the V G edge 2  detector is shown in figure 2.3 on a step edge signal. An  edge pyramid can be built up using the Marr-Hildreth edge operator  an image with the V G filter for a number of different values of a, 2  by filtering  and then finding the  zero crossings. We will call the resulting structure a zero crossing pyramid. In order to reduce computation  and storage  requirements  in our  stereo system,  position to a level directly proportional to the a  we  quantize  the  zero  crossing  value of the V G filter at each resolution. 2  Thus the positions of low resolution edges are specified less accurately  than are edges at the  high resolutions.  A  scale space image representation  can be obtained using the V G filter by applying 2  the filter to an image at a continuum of a values. In fact this was the original definition of the scale space. We can write this operation as an integral transform, which we call the scale space transform, as follows (for the one dimensional case):  18 i I  STEP INPUT  •-t  i  ii  LARGE CT  ?  f ) ( ~ \ J  V^G  &L——^— ^ — ^  OUTPUT  »t  SMALL cT J  FIGURE 2.3 The response of the V G filter to a step input The zero crossings of thefilteredoutput are seen to coincide with the edge. 2  F(x,a) = dVdx /".RujaV/5Jr e"  (x-u)V2aJ  du  (2.1.2)  F(x,a) is said to be the scale space transform of f(x). The above equation is seen to be a 2  convolution of f(x) with the function dVdx  aV/27re~^  x_u  ^  /2a  , which is the impulse  2  response of the one dimensional V G filter. This one dimensional scale space transform can be straightforwardly extended to handle higher dimensions. This is discussed in more detail in chapter 4.3. The function G(x,o), obtained as follows:  G(x,a) = 1 if F(x,o) = 0, and zero otherwise  (2.1.3)  is called the scale map function of f(x). It is a binary function that is zero everywhere except at the zeroes of the scale space transform. The scale map of a random function is shown in figure 2.4. The a scale is logarithmic. Note how, as the resolution decreases (i.e  19  increasing a), the density of the zero crossing contours also decreases.  The map,  will  two image representations described here, the zero crossing pyramid and the scale be the  representations  that  we will  matching process in the rest of the thesis.  be using  in our development of the  stereo  FIGURE 2.4  The scale map of a random one dimensional function.  21  2.2 - The Feature Matching Problem Once the multi-resolution image description has been built for the two images in the stereo pair, we must match them to obtain the disparity function between corresponding points in  the  two  images.  The  disparity  function, which  is  usually  required  to  be  known at all  points in the image, not just at those points coincident with a feature, is defined as follows. Suppose the and  intensity  gCxjJ  for  the  functions in the left  image.  If  two images  we  are  given by f(x^)  assume perfectly  matched  for the  sensors and  right  image  ideal viewing  conditions then we can write:  f(x )  =  R  where d(x^)  a  two  dimensional  space,  search to one  of  each  both  components  other.  This  of  from  the  two components  means that  we  between  can  reduce  the components  would require  disparity of the the  vector  searching could be  disparity vector  dimensionality  of  camera  focal  point  through  the  scene  axis (the  point  the  can be seen in  line between  the focal  eyes or cameras) and the. direction of view of either of the cameras (i.e. the  are  of the disparity vector is  epipolar constraint. The idea behind this constraint  If one forms a plane defined by the inter-ocular  points of the  intersection  that  dimension. This dependence  inherent in the so-called  vector  so  However, it can be shown that the  independent  figure 2.5.  (2.2.1)  R  This would seem to imply that any search for a match  determined. not  R  is the disparity function. Note that d is a vector quantity. That is, it has two  components. over  g(x -rf(x ))  being  imaged),  then  the the  of this plane with the image planes of the two cameras define two lines, one in  each camera's image  plane. The importance  of these lines, called  the  epipolar lines, is that  any feature on the epipolar line in one image has its corresponding feature on the associated epipolar line in the search along the  other image.  Thus, to perform the search for a match,  one need only  epipolar line in the other image associated with given view direction. Note  that, in general, the epipolar lines change as the view direction changes. This means that one must  determine  where  the  epipolar lines are  before  the  search is begun. To determine  the  22  IMAGE  OPTIC A X E S  FIGURE  epipolar such  line  as  knowledge employed. are  for  the  The geometry of the  2.5  given  view  direction  inter-ocular  axis.  If  of  a  the  camera  A particularly  parallel. In  This  is  the  will  be  are  produced  no  that in the  this  exception. one  geometry  can  is  epipolar  geometry  for  Even  if the  in  principle  modified image  requires is  simple epipolar  case the  assumed  this  epipolar lines.  not  known)  accurate known  then  two  a  lines are  all  experiments  camera  geometry  rectify  the  parallel in is  images  epipolar lines are  to  of  the  (possibly  dimensional  geometry results if the  most  space the  knowledge  camera  limited  matching  stereo  not  such  by  some  other  vision,  horizontal.  that  and  a  coarse  must  two  be  cameras  and  are  horizontal.  in  this  thesis  horizontal  coordinate  if  search  optic axes of the each  geometry,  epipolar  transformations  we lines so  23  2.3 - The Marr-Poggio Matching Algorithm Marr intended This  and Poggio  to  model  method  algorithm  filters  the  was  proposed  way in which  subsequently  a  multi-resolution  the  human  implemented  visual  by  matching  system  Grimson  mechanism  performed  (1981a).  The  which was  stereo  matching.  operation  of  this  was as follows.  A filtered  (1979)  mulu-resolution  image are  at  3, 6,  four  different  representation resolutions  12, and 24 picture  quantized  to 3 0 ° levels.  and  same  the  image  spatial  comprised  is  of  constructed.  elements (pixels).  (pixel  size)  is  zero  The values  The orientation  The positions of the zero crossings  resolution  the  used  at  crossings of  a V  for these  2  G  four  of the zero crossings are  are measured  all four  o  of  to within one pixel,  levels  (hence,  the resulting  structure is not a pyramid).  The the  pair  of multi-resolution  Marr-Poggio  crossings  is  scheme  done  proceeds  along  the  over  obtaining Marr  which  ambiguous  and Poggio is less than  region  is very  as  match  a given  the lower  information obtained  Matching  for is  at a  limited  than  the disparity search  range region  are examined  that more  region  resolution  disparity  roughly  in size  1979)  level  matched. Search  crossings  one possible  to  for  must  matching  have  same  reduce  if the  the possibility  in the search size  of  the  of  region). matching  one match  in the search  is limited to V 2 o ,  the maximum  is 4\/2o. Thus,  is relatively  since  high.  a  decreases  is very small.  The Marr-Poggio  is out of range of the true  are in range.  method  disparity (this  in which we are searching).  to see i f they  the  zero  the same orientation. T h e  match  than  T h e matching in  range at the high resolutions  does not lie in the region  resolutions  zero  and have  of the search given  are then resolution.  the probability of getting  high resolution  means that the true match  each  and Poggio,  increases, the maximum  at low resolutions  detects when  (Marr  at  to light)  is, more  the size  that can be measured  Conversely,  happens,  then  low. Because  the resolution  or dark  (that  shown  V2o,  lines.  is searched  matches have  region  disparity  a  representations  in parallel  epipolar  contrast sign (i.e. light to dark region  image  When  this  If so, the disparity  by the low resolution searches are used to shift the search  region in the  24  higher resolutions  so as to bring the  process is depicted in figure  search region  the  following  probabilistic  scheme.  Poggio to be on the order of 0.7. at  least  one  disparity. This  whether or not a given search region is in range If  the  probability of there being at least one match  being  actual  2.6.  The Marr-Poggio method determines with  into range of the  match  in  search  region  is  out  search  range  then  region  approaches  1.  the probability of there  Thus,  by  examining  percentage of zero crossings that have possible matches in a neighbourhood one can whether  or  Clearly,  the  provide a  not  the  search  neighborhood  region in  is in range  which  the  meaningful estimate of the  the  in the search region was shown by M a n and  If the search region is in range,  the  of  of the  statistics  match  are  true disparity tabulated  proportion.  determine  in that neighborhood.  must  If the  the  be  proportion  large  enough  to  of matches in a  neighborhood falls below a certain threshold (say .8) then the neighborhood is declared out of range and all matches are  rejected. The lower resolutions  are  then  examined  to see  if they  are in range. If they are then the disparity information provided by these low resolutions used to shift the search region at the  high resolution and the  are  matching process is repeated  until the search region is in range at the high resolution for all points in the image.  It can happen that there is more than one possible match when  the  search  region  matches can be the find  out  Poggio  which  one  perform  the  is  in  correct one, is  the  range).  Since  it  unless we can  correct one,  disambiguation  by  we  is  obvious  disambiguate  will  examining  incur  an  whether  reference to the current disparity estimate, is positive, negative compared  to the  that  in the search region (even not  of  these  possible  these possible matches, that is, error the  in the  match.  measured  Marr and  disparity,  with  or zero. This measure is then  dominant disparity sign in a neighborhood about the  If the sign of the possible match agrees with the  all  dominant match  feature being matched.  then it is chosen as the  correct match. If none of the possible matches agree with the dominant disparity sign, or if there is more than one match that agrees with the dominant disparity sign then no match is made. This is known as the three pool matching hypothesis as it assumes that there are only three  types  of  disparity  detectors,  divergent  (-),  convergent  (+ )  and  null  (0)  which  are  25  LOW R E S MATCHING^ RANGE  Hi'RES. MATCHING RANGE  do-  _DISPAR|TY -FUNCTION HI-RES IN R A N G E FIGURE  broadly figure  tuned 2.7.  2.6  as  The  visual system  disparity under  is  to  disparity  three  (Marr  Grimson disambiguate  The M a r r - P o g g i o multi-resolution  pool  (1985)  question  at  possible  the  matches  (as  the  neighborhood)  the for  grouped  example  that  is  the  no match  at  three  model  information  that  be  from  candidates.  disparity  resolution. would  one  into  scheme.  pools. has  The  been  scheme  proposed  is  depicted  for  the  in  human  p323).  matching  with  lower  then  1979,  suggests  agreement  are  mechanism  and Poggio,  between in  and  matching  If the  all is  This  information the  low  case made.  the  lower  involves in a  resolution  if the  low  resolutions choosing  the  neighborhood information resolution  can  be match  about  cannot  disparity  used  the  to  whose feature  disambiguate varied  within  26  MATCHING ) RANGE (  C O N V E R G E N T POOL d  N U L L POOL DIVERGENT POOL  FIGURE  2.7 The three matching pools of the three pool hypothesis.  27  2.4 - Problems with the Marr-Poggio Matching Scheme There with  regard  are to  a  number  matching  of problems with the  images  with  Marr-Poggio matching  non-constant  disparity  functions.  scheme, We  especially  describe  these  problems in this section. Passing information from lower resolutions to higher.  In a recent paper (Grimson, 1985)  Grimson makes the following comments:  "What is the effect of driving the matching process in a coarse-to-fine manner? At the next finer level, there will in general be twice as many feature points. If image features persist across scales, which they usually do, then in general, each of the feature points at the finer scale can be associated with a feature point at the coarser scale. This will not always be the case, of course, and if there is no corresponding feature at the coarser scale, then ... the use of multiple scales implies no savings of computational expense."  This quotation implies that in the Marr-Poggio scheme lower  resolution  to  higher  resolutions  only  for  those  information is transferred from  features  that  persist  across  scales.  However, since there are generally twice as many feature points at the next higher resolution, it follows that only half of the  features will  persist  between  one  resolution and the  lower resolution level (this is easily observed in the scale space representation, As  Grimson points out,  the  fact that information from the  next  e.g. figure 2.4).  lower resolution is not available  for matching those features which do not persist to the lower resolutions, means that a larger search region is needed, increasing the probability of obtaining ambiguous matches, and hence, increasing the probability of matching error.  One way of solving the problem of transferring information between resolution levels is not to rely on only those features that persist between resolution levels, but rather reconstruct the disparity function from the information at all features at the lower resolutions, so that an estimate of the disparity is available at all points in the lower resolution image. In this way, a  disparity  features  that  estimate persist  is  available  from  the  for lower  all  features  resolutions.  in This  the  higher  technique  resolution, is  hinted  not at  just  those  by Grimson  (1981a) in his early paper describing his implementation of the Marr-Poggio matching scheme.  28 He proposes extracting the disparity information from a region in a low resolution image, to be used by the higher resolution matching, by finding the median, mode or average of the disparity values in that region. Grimson did not go into any detail on this topic (only one sentence in a 36 page paper) and it is not clear whether or not he was suggesting a reconstruction scheme such as we are proposing and in any case seems to contradict the implications of the quote given above taken from a later paper of his (Grimson,1985). In- range/Out- of- range detection  A second, and more fundamental, problem with the' Marr-Poggio matching scheme rests in what they refer to as the  continuity constraint.  This constraint states that disparity varies  smoothly almost everywhere and that only a small fraction of the scene is composed of boundaries that are discontinuous in depth (Marr and Poggio, 1979 p.303). Grimson (1985) gives a simple example which illustrates how the in-range/out-of-range detection mechanism of Marr and Poggio, which is based on matching statistics in the neighborhood about the feature to be matched, fails for discontinuous disparity functions. This failure is given as a reason for why the continuity constraint is required. The idea behind Grimson's example is depicted in figure 2.8. Suppose that the matching statistics are compiled over the square region with sides of length d, as shown in the figure. Now  suppose that (x/d) of this neighborhood covers region A, which has a  disparity that is out of range of the matching process and the remaining (1-x/d) portion of the neighborhood covers region B which is in range of the matching process. If e is the threshold of the percentage of matching zero crossings in the neighborhood that is required to declare the match (of the feature in the centre of the neighborhood) in range, then for what values of x will the percentage of matched points in the neighborhood exceed el Grimson shows that if x is less than or equal to (l-e)d/0.3 then the matches in the neighborhood will be declared within range. It is clear from the diagram that if x is less than 0.5d then the match in the centre of the region will be over region B, and hence will be in range. However, if e is greater than 0.85 the algorithm will actually conclude that the match is out  29  7ou-r-crf -range  A -</-:t  FIGURE 2.8 The failure of the Marr-Poggio in-range/out-of-range mechanism for discontinuous disparity functions. After Grimson, 1985 of range. Conversely if e is too low (i.e. less than 0.85) then one can encounter situations wherein matches that are actually out of range are taken by Marr and Poggio's algorithm to be in range. Grimson took this failure of the Marr-Poggio in-range/out-of-range result  of the  violation of the  continuity  constraint  However,  as  detector to be a  we will  show, the  Marr-Poggio in-range/out-of-range detector can also fail for disparity functions that satisfy  the continuity constraint. In fact, it can be shown that their in-range/out-of-range detector will only work 100% of the time for constant disparity functions. To see this consider a modified version of Grimson's example. Instead of having a discontinuous disparity function we assume a linear disparity function, as shown in figure 2.9, which obviously satisfies the continuity constraint We assume, for convenience, that the disparity function varies only along the x axis and is constant along the y-axis. The following analysis also holds for the more general case of an arbitrarily oriented linear disparity function. Suppose that the range of the matching process is 2w. Suppose that we have a possible match to a feature at point P, and  30  ^disparity /*,-foncrion  2w  'out of ran-^e  F I G U R E 2.9 T h e failure of the Marr-Poggio in-range/out-of-range (continuous) disparity functions.  linear  that  our  doesn't the  disparity  estimate  d  about  point  P,  that  the  Since  we know that nearly  match  Marr-Poggio  and 70%  of  percentage of features function the  is m then  P  is  actually  that the match  Suppose that we count  then  a  point  know this). Thus it is clear  matching process.  size  at  have  at  least  that will  detector  out-of-range  be matched  it is simple  to show  for the feature  one possible  100% o f the features features  (although  the  match.  in-range will  If this  would  matching  accept  a  match,  percentage the  match  the percentage  exceeds as  process  of features  e  correct.  will  have  we can calculate  in the neighborhood. If the slope that  of  in a neighborhood of  of the matching  have  algorithm  at point P is in range  the percentage of features,  In-range/Out-of-range  the  exact  mechanism for  the  of the disparity  having a match in  neighborhood is:  p  =  100*[0.7  +  If this percentage exceeds  e  (2.4.1)  0.6w/md]  (say 85%) then the match  is taken  to be in-range.  However, we  31  know that the match is actually in-range. Thus if p is less than e erroneously declared out of range. We cannot reduce  then the match will be  e below 70% because otherwise matches  that are truly out of range will be taken to be in range. Note that, even if we take e to be a very non-conservative  value near 70% (i.e. 70%+5) there can be cases where in-range  matches will be rejected. This is because, for any 6>0  we can find a value of the disparity  function slope, m, for which p is less than 70 + 8. In fact this occurs when m > This means that the Marr-Poggio in-range/out-of-range continuous surfaces.  detection scheme  60w/d5.  fails even for some  Upon closer examination, it is seen that the Marr-Poggio theory  not a continuity constraint  but a constancy constraint for strict validity. We will  requires  see in the  next discussion that this is the case for the Marr-Poggio disambiguation scheme as well.  Grimson matches. (1981),  (1985) suggests that figural continuity can be used to disambiguate possible  The use of figural continuity, involves distinguishing between  indicating out-of-range,  which was first proposed  matches which give  rise  by Mayhew and Frisby  to small scattered  segments,  and matches which give rise to extended contours, indicating in-range.  This method seems to be very effective. We have not studied the use of figural continuity in our  algorithms,  as  it is fairly  computationally  intensive,  and as  we shall  see later, our  algorithms work well enough without it  Problems with disambiguation As  you may recall,  one  of  the  methods  that  Marr  and Poggio  suggested  for  disambiguating between possible matches was to examine the dominant sign of the disparity in a neighborhood about the feature  to be matched. The candidate  match  which was consistent  with the domimant disparity sign was then chosen as the correct match. However this method works only for disparity functions that are constant, seen  in figure 2.10.  Suppose we want  or almost  to disambiguate  so. That this is so can be  the possible  matches  to a  feature  located at point P. Suppose that there are two possible matches, one with a positive disparity (relative to the current disparity estimate) and the other with a negative disparity. In order to disambiguate these we find the dominant disparity sign of the unambiguously matched  features  32  matched •features  dispari+y £ estimats  •function  FIGURE 2.10 The failure of the Marr-Poggio disambiguation procedure for non-constant disparity functions. in  a  neighborhood  about  P.  Futhermore, since there are the  dominant  negative  disparity  sign  Note  that  these  more negative is  taken  to  disparities  both  positive  and  negative.  disparities in the neighborhood than positive ones, be  negative.  disparity is taken to be the correct match  can modify the  are  Thus the  candidate  match  with  even though it is in fact incorrect  disambiguation method so that if there is more than  the One  one type of disparity  sign in a neighborhood then no attempt is made to assign a match to the ambiguous feature. However,  if  a  disparity  function  is  non-constant  almost  everywhere  then  very  little  disambiguation will be performed and very few matches will be made..  Grimson  (1985) suggests that figured continuity can  matches as well as determine whether or not a feature This method sufficiently  involves accepting  large  extent  only those matches  However,  the  test  for  be  used  to  is in range of the matching process.  which result  figural  in extended  continuity  can  expensive and may break down for noisy images, or for images whose as described in chapter  disambiguate possible  be  contours  with  computationally  features  are  distorted  4. In these cases the zero crossing contours may be become broken  33 up.  34  2.5  -  A Simplified Multi-resolution Matching Algorithm.  Based  on  the  discussion  in  the  previous  section  we  can  make  the  following  is  unreliable  observations.  The  1.  in-range/out-of-range  for non-constant  2. The non-constant  mechanism  of  Marr  and  Poggio  functions.  disambiguation process proposed  by  Marr and  Poggio  is likewise unreliable  is passed  from  lower  than  done  by  for  disparity functions.  3. The resolutions  disparity  detection  way  needs  in which  to  be  disparity  examined  information  in  more  detail  was  resolutions  Marr  and  to  higher  Poggio,  and  Grimson.  Since for  non-constant  applications), need  to  would  be  the  increased  now  functions  these  effect  of  the  due to incorrect  system  a  it  that  is  did  evident  not  probability of simple  are  are  these  processing  error  propose  (which  operations  eliminating  hand,  that the  detection  a neighborhood of points  other  heavily on the  The  since  would simplify  A ensure  and  the  in-range/out-of-range disparity  examine  operations On  the  and the  quite  for  in  rule  rather  feature  operations  all  and  to  processes  than  computationally  each  somewhat  that  disambiguation  the  Clearly,  resulting  in a  these  processes  eliminating  faster  in  (because  matched) we  together.  unreliable  exception  intensive  be  are  most  of  the  wonder  what  eliminating  these  matching  scheme.  run  the  risk  of  would  have  to  we  matching.  perform  in-range  detection  having out-of-range  matching  information passed  algorithm to the  that  or tries  or  disambiguation  ambiguous  matches  to  these  satisfy  higher resolutions  from the  be  very  conditions  small. by  We  relying  lower.  multi-resolution nearest neighbour matching algorithm  The algorithm is  operation depicted  of  this  algorithm,  schematically  in  which  figure  2.11.  we  call  The  the  idea  nearest- neighbour matching behind  this  algorithm  is  as  35  DISPARITY ESTIMATE FROM PREVIOUS L E V E L  LEFT IMAGE  V G FILTER AND IC. D E T E C T 2  NEAREST NEIGHBOUR MATCHER  V G  FILTER AND Z.C. D E T E C T 2  k  DISPARITY SAMPLES DISPARITY FUNCTION RECONSTRUCTION  DISPARITY E S T I M A T E TO NEXT L E V E L  FIGURE 2.11 The operation of the multi-resolution nearest neighbour matching algorithm. follows. We eliminate  all in-range/out-of-range  checking and disambiguation  of possible  matches, partly to save on computation and partly because of the difficulties these processes have with non-constant disparity functions. In so doing we require that a disparity estimate be available that can be used to determine the centre of the matching region at a given resolution, and that this estimate be as accurate as possible in order to reduce the probability of the feature being out of range of the matching process. In order to reduce the probability of obtaining ambiguous matches, we must restrict the effective matching range. This is done  36  by using nearest neighbour matching. If the disparity estimate is sufficiently accurate, then the match  whose  disparity is closest to the  and we therefore  estimated  disparity is most  likely  the  correct match,  treat it as such. Hence the term 'nearest neighbour matching'. This form of  matching is depicted in figure 2.12. This type of matching algorithm requires that the disparity estimate be fairly accurate (however,  as  we will  see  in chapter  4.5,  disparity estimate is obtained from the a  disparity  reconstruct  estimate the  for  disparity  all  error  can  be tolerated).  In  our  system  the  lower resolution matching processes. In order to have  feature  function  some  points  in  the  sparser  from  the set  higher of  resolution  disparity  it  values  is  necessary  available  at  to the  lower resolutions. Thus, for the high resolution disparity estimate to be accurate it is required that both the  disparity  measurements  at  the  disparity function from these measurements accurate  disparity  estimate  at  a  given  lower resolutions, and the  reconstruction  of the  be accurate. One way in which to provide a more  resolution  level  is  to  transfer  information  from  the  higher resolutions as well as the lower (after an estimate is available at the higher resolution, of course). The processing flow of such an iterative matching method is given in figure 2.13. This  multi-level flow of information from low resolutions  resolutions  is  similar  algorithm (discussed computational  to  that  proposed  in chapter  requirement,  as  3.3). the  by  The  and  disparity  described in chapter  improvement  matching  performance  of the  resolutions surface  of this approach  performed in every iteration. Experiments in the  high  Terzopoulos (1982) in his  main drawback  matching,  to  function  back  to low  reconstruction  is the  reconstruction  5 indicate, however,  increased must  be  that some  algorithm is obtained as a result of such  iteration.  The feature representation  that is used in our algorithm is the zero crossing pyramid  with the spatial quantization proportional to the value of a. This means that the average zero crossing  density  in zero  crossings  per  pixel is independent of the  resolution level. Using a  pyramid structure means that less computation is required to perform the matching search  for  a given disparity range at low resolutions than at high resolutions. We also consider the use of extrema (peaks or valleys) of the V G filtered image as features to be matched. This was 2  37  EPIPOLAR  LINE  21 ESTIMATED POSSIBLE  MATCH POSITION MATCHES  CHOSEN MATCH (NEAREST NEIGHBOR)  FIGURE 2.12 Nearest suggested some  by Frisby and Mayhew (1981)  mechanisms  bulk  of the thesis  (chapter  4)  process  by experimental  is performed  involves  and their  as o f the reconstruction  supported  well.  who claimed that extremum  phenomena concerning human stereopsis  The  well  neighbour matching.  effect  process  results  were  required if  to be explained.  an examination  of the disparity  on the performance  (chapters  (chapter  sufficiently well,  were  features  3  of the matching  and 4). These  5). These  the simplified  studies  theoretical  indicate  matching  measurement  that,  algorithm  error  algorithm as  examinations are  i f the reconstruction described  here  works  38  RESOLUTION LEVEL 1  M  r  LEVEL  M  2  r M  LEVEL 3  LEVEL 4  FIGURE  2.13  The  processing  flow of a- multi-level  iterative matching  algorithm.  39  2.6 - Disparity Scanning Matching Algorithms. In matching by  a  recent  paper  (Grimson,  from low resolutions  scanning  through  a  large  1985)  Grimson suggested  to high, the matching disparity  range  and  that  instead  of  guiding  the  be done at the highest resolution only,  noting  at  which  disparities  a match  was  possible for a given feature. Thus for each feature, a list of possible disparity values could be tabulated. recommends resolutions  Then these possible matches could be disambiguated using  the  to do the  information  obtained  by  performing  in some fashion. Grimson  the  disparity  scan  at  lower  disambiguation. Note that since the disparity range is the same for all  resolutions the following comment by Nishihara (Nishihara, 1984) "Marr and Poggio's idea of trading off resolution abandoned in (Grimson's) technique."  rings true:  for range seems to be largely  Grimson's method of disambiguation is as follows. Given a set of possible matches at a given resolution, a neighbourhood at for  unambiguous matches.  same,  and  if  this  the  If the  disparity  next lower resolution disparity values  value  is  one  of  feature point is  checked  of these unambiguous matches are  all the  the  about  possible  the  disparity  values  for  the  high  resolution feature, then the high resolution feature is assigned that match. Otherwise the high resolution feature is assigned no match  at all. Note that this method  suffers from the same  problem  disambiguation  fails  as  did  Marr  and  Poggio's  mechanism;  it  for  disparity functions. If a region of the lower resolution image has non-constant there will, in general,  be more  non-constant disparity then  than one disparity value in that region. Thus no  assignment  of disparity values will be possible for the higher resolution features. To  remedy  multi-resolution  this  problem  we  dispscan algorithm.  propose  We  the " following algorithm,  begin  the  matching  process  at  resolution level. At this level we scan through the entire disparity range, matches.  At  each  feature  for  neighborhood of points about  which  we  have  this feature. For  found at  least  one  which  we  the  call  very  the  lowest  looking for possible  match,  each possible disparity of the  we  examine  central  a  feature  we count the percentage of matches in this neighborhood that result in a disparity within 2 pixels  of  this  disparity.  The  candidate  disparity  which  has  the  largest percentage  of  such  40  matches is taken to be the correct disparity for the feature. It is obvious that we cannot do this  at  the  higher  resolutions  reasons that we gave for  when  the  the  disparity  inadequacy  of the  function  is  not  constant,  Marr-Poggio matching  for  the  technique.  However,  after doing the above process at the lowest resolution, we have an estimate of the function, higher  with which we can  resolutions  entire  disparity  have.  Then,  about  this  we  do the  range  for  and  each  feature  modify the  and  following.  mark  such  matching As at  down all  possible  tabulate  the  the  disparity,  process at  the  lowest  possible we  percentage  the  disparities  we  scan  that a  examine  the  possible  matches  of  disparity  higher resolutions.  resolution,  features  a  whose  At the  through  given in  same  the  feature can neighborhood  disparities, when  added to the difference between the disparity estimate at the central feature and the disparity estimate at candidate  the  neighborhood feature, lie  disparity  with the  within  largest percentage  2  pixels  is chosen  of  as  the  the  candidate  correct one.  disparity. Note  The  that the  disparity estimate from the lower resolution is being used to guide the disambiguation process, but in a operation  way of  that allows the  the  algorithm  is  algorithm depicted  simple multi-resolution matching  to  work  for  graphically  in  that the  disparity  values  possible and that the reconstruction as  possible.  reconstruction  As we  have  process are  stated studied  figure  2.14.  disparity This  obtained  at  the  lower resolutions. low resolutions  functions.  algorithm,  algorithm described in the previous section,  on the accuracy of the information provided by the important  non-constant  The  like  the  depends crucially  This means that it is  must  be  as  accurate  as  of the disparity function from these values be as accurate earlier,  the  examination  in some detail  in the  of  the  disparity  remainder  chapter, focuses on methods for performing the reconstruction process.  »  of the  measurement and thesis.  The next  DISPARITY  4 N = 5 FOR 0 - 2 3  ±2 PIXEL MATCHING RANGE N = 9 P0RD-I6  M=3  ESTIMATED H DISPARITY FUNCTION MINUS _ r _ OFFSET il  F0RD-/2  n  N = 4-F0RD=G  • POSSIBLE ©POSSIBLE  NEIGHBORHOOD F E A T U R E CENTRAL  FEATURE  X NEIGHBORHOOD F E A T U R E ®  FIGURE 2.14  CENTRAL  The  MATCH  MATCH  POSITION  FEATURE POSITION  operation of the multi- resolution Dispscan matching algorithm.  42  2.7 - Other matching methods The  Marr-Poggio type of stereo matcher  is not the only mechanism  that has been  proposed. Historically, the Marr-Poggio algorithm arose from a consideration of previous efforts to  model the human stereo vision  proposal  by Julesz  (1971)  solution  for the disparity  spatially  distributed,  system.  The bulk  of these methods  that the human stereo vision function  is obtained  but interacting,  disparity  process  based on the  is cooperative.  by the cooperation  detecting  were  neurons.  That  of a large  Such  is, a  number of  cooperative  algorithms  were proposed by Nelson (1975), Dev (1975), Sugie and Suwa (1977). Marr and Poggio (1976) presented  a cooperative  incorporated  physical  algorithm  constraints  However,  even  previously  mentioned  algorithms,  Also,  as  Marr  this  (1982)  analyzed  in the  algorithm.  p303).  (later  formulation  cooperative works has  in Marr, Palm,  of the computational  algorithm,  which  poorly on natural  remarked,  and Poggio,  iterative  performed  imagery methods  (Marr  1978) which  structure better  of the  than  the  and Poggio, 1979,  (which  most  cooperative  algorithms are) are not likely to be used by biological systems because of the slow speed of the neurons. Fast operation is obtainable only by one-shot networks of processing  elements  (neurons).  algorithms utilizing highly parallel  The inadequacy of Marr and Poggio's  cooperative  method led Marr and Poggio to develop their multi-resolution matching algorithm which has been described in the previous sections. Research into stereo vision has not been limited to the modeling of biological systems, of  course.  earliest  There  of these  O'Handley,  have  been  methods  and Yagi,  many  methods  was the intensity  1973).  This  technique  proposed based  specifically  area  for machine  correlation  involves correlating  vision. The  technique  (e.g. Levine,  the intensity  functions of  regions of the stereo image pair to determine the disparity in these regions. These methods typically  suffer  from  low resolution  and from  the high  ambiguity  of the image  intensity  values. Baker and Binford (1981) used edge correlation in addition to intensity correlation in an effort to reduce which  both  coarse-to-fine  reduced  the matching ambiguity. These also the  ambiguity  and sped  up  the  introduced a number of constraints computations.  They  also  used  a  approach to limit the amount of computation required for the correlation. Ohta  43  and Kanade (1985) have proposed an algorithm which uses dynamic programming methods for searching  large spaces of candidate  matches in three dimensions  (that is along the epipolar  line as well as across it, and along the disparity dimension). Baker  and Binford (1981) also  use  values  dynamic programming  drawback  to search the space of possible disparity  of these dynamic programming approaches  computation  required, especially  for complex  seem  for matches. The  to lie in the immense  amount of  scenes. From the point of view of simplicity in  implementation the Marr-Poggio type of matching, or the nearest neighbour type of matching we propose  is preferable  over  the correlational  methods.  concentrated on the Marr-Poggio type of approach.  It is for this reason  that we have  44  2.8 - Summary of Chapter 2 -  The problem of matching feature based image descriptions was introduced.  -  The Marr-Poggio (1979) multi-resolution matching method was reviewed.  -  The out-of-range  multi-resolution  matching  detection method  and disambiguation techniques  were  shown  to  be unreliable  used in the Marr-Poggio for non-constant  disparity  functions. Questions transferred -  were  raised  about  the  way  in  which  the  Marr-Poggio  algorithm  information from the lower resolutions to the higher. A simple  algorithm,  which  does away  relies  on reconstruction  (the Nearest  with the out-of-range of the disparity  Neighbour  detection  Matching  Algorithm)  and disambiguation  function  at each  resolution  was proposed  processes,  and instead  to provide  a disparity  estimate to guide the matching at the next higher resolution. Disparity  scanning  matching  algorithms,  as  proposed  by  Grimson  (1985)  are  introduced. A multi-resolution disparity scanning matching algorithm based on reconstruction of the  disparity  function  at  each  resolution  is  proposed  Algorithm). This algorithm is able to handle non-constant  (the  Multi-Resolution  DispScan  disparity functions, whereas it is not  clear whether or not Grimson's method can.  chapter  It  is shown  crucially  depend  that  the peformance  on the  accuracy  of the matching of  the  disparity  algorithms  proposed  measurements  resolutions and on the accuracy of the disparity function reconstruction process.  at  in this  the  lower  45  III - RECONSTRUCTION OF THE DISPARITY FUNCTION FROM ITS SAMPLES  3.1 - Introduction In  the  discussion,  in the  previous  chapter,  of  the  discrete  multi-resolution  matching  algorithms, it was pointed out that the sparsely sampled disparity function obtained at a given resolution  level must  be, at least partially,  reconstructed or  interpolated  to provide a  denser  sampling. This denser sampling is needed since guidance of the matching process at the next higher resolution requires a depth estimate at each feature location Because  the  resolution  feature  disparity  density  increases as  function  is  not  the  resolution  sampled  at  all  increases,  locations  in this higher resolution. it  follows  that  corresponding  the  to  lower  the  higher  resolution feature positions. Hence the lower resolution disparity function must be reconstructed or  interpolated  image  to  provide  disparity  values  at  all  feature  locations  in the  higher  resolution  representation.  The highest  reconstruction  resolution  resolution  of the  level, where  feature matching  gaps between containing  the  disparity  feature values  disparity  function from its samples  there is no need  process. In this points at  and  each  provide  case the  provide  point  to  a  in  the  disparity  reconstruction  complete image.  is also  estimates to  is needed  disparity One  required at  map,  may  to  that  argue  a  fill is,  that  higher in  an it  the  the  array is  not  necessary to know the disparity at each point in the image, but only where it is needed by some visual process. However, it is unlikely that the place where a disparity value is required is always going to coincide with a location at which the matching algorithm explicitly provides a  disparity  required.  value.  Grimson  considerations,  for  performing  the  applicability  of  Hence  some  (Grimson, the  1981b)  interpolation  reconstruction the  amount  reconstruction  provides process.  operation.  reconstruction  of  methods  It  further In  this  should described  or  interpolation  motivation, chapter  be  we  pointed  in this  based will out  chapter  reconstruction  on  that is  not  always  be  psychophysical  discuss  stereo vision case but to a very wide range of applications. For example, discusses the use of the transformation  will  the  methods  of  domain  of  limited to  (Clark et al,  the 1985)  method described later to an application  46  in radio astronomy. The  topics covered in this chapter are listed in block diagram form in  figure 3.1. A  note on the terminology used in this thesis. There are three terms which will be  used to denote the process of obtaining the value of a function at a point where it is not known explicitly. These are interpolation, approximation and  reconstruction. These are, for the  purpose of this thesis, defined as follows. Interpolation is the process of fitting a known (class of) function(s) through the measured function values such that the resulting function has the same values at the measurement points as the measured values. Approximation is the same as interpolation except that the condition that the interpolated function have the same values as the measured values at the measurement points is not enforced. Reconstruction the process of obtaining the exact function that has  been sampled, using  is defined as information  (or  assumptions) about the underlying function (for example, whether or not it is bandlimited). It should  be  noted  that many of the  methods, however the underlying  function, and  reconstruction  difference is that the hence nothing can  be  methods look  similaT  to interpolation  interpolation methods assume nothing of the said, in general, of their accuracy. In the  reconstruction methods the underlying functions are assumed to satisfy some constraints, which allow the accuracy of the reconstruction to be determined.  Proofs of Theorems stated in this chapter can be found in the Appendix.  Introduction  I  CH 3.1  Grimson and Terzopoulos  1  The WKS sampling  * method  theory and i t s  reconstruction methods  extensions CH  The Warping  3A  I  CH3.5  Extension t o 2D  (Relaxation)  CH  I  3.fo  Implementation  (*) Indicates New M a t e r i a l  CH 3.?  Adding gradient information t o the reconstruction process CH 3.?  F I G U R E 3.1 The topics covered in this chapter.  48  3.2 - lnterpolatory Methods We  will  only  briefly  describe  some  interpolation  methods,  as  these  produce  results  which typically contain more error than the methods to be described later. A good review of interpolation methods can be found in Appendix VI of (Grimson, 1981b). The  simplest form of interpolation is known as nearest neighbour interpolation. In this  method a given function value is assigned the value of the function sample nearest to it. A slightly more complex  method is linear interpolation wherein straight  line segments are fitted  between adjacent sample points (for the one dimensional case; in two dimensions the points are segmented, fit  through  polynomials. which methods  each  of  or triangularized, into triplets and a planar function, f = these  triplets).  However, higher  almost  certainly  involve  would  piecewise  smoothness  constraints  coefficients  in a patch  order  are (ID  not  basic  idea  can  be  extended  to  a + bx + cy, is higher  polynomial functions tend to exhibit oscillatory be  matching met.  This  For  case) may  of  present  in  low  order  example, be set  in  the  actual  function.  polynomial  the  case  so that the  of  Spline  sections cubic  measured  such  splines,  sample  order  behaviour  interpolation that two  function values  certain of  the  at  the  ends of the patch match with the interpolation, and the other two coefficients will be chosen so as to ensure that the first and second derivative of the cubic segment match those of the two adjacent cubic segments;  49  3.3 -  The Methods of Grimson and Terzopoulos  The  first  detailed analysis of the surface  reconstruction  problem in the context of  stereo vision was performed by Grimson, (Grimson, 1981b, 1982) as part of his PhD research. The  approach  Grimson  took  was to construct  a complete  surface  (or disparity) description  based only on the surface information known along zero crossing contours (Grimson used zero crossings infinite values  as the features  to be matched  set of possible surfaces known  imaging  along  process  the zero  which,  that  could  crossing  in the correspondence satisfy  contours,  when coupled with  process).  To constrain the  the conditions imposed by the disparity Grimson  the shape  relied on the properties  and reflectance  characteristics  of the of the  surface, gave rise to the zero crossings in the image. Informally, his method was to compute the surface which fitted the known surface  depth values and was 'most consistent' with the  implicit shading information. Crudely put, this information can be described as implying that the reconstructed surface should, when passed through a V G filter, not contain any new zero 2  crossings  that were  not in the orignal zero crossing set This was formally referred to by  Grimson (1981b, 1982) as the 'surface consistency' constraint, and informally as the 'No news is good news' constraint and was stated as follows:  The absence of zero crossings constrains the surface shape. Grimson best  surface  shows that the adoption of this constraint results in the conclusion that the  to fit the known  orientation (also  known  data  is the one that  minimizes the variation  in surface  as the quadratic variation of the depth gradient function) over the  surface. Grimson shows (1982) that the functional to be minimized is the following:  6(0  =  [//(f  xx  2  + 2f  xy  2  +f  yy  2  )dxdy]  1/2  (3.3.1)  Grimson (1982) shows that the above minimization problem can be characterized by using the calculus of variations to provide a set of differential equations (known as the Euler equations) that the minimal function must obey. Doing this, Grimson obtained the following differential  50  equation for f:  V'f  where V  =  fxxxx +2f xxyy +f yyyy -  is the biharmonic  operator.  (3 3 2) \J.->.*-J  0  The boundary conditions for this P.D.E.  are given by  (for the case of a square boundary, aligned with the coordinate axes):  f  f  x x  =  0, f y  =  0 for the boundaries parallel to the x axis.  (3.3.3)  =  0, f  =  0 for the boundaries parallel to the y axis.  (3.3.4)  XX  y y x  It can be shown (Terzopoulos, 1982) that the minimal the  above  P.D.E.  can be modeled as the surface  function obtained as the solution of  that a thin metal plate  takes when it is  constrained to pass through the known depth values. Grimson (1981b) presents a computational method whereby the minimal surface can be obtained. Essentially this method involves searching for the surface function f which minimizes the  functional 0(0  (equation  3.3.1)  with  a conjugate  gradient  search  algorithm.  Since the  surface representation to be determined in practice, as well as the input depth constraints, are defined converted  in a into  discrete a  rather  than  continuous  discrete • problem.  differences and the integrations  This  fashion,  involves  the above  conversion  of  minimization problem is the  differentiations  into  into summations. The function to be minimized then becomes  (Grimson, 1981b, pl96):  yy\-7 m-f  +  where Is- •} is the set of reconstructed  surface  depth values and {c. •} is the set of known  51  surface  depths. The indices i j refer to positions on the grid of surface points. The scalar 0  is a smoothness parameter. exactiy.  If it is zero, the resulting surface will fit the known data values  If non-zero then the resulting surface  will be a smooth approximation to the actual  surface. In this fashion errors in the measured surface depth values may be smoothed out. Computationally, the conjugate gradient algorithm of Grimson, for the discrete case can be  (see  Terzopoulos,  1982) formulated  as  a  relaxation algorithm.  Relaxation  methods are  iterative procedures for determining the solutions of linear systems of equations of the form:  Au  and  =  I  (3.3.6)  hence:  u  where  =  A is a  A "  1  !  known  (3.3.7)  N x N nonsingular matrix  and J  is a  known  N x l column  vector.  Relaxation methods provide estimates of the solution vector u . The operation of the relaxation methods can be described by the following matrix equation:  S  where  (k+D  =  GU  (k)  G is the iteration  +  E  (3.3.8)  matrix  estimate of the solution vector u ^  and is a function of A and where k + 1  ^  Pc=J.  is obtained by multiplying the previous estimate of the  solution vector u ^ ) by the iteration matrix G and adding to it the vector constraints,  f  different types relaxation,  (which  is zero  of relaxation  for which G =  for the points  schemes  The current  at which no depth  can be put into this  D~^(L+U) where A =  D-L-U  diagonal D, upper triangular U , and lower triangular L  value  of known depth is known). Three  form. The first type  is Jacobi  is the decomposition of A into  components. The other two types of  relaxation methods are the Gauss-Seidel relaxation, for which G  = (D-L)~Hj, and  Successive  52 Overrelaxation (SOR) for which G parameter  =  1  1  1  with the  (I-wD~ L)~ [(l-w)I+a;D~' TJ]  relaxation  we(1,2). The Jacobi method is a parallel method in that each element of the new  solution vector can be computed simultaneously. It requires that both the new and old solution vectors be stored completely. The Gauss-Seidel method is a sequential method as the computation of a particular element of the new solution vector can use the elements of the new solution vector that have been already computed. Thus the Gauss-Seidel algorithm will operate faster than the Jacobi algorithm. The Successive Overrelaxation algorithm is a generalization of the Gauss-Seidel algorithm (one obtains the Gauss-Seidel algorithm from the SOR algorithm when u> — l).  This algorithm scales the residual vector (the vector that is  added to the old solution to provide the new solution vector) by a number greater than one in an attempt to speed convergence. Terzopoulos (1982) provides the conditions on G which insure that the iterative process converges. In particular he shows that the SOR algorithm can not be guaranteed to converge if co is greater than or equal to 2. The matrix A is given by the 1  Hessian  of the functional 0(0 and is given by:  h  A = [3 0(u )/3uij9ukl], l<ij,k,l<N  (3.3.9) 2  2  where NxN is the size of the grid of depth values. Note that A is an N xN sized array. For any reasonable value of N this results in a very large array. However A is very sparse, and banded. We can characterize the computations required at each point in the surface grid at each iteration as a multiplication of the neighbouring surface depth estimates by a set of fixed coefficients as shown in the following relaxation iteration formula:  c  0 iJ 2  v  + v  + v  + v  ( i - l j - l i + l j - l i - l j + l i + lj + l)  (20+0)^  8  =  v  + v  + v  + V  - ( i-lJ i + lj i j - l iJ + l +  y  + v  +v  )  ( i-2j i + 2j ij-2 U + 2> +  V  +  + (3.3.10)  These coefficients are termed computational molecules by Terzopoulos (1982) who derived the  53  values of the near  the  problem  coefficients,  both  grid boundaries where come  for the the  into play. These  interior  grid points as  well as  for  boundary conditions imposed by the  same molecules  were  obtained  the  grid points  formulation of the  by Grimson (1981b) in his  specification of the conjugate gradient algorithm. These computational molecules will be used in the implementation of a relaxation  algorithm later in the  thesis,  and are  displayed in figure  3.2.  The  convergence  of  the  aforementioned  relaxation  methods  and  of  the  conjugate  gradient methods turns out to be painfully slow. In an effort to speed up the computation of the  solution vector, Terzopoulos (1982) proposed the  This  method  involves using depth  The  coarsely  sampled depth values at  one of the standard  constraints the  single grid relaxation  at  a  use  of a  iterative  multi-grid  algorithm.  number of different levels of resolution.  low resolutions algorithms.  would be interpolated first, using  Since these algorithms reduce  the high  frequency errors in the depth estimate very quickly, a coarse solution can be obtained with a small  number  relaxation  of  on the  iterations.  This  coarse  estimate  is  then  next higher resolution level. The processing  the highest resolution level has been reached.  Because,  high  since  frequency  used  errors  are  diminished, and  as  an  initial  estimate  continues in this manner  for until  at each resolution level, the relatively  decreasing  the  resolution  decreases  the  frequency of the errors that can be filtered out, it can be seen that such multi-grid methods can  quickly  eliminate  resolution relaxation but  take  far  measurements  components  having  fairly  low  frequencies.  The  normal,  single  methods can eliminate the high frequency error components quite quickly,  more are  error  iterations  to  eliminate  the  low  frequency  errors.  Thus,  if  depth  available at a number of different spatial resolutions, then the Terzopoulos  multi-grid algorithm would be expected to be much faster than Grimson's method for surface reconstruction.  Let us now discuss whether  or not Grimson and Terzopoulos' algorithms are  actually  suitable for our application. Let us start by checking the validity of the assumptions made by Grimson (1981b).  54  FIGURE 3.2 Computational molecules for the relaxation surface approximation algorithm (after Terzopoulos, 1982). The thick bars indicate the boundary of the grid. The starting point for Grimson's development of his surface reconstruction procedure was his  surface  consistency  constraint.  Basically this said that if there was a region of the  surface for which there was no zero crossing observed in the V G image then the surface 2  could not be changing appreciably, otherwise a zero crossing would have been created. This  55  statement seems  reasonable  enough but it neglects  one important fact.  V G filtering operation restricts the density of observed zero crossings. 2  2  This fact is that the In fact, the larger the  space constant of the V G filter, the lower the density of observed zero crossings. This can 2  be easily seen by looking at any one of the scale space maps that are  depicted in chapters  4 through 6. Thus we see that Grimson's surface consistency constraint strictly holds only for the  case of V  filtering, and not for V G filtering. There can be cases where the  2  2  surface  changes appreciably and not result in an observed zero crossing, simply because the change in the surface caused an intensity change which was of too high a frequency and was therefore filtered out  by  the  resolution surface  V G operation.  This  2  representations  may  not  since these high  be  a  major  problem  frequency surface  changes  for  the  higher  that are filtered  out would not be perceived by the visual system anyway. At the lower resolutions, however, the problem is more serious. The surface will  be  fairly  low  frequency  changes,  changes  and  ones  that are filtered out at that  would  be  these resolutions  perceivable  by  the  visual  system. Of course, if one is only trying to obtain a multi-resolution surface representation, as Terzopoulos (1982) does, then it is not expected that the lower resolution representation would capture all of the higher frequency surface changes.  On the other hand, we have seen that  the  multi-resolution stereo matching process  low  resolution information so that the matching can proceed efficiently and accurately  higher resolutions. Thus,  if the  requires an accurate disparity estimate  low resolution surface  reconstruction  process  from the at the  does not detect  the higher frequency changes in the surface (or disparity) then the higher resolution matching algorithms will be required to be more robust Thus we should ask ourselves whether or not there  are  other  consistency question chapter,  surface  constraint,  prompted  reconstruction  that can  the  search  better for  methods, reconstruct  reconstruction  which the  do  higher  methods  not  assume  Grimson's  frequency surface that  are  described  all of which are based on assumptions of the frequency content  3  surface  changes. in  the  This next  of the function to  One of the reasons that Grimson did not account for this may lie in the fact that, in the statement of his Surface Consistency Theorem and its proof, he used V filtering instead of the more general case of V G filtering. In the case of V filtering all of the (topographic) zero crossings are present and the Theorem holds. However, when V G filtering is used the Surface Consistency Theorem is no longer valid. It should be pointed out that Grimson intended his surface reconstruction algorithm for the task of filling in the disparity array at the highest resolution only. 2  2  2  2  2  3  56  be reconstructed  Before Terzopoulos'  (i.e. bandlimitedness).  we  get  into  these  multi-grid algorithm  Terzopoulos' algorithm appears in  the  be  available  at  all  to  our  let  us  system.  briefly  discuss  multi-resolution matching  to be tailor made  multi-resolution matching  estimates  methods,  a  problem  algorithm.  for the reconstruction  with applying At first glance  processes to be done  However, Terzopoulos' algorithm requires  resolutions  before  the  surface  reconstruction  can  that depth take  place.  Clearly, the multi-resolution matching algorithm requires that the disparity function at a given resolution level be reconstructed  the next resolution level features can be matched. This  before  means that the multi-grid reconstruction to perform a number of relaxation  method is inapplicable. The best that one can do is  iterations  not as bad as it seems because the  at each resolution level. This state of affairs is  disparity function estimate from the previous resolution  level is available to provide an starting point for the relaxation the  low  relaxation  frequency iterations  errors will needed  to  have  been suppressed  achieve  a  certain  in this  error  level  process. Presumably, much of manner. is  less  Thus,  the  number of  than  what  would be  expected from a single resolution level relaxation operation. However, having said this, it turns out that in practice,  the relaxation  reconstruction  methods to be discussed in the next chapter.  method is still fairly slow compared  to the  57  3.4  -  T h e W K S Sampling Theorem and its Extensions  In are  this section  based  upon  we will  series  be looking  expansions  of  these  researchers  including  Kotel'nikov  (1933), that one can represent  coefficients  are the samples  result  is given  theorem,  here  J . M Whittaker  at methods  (1929),  functions.  3.1,  It  has  E . T . Whittaker  been  (1915),  of functions that  shown,  by  C . E . Shannon  a  host  which  in honour of the aforementioned  distributed. The precise  we call  (after  (Jerri,  1977))  of  (1949) and  a bandlimited function with an infinite series  of the function, suitably  as Theorem  for the reconstruction  whose  statement of this  the W K S sampling  mathematicians.  Theorem 3.1 The uniform 1-D sampling theorem (WKS) If F(CJ) = 0 can  f(t)  is  be reconstructed  This distributed constant  at  function  for o)>o;o = 7r/T  f(t)  the  a  =  uniformly.  a  the points  Fourier t = nT, n  transform n  =  F(o>)  such  0,+ l , ± 2 , . . . ,  adapted  0  theorem  f(t)  (3.4.1)  0  requires  for its validity that  That is, the spatial  from  that  then  f(nT)sin[w (t-nT)]/[w (t-nT)]  W K S sampling theorem,  One  at  having  from its samples f(nT) as follows:  O u r application, however,  non-uniformly  variable  rv=-oo  sampling  methods,  one  and is sampled  exactly  Z  of  or temporal  produces samples  as it stands,  that  between  of the function be  sampling instants  be a  that are non-uniformly distributed. Hence,  is not valid  the W K S theorem  interval  the samples  for our application. W e will  allow  the reconstruction  now look  of functions from  distributed samples.  of  the  conditions  on  f(x)  for  the  W K S theorem  bandlimited. This means that the following expression is true:  to  hold  is  that  it  be  58  f(x)  where  ]  [-7r,7r]  =  is some in the  /,  e^^ujdu  bounded interval (which  subsequent  we  will  text) and F(CJ) is the  instead of this condition, f(x) For  (3.4.2)  take,  without loss of generality,  Fourier transform  of f(x).  to be  Now, suppose,  was bandlimited with respect to some other integral transform.  example:  f(x)  =  / j K(XJCJ)F (W)CL>  (3.4.3)  k  Suppose further that F e L j ( I ) (that is, f(x) K  such that the set {K(x cj)} nJ  has bounded energy)  and that  K(x£))e L (I)  is  2  for all integer n is a complete orthogonal set on the interval I.  Then any ' K bandlimited' function f(x) can be reconstructed  as specified in Theorem 3.2  (due  to Kramer (1959)).  Theorem 3.2 statement  The  generalized  of his theorem  WKS  is that  sampling  contained  Let I be an interval and Lj(I) <°°.  theorem  in Jerri,  1977  (due  to  Kramer,  (Theorem  1959,  this  precise  III-A-1))  be the class of functions </»(x) for which fj|0(x)| dx 2  Suppose that for each real x:  f(x)  =  Jj  K(x*>)g(u)du  (3.4.4)  where g(w)eL (I). Suppose for each real x, K(x^>)eL (I), and that there exists a countable set 2  2  E={x } such that {K(x £))3 n  n  f(x)  where  =  is a complete orthogonal set on I. Then:  lim Ef(x )S (x) n  (3.4.5)  59  S (x) =  ; K( ^)K*(x w)cL;// |K(x ^)| cL  n  Note  that  i  X  nJ  if K(XA>)  =  e  I  JCdX  2  n  (3.4.6)  J  and x = nT then we get the standard W K S n  theorem. As an example, a valid K ( x ^ ) is the Bessel function of m  sampling  order, wJ^xw), which  results in the inverse Hankel transform:  f(x)  where I =  =  /j  F(u) o)J (xw)dw  (3.4.7)  m  [0,1]. For this case it can be shown that:  S  n« = V m W ^ V - ^ ^ + W  < > 3A8  The sample sequence for this reconstruction formula is given implicitly by the positions of the zeroes of J (x). m  graph  That is, x  of the function J  m n  satisfies J ( x m  m n  )  =  0. It is easily seen,  for any value of m, that the sample  (however, the sequence approaches uniformity for large values of x  by looking  sequence  at a  is not uniform  ). For this reconstruction  formula to hold, f(x) must be bandlimited with respect to the Hankel transform. This means that F(u>) (which is the Hankel transform of f(x)) vanishes for u> outside the interval I  =  [0,1]. This is a different condition on f(x) than in the Fourier transform case. Note that a function  f(x)  reconstructible  that  is not bandlimited in the usual  with the above  Bessel  Fourier  function reconstruction  transform  formula. There  types of functions that can be used as kernels in reconstruction some of these, including the associated  sense  may still be  are many  formula. Jerri  (1977)  other lists  Legendre functions and the Chebyshev functions of the  second kind.  All  of these  reconstruction  formula  require  that  the function  be sampled  in some  non-uniform fashion. However, this sample sequence is fixed, for each different reconstruction filter kernel.  Thus  while  the sample  distributions are non-uniform they  are certainly not  60  arbitrary.  Thus  these  methods  are not applicable  to our application  in which  the sample  distribution is not known a priori and varies from case to case.  One arbitrary  possible  sample  K(x£))eL (I) 2  way in which  distribution  {x l  is  n  such that (K(x A>)}  we might to  which one can proceed  {K(x jx>)3  is a  (although  they  n  is to choose  complete might  determine,  for  a  sampling theorem  each  case,  a  kernel  for an function  is a complete orthogonal set. In general this would seem to  n  be an impossible task, involving a search in  try to obtain  orthogonal  not have  over the entire space of 1^(1). However, one way a specific kernel and ask whether  set  explained  for an arbitrary their  motivation  sample in this  set {x L n  fashion)  or not the set This  approach  was taken by  Beutler (1966), Yao and Thomas (1967), and Higgins (1976). They took as the reconstruction kernel  function the standard  Fourier  what conditions is the set {e^n}  transform  a complete  kernel e^ x  and asked  the question: Under  orthogonal set? This question was first looked  into by Paley and Weiner (1934) who showed that this set is closed" if the sample set {x } fi  obeys the following condition:  |x -n| n  <  2  1/7T  (3.4.10)  That is, if the sample set deviated by an amount less than l/7r sequence,  then the set {e^ xn}  2  from the uniform sample  is closed. Levinson (1940) provided a tighter lower bound on  the maximum allowable sample deviation and stated that:  |x |-|n| < n  1/4  (3.4.11)  Levinson claimed that this is the 'best possible' bound.  "Closure in this context means that:  fjfitxXeJ^nJdx  =  0  iff f(x)=0 almost everywhere for all x and all  (3.4.9)  x  n  e  ^  x  n  5-  61  Since functions easily  the e ^  are  known  =  n  Higgins corresponds  as  methods.  non-harmonic  Fourier  series  methods  Given  this  which basis  utilize these  set it can be  a  that,  sin[7r(x-x )]/[7r(x-x )] n  (1976) unique  Kronecker  another  by  reconstruction  shown that:  S (x)  the  n are not harmonically related,  shows  and <.|.>  set of complete  for g  =  n  e^° n x  the Lagrange  type  S (x)  =  n  that  biorthogonal  delta  (3.4.12)  n  for basis  every {g (x)}  set  such  n  indicates  orthogonal  basis  the  the resulting reconstruction  <g  product  kernel filter  for  n  that  inner  reconstruction  lg (x)} n  | g  m  >  a =  S  operation.  Hilbert n  m  ,  space  where  Thus  this  functions. F o r example,  6  there is  n m  results  Higgins  function corresponding to g  in  shows  is given  reconstruction function:  H(x)/[H'(x )(x-x )] n  (3.4.13)  n  where:  H(x)  Higgins  further  expression (1940)  =  (x-Xo)rT(l-x/x )(l-x/x_ ) rl  states  for  that  if the sample  H(x) can be  put into  also pointed out the existence  formula  for  S (x) n  (although  not  (3.4.14)  tl  sequence  closed  form.  x  n  is a  It  should  for  the  reasons  that  sample  sequences)  allow  a  wider  come  across  no  one could for other  range  extend  kernels  of different  studies  of such  this besides  sample an  method  (of  the Fourier sequences  approach,  to  however,  function of n, then the  be pointed  o f the biorthogonal sequence Higgins  derived by Y a o and Thomas (1967), albeit in a less elegant,  Presumably  rational  more  down the above  did). This  result  was  also  manner.  closure  conditions  transform  kernels. This  be used  for reconstruction.  and it  Levinson  and wrote  direct  determining  out that  was felt  to  would  be  on the perhaps  We  outside  have the  62  scope of this thesis to attempt our own study of this matter.  A for  different approach was proposed by Yen (1956) who provide reconstruction formula  a number of special  number  of constraints  sampling  cases of non-uniform sampling. His method  on the reconstructed  distribution, and setting  involved applying a  function value, obtained from the nature  up systems  of linear  equations  from  which  of the  the function  values can be solved. The special cases for which Yen provided reconstruction formulae are as follows.  1. -  Migration of a finite number of uniform sample points.  2. -  Sampling with a single gap in an otherwise uniform distribution.  3. -  Recurrent nonuniform sampling.  4.  -  Reconstruction  of time-limited, band-limited signals  from  arbitrarily distributed  samples. The  types  figure 3.3.  of sample  distributions implied  The reconstruction  down here. The interested  by the first three cases of Yen are depicted in  formulae that result are very complex and will not be written  reader  is referred to (Yen, 1956). These equations also appear in  the PhD thesis of F. Marvasti (Marvasti, 1973), who used them in an adaptive quantizer. He did not credit Yen with these equations, although he did include (Yen, 1956) in his list of cited literature.  Marvasti  did, however,  propose  some  interesting  methods  of his own in his thesis,  which we will now discuss. He presents a pair of techniques for 'uniformizing' a non-uniform sample sequence  so that the function can be reconstructed  with the W K S formula. The first  of these techniques involves estimating the function values at uniformly spaced positions based on  a finite number  of previous  (non-uniformly distributed)  function  samples.  To do this  estimation he used the method of (Yen, 1956) (again Marvasti did not cite this paper as the source  of this method)  which reconstructed  a time-limited and bandlimited function from its  samples (which can be arbitrarily distributed) in the interval where the function is non-zero.  63  FIGURE 3.3 The three special types of sample distributions handled by the reconstruction formula of Yen (1956). Although Marvasti does not mention it, it is clear that if the function whose samples are to uniformized are not time-limited, then this uniformization will not yield exact results. Hence the reconstructed function, after uniformization, will be in error. The second of Marvasti's uniformization schemes works by predicting the function values at the uniformly distributed  64 based  on the  (non-uniformly  information  provided  distributed) with  by  a  finite  a linear predictor. This  function values at uniformly distributed locations, states that  the error  number  in the prediction method  of  preceding  method  but with  function  produces  estimates of the  an added jitter  is greater than  samples  noise. Marvasti  that produced using  Yen's  time-limited function reconstruction formula. To conclude this section we will consider reconstruction methods which involve simple linear filtering of the sampled function (which can be modeled as an impulse train). Marvasti (1984) shows that, if the sampling function, S = E8(x-x ) is thought of as the zero-crossing f\'-oa  of the following F M signal:  FM  =  sin[w x + c  11  fp(x)dx]  (3.4.15)  and if u> is much larger than the bandwidth of p(x), then Q  passing  the sampled  obtained  function through  by Papoulis  (1966)  although  a lowpass his  the jitter  noise produced by  filter is negligible. A similar result was  derivation  proceeded  directly  from  the W K S  formulation. The essential conclusion of both Marvasti's and Papoulis' analyses is that the jitter error is negligble only if the deviation of the sample  positions  from the closest uniform  sequence was small enough. This is the same sort of condition imposed by the non-harmonic Fourier series methods. It  is clear  that  all of the methods  described  in this  previous section  constrain the allowed sample positions. Such techniques are therefore wherein  the sample  density  varies  all tightly  inapplicable to situations  significantly. In applying the stereo depth  measurement  algorithms described in this thesis to real life imagery, one is often faced with feature wherein the feature reconstruction  images  density varies considerably over the image. Thus, it is evident that the  methods just discussed are clearly inappropriate for our application. What we  require is a reconstruction method that allows reconstruction of functions from truly arbitrarily distributed sample sets. The next three sections describe the derivation and implementation of just such a method.  65  3.5 - The Transformation or Warping Method In two  this section  we  dimensions in the  develop a  one  next section,  ID Case  dimensional sampling theory,  that allows one, under certain  which we extend  conditions, to  to  reconstruct  bandlimited functions exactly from arbitrarily distributed samples. This theory is seen to be a generalization  of  the  analysis  of  Papoulis  (1966)  who  showed  how  the  standard  uniform  sampling theory of Whittaker, Shannon and Kotel'nikov could be extended to sample sequences that were result  slight deviations from a uniform sample sequence. We show how a more general  can  be  obtained  by  treating  a  non-uniform sample  sequence  as  resulting  from  a  coordinate transformation of a uniform sample sequence instead of being merely deviated from the uniform sequence.  Section 3.6  details how these coordinate  dimensional case. Section 3.7 section  3.6,  distributed  for  describes a heuristic algorithm, based on the theory developed in  performing  samples.  We  transformations can be determined in the two  will  two  dimensional  first  derive  a  function  reconstruction  non-uniform sampling  from non-uniformly theorem  for  the  one  dimensional case and then extend this result to the two dimensional case in the next section.  The (WKS)  starting  point  for  our  derivation  sampling theorem (Theorem 3.1).  where t , the position of the n  n  by  T  the  classical Whittaker-Kotel'nikov-Shannon  Let us consider a non-uniform sample sequence  such that we end up with another  transformation to f(t),  function h(r)  =  7(t ), n  for  some  uniformly spaced (figure 3.4b)  arbitrary  to, then  the  samples  and we can use Theorem 3.1.  to be exact we must have that h(r)  be bandlimited to u  0  of  the  function h(r)  will of  be h(r)  7 r / T . If this is so, we can  then reverse the stretching/compression operation, and retrieve the reconstructed function f(t) by using the relationship:  is such that  For the reconstruction =  described  as shown in figure 3.4b,  with a sampling period of T units. If the transformation, 7 , between t and r U + nT  n  This function is sampled non-uniformly as shown at the  Now, suppose we apply a stretching/compressing  — 7(f),  {t 5  sample, is not necessarily equal to nT. For example, refer  to the function shown in figure 3.4a. locations t .  is  version of the  66  0)  b)  FIGURE 3.4 a) A function, f(t) sampled at non-uniformly distributed positions b) The transformed function, g(r), sampled at uniformly distributed positions.  f(t) =  h(7(t))  Substituting this relationship into (3.4.1) of Theorem 3.2, and using T =  (3.5.1)  7(t) yields  67  f(t)  Hence,  =  L  f ( t ) sin[wo(7(t)-nT)]/[a)o(7(t)-nT)]  (3.5.2)  n=-oo "  in order  to reconstruct  f(t) from its non-uniformly spaced  suffices to find the invertible and one-to-one  samples  function 7(f) such that 7(t ) = n  f (t ), it n  nT and then  to use (3.5.2).  The reconstruction  formula (3.5.2) is equivalent to the one derived by Papoulis (1966),  who treated the case of sample positions that were  deviated slightly from a uniform sample  distribution. However, his analysis indicated that this reconstruction would became are  always be subject  to an aliasing error  which became  would never be exact, but  smaller as the sample deviation  smaller. This conclusion is too pessimistic, however, and it can be shown that there  cases for which the samples are not uniformly distributed and yet the reconstruction can  be exact  The conditions under which an exact reconstruction  can be obtained are discussed  below.  In order for (3.5.2) to hold, the function h(r) must be bandlimited to u> - Thus h(r) 0  is a member  of the set B„, , which is defined as the set of all functions whose  Fourier  "Jo transforms vanish for |o>| >w - Let us define the set C 0  are the image of a function in B„  to be the set of all functions which  under the transformation  7" \  It can be seen that C  "J P  is the set of all functions that can be reconstructed  The  set C  is clearly  non-empty,  and thus  approximately true for all functions is incorrect  7  exactly with (3.5.2), for a given 7 .  Papoulis' assertion  that (3.5.2) is only  Equation (3.5.2) is approximate  only for those  functions that are not members of C .  An  interesting  band-limited.  That  point  this  to be noted,  is so can be seen  is that  the functions  by examination  spectra of h(r) and f(t). It is possible to show that:  in C  are generally not  of the relationship between the  68 F(X)  = f%?(k»)H(u>)±>  (3.5.3)  where the function P , which can be thought of as a frequency-variant  blurring function, is  defined by:  = J^e^  P(XA>)  That is,  P(AAJ)  27r  T  ^'y( ))e^  27rXr  dr  (3.5.4)  is the Fourier transform of the angle modulated signal:  2  pw(T)  = e/ ™T«  (3.5.5)  Thus, if Pw(t) is not bandlimited, as is usually the case for angle modulated signals, then generally f(t) will not be either.  One can show that there exists a transformation  7 such that the FM  signal defined  by:  f  f(t) = t~J • °  is a member of C  n(s)ds  (3.5.6)  whenfl(s)is a positive, continuous function. Hence, such a function can  always be reconstructed exactly, when sampled at the times t =7~^(nT), even when it is not n  strictly bandlimited. Let us summarize the details of the above analysis in the form of a theorem :  Theorem 3.3 The  non-uniform  1-D  sampling  theorem  Let a function f(t) of one variable be sampled at the points t = t , where t is not necessarily a sequence of uniformly spaced numbers. If a one-to-one continuous mapping 7(t) exists such that nT= 7(tn)> and if h(r) = f(y~\r))  is bandlimited  to u) = ir/T, t  then the  69  following equation holds:  f(t)  =  I  f ( t j sin[ej.( (t)-nT)]/[wo(7(l>-nT)]  This  theorem  same manner  (3.5.7)  7  n--co  i J  can be generalized  that Theorem  to include  3.1 was generalized  other  orthogonal  to Theorem  basis  3.2. This  functions  generalization  in the will not  be explicitly stated here.  The  reconstruction  Consider a 'burst'  method  described  here can be thought  of in a different  type signal such as that shown in figure 3.5. Intuitively, we would expect  that a uniform sampling of f(t) that allows an exact reconstruction, efficient central  sampling  scheme.  region where  there are lower  reasonable  to require  there are high frequency  components  frequency  It seems  components.  a  would not be the most  higher  than  Suppose that, at every  make a local estimate of its bandwidth -  sample  density  in the outer regions  signal. This conclusion has been reached who derived the reconstruction  =  f(t)  =  B(t). Then, it would follow that we would have to  The instantaneous  (albeit from a different direction)  given implicitly by:  (3.5.8)  n  I  derivative  of the  by Horiuchi (1968)  n/(2B(t ))  n--oa  where  formula (3.5.9) for a signal with a time varying bandwidth B(t)  that is sampled at the points t  n  in the  point of the function f(t), we  sample at a rate of 2B(t) samples/unit time in order to allow an exact reconstruction  t  manner.  f ( t j sin[7r(2B(t)t-n)]/[7r(2B(t)t-n)]  (3.5.9)  Ii  of  the  mapping  function,  sampling rate and hence we have that:  bj(i)/oi  can  be  thought  of  as the  70  f (t)  t FIGURE  3.5 A burst type of signal, with time-varying bandwidth  97(t)/9t  =  (27T/cj )B(t) 0  If the bandwidth B(t) is a constant  or  7(t) =  t  (3.5.10)  k +J*0 (27r/cj )B(r)dr  (or approximately  0  so over  a given interval)  then  we can  say:  7(t)  With  =  (3.5.11)  (27r/o) )B(t)t 0  this equation for 7(t) we can see that equation (3.5.2) and (3.5.9) are equivalent  bandwidth  is  not approximately  (3.5.9) are not equivalent.  constant  then  our equation  (3.5.2)  and Horiuchi's  If the  equation  71  that  Equations (3.5.8)  and (3.5.10)  is, how to sample  a signal  tells us (implicitly) how to optimally sample with  the smallest  number  of sample  points  a signal, while still  allowing an exact reconstruction of the signal. Equation (3.5.9) suggests that we could interpret the  reconstructed  bandwidth Given  B(t), to the signal L  an arbitrary  integrate When  f(t) as the response  of a time  f(t)5(t-t ).  Thus  n  varying (or adaptive) one can envisage  lowpass filter, with  the following  function f(t) we estimate its bandwidth as a function of time. We then  this bandwidth function, as in equation (3.5.10), to yield the warping function 7(t).  this  warping function crosses an integer  value-n, we sample  f(t). We then store or  transmit the sample f(t ) along with the time at which the sample was taken, t . n  fl  then, knowing all of the f(t ) and t fl  The measure  preceding  analysis  n  We can  values, reconstruct f(t) using (3.5.7).  is complicated  by the fact  that  an exact  'local' bandwidth  does not exist as bandwidth is defined globally, being a frequency domain measure.  Thus, we can only obtain local band width validity of equations (3.5.8)-(3.5.10).  'estimates',  In practice,  which may cause concern as to the  however, the reconstruction  used are defined only over a finite area (truncation of the reconstruction loses  process.  nothing  by assuming  the local  bandwidth  over  this  finite  area  formulas that are  series), to  and so one  be the actual  bandwidth.  In practice, the use of the above reconstruction theorem requires the knowledge of the function 7(t) at all points t for which we desire a reconstruction. If an analytical expression for the sampling sequence analytical  expression  expression  t  Q  is known (e.g.  to include non integer  t  =  s(n)) then we can simply  values (e.g. 7(f)  =  extend this  s(t)). If no such analytical  is available (as is usually the case) or if the analytical  expression  can not be  extended to non integer values, then 7(0 must be found by interpolation between the known 7(n)  points.  The only  constraint  on this  interpolation is that  it must  yield  a 7  that is  one-to-one and invertible (monotonic).  We  will  now present an example showing the effectiveness  of non-uniform sampling  and reconstruction for signals with time varying bandwidth. Consider the following F M signal:  72  f(t)  where  =  (3.5.12)  sin[27T0(t)]  <p(l) is the phase  function,  defined  in terms  of the instantaneous  signal  frequency  (bandwidth) as follows:  0(t)  Let  =  us consider  f*B(r)dr  the  case  (3.5.13)  of a  quadratic  phase  function,  or  linear  bandwidth  function.  Specifically let us define:  B(t)  =  And hence k i =  t/20,000  (3.5.14)  1/40,000, k 2 =  0 and k 3 =  0. It can be seen that the Nyquist rate (for  |t|<1000) is 1/10 samples per unit time. Thus the uniform sample sequence (for |tj<1000) is t =10n. n  In the non-uniform  case, the sampling sequence is obtained  from equation  3.5.8 as  follows:  =  7(t)  c/(27r/a; )B(T)dT 0  =  2c0(t)  (3.5.15)  The constant c is chosen so that there are 100 samples in the interval  [0,1000] (the same as  for the uniform case). We then have that:  =  7(t)  ct /20,000  (3.5.16)  2  We know that 7 ^ = ^  Therefore  we must have that 7 (1000) = 100.  This yields a value  of 2  for c. We can now see that the sample sequence for the non-uniform case is given by:  t  n  =  100/(n)  (3.5.17)  73  Note that, for n<100, the value of |t -10n| is greater than 10/4. This violates the condition n  (3.4.11) that the sample sequence must meet  for the non-harmonic fourier series methods to  work. Thus these methods are inappropriate for this application.  The function in equation (3.5.12) was sampled at the uniform and non-uniform sample points,  and then  summations in each  reconstructed  using  of these reconstruction  magnitudes of the two reconstructions hence  the  reconstruction  formulae  formula were truncated  (3.4.1)  and (3.5.7). The  to 21 samples. The error  are shown in figure 3.6, as a function of time (and  the signal bandwidth). Notice that the error  for the uniform case rises as the signal  frequency rises because of the increased aliasing and truncation errors, while the error for the non-uniform  case  remains  more  or less  constant,  as  expected.  The total  RMS error for  10<t<990 is 0.0982 for the uniform case and 0.0706 for the non-uniform case.  Let us now examine the case of two dimensions.  the extension of the above one dimensional sampling theory to  74  S o  FIGURE 3.6 The reconstruction of a chirp signal for uniform and non-uniform sampling.  75 3.6 - The Transformation or Warping Method - 2D Case As in the one dimensional case, the development of a two dimensional non-uniform sampling theory begins with the consideration of the uniform sampling theory. The theory behind the reconstruction of functions of two variables from uniformly distributed samples of these functions was developed by Petersen and Middleton (1962). Mersereau (1979) and Mersereau  and  Speake (1983) have studied the more general problem  of processing  multidimensional signals that have been sampled on uniform lattices, especially hexagonal lattices which have added importance in this thesis. We are concerned here only with signal reconstruction but it is evident that the type of signal processing techniques described by Mersereau and Speake can be extended, using the results of this thesis, to the case of non-uniform sampling. The essentials of the work of Petersen and Middleton (1962) is summarized in theorem 3.4. This theorem describes the conditions under which a function of two variables, f(x), can be reconstructed exactly from its samples taken at points on a uniform lattice. This theorem basically extends the one dimensional uniform sampling theorem (Theorem 3.1) to two dimensions.  Theorem 3.4  The  uniform  two  dimensional  sampling  theorem  Suppose that a function of two variables f(x) is sampled at points in the infinite sampling set {xs} defined by: xg = {x: x=l,v1 + l,vJ, 1,,1, = 0,±1,±2,±3..., Vj*kv2}  (3.6.1)  The vectors v, and v2 form the basis for the sampling lattice defined by the points in {xg}. Such a sampling lattice is shown infigure3.7 for v, = (2/V3,0) and v2 = (lV3,l). Furthermore let the support of the Fourier transform F(w) of f(x) be bounded by the region R in u space. The spectrum of the sampled function, f(x)=E6(x-x„)f(x), is made up of an infinite s s f*,3  76  88  70  71  71  73  81  FIGURE 3.7 The hexagonal sampling lattice for functions with isotropic spectra number of repetitions of the spectrum F(i3) and is given by F (c3)=EF(a5+c3 ), where the set {cjg} is defined by:  CJ$ ={U:  u> = l,u, + l2Uj, 1 1 ,1 2 =0,±1,±2,±3..., Ui^kuj}  (3.6.2)  77 and where the frequency domain basis vectors u} and u2 are related to the spatial domain basis vectors v, and v2 by T  T  T  u, v, = u 2 v 2 = 2ir, and  T  U i v 2 = u 2 v, = 0  (3.6.3)  If F(c3) = 0 when F(CJ+C3s)^0 (for every " g ^ 0 ) , then the spectral repetitions do not overlap, and the following equation holds:  f(x) = I f(x )g(x-x ) .03 S  (3.6.4)  S  where g(x) is the inverse Fourier transform of the lowpassfilterfunction G(w) defined by: = Q  ueR  G(o5) = arbitrary =  c3iR, c3-c3^R "~^s  0  e  (3.6.5)  ^  Q is a constant that is equal to the area of each of the sampling latdce cells and is the inverse of the sample density. In terms of the sampling lattice basis vectors, v, and v2, Q is given by i  T  J  Q = •|v1| |v2|'-(vl v2)  Now,  (3.6.6)  following the lead of the analysis performed in section 3.5 for the one  dimensional case, let us introduce a second function of two variables, h(x), that is the image of f(x) under the coordinate transformation  I = 7(x)  (3.6.7)  78  i.e.  f(x)  Let  this  3.6.1).  reversing  the  dimensional  -y(x),  contains  s  under  reconstruction  (3.6.8)  7  be such  into a uniformly spaced  Since. {f }  reconstructed  h( (x))  transformation,  is transformed  equation  =  h(f)  the coordinate  {Xg},  be  =  will  the  conditions  be exact  coordinate  points  set of samples  Once  samples  {J^  (such  as the set defined by  that lie on a regular  lattice,  the function h ( f ) can  of Theorem  3.4.  h ( £ ) has been  transformation  sampling theorem  that the set of non-uniformly spaced  (3.6.7).  This  which is stated below  If h ( £ ) is suitably reconstructed,  is  the  basis  as Theorem  f(x) of  bandlimited  then  can be obtained by  our  non-uniform  two  3.5.  Theorem 3.5 The non-uniform two dimensional sampling theorem  Suppose {x }. g  that a function  of two variables  Now, if there exists a o n e - t o - o n e  1  =  7(x)  and I g  =  T  f(x) is sampled  continuous  mapping y  at points  such  (x )  in the infinite set  that:  (3.6.9)  s  and if the function defined by  h(?)  satisfies  f(7 (I))  the conditions o f Theorem  h(f)  where  _1  =  g(£)  = £h(?  (3.6.10)  3.4 then  )g(I-? )  is as defined by (3.6.5).  the following is true:  (3.6.11)  Hence:  79  f(x) =  Let  I  us assume  that h ( £ ) is an isotropic  isotropic,  as  transform  of the function  origin. R  =  If  in (Petersen  this  is  the  (H : | X | < 7 r } .  (3.6.12)  f(i s )g(7(x)-7(x s ))  It  and Middleton, is a disk  case,  we  can then  1962),  shaped  can  function, where to  region  define  be shown  mean  that  the  in the frequency  the  region  that  we have  R,  (Petersen  here  taken  support plane,  mentioned  of  the term  the  centred  about the  in Theorem  and Middleton,  1962  Fourier  3.4,  eq. 74  as  with  B=l/2):  g(I) = (fi/v'3M*\l\)'(*\'Z\) where  Jj is the first order  Bessel  (3-6.13)  function  o f the first kind.  Combining equations  (3.6.12) and  (3.6.13) yields the following result:  f(x) = L f(xs) (TTV3)1,(771 T(x)-T(x )j )/(7TiT(x)-T(XG)|)  (3.6.14)  s  It  can be shown  space) 2/y/3. this  for such This  any  sampling  isotropic lattice  at which we know  x.  process  Unlike  case,  know  we wish the values the  is  1962) that the most efficient sampling lattice (in J a  hexagonal  is the one shown (3.6.13), the values  we are to use equation  dimensional  points once  an  and Middleton,  lattice and, by equation  If one  (Petersen  (3.6.12)  the mapping  to obtain  in  figure  one dimensional  at  are also  as our reconstruction  function y(x)  the sample  case,  with  however,  a  characteristic  3.7. T h e values  o f 7(xg)  a reconstruction.  o f 7(x)  lattice  of £  g  formula,  it  is  are fixed by  we must,  points  interpolate  not a  trivial  as in the  and at all other  A s in the one dimensional {xg},  of  fixed.  at all sample  points  spacing  to matter  case we can, find to  7(x) for obtain  a  80  mapping, r-{xs}->{^s}, between the sample sets in x and  continuous  mapping  space, that yields a one-to-one  mapping function 7 . In the one dimensional case one and only  exists  (restricting  interpolation of 7(t ) = fi  such  and \  the  sign  of the derivative  of 7  to  one such  be positive),  given by  nT, but in two dimensions there may be, in general, any number of  mappings. The difficulty lies in the fact that there is no general  scheme for ordering  arbitrarily distributed points in two dimensions analogous to the sequential ordering available in one dimension, such that adjacency  properties are preserved.  For the purpose of the following discussion let us make the following definitions.  Definition Partition A partition of a planar region R is a set of line segments,  called Links, that divide  R into a number of distinct, possibly overlapping subregions. The endpoints of these Links are called  the Vertices  of the partition.  There  can be no free  vertices  (i.e.  those  vertices  respect to the point set {J^  is the  belonging to only one Link) in a partition except at the boundary of R.  Definition Tessellation A tessellation is a partition whose regions do not overlap.  Definition Voronoi Tessellation The tessellation  Voronoi tessellation whose  of the J  plane  with  Links consist of points equidistant  from any two points ^.,?-e{7 } and no  1 closer  to any other  points  J^e{jj.  J  s  The vertices of the Voronoi tesellation are those points  equidistant from three or more points in { £ } and no closer to any other point in g  subregions created by the Voronoi tessellation contain all points ^ {£ } g  than any other point in  {| }. g  The  closer to a given point in  81  Definition  Dirichlet  The dual line  Tessellation  Dirichlet  tessellation  (sometimes  of the Voronoi  tesselation.  T h e Dirichlet  segments  with respect  connecting to {J^  An  the points  share  example  of  can be seen  sampling created  lattice by  tessellation  Definition  A set  P-  points  same vertex  are  tessellations  tessellations  are  can be found  created  shown  figure  the points  tessellation  triangles.  in  3.8.  in Ahuja and Schacter  by connecting  non-overlapping  defined to  Conserving  be P-Adjacent  Partition  mapping, T ^ x l - X ^ } ,  i f they  Mapping  is termed  a partition having vertex  of ?  We  Further (1983).  in the  hexagonal  and as such, the regions  will  denote  vertices  of a  this  particular  are  partition  P and  (ACPM)  an A C P M  set {1^  such  if it takes  a  partition  that the points in {xl  having vertex  have  the  same  as their images in f £ } . g  a result of the preservation o f adjacency  number of Links has the same  triangular  tessellation  Link.  the partition  are  set of  Link.  Adjacency  P-adjacency  As  subregions in the Voronoi  the  Adjacency  into  {Xgl  partition  of as  is the  by D ^ .  a common  Definition  whose  Delaunay triangulation)  can be thought  to their nearest neighbours is a Dirichlet  the  Two share  that  to as the  tessellation  and Dirichlet  discussion of Dirichlet and Voronoi  It  in {J^  a common  Voronoi  referred  regions,  properties, a region in a partition has the  as the corresponding region in its image  number  the regions  of links  as its image  of any A C P M ,  under  under an A C P M .  an A C P M .  Hence,  P , o f D ^ are also triangular  overlapping). Note also that the inverse of an A C P M  is itself an A C P M .  Also,  since  each  D ^ has  (although possibly  Yoronoi Dirichlet  FIGURE  Definition  3.8  The Voronoi  Generalized  A  Hexagonal  tessellation  Generalized  created  Hexagonal  and Dirichlet tessellations  Tessellation  by  Tessellation  of points  (GHT)  applying or  for a set  an  ACPM  to  G H T . A l l vertices  the  of  a  tessellation G H T are  D^ the  is  called  junction  of  a six  Links.  As the x,  points  was  said  in {Xg}  not necessarily  The interior (known) written  of  fact  at  the  and the a member  that  beginning points  of  in ij^,  of {Xg},  the  regions  of  P  x  of  these  regions,  we  values  of  7(x)  at  vertices  as:  section,  we  can  once  determine  by interpolation of the  one  the  this  are  triangular  should let of  this  we the  region.  be  a  mapping,  T,  between  mapping function, 7(x),  for  values.  T  suggests  £=7(x)  have  that,  some  Such  a  given linear  linear  a  point x  combination combination  in  the  of  the  can  be  83  7  where  (x)  r(x.)I(x^  = |  ( i )  ^  ( i  + 1 )  ^  ( i+ 2)j  (3.6.15)  )  V(x) = (xi,x2,x3) is the vertex set of the region  denotes (i)modulo(3)+l.  The function I(x,x,.\ Jc,- ,  j(  of P  containing x  x  , ^ ) is some  r  and where (i)  3  interpolation function  which results in an invertible 7 .  The  simplest such interpolation that we can do between three non-collinear points is a  Trilinear Interpolation  which fits a planar (vector  valued)  surface  to these three points. The  interpolation function for this method is given in equation (3.6.16). This equation describes a  (xAx^.rjXxj'.xAO),  plane passing through the points  2  I ( x ^ 1 > x 2 ^ 3 ) = [x'(xVx 3) +  and (xj'.x^O).  x (x -x )  +  +  I  2  1  1  3  2  (x x' -x x )]/A 2  3  2  2  2  (3.6.16)  1  3  where  =  A  and x  =  x (x -x ) I  1  2  2  2  3  1  xMxVx ,)  (x x -x x ) 1  2  3  2  3  (3.6.17)  2  (x^x ). 2  We can now state the following theorem given  2  one—to-one  (transformation)  point  set  mapping  will  which supplies the conditions under which a  yield  the  one-to-one  and continuous  mapping  function that is required for equation (3.6.11) to be valid.  Theorem 3.6 Invertibility of a mapping Given interpolation  the  Dirichlet  tessellation  defined by equation  image of D ^ P  x  D  (3.6.16)  h  on  then,  ~\  as  if there  defined is an  above,  and the  ACPM, r ~ \  such  trilinear that the  is a G H T and the points in the set V(x) are not collinear, the mapping  7(x) defined by (3.6.15) is one-to-one  and continuous.  84  The the a  image given  key condition in Theorem 3.6 is that T~ of D ^ , be a G H T . Thus,  sample  set Ix J ,  we need  in order to  first  be an A C P M ,  to perform the reconstruction find  the tessellation  P  O  mapping  T  It  of P  is suspected  lattice  that  run  into trouble trying to order  the  set  and  no proof of this  lines. This  process  conjecture,  in  figure  the  ordering  3.9a.  This  set  a G H T from  of points  less  can  be  regular.  It turns  mapped  This,  of course,  other  ways  However,  without does  out that, creating  o f trying to construct this  example  does  this  in { £ }  all must  x  under the have  three  up of points  set by trying  region  operation. Even i f a G H T could be found  this  GHTs  consider  lie along  in  to map points  figure  3.7.  twelve  or  beyond a a  involved  with  of  this  thinner  point no points collinear  vertices.  as there are a number of  set of points;  problems  become  certain  region  in it to  The result  the regions of the tessellation  a G H T from the  that  }xg{.  tomography (e.g. see figure 3 of Pan  using our construction,  out  for creating  cases. F o r example,  a valid proof of our conjecture  point  all sets of points  that algorithms  given  g  an overlapping  not constitute  made  in X - r a y  is shown in figure 3.9b. A s we proceed  and  of P  a G H T from  but it can be seen  set is found  We try to create  following  D ^ . The regions  large numbers o f points in some  type of sample  Kak, 1983). by  shown  tessellation,  it is not possible to create  have  radial  image  must be the junction of six Links.  x  We  of points  P ,  of a function for  whose  v  that  A  is the hexagonal  sides and the vertices  or conversely,  one o f which may work.  in  performing  the thinness of the regions  the mapping  of the G H T so found  would cause problems with the interpolation process.  However, those  samples  one-to-one  in a  and  continuous only a  if we truncate finite  continuous  over  the reconstruction  region  about  everywhere.  this restricted  mapping function that created  x,  then  The  region. This  formula (3.6.12) it  would  mapping  to only  not be  function  take  necessary  need  be  into  account  for y  to be  one-to-one  would mean that we would need only  a partition that  was only  locally a G H T . In this  and  to find  way it is  85  b) An attempt to create a GHT from {Xg}. expected that a mapping could be found for any set {x }. g  For example, figure 3.9c shows a  few of these local GHTs defined on the sample set of figure 3.9a. The price we pay for  86  this weakening of Theorem 3.5 is that the reconstruction  is no longer exact, even if h ( £ ) is  suitably bandlimited. In practice such a truncation of the reconstruction  equation is unavoidable  as one can only process a finite number of samples in a finite time. Let the finite set of sample points used to reconstruct f(x) be }x } e{x }. Note that s  for different reconstruction  points x  we may have  different sets {x } . g  0  0  s  When only a finite  number of terms are used in equation (3.6.12) the resulting value of f(x ) will not be exact 0  but will be subject to a 'truncation'  error term. This truncation  error is defined in equation  (3.6.18) and can be bounded as shown in equation (3.6.19).  e (x) 2  t  = |f(x>f (x)|> R  = |S ftx ) (wV3)J,(|7(xh7(xs)|)/(Tr|7(x>-7(x )|)| (3.6.18) s  J  s  In finding this bound we have used the Triangle Inequality and the fact that | Jj(x)| <l/j/2 (Abramowitz and Stegun, 1965). The above bound suggests that we make the distances in  {fgJo  as large as possible for £ e { ? } . s  0  between J  In other words, {? } s  0  and all points not  should consist  of the  N  g  points closest to J. How one  are we to determine  constraint  on {x" }  s 0  which points {x 3  by requiring  continuously into the Dirichlet tessellation from the points in {x } g  In order  0  s  that  0  map into J ^ o ? Theorem 3.6 provides  the partition  PXQ  be a tessellation  and  map  D^Q, where PXQ and D^Q are the partitions formed  and (%Jo respectively.  for the interpolation of y (equation  3.6.15) to be valid we must  stipulate  that x be contained in one of the regions of PXQ. This also means that ~% lies in one of the regions of U-^ 0  87  Theorem  3.6  requires  that  H(^) = 0  when  H(X+^ )^0 S  for  every  ^ ^0. s  condition is not satisfied the value of f(x) obtained with equation (3.6.12) will  If  this  be in error.  This error is referred to as aliasing error. It can be shown that if the aliasing errors due to two  different sample  have  distributions are compared, the distribution with the higher density  the lower aliasing error.  aliasing  error  Now, it can be shown that,  of the reconstruction  maximum  minimum  possible  possible  density.  aliasing  This  error,  of {x }  g 0  increases.  This  that has, in addition to the above mentioned conditions,  s 0  a  for a given f(x), the localized  (3.6.12) decreases as the density  suggests that we should select {x* }  will  will or,  ensure  that,  alternatively,  for a given will  give  us  f(x), the  we will maximum  obtain  a  allowable  bandlimit that a function can possess while still yielding an exact reconstruction.  We can summarize the conditions on S x l s  The  set {x }  g 0  must  be such  that  there  0  as follows:  exists a tessellation  PXQ  that  can be continously  mapped into the Dirichlet tessellation DNQ.  The o f  P  The  point x  at which the function is to be reconstructed  must lie within one of the regions  x0-  density  of the set i x ^ o must  be the maximum  possible  subject  to the above two  constraints. In truncation  general,  finding  errors is very  the  optimal  mapping  that  difficult In the next section  jointly we will  minimizes present  the  aliasing and  a heuristic algorithm  which is near optimal for homogenous sample distributions. This algorithm guarantees finding a mapping which locally satisfies the conditions of Theorem 3.6.  88  3.7 - Implementation of the 2D Based on the  Transformation Method  foregoing discussions,  we propose  This algorithm finds, for a given point x  in that  the  truncation  and  following reconstruction  and sample set {xgL  locally satisfies the conditions of Theorem 3.6. sub-optimal,  the  a subset {x^o  algorithm.  of {xg}  that  It will be seen that this algorithm is generally  aliasing  errors may  not  be  the  minimum possible  value. For homogenous sample distributions, however, this algorithm will be optimal.  The motivation behind the algorithm is as follows, ln the application that initiated this study, the sample  distributions were non-uniform but homogenous.  applications. Thus the sample points {xg} a  regular  sample  perturbation  is  set,  small  remain somewhere  such  as  the  enough  so  that  in a 60°  This is the  case in many  could be thought of arising from the peturbation of hexagonal a  given  lattice. point  Our  on  the  algorithm original  assumes  hexagonal  sector about its original position. See figure 3.10  that  this  lattice will  for an example  of such a perturbation. The  algorithm  begins  by trying  to  find  the  centre point of the  lattice. Such a point is denoted by 1  in figure 3.11.  point,  to  0  the  point  in  x  reconstruction.  This point  here  N  we  become  use  equal  g  closest  {xg} will to  then  be  for  to consist of the N determining  {x^o.  consisting of a 6 0 ° closest  point 0  for  higher  at  which  to  x0.  mapped to {f }  g 0  g  g  points closest to J, the  N  g  values  points  will  can  be  to  algorithm devised  which maps into 1 , 0  space  in x  then  as shown in figure 3.11.  be  the  within 60°  obtain  about  6  the  a  described but  they  we must result  sectors, and because we want  we can use the following heuristic  sector, as shown in figure 3.10.  These  wish  Because we have assumed that the points in {xg}  of the points in {J }  Divide  we  In the particular  increasingly more difficult. Once we have the point x 0  from the slight perturbations g 0  the  hexagonal  We take as the perturbed value of this  mapped to J .  7. Algorithms  find the other N - 1 . points of {xg}.  {^ }  x,  'original'  point x 0  into  six  procedure  regions,  each  Find the point in each of these sectors other  points  in {xg}0. These  points  are  89  X Space  4  +-X FIGURE 3.10  The operation of the mapping heuristic for N =7 G  This algorithm is described procedurally in a pseudo high level language below, procedure  (*  RECONSTRUCT({  x 3,{f(x )} g  g  ,x ,f(x))  To reconstruct the value of a function f ( x ) at a point x  function samples {Xg}. It is assumed that N = 7 and that {xt g  begin IF x = X j C { x s }  THEN  f(x) = f ( x p ELSE Starts earch  =  x  (• Start the spiral search at x *)  given an arbitrary set of is homogenous. •*)  90  4 Space  FIGURE 3.11 The sample locations in 1  for N = 7 g  FindNearestNeighbor(x ,{x } Jc ,StartSearch) 0  g  c  (* Look for the centre point of ! x ^  0  *)  FindMapping(x ,lx g} ,{x ] ,\1 J o) 0  g 0  (* Find the mapping between [x^0  and {J^0 *)  InterpolateMapping(x „,f ,1 x ]0,{I g3 o) g  0  (* Interpolate  to find the mapping of x  into J  and (3.6.16)) •) f(x)  =  0  FOR i =  1 TO N  g  DO  BEGIN (* Compute the reconstruction sum *)  f(x) = f(x)+g(?I )f(x i ) i  (e.g. using equations (3.6.15)  91  ENDFOR ENDELSE  endproc The  nearest  neighbor  finding  procedure  'FindNearestNeighbor' can  number of ways. For example, the efficient spiral search  be  technique of (Hall,  done  in a  1982), modified  to search over monotonically increasing distances was used in the examples described later in this thesis. This modification is described in the appendix. The given  mapping procedure 'FindMapping' determines the mapping between {x }o and {"£ } g  the sample  set {xg}. This procedure  is detailed in the following  g  0  pseudo high level  program.  procedure FindMapping(x0,{x ^ ,{xg}0,{| g}0)  I. = (0,0) StartSearch  =  (* Map x 0 while  x0  to f o , the centroid of { f } s  Foundl = false  or  Found2 = false  Found5 = false or Found6=false (* While the N  g  points in {J^0  0  *) Found3 = false  or  or  Found4=false  or  do have not all been assigned, do the following: •)  begin FindNearestNeighbor(xn,{xg}, x ,StartSearch) 0  (* Perform a spiral search, starting from StartSearch to find xfi closest to x 0 but no nearer than StartSearch. *) If(-30°<Angle(x, x„)<30°)and (Foundl = false) then (* Determine whether or not x n  1. •) 2  ? , = (2,2V3)  x,=x n Foundl = true  is in the Jl  sector. If so assign x n  to  92  endif If(30°<Angle(x^ )<90°)and 0  (Found2 = false) then  I 2 = (1V3.1) x = x 2  n  Found2 = true endif If(90°<Angle(x3c )<150 ° )and(Found3 = false) then 0  I =(-1V3,1) 3  Found3 = true x =x 3  n  endif I f ( 1 5 0 ° < A n g l e ( x , x ) < 2 1 0 °)and(Found4 = false) then 0  1 = (-2V3,0) 4  Found4 = true x,=x  n  endif If(210°<Angle(x,x„)<270  ° )and(Found5 = false) then  1 = (-1V3-1) 5  Found5 = true  endif If(270°<Angle(x,x„)<330  °)and(Found6 = false) then  1 = (l//3-l) 6  Found6 = true x =x 6  n  endif StartSearch=x  n  (* Start search for the next sample at x endwhile  n  not x* *) c  93  endproc The Angle function used here computes  the angle between the vector x-x 0  and some  reference vector.  The above mapping heuristic works well when the samples are distributed more or less isotropically. For example, in figure 3.12a, the set {x } of maximum density has been found. 0  When  the sample  distribution is markedly  non-isotropic  it would be expected  that  another  mapping procedure could do better, as can be seen in figure 3.12b, where the optimum 5xs50 has clearly not been found by the algorithm. It is natural  to ask how the reconstruction  and two dimensions can be extended  method  to the- reconstruction  described in this paper  of higher dimensional functions. It  is evident that Theorem 3.6, following the analysis of Petersen uniform  case,  can be directly  extended  to higher  dimensionality of the functions and variables can be similarly extended, the  function 7  where  n  can be determined  is the dimensionality  involved. The search  of the sampled  function.  of two dimensional function reconstruction  and Lawrence, 1984)  merely  by increasing the  procedure,  outlined above,  for any dimension. From this T,  by fitting an n-dimensional hyperplane  heuristic can be expected to fall as the dimension increases,  Examples  and Middleton (1962) for the  dimensionality,  to allow the determination of T  for one  The efficiency  to n+1  points,  of the mapping  however.  can be found in (Clark, Palmer  94  a)  b)  Isotropic Distribution  Anisotropic Distribution  FIGURE 3.12 The relation of the heuristic mapping efficiency in terms of sample density to the shape of the sample distribution.  95  3.8 -  Including Surface Gradient Information in the Reconstruction Process In reconstructing  the shapes  of surfaces  in practice  the height (or depth or surface  amplitude) information that is available is frequently sparse surface  reconstructions  can be performed if one can obtain some other  of the surface shape. For example, a surface shape value of the surface  and noisy. In these cases better  gradient vector.  An example  independent measures  descriptor that is often available is the  of such a measurement  is given by the  diffrequency measurement described in chapter 7 of this thesis. Other means by which surface gradient  information  can  be  obtained  include  photometric  stereo  (Woodham,  question one then asks is: how can the information about the surface with the depth measurements this  to perform the surface reconstruction?  question and provides a numerical  gradient along zero crossing contours, surface  reconstruction  method  1978). The  gradient be combined  Grimson (1984) considers  for obtaining measurements  of the surface  with an eye to incorporating this information into his  algorithm (see the discussion in chapter  3.3). Unfortunately he did not  provide any method for performing this incorporation.  Ikeuchi (1983) also talks about combining depth gradient measurements shape from shading with depth measurements map. He proposed a scheme  obtained using  obtained from a stereo algorithm into a depth  whereby the relative  depth of a surface  is obtained from the  depth gradient (or surface normal) information by minimizing the squared difference between the reconstructed depth  gradient and the measured (sampled)  gradient over the surface. An absolute  function is then obtained by using the amplitude information to determine the depth  offset It can be seen, however, that only one depth measurement is sufficient, in principle, to specify  this  provide  a  depth more  offset, accurate  and Ikeuchi estimate  uses  of this  Ikeuchi's technique does not take advantage  the rest offset.  of the depth  This  leads  measurements  only  to  one to the conclusion that  of all of the information available in the depth  amplitude information, but relies inordinately on the gradient information.  In this chapter we show how the transformation reconstruction method described in the previous section can be extended to incorporate depth gradient information. This will be done  96  for  the two dimensional  case only;  the cases for other  dimensions can be obtained  in a  similar manner. It will be seen that both the amplitude and gradient information are ascribed equal importance  As the  in the development  extension  developed that,  in performing the reconstruction.  of the two dimensional  which allows the incorporation  of gradient  transformation  reconstruction  information is based  on the theory  by Petersen and Middleton (1964) for the case of uniform sampling. They show  for uniform  sampling,  using  both  amplitude  and gradient  information  point one requires only one third the sample density for exact reconstruction which  method,  amplitude  information  only  is used. Conversely  this  at each  sample  than the case in  means that functions  with three  times the bandwidth can be reconstructed exactly with the same sampling positions.  The  reconstruction  theorem  for the uniform samping,  amplitude plus gradient  case is  stated below as theorem 3.7 and is due to Petersen and Middleton (1964).  Theorem  sampling  3.7  The  on a hexagonal  Suppose to  uniform  27rB  two  dimensional  sampling  sampling set {x } g  for amplitude  and  gradient  grid  that a function of two variables  radians,  theorem  and its gradient  f(x), with isotropic  Vf(x) are sampled  at points  spectra and bandlimited  in the infinite  hexagonal  defined by:  x  =  {x:  x = l v + l v , l ,l = 0 , ± l , ± 2 , + 3..., v\;tkv }  v,  =  (v/3/2B,-l/2B) and v  1  1  2  2  1  2  2  (3.8.1)  where:  2  = (0,1/B)  The corresponding frequency domain spectral repetition lattice basis vectors are given by:  (3.8.2)  97  u,  The  =  (4irB/|/3,0) and U  2  =  (27rB/v/3,27rB)  (3.8.3)  result of this frequency domain repetition lattice is that the spectral repetitions exhibit a  threefold overlap. The addition of the gradient information insures that this overlap does not result in any aliasing errors. If the above conditions hold then the following equation is true:  f(£) =  L f(x )g(x-x )+Vf(xs)-E(x-xs) s  (3.8.4)  s  where g(x) is given by the following expression:  and  g(x)  =  6j/3/(27rB) {sin(27rBx V3)[cos(27rBx )-cos(27rBx / /3)]}/[x (x -3xa )]  (3.8.5)  K(x)  =  xg(x)  (3.8.6)  3  2  1  1 l  1  2 1  2  where:  The non-uniform sampling theorem for amplitude and gradient sampling can be derived from theorem 3.7 in the same manner as the amplitude only theorem was obtained in chapter 3.6. Thus the non-uniform sampling theorem is therefore only stated (as Theorem 3.8) and not explicitly derived.  Theorem 3.8 sampling  on  The a  non-uniform  hexagonal  two  dimensional  sampling  theorem  for  amplitude  and  gradient  grid  Suppose that a function of two variables f(x) and its gradient Vf(x) are sampled at points in the infinite set {xg}. Now, if there exists a one-to-one continuous mapping y such that:  98 £ = 7(x) and | s = 7(xg)  (3.8.7)  and if the function defined by  _1  (3.8.8)  P(I) = f(7 (I))  satisfies the conditions of Theorem 3.6 then the following is true:  (3.8.9)  where g(x) and K(x) are as defined by equations (3.8.5) and (3.8.6). Hence we have, after applying the transformation:  (3.8.10)  where |J| = 19x/9-y| is the Jacobian of the inverse transformation 7  (£).  The heuristic mapping algorithm described in chapter 3.7 can be used to implement the amplitude and gradient sampling and reconstruction process as well as the amplitude sampling only reconstruction process. The importance of theorem 3.8 is that it shows the existence of a method by which a two dimensional function can be reconstructed using non-uniformly distributed information about both its amplitude and its gradient  99 3.9 - Summary of chapter 3. -  The  disparity  function  must  be  reconstructed  as  accurately  as  possible  from  its  samples at each resolution level.  -  The reconstruction process must handle arbitrarily distributed samples.  -  Grimson's surface reconstruction method (relaxation  method) is based on his surface  consistency constraint which we show to be invalid for smoothed (low pass filtered) images. -  Current reconstruction methods based on sampling theory can not handle arbitrarily  distributed samples. -  A  warping or  transformation  method  is  introduced  which  can  handle arbitrarily  distributed samples. -  This  method  is  shown  to  have  the  gradient information into the reconstruction process.  ability  to  incorporate  independent  surface  100  IV -  ERROR ANALYSIS O F DISCRETE M U L T I R E S O L U T I O N F E A T U R E  4.1  Sources of Error in Discrete Multiresolution Feature Matching  -  In this chapter multiresolution feature  we analyze the various sources of error that are matching. We also provide a general  feature matching systems which will enable us to estimate types of errors on the accuracy,  MATCHING  involved in discrete  model of discrete multiresolution  the relative effects of the different  and more importantly from a procedural point of view, the  stability of the matching algorithms. This has not been done, except at the most superficial level, by other researchers in this area. Without such an analysis as the one presented here, one cannot and  fully predict the performance of a given matching algorithm. For  Poggio (1979) examined in detail only  mismatching  of  'ghost'  features;  one  and did not  of the  take  errors  into account  discussed the  other  example, Marr  here,  that of  sources  of  the error  which, as we will see, are as important, if not more so.  The topics discussed in this chapter  are  summarized in block diagram form in figure  4.1. The sources of error in a general discrete multiresolution feature matching process are listed below, with their specific characteristics 1. Sensor Noise -  (section  and effects.  4.2)  The image sensors that are used in practical vision systems, such as video cameras or biological retinae, are non-ideal devices, and as such produce signals which are  contaminated  with various types of noise. In the case of video cameras the types of noise that may be present include thermal noise of the electronic circuitry used to process the video signal, shot noise due to the random nature of the electron beam in the vidicon tube, 1/f  noise due to  surface effects on the vidicon target, imperfections or blemishes in the optics of the camera, and so on. The main effect of these noises is to produce shifts in the parameters  of the  measured features. For example, in the case of zero crossing features, the sensor noise causes  101  CH4.I Summary  Sources of Error  Analysis of Sensor noise  Analysis of  Analysis of  Analysis of  F i l t e r i n g error  Reconstruction  Matching error  CH 4.5  CH4-.T  CH4.S  error  Z Relationship between ^  scale maps of stereo pairs  CH 4.4  Derivation of  Probability of  the optimal  zero error  reconstruction  as a function  f i l t e r for  I  truncation  of the disparity ^  estimate error 4.S  *" Derivation of * the general reconstruction error expression  *  I  * Effect of  Effect of the errors on the matching alg orithm  orientation  CH <K7-  quantization CH4.6  CH 4.4  Geometry error  Example - a  analysis  Gaussian surface  1  CH <f .4  CH4.6  * Effect of vertical camera misalignment  Extension to non-uniform  CH 4-.fe  ^ sampling CH 4.4  FIGURE a  random  4.1 The topics covered in chapter 4.  shift in the measured position  of the zero  error in the value of disparity that was derived  crossings.  This  would  then  from these zero crossing positions.  cause an  102  2.  Filtering  It  can be shown that,  Error  -  (section  4.3)  if, in the feature  construction  process,  one performs spatial  filtering of the image (as is the case in all multiresolution systems) there will, in general, be a non-corresponding change (i.e. a change in one image that is uncorrelated with a change in the other image) in the position of corresponding features in a stereo pair. Specifically this means  that if a stereo camera  pair is viewing a surface  disparity (such as a tilted surface),  that gives rise to a non constant  the disparity derived from the difference of the measured  positions of corresponding zero crossings  will  not be the true  disparity. This is due to the  fact that, in this case, the spatial frequency content of the two images is different, but the images  are  being  filtered  with  different spatial characteristics This type of error  the  will  same  filter.  Thus  corresponding  be, when filtered, altered  features,  which  have  in a non-corresponding fashion.  has not been previously described in the computer  vision literature. We  will use the Scale space transform described in chapter 2 to analyze this error.  3.  In  Reconstruction  any discrete  Error  -  (section  multiresolution  4.4)  stereo  system  disparity function at all resolutions in the system.  it  is necessary  to  have  Since the matching process  a  complete  does not (in  most cases) produce disparity values at all the points in our working space (but only at the positions where the features of the two images were matched)  one must perform a disparity  function reconstruction to fill in all the required missing disparity values. As we have seen in chapter  3.6, if the true  disparity function is suitably bandlimited (for a definition  of what  suitably bandlimited means in this context see chapter 3.6) then the disparity function can be, in principle, reconstructed exactly from a set of its samples. However, this is only true if we have  available  an infinite  sufficiently high. In practice  number  of samples,  and if the density  of this  sample  set is  neither of these conditions are always attained, and as a result,  the reconstruction process will yield a disparity function that is only an approximation to the true disparity function. Hence the reconstructed that has added to it a reconstruction error.  disparity function is the true disparity function  103  4. Matching Error -  (section 4.5)  Features which are to be  matched are generally primitive features as explained in  chapter 2. As such they are to some extent ambiguous, meaning that it is not always possible to determine correspondences between images. The  various types of matching algorithms that  have been proposed try to reduce this ambiguity by making use of some of the structural properties of the features. For example, the Marr-Poggio matching of zero crossing features utilizes the fact that zero crossings can not get too close together. However, no matter  how  elaborate the matching algorithm is, there will always be some occasions in which incorrect correspondences are made. Thus, the disparity values produced by the matching algorithm will be subject to a matching error. It will be shown that the distribution of the matching error depends critically on  the distribution of the error in the initial disparity estimate that  was  used to guide the matching. It turns out that this becomes quite important in examining the stability of the multiresolution matching algorithm. From their writings, it appears that Marr and  Poggio were unaware of the importance of this fact It will also be shown that the  orientation parameters of corresponding features can be distorted when the disparity function is not constant. This too results in errors. 5. Quantization Error  In practice, the positions of the features can be, or will be, measured only to within a certain precision. This means that the measured feature position will differ from the true position. This results in an error which is referred to as quantization error. In pyramid type systems (see chapter 2), the quantization error decreases as the resolution increases.  In what follows, we of qop  where  will assume that the features are located to within a quantization  is the filter space constant In the cases in which we  probability density functions, p„(e), we  determine continuous  can obtain the discrete probability mass function p (n),  that arises from the quantization process as follows:  104  P (n) e  6.  Geometry  =  (4.1.1)  e  -  Error  The  S P (e)de  (section  imaging  4.6)  system,  being  non-ideal,  may  introduce  non-corresponding  geometric  distortions in the images which will result in non-corresponding shifts in the feature positions. Vidicon camera tubes are notorious for such geometric distortions. Another type of error  that is a function of the geometry  produced when the geometric  parameters  of the imaging system is  of the imaging system are imprecisely known. These  parameters include the inter-camera baseline distance, the relative angle of tilt of the cameras, the camera focal lengths, and the size of the sensors. All of these parameters are involved in the calculation of depth from disparity values, and hence,  if any of them are in error, the  computed  the disparity  depth  necessarily  error  occurs  after  will  be in error  as  well.  the disparity measurement,  the performance of the disparity measurement  Since  the above  to depth conversion  geometric  errors  do not affect  algorithm. A type of geometry error, that does  affect the matching process is the error caused by not knowing accurately enough the epipolar lines  along  which  the search  for matching  cameras have a vertical offset relative  features  must  to each other,  be made.  For example,  if the  and this offset is not known, disparity  measurements made of non-vertical features will be in error. We errors  will  now examine  in detail the cause and the effects  listed above. For the purposes of these analyses  we will  of each assume  of the types of  that the two input  images are zero mean, white, Gaussian random processes. This is done in order that we may be able to get a handle on the mathematical analysis involved in computing the distributions of  the  various  errors  and thus  obtain  representative  results  on the performance  matching algorithms when disturbed by these various noise sources. image model would errors  would  become  mean that  the mathematics  enormously complex.  involved  As will  In most cases, a realistic  in estimating the effects  be seen,  even  of the  of these  with the assumption of  Gaussian white noise, we only obtain approximate results in some cases.  105 The The  error analyses will be done for the cases of zero crossing and extremum  extremal  crossings  features.  points are the zero crossings of 9/9x (V G*f(x ,x )). That is, they are the zero  of the  We will denote  1  directional  2  1  2  derivative of the V G filtered image 2  in the  horizontal direction.  the probability density function of the zero crossing errors by p (e), and the £  probability density function of the  extremal  point errors by /> (e). The corresponding e  discrete  probability mass functions will be indicated by a ' * ' over the p's. Section 4.7  will  present  a discussion of how the  various  performance of the simplified multi-resolution matching algorithm.  sources of error  affect  the  106 4.2 - Effect of Sensor Noise on Feature Position In this section signal  process  function,  f(x).  crossing) of  that  is  Let f(x)  we examine V G 2  us define to be XQ.  the following problem. Suppose we have a white Gaussian  filtered the  (with  a  filter  position of an  scale  arbitrary  constant feature  Thus f(xQ) = 0. Now suppose  yielding  o^)  Let  n  the  variance  of the  the  variance  of the  unfiltered signal be  we add to  corresponds to XQ be given by x . c  unfiltered noise signal n(x) Let  of.  the  Correspondence  random  (extremum point or the  noise signal, which has a Gaussian distribution, so that the output of the f (x) = f(x) + n(x).  a  position of the  original signal a filter is given by  be denoted  and  feature of f (x)  n  and x  in this case is defined as follows:  (4.2.1)  lie on corresponding feature contours. That is:  lim D(C N -C 0 ) = 0  where CFI  (4.2.2)  is the feature contour through the point x*  and CQ  c  the point XQ. D  is some metric  define the point x*  n  line through XQ.  measuring  the  to be the point on CFI  distance  x* ,  c  n  and XQ  is the feature contour through  between  two feature contours.  that is the closest to XQ  The relationship between XQ, X ,  Note that, in general x  n  CN  and CQ  the  features of f(x)  is shown in figure  horizontal  4.2.  (4.2.1).  with those of f (x). n  there was no added noise, then the measured disparity would be zero everywhere that the correspondence  Now  along the horizontal  are not corresponding points, as defined by equation  Now suppose we were trying to match  shift, or  that  n  lim |x - x J = 0  Note that x  zero  If  (we assume  problem can be solved). Now, if we add some noise, the horizontal  disparity,  between  corresponding  features (which  is what  is measured  most stereo matchers) is a random variable equal to the distance between points x*  n  Thus we can write the feature position error, e  as follows:  by  and XQ.  107  FIGURE 4.2 The perturbation e  Our  goal  p  =  of feature contours by additive noise.  (x -x >(l,0) n  (4.2.3)  0  in the remainder  of this  section  is to derive  an expression  for the probability  density function of e . A one dimensional version of this problem was analyzed by Lunscher (1983). noise  However, the noise to be constant  while  model  he used  the signal  was unecessarily  was assumed  linear).  restrictive Wiejak  (as he assumed the (1983)  discusses the  probability of a zero crossing changing sign or of a new zero crossing being formed due to the  addition of noise. Nishihara (1983) also  analyzes  this  problem. In (Nishihara,  1982) is  presented experimental evidence for the effects of noise on the zero crossings and no analysis is performed. However, in this thesis crossings  we are concerned  due to noise, as the appearance  only with the perturbation  or disappearance  of zero crossings will affect only  the. matching statistics and not the accuracy of the disparity measurements Zero Crossing Position Errors  of zero  themselves.  108  We assume that, near the unperturbed zero crossing point XQ, the signal function f(x) and  the noise function n(x) can both be approximated by a plane. We can then write f(x)  as follows:  f(x) =  (a/c,b/c)-(x-x )  (4.2.4)  0  where (a,b,c) is the unit normal vector of the signal plane and x y indicates the dot product :  operation. Similarly, the noise function can be approximated by:  n<x) =  (a /c ,b /c )-(x-x ) n  n  n  n  (4.2.5)  n0  where ( >b > ) is the unit normal vector of the noise function plane. We will take, without a  n  c  n  n  loss of generality, XQ = (0,0). The function obtained by adding the noise to f(x) can also be approximated by a plane, and can be defined by:  n(x) + f(x) =  One  (a/c+a /c ,b/c + b /c )-(x) n  n  n  n  (x ) - ( a / c , b / c ) n0  n  n  n  (4.2.6)  n  can now see that x , the horizontal position of the perturbed zero crossing, is equal to n  the position error, ep, and is given by:  e  p  Thus the error  <-  ^ V S ^ + V ^  =  4 17)  is seen to be proportional to the slope of the noise function and inversely  proportional to the signal slope. These proportionalities were also noted by Nishihara (1983). We need to find an expression for the joint probability density of k, = x , k = (a/c) n  and k =(a /c ). 3  n  n  Since f(x) and n(x) are uncorrelated we have that:  p(k ,k ,k ) = 1  2  2  3  pp(k ) p (ki,k ) 2  n  3  (4.2.8)  109  Also, we can assume  that X R Q will be independent of the slope of the noise function. Thus  we can write:  pCkiXk,)  The  =  Pf  (k ) P 2  (4.2.9)  3  zero crossing rate of a horizontal slice of the random functions f(x) and n(x). Rice  (1945) has derived the following ^^(T)  =  autocovariance  R  where  expression  for R, in terms of the autocovariance  function,  of a horizontal slice of n(x). His result is:  R  =  [->// "(0)/V/ (0)] ld  1/2  ld  (4.2.10)  /7r  function is derived in the appendix and we have:  i/(31/22)  (l/of)  (4.2.11)  is the space constant of the V G filter. 2  With regard to k The  R3  probability density of xFIQ is assumed to be uniform over a range 1/R, where R is the  expected  The  (k,) p (k )  xn0  2  and k  3  it can be seen that k  probability densities of k and k 2  11  Pf  (k )  =  [-27r^f (0)]  Pf  (k )  =  [-27rV/ (0)]  2  3  2  =  3f(x)/9xi  and k  3  —  9n(x)/9xi.  can then be shown to be as follows (Rice, 1945):  1  /  2  exp(k 2 V^ (0))  (4.2.12)  1  /  2  exp(k3Vv// (0))  (4.2.13)  f  1]  and  3  n  11  n  I1  110  n  where \p^ (r) n  \l>n (T)  1  d /dri [\p^r)],  =  being the autocovariance function of f(x). Likewise,  2  9V9TI2[V/>- (T)], ^ ( 7 )  —  being the autocovariance function of n(x). We can  now  write:  p(k„k ,k ) 2  3  =  [R/(4jri/ty »(<W "(0))] f  exp(-[k /i// (0) + k /i// (0)])  n  2  2  f  I1  3  2  n  11  |k,|<l/R  (4.2.14)  =  0 if |k,|>l/R  Now, k =k +k 4  2  kj = k / k 3  4  3  from  equation  4.2.7 we have  that  e  has a Gaussian distribution, with variance  =  k]k /(k + k ). 3  2  3  [\t'f (0)+\// (0)]/2. n  n  11  The random  variable  The random  variable  is shown by Miller (1964, p50, equation 2.4.1) to have the following density:  p (u) k5  =  1/2  2  3  2|W| (W22U + W )/ r[(W u + W 1 1 ) -4W 11  7  where W is the inverse of the covariance  22  2  12  2  u]  (4.2.15)  2  matrix of the random variables k  4  and k . It can 3  be shown that:  Wa  where  2  2/s -2/s  -2/s 2(l + s )/s  2  we have  2  set -\J/ (0) = o n  (4.2.16)  2  2  n  2  2  and defined s = \//^ (0)/\// (0) 2  l  11 n  to be the 'signal  to noise  ratio' or SNR. Equation (4.2.14) for p ^ reduces to:  p (u) k5  The  position error  =  e  S[U (1 + S )+1]/TT([U (1 + S ) + 1] -4U ) 2  is seen  2  2  2  to be the product  2  of the two random  Since k, is uniformly distributed from - 1 / R to l / R , we can write p48, in his proof of Theorem 4):  (4.2.17)  2  variables ki and k . 5  (following Miller, 1964,  Ill  P (e) ep  =  R J P 5(u)/u du "Re  (4.2.18)  k  This  integral  can be evaluated  CRC  Standard Mathematical Tables) to yield:  p  (e) =  (substitute v = u  R/4  -  2  4  2  #109  and #120  R/(27r)tan [{(l + s ) (Re) + (s -l)}/2s] 1  Rs/(47r)log|(l + s ) (Re) + 2(s -l)(Re) + l| 2  and use integrals  2  l  J  J  +  2  Rs/(27r)log|(l + s )(Re) |  2  2  of the  (4.2.19)  2  In figure 4.3 we plot P p( p) Tor o^=l, for a number of different values of s . Notice that, e  e  as  the SNR increases,  2  the distribution shifts to smaller  error  values.  Figure 4.4 depicts the  case of s = l , for a number of different filter o's. Notice that, as the filter a distribution approaches that  expands  towards  higher  error  values.  It  can be seen  that,  increases, the  for large  e, p(e)  l/(2R(l + s )e ). Thus e p(e) vanishes slower than 1/e as e goes to °°. This means 2  the  variance  2  2  of p(e)  is  undefined,  (but see  section  5.5  for the  definition  of a  'pseudo-variance').  Extremal  Point  The  Position  analysis  Errors  for the extremum  feature  feature case. However, in this case the actual  case  is the same  as for the zero  crossing  value of R is different It can be shown that  (see Rice, 1945 for the derivation):  R  =  Note  extremum  =  (4.2.20)  <l/»)[-*i d "W* l d "(<tt]  (1/0^)^(137/31)  that R  e x u  -  e m u m  / R  z c  =  1-771. This  means  that the expected  extrema is less than the expected interval between zero crossings  interval  between  of a function. From figure  SNR • = .5 o =1.0 A = 2.0 + =4.0  CO  o ou  co  G o> Q  Zero Crossing Shift FIGURE 4.3 The probability density function of zero crossing position error, for a =l, f  and SNR =  0  .5, 1., 2., and 4.  1  2  3  4  5  Zero Crossing Shift FIGURE 4.4 The probability density function of zero crossing position error, for SNR=1, and o = f  .5, 1., 2. and 4.  113  4.4  we can  becomes  see  that, as  compressed  towards  error of the extremum  Quantized  Position  decreases, the probability density function of the position error smaller  error  values.  From  this  we  conclude  that  the  features will always be less than that for the zero crossing  Error  position  features.  Compulation.  To find the discrete probability mass function, p .(n) we can use equation (4.1.1). The ep pr  resulting integral  is not amenable  to simple analytical integration  techniques so we must rely  on numerical integration. To compute the following graphs we used Lyness' (Lyness) S Q U A N K integration algorithm. The special case of n = 0 gives us the probability of a zero pixel error. This probability is plotted  in figure 4.5,  along  with the  probability of 'a  error as a function of the signal to noise ratio, given that q = l V 2 , and It should be again pointed out that the analysis presented relatively small noise levels (i.e. small SNR). In addition to the in the position of the true (physically significant) features,  1,  2  or  3 pixel  or=l.  in this chapter  is valid for  noise causing a perturbation  an excessively  high noise level will  cause the creation of features that are entirely noise related and have no physical significance. Furthermore, the true features will begin to be broken up, and may vanish entirely.  114  O  Signal to Noise Ratio FIGURE 4.5 Probability of an n pixel error, for n=0, 1, and 2, given that q = l//2 and o =l, as a function of the SNR. f  115  4.3 -  Analysis of Disparity Measurement Errors due to Filtering  In this section we examine the errors in the measured disparity that result from the fact that we are spatially filtering the images with the V G filter. This process has not been 2  described elsewhere  in the literature.  We will initially examine the one dimensional case and  then discuss the extension to two dimensions. We  will  use in the analysis  transform (SST) introduced in chapter  of the filtering error  the concept  2.1. The use of the SST will  of the scale  space  allow us to find the  relationship between the corresponding zero crossings of the two stereo V G filtered images. 2  One  Dimensional  Let  Filtering  us consider a situation wherein we are viewing a surface  that gives rise  to a  disparity function of the form:  d(x ) L  =  x -x R  L  = 0o + p \ x  (4.3.1)  L  It can be seen that if the left eye sees a light intensity pattern g(x) and the right eye a light intensity pattern f(x), then g(x) and f(x) are related as follows:  g(x ) L  =  f(x )  (4.3.2)  R  Note that this equation is only approximately true, as the observed intensity of a scene point generally  depends on the angle  However,  for parallel cameras with large  for  the two cameras  between  are usually  fairly  the surface  normal vector  and the view direction.  focal lengths the difference between view directions small,  so that  the difference  in observed  image  intensity will be small. Highly specular surfaces are quite troublesome in this regard, since the observed such  image  cases  intensity is very sensitive to the angle of observation. In order  we must  use an equation  function of position and of the surface  such  as g(x^)h(x^,n(x,y)) = f(x ), R  where  to analyze h is some  normal vector n . If we assume that (4.3.2) is valid  116  we can write, using (4.3.1):  g(x ) L  =  f(/3„ + (l + p\)x )  (4.3.3)  L  The SST of f(x) is given by equation (2.1.2) and is repeated below:  F(x,a) =  dVdx f*Jf(u)aV(/2T)] J  e  ~  ( x - u ) V 2 a 2  du  (4.3.4)  The SST of g(x) is obtained by substituting (4.3.3) into (4.3.4) to yield:  G(x,a) =  letting v =  2  oo  0  (x  1  u)2/2a2  du  (4.3.5)  j3 + (l + j3i)u we obtain: 0  G(x,a) = e  dVdx /" [f(/3 + (l + / 3 ) u ) a V ( / 2 ^ ) ] e " ~  (l + /3 ) dVd((l + /3 )x) ;r [f(v)aV((l + /3 V 2^r)] 1  2  - ( x ( l + p\) + pVv)V2((l + p\)a)  1  2  2  a3  1  r  d v  (  4  1  6  )  Thus the SSTs of the left and right images are related as follows:  G(x,a) =  (l + /3 i r F(^o + (l + |3 )x,(l + /3 )a)  (4.3.7)  F(x,a) =  (l + p\)G((x-p\)/(l + p\),a/(l + p\))  (4-3.8)  1  1  1  or  Thus, for /3 = 0, the right hand scale map is obtained from the left hand scale map by a 0  uniform expansion in both the x and a this can be seen  directions by a factor of 1/(1 + /3i). A n example of  in figure 4.6 which shows the left and right scale maps of a randomly  117  FIGURE 4.6 The left and right scale maps of a randomly textured, tilted surface with a disparity gradient of -60/255. textured  surface  with j3j =-60/255.  The fact that the  left and right scale maps are  scaled  versions of each other are readily apparent Now, crossings  consider  between  the  the  idealized case  left and  right  in  images  which  we  with no  can error  match  the  whatsoever.  corresponding Let  zero  us define the  measured disparity as follows:  V L' x  a )  =  The variables x^(o)  x  R  (°)- L< x  and x (a) R  a )  ( 4  -  1 9 )  are the positions of the corresponding zero crossings in the  left and right images, measured with filter resolution a. The error in the disparity measurement at a resolution a is given by:  e  d  ( x  L'  x  a )  =  d  r^ \J ~ ^l) o)  d  =  x (ahx (a)-/3o-0.x (a) R  L  L  (4.3.10)  118  Let  us denote  the track,  in scale  space,  of the single  zero  crossing  feature  (defined by  G(x,o) = 0), that passes through the point (x^,a ), by the functional representation: 0  a  Note  =  C(x) with inverse x =  that C(x) is a monotonic  C  _ 1  (a)  (4.3.11)  function, as the point where  dC/dx  is zero  is the point  where two distinct features merge and does not strictiy belong to the contour (see Yuille and Poggio, 1983a for more on this matter).  Let Then  us assume  we have  F ( x , a ) = 0, R  that the zero  that  C(x^)  and since  o  =  o . 0  the points  crossing From  (x ,a ) R  0  locations equation  are measured (4.3.8)  and (x^,o ) 0  it  at a resolution o = o0 .  can be  shown  lie on corresponding  that,  zero  since  crossing  contours,  C((x -/3o)/(l + /3i)) R  =  a , / ( l + p\)  (4.3.12)  This is shown diagrammatically in figure 4.7. We can rewrite the disparity error in terms of C~*(a) as follows:  e (x ,a ) d  Note  L  0  =  (l + j3,)Cr (ao/(l + / 3 , ) ) - x - / 3 x 1  that the disparity error  L  1  (4.3.13)  L  is independent of pY This  is what  we would  since translation of a function does not affect the magnitude of its frequency spectra. set e  zero scale  d  to zero in equation (4.3.13) and solve for a = C ( x ) . ze  disparity  error  map contours  expect Let us  This will define the family of  loci. The significance of this family of curves lies in the fact that, if the do not belong  to  this  family,  there  will  be  measurement error. Setting e^ to zero and rearranging (4.3.13) we obtain,  a  non-zero  disparity  119  0"  FIGURE  4.7 The relationship  between  the left and right scale maps  of a tilted  surface.  x = C  This  equation  _1 ze  defines  obtain the following  ( a 0/(1 + 13 0) = (xL+p'1xL)/(l + p\) = x L  a  family  of  vertical  lines.  we substitute  (4.3.14)  into  (4.3.13),  we  relationship,  e (a) = (a./a)[CT (a)-Cr (a)] d  If  (4.3.14)  1  1  ze  (4.3.15)  120  Therefore we conclude that the disparity error is a Jo  times the horizontal difference between  the scale space zero crossing contour through the point (x^,a ) and the zero error line, x = 0  x^,  measured  illustrates  at  o  =  a / ( l + 0i).  An example  o  how one can determine  of this  the disparity error  is shown  in figure  4.7, which  with a graphical construction provided  one has the scale map of the left hand image.  The important point to be noted in the above discussion is that, for there to be zero disparity measurement belong  error,  to the family  all of the scale  of zero  error  conclude that one can, in general, when  one is viewing  a surface  map contours  lines. In general, expect to obtain  with constant  of the image  function must all  this is not the case, zero  and we must  disparity measurement  disparity (/3i=0)  error  or when the'zero  only  crossing  measurements are made at a resolution of o = 0. 0  In (Clark and Lawrence, 1984c) is a derivation of the disparity measurement error for —fx—x Y /2 f(x)=e  v  0 /  , for the case of zero  crossing  features.  They  show that the error  in this  case increases as o increases and as |0,| increases. We  will  now provide an approximation  for the probability density  function of the  disparity measurement error for f(x) a Gaussian white random process.  Zero Crossing Features It turns out that it is not possible to determine an exact expression for the probability density function P (e j) due to the complex, non-stationary nature of the SST, F(x,a), which ed  (  is a two dimensional random process. that is valid for small values of fi  u  We can, however, derive an approximation to P j(e j), ec  (  using the following procedure.  Consider the situation shown in figure 4.8, which depicts a zero crossing contour of a particular realization of the random process F(x,a). Note that for small A a contour is approximately straight this case is given simply by:  and forms an angle <j> with the o  the zero crossing  axis. The error,  e^ in  121  cr  F(x,<r)=o  FIGURE 4.8 A zero crossing contour of the random process F(x,a) e  d  =  (xi-xojao/o-j  =  Now (x,-x ) can be expressed in terms of 0, a 0  e  d  (4.3.16)  ( x i - X o X l + /3i)  0  and /3i to give the following expression for  :  (4.3.17)  where /u=tan(0) is the slope of the line perpendicular to the zero crossing contour at (x ,a ). 0  0  The gradient vector, T}=(rj ,rj )=VF(x,a) measured at (x ,o ) is also perpendicular to the zero 1  2  0  0  crossing line. Thus we have that:  (4.3.18)  and  122  e  =  d  (4.3.19)  -00/3,(772/17,)  Now our problem reduces to finding the probability density function of the above function of the  two  random  variables  77  and  x  77  2  given  that  F(xo,0o) = O.  However,  because  nature of F(x,o) its autocovariance function is not  of  the  complicated,  non-stationary,  Fortunately,  we can convert our problem from the two dimensional case involving F(x,o) to a  one dimensional problem involving the function is well behaved an interesting property (1983a) that the  of the scale space transform.  scale space transform  9 F(X,CT)/9X 2  =  2  (l/a)  Following the  of any or  Diffusion  Now 9F(x,a)/9o is merely n  =  function g(x)  and easy to work with. That  type commonly known as the  n  random  Heat  =  F(x,a ), 0  we are  well behaved.  whose autocovariance  able to do this results from  It was pointed out by Yuille and Poggio  function  satisfies a differential  equation  equation. In particular we have that:  9F(X,CT)/3O  (4.3.20)  and 3F(x,o)/9x is 77,. Thus we can write:  2  (a 9 F(x,a )/9x ) / (9F(x,a )/9x) 2  0  notation  of a  2  0  (4.3.21)  0  of Rice (1945), let us set  £ =g(x) = F(x,o- ), V =dg(x)/dx= 9F(x,o )/9x, 0  0  and $ =d g(x)/dx = 9 F(x,a )/9x . Thus we can write: 2  2  M  =  2  0  2  (4.3.22)  a 0 $/7?  We now have an expression  for y  in terms of the derivatives of a one dimensional function  g(x). We  can  (VanMarcke, 1983  write p52):  the  conditional  probability  of  TJ  and  $  given  that  £ =0  as  follows  123  Pc«,r?U=0)  T  =  (1/[2TT/|B|]) exp(-($,7?) B \i,-n)/2)  (4.3.23)  where  B  is  a  — B -B12B22 ^ B n  function of the partitioned covariance  Bo  where B and  (4.3.24)  T 1 2  B  2 2  =  B Bn n  B , of the vector 0  of random variables  (4.3.25)  B12 B  n  is 2x2, B  n  matrix,  2 2  is 2x1 and B  2 2  is l x l . It can be shown (Rice,  1945), that B  n  , B  1 2  can be expressed in terms of the autocovariance function, \p(r) of g(x) as follows:  B„  i//""(0) o  =  B  1 2  B  22  T  0  (4.3.26) -no)  = (iT(0), 0)  (4.3.27)  = <//(0)  (4.3.28)  Hence we have that:  B  1/a  =  0  2  0 1/b  (4.3.29) 2  where we have defined:  a  and  2  = iK0)/[W0)F"(0H<n0)) ] 2  (4.3.30)  124  b  =  2  -IAT(O)  (4.3.31)  Thus we have that:  =  P (S.T?|*=0) C  Now  (ab/27r) e (a $ + b r} )/2 2  2  2  2  (  let us make the transformation M = $ 0 / T J . o  4  J  J  2  )  From the laws of transformations of  random variables (e.g., see VanMarcke 1983, p32) we have that the probability density function of ix and 77 given £ =0 is:  P (M,T?|£=0)  =  m  =  Now  P (T7M/OO,T?|$=0)  (ab/(27ra )) ^WoS  + V)/!  0  to obtain P ( M ) we need only M  (4.3.33)  (T?/O0)  c  integrate  (  out the dependence of P^(M,T?|£ =0)  4  3  3  4  )  on TJ.  Doing so we get:  P (M)  =  m  5  2  ( b a „ / a ) / ( 7 r [ M + (ba /a) ]) 0  Now making the transformation e = - 0 o / 3 i M d  p (e ) ed  d  =  (4.3.35)  2  we obtain:  k/(7r[e + k ]) d  2  (4.3.36)  2  where we have defined:  Notice that, in the limit as a approaches zero, p(ju) approaches 8(M). Thus the angle of the zero crossing with respect to the vertical approaches zero with probability one. This proves the conjecture that the scale map contours are vertical at a = 0 . 5  0  0  125 k = -pVob/a  (4.3.37)  This zero crossing position error is seen to have a  Cauchy  distribution.  The form of the autocovariance function is derived in the appendix and it is seen that:  n/  i// (0) = (-l) V7r(n + 4)!/[(n/2 + 2)!2 (n)  =  n +  5  a  0  n  *] for n even  (4.3.38)  0 for n odd.  From these relations we can compute the value of k. The result is:  k =  (4.3.39)  CTO/3,  Thus, finally, we get:  Ped(ed) = aopVOrfe^ + o V M )  (4.3.40)  r  Infigure4.9 we plot Ped(ed) f° various values of k. Note that as k increases the density function spreads out towards higher error values. The variance of e^ is undefined, but we can obtain a measure of the dispersion of ped byfittinga Gaussian density to ped by setting P (0) equal to (1/a^(lit)). Doing this we obtain: ed  od  2  (4.3.41)  = 7roV0,V2  In terms of a, = a0/(l + /3 ) we get: 1  2  a. =  a 7T(o1/a0-l) /2 0  2  (4.3.42)  126  u  o_ u u  k value D = 1.0'  C o  O  ©  A  O CU  +  = 2.0  = 3.0 = 4.0  o o  Zero Crossing Shift FIGURE 4.9 The probability density function of the disparity measurement error for k 1, 2, 3, and 4.  Thus the "variance"  of the disparity  the third power of  of,  measurement error due to filtering increases roughly as  for a given a0. Note that this variance is independent of the power,  in the original signal. This is to be expected as scaling a function does not alter the  locations of its zero crossings.  Quantized  Disparity  Measurement  Error  for Zero  Crossing  Let us now include the effect of quantization density  function, p  g d >  to the discrete  probability  Features  by converting  mass  function,  p  the continuous probability e d  , using equation  (4.1.1).  Doing so, we get:  p (n) ed  The  =  -(l/7T)[tan ((n+ l/2)q/p\) _1  tan" ((n-l/2)q/p\)] 1  probability of having a zero pixel error (P (0)) is given by: ed  (4.3.43)  127  P (0)  =  ed  (4.3.44)  -(2/7r)tan- (q/(2/3 )) 1  1  Note that P (n) is independent of the resolution of the filter. This is due to the fact that ed  as a increases, the size of the pixels (qa) increases as well. In figure 4.10 we plot P (n) ed  as a function of the disparity gradient for n =  Extremum  0, 1, 2, and 3, with q = 1 V 2 .  Features  We  can obtain  taking the covariance  the probability  -^ 0  1 2  B  22  =  T  function  of the extremum  position  error by  matrix of (4.3.24) to be:  B»  B  density  (6)  (0)  (4.3.45)  0 tf (0) w  0// (0), 0 )  (4.3.46)  4)  = -*"(0)  (4.3.47)  Thus we get, for B defined as in (4.3.24) and (4.3.29):  a  2  = -^"(0)/[i//"(0)V/ (0)-(^ (0)) ] (6)  (4)  2  (4.3.48)  and  2  b  Apart  =  (  lA// %)  from these changes the rest of the derivation is the same as for the zero  feature case. Thus we have that:  (4.3.49)  crossing  128  Disparity Gradient FIGURE 4.10 The probability of an n pixel disparity measurement error as a function of the disparity gradient, given q = l V 2 , for n = 0, 1, 2, and 3. ^ (e ) ed  d  =  k/(7r[e + k']) d  (4.3.50)  2  where k is as defined by (4.3.33). Using equation (4.3.38) to obtain the required derivatives we get:  k  =  a,0,  (4.3.51)  This is the same result as for the zero crossing case. Thus we conclude that, for the one dimensional  case,  the effect  of filtering  on the disparity measurement  is the same for  extremum features as for zero crossing features.  Quantized  Disparity  Measurement  Error for Extremum  Features  It is clear that the discrete probability mass function for the extremum feature case is the same as that of the zero crossing case and is therefore given by equation (4.3.43).  129  Nonlinear  Disparity  Functions  In general, be  the disparity  of the linear  between  form assumed  functions above.  that one obtains  If the disparity  from real  world surfaces will not  function is non-linear,  a relationship  the left and right eye SSTs such as that in equation (4.3.7) can not be found. This  being so, we can not expect to be able to find an expression, zero disparity error curves. function about  analogous  to (4.3.14), for the  The best we can do in such a case is to linearize the disparity  a point at which we wish to obtain  a value  for the disparity  measurement  error.  If we expand d(x) in a Taylor's Series about a point x  0  d(x) =  We  L d n=o  (n)  rf(x„)/dx  (n)  (x-x„)  we have,  (4.3.52)  n  can linearize by taking the first two terms of this expansion.  valid (i.e. the approximation  error less than a certain  This linearization  will be  amount) only in a small neighbourhood  about x . We will assume that this linearization will be valid. We then say that, 0  d(\)  where  =  d(x ) +  dc?(x„)/dx (x-x )  0  0  =  /UxO+x/S^x,,)  (4.3.53)  /31 = dof(x )/dx and i3 = rf(x0)-Xo/31(Xo). This linearization then allows the computation of 0  0  the disparity measurement error.  Extension  to the Two  Now rise  let us consider  to a linear  intensity  Dimensional  disparity  Case.  a two dimensional surface that, when viewed binocularly,  function  of the form (4.3.1). Then,  pattern g(x ,y) and the right L  and f(x,y) are related as follows:  if the left  gives  eye sees a light  eye an intensity pattern given by f(x ,y), then g(x,y) R  130  g(xL,y) = f(xR,y)  (4.3.54)  Given the disparity function (4.3.1) we can show that:  g(x,y) = fU30 + (H-p\)x,y)  (4.3.55)  Now let us define the two dimensional Scale Space Transform (2DSST) as follows. 2 2  2  2  2  F(x,y,o,e) = [e 3 /3x + 3 /3y ] [  x  u  ;/: 00 f(u 1 ,u 2 )aV(27re)e- ( - ')  Ve2+  U2  /2a2  (>'- ^ du1duJ  (4.3.56)  Note that this is a generalization of the two dimensional transform of Yuille and Poggio who defined: 2  2  t(x_Ui:)2+ (y  F(x,y,a,e) = V j;"00f(u1,u2)a /(27Te)e"  U2)2j/2a2  "  du1du2  (4.3.57)  The reason for this generalization is to facilitate the description of the relation between the transforms of the left and right images. Since the disparity gradient is in the x-direction only, the effect of having a non-constant disparity is to  skew  the 2DSST in the (x,y,o\e)  space as we go from the left eye to the right eye. This is why we need an extra scaling parameter, the skew factor e, in our definition of the 2DSST. Let us now derive the relationship between G(x,y,a,e) and F(x,y,o,c). Using (4.3.62) we can write: 2 2  2  2  2  G(x,y,o,e) = [e 3 /3x + 3 /3y ] 2  [(x  Ul)  SSZjWo + (l + P1)u1>u2)a /(27r e )e~ ~ '  + ( y _ U 2 ) 2 ]/2a2  dUldu2  (4.3.58)  131  ,  Letting Vi = u (l + p i) + j3 1  0  G(x,y,a,e)  and v = u 2  =  2  we obtain:  [e 9 / 9 x + 9 /9y ] 2  2  2  2  2  rf!.f(v,v,)aV(2,^  dv^v./d + ^O  Now, if we define x = x(l + /30 + pY and e = e(l + /3j)  (4.3.59)  then we can rewrite the above equation  as:  G(x,y,a,e)  =  [e 9 /9 x + 9 /9y ] 2  ;;: f(v ,v )a /(2 re)e- ( 00  1  2  2  [  7  =  x  Vl  2  )  2  2/&2 +  (y-  2  V 2  F(x,y,o,e) =  2  «  / 2 c ; 2  dv dv 1  (4.3.60)  2  F(x(l + /3 ) + /3 ,y,a,e(l + /3,)) 1  (4.3.61)  0  and conversely:  F(x,y,a,e) =  Now  let us define  G((x-p\)/(l + p\),y,CT,e/(l + p\))  a two dimensional function, by holding  (4.3.62)  o  and y to be constant, as  follows:  F(x,e)  =  F(x,y ,0 ,O o  (4.3.63)  o  We will call this transform, for lack of a better term, the Skew Space Transform (SKST) of the two dimensional function f(x). This transform and  is a planar  describe  slice  the relation  through between  described earlier). We have:  the four the left  is parametrized  by the y  0  and o  0  values,  dimensional SST defined by (4.3.56). We can now  and right  skew maps (analogous  to the scale  maps  132  F(x,e)  =  g(x/(l + /3 ),e/(l + /3 )) 1  (4.3.64)  1  where we have set p\ to zero for reasons of clarity. Now, let e x  R  V G 2  =  for which F ( x , l ) R  1. This then gives us the case of V G filtering. The points x 2  =  0 and G(x ,l) L  filtered image along y = y  =  similar  and  0 define the locations of zero crossings of the  for a filter space constant of a .  0  0  It is evident that the analysis of the two dimensional disparity measurement be  L  to the one dimensional analysis,  with F  replacing  and e  F  error will  replacing  a. In  particular we can write (following (4.3.15)):  e^l/G + flO) =  where x = C  (l + p\)[C \l/(l + ^)yc ^ ( 1 / ( 1 + /3 0)]  (4.3.65)  (e) is the track in (x,e) space (skew space) of the zero crossing of G(x,e) that  passes through (x^,l), and C ~ * ( ) is the zero error line given simply by x = x^. ze  e  In figure 4.11 we plot the skew maps of F(x,e) = 0 and G(x,e) = 0, for a surface with /3! =—60/255  for  a = 2.  Compare  these  maps  with  the  scale  space  maps  of  the one  dimensionally filtered image shown in figure 4.6. As probability  in the one dimensional case density  we can obtain an approximate  function of the disparity  measurement  error,  given  expression  that  g(x)  for the  is a zero  mean white gaussian process. We make the assumption that Ae =-/3 /(l + /3 ) is small, so that 1  the zero crossing through the point (x^.l) is approximately straight, error, e  H  1  forming an angle 0. The  in this case is given simply by:  e  d  =  (x!-x )/ei L  =  Now (xi-x ) can be expressed T  (xj-XoXl + ^ O  (4.3.66)  in terms of 0  and j3] to give the following expression for  133  Skew map. Disparity Gradient = 0, sigma = 2 IT  :  :  :  1  Skew map, Disparity Gradient=-60, sigma = 2 1  FIGURE 4.11 The left and right skew maps obtained from a randomly textured surface with a horizontal disparity gradient of -60/255 with a =2.  (4.3.67)  where M=tan(</>) is the slope of the line perpendicular to the zero crossing contour at (x ,l). L  The  gradient vector, 77 = (77i,7} )=VF(x,e) measured at (x ,l) 2  L  is also perpendicular to the zero  crossing line. Thus we have that:  M  =  Th/rj,  (4.3.68)  and  e  Now  d  =  /3I(T? /T?1)/(1 + P\) 2  (4.3.69)  our problem reduces to finding the probability density function of the above function of  134  the  two random  variables  77 j and 7?  given that F ( x l ) = 0.  2  L>  The solution of this problem,  however, is quite a bit more complex than was the case in the one dimensional analysis. The reason  for this is that the differential equation relating the e derivatives to the x derivatives  is not as simple as equation (4.3.20). It can be shown (by performing the differentiation with respect to y in (4.3.56)) that F(x,e) can be defined as:  F(x,e) =  € 3V9x H,(x,e) J  2  +  H (x,e)  (4.3.70)  2  where  H,(x,e) =  sZMv)oAe/2*)  e-( - )  H,(x,e) =  SZMu)Oo/(e/3x)  e~  x  u  V 2 e 2 a  ° du  (4.3.71)  du  (4.3.72)  2  ( x _ u ) 2 / 2 e 2 a ( ) 2  and where hi and h are defined as follows: 2  hi(u)  h  >(")  =  SZjMoJ{\TH)  e"  =  /r f(u,v)/(a„/2i)[(y -v)Va„ -l] oo  ( y o  "  v ) 2 / 2 C T  2  0  ° dv  (4.3.73)  2  e ( - ) _  y  v  2 / 2 f f 2  dv  (4.3.74)  We can now see that: 9F79x =  e 9 H,/9x 2  3  3  +  9H /3x 2  =  i?, when  e=l  (4.3.75)  135  9F/9e = 2e9 H,/9x2 +  e 9 3H,/9x 9e 2  2  2  It can be shown that (Yuille and Poggio, function  for the diffusion equation,  = r) when e = l 2  and H , and H  2  are the result  of convolving  some  are solutions of the diffusion equation. With the  2  as described by Yuille and Poggio (appendix,  2  (4.3.76)  1983a), since the Gaussian function is the Green's  function with the Gaussian, then H , and H boundary conditions on FL and H  9H2/9e  +  Yuille and  Poggio, 1983a) it can be seen that:  9 H,/9x  2  =  (l/c)9H /9e  (4.3.77)  9 H /9x  2  =  (l/e)9H /9e  (4.3.78)  2  2  2  1  2  Therefore we can write rj , in terms of x derivatives only, as follows: 2  r?  2  =  2e9 H /9x 2  1  2  +  Let a = ( T j j , r j 2 , £ ) . The covariance  e 9 Hj/9x 3  4  +  4  matrix of a  e9 H /9x 2  2  2  when e = l  (4.3.79)  is given by E ( a a ) . Where E(-) indicates the T  matrix expectation operator. Using the procedure of Rice (Rice, 1945) it can be shown that:  E(rH) =  -^i  Efai)  =  -4^  mi)  =  Vi (0) + 2 « / /  (6)  (4)  ]  ( 0 ) + 2^  (4  12  (4)  \0)+a ^ 2  12  1  (0)-i//  (8  (2)  2  (2)  (0)  \0) + a »// 2  (0)+^ (0) 2  2  (4.3.80)  (4  \ 0 ) + 4a^  1  (6)  (0) + 2a »// 2  12  (6)  (0)  (4.3.81)  (4.3.82)  136  E(7?i7h)  In  =  (4.3.83)  0  (4.3.84)  E(T?,0  =  0  Efa,$)  =  2 ^ / % ) + a«//  the above  i// (r) 2  cross-covariance  I  (6)  (0) + 2a^  is the autocovariance  function of H}  12  (4)  (0) + 2^  of H i ,  n  I// (T) 2  (2)  ( 0 ) + ai//  that  2  (2)  (0) "  of H  2  is the  2  as shown in equation (4.3.25), we can see  that the joint probability density function of rji and 77 given that £ =0 2  1/a 0  and ^ I ( T )  2  and H .  If we partition the covariance matrix of a  B  (4.3.85)  0  2  l/b  is given by: (4.3.86)  !  where  1/a  2  =  (4.3.87)  E(T?0  and  1/b  The  2  covariance  =  (4.3.88)  EfaD-EO^VEtt ) 1  functions  are  derived  in the  Appendix,  as  well  as  formulae  for their  derivatives at zero. Inserting these values into the above equations we get:  1/a  2  =  18.25v/ir/a  5  (4.3.89)  137  1/b  =  2  [1.579a + 4.345a + 16.11]i/7T/o 4  2  (4.3.90)  5  We have that:  PC(T?2,T?,|$=0) =  (ab/27r) e ( ^ a  l J +  b 2 7  ? ) 2 J  (4.3.91)  / 2  If we let M =773/77, we get:  P (M7h)  =  M  (ab/27T)7 e ?1  7?l2(a2 +  (4.3.92)  b 2 M 2 ) / 2  Integrating out the dependence on 771 gives us:  2  =  P^(M)  a/[7rb(M + a /b )] 2  Making the transformation e^ =  Ped^  =  k/[7r(e  d  2  (4.3.93)  2  -p\ju we get:  + k )]  (4.3.94)  2  where:  k  =  -/3,a/b  =  -f3 /(0.881 + 0.238a + 0.0864a ) 2  lV  (4.3.95)  4  As in the one dimensional case the error has a Cauchy distribution.  In figure 4.12 we plot P j(e) for a range of /3] values, holding e(  4.13, we plot p insensitive  e d  for a range of a  to the value  of a  over  a  =  values, holding $1 =  -0.1. Notice that p  quite  of a  a large  range  one-dimensional case, p . was not a function of a at all.  values.  1. In figure e d  Recall  is relatively that  in the  FIGURE 4.12 The probability density of disparity measurement error for zero crossing features, o = l , p\ = -0.1 to -0.4.  Zero C Feature Shift FIGURE 4.13 The probability density of disparity measurement error for zero crossing features, p\ = -0.1, a = 1, to 4.  139  Quantized  Zero  Crossing  Disparity  Measurement  It can be seen that the form of the quantized disparity measurement mass  function  is the same  as for the one-dimensional  case.  The form  error probability of k is different  however, as seen above. In figure 4.14 we plot P j(n) as a function of the disparity gradient ec  for n =  0, 1, 2 and 3 with q =  function of o  for n =  l/\/2, and o  =  1. In figure 4.15 we plot p j as a £(  0, 1, 2 and 3, holding p\=-0.1  probabilities are relatively insensitive to changes in o  and q =  1//2.  Note that these  over a large range of o values. This is  due to the fact that the size of the pixels increase linearly with a.  Extremum  Features  The  probability density  function of the disparity measurement  feature case has the same form matrix  B  functions  that  is used  (equation  is different,  in the two cases.  4.3.94) as in the zero  however,  due to  error  for the extremum  crossing  feature case. The  the difference  As in the one dimensional  case  in the  autocovariance  the autocovariances  in the  extremum and zero crossing cases can be seen to be related by:  extremum  We can then calculate the new values of 1/a  l/a  :  (4.3.96)  zero crossing  2  and 1/b . Doing so we obtain  =  7035,/Tr/(64a )  =  / rr [ a (2.695) + o (20.87) + (99.49)] lo  2  (4.3.97)  7  and  1/b  Hence  2  4  2  9  (4.3.98)  o  *w  N value D O A  +  -0.1  -0.2  = = = =  0 1 2 3  -0.3  -0.5  -0.4  Disparity Gradient for ZC Features  FIGURE 4.14 The probability of an N pixel error for zero crossing features as function of p\ for a = l, q = I V 2 and N = 0,1.2 and 3.  N value = o= A = + =  •  0 1 2 3  -e1  2  3  -e—< i  -p4* 4  iti  it  f— ' 5 1  Filter Sigma for ZC Features F I G U R E 4.15 The probability of an N pixel error for zero crossing features as function of a for 0, = -0.1, q = 1//2 and N = 0,1,2 and 3.  141  k  We  =  (-/?,>/ (0.0245a + 0.1898a + 0.9051) 4  plot  the resulting  (4.3.99)  2  error  function  /^(e^)  for a = l  and /3,=-0.1  ->  figure 4.16, and in figure 4.17 we plot /^(e^) for a =1,2,3 and 4 with /3 =-0.1 ;  l/j/2. Compare  these graphs to the zero crossing  -0.5 in and q  =  feature case; these differ only slightly.  Quantized Extremum Disparity Measurement Error  The basic expression for the probability mass function /> (n) is the same as for the ed  zero crossing  In  feature case, except that the expression for k(a) is different, as noted above.  figure  4.18 we plot  the probability  of getting  an N pixel  disparity  measurement  error, when using extremum features, as a function of /3] for a = l , given q = l/v/2. 4.19  we plot  the probability  of getting an N pixel  disparity  In figure  measurement error, when using  extremum features, as a function of a for /3j=-.l, given q = l/|/2.  0.5  1  1.5  Extremum Feature Shift  FIGURE 4.16 The probability density function of the disparity measurement error for extremum features for 0j = -O.l to -0.4 with o = l.  0.5  1  1.5  Extremum Feature Shift FIGURE 4.17 The probability density function of the disparity measurement error for extremum features with 01=-0.1 for a = 1,2,3 and 4 with q = l/y/2.  o u Ed  N value •  = 0  o =1 = 2 3  CU  + =  as  «0 © O  >>  — d CO  -e—o  X) O hi CU  j  -0.1  -0.2  4 fr 9  $L-$I  -0.3  ^  -0.4  1-  -0.5  Disparity Gradient for EX Features FIGURE 4.18 The probability of an N pixel disparity measurement error for extremum features as a function of p\ for a=l and q = l/i/2.  N value •  = 0  o-1  = 2 + = 3  -e- -e-  -e—<i  •t 4  1  0  1  2  3  5  Filter Sigma for EX Features FIGURE 4.19 The probability of an N pixel disparity measurement error for extremum features as a function of a for p\ = -0.1 and q=lV2.  Numbering error. Text for leaf 144 not produced.  145  4.4 -  Reconstruction Error Analysis  In  this  section  we will  examine  the errors  that arise  in the reconstruction  of the  disparity function from the sparsely distributed samples provided by the matching algorithm.  We chapter  will  analyze  the errors  3. described earlier.  in detail  for the non-uniform reconstruction  The interpolation scheme  of Grimson  method of  (1981b) and Terzopoulos  (1982) will not be analyzed here, partly due to the difficulty of doing so.  Uniform  Reconstruction  Let dimensional  us  Error  begin  by  reconstruction  Analysis  examining process  the  errors  produced  as derived by Petersen  by the  standard  and Middleton  uniform two  (1962). It will be  seen that the error analysis for this case will generalize to produce an error analysis for the non-uniform cases. The reconstruction  equation of Petersen  and Middleton  was written down  in chapter 3, and is repeated below:  fix)  =  If(x ) g(x-x ) m  (4.4.1)  s  where  {x 3 s  =  Ix : x  =  l.vS + l j V ,  ; 1,,1  2  =  0,±1,±2,..J  (4.4.2)  It can be shown that the spectrum of the sampled function  f (x) s  If(x)5(x-x„)  (4.4.3)  LF(w+ws)  (4.4.4)  is given by  146  where CJ -X =27T6JJ. The above holds only if {x${ S  contains an infinite number of points. F(c3)  s  is the Fourier transform of f(x). Using the convolution property of the Fourier transform, we can write:  F (S)  =  T  F (w)G(S) c  =  LF(S+wJG(£)  (4.4.5)  It can be seen from this equation that, if F(W+CJ s ) = 0 for G J ^ U is given by equation 3.6.5, then F (c3) r  =  and F(c3)^0, and if G(w)  F(d>), and the reconstruction is exact. The above  conditions on G(w) and F(CJ) for exact reconstruction are illustrated graphically in figure 4.20. There are three situations in which the above reconstruction formula will not be exact The  first  situation  occurs  when  the  spectral  repetitions  sampled function f (x) overlap with the central figure 4.21.  Some of the  energy  in the  spectral  (i.e.  of  (\ \ ) u  2  F(c5)  =  in the  spectrum  of  the  (0,0)) repetition, as shown in  repetitions is added to the  central repetition passed by the filter. This causes an error in the reconstructed  energy of the value known  as aliasing error.  6  The  second  situation in which  reconstruction  filter, g(x), does not pass all of the central  errors  arise  is  when the  spectral repetition. In other  reconstruction  words part of the  energy in the function that we are trying to reconstruct is filtered out Sometimes this is not such a bad thing because some of the aliased. energy from the other spectral repetitions may also be filtered out  The reconstruction filter can also have too wide a bandwidth, so that  even if the function has a high enough sample density to eliminate all aliasing (or overlap of the spectral repetitions), the filter may pass some of the energy in the spectral repetitions anyway. Both of these situations are depicted in figure 4.22.  In practice this error, which we  call the Filtering Error, can be avoided if the maximum region of support of F(c3)  and the  sampling lattice is known, as one can then design an appropriate filter.  The aliasing error derives its name from the fact that a given frequency component of the spectrum of f(x), when replicated, actually becomes a different frequency component Thus these frequency components are actually under an assumed name (frequency) or alias. 6  147  F I G U R E 4.20 The shapes of the regions of support for F(CJ) and G(ZJ) for exact reconstruction of f(x) from its samples.  SUPPORT OF  •-ALIASING F I G U R E 4.21  density.  ERROR  Aliasing error in the reconstruction caused by too low a sample  148  ^  = FILTER ERROR  FIGURE 4.22 The effect of having an improper reconstruction Filter. Note that the central repetition is partly filtered out and that parts of the other repetitions are passed by the filter. The third type of error that can arise, which will be seen to be a form of filtering error, occurs when a finite subset of the points in ixg} are used to perform the reconstruction of a value of f(x). This error is known as truncation error (because it arises when the reconstruction summation is truncated). The effect of limiting the number of sample points used in the reconstruction is equivalent to reconstruction with afilterthat is a spatially truncated version of the optimum reconstructionfilter.Since the support of this truncated filter in the space domain is bounded, its support in the frequency domain cannot be bounded. \ Hence this truncation error is seen to actually be afilteringerror, since all frequencies will be passed to some extent by thefilter,including those present in the non-central spectral repetitions of f(x). We can write down an expression that includes the effects of all the above errors on the reconstructed function as follows:  149  e_(x) =  where  g (x) t  f(x>f (x) =  is the truncated  f(x) -  version  Zf(5)6[(x-xJ]*g (x)  (4.4.6)  t  of the reconstruction  filter.  Obviously,  in order  to  minimize the truncation, or filtering error, we should choose the Filter function g (x) that has t  the following properties:  -  g (x) =  -  |G(t3)| is a minimum for all u> outside the region fi, where Q, is the region of  t  0 outside some region R (i.e. is spatially bounded)  support of F(CJ).  Derivation of the optimal filter under truncation  We will now derive an approximation for this optimal (in the sense of minimizing the truncation error) filter. Let us assume that both R and 0  are disk shaped with radii a and  b respectively. The total energy in the filter function can be seen to be given simply by:  A  The  second  energy  R  =  // |g(x)| dS 2  R  relation  comes  =  from  /;!jG(^)| c£3  (4.4.7)  2  Parseval's  in G(c3) for |c3|>b is equivalent  theorem.  The condition that  to the condition that  we minimize the  the energy  in G(a>) be  maximized for |o>| <b. Let us define this energy by A^. This has the following value:  Afl  where  *  =  J/ G*(w)G(w)ck3  (4.4.8)  n  _  indicates the complex conjugate. Our task is therefore to determine the function g(x)  that maximizes AQ/A . We can write A ^ r  =  ;; dH/(16rr ) Q  This can be simplified to:  4  /  in terms of g(x) as follows:  f^e+J^V(x)c£ SSZ^'^VGWY  (4.4.9)  150  =  (l/47r )/J//^K s (y-x)g(y)g*(x)cSdy  (4.4.10)  2  where  K (y-x) g  =  (l/47r );; e"J '^- )dH J  R  H  (4.4.11)  J  Slepian (1964) shows that the maximum of A ^ / A j ^  is equal to the largest eigenvalue of the  following integral equation, and that the corresponding eigenfunction of this integral equation is the optimal filter function.  W(y)  =  J7R  Ks(y-x>//(x)dx  for |y|<a  (4.4.12)  Slepian also shows that this equation can be rewritten in the form:  x  ai/'Cy) - SS  R  where  c=b/a  c^ ^V(x)dx  and a = 2n\/\/c. Note  for |y|<a  (4.4.13)  that, for the case a = b = ° °  (i.e. g(x) is non-truncated  and non-bandlimited) we get the equation  ag(y)  or ag(y) = the  =  UZ  a  eJ ' g(x)cE (x  (4.4.14)  y)  G(y). The only function whose Fourier transform has the same form as itself is  Gaussian. That the Gaussian is the optimum function in terms of jointly localizing the  energy in both the space (or time) and frequency domains is a well known result. However, once we apply the condition that the function be space limited, the Gaussian is no longer the  optimum  function.  In order  to  find  this  function  we must  solve  equation. Slepian (1964) showed that this equation can be rewritten as:  the above  integral  151  7  N,n N,n 0  (JN(crrVcrr>(r')dr' 0<r<l  =  ( r )  (4.4.15)  where  <£  N>n  (r)  =  /rR  N n  (r)  (4.4.16)  and  =  TlSU  for  n,N =  /  N  v caNn/(27rj )  (4.4.17)  0,1,2... and where we define  ^N,n ' ( r  =  e )  R  N,n  ( r )  <  c o s ( N 9 )  to be the polar representation of ^ N ( x ) . n  ( 1 - ^ ) 0 ^ - 2 ^ + [(1/4- N ) / r - c r + x]tf> =  The  2  2  equation  0  (4.4.19)  bounded solutions of this equation, for arbitrary N are known as the generalized  spheroidal by X g i v e  2  7  N  (Slepian, 1964). Bounded solutions occur only for discrete x  Junctions n  where X  N,n+l  <  7  N n  <X  N n  a  N n  n  d  +  1  7  - One can similarly order the 7  N+ln  <  7  >  Slepian also shows that the solution of the the  above integral equation is also the solution to the following Sturm-Liouville  2  4A18  Nn  - T n u s  ^  l a r  §  N n  e s t  prolate  values, given  (Slepian, 1964, p. 3039) to eigenvalue of the integral  equation (4.4.13) is obtained when N = n = 0. Thus we have that  g(x) = ^ a Q ( x ) = 0 o o ( | x | ) V | x |  (4.4.20)  152  Now  it remains  to find an expression  for 0QQ(I"). In the literature  no closed form  exists for this function and we can only write down approximations. Slepian (1964) provides a number of approximations to this function that are valid over different ranges of r. For large -1/4 c and small r (|r|<c ) he shows that:  4> (r) =  kVr e ~  00  where L</^(x) =  r2c/2  L  0  (0)  ( r c ) for |x|<c" J  (4.4.21)  1/4  1 is the Laguerre polynomial of degree zero and k is a constant.  We can  now write:  C  g(x) = k e  X  V2  I I  for |x|<c~  (4.4.22)  1/4  Note that this is the Gaussian function, which, as we have seen earlier, is the exact solution when there is no spatial or frequency  domain constraints.  When we apply such  constraints,  such as requiring that the function be bandlimited and spacelimited, then the Gaussian is only an approximate  solution. Let us now rescale the spatial axes (recall that we earlier scaled so  that a = l ) giving x->x/a, c->a c to give 2  g(x)  This  result  Lunscher energy  =  k e  _ c  x  l l  for |x| < ( a c ) ~ 2  (4.4.23)  1/4  is similar to the one obtained by Shanmugan et al (1979), later  (1983), who found  in the vicinity  approximation |x| <(a c)~^ ). 2  2 / 2  /4  beyond  the one dimensional filter whose  of the step the  rather  input,  restrictive  A question range  step response  arises  that  it  is  strictly  defined  of this  over  (i.e.  Lunscher (1983) does not say anything about this but apparently assumes that  Shanmugan et al (1979) cite a paper is incurred by extending  Shanmugan  had maximum  as to the validity  this approximation can be extended to |x|<a (recall that by definition g(x) =  error  corrected by  et al misread  by Streifer  the range  Streifer's  (1969) claiming that he shows that little  of validity of the above  remarks.  0 for |x|>a).  Streifer  says  that  approximation. However,  the approximation  can be  153  extended  because,  in his application  multiplied  by a  function  which  in his application  decreases  rapidly  the prolate  for x>c~^ . /4  spheroidal  Thus,  function is  in his application,  extending the approximation will not incur much additional error. However, in our application (as well as Lunscher's and that of Shanmugan et al) we must be more careful in extending our approximations. Actually we need not extend  the above approximation past its region of  validity at all since Slepian (1964, p 3027) provides, in addition to the above approximation, approximations  which, taken  together, are valid  over  the range  from |x| = c ~ t o  |x| =a.  These can be paraphrased as follows:  g(x)  valid for (a c)~  1/4  2  ac(v/a2  2  x  2  3)  <|x| < a - ( a c )  =  2  2  1/4  "l l /[(a+v/(a -|x| ))(a -|x| ) ]  2  g(x)  valid  k e~  =  _1  (4.4.24)  2  and  2  2  k e" /„(aci/(a -|x| )) 3  (4.4.25)  a2c  for a-(a c) ^<|x|<a. 2  For large  values  of c the first of these approximations would  cover most of the range in which we are interested (i.e. 0 to a). A plot of g(r) using these approximations figure 4.23.  for r in the range 0 to 1, for a = l  In figure 4.24 we plot, under  from extending  and various values of c is given in  the same conditions,  the filter g(r) that  results  the approximation of (4.4.23) past its range of validity. It is seen that the  filter that results from using equation (4.4.23) for all x is not appreciably different than the filter that thesis,  arises  from  valid  approximation.  Thus,  we will,  in the rest of the  follow the intuition of Lunscher and Shanmugan et al, and use equation (4.4.23) to  define the approximately time  the strictly  that  optimal filter (under  the specification  truncation).  of this filter is only  It should be pointed out at this  to within a  multiplicative  constant The  problem of scaling the filter properly will be discussed later in this section when we derive expressions for the mean square reconstruction  error.  154  FIGURE 4.24 A plot of Slepians first approximation to the optimum filter extended past its region of strict validity.  155  Derivation of the general reconstruction error expression  We will  now derive a formula for the reconstruction error  for the general case (of  linear filter reconstruction). Since the disparity functions that we will be trying to reconstruct are, in practice, of a non-deterministic nature we will treat the disparity function, f(x), as a stochastic process, and determine an expression for the mean square reconstruction error. This has been done, in part, by Petersen and Middleton (1962) who show that, for the case of ideal filtering, the minimum mean square reconstruction error is given by:  H[f(x)-f R (x)p}  =  1/(4**) / / - [ * ( S ) - ( G ^ ) / Q ) Z : e x p a x w » ( S + 5 J ] ( £ 5  m  where $(a3) is the Fourier transform of the autocovariance  s  (4.4.26)  s  function of f(x) (i.e. the power  spectral density), and Q is as defined in equation (3.6.6) of chapter 3. G(c3) is the Fourier transform of the ideal reconstruction filter and is defined by:  G(w) =  Q$(O))/[£<I>(CJ+C3 )]  m  (4.4.27)  s  The set lc3 3 is the set of the frequency domain repetitions caused by the sampling (this set s  is  the dual  of the spatial sample set defined  (3.6.3)). If the power spectral  by equation (3.6.1) as shown by equation  density of f(x) is sufficiently bandlimited  then the minimum  mean square error vanishes (note:  minimum in this context means that the mean square error  produced  using  the above  filter  G(c3) is less  produced  using  any other  filter).  square  reconstruction  error  Petersen  uniformly  than  or equal  and Middleton  averaged  over  also  a sampling  to the mean derive cell.  square  error  the minimum  mean  This  quantity may be  useful in the analysis of the reconstruction errors in the non-uniform case, where the size of the sampling cell varies. This averaged mean square reconstruction error is given by:  E {[f(x>f R (x)?} = A  1/(4TT ) / / ^ * < S ) [ l - G < S ) / O J ( S 2  (4.4.28)  156  However, this equation includes the effects of aliasing only, as it assumes that the ideal reconstruction filter is being used. In practice, as we have seen, we may not be reconstructing with the ideal reconstruction filter. In this case, then, the above formulae for the mean square errors will only supply a lower bound. We need a more general formula for the mean square reconstruction error. Again, such an equation is supplied by Petersen and Middleton (1962) and is given below: 2  H[f(x>fR(x)] } = K(U) - 2IK(x-x„)g(x-xJ + ZIK(x -x fetf-i" )g(x-ST )  m  %,U&  1  2  1  (4.4.29)  2  where K(x) is the autocovariance function of f(x). Petersen and Middleton do not, however, provide a general frequency domain representation of the mean square error such as was given for the ideal filter case (4.4.28). They do show that the first two terms of the above expression can be put in terms of frequency domain quantities as follows:  K(U) - 2IK(x-xs)g(i-xs) = 1/(4*')//"„,[*(£) - 2(G(£)/Q)Lexp(-jx£ )*(S-5 )]dw S  '  (4.4.30)  S  It is the third term in equation (4.4.29) that they do not provide a frequency domain expression for. Let us call this third term T, for simplicity. We will now derive a frequency domain expression for T. We can expand T in terms of delta functions as follows, using the integral properties of the delta function:  T = ////rooK(r?-s)g(x-r)g(x-s)ZI6(r-xSi)5(s-xS2)d?ds  (4.4.31)  It can be shown (using the result of Appendix A in Petersen and Middleton, 1962) that:  ZZ5(r-xSi)5(s-xS2) = (l/Q')ELexp(-jr u^expHsu^)  (4.4.32)  157  where {aj } g  is the dual set to l!x j]  T  =  (l/Q )ZI/;//" 2  If we make the change K(x)  as given by equation (3.6.3). Thus we can write:  g  ffi  K(r-s)g(x-r)g(x-s)exp(-jr^ )expHs^ )drds Si  of variables y = r - x , and z = s - x ,  Sj  (4.4.33)  and use the fact that g(x) and  are even functions we obtain:  T  =  a;//;" K(y-z)g(y)g(z)exp(-jyw )exp(-jztJ )dydz oo  Si  S2  (4.4.34)  where we have defined, for simplicity, the operator a to be:  a  =  (l/Q )LLexp(-jx-(w_ + w „ ))  (4.4.35)  2  Separating out the functions that depend only on y gives us:  T  The  =  a//" g(y)exp(-jy^ )[//" K(y-z)g(z)exp(-jzicj )dz]dy ra  Si  tt  S2  (4.4.36)  integral in the brackets can be recognized as a convolution. Hence we can write:  T  =  a//" ,g(y)exp(-jyw )[K(y)»g^)exp(-jy5 )]dy B  S i  S j  (4.4.37)  Replacing the bracketed term by its Fourier transform representation gives:  T  =  a/(47r );/"„g(y)exp(-jyw )[/;"  4>(^)G(^-w pexpayw)d3]dy  (4.4.38)  a/(47r )//" #(w)G(i-w S2 )[;/r g^)exp(-jy^ )expCyw)dy]dw  (4.4.39)  2  Si  ffi  s  Rearranging this equation gives us:  T  =  2  w  ro  Si  158  Evaluation of the bracketed integral as a Fourier transform results in:  T  =  a/(47r )fJ"" * ( S ) G ( w - 5 J  c  )G@-c3 )du  (4.4.40)  Substituting the expression for a yields:  T  2  =  (1/(4TT Q ))//" <i>£)£Eexp^ 2  (4.4.41)  ro  The total mean square error, E, for the general case of aliasing and truncation can now be written, and is done so below:  E  2  l/(47T );/r [*(w)-2(G(w)/Q)Iexp(-jx^ )d>(^-w )]  =  oa  s  (1/Q )Jj" *(S>ELexp<-jx-(5 J  w  +w  s  ))G<5-5 ) G ( 5 - 5  This formula is valid for any filter function g(x)  +  (4.4.42)  )]du  and for any random process  f(x).  We can obtain an expression for the average mean square error over a single sample cell, T, as was done by Petersen and Middleton (1962) in the case of ideal filtering. Let us denote this average mean square error by E . It can be seen that E  E K(0) -  is given by:  =  (1/QJEJ J [ 2 K ( x - x ) g ( x - x ) r  s  +  ZZK(x - x g  g  )g(x-x  )g(x-x )]dx g  (4.4.43)  Making a change of variables and noting that the summation of integrals over the elementary cells T  is the same as integrating over the entire space allows us to write:  159  K(0) -  (2/Q)/;" K(x)g(x)cS oo  +  (l/Q)/JR2LK(x -x )g(x-x )g(x-x ))dx Sj  s  Si  (4.4.44)  Si  Let us define the function r(x) as follows:  r(x)  =  1, for x e T and r(x) =  0 elsewhere.  (4.4.45)  Then, using equation (4.4.42) we can write the third term in equation (4.4.29) as:  (l/47r Q )IL;/" r(x)exp(-jx-(w 2  Recognizing  3  oo  +w ))dx J f~„<l>(w)G(u-£ ) G ( S - 5 . )du  (4.4.46)  the first integral in the above expression as a Fourier transform allows us to  rewrite this as:  (l/(47r Q ))IIR(w 2  3  Si  +u )S / " S  00  $(w)G(w-w )G(w-w )c£3 Si  Petersen and Middleton (1962, Appendix D) show that R(w ) = g  zero for cj„e{wJ S>  (4.4.47)  S2  Q for c3 ek3 } s  g  =  U and is  & TJ. This means that, since co„ +a>  is a member of k o j , R(C3„ +CJ„ ) iis S Si s and is zero elsewhere. This allows us to rewrite (4.4.47) as: 0  S>1  o  equal to Q for c3 = - C J 31  »  »2  2  2  (l/(4 r Q ))Z;;" 4'(w)G(^-^ )G(w+w s )ck3 7  J  2  oo  (4.4.48)  s  Combining this result with equation (4.4.30) gives us the following expression for the general averaged mean square error:  E  = l/(47r2);/rc=*(^)[l "  (2/Q)G(£) +  (1/Q )EG<S-5JG<£+W-)](£ J  (4.4.49)  It can be seen that this expression reduces to (4.4.28) when G(c3) is the ideal reconstruction  160  filter. Notice the difference between this expression and the one given previously (equation (4.4.28)) for the ideal filter case. With a non-ideal filter (e.g. a truncated one), the average mean square error depends on the sample set, whereas in the ideal case it did not. Optimal scaling of the reconstruction filter.  Let us now consider the problem of scaling thefilter.As we have seen earlier, the optimum (under truncation) filter is only specified to within a multiplicative constant. What should this constant be? To answer this question, let us suppose that G(c3) can be written as kS(c3) where S(0) = 1. It makes sense to determine k such that E is minimized. We can write: J  J  E = l/(47r )J/" *<S)[1 - (2k/Q)S(5) + (kVQ )E5<£-5c)5(S+wc)]c£  (4.4.50)  Differentiating with respect to k and setting equal to zero to find the extremal points yields:  k = QJ/!»*(S)B^)<ko / Z / j ' ^ J S ^ + w j S < £ - 5 )oS  (4.4.51)  m Often we will not have complete information about the process to be reconstructed. In such a case we will find k such that E is minimized for some assumed process. For example, we could assume that f(x) is a constant Then k is given by: 2  k = Q/ZS (w„)  (4.4.52)  From the above discussion we can see that, in general, the reconstruction error depends on three factors: 1. The power spectral density # (or equivalents the autocovariance function) of f(x). 2. The reconstructionfilterG(OJ).  161  3. The sample set f x J (or its frequency domain dual set !w }). g  Variation of any of these parameters will cause a change in the average mean squared error.  Example  of the reconstruction  We  error  compulation  will now illuminate the details of the preceeding discussion with an example. The  conditions of this example are similar to the conditions of some of the experiments described in chapter 5.  Let  the power  spectral  density  of the process  be given  by a  cylindrical  gaussian  function:  d>(cj) =  A a v/7Texp(cjj a /4)5(aj ) 2  2  s  s  2  (4.4.53)  2  where u>={u> v>i). Let the filter be given by the approximate filter defined by (4.4.23), except x  that it extends to °°. That is:  g(x)  =  k exp(-c|x| /2)  (4.4.54)  2  Thus:  (4.4.55)  We  will determine k so that the mean square error is minimized when a  f(x)  is constant). Doing this yields:  (k/v/(27rc)) =  Q/Iexp(-H /c)  Let us define S as follows:  2  g  is °° (i.e. when  (4.4.56)  162  S =;Iexp(-|aJ |Vc)  (4.4.57)  s  Then E can be written:  E  -  (A'o^ic)/(4Tt>)SZJe*vHjSos>/4)  (l/S)exp(-w (l/c+a V4))]dcj 1  2  s  (2/S)exp(-w, (l/2c+o ))  ~  2  s  2  +  (4.4.58)  1  This integral can be evaluated to give:  E  =  2  A /(27r)[l + ( l / S ) { l V ( l + 4/(ca S ))-2V(l + 2/(ca ))}] 2  (4.4.59)  2 s  The RMS error (i/E) is seen to be proportional to A, the amplitude of the function being reconstructed. This dependance can be seen in the experiments described in chapter 5. Notice that E does not go to zero as a  goes to infinity (except for c=0) but approaches 1— 1/S.  g  This is due to the fact that the exponential filter lets in some energy from the non-central spectral repetitions that are created which  was described earlier  by the sampling. This is an example of Filtering Error  in this  section  (not to be confused  with  the filtering error  described in section 4.3 which refers to the effects of V G filtering). However, as the distance 2  between samples decreases  to zero, S goes to one and E goes  zero, E approaches A /27T  for all values of c (except zero).  2  to zero. As a  Let us assume that we have regular hexagonal sampling. Then,  g  approaches  from chapter  3, we  have that:  w  for l \ u  2  g  =  lju. + lju,  taking on all integer values. The vectors u \ and u  (4.4.60)  2  are obtained from the spatial  sampling basis using (3.6.3). If we let the distance between nearest neighbour samples be 2a, then the resulting frequency domain sample basis vectors can be computed to be:  163  u, = (7T/a)(l,-lV3) and u2 = (7r7a)(0,2V3)  (4.4.61)  We can now write S as follows: 2  2  2  2  S = ZZexp(-47r (l1 -l1l2 + l2 )/(3a c))  (4.4.62)  In figure 4.25 we plot the average RMS reconstruction error v/E as a function of og for  four different values of the filter constant c, given that A=\/(2n), and a = lV3. Figures 4.26 and 4.27 are similar to figure 4.25 except that a = l/i/6 and 1/2/3 respectively. Some conclusions can be immediately drawn. The first is that reducing the sample spacing, a, reduces the error. Secondly, increasing the value of the filter constant c, reduces the error for small values of ag, while decreasing c reduces the error for large values of ag. It is also evident that increasing the separation between samples results in increased error for large ag due to the passing by the filter of the energy in the spectral repetitions, which move closer to the frequency plane origin as the samples get farther apart. Nonuniform Sampling  Let us now consider the case of the reconstruction error produced by the warping or transformation method described in chapter 3 for nonuniformly distributed samples. Recall that this method involved making a coordinate transformation based on the sample distribution, in such a way that the samples in the new coordinate system were uniformly distributed, so that the standard uniform reconstruction method of Petersen and Middleton (1962) could be used. The transformed version, h(x), of the function, f(x) to be reconstructed is related to f(x) as follows: _1  h(x) = f(7 (x))  (4.4.63)  where 7(x) is the transformation from uniform to nonuniform coordinates. To determine the  164  c value • =c = 5 o = H  1  -A  1  H  1  1  1  4  1  1  V-  A  ia  A  A  A  A  A  A  A  c  =c +=c A  A  A  A  = 10 = 15 = 20 6  A  H  1  A  A  1-  A—i i  -e—e—e—e—e—e—e—e—e—e—e—e—e—e—6—e—e—e—<)  -B  B  B  B  B  B -1—  5  10  B  B  B  B  -r-  Sigma S  15  B  B  B  S 20  B—-B  B—11  i  25  FIGURE 4.25 The average RMS reconstruction error for a Gaussian process and filter, with a = lV3, for c=5, 10, 15 and 20.  FIGURE 4.26 The average RMS reconstruction error for a Gaussian process and filter, with a = l V 6 , for c=5, 10, 15 and 20.  165  c value D = c =5 o = c = 10 A = c = 15 + = c = 20  1  t-  H  H  1  10  1-  H  (-  15  (-  •+-•  20  Sigma S  25  FIGURE 4.27 The average RMS reconstruction error for a Gaussian process and filter, with a = W l 2 , for c=5, 10, 15 and 20. average mean square reconstruction determine  the power spectral  that the set {a>} g  error for a given reconstruction filter g(x), we need only  density, $^(5),  of h(x) and then  is fixed for all cases, as described in chapter  use equation (4.4.49). Note 3.6. Let us assume that we  know the power spectral density, $fe>), of the function, f(x), that is to be reconstructed. The question to be answered now is: How can $^(u>) be obtained from $f<S)? For arbitrary this  is an intractable  obtain  some  problem. However, by making some  representative  results.  For  example,  transformation algorithm described in chapter lattice  arose from perturbing a uniform  in  the  assumptions  about  development  of  we can the  heuristic  3.7, we assumed that the nonuniform sampling  lattice  slightly. In this case we can model  as  follows:  7  _ 1  (x)  =  Ax + B ( x )  (4.4.64)  where B ( x ) is some random vector process and A is a constant 2x2 matrix. We can, without  166  loss of generality,  assume A to be the identity matrix,  I, by suitably rotating,  scaling and  translating the target (e.g. I space in chapter 3) coordinate system. Thus we can write:  h(x) =  f(x+E(x))  (4.4.65)  Let us assume that both f(x) and b(x) are stationary and zero mean processes. the autocovariance (or autocorrelation)  <//{<?) =  Hf(x)f(x+7)j  \ph (r) =  Then  functions of h(x) and f(x) can be written as follows:  =  Hh(x)h(x+7)}  Hf(0)f(7)}  =  (4.4.66)  Hh(0)h(7)J  (4.4.67)  Using the transformation between f and h we get:  ^ (7) h  =  Hf(x+B(x))f(x+7+B(x+r))}  =  Hf(0)f(B(x+7)-B(x)+7)}  (4.4.68)  Let us define a new random vector process, c ( x , r ) as follows:  c(x,r)  =  B(x+7)-B(x)  (4.4.69)  Since B(x) is stationary we can write:  C(X,T)  = c(r) = E(r)-E(0)  (4.4.70)  Now let us obtain a single random value of c(7), or event, and call this event Ci(x). For this event we can write ^^(j)  in terms of <j>t(j) as follows:  167  * (T)  Hf(0)f(c (7)+7)}  =  hl  1  =  *  (4.4.71)  ft+£>(?))  However, in determining '/'^(r) we must consider all possible values that c ( r ) can take on. Thus we must perform an expectation  tf (r) h  =  mft+c(T))}  operation with respect to c ( r ) . Doing so yields:  =  UZa V (c> c  ft  (4.4.72)  +c)dc  where P ( c ) is the probability density funcdon of the random process c . c  If P ( c ) and \frft) c  are even functions the above equation is seen to. be a convolution. Hence we can write:  tf (r) h  =  (4.4.73)  pftMft)  Taking the Fourier transform of this relationship allows us to write:  * (5) h  =  <fic &yi>ft)  (4.4.74)  where # (w) is the characteristic Junction of the distribution P ( T ) . Thus, if we know p (c) c  c  and <j>ft) we can, in principle, determine ^ ( C J ) compute the average mean square reconstruction  When invertibility  modelling condition  transformation  must  which can be used in equadon (4.4.49) to  error.  the random perturbation function, B(x), one must keep in mind the on  7~^(x)  never  equal  (which zero).  states  This  that  condition  the  Jacobian  means  that  determinant B(x)  must  of the obey  the  following:  |l + 9B(x)/3x| >  (we  have arbitrarily selected  0  (4.4.75)  the sign of the Jacobian to be positive, it could just as easily  be negative, in which case the above quantity must always be less than zero.) This can be  168  written as:  where rj  u  One  1+  T h i + 7 h 2 + T?n7? 22 - 77 77  -  dby/dxu  1 2  3bi/3x ,  -  Vii  2  possible model for 5(x)  (4.4.76)  > 0  21  7? = 2)  9b /3xi 2  and 7722 =  3b /3x . 2  2  can be obtained by assuming the probability density of  the 7 7 . . to vanish over the range (-5,6),  where 5 lies in the range  With this condition, the Jacobian of 7 " ^ is guaranteed  [-(l+i/3)/2, (1—1/3)/2].  to be always positive. Let us further  assume that the power spectral density of the bj(x) vanish outside a disk of radius B in the frequency derivatives  domain  (i.e. the bj(x) are bandlimited to  of a bandlimited deterministic  B). Papoulis  function are themselves  (1967)  shows  that the  limited in magnitude. He  extends this result to the case of random functions in the one dimensional case. However, the limit is now a limit on the RMS value of the derivatives. His derivation can be extended to the  two dimensional case  to give  the following  limit  on the mean  square  value  of the  derivatives of bj(x):  H|3  k+r  b,(x,y)/3x 3y | } k  r  2  ^  P B  2 k+  2 r  (4.4.77)  where P is the power in b (x), defined as t  P  and  =  2  (l/47T )// n  4> (£)c£ b  (4.4.78)  is the region of support of the Fourier transform of b,(x) (i.e. where it is non zero).  Note that this does not limit the maximum of the derivatives of the functions. However, if the  bj are Gaussian distributed, then so are the derivatives of the bj. If we fix the B / P  product so that it is less than 8/2, then the probability that the magnitude of the derivatives will not exceed 5  will be 0.955 (as this is the 2a value of the Gaussian distribution). Thus  we can create a model for the perturbation noise B(x) by assuming the functions bi(x) and  169 b (x)  to be Gaussian distributed and bandlimited such that the product of the bandwidth and  2  the  square root  [-(W3)/2,  of the power of the  c  is  are  less than  5/2,  where  5  is in the range  dV3)/2].  Since B(x) <£ (u>)  bj(x)  also  has a Gaussian distribution, so does c(x).  Gaussian.  This  means  that ^ ( w )  for  this  Thus the characteristic function perturbation model  is  simply  a  Gaussian weighted version of ^ w ) . In the case of the sample sequence created by zero crossing contours of V G filtered 2  images, the above perturbation model is not valid. It appears to be very difficult to obtain a model  for this case, primarily  perturbation contours)  function  and also  in the because  because  zero  of the  high  crossing case  of the  difficulty  degree  of correlation exhibited by the  (because zero crossings lie along continuous  in finding a  model  which  both  satisfies  the  invertibilty constraint on 7~^(x) and results in a closed form expression for the characteristic function of c(x). the  For this reason no further work was done on trying to obtain a model for  perturbation function  for the  developed the theory necessary the  case  of  reconstructing  zero crossing sample sequence. However, this section has  to compute the average mean square reconstruction error for  from  non-uniformly  distributed samples  using  the transformation  method of chapter 3, even if it turns out that the application of this theory to some sample sequences may be an intractable problem.  170  4.5 -  Matching Error Analysis  We  will  now consider  matching of features.  the  contribution to  disparity  error  produced  by  incorrect  Marr and Poggio (1979) discuss this topic in terms of the probability of  there being more than one feature work  the  in the matching region. The assumption implicit in their  is that if there is only one feature  in the  matching range  then that feature  is the  correct match. However, as we will see in this section, this assumption is valid only if the disparity estimate error is very small. They did not consider the effect on the matching error of relatively large disparity estimate error can  result  from the  action  detailed analysis of the neighbour matching.  of a  which, as we have  number of error  matching error,  for the  This analysis brings out the  7  processes.  case of zero dependence  seen  in the  previous sections,  In this section crossing of the  we provide a  features  with nearest  matching  error  on the  error in the disparity estimate as well as on the size of the matching region.  The away The side.  matching  process  from the estimated  is depicted in figure 4.28.  The true match  is  a  distance  e^  match position and is taken to be the origin of the epipolar line.  matching region extends a distance r Ghost matches (defined as  away from the estimated match position on either  m  features,  other  than  the  true match,  which lie within  the  matching region) lie at distances T- from the true match. An incorrect match will be made if one of the ghost matches (say ghost match j) lies closer to the estimated match position than does the true match. Thus the matching error will be equal to Tj (and not zero). Note that reducing the size of the matching region will not necessarily reduce the probability of error. It will  do so only if the  error  in the  disparity estimate is smaller than  the  size of this  reduced matching region. In general this will not be case and the matching region must be fairly large so that, if there is a relatively large error in the disparity estimate, a chance  that the true match  reduced  error.  proceed  to  In  higher  general, levels  there is still  (or a ghost match close to it) will be chosen, resulting in a  what  we require  of resolution, the  from our variance  matching  of the  algorithms,  disparity  error  is  that,  as  we  gets smaller in  'Recall from chapter 2 that in this form of matching, the match is taken to be the matching feature nearest to the estimated match location.  171  <ftrue m a t c h  m  FIGURE 4.28 The matching process. absolute terms (or  stays more or less constant  the resolution increases).  in terms of our pixels, which get smaller as  To this end, we provide an analysis of the matching error to see if,  in fact, the disparity error does converge (i.e. the matching error should be smaller than the error in the initial disparity estimate). For the purposes of the following analysis let us assume, that the matching region is of infinite extent  This is not as bold  an assumption as it may seem  because  of the way  that the matching is done. Recall that the match is taken to be the closest matching feature to the estimated match position. The only time that having a very large matching region will, have an effect is when the true match is missing, and instead of just having a situation (which would not affect always be incorrect the larger  the  'no-match'  matching error) we would have a match which would  Note that this can also happen with smaller matching regions, but with  matching regions there is a slight possibility that the induced matching error  be quite large (if there are no ghost matches near to where the true match should be).  will  172 Probability  density  of the  matching  error  We can write the probability density of the matching feature at T = e are  m  no matching  error e  m  as p(e ) = m  Prob{ a  given that there is a matching feature at 0 (the true match) and that there features in the region  (otherwise these features would have been  [T,2e^~r]  selected as the match) 3 Let us define P^C?") to be the probability density of the interval r between the feature at the origin and the N ^ matching  feature (where we order consecutive  1  features  along  origin).  Let us further  8  the epipolar  line  define  feature given that it is r  :  ...-2-1,0,1,2...  0  corresponding  P (N) to be the probability  units from the zeroth  definitions it can be seen that the probability  to  the feature  of a feature  being  at the the N  feature (the one at the origin). With these density  of the matching  error can now be  written:  Pm< m) e  ^ g W W - '  PN-I^X*'*  f o r  e  d  < e  m  < 2 e  d  ^  and  N  Pm( m> = N ? 0 P e m ( * W e  and is zero for all other values hand side of these equations given that it is a distance e  P^rOdr,!  for 0 < e < e m  of e . For each value of N in the summation,  is the probability that the matching m  (4.5.2)  d  feature is the N  the right feature  from the zeroth feature, times the probability that there is no  other feature closer to the estimated match position. Note that in evaluating the probability of a feature closer to the estimated match position than the N the N-l^  N+1 1  and the N - 1  features. If e  m  is greater than  feature, we need only consider e  d  we need  only look at the  feature since the N - i ^ ( i > l ) features can be closer to the estimated match position  This notation differs slightly from the notation by Longuet-Higgins (1963) who defined P(T) to be the probability density of the interval r between the feature at the origin and the N+1 matching feature. The reason we use the modified notation is to allow the definition of P 0 (T) which is the probability of the same feature being r units apart, which is obviously equal to one at T = 0 and zero everywhere else.  173  than the N case of e  m  m  feature only if the N - l  feature is as well. A similar argument holds for the  m  less than e^, in which case we need only consider the N + l ^  feature. This is  shown in figure 4.29. We have assumed that the occurence of a gap with no features in the interval ( e  ,2e ) is independent of the occurence  m  not strictly  valid  (except for large  order to obtain any mathematical  of the features at zero and e . This is  values of e )  but is an assumption  m  we must  make in  headway.  An interesting special case is that of e = 0. P (0) m  m  i s  m  probability that the correct  e  match will be found. From the above equations we can see that:  2e 1 - JT o P^rOdT, d  P (0) m  Note  that P (0) m  =  is a function  (4.5.3)  of the disparity  provides a number of approximations to  estimate  error,  e^.  Longuet-Higgins (1963)  for the case of zero crossing features of a one  dimensional random Gaussian distributed process,  f(x). For the case of matching zero crossings  without regard to their sign, Longuet-Higgins gives the following approximation:  (4.5.4)  P,(r) = X(+,-;r) - X( + ,-,-;r)  where:  X( + , - , t )  =  (l/27ry(-^ /i//(r)X/(MnM )[v/(l-v 0  22  12  2  )-v ARCOS(v )]/[^ (0)-^ (r)] 12  2  12  (4.5.5)  2  X( + ,--;r) = (l/47r );v/(-<// /LV(r)y(MiiM22M33)[i/|v| +s a + (s -7r)a + (s -7r)a 3dT 2  0  1  1  2  2  3  3  1  (4.5.6)  174  e-sfimcite  M-  •e ^  ,  2C0  *—  -e  m  FIGURE 4.29 The analysis o f the matching error estimated match position is the N t h feature.  and  given that the closest  match  to the  where:  D  «r,s  inn  D  (4.5.7)  ln  =  =  ,  ^ns „  (4.5.8)  ln nl  nn  where n = 2 for X ( + ,-) and n = 3 for X ( + ,-,-)•  v..  s,  =  =  u../i/(y -u-)  (4.5.9)  ARCOS[(v v -v )/v ((l-v 31  12  23  /  31  2  )(l-v  23  2  ))]  (4.5.10)  175  s  =  ARCOS[(v v -v )V((l-v  12  s  =  ARCOS[(v v -v )/ /((l-v  23  2  3  12  23  23  31  31  12  v  2  2  )(l-v  )(l-v  31  12  2  ))]  (4.5.11)  ))]  (4.5.12)  2  (These angles are to be taken in the range (0,7r))  vKr)  a,  =  v v +v 12  23  (4.5.13)  a  2  =  v v +v  31  (4.5.14)  a,  =  v v +v  12  (4.5.15)  3]  12  23  23  31  is the autocorrelation  function of the random  process,  f(x). The subscripts  on the  autocorrelation function in the above matrices have the following meaning:  *y  For  =  \KT j - T j )  (4.5.16)  the case of matching zero crossings with  the same  contrast sign Longuet-Higgins gives  the following approximation:  P,(T)  =  X( + , + ; r ) -  X( + ,+,+ ; r )  (4.5.17)  where:  X ( + ,+ ;T)  =  (l/27T) /(- // /^(r)>/(MiiM 2)[v/(l-v ) + v A R C O S ( - v ) ] / [ ^ ( 0 ) - ^ ( r ) ] v  <  0  2  2  12  12  12  2  2  (4.5.18)  176  X( + , + ,+ ;T)  Marr  and Poggio,  formulae 1979)  1  analysis  some  writings (Grimson,  (1979)  minor  that they  errors.  These  1  of the ghost  in approximating  for these approximations  contained  t  (l/47r )/v/(-<//o/D^(r )y(MiiM22M33)[t/|v|+s a + s a + S3a ]dT (4.5.19) o  in their  and X ( + , - ; r )  X ( + ,+ ;T)  2  =  match  2  wrote  down  errors  were  2  3  probabilities  P-^r) for the two cases. in their paper propogated  1981a, 1981b). However, by the appearance  provided by Grimson (1981b) it is clear  1  1  used  only  Furthermore, the (Marr and Poggio,  through  to Grimson's  of the graphs of the functions  that the proper equations were used, and that the  equations in Marr and Poggio's paper were misprinted (and merely copied by Grimson). The errors  were  P^O")  w  a  s  the following: *  w h e n  *t  s n o u  the exponent  ld  in the [i// (0)~V/ (r)] 2  2  term  in the definition of  have been -3/2; and the exponents of M (T) and M 2 3 (T) in 22  the definition of H ( r ) were 1 when they should have been 2. Note that Marr and Poggio used the notation of (Rice, 1945) whereas we use the notation of (Longuet-Higgins, 1963).  9  The autocorrelation gaussian  process  that  function of a one dimensional slice of a random two  has been  V G filtered 2  is  derived  in the Appendix.  dimensional Using  this  autocorrelation function, we can compute P J ( T ) and hence p (0) as a function of e^ for the m  cases of matching zero crossings with and without regard (alternatively  these  cases can be viewed as matching  quantized to within 180° and 360° 4.30  to the sign of the zero crossings  zero  crossings  whose  orientations are  respectively). The resulting functions are plotted in figure  (for a filter o of j/2). The curves do not approach zero asymptotically as they should,  but pass right through zero and go negative. This is due to the fact that the approximations used  for P-^(T)  are accurate  only  for small  values  of r.  For large  values  of r  they  It should also be pointed out that the plots of P ( T ) given by Grimson (Grimson, 1981b, p76-77) are for the case of the filter a = l . It might be assumed, from his graphs that only the horizontal scale changes when a changes but it is not so. The vertical scale also changes. In fact, as a becomes larger (coarser resolution) the peak height of the probability distribution becomes smaller. This is not indicated by Grimson's graphs as omly the horizontal axis on his graphs are scaled in terms of a, while the scaling on the vertical axes is constant 9  177  overestimate  P-^(r). Note that as the disparity estimate error is increased, the probability of  choosing the correct match quickly goes to zero. Effect of quantization of the orientation of the zero crossings  We would crossings  expect that,  according to  their  if instead of matching raw zero crossings, we matched zero  orientations  (within some  angle  quantization) the  probability of  obtaining the correct match for a given disparity estimate error would increase. However, the analysis of this probability becomes increasingly more difficult (because one is required to do a two dimensional analysis instead of a one dimensional analysis) and one can only find very crude approximations. We can, however, perform experimental determinations of this probability for different orientation quantizations. This was done for angle quantizations of 22.5°, 180°  60°,  and 360°. We generated a 256x256 array of gaussian distributed random variables with  mean 128 and variance 64 . This array was then filtered with a V G filter with o=\/2. The 2  2  zero crossings of this filtered array were found and the orientations of these zero crossings were  quantized to  the  desired granularity. This  array  of  zero  crossings  was replicated in  another array which was shifted by an amount e^ with respect to the initial zero crossing array. The matching process was then performed between these two arrays. The total number of correct  matches was  found and this number  was divided  by  the  total number of all  matches to give the probability of obtaining a correct match. This procedure was repeated for a number of different values for e^ (integer steps from 0 to 25). The results are displayed in the graph shown in figure 4.31. It is evident that the smaller the angle quantization, the higher  the  probability of  estimate (however, see process  to  obtaining the  correct  match  for  a given error  in the disparity  the discussion in the next section on the sensitivity of the matching  perturbations  in  the  zero  crossing  orientations,  which  increases  as  the  angle  quantization decreases). Another quantity of interest is the probability density of obtaining a given non-zero matching error as a function of the error in the disparity estimate. However, for non-zero matching errors, finding even an approximate expression for the right hand side of equation  178  u  O u  u  Angle Quantiz. o 180 degrees o 360 degrees  Wm C .A  o  .*-> S3 p  so i N  O o ^  o a5  1  10  1  15  i  20  25  Error in the Disparity Estimate FIGURE 4.30 The theoretical probability of obtaining the correct match as a function of the disparity measurement error, o = y/2  5  10  15  20  Error in the Disparity Estimate FIGURE 4.31 Experimentally derived relationship between the probability of obtaining the correct match and the error in the disparity estimate for a number of different angle quantizations.  179  (4.5.2) is very difficult as finding a closed form expression for P ( T ) for N > 3 n  of  matching  zero  crossings  without  regard  to sign)  can not be done  (for the case  in general  (see the  discussion on this point in Longuet-Higgins, 1963). However, we can, as we did for the zero matching error case, obtain some representative figure 4.32 for the cases of e  m  =  results numerically. The results are depicted in  0 (same as previously), 1, 2 and 3. As expected, the  probability density for a non-zero matching error is a maximum at some non-zero value of the  disparity estimate  error  error,  and the larger  the matching  error,  the larger  is the disparity  for which the probability reaches a peak. This indicates that, as the disparity estimate  error increases, the expected value of the magnitude of the matching error will also increase.  In error  conclusion we can make some general comments about the matching error.  in the disparity estimate  correct match  will  be found.  algorithm will  converge.  is relatively small, Thus  there will  If the  be a high probability that the  we can be confident that our multi-resolution matching  If, on the other  hand, the error  in the disparity estimate  is large  then the matching error may be large as well. However, there is still at least a 50% chance that the matching error will be less than the disparity error (since the match may be either closer to or farther away from the true match, and for large disparity errors the chances of one or the other happening are about equal). Thus the matching algorithm may still converge if an iterative procedure (one in which the matching is done repeatedly at a single resolution level) is performed. It is difficult to evaluate how small the disparity error must be in order for  the matching procedure  to converge.  The problem is further  complicated by the highly  correlated nature of most disparity functions encountered in practice, one section of the image the disparity estimate other  region. This  often  will  happen near  process produces a large amount of error. enough  then  the matching  process  matching process is inherently unstable disparity estimate  will  may be highly accurate but way off in some  disparity  discontinuities where  If the disparity error  hardly  which will mean that in  ever  produce  the reconstruction  in a given region is high  the correct  match.  Thus the  as a large enough disturbance (error) can dislodge the  from the somewhat stable point at the correct value, and will never make  its way back to the correct value.  180  Matching Error • 1.0  5  10  15  20  25  Error in the Disparity Estimate FIGURE 4.32 The probability density of obtaining a matching error as a function of the error in the disparity estimate.  Errors  in the  orientation  When the disparity surface in  going from one  image  is non-constant  to the  other,  as  the zero crossing contours will be distorted  shown in figure 4.33.  Let  us assume  that the  orientation of a given zero crossing in the left image is 0 . Then, if the disparity funcdon is O  lineaT along the y axis with a gradient of m, and constant  along the x axis, the orientation  of the corresponding zero crossing in the right image is not 0  0i  The  =  but is given by:  (4.5.20)  tan (tan(0 )+m) _1  result of this change  process  O  o  in the orientation of the zero crossings  to make incorrect matches,  or  is to cause the matching  to cause matches to be missed. To test the effect  disparity gradients on the orientation, and how it affects the matching process,  of  we performed  the following experiment: We  generated  a  256x256 array  of  gaussian  distributed random  values,  as  above.  A  second  181  FIGURE functions. array  4.33 The distortion of zero crossing contours for non constant disparity  was generated  which  contain the values of the first array  horizontally shifted by an  amount equal to 20*exp(-(I-128)**2/3200) where I is the row number of the array (1-256). Thus the disparity is constant perform  along a row and varies along a column.  the matching of the zero  crossings  for angle  degrees, for various values of the disparity estimate was  tabulated  in each  case.  The results  quantizations of 22.5,  error. The percentage  are shown  Then  in figure 4.34.  It  we tried to 60,  of correct  and 180 matches  can be seen, in  comparison with figure 4.32 that one of the effects of the non-constant disparity is to reduce the proportion of correct matches. This is due to the change in the zero crossing orientation. It  can also be seen  quantization.  This  that  the effect  is to be expected  is more  pronounced for the smaller levels of angle  as the larger  the angle  quantization, the larger the  required perturbation to get a change in the angle measurement The bottom line is that if the  disparity function  produce discussed.  some  is non-constant  matching  error  over  (as is usually and above  the case) the matching algorithm will  all the other  sources  of error  we have  Error in the Disparity Estimate FIGURE 4.34 The probability of obtaining the correct match as a function of the disparity estimate error for non-constant disparity functions.  183  4.6 - Geometry Errors Errors in camera parameters  In the Appendix is derived the relationships between the  camera  geometry  (shown  in figure  4.35)  the image  and the physical  (viewer  displacements, centred)  (X],x ) 2  coordinates  (X,Y,Z). These relationships are summarized as follows:  Z  =  (l + a ) f d / [ f ( x - x ) + a ( f - x x ) ]  X  = x,Z/f  (4.6.2)  Y  =  (4.6.3)  J  x  2  1  (4.6.1)  2  y Z/f 2  where a = tan(2/3). We can now determine errors in the measured  the sensitivity of the computed position coordinates  (or assumed)  camera  (X,Y,Z) to  parameters j3, d , and f. The sensitivities are x  obtained by partial differentiation with respect to the parameters in question. Thus, for Z we get:  3Z/3f  where D = x , - x  2  =  2Z/f -  Z[D+2af]/A  is the disparity and  A = f D + a ( f + x,x ) 2  We also have:  (4.6.4)  (4.6.5)  184  FIGURE 4.35 The stereo camera geometry.  3Z/3d  =  Z/d  3Z/3XJ  =  -Z(f+ax )/A  (4.6.7)  3Z/3x  =  Z(f-ax,)/A  (4.6.8)  =  sec (/3)9Z/3a  x  2  3Z/3/3  For P>>XiX  2>  2  2  =  [2af d /A 7  x  Z(P + x,x )]sec (^)/A 2  2  (4.6.9)  d =0 and d = d the angle sensitivity can be written:  0=0,  3Z/30  (4.6.6)  x  z  =  x  -Z /d 2  (4.6.10)  185  This sensitivity can become quite large for large depth values, meaning that slight errors in the measured camera tilt angle can result in large errors in the computed depth. The sensitivities of X can be written as: 2  3X/3f = (x2/03Z/3f-(xJ/f )Z  (4.6.11)  3X/3dx = (x2/f)3Z/3dx  (4.6.12)  3X/3x, = (Xj/OSZ/Sx,  (4.6.13)  3X/3x2 = (x2/f)3Z/3x2 + Z/f  (4.6.14)  3X/3/3 = (x2/Q3Z/3/3  (4.6.15)  The Y sensitivities are obtained in a similar fashion. For truly accurate depth and position computation, precise values for the camera parameters must be obtained. This can be done by careful setup of the cameras, minimizing the effects of external disturbances such as vibration, or by accurate estimation of the camera parameters from image plane measurements of ground control points. This is frequently done in photogrammetric applications (see Ghosh, 1979). Errors due to vertical misalignments  186  If  the relative  camera  tilt  angle  is nonzero  then  there  will  be vertical  as well as  horizontal disparities (another way of saying that the epipolar lines are not horizontal). If the matching algorithm assumes a horizontal epipolar line along which to search the  fact that there are vertical  disparities  will  cause  for matches then  errors in the measured  disparities and  also may cause matches to disappear altogether. These two events are depicted in figure 4.36.  Let vertical  us now derive  the probability  density  misalignment for the case of gaussian  from figure 4.36 that the disparity error and the amount of vertical  e  y  misalignment, 8  of the disparity  error  produced  by this  random white noise processes. It can be seen  is a function of the zero  crossing orientation 8  (which is assumed to be constant), and is given  by:  e  y  =  5/tan(0)  (4.6.16)  For any isotropic random process the angle 8 of the zero crossings has a uniform distribution in the interval (-7T,7r). However, in our matching algorithm we ignore all zero crossings that lie close  (within an angle A ) to the horizontal. Thus we take the distribution of the angles  to be uniform only over the ranges ( - A , A - 7 r ) and (A,7r-A). Thus we have:  P«(0)  Outside  this  range  =  1/[2TT-4A]  for 6>e(-A,A-7r) and (A,TT-A)  ?Q(8) is zero.  Let u = l/tan(0)  and 8  (4.6.17)  =  tan  (1/M).  The probability  density of u is then given by:  P (M) M  =  =  (4.6.18)  ?e (8)\d8(n)/dn\  [(2TT-4A)(M + 1)] 2  -1  for tan  ( l / M ) e ( - A , A - 7 r ) and (A,7r-A)  (4.6.19)  187  ZERO CROSSING  ASSUMED EPIPOLAR "> LINE TRUE EPIPOLAR-7 LINE ^  8  t INCORRECT DISPARITY  t ZERO CROSSING  MISSING MATCHES  FIGURE 4.36 The effects of vertical misalignment on the disparity measurements. Outside this range P^(M) is zero. Since ty = 5n we can write:  ev< v>  P  e  =  P M (M(e )|3M(e )/3e v  v  5/[(27T-4A)(5 + e )] J  v  (4.6.20)  x  for e )e(0,8/tan(A)  2  Outside this range P e v ( e y ) is zero. The variance of P ev  P V  o  e y  2  =  (25/(27r-45))/e /[5 + e ]de = 2  o  (4.6.21)  v  2  2  is given by  5 [l/(tan(A)(7r-2A))-l/2] 2  (4.6.22)  For small A we have:  o  e v  2  =  6 /(7rA) 2  (4.6.23)  188 Thus  the  standard  deviation of  the  error  due  to  the  proportional to the magnitude of the vertical misalignment  vertical  misalignment is seen  to be  189  4.7 - Effect of the various errors on the multi-resolution matching algorithm In  this section we discuss how the errors described in the previous sections affect the  performance of the simplified multi-resolution matching algorithm. The shown  in  various error sources can be seen to act on the matching process at the points figure  4.37. The disparity  measurement  errors  due to  filtering,  sensor  noise,  quantization, and vertical misalignment can be thought of as adding to the positions of the zero crossings that are input to the matcher. The matching error is added to the output of the matcher, and the reconstruction error is added to the output of the reconstruction process. Note  that  However,  the reconstruction the reconstruction  process process  actually filters, or smooths, also  smooths  the other  error  functions.  out the disparity function, resulting in a  reconstruction error if the disparity function has appreciable high frequency components. It is also evident from figure 4.37 that the matching error estimate  obtained  resolution  level,  from  the next  and hence  lower  (and in the iterative  on the various  errors  at  that  recursion which defines the error in the disparity estimate  depends on the disparity  algorithm, the next  level.  We can write  higher) down a  at a given resolution level, k, as  follows:  e « d  e  where  d  function.  0  d  ( 0 )  =  =  k  d o  "  is the initial The  reconstruction  operation  smoothed  eW+riejV-^e^  +  ^ K c ^  +  (4.7.-1)  ^  d  ( 4 > 7  (lowest  reconstruction  misalignment errors. are  d< >-d =  acts  resolution)  process on  is  the  disparity estimate,  indicated filtering,  by sensor  r{...}.  -  2 )  and d is the true disparity It  noise,  can  be  seen, that  quantization  and  the  vertical  Since the reconstruction operation is essentially a smoothing, these errors  out somewhat  as  well.  Recall  that,  in chapter  3,  we showed  that the  190  FIGURE 4.37 reconstruction  error  The action of the various errors on the matching process. was  reduced  when the  sample  density  increased.  This  fact, coupled with  the smoothing of the error  function by the reconstruction  process suggests that the effect of  including  slightly  getting  matches  that are  reduce the overall disparity error reconstruction  incorrect,  rather  than  rid  of  them  may  actually  by increasing the sample density which in turn reduces  the  error. This would probably be the case only in the regions of the images for  which the disparity function was rapidly changing. It  would  be  the  desirable  disparity  to  error  obtain at  each  a  closed  resolution  form  expression  level, so  that we  for  the  could  probability  distribution  of  examine  the  convergence  of the matching algorithm. However, even if we assume white Gaussian noise for  the input images, the distribution of the various error sources are all markedly non-Gaussian as we have seen in the discussion in the previous sections. analytical  expressions  for  the  disparity  error  probability  Thus it is not possible to obtain  density  function. We  can,  however,  191 with reference  to  figure 4.37, make some qualitative statements.  We know, from the  earlier  sections in this chapter, that all of the disparity measurement errors (except for the error due to  vertical  resolution  misalignment increases.  of  This  the  means  cameras) that  as  and  the  we proceed  reconstruction to  errors  decrease  higher and higher  as  the  resolutions,  the  accuracy of the disparity measurements increase. As well, the accuracy of the disparity function increases.  The only question lies with the matching algorithm. If the matching algorithm can  match accurately seen,  the  estimate error  then convergence of the matching algorithm is assured. However, as we have  performance  of our matching algorithm depends on the  that guides it. In chapter  exhibits an impulse at  zero  estimate  of the disparity  we saw that the probability density of the matching  4.5  error.  The magnitude of this impulse is a monotonically  decreasing function of the error in the disparity estimate. that it shows that the  accuracy  in the  The importance of this analysis is  disparity function can have a certain  level of error  and  the matching algorithm will still yield exact matches most of the time. Crudely put, we  can  conclude  function  that  if the  reconstruction  algorithm will  is  disparity measurements sufficiently  converge. If the  errors in the  may happen with very noisy sensors, may happen with surfaces  accurate,  or  are  at  all  sufficiently accurate, resolution  levels,  disparity measurements  if the  that have high spatial  are  and then  the disparity the  excessively  disparity function reconstruction frequency components or  matching high, as  is poor, as  from using poor  reconstruction methods, the matching algorithm may not converge. Often it may happen that the  various sources  of error  are  not uniformly  distributed throughout  the  image but  rather  tend to accumulate in distinct regions. In this case the matching algorithm may converge over most of the experiments  image  but diverge over scattered patches  described in the next chapter,  sufficiently well.  of it This is seen  in some of the  especially when the reconstruction process  is done  192  4.8 -  Summary of chapter 4  -  The major sources of error in the simplified multi-resolution matching algorithm are  sensor noise, spatial filtering effects, reconstruction errors, matching errors and geometry errors.  -  The disparity error due to the sensor noise increases as the resolution decreases and  as the signal to noise ratio -  decreases.  The filtering error, due to spatial filtering of the images for non-constant disparity  functions, increases as the disparity gradient increases, and as the resolution decreases. -  The left and right scale maps of a one-dimensional stereo pair, for linear disparity  functions, are related by a simple expansion factor.  Map  Two dimensional functions can be represented by the Two Dimensional Scale Space  which has  through  this  two spatial  dimensions and two  function, obtained  by  holding  one  scale  dimensions. A  scale  and  one  two  spatial  dimensional slice  dimension  constant,  results in the Skew Map of the function. It is seen that the Skew Maps of two functions, for linear disparity, are related by a simple expansion factor. -  The reconstruction error is composed of three, somewhat interacting, components, the  truncation, aliasing and filtering errors. -  The optimal truncated reconstruction filter is derived, consisting of generalized prolate  spheroidal wavefunctions. -  A general  expression  for the  reconstruction  error  is derived, involving  the sample  distribution, function spectrum, and the reconstruction filter impulse response. -  The reconstruction error is seen to, in general, rise as the resolution decreases (due  to decreased sample density) and as the disparity function bandwidth increases.  193 -  The  distribution of  magnitude of this  the  matching  impulse decreases as  the  error  exhibits  error  in the  an  impulse  at  zero  disparity estimate  error.  supplied  to  The the  matching process is increased. The fact that this impulse exists indicates that the matcher can tolerate some error in the disparity estimate and still yield exact matching.  -  The level of quantization of the zero crossing orientation affects the matching error,  ln general the finer the quantization, the smaller the error. -  If  the  disparity  function  is  not  constant,  then,  in  general,  the  orientation  of  corresponding zero crossings will not be the same. This can cause an increase in the matching error.  The  increase  in  the  matching  error  is  seen  to  be  greater  for  finer  orientation  quantizations, and for zero crossings whose orientation approaches vertical. -  Errors in the measured or assumed camera geometry parameters  will cause errors in  the computation of depth and position from the disparity measurements. -  Vertical misalignment of the  error is greatest for zero crossings  cameras cause errors  in the  measured  disparity. This  near horizontal. For Gaussian white random noise images  the error standard deviation is proportional to the amount of vertical misalignment -  The total disparity error at any resolution can be written as a recursive function of  the errors at the previous resolutions. -  The reconstruction process tends to smooth out the disparity measurement errors (not  including the errors produced by the reconstruction process itself). -  If the  disparity errors  multiple of the V G filter a) 2  at  each  resolution are  and if the reconstruction  relatively small (compared process  to some  is sufficiently accurate then  the matching algorithm will converge.  -  The matching algorithm may  small patches of the image.  converge  over  most  of an image  and diverge over  194  V  -  EXPERIMENTS  WITH  T H E DISCRETE  MULTI-RESOLUTION  MATCHING  ALGORITHMS  5.1 -  Introduction  This chapter presents the description and results of computational experiments performed to  illustrate  some  of the analyses  of the discrete  multi- resolution  matching  algorithms that  were done in the previous chapter. The topics covered in this chapter and the relationship between them are summarized in figure 5.1. A summary of the findings of this chapter is given at the end of the chapter. The experiments noise,  reconstruction  multi-resolution for with  the last  errors, filtering errors  and matching  errors  on the performance  of the  matching algorithms. All of the experiments described in this chapter (except  section)  Gaussian  detailed in this chapter are designed to examine the effects of sensor  were performed on images  distributions  with  mean  128  whose  intensities were  and standard  deviation  randomly distributed  of 64.  The intensity  distributions were truncated so that all the intensities had values between zero and 255. The departure  from  the true  Gaussian distribution caused  by this  truncation  is assumed  to be  negligible. The measure of error that is used in these experiments is the RMS disparity error. That is, the square root of the average of the square of the difference between the measured disparity  and the actual  disparity at each  point in the image.  The surfaces  used  in the  experiments were ones which gave rise to disparity functions of the form  d(x,y) =  exp(-yV2o ) d, max 2  s  (5.1.1)  and  rf(x,y) =  rf  max  exp(-x /2a ) 2  2  s  (5.1.2)  195  Introduction CH5.I  Application of Experiments  stereo vision *  Summary  to log scaling CHS."? Implementation of the f i l t e r CH 5.2  Frequency response  Review of basic  CM 5.3  techniques CM  5.1 Error due to disparity gradients  Analysis of the  CH 6". 4  measurement accuracy CH 5.?  Error due to  * Finding the log in the disparity map  sensor noise ^  I  CH  CHS.?  5.5  Comparison of the simple  Estimating the log volume CH  *  method to the dispscan method  5.?  (*) Indicates New Material  FIGURE 5.1 The topics covered in this chapter. These are cylindrical gaussian functions. These disparity functions were chosen of reasons.  First,  by changing  o  we can change  the effective  for a number  bandwidth of the disparity  196  function,  thereby  exercising  cylinders is aligned with there  is  surface  the  the  no filtering effect  reconstruction  x-axis,  the  in this  algorithms.  disparity gradient  case.  Furthermore  which would cause missing or incorrect  the y-axis  Secondly,  along the  there  matches.  when  are  If the  no  the  axis of these  x-axis  is zero. Thus  self-occlusions  of  the  cylinder axis is aligned with  then there is a disparity gradient along the x-axis. If we constrain the disparity  gradient along the  x-axis to be less than one,  there will  be no occlusions. Thus,  we can  obtain a measure of the effect of the filtering error on the matching algorithm by performing the experiment first on a cylinder with its axis along the x-axis and then repeating it with the  cylinder aligned  y-axis.  The  change  in the  the filtering error  effect  (but  see  remarks  components).  The  for  modeled the  expected  attributed  section  to  with  third  the  reason  disparity  using these  the  gaussian  functions obtained in the  observed  disparity  error  can  below on decoupling the  cylinders was  that  they  be  error  somewhat  log scaling application described in  5.6.  We  examine,  in  these  experiments,  the  use  of  three  different  matching  algorithms.  These are: 1. Matching using zero crossing features only. 2. Iterative (two pass) matching using zero crossing features only. 3. Iterative (two pass) matching using zero crossing and extremum  features.  Furthermore, three different types of reconstruction algorithms are tried. These are: 1. Warping or Transformation reconstruction method (see chapter filter derived in chapter 2.  Relaxation  3.7)  using the optimal  4.4.  reconstruction  method, (see  chapter  3.3).  This method is used only for  the iterative matching algorithms, in order to speed its convergence rate.  3.  9x9  Averaging. This  method  was  suggested  by Grimson  (1981a)  as  obtaining a disparity value from a region. It consists of merely averaging the  a  means  of  values of all  197  sample points located in a 9x9 pixel region of the reconstruction grid centred on the point to be reconstructed. T h e  motivation for using such a  method is that it provides a check on  whether or not a simple reconstruction is all that is required. This method is computationally much  cheaper  than  the  other  two  reconstruction  techniques.  However  such  a  large  region  causes problems when the disparity function is not constant.  One  of the difficulties that we encounter in doing these experiments  is in decoupling  the various sources of error from the resultant disparity error. The effect of the sensor noise is decoupled from the other conditions the same  error  contributions by repeating a given experiment  holding all  except that a Gaussian random function with a given variance is added  to one of the input images. The increase in the disparity error can be then attributed to this added noise signal. Similarly, the effect of changing reconstruction methods and the effect of changing the surface bandwidth is obtained with the same process; while  holding  chapter,  the  all  other  conditions  matching error  the  same.  However, as  is dependent on the  errors  we  varying a given parameter have  in the  seen  in  the  disparity estimate,  previous which in  turn is a function of the various measurement and reconstruction errors. This means that the matching between  error two  can  not  experiments  be  held  will  constant  Thus  always consist  the  observed  of changes  differences  in the  in  disparity  matching error  as  error  well  as  changes in the disparity measurement error due to the effect being tested for (such as sensor noise).  However,  reconstruction  etc.)  all  is  not  lost  If  the  disparity  estimate  error  (due  to  sensor  is small enough so that the correct match is always within the matching  region, then the matching error will be essentially independent of the disparity estimate (as  noise,  was shown in the previous chapter).  In this case the changes  error  in the observed disparity  error will be due to changes in the parameter that is being varied.  Before going on to the presentation of the actual experiments we will briefly describe the implementation of the multi-resolution feature detection algorithm. -  198 5.2  -  Implementation of the Multi-Resolution Feature  ln  this  section  we  discuss  the  implementation of the  production of the multi-resolution feature up into two sections:  the  Extraction  subsystem  image representation.  spatial filtering to  produce  the  responsible  for  the  This subsystem can be broken  set  of spatial  frequency channels,  and the feature detection. The spatial filtering is performed as shown in figure 5.2. The sampling one  lowpass  filter  rate reducer.  half  its  and  sub-sampler  sections  form  The lowpass filter restricts the  previous  maximum,  to  limit  the  a  two-dimensional  decimator  maximum frequency of the  aliasing  error  when  the  filtered  or  image  to  image  is  subsampled. Each decimation stage reduces the number of image samples by a factor of four (by  two  in each  of  the  horizontal and  vertical  directions).  exactly the same set of filter coefficients. Each successive  Each  lowpass filter section  has  stage of the decimator is followed  by a V G bandpass filter. Even though the coefficients for each of these V G filters are the 2  2  same, the apparent frequency response of these filters with respect to the input have different centre frequencies because of the sampling rate reduction. This scheme of spatial frequency channel production offers distinct advantages direct  method in which the  having  a  different frequency  input signal response.  is filtered by  over the  four separate bandpass filters, each  The first, and probably least important  advantage,  is  that only one set of filter coefficients is required for all the lowpass filters and for all the V G 2  filters.  A more  important advantage  lies in the  fact that the centre frequency of the  prototypical bandpass filter, with respect to the  input, is fairly high,  radians.  bandpass  In  designing  required to approximate  finite  wordlength  digital  an ideal filter response  filters  the  to a given accuracy  on the number  order of  of  n/2  coefficients  is inversely proportional  to the centre frequency. For example, in the direct method we would require a filter size on the order of 8Nx8N for the  lowest (fourth)  spatial frequency channel filter (given that the  highest frequency filter was of size NxN), compared to the NxN size filter required in the hierarchical scheme filters in  the  for all the channels. Of course  hierarchical  case but these too  will  we must take into account be of constant,  the low pass  not exponential, size. In  199  INPUT IMAGE  V 6  , CHAKINFl  2  BANDPASS  (N.N)  FILTER  1  (N-N)  LOW PASS FILTER • SUBSAMPLER  V 6 BPF  ^.CHANNEL  2  V G BPF  , CHANNEL  2  fW)  3  LP F  SS PRODUCTION OF SPATIAL FREQUENCY C H A N N E L S  — i —  FIGURE 5.2 The spatial Filtering process. addition,  the  structure  of  our  hierarchical  filtering  system  facilitates  the  can be compared  with  pipelining  of  computation as can be seen in (Clark and Lav/Tence, 1985c).  The developed  filtering method by Crowley  described  and Stern  in this  (1984).  thesis  They  compute  the  Difference  of  the technique Low-Pass  (or  200  D O L P ) transform of an image, which, for Gaussian low-pass filters, closely approximates the V G 2  filter.  separability method  Their  method  uses  of the Gaussian  produces  subsampling  low-pass  filter  bandpass filtering at  to to  resolution  reduce reduce levels  computation  and  the amount that  are a  also  uses  of computation. factor  of \/2  the Their  apart, in  comparison to our method which produces bandpass filters with resolution a factor of 2 apart In some  cases this may be useful, but for our application the Crowley and Stern method  needs to do twice as much computation than is actually required. Clark and Lawrence (1984, and 1985c) describe a proposed hardware this thesis that takes advantage to  increase  the speed  implementation of the filtering method described in  of the fact that it can be implemented in a pipelined fashion  of computation.  It is not clear  whether  the method  of Crowley and  Stern can be similarly configured.  If we let the lowpass  filter prototype  have  a  frequency  response  U^\U>i)  and the  bandpass filter have a frequency response B ( b > i £ > ) then the frequency responses of the spatial 2  frequency channels, referred to the input, are as follows:  H1(CJ1JW2) = B(CJJ^J) H ( w i j w ) = B(2CJ !,2CJ )L(W i AJ2) 2  2  2  H 3 (OJ ] A> 2) = B(4co, ,4CJ )Uw x A> 2 )L(2o) i ,2CJ2) 2  H4(W1A) ) = B(8CL)I,8CJ2)UCJ1AJ )U2W ,2CO ^4JI,4J2) 2  2  1  and ^ w £ ) ) = L(a) + 27rkA>2 + 27r^) for  2  1  2  1  k,l = ± 1,2,3...  The  prototype  one-dimensional transformation  lowpass  takes  a  two-dimensional filter  using  one-dimensional  lowpass  the  filter  McClellan  filter  with  was  designed  transformation transfer  function  by  transforming  (McClellan, Fi(co)  1973).  a  This  and produces  a  two-dimensional filter with transfer function:  (5.2.1)  201  where:  f(w cj ) lJ  This  =  2  transformation  two-dimensional  arcos[.5(cos(cji) + cos(co ) + cos(a),)cos(6J )-l)] 2  preserves  the  optimality (if present) of the one-dimensional filter in the  design. The one-dimensional  algorithm (McClellan  (5.2.2)  2  et al 1973)  filter  was  designed using the  Remez  exchange  to produce an optimal half band lowpass filter (optimal in  the sense that the peak approximation error  to an ideal lowpass filter is minimized using a  minimax criterion).  The used  in  peak sidelobe level of the  the  filter.  For  N=25  this  low pass filter is set level  is  about  by the  -33dB.  One  number of coefficients  result  of  the  transformation is that the resulting filter displays octant symmetry. Thus L(6Ji£j ) 2  L(WI,-CJ )  =  =  U-cj!A> )  -  2  2  =  1X^2,-0)0  \X-UJ (JJI) =  means that, for N odd, there are only ( N + l ) V 8 + (n + l)/4 N .  This  2  can  result  symmetry  is taken  computer,  instead  processor  described  using  the  Fast  VAX-11/750 in  in  a  large  advantage of  being  of.  savings  in  to  build  a  in (Clark and Lawrence,  Fourier  Transform  (FFT)  we  1985c), on  were  special  an  U  and  however.  Typical  C P U times  to  use  device  This  2  FPS-100  array  throughput a general  such  implemented the  minicomputer. The multi-resolution filtering process  figure 5.2,  increased  forced  purpose  we  L(GJ2A>I)  L(-Cc> £Ji).  -  2  =  unique filter coefficients instead of  computation  However, since  able  IX-CL) -CJ )  2  McClellan  as  the  if  the  purpose systolic  filtering operations  processor  used was still  attached  to  a  that depicted  for performing four level filtering on a 256x256  image were on the order of 70 seconds, for a lightly loaded system.  The  prototype  bandpass  filter  is,  as  mentioned  earlier,  a  2  V G  filter  with  transfer  function  B(a> p )=k(w +w )e" 1  2  1  2  2  2  a2(cJl2+£j22)  (5.2.3)  202  The  value  of o  coefficients)  is chosen  to trade  and low aliasing error  off between  high  bandwidth (lower  number  (due to the sampling of the ideal continuous  of filter  filter). We  set the o value for the highest resolution level to be y/ 2. The frequency response of the four spatial filters is shown in figure 5.3. One of the spatial  frequency  axes has been suppressed  (CJ = 0) 2  for clarity. The peak sidelobe levels are  less than 33 dB in all cases.  Zero  crossings  are detected  by scanning  along  horizontal  lines (rasters) for either a  zero value or for a change in sign. When one of these is found, a zero crossing is assigned to the position of the left pixel in the case of a sign change, and to the zero pixel in the zero value case. Once the zero crossings have been detected, which  gets rid of small (one or two pixels across) isolated clumps of zero crossings. After  this is done, we compute a measure zero  we perform a thinning procedure  crossing  pixel.  This  value  is  of the angle of the zero crossing contour then  used  between possible matches. The angle measurements 6 sectors in a 360°  in the matching  algorithm  to  through the disambiguate  are quantized to to sectors of 6 0 ° , that is,  circle. To provide some measure  of noise immunity, we ignore all zero  crossings whose contrast falls below a given threshold. The threshold used will depend on the expected signal to noise ratio of the V G filtered images (which in turn will depend on the 2  processor 20/255  word length and camera characteristics). In our experiments as  the  majority  of noise-like  zero  crossings  fell  below  we used a threshold of this  threshold.  The zero  crossing detection algorithm did not use the array processor at all, and typical C P U times for the zero crossing  detection  process on a four level image set (256x256, 128x128,...) were on  the order of 90 seconds for a lightly loaded system. Combining these times with the typical filtering times reported  above  for the filtering process results  Seconds (about five and a half minutes)  in times on the order of 320  for performing the multi-resolution filtering and zero  crossing extraction on a stereo pair of 256x256 images.  203  FREQUENCY  FIGURE 5.3 The frequency response of the four spatial filters. Extremal performed  by  points (points searching  along  of local maxima horizontal  lines  and minima) were  also  between  zero  successive  detected. crossings  This was for the  maximum, or minumum value. Note that this procedure Finds only one extremal point between successive  zero crossings. Thus, since the expected  number of extremal  points is greater than  the expected number of zero crossings (see equations 4.3.11 and 4.3.20), we will not find all  204  the extrema with this procedure. However, in practice, the number of extrema due to noise is substantial. The largest extremum in a given interval (which we call a semi-local extremum) is  most  likely  not  a  noise  extremum  (although  the  position  of  this  extremum  may  be  perturbed by noise). In using the semi-local extrema as features we trade off feature density for the assurance that the features are created by events in the scene and not by noise.  205  5.3 -  Frequency Response of the Matching Algorithms  In this chapter we examine the performance of three different matching schemes, using three different reconstruction surface  disparity  methods, as we vary the frequency domain characteristics  function. The change  attained by varying the value of a values of a and  g  g  All stereo pairs  disparity function frequency  content is  in the disparity function equation (5.1.1) and (5.1.2). The  used in the experiments  80 pixels.  in the surface  of the  in this chapter are 5, 10, 15, 20, 25, 30, 35, 40, 45  in the experiments  were  256x256  arrays  of white  gaussian  random numbers with mean 128 and standard deviation of 64. The left and right images are related by the following equation:  I  where  d(x,y)  right  (x,y)  =  I (x+tf(x,y),y)  (5.3.1)  left  is given in (5.1.1).  Figure  5.4  shows  the zero  crossing  pyramid of such a  random image pair. The three matching algorithms that are used are: 1. -  Single pass, zero crossing features only.  2. -  Two pass iterative, zero crossing features only.  3. -  Two pass iterative, zero crossing and extremum  features.  The  size of the matching region was seven pixels wide (r  =  3 pixels). Figure 5.5 shows  how  the RMS disparity error varies as the size of the matching region changes (single pass,  zero crossings only with reconstruction by the transformation method). Beyond r much difference in the measured RMS disparity error. This is because  m  there is not  the matching is done  from the centre of the matching region outwards. Thus a feature in the outlying parts of the matching region will be taken to be the match  only if there is no features  the centre of the matching region (see chapter 4.5). For large r probability of happening, and hence increasing r  m  closer  towards  this will have a very small  past a certain point will have little effect  206  FIGURE 5.4 For  the  The zero crossing pyramid of a random image pair.  purposes  of the  experiments  three resolution  levels were used. The variation  207 o  CM  o Ui  Ui CO ©  a, tn  • — i<  Q  S i z e of R • = 2 o = 3  °  = 4  T  10  20  30  40  of the  RMS  The variation of the RMS  disparity error  with changes  80  70  60  Surface Sigma FIGURE 5.5  —r-  —r-  50  disparity error with the matching region size.  in the  number of resolution levels is shown in  figure 5.6. It can be seen from this graph that at least three levels of resolution are required for these experiments. The three reconstruction methods used in these experiments are 5.1.  those listed in section  The relaxation method is used only in the iterative procedures, as the convergence  slow in the single pass case. Figure 5.7  charts the RMS  is too  disparity error as a function of the  number of relaxation iterations performed, and provides a visual indication of the convergence. It is evident that even more iterations are necessary  for complete convergence. However, even  at fifty iterations the amount of computation is very high, and the reconstruction takes longer than for the averaging or transformation methods.  The shown estimate  results  of  in figures 5.9 of  the  RMS  the to  frequency  5.20.  error  response  The thick solid due  to  the  experiments line  regions  of  are  in each the  summarized  by  of these graphs  image  that  can  not  the  graphs  represents be  an  matched  (because the two images do not overlap in these regions). We assume that the disparity value  FIGURE 5.6 The variation of the RMS disparity error with the number of resolution levels.  Surface Sigma FIGURE 5.7 The RMS disparity error as a function of the number of relaxation iterations.  209  obtained  in  these  regions  will  be,  on  the  average,  one  half the  disparity  of  the  actual  disparity at these points. This assumption is supported by examination of the actual disparity values obtained in these regions in the experimental tests.  Perspective shaded plots  10  of the disparity functions obtained with algorithm 1 and the  warping reconstruction method for disparity function a  of 10,  20,  40,  and 80  figure 5.8.  It can be seen that the majority of the matching errors are  edges  the  of  image.  Also  noticeable  is  the  degradation  in  are  given in  to be found at the  performance  for  the  higher  bandwidth surface, evident in the poorly defined ridge in the peak of the measured disparity function  in the  o— 10  case.  In figure 5.21  is  shown perspective, plots  of the  error  maps  obtained using matching method 1 for the three different reconstruction methods, for the case of a 200.  g  =  40.  The number of relaxation  iterations used for the relaxation reconstruction was  Note that the errors are typically localized to small patches of relatively high error. The  warping method is seen to yield the lowest error. Figure 5.22  shows perspective plots of the  disparity functions for these cases. The warping method results in the least amount of spike errors, but produces a surface that is not as smooth as the relaxation or averaging methods. The  following points can be made from these results:  1. the  The averaging method performs comparably to the transformation method, while  relaxation  method is somewhat  worse. This  is to  be  expected  since  the  transformation  method, due to truncation, is far from optimal, and the relaxation method is not convergent The  warping or transformation method is seen to be the best overall in terms  of reducing  the amount of the highly localized spike errors. 2.  -  The  performance  with  addition of relaxation  extrema  features  reconstruction,  but  to does  the  zero  not  with the other two reconstruction methods except at low a due  to the  fact the  the relaxation  reconstruction  crossing  appreciably g  features affect  the  improves  performance  values. This can be explained as  method requires less iterations  to  converge  The program for plotting these shaded plots was written by Richard Jankowski of the Electrical Engineering Department U.B.C. 10  the  210  FIGURE 5.8 Perspective plots of the disparity function obtained using matching method 1, with the warping reconstruction method.  when  the  sample  density  is increased,  density may not improve the  while for  reconstruction  if the  the  other  disparity  methods,  increasing  function is already  the  sample  oversampled.  Recon. Method a Warping  8O co  . ?_. Ayerag ing  u.  IM  CO  CM  CO  05  10  T  20  -  30  40  50  60  Surface Sigma  —r— 70  B0  FIGURE 5.9 RMS disparity error as a function of o , for matching algorithm 1, maximum disparity = 5. g  Surface Sigma FIGURE 5.10 RMS disparity error as a function of o , for matching algorithm 1, maximum disparity = 10. g  -1  Recon. Method D Warping  O eo u b Ed  .?._ Averaging  20  30  40  50  Surface Sigma FIGURE 5.11 RMS disparity error as a function of o maximum disparity = 15.  30  40  60  70  80  for matching algorithm  50  Surface Sigma F I G U R E 5.12 RMS disparity error as a function of o' for matching algorithm maximum disparity = 20.  O  t-i  Recon. Method o Warping . ? . . A V erag 1 ng ^"Relaxation  CO-  u> Cz3  efl  CM -  a, 3S -A  A  O-  10  20  A  A  —T—  30  —r40  —r60  50  Surface Sigma  •  u  70  80  FIGURE 5.13 RMS disparity error as a function of a , for matching algorithm 2, maximum disparity = 5. g  O  Recon. Method D Warping . 9.. Averaging a Relaxation  P 5 -  i  0  i  10  i 20  i  30  i  40  i 50  i  60  I  70  T 80  Surface Sigma FIGURE 5.14 RMS disparity error as a function of a , for matching algorithm 2, maximum disparity = 10. g  Recon. Method o Warping . 9_. AY?rag ing a Relaxation  n  O  u  ed  a  CM  CO  CO  - r -  10  20  -r-  30  40  50  60  70  80  Surface Sigma FIGURE 5.15 RMS disparity error as a function of o maximum disparity = 15.  o-| 0  i 10  1  20  i 30  '•  1  1  40  50  i 60  Surface Sigma  FIGURE 5.16 RMS disparity error as a function of a maximum disparity = 20.  for matching algorithm 2,  g)  i 70  80  for matching algorithm 2,  215  Recon. Method a Warping . ?.. A Y crag l ng A Relaxation  u>  O n Ui  u  Ed ed cuCX CO  Q  S  o-  0  i  i  10  20  i  30  A  .,  40  i 50  Surface Sigma FIGURE 5.17  RMS  maximum disparity = 5.  70  80  disparity error as a function of o S' for matching algorithm 3,  Recon. Method • Warping  u O n u  Averaging  Relaxation  H  cd  60  CM-  CO  CO OS  "T" 10  20  30  40  —T"  50  Surface Sigma FIGURE 5.18  RMS  maximum disparity = 10.  60  70  80  disparity error as a function of o S' for matching algorithm 3,  kl  O u  Recon. Method a Warping  m-  . 9 . . AY J l J J 1 ng e  a  ^""Relaxation  W  CM -  10  CO  10  20  30  40  50  60  70  80  Surface Sigma  FIGURE 5.19 RMS disparity error as a function of og. for matching algorithm maximum disparity = 15.  Surface Sigma FIGURE 5.20 RMS disparity error as a function of os , for matching algorithm maximum disparity = 20.  217  This has as a corollary the statement that the reconstruction error is not a major component  218  WARPING  RELAXATION  AVERAGING  FIGURE 5.22 Perspective plots of the disparity function obtained for a =40 three reconstruction techniques.  of the disparity error, at least not for the larger a  g  values.  for the  219  3. -  The RMS disparity error  has been seen in chapter 4.4  is proportional to the maximum disparity value. This  to be the case for the reconstruction error. It is also evident  that the other sources of error, that is, additive noise, quantization error, and filtering error, would,  in general,  not  be  related  to  the  maximum disparity value. This  implies  that the  reconstruction process, coupled with the matching process, as described in chapter 4.5,  at low  resolutions, is responsible for the bulk of the disparity error. 4. -  The RMS disparity error, after subtracting the estimated RMS component due to  edge effects, is seen to increase as a g  decreases. This is due to the reconstruction error and  roughly follows the example given in chapter 4.4  (where the same type of disparity function  was use, but a slightly different type of filter). This effect can be seen in figure 5.18  for  the a = 10 case where the disparity function is not completely reconstructed, due to the rapid change in the disparity function.  5. seen  to  The RMS disparity error  paradoxically rise  for  large  ag  in the case of the warping reconstruction method is values. This  is  due to  edge  effects.  The warping  method assumes a fixed boundary of zero value, while the other two methods assume a free boundary. Thus, in the warping case, the reconstructed function is pulled down to zero at the edges.  The error  incurred by this pulling  down  disparity values near the edges of such surfaces for low a g  is largest are  surfaces which quickly drop down to zero.  for large  ag  surfaces  since the  relatively high compared to the values  220  5.4  -  Surface Gradient Response of the Matching Algorithms The  effect  of the  surface  gradient  (in  the  x-direction)  on the  performance  of the  matching algorithms is obtained as follows. The RMS disparity measurement error is found for a number of different gaussian cylinder surfaces as in the previous section. However, in these cases the  axis  of the  cylinders were  aligned with the  y-axis.  This means that there  is a  disparity gradient along the x-axis. Because of this disparity gradient we expect to observe an increase, in the measured RMS disparity error  from the case in which the cylinder axis was  aligned with the x-axis. To obtain a value for the error due to the filtering effect, we run the same experiment twice, once with the gaussian cylinder aligned with the x-axis and then with the cylinder axis aligned with the y-axis. The difference in the measured RMS disparity error  can be attributed to the effect  of the V G filtering process 2  on the  feature positions.  For most values of o the average disparity gradient is approximately given by:  daVdx = avg  d /256 max  n  We performed the  (5.4.1) '  mflv  experiments  v  for  four values of ^  m a x  ;  5,  10,  15,  and 20.  Ten different  values of surface sigma were used; these were the same values as used in the experiments of the  previous section.  Since the  average  disparity gradient  is  essentially  independent of  surface sigma, the RMS errors obtained using the different sigma values (for the same  the rf_„ ) v  rn.3.x  can be averaged results  of  this  to obtain a better estimate experiment  were  of the filtering error component  inconclusive, as  the  variance  of  the  error  However the differences  so  obtained were very high. In component  order we  to  fully  must  examine  somehow  the  effect  decouple  of  the  disparity disparity  gradients  on  measurement  the  filtering error  process  from  the  correspondence process. This can be done, for the case of one-dimensional intensity functions, with the determine  use of scale-space the  correspondence  matching as described in chapter  correspondence has  been  between  established,  two we  one can  dimensional measure  6.  With this method we can  intensity the  functions.  disparity  values  Once  this  between  221  corresponding  features.  We can then  compare  these  disparity  values to obtain filtering error values. We can then compare  values  to the actual  these experimentally  disparity determined  filtering errors to the theoretical  predictions given in chapter 4. In chapter 4 it was pointed  out  of the filtering error could be done  that the theoretical  linear  disparity  analysis  functions.  Thus,  in the following experiments,  only in the case of  we used only linear  disparity  functions. The case of nonlinear disparity functions is treated in chapter 6.  The experiments f(x).  proceeded  as follows. We created a random one dimensional function  A second random function, g(x), was then derived from f(x) using equation (4.3.3). This  means that f(x) and g(x) can be thought of as the right and left intensity from a surface having a disparity function of the form d(x) = (lix. two  functions  then  matched  (for a given value using  of p\) were then  the technique  performed the disparity between  described  in the scale map. The RMS error  actual  disparities for each value of o  -80/255.  for four  values  in section  6.2.  of the  between  The scale maps  These  After  corresponding scale map contours  value  performed  computed.  functions arising  two scale maps were  the matching  were computed  these measured  of these  disparity  had been for each a  values  and the  in the scale map is then obtained. This procedure was disparity  gradient  The right hand scale maps of the intensity  p\;  -20/255,  -40/255,  -60/255  and  functions for each of these four cases  are shown in figures 5.24 to 5.27. The left hand scale map is the same for each case, since the same random function is used as the left hand function. The scale map of this function is shown in figure 5.23.  The results are plotted in figure 5.28. Figure 5.29 shows a theoretical  prediction of the  expected R M S disparity measurement error due to filtering as a function of a  and p\. This  prediction was obtained by assuming that the RMS filtering error is given by the square root of the sum of the variance of the theoretical filtering error distribution (equation the variance  4.3.40) and  of the quantization error distribution. However, since the variance of the filtering  error distribution, as strictly defined, is infinite, we can only obtain a 'pseudo-variance' This pseudo-variance  value.  is obtained by fitting a gaussian distribution to the actual filtering error  distribution. This is performed by setting the peak values of the distributions to be the same.  FIGURE 5.24  The right hand scale map for the ^ =-20/255 case.  Doing this we obtain for the pseudo-variance  the expression given in equation (4.3.41). The  223  F I G U R E 5.26 The right hand scale map for the p\ =-60/255 case, variance  of  the  quantization  error  is  simply  given  as  1/6  of  the  zero  crossing  position  224  FIGURE 5.28 The measured RMS disparity error due to filtering as a function of o for disparity gradients of -20/255, -40/255, -60/255 and -80/255.  225 CM  Filter  Sigma  FIGURE 5.29 The expected R M S disparity error due to Filtering as a function of o for disparity gradients of -20/255, -40/255, -60/255 and -80/255. quantization  level, and is thus equal to 1/6 since the zero crossing  this experiment were quantized to one pixel, for all a and  5.29  show  that  the experimentally  obtained  position measurements in  and |3 values. Comparing figures 5.28  filtering  errors  are indeed  similar  to the  theoretical predictions. This experiment thus shows that the filtering error effect is obtained in practice and needs to be considered algorithm.  in analyzing the disparity errors produced by a matching  226  5.5 - Performance of the Matching Algorithms with Additive Noise In  this  section  we  describe  experiments  that  were  performed  to  test  the  effect  of  adding gaussian white noise to one of the images in a stereo pair on the performance of the discrete multi-resolution matching algorithms. These experiments proceeded as follows. First the experiments described in section 5.3  (for the case of d  m a x  =  10) were performed to give a  noise-free baseline for the RMS error measurements. Then the experiments were repeated, this time with gaussian white noise of variance on  added to one of the images. This extra noise  caused a shift in the position of the zero crossings of that image, thereby causing an error in the measured disparity. This was analyzed in chapter 4.2. The difference in the RMS error between  the  depicted  in  two  sets of  figures  experiments  5.30  to  were  5.33.  The  tabulated.  The results  maximum  noise  of  variance  these  experiments  tested  was  400.  are This  corresponds to a minimum signal to noise ratio of 10, as the signal, variance was. 4096. Note that this is a fairly high signal to noise ratio yet the effect of the added noise on the RMS disparity error was still significant.  We can obtain a theoretical prediction of the expected RMS disparity error due to the additive noise by using the results of the analysis of this error performed in chapter 4.2. As was  the  case  distribution measured  in  is  the  filtering  undefined.  RMS error  Thus  with a  error, the  the  best  variance that  'pseudo-variance'  we  of can  of the  the do  theoretical is  to  additive  noise  error  qualitatively compare  additive noise  error  the  distribution. The  pseudo-variance measure that we use is defined implicitly as follows:  •f ep P P  ( e  ) d C  P  =  i e  _ x V 2  dx  =  0.3413  (5.5.1)  Thus <7p is the point at which the area under the probability density curve is equal to the area under the standard gaussian (normal) probability density curve in the interval (0,1)  (recall  that the standard deviation of the standard guassian distribution is 1 and the mean zero, thus the  above  interval  is  one  standard  deviation  pseudo-variance is plotted in figure 5.34  from  the  mean  of  the  distribution). The  as a function of the noise variance  for a filter a  Ul  o  u  U.  >><=>  Recon. Method • = Warping o = Averaging A = Relaxation  120  160  200  240  280  400  Noise Variance FIGURE 5.30 Increase in RMS disparity error as a function of added noise variance. Surface o =30, Iterative matching, Zero crossings only.  T  160  200  240  Noise Variance  280  320  360  400  FIGURE 5.31 Increase in RMS disparity error as a function of added noise variance. Surface o = 15, Iterative matching, Zero crossings only.  O  u  u<  Recon. Method D = Warping o = Averaging A = Relaxation  160  200  240  Noise Variance  260  320  r  360  1  400  F I G U R E 5.32 Increase in RMS disparity error as a function of added noise variance. Surface a = 30, Iterative matching, Zero crossings and extrema.  Noise Variance F I G U R E 5.33 Increase in RMS disparity error as a function of added noise variance. Surface a = 15, Iterative matching, Zero crossings and extrema.  229 of \/2 (the function  of  indicates  that  highest the  resolution  additive  the  filter  noise  It  variance.  RMS disparity  variance (note that the statistical deviation of the measurements  a).  error  is  This is  a  seen  that  concurs  the  with  roughly linear  nature of the measurements  pseudo-variance  the  experimental  function of  the  is  a  linear  evidence  that  additive noise  means that there will be some  from any trend such as the assumed linear one). However, the  magnitude of the pseudo-variance is about 1/5  that of the measured increase in the RMS  disparity  of  error.  This  is  presumably  a  result  the  fact  that  our  definition  of  the  pseudo-variance was somewhat ad-hoc, and another definition may have produced a closer fit to the  measured  magnitudes. However the  expect it to capture  more  truly the  nature  of the  form of the  definition  variation in the  is such that one disparity error.  would  Also,  the  discrepancy in the error magnitudes is almost certainly due in part to the fact that the errors due to the noise induced shifting of the zero crossing and extrema the pseudo-variance  is a measure  features  (which is v/hat  of) can not be decoupled completely from the  effects  of  the matching and reconstruction processes.  It performs  is  seen  from  the  the  best  in terms  additive of  noise  minimizing  tests the  that  the  increase  in  warping the  reconstruction  RMS disparity  method  error.  This  suggests that the warping method is better than the other methods at smoothing out the high frequency  error  decreased  from  components. 30  to  15.  The This  RMS suggests  disparity that  the  error  increases  mechanism  only  causing  slightly the  as  a  increase  in  g  is the  disparity error in the presence of additive noise is only loosely coupled to the reconstruction process. Similarly, the addition of the extremum features does not affect the increase in RMS disparity  error  appreciably (except for the relaxation  case at  a  g  =  15).  This again  that the reconstruction process is loosely coupled to the noise error mechanism.  suggests  230 o o  d  -  |  I  1  1  0  40  80  120  160  1  200  1  240  Noise Variance  I  I  i  280  320  360  T  400  FIGURE 5.34 The pseudo-variance of the additive noise error as a function of the additive noise variance, for of = y/2.  231  5.6  -  Comparison  of  the  Simplified  Matching  Algorithm  with  the  DispScan Multi-resolution  Matching Algorithm.  In this section  we compare  the performance  of the multi-resolution Dispscan matching  algorithm with the simplified matching algorithm. We varying  a  tested  the  and a  g  multi-resolution  maximum  disparity  Dispscan of 10  algorithm  pixels.  were used to do the disparity function reconstruction. function of a  It better  is  than  computation performs  The 9x9  random  image  pairs with  average and warping  methods  The resulting RMS disparity errors, as a  are plotted in figure 5.35.  g  seen the  that  the  multi-resolution  simplified method.  and takes almost  better  reconstruction  on the  with  the  twice warping  method. This indicates  Dispscan matching  However  the  Dispscan  algorithm  algorithm  as long. It is also observed reconstruction that, even  process is very important to the performance  method  than  performs  involves  somewhat  much  more  that the Dispscan algorithm with  the  for the Dispscan algorithm, the  of the overall matching process.  9x9  averaging  reconstruction  Algorithm D dispscan. avg o dispscan, warp a simple, warp  Ui  O co U Ui  W  30  40  50  70  Surface Sigma  FIGURE 5.35 The RMS disparity errors for the Dispscan and simplified matching algorithms as a function of a_.  233  5.7 - Application of Image Analysis to Log Scaling In this section of the thesis we discuss the application of image analysis methods to a particular industrial application, that of log scaling. Log scaling is the process of measuring, or estimating the volume of wood in a log or group of logs. There are four basic reasons for wanting to scale logs: (Sinclair,  1980)  1. To determine payments and royalties to the Crown; 2. To determine payments to logging contractors;  3. To find the volume of logs for sale, trade or transfer; 4. To calculate divisional, departmental, or area production. The two most common techniques for scaling logs are weigh scaling and stick scaling. Weigh  scaling  weight  of  involves weighing a  the  empty  truck.  The  truckload of resulting load  logs  and  weight  subtracting  can  be  volume of the load of logs provided the mean density of the  used  this  weight  to  estimate  from the the  total  logs is known. Stick scaling  involves a man walking over a log as it lies on the ground and measuring the length and the  widths of  the  log at  the  two  ends  with a  measuring  stick.  These  measurements  are  written into a log book from which, at the end of a shift or workday, the log volumes are calculated. Stick scaling is more accurate on a piece by piece basis than is weight scaling, as the log density used in the  estimation of the weight scale volume can be in error  due to  variations in the log's moisture content as well as to invalid assumptions as to the species of log. In contrast weight scaling provides information about a set  of logs only. However, the  main drawback of stick scaling is that it is very slow compared to weight scaling, especially if the logs are small. This is an important consideration now that forest firms are harvesting smaller and smaller logs. Sample scaling, which involves stick scaling only a small sample set of logs from a larger group of logs, can be used to speed up the scaling operation, but is useful  only  for  large,  homogenous,  groups  of  logs  for  which  volume  statistics  can  be  234  adequately characterized (similar to the case of weight scaling).  Sinclair (1980) has noted that, in a large number of British Columbia coastal logging operations,  the log scaling  function  was the controlling  factor  in determining productivity.  Hence, if one could speed up the log scaling process, then presumably one could improve the productivity of a sort yard operation . Sinclair also states that as the scalers  are rushed or  11  pressured,  the quality of the scaling is diminished  automating pressures  the scaling  process,  that a human scaler  which  would  is exposed  logging machinery rumbling about nearby),  (the measurement  presumably  to (such would  error  be unaffected  increases).  by the workplace  as the menacing presence  be expected  to increase  Thus  the  of very large reliability  and  of the log scaling process. This section of the thesis looks at a process whereby  repeatability  the log scaling operation can be done automatically.  One  could conceivably automate  the stick  scaling operation by designing a robot to  directly replace the human stick scaler. This machine would walk along the logs and measure them with a shiny chrome measuring stick. This approach however, would have most of the human problems of stick scaling as well engineering  such  a  mechanism  are  as some new ones. At any rate the problems of  enormous  and the present  state  of the art  is not  sufficiently advanced to permit such a design. Demaerschalk et al (1980) have demonstrated improved accuracy over weight scaling by the  use of optical methods  in truck  load volume estimation. Their  method involved taking  photographs of the truck load from the back and the side, enlarging these photographs and manually  estimating (by counting the dots on an overlay grid) various geometric  such as the area of bark, the number of logs and so on. These parameters  parameters  were input to a  statistical  regression procedure which provided an estimate  Although  this technique was designed to replace weight scaling, the general principle, that of  optically  sensing  and measuring  log parameters,  for the volume of the truck load.  could be applied  to the domain of stick  scaling; that of logs lying flat on the ground. A sort yard is a place where logs from the harvesting grounds are sorted as to species and grade, bundled, and sent to the mills. n  235  This painfully  type  of procedure,  counting dots,  instead of manually talcing photographs,  could, in principle, be fully  automated  enlarging them and  with television cameras and  (special purpose) image analysis hardware. Simple techniques The simplest technique that one can use for processing images, widely  used  in log handling applications  (as well  as other  yet one of the most  applications)  is that  of binary  thresholding. In this technique, all piexels whose intensity is greater than a given threshold is a assigned a high or T assigned  level. All pixels whose intensities fall below this threshold would be  the complementary  low or '0' value. Thus the image  those comprised of pixels with level T  is partitioned into two sets,  and the other containing the pixels with level '0'. If  the intensity of the object(s) that one is interested in is sufficiently different from the other parts of the image, then this method provides a simple mechanism for separating  the object  from its surround. However, if the objects do not have an uniform intensity distribution, then the  thresholding  surround  does  operation  not have  will  cause  the  object  a uniform intensity  to  become  broken  up. Similarly  distribution, then parts of it will  if the  be confused  with the object. Figure 5.36 shows a photograph of a typical log lying on a flat deck. Figure 5.37  shows the result  of applying a threshold equal to the mean intensity of the original  image. Note that the black and white regions do not correspond to any meaningful structures. In general, to make the thresholding technique viable requires that one be able to control the lighting  conditions, as well  as the characteristics  of the objects  being imaged, very  closely.  This, for example, can be done by backlighting of the log which reduces the apparent texture of the logs visible surface (since it is in darkness) and provides a bright, uniform background. This technique, commonly refered to as broken beam scanning has been successfully employed in sawmills, where such a setup can be readily arranged. Examples of such systems are given in  (Vit, 1962)  and (Hand,  1975).  Another  implementation of the thresholding technique, highly  reflective  white  paint  (thus  reducing  method  which  is to paint over the  effect  of  has  been  proposed  the surface texture)  for the  of the log with  and  darkening  the  background (Miller and Tardif, 1970). In (Whittington, 1979) is described a method whereby  236  FIGURE 5.37 an  array  of  The effects of thresholding figure  photodiodes  senses,  and  thresholds,  the  5.36  light  reflected  off  of  a  peeled  log.  237  Vadnais  (Vadnais,  1976)  describes  a  similar system  that  detects the  boundaries of  a sawn  board in order to determine proper edging strategies. However, control  either  in  the  the  situations  encountered  lighting or the  in most  characteristics  sort  yards,  of the surfaces  one  (logs  can  not  practically  and background). Thus  the image processing technique must be able to handle textured surfaces and varying contrasts, such as is the case in figure 5.36. systems  utilize  edge  detection  Some of the more advanced log handling image processing  algorithms.  thresholding techniques because  These  of their relative  techniques  are  more  insensitivity to changes  powerful  than  the  in illumination across  the scene. Instead of trying to distinguish between the body of the log and the background, edge  based techniques  search  for the  boundary between  the  log and the  background. Since  logs are usually simply connected (no holes) the body of the log can be determined from its boundary. Edge detection is performed by measuring, with some differential operator, the local variation in the  intensity. If this variation exceeds a given level, we infer an  locality. There are 1976  many such edge operators  for a review, also see chapter  theory of edge detection). adequate operator  for  our  to the  background is detected,  Rosenfeld and Kak,  2 of this thesis for a discussion of the  Marr- Hildreth  However, even edge detection based schemes are not in themselves  application. Figure  image  that have been used (see  edge in that  5.38  of figure 5.36. many other  show  the  results  of  Notice that, while the  edges are  applying  an  edge  boundary of the  found as well. It  is a  detection  log and the  very difficult task to  determine which of these edges belong to the log-background boundary and which are due to the texture in the log interior and background.  Thus, background contrast,  to when  solve both  our the  particular logs  and  problem, the  which  background  is are  to  distinguish  highly  textured  logs and  from  the  of varying  we require a more complex image analysis technique. In the previous chapters of the  thesis we have seen that stereo depth perception works best under just these conditions. That is, the the  more  edge  segments  depth measurement  there are  process.  Also,  in a stereo image pair, the greater the reliability of since the  logs in our application are  lying  on a flat  deck, their occluding boundary (which is the boundary that we see in the image) is closer to  238  FIGURE 5.38 the  camera  principle,  than  an  The result of applying an edge operator (Marr-Hildreth) to figure  is  error  the  background.  Thus  free  segmentation  of  thresholding  the  logs  the  from  depth  the  value  will  background.  give  There  5.36. us,  are  in  other  methods, other than the one described in this thesis, for determining the distance to the logs; these include laser ranging, and structured finding  widespread  constrained attainable  light techniques (Jarvis, 1983). These techniques  application in industrial vision systems. However,  they  generally  are  require  a  environment (in terms of controlling the ambient light). Such control may not be in  a  logging  sort  yard,  and  a  system  that  can  operate  under  more  arbitrary  conditions would be desirable.  Analysis  of the  The  measurement  setup of the  accuracy  video log scaling system  is given in figure 5.39.  We assume that  the camera axes are parallel and that the logs are lying on a flat deck parallel to the image planes of the cameras and that the focal points of the cameras are the same distance the  deck. In the Appendix is derived the  formula for the  For the simple geometry involved in this application we have:  depth in terms of the  from  disparity.  239  FIGURE 5.39  z  =  fd/D  The video log scaling system  or  D  where D is the disparity,  =  fd/z  (5.7.1)  z is the vertical  point on the log or deck being imaged, between  setup.  depth  from  the camera focal  point to the the  f is the camera focal length and d is the distance  the focal points of the two cameras. Let the size of the sensor be given by s and  the number of imaging elements on the sensor (pixels) be denoted N. Then we have that:  N*D/s  (5.7.2)  is the disparity measured  in pixels. Let the distance from the camera focal points to the deck  p  =  be denoted h. Then the minimum disparity will be:  (5.7.3)  240  If  the maximum  log radius  is denoted  r  max  then  the maximum  disparity  range  that the  matching algorithm will be faced with is given by:  Ap  The  mean  =  (5.7.4)  2*N*f*d*r/[(s*h)(h-2r)]  depth  accuracy  is given  by 2r/Ap  (for quantization  of disparity  to the nearest  pixel) or:  depth resolution  In  this  application  =  the depth  sh(h-2r)/Nfd  resolution  (5.7.5)  is not important  More  important  is the horizontal  position accuracy. However the scaling factor for the horizontal position is almost constant and can  be calibrated  beforehand  or during the log scaling process if ground control  points can  be detected in the images. If the horizontal scale factor is known, the error in the measured ground position, A X , is a function of the position error in the image, Ax. It can be shown that:  AX  =  (h + f)s/(fN) A n  where A n is the image  position error measured  (5.7.6)  in pixels. In the best case analysis the only  error in the image position will be due to the pixel quantization. In this case An =  l/v/12.  If we are viewing a cylindrical log with dimensions of LxW, then its volume is given by:  V  =  (5.7.7)  TT(W/2) L 2  If the errors in the measurements of L and W are equal to A X , then  we can write the  normalized error in the computed volume as:  AV/V  =  (1/L +  2/W) A X  (5.7.8)  241  The Field of view common to both cameras has dimensions (sh/f) (vertical) by (max(sh/f-d,0)) (horizontal).  For  the camera  setup  in the examples  that  follow,  typical  values  for the camera  parameters were: d  =  10 inches  N  =  256 pixels  h  =  100 inches  f =  46 mm  r  8 inches  =  s =  25 mm  Thus we have that: p  =  0.1 mm/pixel  Pmin Ap  =  =  4  5  P  i x e l s  9 pixels  The mean depth resolution =  1.89 inches/pixel  The field of view is 55.5 inches by 45.5 inches. If the only errors in localizing positions in the image were due to quantization of the pixels, then we have that Ax =  0.062 inches.  If there is a log in the scene with dimensions L=40 inches and W = the  normalized error  in its volume  computation  would  be A V / V =  0.013  10 inches, then (or 1.3%). In  practice A n may be higher; on the order of 1 pixel. In this case A V / V would rise to 4.6%.  242  Doubling the resolution to N = 512 would extend the required disparity range to 18 pixels but would halve the error  in the volume. Moving the cameras farther  apart would also  increase  the range in disparity values, but would do so at the cost of less overlap in the stereo pair, which means that disparity values would be available over a smaller region. Because only  using the disparity  values  to  distinguish the logs  from  the background,  we are  we do not  require high disparity resolution. Thus we can move the cameras closer together and get an increased  image  overlap  and a  decrease  in the disparity range.  As we have  seen  in the  experiments in the previous section, and in the theoretical analyses of chapter 4, the matching algorithm  performs  better  for smaller  disparity  ranges.  Thus,  the  reduced  matching  created by moving the cameras closer together should improve the extraction  range  of the disparity  maps. We can not move the cameras too close together, however, as then the logs and the background  will  not  be  sufficiently  separated  in  disparity  value  to  allow  a  noise  free  segmentation. Experiments have shown that a disparity range of between 4 and 15 pixels gives good results. In the cases described later, the disparity ranges were on the order of 5 pixels. In a practical system we would require a vertical (from the camera's point of view) field of view of about 10 metres, and a horizontal field of view of at least 2 metres. The height of the camera  would be on the order of 10 metres above the ground on which the logs are  lying. Given a sensor size of 25 mm the field of view requirements result in a needed focal length  of 25  mm. Assuming  a  resolution  of 512x512  requiring that the disparity range, Ap to be 10 pixels cameras  be 4  metres.  The horizontal  field  of view  (N=512)  and taking r=0.25m, and  gives that the distance is then  6  metres.  between the  The mean  depth  resolution is 5 cm/pixel. Thus a log with a diameter of 25 cm would have a disparity range of 5 pixels.  Finding  the disparity  The  map  matching algorithm described in chapter  2 can be used to obtain the disparity  map from the stereo image pair of the scene containing the logs. disparity  range  produced  by the log scene  for a  256x256  sensor  necessary  to use more than one level of resolution in the experiments  Because  of the limited  resolution,  it  was not  that we performed. In  243  a  practical  resolution 512x512  situation  the  level may the  sensor  resolution  may  be  be required. However, even  disparity  range  is  not  larger,  when  excessively  the  large.  in  which  case  more  sensor resolution Thus  we  can  than  is increased  expect  that  one to the  multi-resolution matching algorithm will perform well. For the matching process to work with only a few levels of resolution, the initial disparity estimate must be as accurate as possible. Fortunately,  the  nature of our application ensures that this can  disparity will always, for parallel camera axes, occur is fixed in relation  to the  be the  at the background. Since the background  cameras, its disparity can  be computed  The size of the logs can be bounded by some expected value (say bound  the  maximum  disparity  case. The minimum  value. Thus a reasonably  or measured 75  beforehand.  cm), and so we can  good estimate to  the  average scene  disparity can be computed.  The matching process was performed on the using  only  purposes  of  one  level  this  of  resolution.  experiment,  directly  The by  initial hand  stereo image pairs shown in figure 5.40  disparity on  the  estimate  images.  was  There  misalignment which was measured by hand on the images and accounted process. These  steps can  be done  in a practical  system  measured, was  also  for a  the  vertical  for in the matching  as an initial calibration  this is done for a given camera setup a number of images can be processed  step. After  with no human  intervention. The zero crossings of the left and right images in the stereo pairs of figure 5.40 shown in figure 5.41.  We  do not  do the  reconstruction  process at  the  are  highest resoluion in  this case because we wish to both save computation and not distort the disparity discontinuity at the boundary of the log. The segmentation function reconstruction  process to be explained next does the disparity  in such a way as to accurately retain the position of the log boundary.  It is still necessary to perform the reconstruction  Segmenting  the  Once the  disparity  map  to  find  step at the lower resolutions.  the log  disparity map has been computed  it can  be thresholded  to extract all the  objects which have a disparity greater than a given value. If the deck that the logs are lying  244  FIGURE  on  5.40  is parallel  to the  constant and the disparity always  Two stereo image pairs depicting single log scenes.  planes of the  (magnitude  of the  deck.  correspond  to  computed  image  of the)  Thus the logs.  In  cameras then  disparity  of the  objects that are  practice,  there  logs  left  is  the  will  after  always  disparity always  the  a  of the be  amount  will be  greater  thresholding  small  deck  the  process should  of  disparity function, which results in some isolated points which are  than  noise  in  the  not part of the  log. These can be Filtered out by requiring that there be a given number of other points in a  given sized neighbourhood  about  the  point  in question.  then the point is removed. At this point in the processing which  loosely  thresholding Note  define  the  logs  and filtering the  that the  disparity map  region, disparity  as  seen  map  in  figure  obtained  the right hand side of the image. This is due to the detail  in  this  noise.  However,  region when  and the  the  zero crossings  segmentation  that  process,  were  a sparse set  which  the  many  depicts  stereo pairs  incorrect  found  were mainly  below,  is  the of  disparity  fact that there was  described  points form a distinctly non-log shaped region and are  condition is not  we have 5.42,  from  of figure 5.42b) contains  If this  of points result  figure  of 5.40.  values along  very little image due  performed,  thus recognized  satisfied  to these  camera noise  as a non-log object  245  FIGURE 5.41 and discarded.  The zero crossings of the stereo pairs shown in figure  5.40.  246  FIGURE 5.42 The thresholded disparity maps of the log scenes. The remaining processing steps require that we know the boundary of our log region. Thus  we  must  somehow  obtain  this  boundary  from  the  set  of  points  produced  by  the  247  thresholding thresholded line  process. image  We  accomplish  this  in  the  following  we search fifteen pixels to either  with an orientation of 9 0 ° . If there is at  manner.  For  pixel  in  the  side of the pixel in question along a  least  one  'hi' pixel on both sides of the  centre pixel then we set the centre pixel to 'hi'. We then repeat line orientations of 7 7 ° , 4 5 ° , 22°  each  this process  with search  and 0 ° . Finally the entire process is repeated. It can be  seen that, if instead of just the five angles above, we used a continuum of angles, and if we extended the search range to infinity instead of 15 pixels, and if the process was iterated indefinitely, the boundary of the resulting filled in region would be the convex hull original  set  of  representation  thresholded  points.  of  often  the  convex  for the log region, since sharp protusions on the  result from knots or branches, boundary  However,  the  log  as  computation, we limit the  will  cause the convex  shown in figure 5.43. region of search  For  in the  hull  not  log's surface,  hull  to widely  this  reason,  filling  is  of the  a  desirable  such  as may  diverge from the  as  in process  12  well to  as  15  to  true  save  on  pixels. Thus, a  non-convex region can be obtained, which is still filled in and roughly conforms to the log shape. The result of applying this filling in process to our real log image is shown in figure 5.44.  We can now determine the log boundary simply by noting which pixels are  'hi' and  are adjacent to only one 'hi' pixel to the left and right The log boundaries obtained for the real  log images  are  shown in figure 5.45.  The  actual  log  boundaries  are  also  drawn in  (dotted lines) for comparison purposes.  In practice the image may contain more than one log or there may be other (real will  objects  or hallucinated by the computer) which pass the disparity threshold. In this case there be  examine  more  than  one  connected  region  in the  the shapes of these regions to see  segmented  whether  output  In  this  case  we  can  or not they have the characteristics  of  logs (long and thin). If they do then we compute their volumes. If not, we discard them. The  moment calculation described in the  boundary  be  constructed,  algorithm  for  deriving  known  as  the  this representation  next section requires that a representation chain  code  from the  representation  boundary image  (Freeman, map will  of each  1974).  The  not be given  The convex hull of a set of points is the convex polygon of smallest area that encloses all of the points in the set 12  248  LOG BOUNDARY •CONVEX HULL  FIGURE 5.43 here.  However  boundaries 2,  3  include will  see  Volume  so  on.  not  the  the  have  the  in the  is  then shape  long of  next section,  possible  to  For  perform  slender  moments  example  to  between  on  each  region  allow us to estimate the  the  could number  associated  the  hull.  we  with  inside  different the  boundary  distinguish between  shape of  by its convex  distinguish  computations  measurements  and  the  log boundary  logs  logs.  the  closed  boundaries  separately. and  other  Other  boundary,  1.  Such objects  computations which  as  we  volume  can  volume of a log.  Computation  estimate  is  include  it  of the  unique labels.  can  determination  the  estimated.  often  that  them  We  would  Once be  note  assigning  computations do  we  by  and  which  T h e approximation  the  required  important cunit (or  log  Over  has  the  volume in  There cubic  course  of  the  been  a  foot)  two  of time  a  based  on  log  scaling  are  succesfully  process, scales  (Watts,  of  1983,  separated number a  volume the  to  from  of simple  small  having  out  number  make that  Forestry  are  its  background,  formulae of  only a  have  been  measurements. small  commonly  number  used;  developed  Since of  the  Handbook for B.C.). A  its  is  measurements  board  board  speed  to  foot  foot  is  and the  249  FIGURE 5.44  The filled in log region,  volume of wood in a piece of dimensions 12"xl2"xl".  There are board foot scaling rules and  250  V  1 \\ \\ \ \  \\  \v  \\ [t  V \  \  \  \  V V  V  V.  FIGURE  \ J  /?  Sr  5.45 The detected log boundary.  cubic foot scaling rules. These rules include adjustments for such things as kerf loss (due to non-zero width sawmill blades) and butt flare. Due to these adjustments the relation between  251  board feet and cubic feet measures is not 12  board feet to the cubic foot but rather  5.63  (Dilworth, 1975). We list some of the scaling rules below (from Dilworth, 1975). Board Foot Scaling Rules  Knoufs  rule  of thumb -  length in feet, and D =  V  =  (D -3D)L/20  where  2  mean diameter  V  =  log volume,  L  =  log  in inches.  Girard and Bruce rule of thumb -  V =  1.58D -4D-8 (for 32 foot logs). 2  Scribners Decimal C log rule (Scribners Dec. C.) of the volume (gives volume in tens of board  drop off the least significant digit  feet). This is the  official rule  for the U.S.  forest service.  British Columbia rule (superceded  V  =  (D-3/2) (.7854)(8L/132). This is the old B.C. standard 2  by the Smalian rule given below).  Sammi's rule of thumb Brereton Doyle -  V =  V =  V =  (D-l) L/20 2  0.0654D L. Used for measuring logs to be exported on ships. 2  (D-4) L/16. Erratic measure, 2  not widely used.  Cuhic Foot Scaling Rules Rapraeger's  rule  -  V  =  .005454154(D+L/16) L. 2  D  =  diameter  at  the  small end.  Assumes a taper of 1 inch every 8 feet of length. Sorenson's  rule -  V  =  .005454154(D + L/20) L. 2  Assumes a taper of 1 inch every  10  feet of length. Ffuber's rule -  V  =  C L where  C  =  area at centre of the log. This is unsuitable  for decked, rafted or loaded logs due to the difficulty of measuring the radius of the centre.  Smalian -  V =  (b+t)L/2 where b =  area of the base of the log and t =  the top of the log. This is the official rule in British Columbia.  area of  252  Dilworth listed  (1975) supplies a table  above. These were  of the relative  obtained from measurements  accuracies  on a series  the volume obtained by Newton's rule, which is V =  of the cubic  foot rules  of Douglas fir logs, and  (b + 4c + t)L/6  (where c is the area of  a slice through the centre of the log), was used as a baseline for comparison purposes. The results were as follows:  Smalian : -5.8% Huber : -2.9%  Sorenson : -3.78% Rapraeger : -4.7% In our automated process, we could, in principle, determine the volume more accurately than any of these rules by summing up the incremental areas along the axis of the detected log. However, a given user of the system may want the volume measures to  the value  given by the log scaling  rule  that  they  are used  to be comparable  to (such  as the official  government rule). In this case we would use the appropriate log rule. What  is required of our image  analysis process  in order  to compute  the volumes,  given the segmentation of the log? If we were to use the accurate method, we would need to determine the axis of the log; a deceptively difficult task. The use of one of the various log rules requires that one or more of the following  be known:  The length L of the log  along its major axis, the diameter of the log at the top, centre and bottom, and the mean log diameter. In the system described herein the following method is used to find the axis of the log. We first measure the moments, m , m , m 00  m  00  =  / f D dxdy  10  ou  m  lu  m  20  and m , where: 02  (5.7.9)  253  m  m  Using  a  computed  10  oi  =  J*/ x dxdy  (5.7.10)  R  •= / /  R  (5.7.11)  y dxdy  mn  =  / T xy dxdy  (5.7.12)  m  2o  =  //R  dxdy  (5.7.13)  m  o2  =  J/RY  dxdy  (5.7.14)  discrete with  R  x2  3  version  line  of Greens's  integrals  along  theorem  (Tang,  the boundary  of R  1982) (R  the above  moments  is the segmented  can be  log region),  thereby cutting the required computation by an order of magnitude. The reader is directed to (Tang, 1982) for the exact formulation of the moment calculations. Once these moments have been computed the centroid and axes of the ellipsoid that also gives rise  to these moment  values can be determined. The centroid of this ellipsoid is given by:  (x ,y ) = c  c  (m /m o,moi/moo) 10  The axes of the log will  0  pass through this point  (5.7.15)  The axis vectors are the eigenvectors of  the following matrix:  C  where:  =  C20  c„  C  n  Co 2  (5.7.16)  254  C  n  =  rrin/m,'00  (5.7.17)  C  20  =  m / m[00  (5.7.18)  C  02  =  ni /m '00  (5.7.19)  20  02  (  (  The eigenvalues of this matrix give the lengths of the axes corresponding to the perpindicular eigenvector. Once  the long  axis  has been  determined  we can search  along  this  axis  for the  boundary points which cross this axis. The distance between these points gives the estimated length  of the log. Correspondingly, we can find  intersection  the width  of the short axis with the boundary. These  of the log by finding the  width and length estimates can be  used in a number of the above scaling rules. We should be able to obtain a more accurate volume calculation by integrating the incremental areas measured along the long axis. We can also obtain the radius of the middle of the log, which is needed in some rules, by measuring the width of the log along the short axis (through measurements  the centroid). These  are depicted in figure 5.46. Note that the above procedure  well for objects that are warped or curved as the central  of the scaling  will not work very  axis cannot be approximated as a  straight line as we have assumed. Also the centroid may, in such a case, lie outside the log region. The algorithm described above requires that the centroid be inside the boundary. More complex  shape analysis techniques be employed for such cases. We have  problem any closer than this. However, the automated  not examined this  system would in all likelihood be used  to measure high value logs which are unlikely to exhibit warpage and other defects.  We  have  applied the above  procedure  to the real  log images  pictured earlier. The  actual volume of the visible portion of the logs were measured by hand (stick scaling) using Newton's formula and are compared below with the volume computed using the video method  255  FIGURE 5.46 Fitting an ellipsoid to the log region. with  a number  of scaling  rules,  volume for the first log (figure  including the integration 5.40a) and the second  5.40b. Comparison of Volume Estimates (in cubic feet) 1. Hand Measured Volume = 2. Integration along Axis = 3. Length*7T "(Width/2)  2  =  1.480 ; 0.7200 1.749 ; 0.564  1.749 ; 0.637  4. Smalian Rule Volume =  1.538 ; 0.617  5. Huber's Rule Volume =  1.678 ; 0.698  6. Sorenson's Rule Volume =  1.454 ; 0.617  method.  number  The first number  is the  for the log shown in figure  256  7. Rapraeger's  Rule Volume =  8. Newton's Rule Volume =  1.631  ;  ;  0.619  0.671  9. Frustum of Cone Volume =  1.536  ;  10. Average of the estimates =  1.590  ; 0.630  It values  1.468  is  vary  described  seen  that  somewhat.  in this  the  correspondence  The  thesis  were  differences done  of  0.617  the  range  chiefly  to  various  from  about  show  that  estimates 2%  to  stereo  to  the  hand  20%.  The  disparity  can  measured experiments  be  used  to  segment out the log from its background, and were not designed to obtain statistics as to the accuracy of the system over a large sample of logs. More experimentation  is required in this  regard. Also better techniques for obtaining the log region from the thresholded disparity map could  be  devised,  which  would increase  the  accuracy.  However,  it  is  evident  that stereo  disparity can indeed be used to get an estimate (around 10% accuracy with the methods given here).  The processing  algorithm hardware  logical next step. the  as  or  it  stands  could  with special purpose  Such a hardware  industrial setting.  now  be  implemented  hardware.  This  has  in not  off  been done  This test would presumably  bring to light some  obtainable volume accuracy obtainable (after performing a large Following this  prototype  shelf  image  but is  the  implementation would enable a test of the algorithm in unthought of practical  problems which need solving, and can also be used to provide a statistical  logs).  the  phase,  a production  number of tests on different  model could be  indicated by the technical and economic analysis of the prototype  estimate of the  developed, if this  performance.  was  257  5.8  - Summary of Chapter 5  -  An efficient method for implementing the  zero  crossing pyramid construction  was  presented.  -  Experiments were performed to test the performance of the matching algorithms on  random noise stereograms with non-constant disparity functions.  -  The RMS disparity errors  were seen to decrease as the disparity functions became  smoother and smoother.  9x9  -  The disparity errors were seen to be concentrated  -  The warping or transformation reconstruction  averaging  or  relaxation  reconstruction  in small patches.  method worked better than either  methods,  with regard  seen  increase  to  decreased  the  RMS disparity  error. -  The RMS disparity errors  were  to  with the  disparity range  of the  disparity function. This can be explained as being due to the reconstruction error. -  The filtering effect predicted in chapter 4.3  was observed experimentally in the case  of one dimensional random image pairs. Its effect on the matching of two dimensional image pairs could not be ascertained. gaussian  It  was  white  shown experimentally that the  noise  was  added to  the  increase  input images  in the  was  RMS disparity error when  linearly related  to  the  deviation of the added noise. This was in agreement with the predictions of chapter -  standard 4.2.  The multi-resolution DispScan algorithm was seen to be slightly more accurate than  the simplified matching algorithm. This improvement, however, was at the expense of increased computation.  258  -  The  simplified  matching  algorithm  was  successfully  applied  to  the  log  scaling  problem. -  Moment based methods  were developed to estimate the long and short axes of the  logs, from which measurements can be made to be used in volume estimate calculations. -  The  hand measured  difference  volume  volume was on the  boundary finding methods incorrect  in the  order  estimate  obtained  of 2 to 20%.  from  these calculations  The errors are  and  the  due in part to the  used and in part to the pixel quantization. Errors also arise from  disparity values near the  result in more accurate estimates.  true boundary, used. Better boundary  finding methods may  259 VI -  6.1  S C A L E SPACE F E A T U R E  -  MATCHING  Introduction This chapter  matching shown  process.  that  discusses the possibility of using scale space representations  The topics covered  one  of  the  in this chapter  main problems  that  the  are  in the stereo  illustrated in figure 6.1. We have  discrete  multi-resolution feature  matching  algorithms suffer from is the error induced by the reconstruction process. This error is largest at low resolutions and can cause the multi-resolution matching algorithms to become unstable. It would be preferable, then, to have an algorithm in which the reconstruction process would not be  required, at  least  disparity measurements it would high  be  at  low resolutions. We  have also seen  decrease as the resolution decreases,  preferable  to  have  a  matching  that the  accuracy  of the  at least for 'tilted' surfaces. Thus  algorithm that made  disparity measurements  resolutions only. Let us summarize these two conditions on the  at  matching algorithm as  follows:  Cl  -  The disparity measurements are to be interpolated only at the highest resolution  C2 -  Disparity measurements are to be made (or used) only at the highest resolution  level.  level. From the above, we can say that a desirable feature matching algorithm would be one that in some way refrained from making any disparity measurements,  and interpolating these  measurements,  we have  the  ambiguity  problem the  until  the  between  highest possible resolution level. However, as features  increases  the  resolution  increases,  making  the  earlier,  matching  very complex. In order to reduce the amount of feature ambiguity we must reduce  resolution. However, reducing the  measurement remains  as  seen  is  resolution also  reduces  the  accuracy  Clearly, some form of multi-resolution matching is necessary. -  is  there  any  multi-resolution  conditions C l and C2 defined above?  feature  matching  algorithm  of  the disparity  The question that that  can  satisfy  260  Introduction CH6.» Matching without disparity measurement or reconstruction CH 6.1  Problems with scale space matching  k matching Scale space ' 1 CM 6.2. Mokhtarian's  CM 64  Matching 2D scale maps  method CH6.I  JL A new matching  CM 6.3  method Biological implications CH C.s  CM 6.7 Experiments (chapter 5.4)  (*) Indicates New Material  FIGURE 6.1 The topics covered in this chapter. The answer to this question lies in the scale space representation of the stereo images, Before we describe how the use of the scale space representation allows this question to be  261  answered in the affirmative, we should discuss again the reasons why discrete multi-resolution matching algorithms cannot satisy conditions C l and C2. ln order for the matching process to take place have  at  a (discrete) resolution  a good enough  match  come  estimate of the  is sufficiently small to ensure  This means  level, with a minimum  of  feature  ambiguity, we  disparity so that the size of the search that the  must  region for a  percentage of false matches is suitably small.  that we have some a priori knowledge of the disparity function which can only  from the  lower resolution matching  process.  This means  that the  algorithm must  have  measured the disparity at some lower resolution. This means that condition C2 cannot be met Furthermore, be  matched  since the low resolution disparity measurements at  the  operation  to  resolution  matching  these types  higher  provide  the  resolutions,  higher  processes.  of algorithms.  must  of  disparity  density  Therefore,  It can  there  it is clear  be seen  that the  be  are an  sparser than the interpolation  information  or  required  that condition C l  can  features to  reconstruction  by not  the  higher  be met by  fundamental problem that these  discrete  matching methods have is that they perform a given matching operation at one resolution at a given time. That is, although they may use information from other resolutions to guide the matching, the matching itself is between features at a single resolution level only. This is the key point. If one  allows the  matching  algorithm  to  match  features across  resolution levels,  then conditions C l and C2 can indeed be satisfied.  This  brings  provides  an  resolution  chauvinism  us to  integrated  the  idea  of  multi-resolution  scale  space  representation  matching. of  the  The  scale  image.  map  There  of is  an  no  image built-in  to cause one to believe that matching should be done at one resolution  at a time. Look again at the scale maps of the simulated and real stereo pairs that were shown  in chapter  cardboard  4  and  5. Now try  this  simple  experiment  or paper with long thin slits cut in them. Place  stereo pair such  that the  slits lie over  Get  two  identical  pieces  of  these over the scale maps of a  a line of constant resolution (a,  and the  same in  both). Now try and do the matching (assuming that you have an error prone estimate of the actual matches). It is pretty the scale map contours.  difficult Now remove the pieces of cardboard and try to match  If you are  like most humans this will be a much easier task than  262 the previous one. Presumably this will be the case for computers  as well. If one  examines  the way in which humans perform the task of matching the contours of the scale maps, they find  that it is the  global shape  of the contours  and the relationships between  the  contours  that are used to perform the matching. This is due to the fact that, though segments  of the  scale  contour  map  contours  over  a  narrow  range  of  a  values  may  be  quite  ambiguous,  segments which cover a large range of a values are much less ambiguous. Since, in the scale map, the contours  are  continuous in x and a,  knowing the correspondence  between  contours  at low resolutions means that one knows the correspondence  at high resolutions as well, and  vice-versa.  Thus,  one  resolution,  for  one  the  needs  smallest  only measure error)  to  the  know  disparity the  at  disparity  resolution  values  at  all  (i.e.  the  resolutions.  highest Thus  condition C 2 is satisfied. Furthermore interpolation or reconstruction of the disparity function is required only at  the highest resolution (so that a suitably dense set  of disparity values  are  available for the next processing step). Thus condition CI is satisfied. From this we conlude that,  given that  the  process  of  scale  map  matching  is  feasible,  such  a  process  would  be  preferred to the discrete multi-resolution algorithms described in the earlier part of this thesis. In the  following  scale space image  section  we examine  representations.  some  methods  that have been  developed for matching  263  6.2 -  Matching of Scale Space Image Representations  As the concept of scale space  image representations  that the application of these representations  is fairly new it stands to reason  to computational vision processes  is only getting  started. There has, to the authors knowledge, been only one system described in the literature that explicitly matches scale space function representations. This system, described in the thesis of F. Mokhtarian (Mokhtarian, 1984) is used to match the scale space curvature  function of two planar curves  representations  (see Mackworth and Mokhtarian,  of the  1984). This system  was designed for a limited domain, that is one in which the scale maps to be matched were scaled and translated versions of each other. The reasoning behind this matching algorithm is as follows.  The scale map can be considered as a tree structure. Witkin, (1983) was. the first to point this out The root of this tree is an imaginary contour which encloses all of the real scale map contours. Each of the real contours in the scale map encloses zero or more other contours,  which correspond to the children of that contour.  Each  of these children enclose  their own children and so on. Thus the scale map can be thought of as a multi-level tree. Each  of the nodes  in the tree correspond  to a connected  scale  space  contour,  and have  associated with them a left and right branch and a peak (unless the contour is incomplete, in which case it passes  either  the left, right or top boundaries of the scale  map). Since the  scale maps are assumed to be related by only a scaling and a translation, the transformation between corresponding contours in a pair of scale maps has only two parameters  which need  be determined. Mackworth and Mokhtarian's matching algorithm determines the minimum cost tree match, where the cost of a contour match contours  once  the smaller  is defined as the average distance  of the two has been  transformed  (i.e. scaled  between  and translated).  Incomplete contours are not matched. The algorithm used to determine the lowest cost node matches  is an adaption of the Uniform Cost Algorithm (Nilsson,  1971). Full  matching algorithm can be found in Mokhtarian's thesis (Mokhtarian, 1984).  details of the  264  We in  the  have developed a somewhat simpler scale map matching algorithm which operates  same  domain as  Mackworth and  Mokhtarian's algorithm.  As  with  Mackworth and  Mokhtarian's algorithm, we create a tree from the scale map whose nodes correspond to scale map contours and the children of these nodes are those scale map contours enclosed by the node contour. above)  set  However, instead of performing a search  of node matches,  polarity of the  scale  our algorithm merely tries  map contours  along the  to find the lowest cost (as defined to maximize the correlation of the  line of minimum  a.  The polarity of a  scale  map contour is defined, in this instance, to be the sign of the V G filtered signal outside 2  (that is, above and to the sides) the contour. The ordered list of contour polarities along the minimum  o  line provides a signature  of the  scale  map, which can be used to match  scale maps. The autocorrelation of this signature almost always, in practice, peak at some  shift. Thus, to match  ordered polarity list along the  two scale  minimum  a  maps we can measure  line.  the  exhibits a distinct  the correlation of the  The peak of this correlation  function will  give the horizontal shift between corresponding scale map contours at minimum o.  Note that,  for non-zero disparity gradients, there will be scale map contours on the edge regions of one of the scale in  the  maps which have no corresponding contour in the other,  effective  field  of view in the  two images.  Thus,  due to the difference  in this case,  the  horizontal shift  predicted by the correlation will not be zero, but will be equal to the number of these 'new' boundary contours on the left side of the scale map. Also, if there is a non-zero disparity gradient,  new contours  may appear  at  the  bottom  of one of the  scale  maps and will  not  correspond with any of the contours in the other scale map. In this case the correlation peak will be diminished from its maximum possible value and correct matches cannot be made for all contours. The larger the disparity is, the greater the number of new contours. To handle this, our algorithm does the following. First we make a list consisting of all of the possible matches for each contour in the left hand scale map. We order the contours in the left and right scale maps according to height If we assume that the left image is the expanded one then we know that a given contour in the left scale map cannot match to any contour in the right scale map that is larger. Also we know that matching contours must have the same polarity. Furthermore, since the disparity function is linear, we know that the positional order  265  of contours in the right hand scale map must be the same as the order contours  in the left map that they match  of the scale map  to. That is, there are no position reversals.  can determine the proper correspondences  by applying these three constraints  list of possible matches. This  has been  algorithm  We  13  tested on a number  to weed out the  of one dimensional  stereo scale map pairs of random Gaussian noise, with linear disparity functions. The matching algorithm matched perfectly on all of these cases. These cases were described in chapter  5.4  where they were used to test for the presence of the Filtering effect described in chapter 4.3.  There  are  multi-resolution  in  image  the  literature  descriptions  representations  notable of these is the representation  that  are  of  systems  similar  to  value)  scale map, where  2  tree,  where  the peaks  and ridges  matching  map representations.  is seen to be a coarsely  the features are the peaks  ridges in the two dimensional V G filtered image. a  scale  perform  of  Most  of Crowley (Crowley, 1984, Crowley and Parker, 1984,  and Crowley and Stern, 1984). This representation a  which  14  (what we have  This representation  are the nodes  and they  quantized (in the  called  extrema) and  is also in the form of  are linked  both  at a given  resolution level as well as across resolution levels. In addition the local maxima of the three dimensional filter output (analogous features,  to a quantized 2D scale space transform) are also used as  and hence become nodes in the representations.  a type of scale space representation, have  described  earlier.  representation  was  representation  can also  colleague  in  It is clear that this representation is  although the tree structure is different than the one we  The application that Crowley had in mind  when he developed this  the  It  description  be used  of his (Lowrie,  and matching  to perform  of  shapes.  the stereo correspondence  1984) did, in fact, use this representation  is  evident  matching.  that his Indeed, a  to perform matching of  stereo pairs.  Just as a note in passing to complete this section, Yuille and Poggio, in one of their recent technical  reports (Yuille and Poggio,  1983a) commented  that they  were  looking into  Such reversals can occur with nonlinear disparities, such as are obtained for long thin objects such as wires or poles. "Actually, for computational reasons, Crowley uses the D O G or Difference O f Gaussians operator which can be shown to be a good approximation to the V G operator (see e.g. Marr and Hildreth, 1980). 13  2  266  using scale maps in matching stereo pairs. It is clear that the scale map concept is turning out to be a useful practical tool as well as a theoretical tool.  267  6.3 - Matching of Two Dimensional Scale Maps The above  mentioned  methods  are all one dimensional processes in that matching is  done only along a line in the image plane, and the scale maps that are matched  are only  one dimensional. There are two drawbacks to this one-dimensional matching process. The first drawback is that such methods are unable to match  scale maps  derived from scenes having  an appreciable vertical disparity component. Granted, such a condition may be rare, depending on the camera geometries, to produce  horizontal  and if vertical  disparities are present the images may be rectifiable,  epipolar lines. The second  drawback  occurs when the scale maps are  computed via two dimensional filtering of the image, as is the case in a number of stereo systems  (e.g. Grimson,  dimensional  scale  1981a).  map (as  It  defined  can be shown by Yuille  that  a  one dimensional  and Poggio,  1984a  (equation  slice  of a two  4.3.57  of this  thesis)) is, in general, not well-behaved where well-behavedness here refers to the property of scale maps of having contours  that never  contain points of local minima (or as Yuille and  Poggio call them, upside down mountains and volcanoes). The proof of this statement is given in the appendix  along with the conditions under which a slice of a two dimensional scale  map is well-behaved. A scale map that is not well-behaved may possibly contain that never reach encounter  contours  the minimum a line. Clearly, the matching algorithms described above will  difficulty in such a case. Just to drive home  the non-well-behavedness  of these  one dimensional slices of scale maps consider the maps shown in figure 6.2. These represent three adjacent (i.e. slightly different y coordinates) a  Gaussian white  noise  image  function.  15  slices of a two dimensional scale map for  The presence  of scale map contours  with local  minima are evident These local minima are marked by small open circles on the maps. There  are  three  ways  in which  we can remedy  this  problem.  The first way is  suggested by the theory developed in chapter 4.3 concerning the skew map representation  of a  two dimensional function. It was shown then that the skew map (which has only one spatial dimension) has the same properties as a one dimensional scale map. Since Yuille and Poggio  These examples of one dimensional scale map slices were computed and plotted by Hans Wasmeier of the Electrical Engineering Department, U B C as part of a course project 15  268  FIGURE 6.2 Three adjacent one dimensional slices of a two dimensional scale map exhibiting non-well-behavedness. (1983a) have  shown that all one dimensional scale maps are well-behaved, then  skew map. Thus we conclude that we can use the skew map representation  so is the  as our input to  the matching algorithms. There is a problem with this approach, however. The way in which the skew map is computed is to take a two-dimensional slice of a four-dimensional function (i.e. that defined by equation 4.3.56). Thus, the production of a skew map involves an order of magnitude  increase  in the amount  (which involves computation  of computation  over  the production of a scale map  of a three dimensional function). This would rule out such an  269  approach  for use in all practical  A  second  involves actually  remedy  implementations.  to the  performing  the  problem matching  of matching  two  dimensional  in two dimensions.  image  representations  Yuille and Poggio  have shown  that the full three dimensional form of the two dimensional scale map is always well-behaved (even  though  slices  of  it are  not).  Although the  two  dimensional  scale map  contours may  contain saddle points, which look like local minima when sliced, there is always a path from such  a point to  the  maximum possible skew  map  minimum a  resolution.  method.  line.  Thus contours can  This solution, however,  Matching a two  dimensional  be matched  suffers  from  scale map  the  and tracked  to  same problem as  involves an  order  of  not  be  as  great as  for  the  process involves fewer computations  skew  map  method,  however,  since  the  the  magnitude  increase over the matching of a two dimensional scale map. The increase in computation probably  their  tree  will  matching  than does the spatial filtering process. Thus this method is  preferable to the skew map method.  The  third solution, and probably the best, is to perform one dimensional filtering only.  That is, compute  separate one dimensional scale maps along each epipolar line in the images  and match these. The filtering and matching of magnitude drawback image  to  less than this  the  solution  computational  two dimensional scale map  is the  effect of  one  complexity  matching  will both be an order  method.  dimensional filtering on  The only possible a  two  dimensional  function. As Grimson (1981a) points out, one dimensional, or directional, filtering tends  to smear out edges in a direction perpendicular to the line of filtering. We can summarize by stating that, if we want to match scale based representations of two dimensional image functions, we must do one of the following: 1  -  If  the  epipolar  lines  of  the  imaging  system  are  not  known  perform a full two dimensional matching of the two dimensional scale maps. If the epipolar lines of the imaging system are known then:  then  we  must  270  2 - We  can perform a full two dimensional matching of the two dimensional scale  maps anyway. 3 - We  can compute the skew maps at each epipolar line of the two images and  match these. 4 - We  can perform one dimensional filtering along the epipolar lines in order to  compute the one dimensional scale maps along these lines, and match these maps. From the point of view of minimizing computational complexity method 4 is the best. However, other  considerations, such  as minimization  of feature  distortion,  or lack of  information about the imaging geometry may indicate that one of the other methods be used.  271  6.4 -  Problems With Scale Space Feature Matching  As  with  theoretical,  all  computational  vision  processes,  there  with implementing stereo algorithms based  and foremost  of these problems is the  scale  map,  a  complete  convolution of  required. In order to maintain coherence the o  space  feature  and  matching. First  the  value of a  image  of the scale  with  used in the construction of  the  appropriate  V G filter 2  is  map contours the quantization between  one can see  that this quantization should be at most on the order of  (i.e. a^/aj _-^<l.l for a logarithmic scale), and smaller for regions where the scale map c  contours  approach  aj /a _-^ = 2, c  on scale  problems, practical  values must be fairly small. Looking at the examples shown in chapters 4 and 5 and  later in this chapter, 1.1:1  some  immense computational load that is imposed in the  computation of the scale maps themselves. For each the  are  there  k  different a  horizontal. is  very  Obviously, little  in  coherence  the  discrete  between  values. From this it is evident that the  required to compute a scale  map, over the same  least  times  LOG(2.)/LOG(l.l) = 7.27  Crowley (Crowley, 1984  as  much  o  as  for  multi-resolution algorithms,  segments  of  a  scale  map  where  contour  at  number of complete image convolutions range the  as the discrete  discrete  algorithms, is at  algorithms.  Note  that  the  and Crowley and Parker,  1984)  and Lowrie (Lowrie, 1983)  methods, which were not designed with the scale  space  in mind, have quantization ratios  i/ 2:1.  These  contour problem.  methods  coherence This  would  and  would  thus  require  is shown in figure  shown in figure 4.5  be  6.3  to a ratio of /2:1  expected the  to  have  addition  where  we  of  have  problems with heuristics  loss  designed  quantized the  left  of to  matching of  scale  map  handle  this  hand scale  map  and linked the contour segments using the Crowley  method (basically linking the contour segments at one a  value to the nearest contour segment  of the same sign for the next lowest a value). Comparing this (coarsely) quantized version of the scale map to the finely quantized (ratio =  1.018:1) scale map of figure 4.5,  we see that  the coarsely quantized scale map differs in some places from the finely quantized scale map. It must be noted, however, that, by and large,  the  two scale  maps agree  possible to obtain matches between two coarsely quantized scale maps.  and it may be  272  FIGURE  6.3 A linked coarsely quantized scale map (v/2:l a ratio)  The computational load is a great problem in that, with current hardware, scale space matching  algorithms,  in their  full  two  dimensional glory, take  far  too  long to  be  of any  practical use. Indeed, all the experiments on scale space matching described in this thesis are for  the  one  problem  dimensional case  is but a paper  faster and more  tiger,  complex  only.  From  another  point  in that one can always (at  hardware  load. For the time being, however,  architectures  of  view,  the  computational  least to a certain  load  point) create  that will be able to bear the computational  it is a problem, and scale space matching systems  will  not be ready for the real-time world for a while yet A  more  functions are means  that  fundamental problem results  not, the  in general, shape  of  merely  from the  expanded  corresponding  scale  or map  fact  that  compressed contours  the  left  and  versions of each may  not  be  right other.  mere  image This  expanded  versions of each other. This may make some matching algorithms fail, if they were based on  273  matching  the  shapes.  Furthermore,  nonlinear  disparity  wherein the scale map contours split and merge  functions  can  give  rise  in different fashions in the  to  instances  left and right  scale maps. This essentially means that the left and right sides of a closed contour scale map may correspond to the right and left side of two adjacent scale map. This is illustrated in Figure 6.4.  Note that one cannot  contours  achieve  in one  in the  other  correspondence  in  this case by matching complete contours.  A possible solution to this problem is to manipulate the scale maps so that they are transformed into an isomorphic pair. One can do this by identifying places in the where splitting or merging of scale map contours to  occur.  For  example  these points are scale  maps  narrow  necks  between  closely  indicated (with a small circle)  derived for  a  sinusoidal disparity  points have been detected  (relative  previously  discussed  to the other scale map)  spaced  scale  in figure 6.5  map  contours.  Some  =  40sin(7rx/256)).  Once  one can edit the scale map by splitting or merging the  methods.  Alternatively,  a  scale maps are tree  is likely of  which shows a stereo pair of  function (d(x)  that have been singled out in such a way that the the  scale map  matching  matchable method  these  contours  with one  which  of  performs  splitting and merging of nodes in order to make the trees isomorphic can be used (see  for  example the method of Lu, 1984). The tree in this case would be the tree whose nodes are scale map contours whose children are contours that are contained by the father contour is  the  tree  discussed  earlier  in  the  discussion  of  the  scale  map  matching  (this  algorithm  of  Mackworth and Mokhtarian).  A  second  problem that  is  encountered  in practice  is  that  intensity functions have added to them noise (i.e. from the camera is uncorrelated  between  of  noise.  electronics),  If  the  image  and this noise  the left and right image functions, then the scale maps of the two  images will contain some non corresponding contours, or at the very least the positions of the corresponding  contours  will  be  perturbed  merging of the  contours  will  algorithms  be able  to handle the  must  take place  enough (even  so  that  for linear  extra noise  non  corresponding  disparity functions).  generated  contours  as  splitting and The matching  well  as  the  non  corresponding splitting and merging of the scale map contours. The effect of additive Gaussian  274  LEFT  RIGHT  FIGURE 6.4  Splitting and merging of scale map contours  for nonlinear disparities  FIGURE 6.5  A stereo pair of scale maps with sinusoidal disparity  white noise on the scale maps of a stereo pair for the linear disparity case can be seen in  275  the  experiments  of  chapter  5.  In  practice  functions. This can be seen in figure 6.6 Note that the  real  images  exhibit  very  non-linear  disparity  which shows the scale maps of a real image pair.  splitting and merging of contours can be seen. However the human eye can  easily determine the  correspondences  between  these two maps. This indicates that it may be  possible to fashion a scale space matcher that will work on real stereo imagery.  FIGURE 6.6 The scale maps of a real stereo image pair.  277  6.5 -  Implications for Biological Depth Perception  The about  above  whether  Hildreth  scale  space  matching  or not such processes  techniques  raise  a  number  of interesting  may be used in biological vision  systems.  questions Marr and  (1980) and Marr and Poggio (1981) have pointed out current physiological evidence  that the human visual system does contain V G type filtering mechanisms that cover a wide 2  range of resolutions or scales.  Marr and Ullman (1981) go on to claim that certain neurons  in the visual cortex may be performing zero crossing detection. Thus, it would seem that all the neural machinery is present to, at the very least, compute some form of scale map. What is  not clear  hierarchy,  is whether  that  perform  there the  are mechanisms, matching  of  higher  these  up in the visual  scale  map like  cortex  processing  representations.  If  such  mechanisms exist, are they like the ones discussed earlier (i.e. tree matching), or do they use some other algorithm? We cannot say at present, and must wait for further neurophysiological research.  One can obtain evidence for scale via  psychological experiments  people can fuse stereograms  as well.  map based processing in the human visual system  One such  piece  of evidence  is the observation  that  of sparse line drawings as well as dense random dot stereograms.  The discrete multi-resolution matching algorithms analyzed earlier can be seen to have a lot of trouble with very sparse prone.  Since the scale  density  of the feature  achieved.  Julesz  (1970)  matching  algorithms  feature  sets, since the reconstruction  process  map method does not require any reconstruction set  does  pointed  not affect out the  on sparse  apparent  and dense  mechanisms in the human visual  system  the quality  feature  of the  differences sets  will  be very  to take  place, the  correspondences  that are  in the performance  and proposed  for performing stereopsis.  that  error  there  of his be two  One would work on the  densely featured images and the other would perform the relatively simple tasks of matching sparse  images.  This  conclusion  is perfectly  valid,  except  that  a  single  method  that could  perform both of these tasks would be preferrable, if only on the grounds of simplicity and elegance. system  According to current  thought in biological circles, development of the human visual  is due to evolutionary pressures.  That is, any feature  of the visual system, such as  278  depth perception, was developed because in doing so the survivability of the organism was in some  way enhanced  with the  addition of the  feature.  However, while it is fairly  easy  to  come up with ways in which the addition of depth perception would benefit an organism, it is  difficult  stereograms, human  to  see  which rarely,  visual  handling  what  system  sparse  to  advantage if ever, evolve  there occur  a  would  in nature.  separate  depth  imagery. The conclusion we can  mechanism,  while  developed  to  functioning,  handle very sparse  handle  be  dense  imagery as  well.  matching methods can handle dense and sparse  in the Thus  ability to  it  is unreasonable  perception  mechanism  make is that the imagery  process  can  The point to  also,  to  very  sparse  expect the  capable  of  only  human depth perception in  the  course  of  be made is that scale  its map  images with equal facility and thus, in the  light of the psychological observation noted above, would be preferred over a method such as the  discrete  decreases.  multi-resolution  method,  whose  performance  degrades  as  the  feature  density  279 6.6 - Summary of Chapter 6 high would  A multi-resolution matching algorithm that used disparity measurements  resolutions perform  measurements  -  only, and performed better  than  one  disparity function reconstruction  that  used  low  resolution  (and  at  obtained at  high resolution only  higher  error)  disparity  and disparity function reconstruction.  Scale  space matching  algorithms  need  only  measure  and  reconstruct  the  disparity  function at the highest available resolution, thereby minimizing resolution dependent errors.  -  A  constraint  based  method  for  matching  scale  maps  for  linear  disparities  is  proposed. -  This algorithm is successfully applied to random image pairs, (chapter 5.4).  -  Scale  space matching  increased computational -  Nonlinear  correspondences -  of  two  dimensional image  pairs  is problematic  due  to  the  requirements.  disparity  functions  distort  the  scale  maps  making the  determination  of  quite difficult.  Scale maps can handle sparse stereo pairs just as easily as dense stereo pairs which  is not the case for the other  multi-resolution matching methods described in this these (but  is the case for the human vision system).  280  VII -  BINOCULAR  7.1 -  Introduction  DIFFREQUENCY  In chapter 4.3, we proved, using the Scale Space Transform, that the measurements of position disparity made from the zero crossings of V G filtered images, will not be exact if 2  the true disparity function is not constant This error was shown, for random noise images, to increase as the filter resolution decreased. This observation leads one to ask whether or not there  is some measurement that can be made, other than position disparity, that provides a  better shape descriptor.  In the  this chapter we show that one can obtain, from the Scale Map, measurements of  binocular diffrequency, loosely defined  two image functions. It will diffrequency  values are  as  be shown that  directly related  to  the  difference in spatial frequency content  of  in the case of linear disparity functions, these the  disparity gradient  and contain none of the  errors induced by spatially filtering that were observed in the position disparity measurement errors.  We can conclude that  for low spatial resolutions where the position disparity errors  are large, the diffrequency measurements may be a better descriptor of the surface shape. We present the findings of some psychophysical research  done by others which tends to support  this conclusion.  The results of experiments on one dimensional functions are presented which show that the  measurements  of diffrequency from  the Scale maps do provide a good estimate  of the  disparity gradient and that the errors in these measurements are more or less independent of the spatial resolution of the filters and due primarily to quantization. Figure 7.1 shows the structure of this chapter.  Introduction j  1 cm./ v Measuring diffrequency from a pair of £ scale maps ^  CH7.Z  P»ychophysical evidence for ^ diffrequency |  CH7.3  Experiments on linear *  disparities  |  CH7.4  Summary CH 7.6  (*) Indicates New Material  FIGURE 7.1 The topics to be covered in this chapter.  282 7.2 -  Diffrequency Measurement  Let  us consider  a situation  disparity function with a constant, tilted planar rise  surface  wherein we are viewing a surface  to a  non zero, gradient when viewed binocularly. For example, a  viewed al a distance  to an approximately  that gives rise  constant  large  disparity  compared  gradient.  to the inter-ocular  The true  disparity  baseline gives  function  has the  following form, in this case.  d(x ) L  =  x -x R  =  L  (7.2.1)  Po + P:X  L  It can be seen that if the left eye sees a light intensity pattern  g(x) and the right eye a  light intensity pattern f(x), then g(x) and f(x) are related as follows:  g(x ) L  =  f(x )  (7.2.2)  R  and hence we can write, using equation (7.2.1):  g(x ) L  =  f(/3„ + (l + p\)x )  (7.2.3)  L  We can now, using equation (4.3.7), show that the SSTs of the left and right images are related as follows:  G(x,a) =  ( l + p \ f F ( p \ + ( l + |3 )x,(l + /3 )o)  (7.2.4)  F(x,o) =  ( l + /3 )G{(x-^o)/(l + /31),a/(l + /3 ))  (7.2.5)  1  1  1  or  Note that the scale  1  map of f(x) is obtained  1  from  the scale map of g(x) by a uniform  283  expansion by a factor  1/(1 + ^)  in both the x and a  which shows a scale map pair for a tilted surface.  directions. Look again at figure 4.5  It is clear that the left and right scale  maps are scaled versions of each other.  Foveal  Diffrequency  If we make the coordinate transformations:  r =  log (a) and y = 2  x/o  (7.2.6)  it can be shown that:  5(yj)  =  G(x,a) =  2" k F(y + 0 „ 2 ~ , r + k )  where k=log (l + /3 ) and F(y,r) = 2  (7.2.7)  r  1  F(x,a). It can be seen that, for r fixed to some value  r , G"(y,r ) is the same as a scaled and translated version of F(y,r). This transformation is the 0  0  type one would expect to find in a foveal  system, where the highest resolution (smallest  o)  elements are clustered in a central region and the lower resolution elements are more loosely distributed outwards toward the periphery. Such a foveal representation,  similar to the human  system, is shown schematically in figure 7.2. This diagram shows the field of view for each resolution level, where the number of elements in any one resolution level is the same. We have assumed that there are elements sensitive to the central region at all resolutions. There is evidence, both neurological and psychological, that this is indeed the case in humans (see for example, Wilson and Giese (1977), especially equation (10)).  Applying the transformations 7.2 results in the representation  of equation (7.2.6) to the foveal representation  of figure  shown in figure 7.3. The most striking feature of this image  representation is its homogeneity, which is reminiscent of the spatial frequency columns found in  the human visual cortex  transformation  (7.2.6)  (Maffei and Fiorentini (1977)).  on the foveal  representation,  we will  Because denote  of this the image  effect  of the  of the SST  284  FIGURE 7.2  FIGURE 7.3  The spatial organization of a foveal image representation.  The transformed version of the foveal image representation  under the transformation (7.2.6) as the foveal scale space transform or FSST.  of figure 7.1  The FSST of a  285  function f(x) is defined as follows,  2" V2i  F(y,r) =  The  2  2  T  (X  set of zeroes of the FSST will be referred  knows j3  scale map of f(x), do this  the  which contours map  where f(x)  correspondence in the  of f(x).  and g(x)  We  foveal  will  (7.2.8)  to as the foveal scale map of f(x).  from the foveal  scale map of g(x)  correspond  correspondence  must be able  with those in the  problem  can  in equation (7.2.7) we can see  be solved without any  that a point (y,r)  elements  of  the  this  -r  0  F(y,r) = 0  vector  process is illustrated  in the  is the same as the point (y + j3 2 ,r+k) in the foveal scale map of by adding to  2  on  to tell  foveal scale  where k = log ((l + /3i)). Hence one can obtain the foveal scale map of g(x) point  If one  are the right and left eye image functions, ln order  assume that the  foveal scale map of g(x)  each  dv  problem must be solved. That is, one  error. Using the relation expressed  f(x),  ( y _ v ) V 2  and /3j one can graphically construct the foveal scale map of g(x)  0  to  d /dy /" f(v2 ) e ~  2  curve  the  by the  in  (y,r)  space  the  foveal diffrequency and example  vector  the  (|3 2~ ,k). 0  r  We  will  foveal disparity. This  call  the  construction  shown in figure 7.4. The important point to be made  about the above process is that it is, in principle, invertible. That is, given that we know the foveal scale maps of both f(x)  and g(x),  the  and  true disparity  function p\  p\.  we should be able to determine This  inversion process  stereo; of determining disparity gradients from frequency Let  us assume that f}  0  systems since  is zero  (this may  is  the  the parameters of  basis  of diffrequency  differences.  be a good  assumption  for  human vision  the disparity at the centre of the visual field or fovea is always being driven  to zero by the vergence mechanisms). as shown in figure 7.5.  In particular,  In this case the left and right scale maps are the point in the  right scale map corresponding  given point in the left scale map is obtained by finding the intersection  related to a  of the line passing  through the origin and the corresponding point in the left scale map with the corresponding right  eye  scale  map  contour.  It  can  be  observed  that  the  ratio  of  the  a  values  of  corresponding right and left eye points is a constant, and that this ratio is equal to (l + p\).  286  F I G U R E 7.4  The relationship between the left and right foveal scale maps.  Thus, if we measure this ratio, the disparity gradient can be easily obtained. If we transform to the  foveal scale space the problem is even simpler, as shown in figure 7.6. In this case,  corresponding points in the two foveal scale maps are  located  directly above each other, and  the vertical shift is equal to k=log (l + 01). From this measurement the disparity gradient can 2  again be obtained.  287  FIGURE 7.6  The relationship between the left and right foveal scale maps for /3 = 0. 0  288  7.3 -  Psychophysical Evidence For Diffrequency Stereo Blakemore  presented whose  (1970)  human  period  and Fiorentini  subjects with  was different  and Maffei  binocularly ambiguous  in each  eye. Such  (1971),  in  independent  stimuli consisting  a stimulus pair  experiments,  of sinusoidal  is shown  gratings  in figure  7.7. A  unique disparity function can not be found for such stimuli, yet the subjects did not perceive any rivalrous surfaces, but instead perceived a stable tilted surface. was  evidence  for a  differences between  binocular  mechanism  that  was based  Blakemore claimed that this  on the perception  the two eyes, as the usual binocular mechanism  of frequency  based on retinal position  disparity would not give a unique result  Tyler and Sutter (1978) improved on Blakemore's  experiment  in an effort to eliminate  all possibility of the use of any position disparity information. They used dynamic (i.e. time varying) random noise images such that the left and right eye images were uncorrelated. Thus there were  no corresponding  obtain position disparity however,  different  features in the left  measurements.  and right  The spatial  frequency  Thus, if there is any binocular  based on diffrequency measurements,  grating  diffrequency  stimuli  mechanism  in the human  were,  visual  system  then a perception of depth should be elicited. This is in  but was nonetheless  operated  that could be used to  ranges of the two image  mechanism  fact what the subjects tested by Tyler and Sutter reported. sinusoidal  images  only  at  present  medium  The. effect was weaker  They  or high  also  found  diffrequency  that  values;  than for the  pure  presumably  because the error involved in using only position disparity information at low diffrequencies is small.  This  disparity  is in accord  error  with  does in general  the findings of chapter increase as the disparity  4.4  wherein  gradient  could be useful fields but diffrequency  a finely tuned processing for animals,  no conjunctive to  disparity  such  system.  gradient  are  The equations  exact  frequencies,  and as such  This means that diffrequency measurements  as crabs and some reptiles,  eye movements.  that the  increases. Tyler and Sutter  argue that diffrequency measurements could be made at low spatial do not require  we showed  That  derived  is, there  measurement of diffrequency for linear disparity gradients,  that have above  overlapping visual relating  is no inherent  error  binocular in the  in contrast to the case of position  289  F I G U R E 7.7  disparity  A pair of ambiguous sinusoidal stimuli.  measurement  resolutions  This  would  would be just as precise  indicate  that  diffrequency  measurements  made  as at high resolutions. Thus Tyler and Sutter's  at  low  argument  seems to be supported by our analysis. It should be pointed out here that we can not obtain an expression In such  for the  a case the  chapter  4.3,  the  scale maps  for a nonlinear  disparity function.  diffrequency measurements may exhibit a form of Filtering error such as  that derived for the in  relationship between  disparity measurement case in chapter 4.3.  if the  region  of interest is sufficiently small  However, as was pointed out we can  function about the centre of that region. This linearization process will  linearize  the  disparity  become  less valid as  the resolution decreases.  It has been be  obtained  range . 16  even  found (Blakemore, if  This behaviour  the can  cumulative  1970)  that a fused perception  horizontal  position  not be explained as the  disparity  result  of a tilted surface can exceeds  of matching  Panum's  algorithms  fusional such  as  "Panum's fusional range is the range of disparities over which a binocular stimulus consisting of a vertical line can be brought into correspondence by the human visual system, for a given position of the eyes.  290 Marr and Poggio's  (1979). It has been  fusion  is the disparity  gradient  Tyler,  1973). It can be seen  found that the limiting  and not the position  that a disparity  frequency  disparity  gradient  explained by positing a diffrequency mechanism. scatter of the diffrequency sensitive  factor in obtaining binocular (Burt and Julesz,  1980, and  limited fusional range can be readily  If the diffrequency value exceeds the spatial  neurons,  then a diffrequency value can not be  obtained. Hence fusion will not be produced.  In human  light  visual  information detection  of these psychophysical system  in some  neurons  the foveal  units. These neurons  mechanisms  fashion.  sensitive  scatter in the retinal detect  neural  There  performing  be  spatial  neurons  frequency  receptive fields of these afferent (/3 2~ ). 0  r  diffrequency sensitive  This  scatter  the depth  0  the  that  that there  processing  receive  ranges. There  exists in the  of  signals  diffrequency from  would need  feature detector neurons,  would be larger  in order to  for the higher  to a set of higher  feature to be a  resolution  neurons would combine with the normal disparity  et al 1967) to feed signals  which • would compute the /3 sense describe  for  could  to different  disparity  (see Barlow  findings, it seems evident  level cortical  sensitive neurons  and /31 values (or at least values of parameters which in some  and tilt of the surface). The fact that corresponding  foveal scale  map contours lie right above each other (as shown by figure 7.6) means that the connections required for diffrequency measurement can be local and regular (see figure 7.2). The analysis presented in this section indicates that a possible reason for the existence of diffrequency measurement units in the human visual system, least, is to provide better surface shape  descriptors  than  at low spatial frequencies at  those provided by measurements of  position disparity. In fact, since the measurement of position disparities at low resolutions are susceptible which  to  the filtering errors  are immune  reliable matching  depth  an independent  in chapter  to these errors (for linear  information  algorithms.  discussed  at low resolutions  Perhaps  more  importantly,  disparities  4.3,  the diffrequency  anyway),  for the initial  measurements,  can be used  to provide  phases of the multi-resolution  the diffrequency measurement process provides  measure of the surface orientation,  which can then  be used  in conjunction  with the position disparity measurements to provide an improved reconstruction  of the surface  291  shape.  292  7.4 -  Experiments In this section we present the results of experiments designed to test the adequacy of  diffrequency measurements  in estimating the disparity gradient of a surface, when that surface  gives rise to a random, Gaussian, intensity distribution. The experiments proceeded as follows. We generated  the left and right scale maps of  one dimensional random, white, Gaussian distributed, functions for constant of -20/255,  -40/255,  -60/255  and -80/255.  disparity gradients  These scale maps are the same ones shown in  figures 5.23 to 5.27. We then, for each disparity gradient case, match the left and right scale maps  (using  measure  the simple method  the diffrequency value of each  left hand scale along  which  equation  described  map. Since the scale  we measure  for this  curve  6.2).  Then,  for each  maps have is not a  a logarithmically scaled straight  line,  a  but, rather,  as follows. The diffrequency path  corresponding point as /3j is varied)  value of a we  scale map contour crossing that value of o  the disparity is derived  in chapter  axis,  in the  the path  is curved. The  (i.e. the path  of a  for linear (x,o) axes scaling through the point (in the  left hand scale map) (x ,a ) is given by: 0  a  a x/x  —  0  0  (7.4.1)  0  Making the logarithmic scaling transformation:  I =  (255./6.65)LOG (a-2)  (7.4.2)  2  that is used in the computation of the scale maps, we get  I(x) =  ,6.65I /255, (255./6.65)LOG ((2 + 2' )x/x 0  2  for the equation of the diffrequency path. Some figure 7.7.  Notice  how these paths  are almost  0  -  sample vertical  (7.4.3)  2  diffrequency paths near  are depicted in  the x origin and flatten out  293 Diffrequency Search Paths  • • • - « • • «  i i  m  •*  m  m  FIGURE 7.8 The diffrequency search paths for a logarithmically scaled a axis.  towards large x values. Once diffrequency  all of the diffrequency values have measurements  with changes in a  is then  been  measured  the R M S error  in these  computed. The variation of the RMS diffrequency error  for each value of disparity gradient tested is shown in figures 7.8 to 7.11.  In all the cases, the diffrequency error was on the order of 10% of the actual value for all a for  values in the range tested.  It can be seen, however, that the diffrequency errors  the linear disparity functions, are not zero (as predicted by theory).  Also  the error is  seen to rise with an increase in o. These effects can be accounted for by the quantization in the scale map contour position measurements. An expression for the diffrequency quantization error variance as a function of the disparity gradient and a is derived in the Appendix. The standard deviation of the quantization error (the square root of the variance) is a measure of the R M S quantization error. This is plotted in figure 7.12 as a function of a gradients of -20/255, -40/255, -60/255  and -80/255.  for disparity  It is seen that the quantization error  does indeed form a large portion of the observed R M S diffrequency measurement error. The  Filter Sigma FIGURE 7.9 The RMS diffrequency error as a function of o for 0,= -20/255, linear disparity  Filter Sigma FIGURE 7.10 The RMS diffrequency error as a function of o for p\ = -40/255, linear disparity  295  o d  Filter Sigma FIGURE 7.11 The RMS diffrequency error as a function of o for 0 i = -60/255, linear disparity  Filter Sigma FIGURE 7.12 The RMS diffrequency error as a function of a for p\= -80/255, linear disparity  296 o d  fcd W  c o  • 1-4  11 c CO  3  Or  8 d  100  10  Filter Sigma FIGURE 7.13 variation in the be explained diffrequency  The standard deviation of the diffrequency quantization error  measured  by the  value of the diffrequency (which should be independent of o)  fact that there are  measurements  to  be  made.  less contours  for larger  Since  are  there  fluctuations in the estimate of the mean diffrequency increase.  J  less  a  values,  measurements,  and hence, the  can less  statistical  297  7.5  -  Summary of Chapter 7  -  The  disparity  gradient  of  a  surface  can  be  obtained  from  measurements  made  directly on the scale maps of a stereo image pair of that surface.  -  The  corresponding 1/(1 +  binocular features  in  diffrequency, the  scale  defined maps  of  as a  the one  where j3, is the disparity gradient of the -  If one  makes  the  coordinate  one obtains the foveal scale map, wherein the  ratio  of  the  dimensional  spatial  image  frequencies  pair  is  equal  of to  surface.  transformation  r=log (a), 2  y = x/a  to  so called because its structure resembles  density of elements is greater for high resolutions  the  scale map,  the human fovea,  than low. ln the foveal scale  map, corresponding features lie directly above each other, resulting in a simple implementation structure for the diffrequency measurement  -  The  diffrequency  process  has  been  proposed  by  others  as  an  alternate  (to  the  standard position stereo process) mechanism for acquisition of depth information in animals. -  The diffrequency measurements,  for linear disparities, are not affected  filtering process, as was the case for position disparity measurements.  by the spatial  This suggests that  for  low resolutions at least that diffrequency measurements may be more reliable measures of the surface shape than position disparity  -  Experiments  measurements.  were done which show that diffrequency measurements can be obtained  from random one dimensional stereo pairs for linear disparities, and that the  errors in these  measurments were on the order of the expected quantization error. -  The  independent disparity  disparity  of  function  the  gradient  position  reconstruction  information  disparity process  disparity information alone, (see chapter  obtained  measurements. to 3.8)  obtain  a  from  Therefore  diffrequency both  can  better reconstruction  measurements be than  used  in  is the  with position  298 VIII -  8.1  -  CONCLUSIONS  Summary  and  AND A  LOOK  TO  THE  FUTURE  Conclusions  In this thesis we have presented  a multi-resolution stereo feature  matching algorithm  that was very simple. This algorithm was based on the Marr-Poggio (1979) algorithm. We have  dispensed  detection  mechanisms  non-constant simply  with their computationally intensive  use  on  the  grounds  that  these  disambiguation and mechanisms  do  in-range/out-of-range  not  work  reliably  for  disparity functions. To take the place of these mechanisms we proposed that we a  disparity  function  estimate,  obtained  from  the  lower resolution  levels  by a  reconstruction process, to disambiguate competing matches at a given resolution. Two matching techniques for a given resolution level were proposed, the nearest neighbour matching scheme, which took the match closest to the estimated match position to be the correct one, and the more computationally intensive dispscan algorithm which takes as the correct match that which resulted  in the  largest  correlation  with  the  estimated  disparity function in a neighborhood  about the match. We predicted that, in order for these methods to work, using such simple matching methods, the accuracy of the low resolution disparity estimate would need to be quite high. In this light we analyzed the mechanisms  which would give rise to errors in the disparity  estimate and tried to predict what effect they would have on the matching algorithms. These analyses have shown that the error in the disparity estimate, for white gaussian noise images, increases as the resolution decreases, as the disparity gradient increases,  as the  feature density (usually a function of resolution) decreases, as the camera signal to noise ratio decreases, and as the disparity range increases. errors affected  the  matching process,  We were not able to fully detail how these  due to the mathematical complexity. We were able to  show that the nearest neighbor matching scheme can tolerate a certain level of error in the disparity  estimate,  which depends  on  the  feature  density  and  type,  and  still  yield  exact  matching. Thus we can conclude that if the disparity estimate is sufficiently accurate at each  299  resolution level (and the the  estimate  matching algorithm will  can be less exact at lower resolutions than at high) then  converge  to near  the  correct value. This general  statement was  borne out by the experimental results given in chapter 5.  The process  experimental evidence made clear the importance of the reconstruction process; the  of obtaining a complete  samples of the either because  disparity function estimate  at  each  resolution level from the  disparity function at that resolution. If the reconstruction was not done well, the reconstruction method was poor or because the true disparity function was  varying too rapidly, there would be significant amounts of error in the final disparity function estimate.  It was seen  that, since  the  sources  of error  generally show spatial variations,  the  error in the final disparity function were also not uniformly distributed. For example, regions where the disparity function changes rapidly will be prone to reconstruction and filtering error, while relatively smooth regions will be accurately measured.  We neccessarily method  examined a number of techniques for performing the reconstruction process, which involves non-uniformly distributed samples.  proposed  constraint)  by  Grimson  (1981b)  was  based  which were not always valid. This fact,  on  We  pointed  out  assumptions  (his  that  the  surface  relaxation consistency  combined with the shortcomings of other  known methods for reconstructing functions from non-uniformly distributed samples led us to the  development of our own method, which we call the  warping or  transformation method.  This method was seen to perform better than the relaxation method or an averaging method in regards to minimizing the large isolated errors that the matching algorithm produces when it diverges.  We  have also pointed out that the process of reconstructing the disparity function from  its samples also smooths crude  9x9  averaging  the errors  reconstruction  in these samples somewhat technique  worked  well  This  (apart  may  from  explain why the its  propensity  for  localized error). The averaging of the disparity values also averaged the disparity errors, which being zero mean and uncorrelated should average out to zero.  300  The key factor in obtaining high accuracy keeping the disparity measurements  from our simplified matching algorithm is in  and the reconstructed  disparity function estimate sufficiently  accurate at all resolutions. However, at low resolutions, which are the most important in terms of  the  convergence  reconstructions is to  make  of  the  algorithm,  obtaining  accurate  disparity  spatial  more  accurate measurements  filtering  functions as  of the  disparity  function at  error  (for  linear  disparities  at  well). The diffrequency measurement  least,  and  mechanism  very  coarse resolutions  (implying very  matching scale space representations any disparity measurements disparity  can  be  easily  may  simple implementation).  We  which are immune to for  other  disparity  be important. as  a vision  of the disparity gradient  The second answer lies in  as a whole. Scale maps can be matched without making  or reconstructions.  measured  lower resolutions.  possibly  module of its own, due to its ability to make accurate measurements at  and  are difficult We have proposed two possible answers to this problem. The first  have shown that this can be done by using diffrequency measurements, the  measurements  at  the  Once the  highest  scale maps  resolution  in  have  the  been matched,  scale  map,  and  the then  reconstructed to give a complete disparity function.  We  have  shown  that  the  simplified  multi-resolution  matching  successfully applied to the industrial task of automated log scaling.  algorithm  can  be  301 VIII -  8.1 -  CONCLUSIONS AND A LOOK TO THE FUTURE  Summary and Conclusions In this thesis  that was have  very  simple. This  dispensed  detection  simply  with  their  mechanisms  non-constant use  we have presented  on  a multi-resolution stereo feature matching algorithm  algorithm was computationally  the  grounds  based  on the  intensive  that  Marr-Poggio (1979) algorithm. We  disambiguation and  these  mechanisms  do  in-range/out-of-range  not  work  reliably  disparity functions. To take the place of these mechanisms we proposed that we a  reconstruction  disparity process,  function  estimate,  obtained  from  the  lower  resolution  levels  which took the match closest to the estimated  the  a  largest  correlation  with  scheme,  match position to be the correct one, and the  more computationally intensive dispscan algorithm which takes as the correct match in  by  to disambiguate competing matches at a given resolution. Two matching  techniques for a given resolution level were proposed, the nearest neighbour matching  resulted  for  the  estimated  disparity  function  in  that which  a neighborhood  about the match.  We predicted methods,  the  that, in order  for these methods  accuracy of the low resolution  In this light we analyzed the  mechanisms  to work, using such simple matching  disparity estimate would need to be quite high. which would give rise  to errors in the disparity  estimate and tried to predict what effect they would have on the matching algorithms. These analyses have shown that the error in the disparity estimate, for white gaussian noise images,  increases as the resolution  decreases, as the  disparity gradient  increases,  as the  feature density (usually a function of resolution) decreases, as the camera signal to noise ratio decreases, and as the errors affected  the  disparity range increases.  matching  process,  due to  We were not able to fully detail how these the  show that the nearest neighbor matching scheme disparity  estimate,  which  depends  on  the  matching. Thus we can conclude that if the  mathematical  complexity.  We  were able to  can tolerate a certain level of error  feature  density  and  type,  and  still  yield  in the exact  disparity estimate is sufficiently accurate at each  302  resolution level (and the estimate the matching algorithm will  can be less exact at  converge  to near  the  lower resolutions than at high) then  correct value. This general statement was  borne out by the experimental results given in chapter 5.  The  experimental evidence made clear the importance of the reconstruction process;  process of obtaining a complete  disparity function estimate  at each  the  resolution level from the  samples of the disparity function at that resolution. If the reconstruction was not done well, either because the reconstruction method was poor or because the true disparity function was varying too rapidly, there would be significant amounts of error in the final disparity function estimate.  It was seen that, since the sources  of error  generally  show spatial variations,  the  error in the final disparity function were also not uniformly distributed. For example, regions where the disparity function changes rapidly will be prone to reconstruction and filtering error, while relatively smooth regions will be accurately measured.  We neccessarily method  examined a number of techniques for performing the reconstruction process, which involves non-uniformly distributed samples.  proposed  constraint)  by  Grimson  (1981b)  was  based  which were not always valid. This fact,  known methods for reconstructing  on  We  pointed  out  assumptions  (his  that  the  surface  relaxation consistency  combined with the shortcomings of other  functions from non-uniformly distributed samples led us to  the development of our own method, which we call the  warping or transformation method.  This method was seen to perform better than the relaxation method or an averaging method in regards to minimizing the large isolated errors that the matching algorithm produces when it diverges.  We  have also pointed out that the process of reconstructing the disparity function from  its samples also smooths crude  9x9  averaging  the  errors  reconstruction  in these samples somewhat technique  worked  well  This may  (apart  from  explain why the its  propensity  for  localized error). The averaging of the disparity values also averaged the disparity errors, which being zero mean and uncorrelated should average out to zero.  303  The key factor in obtaining high accuracy keeping the disparity measurements  from our simplified matching algorithm is in  and the reconstructed  disparity function estimate sufficiently  accurate at all resolutions. However, at low resolutions, which are the most important in terms of  the  convergence  reconstructions is to  make  of  the  algorithm,  obtaining  accurate  disparity  spatial  more  accurate  filtering  functions as  measurements  of the  disparity  function at  error  (for  linear  disparities  at  well). The diffrequency measurement  least,  and  very coarse resolutions  (implying very  matching scale space representations any disparity measurements disparity  can  reconstructed  We  be  easily  for  the  Once the  highest  other  mechanism may be important  scale maps have  resolution  in  the  disparity  as a vision  of the disparity gradient  as a whole. Scale maps can be matched  at  resolutions. We  which are immune to  simple implementation). The second  or reconstructions.  measured  lower  possibly  module of its own, due to its ability to make accurate measurements at  and  are difficult We have proposed two possible answers to this problem. The first  have shown that this can be done by using diffrequency measurements, the  measurements  answer  lies in  without making  been matched,  scale  map,  and  the then  to give a complete disparity function.  have  shown  that  the  simplified  multi-resolution  matching  successfully applied to the industrial task of automated log scaling.  algorithm  can  be  304  8.2 -  Directions for Future Work  In  this thesis  computation. complex  we have  tried  However, increased  matching  algorithms.  to keep our algorithms simple, for the sake  matching For  accuracy  example  the  may be obtained figural  of rapid  with the use of more  continuity  constraint  suggested  by  Grimson (1985) and by Ohta and Kanade (1985) may provide improved performance. However these algorithms must be made more computationally efficient.  Incorporating  information from other  will certainly help the convergence obtained process.  as a result For example  vision modules, for example  shape from shading,  of the matching algorithm (see Ikeuchi, 1983). Information  of a recognition  process  can also  be used  to aid in the matching  the vision module may decide that it is looking at a box (or a log)  and use a 3D model of the object to guide the stereo matching process. Also more complex reconstruction with from  algorithms  may examined.  non-bandlimited functions, real  scenes  usually  such  have).  as those  Grimson  methods for handling the reconstruction not  pursued  representation has  object  this  any further.  It  of the disparity data  only one disparity value,  The current  reconstruction  techniques  with discontinuities (which  (1981b)  and Terzopoulos  have  problems  disparity functions  (1982)  have  discussed  of functions with discontinuities. They have apparently  may be  that  what  is  required  is  a  change  in the  for a scene from a functional form, wherein each point  to an object  has a unique disparity value.  based  Reconstruction  object, independently of all the other objects.  representation would  then  wherein  each  point of an  be performed over  a given  Note that this method requires input from the  higher cognitive levels of the vision system.  The scale space methods so briefly described in this thesis,  need much improvement  In particular, matching of scale maps of non-linear disparity functions should be looked into. Two  dimensional  excessive  scale  computation  space  required  image. Perhaps approximate  matching  needs  to calculate  examination,  the scale  maps  as  does  the  for an entire  problem  of the  two dimensional  methods based on quantizing the scale space or coding the scale  space contours could be devised.  305  From  an engineering standpoint the obvious next step is to implement the  algorithms  described herein in hardware capable of real time operation. This is especially so for the case of the log scaling application described in chapter 5 where processing times on the order of seconds  are  required, as opposed to the  20  minute span taken by the  general purpose minicomputer. We have addressed Lawrence, 1984,  and Clark and Lawrence, 1985b).  implementation on a  this problem in some detail in (Clark and  306  APPENDIX I -  PROOFS O F T H E O R E M S  Proof of T H E O R E M  Proofs  S T A T E D IN CHAPTER  3  3.1 .  of this theorem  can be found  in many  places.  For a survey  of these see  (Jerri, 1977).  Proof of T H E O R E M The  3.2 :  proof of theorem 3.2 can be found in (Jerri, 1977).  Proof of T H E O R E M 3.3 : Since h(r) is bandlimited to w = 7r, we can write (from Theorem 3.1) : 0  h(r)=Ih(n)g(T-n)  Now,  since a one-to-one  (1)  continuous mapping 7(1) exists such that n = 7(t ) and r = 7 ( t ) , we n  have that  h(7(t))=£h(7(t ))g(7(t)-n) n  Because  h(7(t)) = f(t) we have that  (2)  307  CO  f(t)=Zf(t )g( (t)-n) n  (3)  7  Q.E.D.  Proof of T H E O R E M For  3.4 :  a proof of this theorem see Petersen and Middleton,  Proof of T H E O R E M  1962.  3.5 :  From Theorem 3.4 we have that:  h(f) = L h(L)g(f-Is)  Now,  since 7(x ) g  =  T  (4)  and 7(x)  =  £, we have that:  (5)  The  condition that h(?)  transformation y  =  f(x),  along with the  be non zero everywhere, gives then that:  condition that the Jacobian  of the  308  f(x) = I  f(x )g( 7 (x)-7(x s ))  (6)  s  Q.E.D.  Proof of T H E O R E M 3.6 : A matrix  mapping -y(x) is one-to-one  |37(x)/9x|  and continuous if the determinant  is non-zero everywhere. At the interior points of P  x  of its Jacobian  (those points that  are not vertex or Link points of the partition) we have y as defined in equations (27), (28) and (29). We can rewrite (27) as follows:  7(x) = A V U A ^ X + C ]  (7)  where r  =  r\ r 2  2 3  2  (x 2-x 3) 2  2  (X 3-X 0  (X ,-X ) 2  2  2  (xS-xS) 1 (xS-xS) (xS-x,)  1  The  (8)  vector C is of no consequence in this proof as it does not appear in the expression for  the Jacobian. A is as given in equation (29). After some algebraic manipulation we get that:  J  = A" |r (X^)| 2  T  = a/3/A  2  (9)  309  where  a  =  x (x 3-x ) + x (x -x ) + x 3(x -x ) 5  I  I  2  1  i  J  2  i  1  2  ,  1  j  2  3  i  1  ]  1  2  2  (10)  1  i  i  0 = r 1(r 3-r 2)+r 2(r 1-r 3)+r 3(r 2-r 1) It can be shown that a = 0 iff the points x\,x ,x 2  r!,r2,r3  are collinear. Since the points  Jacobian at the interior points of P  In order  that 7  x  Tj  (ii)  are collinear, and that )3=0  3  are not collinear, (3^0  and continuous at the  Links of P  that the value of the Jacobians on either side of an Link of P  hexagonal points  lattice  r!,r2,r3  in ?  in  J  We  we must  have the same sign.  which illustrate the mapping of points in x have  mapped  the  points  ensure  Xi^ Jc 2  in x  3  space to the  space  into  the  x  that a > 0  and that /3>0  space to the point T 4  partiton regions P , D 2  space.  x  x  space, creating the partition regions P] and D .  Let us assume sample point in x  and 1.2,  will have a nonzero  when the points in the set V(x) are not collinear.  be one-to-one  Consider figures 1.1  and y  iff the points  2  for this mapping. We now wish to map a  of the hexagonal  lattice in J  space, to create the  that share a common Link with P i and D i . Imagine that r4  was not  constrained to lie on a vertex of the hexagonal sampling lattice, but could lie anywhere in J space. It can be seen that if I\ was to lie anywhere on the line through Tj and T 2 , would be zero  (as  the points  only along this line can /3 T]  and T 2  negative. lattice,  as  does T 3  Now since  riJ 2  and T 4  are collinear). Furthermore  be zero. Hence if T 4  then /3  we have  it can be seen that 0  constraint  that  it can be seen that  lies on the same side of the line through  is positive, and if it lies on the  the  that /?  T4  must  for the region formed by  lie  on  opposite side then 0 the  and T 4  hexagonal  is  sampling  is negative. Now for  "3 FIGURE 1.1 A  portion of the partition P-^Q in x space.  r4  FIGURE 1.2 A the Jacobian Hence x  t  of 7  must  portion of the partition D^Q  to have the same sign in P  2  in \  as in P i , a  2  by points x,, x , and x . That is, P 2  Q.ED.  for P  lie on the side of the line through it, and x2  words the region formed formed by points Xi, x , and x  be one-to-one  space.  4  and continuous.  x  3  2  must also be negative.  opposite  to x3.  In other  must not overlap the region formed  must be a tessellation  for the mapping function 7 to  Proof of THEOREM 3.7 : The proof of this theorem can be found in (Petersen and Middleton, 1964).  Proof of THEOREM 3.8 : The proof of this theorem follows from the proof of theorem 3.7.  312  APPENDIX II -  This  A METRICALLY ORDERED SEARCH  appendix  describes  searches in the transformation is  a  modification  rectangular  of  region  the  the  reconstruction  one  in  (whereas Hall's over  1982)  algorithm  be extended  arbitrary  the  boundaries. The improvement over  that  is  used  to  perform  algorithm described in chapter  (Hall,  can  search  to search  algorithm  ALGORITHM  and  performed  convex  the  performs a  regions  a  is metrically ordered. That is, no point is searched  3.  diamond  rectangular  paper  neighbour  This algorithm search  search).  by modifying the  algorithm given in the  nearest  This  over  a  algorithm  test for reaching above  is that the  before a point that is closer (using  a Euclidean distance metric) to the start point The search is performed along the path shown in figure 2.1. search  region.  Note how the search Most of  the  path changes in response  complexity  in the  boundary conditions. The edges of the search  algorithm  to reaching a boundary of the  is due  to  the  handling of these  path are divided into five different types. These  types, numbered 1,2,3,4 and 5, are defined as follows:  Edge Type 1: Edges along the lower right side of the diamond. Edge Type 2:  Edges along the upper right side of the diamond.  Edge Type 3: Edges along the upper left side of the diamond. Edge Type 4: Edges along the lower left side of the diamond. Edge Type 5:  The short edge causing an offset in the search path between edge type  4 and 1. Examples of these edge types are shown in figure 2.1. The edges, along which the search  takes place, can be in one of five modes. These  modes describe the relation of the edge with the boundaries of the search region. These five modes are summarized as follows:  Mode  1  =  1:  Mode I represents the case wherein an edge lies completely within  the search region. That is, no part of the edge lies in the boundary.  313  FIGURE 2.1 Mode 2 =  The diamond search path and the five edge types. IO : Mode 10  represents the case wherein the initial portion of the edge  lies within the search region and the rest of the edge lies beyond the boundary. Mode 3  =  01  : Mode 01  represents the case wherein the initial portion of the edge  lies outside the boundary and the rest of the edge lies within the search region. Mode 4  =  010  : Mode OIO represents the case wherein the initial part of the edge  lies outside the boundary, the middle part lies within the search region, and the final part of the edge lies outside the search region. Mode 5  =  B : Mode B represents the case wherein the entire  the search region. Examples of edges of each of these modes are given in figure 2.2.  edge lies outside  of  314  BOUNDARY  MOPE 1  MODE Z  \ MODE  3  FIGURE 2.2 The  Examples of edges in the five edge modes.  operation of the algorithm is fairly simple. The search merely cycles between edge  1 to edge 2 to edge 3 to edge 4 to edge 5 to edge 1 etc. After each cycle the length of the edges is increased by one pixel. When the search along any one of the edges encounters a boundary, the mode of that The  edge, and of the next edge is altered to indicate this fact.  search point (X,Y) and the endpoint of the previous edge (E(i-1)) are also altered when  the search along edge i reaches a boundary. The exception is edge type 5 whose parameters (length,  mode)  are  never changed. In this way the  search  pattern  shown in figure 2.1  obtained.  A pseudo-high-level language description of this algorithm is given below,  begin **  X,Y is the starting point * *  El  <-  X + d;  •* d is the search step size **  is  E2 < -  Y - d ; ** Initialize the edge directions **  E3 < -  X-d;  E4 < -  Y + d;  for  X  i <=  1  X  4  until  +  do  m(i) < -  I ** Initialize the edge modes to I  1;  TEST(X,Y) SEARCH  EDGE 2  SEARCH  E D G E 3 ** Search along the edge directions **  SEARCH  EDGE  SEARCH  EDGE 5  SEARCH  EDGE 1  go  4  L  to  end  DIAGONAL SEGMENT  procedure  SEARCH(dx,dy,Xl,X2,Yl)  begin  N if  <N  for  floor(|(X2-Xl)/dx|); =  i  0  <-  then  return  1  until  N  do  XI  <-  XI  +  dx; ** Increment the search position **  Yl  <-  Yl  +  dy;  begin  TEST(Xl.Y) ** Test for the quantity being searched for ** end  end  procedure begin  SEARCH E D G E 1  dx =  1; ** Initialize the direction of search **  dy = - 1 ;  Step 1: [Mode I edge generation]  if m(l) = I then begin if E l >  XRIGHT then go to HITBOUND1;  DIAGONAL E4 < -  SEGMENT  SEARCH(dx,dy,X,El,Y);  E4 + d;  return  end HITBOUND1:  begin if E l > m(l)  2*XRIGHT -  XSTART then go to ENTERB;  < - IO;  Let m(2) be adjusted to reflect that the initial segment of edge 2) is now outside of the search region.; DIAGONAL E4 < -  S E G M E N T SEARCH(dx,dy,X,XRIGHT,Y);  E4 + d;  return end  Step 2: [Mode IO edge generation]  If m(l) = IO then begin if E l >  2*XRIGHT -  DIAGONAL E4 < X  S E G M E N T SEARCH(dx,dy,X,XRIGHT,Y);  E4 + d;  <- E l ;  X S T A R T then go to E N T E R B ;  Y  <-  YSTART;  return end  Step 3: [Mode 01 edge generation]  If m(l) = 01 then begin if E l >  2*XRIGHT -  Y  <-  YTOP  X  <-  El -  if E l >  Y  +  XSTART then go to  ENTERB;  YSTART;  XRIGHT then go to HITB0UND2;  DIAGONAL SEGMENT E4 < -  E4 +  SEARCH(dx,dy,X,El,Y);  d;  return end HITBOUND2:  begin m(l) < Let  010;  m(2) be adjusted to reflect that the initial segment  of edge 2) is now outside of the search region.; DIAGONAL SEGMENT X  <-  Y  <-  E4 < -  SEARCH(dx,dy,X,XRIGHT,Y);  El; YSTART; E4 +  d;  return end  Step 4:  [Mode 010  edge generation]  if m(l) = 010 then  begin if E l >  2*XRIGHT -  Y  <-  YTOP;  X  <-  El -  Y  +  XSTART then go to  ENTERB;  YSTART;  D I A G O N A L S E G M E N T SEARCH(dx,dy,X,XRIGHT,Y); X  <-  El;  Y  <-  YSTART;  E4 <-  E4 + d;  return end  Step 5: [Mode B generation] if m(l)  =  B then  begin  E4 <-  E4 + d;  X  <-  El;  Y  <-  YSTART;  return end  Step 6: [Outer right boundary first reached] ENTERB: begin m(l)  <-  B;  Let m(2) be adjusted to reflect that the initial segment of edge 2) is now outside of the search region.; if all m(i)  =  B then return;  else  E4 <-  E4 + d;  319  X  <-  Y  <-  El; YSTART;  return end end  procedure SEARCH E D G E 5 begin X  <-  X  if X >  +  1;  XRIGHT then m(l) = B;  else TEST(X,Y)  return end  The  procedures S E A R C H  measure.  EDGE  2, 3, and 4 are  The form of these procedures  Details such as  the  is the  same  sign of dx and dy, and the  condition are different.  not written down here, as a space saving as  for procedure  detection  SEARCH  and handling  EDGE  1.  of the boundary  320  APPENDIX m - DERIVATION OF COVARIANCE FUNCTIONS In  this  appendix  we derive  the expressions  for the covariance  functions  and their  derivatives that are required in the body of the report.  One  dimensional  We  case.  will  first  determine  the autocovariance  of the slice  along  o=o0  of the one  dimensional scale space transform. This slice is obtained by filtering a one dimensional white Gaussian signal, having power spectral density of, with a filter having the following frequency response:  H(w) =  -ao w e"" 3  2  (1)  2 0 ( , V 2  The power spectrum of the filtered signal is given by a- |H(w)| 2  S(w)  =  2  giving:  (2)  ofo^t'^ 0" 2  The autocovariance  of the filtered function is simply the Fourier transform of S(a>) so that  we get:  <Kr)  25  =  (3)  (ofoo/iW^nsiT/ooy-iiT/ooy+ih' ^^^  If one performs a McLaurin Series expansion of e~'-^(T'/ 'o ) 0  tf( )(0) = n  a (-l) i  2  n/2  ( n + 4)!/[(n/2 + 2)!2  for n even and is identically zero for n odd.  Two  dimensional  case.  2  j ^ t  ^  Q  s e g n  n  n  +  5  a„ "V7r]  (4)  321  We will now derive the expressions for the autocovariances the cross-covariance h,  and h  Hj  and H  of H ! and H . Let S](CJ) and S (w) 2  2  of H i and H  2  be the power spectral densities of  respectively. Let us define PI(CJ) and P (CJ) to be the power spectral  2  2  respectively. These are related to Si and S  2  P,(o>) =  \oWe~ ° \  P (u)  |a w e  u2  =  2  2  2e2/2 2  2  _ c j 2 a 2 e V 2  |  2  along with  densities of  as follows:  2  S,(w)  (5)  S (CJ)  (6)  2  —cj cr e /2 2  where  2  2  2  is  2  O CJ C  -x /2e a 2  d /dx 2  2  2  o/e\/2ire  The  .  transforms of PI(CJ)  the  Fourier  transform  of  the  filter  ^2(r)  are  function  2  and P (w) 2  autocovariance respectively. In  functions  ^ i ( r ) and  order  determine  to  these  the  Fourier  functions we must  find expressions for Si and S . 2  Let us define the following two dimensional functions: =  f(x,y)»a5(x)V27r  e"  g (x,y) =  f(x,y)»a5(x)/ /27r  9 /3y  gl  (x,y)  2  where (•)  l  2  (7)  y / 2 a  2  e  "  y  V  2  a  2  (8)  indicates the convolution operator. It can be seen that:  g.(x,y ) =  h,(x)  (9)  g (x,y ) =  h (x)  (10)  0  and  2  0  2  322  That is, h, and h Mersereau hi  and h  are slices of the two dimensional functions g, and g . We can use  2  2  and Oppenheim (1974) slice projection  theorem  to find  from which we can then get Si and S . Using  u  2  the  the Fourier transforms of  the slice projection  theorem we  can write:  2  U22  2n  /Ih,(x)}  =  o.SZ o t~ °  4o  flh (x)}  =  - a T "a^j'e""^  a  2  =  o^Vl-n  (11)  and  2  7 2  dw = 2  a J IK la  (12)  Hence we have that:  2  2  S,(o>) =  Oj o 27r  (13)  S (w) =  oflir/o  (14)  and  2  2  Therefore we obtain:  P,(w) =  ofl-noW  e  P2(CJ)  ofltioW  e  ~"  2 e , a J  (15)  _ c j 2 e J a 2  (16)  and  =  Evaluating the inverse Fourier transforms and setting e -1 yields:  323  \MT)  =  .25a o /7r[.25(T/a) -3(r/a) + 3 ] e " i  2  ,  l  2  2 5 ( T / ( 7 ) 2  (17)  -4 and \p {r) =  \|/,(r). As before  o  2  it can be shown that the value of the derivatives of  these functions at zero are given by: xjjS \0)  = (-l)  n  n/2  i / * ( n + 4)!/[(n/2 + 2)!2  n+  4  a ~ ] n  (18)  1  for n even and are zero for n odd. The cross-covariance mean of ^  function \jj of H , and H u  can be shown to be the geometric  2  and \p . Thus, we have: 2  =  iMO  .25a ; V7r/a[.25(T/a) -3(T/a) + 3 ] e " 4  2  2 5 ( T / a ) 2  (19)  and  ^ ( )(0) 12  n  =  ( - l ) V 7 r ( n + 4)!/t(n/2 + 2)!2 n/  n+  4  a  n + 1  ]  (20)  for n even and are equal to zero for n odd. One dimensional slice of a two dimensional function  Consider the 2D filter with the following frequency response:  H(CJ,^ ) 2  =  J  a*(w 1 +a)' 2 )e"  By the slice-projection theorem the following Fourier . transform:  (w2l+6,J2)aV2  (Mersereau  (21)  and Oppenheim, 1974) a slice p(x) of h(x,y) has  324  P(u)  =  -  o<sZ W u> )c-V » ^ ^ m  aj/27r[u a + l ] e " 2  2  +  +  2  2  /2  (22)  2  (23)  u i a V 2  2  This result, apart from a scale factor of 27ro , was derived by Grimson (1981b).  The power spectrum of p(x)  P (u) 2  =  can now be determined and is given by:  27ra [l + 2o u + o - V ] e " 2  2  2  (24)  u 2 a 2  Taking the inverse Fourier transform of the power spectrum  yields the covariance  function of  the signal obtained by passing white noise, with unit variance, through the filter.  <//(T) =  ( o j / 7 r / 4 ) [ l l - 5 ( T / a ) + .25(T/0) ]e" 2  4  T2  (25)  325  APPENDIX IV - DERIVATION OF THE DEPTH FROM DISPARITY EQUATION In this appendix we derive the relationship between  the three dimensional position of  a physical scene point being imaged, and the two dimensional positions of the image of the scene point in the image planes of the two cameras. Figure 4.35 depicts the geometry  of the  imaging situation. From this diagram we can see that:  tan(2/3+0,) =  ( a+ xyOAl-aXj/f)  r  (26)  where  tan(2)3)  Xj  (27)  is the horizontal position of the imaged point in the image plane of camera 1, and f is  the focal length of camera 1. Similarly we have that:  tan(-0 ) 2  =  -x /f  (28)  2  It can be seen that:  X  =  Z x / f and Y =  Zy /f  2  (29)  2  where (X,Y,Z) is the three dimensional position of the scene point being imaged. Thus the X and  Y coordinates  of the scene point are functions of the depth Z, the camera focal length,  and  the image plane coordinates  of the scene point in camera 2. From figure 4.35 we can  see that:  d  x  =  Ztan(-0 )+(Z-d )tan(20+0 ) 2  z  2  (30)  326  or, since d  z  =  ad  Z  =  d [ l + atan(20 +  x >  x  e)]/[tan(-e ) + tan(2/3+0 )] I  2  1  (31)  Hence, substituting in the expressions for the tangents we have, after some algebra:  Z  =  where D = X ] - x 2  f (l + a ) d / [ f D + a(f + x x )] 2  :  x  is the disparity.  2  1  2  (32)  327  APPENDIX  V  -  T H E CONDITIONS  FOR W E L L - B E H A V E D  SLICES  O F 2D  SCALE  MAPS  In this appendix we derive the conditions on a two dimensional function which ensure that a given one dimensional slice of its scale map is itself a well behaved (in the sense of Yuille and Poggio, 1983a) scale map.  A well behaved scale map comes from a • scale space transform of the form:  F(x,a) =  where Ii}  USZj(u)e~ ~ (X  is some  linear  (33)  du}  U)2/2o2  differential operator  in x. Now, let F (x,a) be a slice of the 2 - D  scale space transform defined by equation (4.3.57) (a skew factor of 1) as follows:  F*(x.o) =  V ;;" f*(u,v)(aV27r)e" 2  00  [(x  "  u)2 +  ( y _ v ) 2 ; i / 2 a 2  dudv|  y =  y()  (34)  We can rewrite this as:  F*(x,o) =  Og(u,x) + h ( u ) ] e ( x  u ) V 2 a 2  du  (35)  where  g(u,x) =  /r f*(u,v)/(27r) [(x-u)Vo -l] oo  2  e~  ( y , r v ) 2 / 2 a 2  dv  (36)  dv  (37)  and  h(u)  =  /r.f.(u.v)/(2ir) [ ( y „ - v ) V a - l ] 2  e"  ( y  °"  v ) 2 / 2 a 2  We can rewrite the scale space transform of equation (33) as follows:  328  F(x.cr) =  =  j"  r o  f(u)Ue~  ( x  "  u ) V 2  ° }du  (38)  2  /VCuMx-iOe-^-^^du  (39)  where p(x-u) is a polynomial in (x-u). Thus, for F (x,o) to be well to be well behaved we require that:  g(u,x) + h(u)  =  f(u)p(x-u)  (40)  where f(u) is any function of u and p(x-u) is a polynomial in (x-u). This implies that:  g(u,x) =  [p(x-u)-l]h(u) =  p*(x-u)h(u)  (41)  * * where p is also a polynomial. From equations (4) and (5) we can see that p can only be  the following:  p*(x-u) = k[(x-u) /o -l] 2  (42)  2  where k is some constant This condition means that:  /ro/(u,v)e-(y°- ) v  2 / 2 a 2  dv  This equation can only be satisfied  f(u,v) =  =  k;" f*(u,v)[(y -v) /a -l]e(y°- ) c  if f(u,v) is separable,  2  2  v  2 / 2 a 2  dv  (43)  that is if:  f,(u)f (v)  (44)  2  Thus we conclude that a one dimensional well behaved  o o  slice of a two dimensional  scale map is itself a  scale map if and only if the two dimensional function which produced the two  dimensional scale map is separable with respect to the axis along which the slice is made.  330  APPENDIX VI - DERIVATION OF THE DIFFREQUENCY QUANTIZATION ERROR The  disparity  gradient  0 i is a  function  of the scales  of corresponding  scale map  contours, as detailed in chapter 7. If the scale in the left image is 0 , and that in the right image is o  then the disparity gradient is given by:  2  = (a./o,-].)  0,  Let /3 = exacdy  , l + /3j  =  o /o . 2  l  (45)  The values of the scales in our experiments  are not determined  but are quantized. This means that the computed value of (3 is also quantized, and is  not exact disparity  In this appendix we derive the probability density gradient  produced  by this  quantization,  and also  function of the error compute  the  variance  in the of this  quantization error. The quantization error is given by:  (a + e )/(ai + e,) -  e =  =  2  (46)  o /Oi  2  2  ( e i O ^ d O i V t o i C a i + ei)]  where ei and e  2  (47)  are the quantization errors of the measured values of Oi and o . 2  We assume  that e  s  and e  are uniformly  2  distributed and independent of each  other.  Thus their joint probability density can be written as:  P  P  and Ox  (ei,e ) =  P  ClC  2  l/[(a -bi)(a -b )] 1  2  2  for ai<e!>bi  is zero otherwise. The values of ai, a , bi, and b 2  and o . 2  experiments:  and a < e > b 2  2  (48)  2  2  These  functions  2  are functions of the true values of  are as follows, for the logarithmic  quantization  used  in our  331  =  b  ^  where  k =  value of a  kN  >-2 ^ ' k  N  +  )  (49)  2  kN,_ k(N,-l/2)  _  2  k N _ k ( N , + l/2)  2  2  2  (51)  kN _ k(N -l/2) 2  2  :  for Ni and N  Using  (50)  2  6.65/255, 2 + 2  2  1 / 2  =  _  b  2  the laws  (52)  is the quantized value of a ,  k N l  and 2 + 2  k N z  is the quantized  integers.  2  of transformation  of variables  for probability density  functions we  obtain:  P (e) e  =  J"!' P (ei,e2(e,e ))|a(e ,e2)/9(e ,e)|de CD  e]e2  1  1  1  1  (53)  Using equation (48) we can see that this expression can be rewritten as  P (e)  =  1, =  max{ a!, {& -Qa )a /[ta ->ro J  e  /^(crj -*- e )/[(a -b )(a -b )] de 1  1  1  2  2  2  (54)  where  and  2  l  1  1  }  (55)  12 = max! bi, (b-to^o^/leo^ + o2 ] )  and l i < l .  If l i > l  2  P (e)  =  e  for li  1]<12  then p (e)  2  £  =  (56)  0. Thus we get after integration:  ( l - l i X a . + (li + l )/2)/[(a -b )(a -b )] 3  2  and is zero for  and 12 switch between  1,>12. their  1  1  3  (57)  2  Let us define Lj and L  2  to be the values of e for which  two possible forms (equations (55)  and (56)).  These  values  are  seen to be:  Li  =  (a a,-a o )/(a,o, + ai )  U  =  ( b ^ - b . o O / t b . o . + o, )  2  1  '  2  2  (59)  2  It can be seen that, since the cases l] = (a 2 a,-eai 2 )/(a 2 + eai) occur  at  (and p  the  same time and that a i is always less than  (58)  l = ( b a i - e a j ) / ( a + eOi) never 2  bi, the  2  2  2  only case for which li>l  2  is zero) is when:  e  a i > ( b O i - e o ) / ( o - + effi)  (60)  bi<(b,ai-eoV)/(o + e a i )  (61)  2  2  2  2  or when:  2  Let us define L 3 and L* to be the values of e for which l i = l . 2  that these values are:  It  can  be shown  L,  =  (b a,-a,0 )/(a,a, + 0 )  (62)  L  =  (a 0i-b,0 )/(a,0]+ 0 )  (63)  4  2  2  2  2  2  2  2  2  We can now obtain an expression  for p  g  by substituting in the proper value of li and 1, in  equation (?) according to the value of e. We will is very  tedious and does  not contribute  not write down the expression  much more  than  here as it  equation (57). The variance  of the  diffrequency quantization error can be computed as follows:  °  q  = /!=o P ( ) e2  A closed form expression length.  The standard  e  e  < )  de  64  for this integral can be obtained but is not given here due to its  deviation  chapter 7 as a function of a  2  (square  root  of the  variance)  is plotted  and 0i (where we have set o i = o / ( l + 0j)). 2  in figure 7.13 in  334 References  1) Abramowitz, M. and Stegun, I.A. 1965,  "Handbook of Mathematical Functions.".  Dover,  New  York  2) Ahuja, N. and Schacter, B. 1983, "Pattern  John  Models.".  Wiley  and  Sons  3) Baker, H.H., and Binford, T.O. 1981,  "Depth from edge and intensity based stereo.",  Proc.  7th Int. Joint  Conf. Art. Intell.,  Vancouver,  B.C.  4) Barlow, H.B., Blakemore, C. and Pettigrew, J.D. 1967, "The  neural mechanism of binocular depth discrimination.",  Journal  of Physiology,  London,  Vol. 193, pp  327- 342  5) Beutler, F.J. 1966,  "Error free recovery of signals from irregularly spaced samples.",  SI AM  Review,  Vol. 8, No.  3, pp  328-335  6) Blakemore, C. 1970,  "A new kind of stereoscopic vision.",  Vision  Research,  Vol. 10, pp  1181-1199  7) Burt, P., and Julesz, B. 1980,  "A disparity gradient limit for binocular fusion.",  Science,  Vol. 181, pp  276-278  8) Clark, J.J., and Lawrence, P.D. 1984,  "A hierarchical image analysis system based upon oriented zero crossings of bandpassed images.", in Multiresolution Image Processing and Analysis. Rosenfeld, A. ed., pp 148-168,  Springer-Verlag,  Berlin  9) Clark, J.J., Palmer, M.R. and Lawrence, P.D. 1985a, "A transformation method for non-uniformly spaced samples.", Accepted for publication, Processing.  IEEE  10) Clark, J.J., and Lawrence, P.D. 1985b,  the  reconstruction  Transactions  "A systolic parallel processor for the edge images using the V G operator.",  on  Acoustics,  rapid computation  of  functions  Speech  and  for publication,  11) Clark, J.J., and Lawrence, P.D. 1985c,  Journal  of Parallel  " A theoretical basis for diffrequency stereo.",  Submitted  for  publication.  and  Distributed  Signal  of multi-resolution  2  Accepted  from  Computing.  335  12) Crowley, J.L. 1984,  "A multiresolution representation  in Multiresolution Image 169-189, Springer-Verlag,  for shape.",  Processing Berlin  and  Analysis.  Rosenfeld,  A.  ed.,  pp.  13) Crowley, J.L. and Parker, A.C., 1984,  "A representation for shape based on peaks and ridges in the difference of lowpass transform.", IEEE 2, pp  Transactions 156-169  on  Pattern  Analysis  and  Machine  Intelligence,  Vol. 6,  No.  Vol. 6,  No.  14) Crowley, J.L. and Stern, R.M. 1984,  "Fast computation of the difference of low-pass transform.",  IEEE 2, pp  Transactions 212-222  on  Pattern  Analysis  and  Machine  Intelligence,  15) Demaerschalk, J.P., Cottell, P.L., and Zobeiry, M. 1980,  "Photographs improve statistical efficiency of truckload scaling.",  Vol.  16) Dev, P. 1975,  10, No.  3, pp  269-277  "Perception of depth surfaces in random-dot stereograms: A neural model.", International  Journal  of Man-Machine  Studies,  Vol. 7, pp  511-528  17) Dilworth, J.R. 1975, "Log  Scaling and Timber Cruising",  OSU  Book  Stores,  Corvallis,  Oregon  18) Fiorentini, A. and Maffei, L. 1971,  "Binocular depth perception without geometrical  Vision  Research,  Vol. 11, pp  cues.",  1299-1311  19) Freeman, H. 1974,  "Computer processing of line-drawing images.",  Computer  Surveys,  Vol. 6, pp  57-97  20) Frisby, J.P. and Mayhew, J.E.W. 1980,  "Spatial frequency tuned channels: implications for structure from psychophysical and computational studies of stereopsis.",  and function  Philosophical 95-116  Vol.  Transactions  of the  21) Frisby, J.P. and Mayhew, J.E.W. 1981, "Psychophysical stereopsis.", Artificial  22) Ghosh, S.K. 1979,  Intelligence,  "Analytical  Pergamon  and  Royal  computational  Vol. 17, pp  Society  studies  of London  towards  a  B,  theory  of  290,  pp  human  349-385  Photogrammetrv.". Press,  New  York  23) Grimson, W.E.L. 1981a,  "A computer implementation of a theory of human stereo vision.",  Philosophical pp217-253  Transactions  of  the  Royal  Society  of  London  B,  Vol.  292,  336  24) Grimson, W.E.L. 1981b,  "From Images to Surfaces: A Computational Visual System.". MIT Press, Cambridge, Mass.  Study  of the Human  Early  25) Grimson, W.E.L. 1982,  "A computational theory of visual surface interpolation.", Philosophical Transactions of the Royal Society of London B, Vol. 298, pp 395-427  26) Grimson, W.E.L. 1984,  "Binocular shading and visual surface reconstruction.". Computer Vision, Graphics and Image Processing, Vol. 28, No. I, pp 19-43  27) Grimson, W.E.L. 1985,  "Computational experiments with a feature based stereo algorithm.", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 7, No. 1, pp 17-34  28) Hall, R.W., 1982,  "Efficient spiral search in bounded spaces.", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 4, No. 2, pp 208-214  29) Hand, D.E. 1975,  "Scanners can be simple.", in Modern Sawmill Techniques. Miller- Freeman, San Fransisco.  White,  V.  (ed), Vol. 6,  pp 187-196,  30) Higgins, J.R. 1976,  "A sampling theorem for irregularly spaced sample points.", IEEE Transactions on Information Theory, September 1976  31) Horiuchi, K. 1968,  "Sampling principle for continuous signals with time-varying bands.", Information and Control, Vol. 13, pp 53-61  32) Ikeuchi, K. 1983,  "Constructing a depth map from images.", MIT Al Memo 744, Mass. Inst. Tech., Cambridge Mass.  33) Jarvis, R.A. 1983,  "A perspective on range finding techniques for computer vision.", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 5, No. 2, pp 122-139  34) Jerri, A.J. 1977,  "The Shannon sampling theorem-Its various extensions and applications: A tutorial review.", Proceedings of the IEEE, Vol. 65, No. 11, pp 1565-1596  35) Julesz, B . 1971, "The Foundations of Cvclopean Perception", University of Chicago Press, Chicago  337  36) Kotel'nikov, V.A. 1933  "On the transmission capacity of 'ether' and wire in electrocommunications.", lzd. Red. Upr. Svyazi RKKA (Moscow)  37) Kramer, H.P. 1959  " A generalized sampling theorem.", Journal of Mathematical Physics, Vol. 38., pp 68-72  38) Levine, M.D. 1978,  "A knowledge based computer vision system.", in Computer Vision Systems. Hanson, A. 335-351, Academic Press  and  Riseman,  E.  (eds.) pp  26,  American  39) Levine, M.D., O'Handley, D.A., and Yagi, G.M. 1973,  "Computer determination of depth maps.", Computer Graphics and Image Processing, Vol. 2, pp 131-150  40) Levinson, N. 1940,  "Gap and Density Theorems". American Mathematical Society Colloquim Publications, Vol. Mathematical Society, New York  41) Longuet-Higgins, M.S. 1962,  "The distribution of intervals between zeroes of a stationary random function.", Philosophical Transactions of the Royal Society of London A., Vol. 254, pp 557-599  42) Lowry, A. 1984, it  it  M.S.  Thesis, Carnegie-Mellon  University, Pittsburgh, PA  43) Lu, S.Y. 1984,  " A tree matching algorithm based on node splitting and merging.", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 6, No. 2, pp 249-256  44) Lunscher, W.H.H.J., 1983,  " A digital image preprocessor for optical character recognition.", MaSc thesis, Dept. Electrical Engineering, University of British Columbia, Vancouver.  45) Lyness, J.N.,  " S Q U A N K (Simpson Quadrature Used Adaptively Noise Killed)", CACM Algorithm No. 379  46) Mackworth, A.K. and Mokhtarian, F. 1984,  "Scale based description of planar curves.", Dept. of Computer Science Technical Report, 84-1, University of British Columbia, also in Proceedings of the Fifth National Conference of the Canadian Society for Computational Studies of Intelligence, London, Ont. 1984  47) Maffei, L. and Fiorentini, A. 1977,  "Spatial frequency rows in the striate visual cortex.", Vision Research, Vol. 17, pp 257-264  338  48) Marr, D. 1974,  "A note in the computation of binocular disparity in a symbolic, low level processor.", MIT Al memo no. 327, Mass. Institute of Tech., Cambridge, MA.  49) Marr, D. 1982,  "Vision; A Computational Investigation Processing of Visual Information.". W.H. Freeman, San Francisco.  in the Human Reperesentation and  50) Marr, D. and Hildreth, E. 1980,  "Theory of edge detection.", Proceedings of the Royal Society of London B, Vol. 207, pp 187-217  51) Marr, D., Palm, G., and Poggio, T. 1978,  "Analysis of a cooperative stereo algorithm.", Biological Cybernetics, vol. 28, pp 223-239  52) Marr, D., and Poggio, T. 1976,  "Cooperative computation of stereo disparity.", Science, Vol. 194, pp 283-287  53) Marr, D. and Poggio, T. 1979,  " A computational theory of human stereo vision.", Proceedings of the Royal Society of London B, Vol. 204, pp 301-328  54) Marr, D. and Ullman, S. 1981,  "Directional selectivity and its use in early visual processing.", Proceedings of the Royal Society of London B, Vol. 211, pp 151-180  55) Marvasti, F. 1973,  "Transmission and Reconstruction of Signals using Functionally Zero-Crossings.", PhD Thesis, Rensselaer Polytechnic Institute, Troy, New York  Related  56) Marvasti, F. 1984,  "Spectrum of non-uniform samples.", Electronics Letters, Vol. 20, No. 21, p 896  57) McClellan, J.H. 1973,  "The design of two-dimensional digital filters by transformations.", Proceedings of the 7th Annual Princeton Conference on Information Systems  58) McClellan, J.H. Parks, T.W. and Rabiner, L.R. 1973,  Science  " A complete program for designing optimum FIR linear phase digital filters.", IEEE Transactions on Audio and Electroacoustics, Vol. 21, pp 506-526  59) Mersereau, R.M., 1979,  "The processing of hexagonally sampled two-dimensional signals.", Proceedings of the IEEE, Vol. 67, pp 930-949  339  60) Mersereau, R.M. and Oppenheim, A.V. 1974  "Digital reconstruction of multidimensional signals from their Proceedings of the IEEE, Vol. 62, pp 1319-1338  reconstructions.",  61) Mersereau, R.M. and Speake, T.C., 1983,  "The processing of periodically sampled multidimensional signals.", IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 31, No. 1, pp 188-194  62) Miller, K.S., 1964,  "Multidimensional Gaussian Distributions.", John Wiley and Sons, New York  63) Miller, D.G. and Tardif, Y. 1970,  "A video technique for measuring the solid volume of stacked pulpwood.", Pulp and Paper Magazine of Canada, Vol. 71, No. 8, pp 40-41  64) Mokhtarian, F., 1984,  "Scale space description and recognition of planar curves.", MSc. Thesis, Dept. of Computer Science, University of British Vancouver, B.C.  Columbia,  65) Moravec, H.P. 1977,  "Towards automatic visual obstacle avoidance.", Proc. 5th Int. Joint Conf. Artificial Intell., p 584  66) Nelson, J.L, 1975,  "Globality and stereoscopic fusion in binocular vision.", Journal of Theoretical Biology, Vol. 49, pp 1-88  67) Nishihara, H.K. 1983,  "Hidden information in early visual processing.", Proc. SPIE, Vol. 360, pp 76-87  68) Nishihara, H.K. 1984,  "Practical real-time imaging stereo matcher.", Optical Engineering, Vol. 23, No. 5, pp 536-545  69) Ohta, Y. and Kanade, T. 1985,  "Stereo by intra- and inter-scan line search using dynamic programming.", IEEE Transactions on Pall. Anal, and Mach. Intell., Vol. 7, No. 2, pp 139-154  70) Paley, R.E.A.C. and Weiner, N. 1934,  "Fourier Transforms in the Complex Domain", American Mathematical Society Colloquim Publications, Vol. 19, American Mathematical Society, New York  71) Pan, S.X. and Kak, A.C. 1983,  "A computational study of reconstruction algorithms for diffraction tomography: Interpolation versus Filtered Backpropogation.", IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 31, No. 5, pp 1262-1275  340  72) Papoulis, A. 1966,  "Error analysis in sampling theory.",  Proceedings  of the  IEEE,  Vol. 54, No.  7, pp  947-955  73) Papoulis, A. 1967,  "Limits on bandlimited signals.",  Proceedings  of the  IEEE,  Vol. 55, No.  74) Petersen, D.P. and Middleton, D. 1962,  "Sampling and reconstruction of N-dimensional Euclidean spaces.", Information  and  Control,  Vol. 5, pp  10, pp  1677-1686  wave-number  "Reconstruction of multidimensional stochastic measurements of amplitude and gradient", and  Control,  functions  in  279-323  75) Petersen, D.P. and Middleton, D. 1964, Information  limited  Vol. 7, pp  fields  from  discrete  445-476  76) Rice, S.O. 1945,  "Mathematical analysis of random noise.",  Bell  System  Technical  Journal,  Vol. 24, pp  46-156  77) Rosenfeld, A. (ed.) 1984,  "Multiresolution Image Processing and Analysis",  Springer-Verlag,  Berlin  78) Rosenfeld, A. and Kak, A.C. 1976,  "Digital Picture Processing", Academic  Press,  New  York  79) Shanmugan, K.S., Dickey, F.M., and Green, J.A. 1979, "An  optimal frequency domain filter for edge detection in digital pictures.",  IEEE 37-49  Transactions  on  Pattern  Analysis  and  Machine  80) Sinclair, A.W.J. 1980,  "Evaluation and economic analysis of twenty-six coast of British Columbia.", FERIC Canada  Technical  Note  TN-39,  Forest  Intelligence,  Vol.  1,  pp  log sorting operations on the  Engineering  Research  Institute  81) Slepian, D. 1964,  of  "Prolate spheroidal wave functions, Fourier analysis and uncertainty-IV: Extensions to many dimensions; Generalized prolate spheroidal functions.", Bell  System  Technical  82) Srihari, S.N. 1984,  "Multiresolution 3-d  in  Multiresolution  Journal,  Processing  "Optical resonator modes of the  Optical  1964,  pp  3009-3057  image processing and graphics.",  Image  83) Streifer, W. 1965,  Journal  November  Society  and  rectangular  Analysis,  A.  Rosenfeld  reflectors of spherical curvature.",  of America,  Vol. 55, no. 7, pp  868-877  341  84) Sugic, N. and Suwa, M . 1977, "A scheme evidence.", Biological  for  binocular  Cybernetics,  depth  perception  Vol. 26, pp  suggested  by neurophysiological  1-15  85) Tanimoto, S.L. 1978,  "Regular hierarchical image and processing structures in machine vision.",  in Computer Press  86) Tang, G.Y. 1982, "A  Vision  Systems. Hanson,  A.  and  Riseman,  E.  (eds),  Academic  discrete version of Green's theorem.",  IEEE 3, pp  Transactions 242-249  on  Pattern  87) Terzopoulos, D. 1982,  "Multi-level reconstruction element methods.", MIT  Al  memo  88) Tyler, C.W. 1973,  "Stereoscopic  Science,  671,  Analysis  and  Machine  of visual surfaces:  Mass.  Inst. Tech.,  Intelligence,  Cambridge,  Mass.  "Depth from spatial frequency difference: An old kind of Research,  effect".  276- 278  89) Tyler, C.W. and Sutter, E.E. 1979, Vision  No.  Variational principles and Finite  vision: Cortical limitations and a disparity scaling  Vol. 181, pp  Vol. 4,  Vol. 19, pp  stereopsis?",  359-365  90) Vadnais, C. 1976,  "Raise sideboard recovery with computerized edger.",  in Modern Sawmill Miller- Freeman, San  Techniques. Francisco  White,  V.  (ed.),  Vol.  6,  pp  154-161,  91) VanMarcke, E. 1983,  "Random Fields: Analysis and Synthesis.".  MIT  92) Vit, R. 1962,  Press,  Cambridge  Massachussetts  "Electronic log scaler and its application in the logging industry.", Canadian p526  Pulp  and  Paper  Assoc.,  Woodlands  Section,  Index  No.  2125  (B6),  of  B.C.,  93) Watts, S.B. (ed.), 1983  "Forestry Handbook for B.C.".  published Vancouver  94) Whittaker, E.T. 1915,  "On the theory.", Proceedings  by  the  Forestry  functions which are of the  Royal  Undergraduate  represented  Society  Society,  by the  of Edinburgh,  University  expansions  Vol. 35, pp  of interpolatory 181-194  95) Whittaker, J.M. 1929,  "The Fourier theory of the cardinal functions.",  Proceedings  of the  Mathematical  Society  of Edinburgh,  Vol. 1, pp  169-176  342  96) Whittington, J.A. 1979,  "Computer control in a chip-n-saw operation.",  pill-118  97) Wiejak, J.S. 1983,  "Edge location accuracy.",  Proc.  SPIE,  Vol. 467, pp  164-169  98) Wiley, R.G. 1978,  "Recovery of bandlimited signals from unequally spaced samples.",  IEEE  Transactions  on Communications,  Vol 26, No. 1, pp  135-137  99) Wilson, II.R. and Bergen, J.R. 1979, "A  four mechanism model for spatial vision.",  Vision  Research,  Vol. 19, pp 19-32  100) Wilson, H.R. and Giese, S.C. 1977,  "Threshold visibility of frequency gradient patterns.",  Vision  Research,  Vol. 17, pp  1177-1190  101) Witkin, A. 1983,  "Scale-space filtering.",  Proc. 8th 1019-1022  102) Woodham, R.J. 1978,  "Reflectance  Int. Joint  Conf.  Karlsruhe  West  Germany,  map techniques for analysing surface defects in metal  T.R. 457, AI Lab, Mass.  103) Yao, K. and Thomas, J.B. 1967, "On some stability expansions.", IEEE  Art. Intell.,  Transactions  Inst. Tech.,  Cambridge,  and interpolating  on Circuit  Theory,  properties  Vol. 14, pp  of nonuniform 404-408  "On nonuniform sampling of bandwidth-limited signals.", Transactions  on Circuit  Theory,  December  1956, pp  105) Yuille, A.L., and Poggio, T. 1983a,  "Scaling theorems for zero-crossings.",  MIT  AI memo  722, Mass.  Inst. Tech.,  Cambridge,  Mass  106) Yuille, A.L., and Poggio, T. 1983b, "Fingerprint theorems  MIT  AI memo  for zero-crossings.",  730, Mass.  Inst. Tech.,  castings.",  Mass.  104) Yen, J.L. 1956, IRE  pp  Cambridge,  Mass  251-257  sampling  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items