VIEWPOINT INVARIANT SURFACE RECONSTRUCTION FROM GRADIENT DATA by Yossarian. Yggy King B.Sc. (Computer Science) Dalhousie University, 1990 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN THE FACULTY OF GRADUATE STUDIES DEPARTMENT OF COMPUTER SCIENCE We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA August 1993 © Yossarian Yggy King, 1993 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of ComNk-ec _Science, The University of British Columbia Vancouver, Canada Date ^ DE-6 (2/88) 30 181(i 3 Abstract Surface reconstruction recovers complete and coherent surfaces from scattered, noisy data. Viewpoint invariant reconstruction allows seamless combination of multiple views, recognition of objects in arbitrary poses, and consistent quantitative measurements. Viewpoint invariant reconstruction from gradient data is studied in a variational framework. An invariant smoothing functional based on the bending energy of a thin plate is derived. The functional is nonconvex, and three convex approximations are presented. An invariant but nonconvex metric for measuring fit to gradient data is given, based on the angle between unit surface normals. A convex constraint metric is obtained from the second order Taylor series approximation. The chief goal of this thesis is to evaluate and compare the different functionals, with viewpoint invariance the prime evaluation criterion. The approximately invariant regularizing functionals are discretized on a regular grid using a finite difference method. Reconstruction from gradient data proceeds in three stages: interpolation of the initial data; smoothing of the gradient map; and recovery of height from gradient. Data were generated for multiple views of a synthetic surface. Each reconstruction method is applied to recover a surface for each view. The viewpoint invariance of the reconstruction is evaluated by measuring, qualitatively and quantitatively, how well the surfaces from different views match up. Experiments are performed with real data from photometric stereo in the context of building 3D object models by fitting together surfaces reconstructed from multiple views. Viewpoint invariance is important in order for the different surfaces to fit together. Multiple views of a simple ellipsoid of known shape allows accurate evaluation of the results, while data gathered from a doll's face provides more of a challenge to the reconstruction techniques. 11 Table of Contents Abstract --FAtAt,^Con*QA3 List of Tables List of Figures vi 1 Introduction 1 2 Surfaces, Shapes, and Objects 2.1 Surface Reconstruction from Range Data ^ 2.2 Shape from Shading ^ 2.3 Acquisition of 3D Object Models ^ 5 5 9 3 Variational Approach to Surface Reconstruction 3.1 Problem Statement ^ 3.1.1^The III-Posed Problem ^ 3.1.2^Well-Posed Variational Formulation ^ 3.2 Viewpoint Invariance ^ 3.2.1^Requirements for Invariance ^ 3.2.2^An Invariant Constraint Metric ^ 3.2.3^An Invariant Smoothing Functional ^ 3.3 Convex Approximation ^ 3.3.1^Approximating the Constraint Metric ^ 3.3.2^Small Deflection Approximation of the Smoothing Functional 3.3.3^First Derivative Parameterization ^ 3.3.4^Principal Directions Parameterization ^ 3.4 Recovering Height from Gradient ^ 12 12 13 13 15 15 16 19 21 21 22 23 23 27 4 Surface Reconstruction on a Regular Grid 4.1 Discretization ^ 4.1.1^Regular Grid ^ 4.1.2^Approximating Derivatives with Finite Differences ^ 4.2 Initial Interpolation of the Gradient Map ^ 4.3 Minimization of the Discretized Functional ^ 4.4 Boundary Conditions ^ 29 31 32 33 35 37 39 41 5.1 Synthetic Data Sets ^ 42 5.2 Failure of Invariant Constraint Metric ^ 46 5.3 Use of Different Functionals for Height Recovery ^ 48 5.4 Comparison of Smoothing Functionals ^ 49 5.5 A Normalization ^ 57 5.6 The Data-Point Correspondence Problem ^ 58 5.7 Effects of Noise ^ 60 5.8 Effects of Varying Smoothing Parameters ^ 64 5 Investigations with Synthetic Data^ 6 Building Object Models from Real Data^ 69 69 70 71 75 75 78 96 7 Conclusions and Future Work ^ 105 106 107 109 6.1 Data Collection ^ 6.1.1 Photometric Stereo ^ 6.1.2 Imaging Setup and Calibration ^ 6.2 A Simple Ellipsoid ^ 6.2.1 Surface Recovery ^ 6.2.2 Building Models ^ 6.3 Facial Reconstruction ^ 7.1 Summary of Results ^ 7.2 Issues Raised ^ 7.3 Future Work ^ 112 Bibliography^ 116 A Invariant Quadratic Constraint Metric Impossible ^ 116 B Interpolation by Iterated Local Averaging ^ 120 C Conjugate Gradient Descent ^ 123 Appendices^ iv List of Tables 5.1 Distance between recovered surfaces ^ 56 5.2 Distance between recovered surfaces for noisy data ^ 64 6.3 Pixel dimensions at different distances ^ v 74 List of Figures 1.1 Model reconstruction from multiple views ^4 3.1 An invariant measure of the distance between two vectors ^ 17 4.1 Surface reconstruction from gradient data ^ 4.2 Estimating derivatives over a rectangular patch ^ 30 34 5.1 A synthetically generated surface ^ 5.2 Synthetic zz ^ 5.3 Synthetic zy ^ 5.4 Cross section through three synthetic surfaces ^ 5.5 Oversmoothing caused by angle-based constraint metrics ^ 5.6 Differential surfaces for the two constraint metrics ^ 5.7 Height recovered with different integrators ^ 5.8 Surface reconstructed using ili ^ 5.9 Surface reconstructed using 112 ^ 5.10 Surface reconstructed using 123 ^ 5.11 Surface slices reconstructed using Di ^ 5.12 Surface slices reconstructed using 112 ^ 5.13 Surface slices reconstructed using 113 ^ 5.14 The data-point correspondence problem ^ 5.15 Reconstruction with corresponding point data fit term ^ 5.16 Surface reconstructed from noisy data without smoothing ^ 5.17 Noisy, sparse p component of gradient map ^ 5.18 Reconstruction from noisy data with 121 ^ 5.19 Reconstruction from noisy data with 122 ^ 5.20 Reconstruction with different A values ^ 5.21 Reconstruction with different v values ^ 42 43 43 45 47 47 48 50 51 52 53 54 55 59 61 62 63 65 66 67 68 6.1 The imaging setup ^ 6.2 Three images of the ellipsoid ^ 6.3 Optical bench calibration ^ 6.4 Mathematical specification of the ellipsoid ^ 6.5 p and q components of gradient map from photometric stereo ^ 6.6 Surface height recovered without smoothing ^ 72 73 74 76 77 78 vi 6.7 Surface height recovered with 121 smoothing ^ 79 6.8 Surface height recovered with 02 smoothing ^ 79 6.9 Slice through reconstructed surfaces ^ 80 6.10 Slices through smoothed gradient maps ^ 81 6.11 Model built with no smoothing ^ 86 6.12 Model built using C21 smoothing ^ 87 6.13 Model built using 02 smoothing ^ 88 6.14 Cross-section of registered surfaces with no smoothing ^ 89 6.15 Cross-section of registered surfaces with 121 smoothing ^ 90 6.16 Cross-section of registered surfaces with C22 smoothing ^ 90 6.17 Cross-section of ground truth definition of ellipsoid ^ 91 6.18 Cross-section of registered surfaces for trimmed data with no smoothing ^ 92 6.19 Cross-section of registered surfaces for trimmed data with S21 smoothing ^ 93 6.20 Cross-section of registered surfaces for trimmed data with S-22 smoothing ^ 93 6.21 Cross-section of registered trimmed surfaces with no smoothing ^ 94 6.22 Cross-section of registered trimmed surfaces with Ili smoothing ^ 95 6.23 Cross-section of registered trimmed surfaces with S22 smoothing ^ 95 6.24 Images of doll's face ^ 97 6.25 Gradient data for the first view of the doll's head ^ 98 6.26 Face reconstructed in first view without smoothing ^ 100 6.27 Face reconstructed in first view with iii smoothing ^ 101 6.28 Face reconstructed in first view with 1/2 smoothing ^ 102 6.29 Face surface alignment without smoothing ^ 103 6.30 Face surface alignment with fti smoothing ^ 103 6.31 Face surface alignment with S22 smoothing ^ 104 B.1 Sparse data on a grid ^ B.2 Surface interpolated from sparse data ^ 121 122 C.1 Convergence criterion for conjugate gradient descent ^ 125 vii 1 Introduction V isual perception of the three dimensional world begins in two dimensions. A flat image, together with knowledge of how that image is formed, provides information about the shape and structure of the objects and scenes in view. Stereo, optical flow analysis, shape from shading, and shape from texture are some of the techniques of computational vision used to derive 3D information from 2D images, typically providing estimates of distance or surface orientation at points in the visual field. Surface reconstruction is the task of determining complete and coherent surfaces from the sparse, noisy, and possibly inconsistent local constraints provided by the various modules of early vision. The reconstruction of visual surfaces is an intermediate step leading towards full 3D scene analysis, object recognition, object model acquisition, and similar high-level tasks. The form of objects in the world is independent of how they are viewed, and their perception should mirror this invariance—the shape of a reconstructed surface should not depend on the viewpoint from which it is seen. Viewpoint invariant surface reconstruction from range data has been attempted in previous work (Stevenson & Delp, 1992; Vemuri & Skofteland, 1992), and this thesis seeks to provide a similar investigation for reconstruction from gradient data. A variational framework is used, and the task of surface reconstruction is posed as one of functional minimization, wherein viewpoint invariance can be achieved by using invariant functionals. Use of gradient data has typically received attention in the context 1 1. Introduction^ 2 of shape from shading, which recovers surface orientation, and often surface height, from the shading of one or more 2D images. Shape from shading has been shown to be an area where a variational formulation is particularly appropriate (Horn & Brooks, 1986), but little emphasis has been placed on viewpoint invariance. Automatic model acquisition seeks to integrate information from multiple views to form a complete 3D model of an object, such as might be required for object recognition (Goad, 1983), pose estimation (Li, 1993), or for inclusion in a computer graphics or CAD system. Viewpoint invariance is important for successfully merging information from multiple views, since the recovered shape of the object in overlapping areas must be the same in order for them to fit together. This thesis uses 3D model acquisition as a scenario for evaluating the invariance of the proposed reconstruction methods. Chapter 2 provides an overview of some of the relevant literature in the area of surface reconstruction, shape from shading, and 3D model acquisition. Aspects of viewpoint invariance are emphasised, particularly as they relate to variational methods. Chapter 3 presents a variational formulation that regularizes the ill-posed surface reconstruction problem. The requirements for viewpoint invariance are elucidated. An invariant data fit metric based on the angle between surface normals is derived. The energy density of a thin-plate under deformation is used as an invariant smoothing term. This derivation yields a non-convex functional which is difficult to minimize with standard techniques. Three quadratic approximations to the invariant smoothing functional are derived, following the development of (Stevenson & Delp, 1992). The first of these is the small deflection approximation, equivalent to the quadratic variation stabilizer that has previously been used in the surface reconstruction literature (Grimson, 1981; Terzopoulos, 1988). The second approximation is based on having estimates of the first derivatives of surface height at each point in the image, a technique that is particularly appropriate for application to gradient data. The final approximation is based on estimates of the 1. Introduction^ 3 principal directions at each point on the visible surface. Continuous-domain formulations are discretized in Chapter 4, using a finite difference method. The conjugate gradient descent method is described for solving the discrete minimization problem. Surface reconstruction over a regular grid is achieved through a three stage process that interpolates the initial gradient data, smooths the resulting gradient map by application of the invariant functional, and enforces integrability constraints to recover surface height. Chapter 5 describes a number of experiments on synthetic data that were used to evaluate the reconstruction methods in a carefully controlled manner. Invariance of the three approximations is investigated, and the effects of noise and different smoothing parameters are explored. Various issues related to the regularizing process and the comparison of different regularizing functionals are also shown to be important. Chapter 6 uses real gradient data derived from photometric stereo (Woodham, 1980) to reconstruct surfaces from multiple views of an object mounted on a rotating turntable. Data are gathered for four views of an ellipsoid of known shape. The different surfaces are matched together to produce a complete 3D object model. For direct matching of surfaces from adjacent views, viewpoint invariance is essential. Since gradient data provides no information about absolute depth, neighbouring surfaces need to be registered. Additional surface matching experiments were performed with data from two views of the more complicated surface of a pottery doll's face. Figure 1.1 shows the construction of surfaces from multiple views and illustrates the correspondence problem and the need for invariance. Some success is achieved, though in practice effects of view variation in the data and enforcement of boundary conditions may be more important than the choice of smoothing functional. These and other results are summarized in Chapter 7, together with directions for future work. 1. Introduction^ 4 Figure 1.1: Model reconstruction from multiple views. Invariance is required so that adjacent surfaces fit together. Recovered surfaces need to be translated along their respective lines of sight in order to bring them into registration. Surfaces, Shapes, and Objects T 2 he literature abounds with work on surface reconstruction from range data and the determination of shape from shading. Both of these areas are very relevant to the problem of surface reconstruction from gradient data. The proposed invariant reconstruction techniques are to be tested in the context of fitting surfaces together to form 3D object models, another area that has seen much work. This chapter provides a review of some of the more relevant literature in these areas. Past work on surface reconstruction is reviewed in Section 2.1. Related issues from shape from shading are examined in Section 2.2. Various approaches to 3D model acquisition are described in Section 2.3. 2.1 Surface Reconstruction from Range Data Stereo vision, motion analysis, and passive and active rangefinders all provide estimates of the distance to the nearest surface at points in the visual field. Interpolation and smoothing of these data are required to combat noise and sparsity. Reconstruction of visual surfaces from range data has received attention in previous research. Grimson (Grimson, 1981) provides an early variational formulation of the surface reconstruction problem. Reconstruction is performed on a grid, with the quadratic variation stabilizer used as a smoothing term. Under certain assumptions, this stabilizer is equivalent to the small deflection approximation of the bending energy of a thin plate, a 5 2. Surfaces, Shapes, and Objects^ 6 stabilizer which is examined in this thesis. The quadratic variation stabilizer is thus approximately invariant when the surface slope is small, although the property of invariance is not explicitly considered in Grimson's work. The desire for invariance is expressed by Terzopoulos (Terzopoulos, 1988), but the formulation presented is, again, only invariant under the small deflection assumption. "Splines under tension" are proposed, providing a combination of thin-plate and membrane functionals that allows controlled continuity. An efficient multigrid method is implemented. A recent implementation of the thin-plate/membrane spline approximation provides approximately invariant surface and motion estimation from stereo and rangefinder data (Vemuri & Skofteland, 1992). Kalman filtering techniques are used to integrate data from a stream of images and to maintain uncertainty estimates of the values recovered. The small-deflection approximation of the thin-plate energy density is again used, which seems adequate in practice. Viewpoint invariance is one of the major evaluation criteria of the survey of 3D reconstruction techniques provided by (Bolle & Vemuri, 1991). Various methods are surveyed, using a broad categorization into methods for explicit and implicit surfaces. A specific scheme is proposed for the reconstruction of parametric surfaces. Provided the parameterization is viewpoint invariant, then so is the reconstruction. The notion of an invariant parameterization is an important one, as is discussed in Section 5.6. The reconstruction methods described above all refer to invariance as a desirable property, but are generally more concerned with accurate recovery of the surface. Viewpoint invariance is the chief focus of (Stevenson & Delp, 1992), which derives an invariant variational formulation of the surface reconstruction problem based on the thin-plate energy density. The formulation is non-convex, and hence difficult to solve. Several convex approximations are derived and tested on synthetic and real range data. 2. Surfaces, Shapes, and Objects^ 7 This thesis is largely based on the work of Stevenson and Delp, and provides an invariant formulation for surface reconstruction from gradient data that is analogous to their formulation for range data. Stevenson and Delp suggest a constraint metric for use with gradient data based on the sine of the angle between surface normals, but do not test it. A metric very similar to this is shown to be inadequate in Chapter 5 of this thesis. Attempts have previously been made to incorporate slope information into the surface reconstruction problem. Harris presents a method for surface reconstruction that integrates range, slope, and higher-order derivatives (Harris, 1987). The solution is formulated as the minimization of the heat energy in a multi-level analog resistor mesh. Discontinuities of any order are handled by "breaking" the mesh at the appropriate level. A parallel iterative scheme is presented for the solution. The formulation is not view invariant. This coupled depth/slope model has been implemented and combined with a method to incorporate discontinuities based on the natural boundary conditions of the problem (Li, 1991). A number of experiments are performed, indicating that natural boundary conditions work well, except at self-occluding boundaries, where additional information in needed to achieve satisfactory results. This thesis does not directly address the issues of discontinuities and boundary conditions, but they are shown to be important to the general task of surface reconstruction. 2.2 Shape from Shading Gradient information has been used for surface description and reconstruction primarily in the context of shape from shading. The observed shading of surfaces is obviously affected by the local orientation of the surface, and this fact has been exploited in vision systems to determine local shape. Given orientation, full recovery of the 3D surface is a 2. Surfaces, Shapes, and Objects^ 8 logical extension that is often incorporated. The shape from shading problem is easily cast in a variational framework, and much of the early work on shape from shading can be understood in these terms (Horn & Brooks, 1986). Horn and Brooks develop a "systematic approach to the discovery of parallel iterative schemes for solving the shape from shading problem on a grid," based on satisfying a discretized version of the Euler equations for the variational problem. The importance of enforcing integrability is emphasized, and a method for doing so is derived which recovers height and orientation simultaneously, using the height function as a check on the integrability of the gradient field. Integrability is not strictly enforced, but rather encouraged by the use of a penalty term. Horn and Brooks are not alone in emphasizing integrability. Frankot and Chellappa present a method for projection of a non-integrable gradient field onto its "closest" integrable counterpart (Frankot & Chellappa, 1988). The solution is direct, but requires that the surface be represented as a linear combination of a finite set of basis functions. The technique can be retrofit to existing shape from shading methods to maintain integrability. Horn provides a historical perspective on the shape from shading problem and its solution, and presents a method for the simultaneous recovery of height and gradient (Horn, 1990). It is emphasized that shape from shading is not ill-posed if suitable boundary conditions are known, though these boundary conditions can be difficult to incorporate with the gradient representations of surface orientation. The method presented seeks to encourage integrability and satisfy the image irradiance equation. An iterative scheme is derived based on the Euler equations of the problem. A caution against overly sophisticated techniques is given by (Tsai & Shah, 1992), who present a shape from shading algorithm that is implemented in 20 lines of C code that requires only "one to two iterations" to converge. The first derivatives of the surface are 2. Surfaces, Shapes, and Objects ^ 9 approximated with forward differences of the surface height function, and the reflectance function is linearized directly in terms of surface height rather than first derivatives. Convergence is assured by the linearity of the solution. The algorithm produces results that appear qualitatively correct. Motivated by the observation that surface shape is perceived largely independently of illumination conditions, (Koenderink & van Doom, 1980) show that a large class of local brightness extrema occur along parabolic lines of a surface—points at which one of the principal curvatures is zero (i.e., the surface is locally flat in one direction). These lines divide the surface into patches of positive and negative Gaussian curvature. Surface curvature is a viewpoint invariant surface property, and so a link is provided between the shape from shading problem and the quest for invariance. Reliable local estimates of curvature can be derived from shading information using the photometric stereo technique (Woodham, 1990). Despite these results, little attention has been given to the issue of viewpoint invariance in solving the shape from shading problem. This thesis addresses one aspect of this problem by testing various approximately invariant gradient-based smoothing functionals. 2.3 Acquisition of 3D Object Models A logical extension of the idea of surface reconstruction is to merge surfaces from multiple views to derive full 3D models of the object in view. Many proposed techniques for object recognition rely on having an internal model of the object to be recognized (Roberts, 1965; Goad, 1983; Lowe, 1987; Wang & Iyengar, 1992). Automatic generation of object models would facilitate these techniques. 2. Surfaces, Shapes, and Objects^ 10 Surfaces generated from different views can only be merged directly if their overlapping regions fit together (i.e., if they have been calculated in an invariant manner). This thesis proposes to test the invariance of surface reconstruction techniques in the context of piecing together 3D object models such as might be used in object recognition tasks. The octree representation is a method of recursively subdividing 3D space to create an occupancy structure describing an object. Szeliski uses an octree representation for automatic recovery of object models (Szeliski, 1990a). The object is placed on a rotating turntable, with multiple silhouette views used to "carve out" the octree representation. Holes can be handled, but the use of the silhouette limits the forms of concavity that can be determined. Since most of the information in each image is being ignored, many images are required for an accurate model. Octrees are efficient to represent and manipulate (Chen & Huang, 1988), but suffer from an inability to represent local shape. Szeliski presents an alternative method for automatic model building in which optical flow is used to calculate the range of points on an object as it rotates on a turntable (Szeliski, 1990b). Disparity measurements are projected to 3D points with associated uncertainties. Points from multiple views are brought into a common coordinate frame and nearby points merged. A method of smoothing and interpolating the resulting point set is suggested but not implemented, though later work explains how this might be achieved (Szeliski & Tonnesen, 1991). Results are presented for several simple polyhedra. Viewpoint invariance is not emphasized in Szeliski's work, but is indirectly important to Ferrie et al, who use estimates of the local Darboux frame (representing invariant surface properties) to segment a surface and fit volumetric primitives to the resulting patches (Ferrie et al, 1990). The process generates a "blob" model of the object in which fine surface detail is lost. Earlier work attempts to fit a polyhedral object model to a set of 3D points (Faugeras et al, 1984). Range data is acquired from multiple views as an object rotates on a 2. Surfaces, Shapes, and Objects^ 11 turntable. Points from each view are brought into a common coordinate system, and an iterative process is used to determine a triangulation that lies "sufficiently close" to the observed data. It has recently been proven that "there are no view invariant features for the general case of n 3D points under perspective (or orthographic] projection" (Burns et al, 1993), which presents difficulties for feature-based object recognition. If recognition is not to depend on viewpoint, then it is desirable that it be based on viewpoint invariant measures, be they discrete features or dense measures. The authors suggest an analysis of the view variation of different types of features to derive distribution functions to aid recognition. A more direct approach seems to be to augment object models with local shape information, thus providing more constraints on the matching process. Chapter 6 describes an attempt to build 3D object models by registration of surfaces calculated from overlapping views of an object. Local surface orientation can be maintained at each stage of the process and used to augment the final model. If surface reconstruction is invariant, then the multiple surfaces should fit together well. The actual fit observed provides a test of invariance. Variational Approach to Surface Reconstruction T 3 he surface reconstruction task is to produce a dense map of depth estimates that is consistent with data sensed from the scene, possibly using prior knowledge and expectations about the scene domain. This chapter formalizes this problem. Section 3.1 states the problem in its ill-posed form, and presents a well-posed variational formulation. The conditions required for viewpoint invariance are elaborated and an invariant regularizing functional is derived in Section 3.2. The functional derived is non-quadratic and difficult to minimize. More tractable approximations to the invariant functional are discussed in Section 3.3. The presentation and content of this chapter somewhat follows the development of (Stevenson Sz Delp, 1992), which was the work that inspired much of this thesis. 3.1 Problem Statement This section provides a mathematical statement of the surface reconstruction problem and a well-posed variational formulation. 12 3. Variational Approach to Surface Reconstruction^ 13 3.1.1 The III-Posed Problem A visual surface can be represented (barring transparency) as an explicit surface z(x, y), giving the distance to the scene at each (x, y) point in the visual field'. Low level visual processes provide estimates of properties of the scene related to shape, in particular estimates of the distance or surface normal at points in the scene. These estimates may be noisy, sparse, and inconsistent. Given a set S of estimates of the surface distance or surface normal at discrete points in the visual field V, a naive statement of the surface reconstruction problem is to find a real-valued function i(x, y), defined for all (x, y) E V, which matches the point-wise estimates of distance and orientation given by S. A problem is well-posed, in the sense of Hadamard, if a unique solution exists that depends continuously on the initial data (Tikhonov & Arsenin, 1977). The surface reconstruction problem, as posed above, may violate any of these conditions—inconsistencies in the constraints may imply that there is no exact solution; sparsity of constraints allows many solutions; detection and exploitation of depth or orientation discontinuities may lead to a solution that does not depend continuously on the initial data. 3.1.2 Well-Posed Variational Formulation Regularization theory can be applied to yield a well-posed problem by restricting the space of allowable solutions, and has been successfully applied in many areas of computer vision (Poggio et al, 1985). Incorporation of a priori expectations and knowledge about the scene domain places restrictions on the form of the estimated surface. The problem now becomes one of finding the surface estimate that best fits the sensed data while simultaneously meeting the desired smoothness and coherence criteria. The problem 1Though there may be many different object and background surfaces in view, we can speak of a single "surface," perhaps with discontinuities, covering the entire field of view. 3. Variational Approach to Surface Reconstruction^ 14 as initially posed sought to find the surface height function from among all real-valued functions. Regularization generally limits the solution space to functions with continuous derivatives to some specified order. The regularized problem can be mathematically stated as finding (for fixed 8) z. which minimizes a measurement functional m[z., sj . E pc(z* , c) + Afl[z1^ cEs (3.1) where ft is a stabilizing functional that measures the "smoothness" of the surface in some appropriate way; pe measures the fit of the surface to the constraint c; and A is a regularizing parameter that controls the tradeoff between smoothness and fit to the data. Note that ft serves a dual purpose—it incorporates expectations about the general form of the surface to interpolate between data points, and it helps compensate for the effects of noise in the data. Note also that the solution of the variational problem is not the same as the solution of the original problem. If the smoothness and data fit terms of the regularizing functional are chosen appropriately then the solution should be satisfactory, but it is a different problem that is being solved. The solution to the regularized surface reconstruction problem is the surface 2 satisfying M[2, S] = min M[z*, S]^ (3.2) A unique solution is guaranteed if and only if the measurement functional M has a unique global minimum2. 2If there are no local minima other than the global minimum then the solution will be easier to find (using gradient descent methods). 3. Variational Approach to Surface Reconstruction ^ 15 3.2 Viewpoint Invariance For seamless combination of multiple views or recognition of objects in arbitrary poses, it is desirable to reconstruct surfaces in a manner that is independent of viewpoint. This section describes the conditions that must be met for this to be achieved and derives a regularizing functional for invariant reconstruction. 3.2.1 Requirements for Invariance Suppose that a surface estimate 2 is derived from a set of sensed constraints S. Applying a transformation T to the constraint set yields 5' = T[S]^ (3.3) Let i' be the surface reconstructed from S'. If the reconstruction process is invariant to the transformation T, then the surface reconstructed from the transformed constraints should be the same as that obtained by transforming the surface reconstructed from the original constraints, i.e. V = T[2]^ (3.4) This can be achieved if the regularizing functional is invariant to the transformation, that is M[e,S] = M[T[e],TESU = M[z*',S]^(3.5) That this is sufficient for invariance can be seen from the following MP, S] = min M[z* , S] = min M[z*', S'] = MEY , S1^(3.6) If M has a unique global minimum then i and "i' are unique, and by Equation 3.5 i' is the transformed version of 2. 3. Variational Approach to Surface Reconstruction^ 16 For viewpoint invariance, the relevant transformations are rigid translations and rotations3. An invariant regularizing functional can be constructed from a constraint metric p and a stabilizing functional 12 based on surface characteristics that are unaffected by rigid motion. 3.2.2 An Invariant Constraint Metric The fit between a postulated surface estimate and a set of data constraints should be an invariant calculation if the final surface estimate is to be independent of viewpoint. Viewpoint invariance requires that the constraint metric not depend on the coordinate system within which the surface is embedded. For depth data, such as those from a range-finder or stereo module, a suitable invariant metric is the squared distance between each range point and the postulated surface. This thesis addresses the use of slope data, for which an invariant measure can be based on the angle between the normal of the estimated surface and the normal obtained from a slope data point. Given a gradient estimate (p, q), where p and q represent the surface slope in the x and y directions, the (un-normalized) surface normal corresponding to that data point is rid = (—p,—q,1). Similarly, the surface normal at a point on the estimated surface is n, = ( — z„ — zy, 1), where z1, and zy are the first partial derivatives of surface height with respect to x and y. The angle 0 between the two vectors is given by cos 0 nd • ns ndIIns p.zz^qzy -I- 1 Al(p2 + q2 +1)(4 + 4 +1) (3.7) 3Note that preservation of visibility is assumed. Wherever the term "invariant" is used, "visibleinvariant" should be understood. 3. Variational Approach to Surface Reconstruction^ Figure 3.1: An invariant measure of the "difference" between two surface normal vectors. 17 3. Variational Approach to Surface Reconstruction^ 18 An invariant constraint metric that increases with increasing angle (up to the maximum possible angular separation 7r) is the squared distance d2 between the endpoints of the normalized vectors (see Figure 3.1), where cl, = sin ! 2^2 (3.8) d2 = 2(1 — cos 0) (3.9) Trigonometric manipulation leads to yielding the angle-based constraint metric pz, + qzy ±1 Me , (p, q)) = 1^ 1/(132 + q2 + 1)(4 + 4 ± 1) (3.10) This metric is not quadratic with respect to p and q, which makes minimization somewhat harder (as explained in Section 3.3). Stevenson and Delp suggest a similar constraint metric, using sin2 0 (Stevenson & Delp, 1990), but apparently has not implemented it. A quadratic constraint metric can be obtained from the squared distance between the endpoints of the non-normalized surface normal vectors: psD(z* , (p, q)) =11 ns — nd 112= (z. — pr + (zy — q)2^(3.11) This constraint is coordinate system dependent and thus not invariant4. This is the metric used by (Horn, 1990) as an integrability constraint for recovering surface height from gradient. The ideal constraint metric would be both viewpoint invariant, and quadratic in z, and zy. Unfortunately, such a metric does not exist (see Appendix A for proof). An alternative is to use a quadratic approximation, such as that derived in Section 3.3.1. 'For example, if (p, q) = (1,0) and (zz,zy) = (0,0), then a penalty value of 1 is obtained. Rotating these vectors by 450 about the y axis causes the penalty to blow up to co, with all other penalty values in [1,00) assumed for smaller rotations. 19 3. Variational Approach to Surface Reconstruction ^ 3.2.3 An Invariant Smoothing Functional Ideally, the stabilizing functional 1 of Equation 3.1 would be viewpoint invariant and would assume greater values as the surface departs from consistency. Since most surfaces commonly encountered in the visual world are (piecewise) smooth and tend not to vary rapidly, a stabilizing functional can be based on surface curvature, an invariant property. Reconstructing the surface by fitting a thin plate model to the available data achieves this end. The energy density of a thin plate deformed from a planar rest state is given as (Courant & Hilbert, 1989) h..,2 4,2 —2) -I- BK1K2 A (-1^ 2 (3.12) where Ki and K2 are the principal curvatures of the plate. The parameters A and B are properties of the material, where A = E is Young's modulus, a measure of the stiffness of the material5 and = v is Poisson's ratio, which measures the lateral contraction of the material per unit elongation (Volterra & Gaines, 1971). For most metals and glass, Poisson's ratio is approximately v 0.3, while for some rubbers and plastics it can approach the volume-conserving theoretical limit of 0.5 (Jastrzebski, 1976). Equation 3.12 can be rewritten in terms of these more familiar constants as E {('c? + K3) + Inclic2} (3.13) An alternative measure of curvature is provided by the Gaussian and mean curvatures K (3.14) H (3.15) Which yields an equivalent expression for the energy density E(2H2 — (1 — OK)^ 5Analogous to the spring constant of an ideal spring that obeys Hooke's law. (3.16) 3. Variational Approach to Surface Reconstruction ^ 20 The stabilizing functional to be used is the total deformation energy of the plate, obtained by integrating the energy density over the surface area of the plate Si[z1 = ju E(2H2 — (1 — v)K)dA^ (3.17) For an explicit surface z(x, y), the Gaussian and mean curvatures are (Pogorelov, undated) K= H= ZxxZyy - zz2y (3.18) (zx2 + zy2 +1)2 (4 +1)zyy — 2zzzyzxy + (4 +1)zxx 2(zx2 + 2.121+1)312 (3.19) while the area of an infinitesimal planar patch is dA = A / 4 + q ± 1^ (3.20) Combining these gives an expression of the stabilizing functional for explicit surfaces if [(4 + 1)z O[z(x, y)] =^ (4 + 1)zz]2 ^2zzzyzxy + 2(4 + +1)5/2 4 - (1 0 Zxx Zyy - ....,2zy (4 + 4 +1)3/2 dx dy^(3.21) where the constant factor E due to the material has been dropped. In short, the surface reconstruction problem is stated as a functional minimization problem. The invariant functional to be minimized acts to fit a thin plate approximation to the given slope data. Combining 3.10 and 3.21, the complete invariant functional is M[z,S] =^ E (1 , pzx + qzy + 1 (p,q)ES^11 (P2 + q2 + 1)(4 + .q + 1)) [(4 +1)zyy — 2zzzyzzy + (4 +1)zxx? (4 + 4 + 1)5/2 2 Z2xx Zyy Z1)3 — 2(1 v) ^ dx dy^(3.22) (z + 4 + /2 3. Variational Approach to Surface Reconstruction^ 21 where constant factors have been combined with the regularizing parameter A. This formulation is in terms of the surface z(x, y), but it is to be understood that the application will be in the gradient domain, with second derivatives of surface height becoming first derivatives of gradient. This is a non-convex functional which will be difficult to minimize without making some approximations. The following sections present several possible approximations, which will be tested for accuracy and viewpoint invariance. 3.3 Convex Approximation The reconstruction process has been stated as the minimization of the functional given by Equation 3.22. In order for a unique solution to exist, the functional must have a unique global minimum. A unique minimum guaranteed to exist and is much easier to find if the functional is convex. A functional F[x] is convex over a solution space V if and only if (Maddox, 1988) Vx, y E V, /3 E [0,1] : F[/3x -I- (1 - 0)y] < 0.Flx] + (1 — 01)F[y] (3.23) Any quadratic function is necessarily convex. Neither the smoothing functional nor data fit term of Equation 3.22 are convex. Several quadratic approximations are possible, and are detailed in the following sections. 3.3.1 Approximating the Constraint Metric While the data fit term pe is not convex, it does have continuous derivatives and a single global minimum. This makes minimization by gradient descent methods feasible, but a quadratic functional like the integrability constraint ps,, of Equation 3.11 allows 3. Variational Approach to Surface Reconstruction ^ 22 greater efficiency (since the conjugate gradient descent method is intended for quadratic or approximately quadratic functionals—see Section 4.3.) As stated previously, no invariant quadratic constraint metric exists (see Appendix A). We can instead use the second order Taylor series approximation of pe, expanded in terms of zx and zy about the point (p, q): Pe = PT, + 0(z. — Pr + 0(zy — q)3 ^ (3.24) where (z, — p)2 + (zy — q)2 ± (z4 — zyp)2 PT,(z*,(P,q)) =^ (1 ± p2 ± q2)2^ (3.25) The Taylor series is not convergent (unless (zs — p) < 1 and (zy — q) < 1), and the expression pray is not invariant, but it is quadratic and can be hoped to be an improvement over psD• (Note that where p = q = 0, PTay and ps,, are equivalent.) For the experiments, all three constraint metrics —po, ps,,, and pT,—were tried and compared. 3.3.2 Small Deflection Approximation of the Smoothing Functional A common approximation to Equation 3.21 comes from the assumption that zs and zy are small, yielding (Courant & Hilbert, 1989; Volterra & Gaines, 1971) Cli [z(s, y)] = if (zxx + zyy)2 — 2(1 — v)(zxxzyy — zz2y)dxdy (3.26) Setting I/ = 0 yields the quadratic variation stabilizer, which has often been used for surface reconstruction (Grimson, 1981; Terzopoulos, 1988). This approximation goes back to Lagrange (Volterra & Gaines, 1971) and is frequently used in the mechanical engineering field, where the bending of plates is analyzed in a 3. Variational Approach to Surface Reconstruction^ 23 coordinate system based on the rest state of the plate and the amount of bending is usually small. For surface reconstruction, where the coordinate system is arbitrarily imposed by the viewing direction and surfaces may have many folds and wrinkles, the approximation seems less appropriate. 3.3.3 First Derivative Parameterization A more accurate approximation can be derived if estimates of the first derivatives zs and zy are available. Given values of p and q as estimates of zs and zy, Equation 3.21 can be written as 122[z(x,Y)] = f f (Azzs + Bzsy + Czyy)2 — 2(1 — v)D(zxxzyy — zz2y)dx dy (3.27) where A - D are constant at each point on the surface, and are defined as A = B= C = D = 1+ (1^p2^q2)1 Pq (1^p2^q2)1 1 + q2 (1^p2^q2)t 1 (1 +p2 + q2)1 (3.28) (3.29) (3.30) (3.31) (Note that A- D are all less than 1 in absolute value.) This is the first attempt at an invariant parameterization made by (Stevenson & Delp, 1992)6. 3.3.4 Principal Directions Parameterization An alternative parameterization can be based on the first and second fundamental forms of the surface (Pogorelov, undated), which serve as a unique local characterization of the 'In their paper, A and B are erroneously squared—this is a typographic error (Stevenson, personal correspondence). 3. Variational Approach to Surface Reconstruction ^ 24 surface. The fundamental forms can be used to derive the principal curvatures of the surface, from which the smoothing functional naturally follows. First fundamental form. An explicit surface can be represented as the set of points in Ile of the form z = (x, y, z(x , y)). The first fundamental form at such a point is I = dz2 = (dx , dy, zz dx + zy dy)2 ± dy2 ± zx2 dx2 2zxzydx dy + zy2 dy2 = dX2^ + (3.32) The first fundamental form is often expressed in terms of the parameters E, F, and G as I = [dx dy] [E Fl[dx1 F G dy = Edx2 + 2Fdx dy + Gdy2 (3.33) where E = z! = (1 + 4) F = zz • zy = zxzy G = z2 = fi __,_ ,2 \ Y^k m ''y) (3.34) Second fundamental form. The second fundamental form is defined similarly as II = —dz • dn (3.35) where n is the unit surface normal at the point, which can be expressed as n (3.36) 3. Variational Approach to Surface Reconstruction^ 25 The second fundamental form is generally expressed in terms of the parameters L, M, and N as II = [dx dy] [L Mi[dxl M N dy (3.37) = Ldx2 + 2Mdx dy + Ndy2 where L = —zx nx = zxx • Zxx n= zz2 zy2 +1 M = —(Zx • Ily Zy • 11x) = Zxy • N = —zy • ny = zyy • n = v n= xy ++1 ZYY z2 z2 (3.38) x^y The forms I and II can be thought of as expressing properties of curves on the surface z(x, y). Suppose that -y is the curve formed by the intersection of the surface with a plane perpendicular to the tangent plane at a point P. If the direction of the curve at P is given as (dx, dy), then the curvature of -y at P is (Pogorelov, undated) kdx dy The quantity kdx dy L dx2 + 2M dx dy + N dy2 II E dx2 + 2F dx dy + G dy2 — (3.39) is known as the normal curvature of the surface in the direction (dx dy)7. This expression can be used to calculate the smoothing functional of Equation 3.12, since the principal curvatures ki and IC2 are just the normal curvatures in the principal directions. Given the principal directions {(dxi : dyi), (dx2 : dy2)}, the principal curvatures can be written as kJ. = 1 ^ (c4,14x +w2zxy +w3zyy) 4 +4+1 If the direction is specified as an angle 9 measured from the tangent in the x-direction, then the normal curvature can be calculated from dx = cos 8 , dy = sin O. 3. Variational Approach to Surface Reconstruction^ 1 K2 = ^ (W4Zxx ± W5Zxy CV6Zyy z2 z2 1 x^y 26 (3.40) where dx ? E dx? + 2F dxi dyi + G dy? 2 dxi dYi C4/2 E dx? + 2F dxi dyi + G dyl = E dx? + 2F dxi dyi + G 42. dx3 4.04^= E dx + 2F dx2 dy2 + G dY3 2 dx2 dY2 w5 E dx3 + 2F dx2 dy2^G dy3 dy3 cos^= E dxi ± 2F dx2dy2 ^G dA (3.41) The principal directions can be calculated from the Weingarten mapping -1 EFL FG MI MN (3.42) The principal directions are simply the eigenvectors of 13: ei = (dxi, dyi). An approximation to the principal curvatures of Equation 3.40 can be derived if estimates p and q of zx and zy are available, from which the Weingarten mapping can be calculated, and thus the principal directions can be derived. This yields estimates of the parameters w1—w6 at each point in the image8. The smoothing functional (see Equation 3.13) now has the convex approximation (again, neglecting a constant factor) 123 [Z (X y)] = f (k? +^+ 2vk1k2 dA 'To estimate the second fundamental form it is necessary to numerically differentiate p and q in order to estimate second derivatives of surface height. This is the only real difference between 523 and f/2• This method of arriving at 123 is not terribly good (as explained in the conclusions section). 27 3. Variational Approach to Surface Reconstruction ^ = [GDiz..,fVf zz2+zy2+1 (.212zxy C.D3zyy)2^(C2/42'xx (215zxy 6)6zyy )2 +21)(C2'1Zxx CD2ZXy C2)3Zyy )(6)4ZZX C2)52.Xy CV6Zyy )1 dx dy^(3.43) where c24-6)6 are the estimated parameters derived from the Weingarten mapping. The parameters are derived from estimates of the principal directions—at singularities in the principal direction field (umbilic points), any arbitrary orthogonal directions can be chosen, since the curvatures are the same in all directions. The factor A/4 + zy2 + 1 in the denominator can be approximated by -Vp2 + q2 + 1. This is the approximation used by (Stevenson & Delp, 1992), with v = 09. Any of the smoothing functionals 121-423 can be combined with any of the data fit terms pe, psD or pr., (Equations 3.10, 3.11, and 3.25) to yield a regularizing functional. The minimization of the functional yields a complete, smoothed gradient map. The invariance of the reconstruction will depend on how accurately the approximation chosen reflects the original invariant functional. This thesis seeks to evaluate and compare the different approximations on this invariance criterion. 3.4 Recovering Height from Gradient Application of the smoothing functional is designed to produce a complete and coherent gradient map, which leaves the task of recovering the actual surface height. This is an integration process, which seeks to find the surface estimate which comes closest to being the integral of the gradient map. (Since integrability has not been enforced, direct integration is not possible.) Separation of processing for gradient smoothing and height recovery was chosen as a means of simplifying the computation, but is rather an artificial split (see Chapter 7). 'The factor (4 + zy2 + 1)4 is eliminated in Stevenson and Delp's formulation by virtue of the minimization scheme they have chosen (Stevenson, personal correspondence). 3. Variational Approach to Surface Reconstruction^ 28 This can easily be done by minimizing a functional analagous to the integrability constraint of Equation 3.11, or any of the other data fit functionals used while smoothing the gradient map. The difference now is that the penalty term is applied for every point in the smoothed gradient map, rather than just at the points corresponding to the original data. In short, we find the surface which minimizes R[z] = i f p((z„ zy), (p, q)) dx dy (3.44) The squared differences constraint, being quadratic, is easier to minimize, while po meets our desired invariance criteria, but is non-convex and hence harder to minimize. Alternatively, pm), is quadratic and approximately invariant. Note that during the stages of gradient map smoothing and surface height reconstruction, only first order derivatives ever need to be numerically estimated, a benefit of working with slope data rather than range data. 4 Surface Reconstruction on a Regular Grid S urface reconstruction has been formalized as a problem of minimizing a functional that imposes smoothness and fit to the data constraints on the surface (see Equation 3.22). The previous chapter derived an invariant regularizing functional and several quadratic approximations, all in the continuous domain. This chapter presents a discretization of the problem and an algorithm for its solution. The task considered is to derive robust estimates of surface orientation and relative distance at each point on a regular grid, given noisy estimates of surface orientation at a possibly sparse set of points on the grid. The grid corresponds to a digital image, and it is assumed that the imaging process can be modeled by orthographic projection. The problem of dealing with depth and orientation discontinuities, while important in the general case, is only addressed superficially here (see Section 4.4). The problem is solved in three stages (see Figure 4.1): 1. initial interpolation of the gradient data; 2. smoothing of the gradient map; and 3. integrating gradient to recover surface height. The continuous formulations presented in Chapter 3 are recast in the discrete domain in Section 4.1, using a finite difference method. A method for interpolating the initial gradient data is explained in Section 4.2. Section 4.3 describes gradient descent methods 29 30 4. Surface Reconstruction on a Regular Grid^ Low level visual process gradient data smoothed gradients recovered surface Figure 4.1: Surface reconstruction from gradient data. 4. Surface Reconstruction on a Regular Grid ^ 31 for minimization of the discretized problem. Boundary conditions apply at the edges of the reconstruction, and are discussed in Section 4.4. The algorithm was implemented in Lucid Common Lisp on a Sparc workstation. 4.1 Discretization In Chapter 3, the surface reconstruction problem was stated as the variational problem of minimizing a functional defined over a continuous surface height function z(x, y). For a discrete surface height function defined at points in a digital image, the problem becomes one of minimizing a function M(z), where z is an array of surface height values corresponding to points in the image. The function M has the form of Equation 3.22, but in the discretized problem, the double integral is replaced by a double summation over the rows and columns of the image, and the values of derivatives are replaced with their numerical estimates. For an M x N image, the discrete versions of the various smoothing functionals are thus (see Equations 3.26, 3.27, and 3.43) 111[Z(X1 y)} M N =^E E(4z + iyy) 2 -- 2(1 — v)(ixi^—2) YY^xy C22[z(x, y)] = i=1 j=1 MN^,, E E(A.S.= + bizy + Oiyy)2 i=1 j=1^ f23 [Z(X, y)1 = 2(1 — v)D(ixxiyy — 22 xy ) (4.1) (4.2) MN 1 E E^ [Pi ixx + (2)22xy + 6)3402 + ((4)42xx + (4)52xy +6)6 2yy)2 i=1 j=1 . \ I 4 + Z11 + 1 —1-21462,142 + C222 xy 4- (c)3iyy)(C4J4i xx ± L;')5iry ± (4)62y01 (4.3) where zx and zy are the variables to be solved for during the smoothing process, and the second derivative estimates 2, ixy, and iyy are formed as described in Section 4.1.2 (Equation 4.11). For each term in the summation, the values are taken from the corresponding point (i, j) in the image. 4. Surface Reconstruction on a Regular Grid^ 32 Similarly, the functional for recovery of surface height from the smoothed gradient map (Equation 3.44) is discretized as: R(z) = MN E E p(pz,^(p,q))^ i=1 j.1 (4.4) with first derivatives estimated by Equation 4.12. Using the different constraint metrics derived previously yields three possible functionals for surface height recovery R0(z) Re(Z) = = M N E E psD((z„ 4), (p, q)) i=1 j=1 MN E E pe((ix,^(p,q)) i=1 j=1 MN Rmy(Z) E E pr,((2s, 4), (p, q)) (4.5) (4.6) (4.7) i.1 j=1 Rather than directly discretizing the continuous functionals, an alternative approach is to derive the Euler-Lagrange equations for the continuous problem statement, which are then discretized to form the basis of an iterative solution. (See (Courant & Hilbert, 1989) for an explanation of the Euler-Lagrange equations of a variational problem, and (Horn & Brooks, 1986) for examples of their application to shape from shading.) Discretization of the original functional was chosen as the more direct way to tackle the problem. 4.1.1 Regular Grid Surface reconstruction will be performed over an M x N grid. The grid points are h units apart in the x direction and v units apart in the y direction. Each intersection corresponds to a pixel in a video image, with h and v indicating distances in the world corresponding to the width and height of a single pixel. This assumes that the imaging process can be modelled as orthographic projection, with X x=— n y= — (4.8) 4. Surface Reconstruction on a Regular Grid^ 33 where (X,Y,Z) is a point in the world corresponding to position (x, y) in the image grid'. The parameters h and v, determined during camera calibration, encode information about the average distance to the scene, aspect ratio, focal length, analog to digital conversion, etc. Data values are available at a subset of the pixels in the image. 4.1.2 Approximating Derivatives with Finite Differences Given a small h x v rectangular patch in the discretized gradient map, with estimates of the gradient (p, q) at each of the four corners, a quadratic surface height function can be defined over the patch z(x , y) = ax2 + by2 + cxy + dx + ey + f (4.9) The first derivatives of this height function are z x(x ,y) = 2ax + cy + d zy(x , y) = 2by + cx + e (4.10) which will be equal to the known values of p and q at the corners of the patch. Note that these patches are nonconformal, since adjacent patches will not in general coincide along their edges (except, by construction, at the corners). Plugging in the known values of p, q, x, and y at the corners of the rectangular patch yields a system of eight linear equations in the five unknowns a—e (note that gradient information alone places no constraint on the constant term 1). A least squares solution to this system provides estimates of the second derivatives of z(x, y), corresponding to first derivatives of the gradient map 2XX = 2a = Pio — Poo +Pii — Poi 2h = Pz 1This is sometimes referred to as a weak perspective model of projection, as opposed to true orthographic projection where x = X,y = Y. 34 4. Surface Reconstruction on a Regular Grid ^ 1 1 1y 100^101v +P 01^11 h Figure 4.2: A rectangular patch. The derivative calculation yields first derivative estimates valid at point P in the center of the patch. = 2b = go' — goo + Di — qi0 . qy 2YY^ izy = c = 401 2v Poo + PH —Pio) + h(gio — goo + Di —NO 2(h2 + v2) — v2py + h2qx = ^ h2 + v2 (4.11) where the numerical subscripts of p and q indicate different corners of the rectangular patch (see Figure 4.2). One might expect the estimate of zxy to be -(py -I- q) —observe that this is only the case when h = v. For reconstruction of height from gradient, the first derivatives of surface height need to be computed. This can be done in an exactly analogous fashion ix = zio — zoo + zn — zoi iy = 2h zoi — zoo ± zn — zio 2v (4.12) These calculations can be performed by convolution with the simple masks 1 —1 2h — 1 +1 +1 and 1 —1 —1 2v +1 + 1 (4.13) Note that this method of calculating derivatives has the desirable property that the estimates of iy 4 = 2.01 -- ZOO • and zy are valid at the same point, unlike a scheme such as ix = zlo — zoo, 4. Surface Reconstruction on a Regular Grid ^ 35 Calculating derivatives in this way yields estimates that are valid at points offset by half a pixel in x and y from the original grid. This has implications for the iterative reconstruction process, since data values and estimated parameters are valid at slightly different points on the surface than the derivative estimates. In practice, it has been found that the only substantial effect of the staggered grids is that the reconstructed surface is offset by half a pixel from the data. If necessary, this can easily be compensated for by resampling. 4.2 Initial Interpolation of the Gradient Map The initial input to the surface reconstruction process is a set of possibly noisy gradient constraints measured at discrete points on the surface. These constraints need to be interpolated to fill in missing values and smoothed to reduce the effects of noise. Rather than immediately applying the full smoothing functional, it is useful to first interpolate the data, in order to get closer to the minimum before incurring the computational overhead of more complex later stages. The initial interpolation is done by component-wise interpolation of the surface normals given by the gradient data. This estimate passes directly through the data points and does nothing to minimize noise, which will be compensated for by later smoothing stages. The interpolation proceeds as follows. Consider the abstract case of a sparse data set fui(xi, yi)ii = 1, , N} defined at discrete points in the (x, y) plane. We can fit a piecewise planar surface fi to the data by minimizing the small deflection approximation of the membrane energy functional (Courant & Hilbert, 1989) fi (ft! + 74)dx dy^ (4.14) subject to the constraints ii(xi,yi) = u(xi,yi). Terzopoulos (Terzopoulos, 1988) uses 4. Surface Reconstruction on a Regular Grid^ 36 this penalty term as part of a controlled continuity stabilizer, as it allows orientation discontinuities. This interpolation scheme could be applied separately to the p and q components of a set of gradient constraints2, equivalent to minimizing Ii( p! + 14 + q! + gy2)dx dy (4.15) which is just the quadratic variation of the underlying surface. This is essentially equivalent to the smoothing functional used by (Grimson, 1981) for surface reconstruction from depth constraints. However, this functional is not viewpoint invariant. Another approach is to calculate the unit surface normals corresponding to the known values of p and q from n(p, q) = (nx , ny, nz) = (—p, —q,1)/ Vp2 ± q2 ± 1 (4.16) Each component nx, ny, nz of the surface normal "map," can be interpolated as described above. This process is viewpoint invariant, and (provided the angle between nearby normals is less than about ir-) is roughly equivalent to linear interpolation of the angular separation between adjacent normals. Following interpolation, the normals are converted back to gradient values by p(n) = (4.17) q(n) = (4.18) The interpolation is carried out using the simple algorithm of Appendix B. It is this interpolated gradient map that forms the input to the second stage of the reconstruction process. The interpolated gradient map is also used to calculate the parameters required to form E22 and O3. 2Note that C° continuous interpolation of gradients yields C' continuity of the corresponding surface height. 4. Surface Reconstruction on a Regular Grid^ 37 4.3 Minimization of the Discretized Functional Discretization of the variational problem yields a problem of function minimization. Consider a functional M[f], where f is a function. We can think of M as a function of infinitely many variables, corresponding to the values z = 1(x) at each point x in the domain of f. Discretizing the domain of f approximates the functional MO with a function MO of a finite (though large) number of variables—the values of f at each node in the discretization. The discrete problem is to find z such that M(z) = min M(z*) z. (4.19) The typically large number of variables foils attempts at an analytic solution, but various iterative numerical methods are available. Many of these methods are based on the idea of gradient descent—starting at some point in the parameter space of the function, keep moving "downhill" along the surface of the function until the minimum is reached (Faddeev Sz Faddeeva, 1963). Steepest descent is the simplest of these methods. Given an initial point zip, the gradient of the function is computed at that point, corresponding to the locally steepest direction on the function surface. The function is minimized along this direction3, yielding a new point z1. More precisely gk = vM(zk) ak = arg m?1 M(zk — 7gk) 7E1K Zk+i = Zk - akgk (4.20) 3This is an optimistic form of steepest descent. A more cautious approach is to take small steps of a predetermined size in the direction of steepest descent, perhaps updating the step size via some heuristic. 4. Surface Reconstruction on a Regular Grid ^ 38 where this process is repeated until a suitable stopping criterion is met'. Note that the line minimization to find ak may itself be an iterative approximation process, depending on the form of MO. The trouble with steepest descent is that while each new direction is perpendicular to the previous direction (minimizing the function in a particular direction means that the component of the gradient in that direction is zero), it may be close to parallel to some prior direction. This results in the algorithm taking many more iterations than should be necessary. For example, if the minimum is in a "valley" of the function, we would like the algorithm to move to the floor of the valley in one step and to the minimum in the next. Steepest descent will not do this, but instead will jump back and forth across the valley, taking many small steps before it finally reaches the minimum. The method of conjugate gradient descent provides faster convergence by seeking to optimize the direction in which to move at each step. Furthermore, for a quadratic function, the conjugate gradient method guarantees convergence after at most n steps, where n is the number of unknowns in z (Faddeev & Faddeeva, 1963). The basic idea is to keep moving in a direction that minimizes the function over the subspace spanned by the previously chosen direction vectors. Appendix C presents an overview of the algorithm; see (Grimson, 1981) or (Press et al, 1988) for further details. All such gradient descent methods are vulnerable to getting stuck in local minima. Various stochastic techniques exist which attempt to avoid this problem (Blake, 1989), but they tend to be very computationally intensive. The simplest alternative is to stick to convex functions, in particular quadratics, which is why so much effort was taken to provide quadratic approximations to the invariant functional. Once the gradient map has been found to solve the minimization problem, we can recover height from gradient by solving another discrete minimization problem. The 4e.g. the norm of the gradient, II gk II, is below some tolerance. 4. Surface Reconstruction on a Regular Grid^ 39 continuous solution is described in Section 3.4. We simply apply the gradient descent minimization techniques to the discretized versions of the continuous problem (see Equations 4.5, 4.6, and 4.7). 4.4 Boundary Conditions The three stages of the surface reconstruction process—gradient interpolation, gradient smoothing, and recovery of height from gradient—all act to recover functions by imposing conditions on the function over some region of interest. Additional issues are raised in considering the boundaries of the region. The Dirichlet boundary conditions are used when values can be prescribed at the boundary. If appropriate boundary conditions are not known, then rather than prescribing arbitrary values (say 0), the Neumann, or natural, boundary conditions can be used. For the gradient smoothing stage, the natural boundary conditions are: Op Oq — = — = 0 and On an Oz -Ft = ( 13' q) n (4.21) where n is the outward normal to the boundary (Ascher & Carter, 1993). That is, p and q are extended linearly beyond the boundary, while the changes in z across the boundary are obtained from the values of p and q. The natural boundary conditions can be derived from the Euler equations for the problem (Courant & Hilbert, 1989). Alternatively, if the functionals are discretized directly then the natural boundary conditions simply fall out of the discretization. (Li, 1991) showed that natural boundary conditions work well for surface reconstruction from slope data, except at self-occluding object boundaries. At occluding boundaries, it is desirable to have prescribed values or to use Dirichlet conditions. In this work, direct discretization of the functionals over the image ensures that the 4. Surface Reconstruction on a Regular Grid^ 40 natural boundary conditions are implemented at the boundaries of the image, but not at object boundaries within the image. This is a serious flaw if the goal is accurate surface reconstruction, but handling of such discontinuities is an issue that is not addressed. Provision has been made for inclusion of boundary locations—if the algorithm is provided with the location of discontinuities then derivatives will not be calculated across these boundaries and values for gradient or height can be explicitly set. In particular, the photometric stereo process yields an object silhouette image which can be used to find occluding boundaries 5 . 5 1n practice, use of this method of setting boundary conditions did not prove useful and was not used. Investigations with Synthetic Data A 5 method has been described for reconstructing visual surfaces from gradient data. This chapter presents the results of a number of experiments with synthetic data devised to evaluate the method. Although reconstruction accuracy is important in general, this study focuses primarily on the view variation of the smoothing functionals. Smoothing and accuracy can conflict, but finding the optimal degree of smoothing is not considered. Section 5.1 describes the synthetically generated data sets. Section 5.2 discusses problems with the invariant constraint metrics. Use of different functionals for determining height from gradient is discussed in Section 5.3. The results obtained using different smoothing functionals are presented in Section 5.4. A "A normalization" technique required for the comparison of results from different reconstruction functionals is described in Section 5.5. Functionals for surface reconstruction were developed with pointwise invariance in mind—the problem of invariance to reparamaterization is considered in Section 5.6. The effects of noise in the data are investigated in Section 5.7. Section 5.8 demonstrates the results of varying the regularizing and smoothing parameters. 41 5. Investigations with Synthetic Data^ 42 Figure 5.1: Synthetically generated surface height function. 5.1 Synthetic Data Sets A smoothly varying synthetic surface was generated that contains a wide range of surface normal directions. The mathematical form of the surface is z(x, y) = 2(1 -I- x2(y2 — 1))(CND04 (y) — 1/2)^(5.1) where CNDI,,, is the one-dimensional cumulative normal distribution function with mean it and standard deviation u. The surface was sampled at points on a 50 x 50 grid laid over the range x, y E [—LI] (see Figure 5.1). This sampling yields values for the grid spacing parameters of h = v = 2/50 = 0.04. The input to the reconstruction process is the gradient at each point (Figures 5.2 and 5.3). 5. Investigations with Synthetic Data ^ Figure 5.2: First component of the gradient map—z x values for synthetic surface. q Figure 5.3: Second component of the gradient map—z y values for synthetic surface. 43 44 5. Investigations with Synthetic Data ^ Two more data sets were generated by rotating the analytical surface by 300 and 60° about the x axis and resampling over the range x E [ 1, y E [ 1+2V3 , J4A. This -- - corresponds to the same "piece" of the surface as that for the unrotated data set. For the rotated data sets, the grid spacing parameters are h = 0.04, v = (1+ Va)/50 = 0.0546. The unrotated surface and its two rotations will be denoted So, 530 and S60. Figure 5.4 shows a cross-section of the three surfaces'. The basic method of evaluating the invariance of a surface reconstruction method will be to reconstruct the surface for each of the data sets, then rotate the resulting surfaces into a canonical position and see how well they match. Matches will be evaluated qualitatively by comparing surface plots and quantitatively by measuring the distance between different reconstructions. The functional to be minimized (Equation 3.22 and its discrete approximations) contains the smoothing parameter A, controlling the tradeoff between smoothness and fit to the data, and the parameter v relating to the thin-plate material (see Section 3.2.3). For accurate surface reconstruction, many techniques exist for determining a value for A that is appropriate for the available data (Wahba, 1990). This thesis seeks to evaluate different smoothing functionals, and so a rather larger A was used than might have been necessary, in order to emphasise the effects of smoothing. In addition, the values used in conjunction with different smoothing functionals must be equalised to allow meaningful comparison of the results. The value Ao = 1.0 was used throughout the experiments, with the "normalization" process described in Section 5.5. The choice of v had little effect on the results, and v = 0.0 was used throughout. The effects of varying these parameters are demonstrated in Section 5.8. 'Points in 2D plots and grid intersections in 3D plots correspond to actual data points. For the 3D plots, the data has been sub-sampled, so not all points in the data set are plotted. 5. Investigations with Synthetic Data ^ 45 No rotation min min MM. Gm First rotation om no no e &toed rotation e".'■ 0.5 / /^ ■■.■•■■ • 0.5 • • •• • Figure 5.4: Cross section of row 25 of the three synthetic surfaces. 5. Investigations with Synthetic Data^ 46 5.2 Failure of Invariant Constraint Metric On attempting to use the invariant constraint metric po its inadequacy was immediately apparent (see Figure 5.5). This failure is due to the small range of values taken on by the constraint metric (since 0 < (1 — cos 8) < 2) compared to the large range for the smoothing functional (infinite at discontinuities). The data fit term is completely overwhelmed by the smoothing term at points of high curvature on the surface, meaning that these points are oversmoothed. The linear tradeoff provided by the regularizing parameter A is not sufficient to combat this effect. pi,,, suffers from similar defects. Tikhonov and Arsenin, in their treatment of the solution of ill-posed problems, assume that L2 norms are to be used (Tikhonov & Arsenin, 1977). Of the constraint metrics studied here, only PSD is an L2 norm. The other constraint metrics may thus be inappropriate, but the mathematics has not been pursued. Figure 5.6 plots a slice through the "derivative surface" for po and Ps,,. The plot shows the partial derivative of the constraint metric with respect to a particular element of the p component of the gradient map. Although both plots have the same sign, the magnitude of po becomes very small across the jump in height, thereby failing to enforce data fit at those points and yielding to oversmoothing. pTi, suffers from similar problems. (The calculation is performed using an intermediate gradient map estimate during smoothing of So with 12 and ',sp.) The constraint metric psD based on squared differences (see Equation 3.11) provides a better match with the smoothing functionals used, and will be employed for the remaining examples. On a purely practical note, the quadratic metric p,„ yields convergence in fewer and faster iterations than the doubly iterative solution using pg. 5. Investigations with Synthetic Data ^ 47 Figure 5.5: The inadequacy of the angle-based constraint metrics is demonstrated by their application to the unrotated surface. The smoother is 12 in each case. Sneered difference medic Angle-based metric Figure 5.6: Differential surfaces for the two constraint metrics. 5. Investigations with Synthetic Data ^ 48 Figure 5.7: Height recovered with different integration functionals (after smoothing) The solid line is the corresponding slice through the original surface. 5.3 Use of Different Functionals for Height Recovery Height can be recovered from a gradient map by minimizing any of the functions RsD ) Re, or RT,,,,, (see Equations 4.5, 4.6, and 4.7). Not all integrators are created equal though, as was evident in practice. Rs0 converged most rapidly and produced the best results; Rmy converged more slowly but gave roughly the same results; Ro was the slowest and most erratic to converge, and gave poor results. Figure 5.7 shows a slice through the surfaces produced by the different integrators when applied to the gradient map for So smoothed with a combination of C-22 and psD. The reasons for the differences in performance are not entirely clear, but it is believed 5. Investigations with Synthetic Data ^ 49 that Ro is inheriting some of the shortcomings of pg. RT., is thought to produce slow convergence because of the interaction of p and q in the penalty term. Note that for an integrable surface, any reasonable functional that pushes the surface in the right direction should work (indeed, a direct solution should be possible). It is only when integrability has not been enforced that the choice of the height recovery functional becomes important. Unless otherwise stated, all following experiments use RSD • 5.4 Comparison of Smoothing Function.als The primary goal of this thesis is to evaluate the performance of the different smoothing functionals 1/1, C22, and 123 that were derived in the preceding chapters. Each of the smoothers was paired with the constraint metric ps,, and used to smooth the gradient map corresponding to the surfaces So, smoothed gradient map using RsD S30, and S60. Height was recovered from the as an integrator (see Equation 4.5). Plots of the surfaces recovered for the original, unrotated data set So are shown in Figures 5.8, 5.9, and 5.10. Cross-sectional plots of the surfaces reconstructed for each of the three data sets are shown in Figures 5.11, 5.12, and 5.13 for 01, 112, and 123 respectively. Each figure shows a slice of the surface for the original data and for each rotated data set, where the reconstructions for the rotated data sets have been rotated back into the position of the original surface. The mathematical surface definition is also shown to indicate the reference surface. The effect of staggered grids (see Section 4.1.2) is apparent near the origin, where it is clear that the reconstructed surfaces are displaced slightly to the right from the original surface. Note that the reconstructions have not been rescaled—the reconstruction process 5. Investigations with Synthetic Data ^ . Figure 5.8: Surface reconstructed using Ch. 50 5. Investigations with Synthetic Data^ z Figure 5.9: Surface reconstructed using 122• 51 5. Investigations with Synthetic Data^ Figure 5.10: Surface reconstructed using 113. 52 5. Investigations with Synthetic Data ^ 53 Figure 5.11: Surface slices reconstructed using 111. Each slice is through row 25 of the reconstructed surface. accurately recovers the correct range of surface heights2. 122, the smoother based on first derivative parameterization, appears (at least qual- itatively) to offer no significant benefits over the quadratic variation stabilizer Ii. In both cases, the reconstructions for the two rotated data sets S30 and Sop line up very well, while the reconstruction for the original data set So differs significantly. The trouble with the original data set is due to the high gradients encountered when y is close to 0. The surface is so steep at this point as to be almost discontinuous, which presents a problem for the smoother. This does not happen for the rotated surfaces, since the gradients remain much smaller. For a quantitative comparison of the reconstructed surfaces, we can measure the 2The absolute position of the surface is also recovered correctly, which one would not generally expect from gradient data. This happens because of the symmetry of the surface. 5. Investigations with Synthetic Data ^ 54 Figure 5.12: Surface slices reconstructed using 02. Each slice is through row 25 of the reconstructed surface. 5. Investigations with Synthetic Data^ 55 Figure 5.13: Surface slices reconstructed using S-23. Each slice is through row 25 of the reconstructed surface. 5. Investigations with Synthetic Data ^ 56 S30 --) SO I S30 -> S60 hi 112 123 0.0542 0.0355 0.0302 0.0049 0.0095 0.0145 Table 5.1: Distance between recovered surfaces for different smoothing functionals. "distance" between the reconstructions from different views. The distance metric used is the average vertical distance between the reconstructed surfaces, after rotation into the same position. S30 was rotated into the position of each of the other surfaces and this metric calculated. The results are shown in Table 5.1, which indicates that 521 does best when the gradients are relatively small, while 522 does perhaps marginally better overall. (This table provides a measure of reconstruction invariance, rather than closeness to original data.) Q3 appears to be worse than both 111 and 122. In retrospect, the method selected for implementation of 123 was not a good choice. The parameterization is based on estimating the principal directions at each point on the surface. (Stevenson & Delp, 1992) report good results for this smoother in the context of range data, but their method is based on a more sophisticated method of estimating principal directions. In the current implementation, the only "new information" introduced to derive 123 that was not used for C22 was to numerically differentiate the gradients p and q to form estimates of the Weingarten mapping (see Section 3.3.4). This implies that when the estimates of p and q are poor, the parameterization of f23 will be even worse. However, it was exactly when p and q were inaccurate that f23 was meant to be an improvement over 122. 123 was not considered in further tests, including those with real data. 57 5. Investigations with Synthetic Data^ 5.5 A Normalization Selection of a regularizing parameter that gives the "best" results for the level of noise in the data is a difficult problem that has received much attention (Wahba, 1990). Setting A = 0 will interpolate the data, while A = oo will find the best-fit plane. Aside from reconstruction accuracy, an additional issue arises when the task is evaluation of different smoothing functionals. In comparing the different smoothing functionals, it proved necessary to "normalize" the values used for the regularizing parameter A. For a fixed value of A, the tradeoff between smoothness and data fit will be different when different functionals are used. It is thus inaccurate to compare different smoothing functionals using the same regularizing parameter, as the results for one functional may be smoothed more than for another. In order to achieve comparable degrees of smoothing with the different functionals, the notion of an effective A value is introduced, denoted Ao. Given a desired value Ao, and smoothing and data fit terms ft and p, the actual value of A to be used is determined by scaling Ao by the "base response" of the smoothing and data fit terms to some representative gradient map data set PQ0 = {(p, q)jj}. The base smoothing response is defined to be just the response of the smoothing term to PQ0 BSR = 11(PQ0) (5.2) To define a base data fit, the average gradient (p-, 4) is first calculated by averaging the surface normals present in PQ0. The base data fit is then defined as the sum of the response of the data fit term for each element of PQ0 compared to the average gradient BDF = E (p,q)EP Q0 p((p , q), 05, 0) (5.3) 5. Investigations with Synthetic Data^ 58 The actual value of A to use is then calculated as A = Ao BDF BSR (5.4) Unless stated otherwise, the effective regularizing parameter used in these experiments is Ao = 1.0. The S30 data set was used for PQ0. This yielded actual A values of 0.0403, 0.0810, and 0.0757 for f2i, 122, and 123 when paired with psD• This is by no means the only way of solving this problem, but it seems to give reasonable results, as the surfaces in Figures 5.11-5.13 appear to be smoothed by comparable amounts. The problem of "normalizing" the regularizing parameter arises whenever different smoothing functionals are to be compared, but does not appear to have been addressed in the literature. (Stevenson Sz Delp, 1992), for example, state that A = 0.1 was used throughout their experiments with different smoothing functionals. 5.6 The Data-Point Correspondence Problem One problem with reconstruction on a regular grid, as proposed, is that the correspondences between data points and reconstructed surfaces are different in different views (see Figure 5.14). This means that while a smoothing functional such as 522 may be viewpoint invariant on a point-by-point basis, it is not invariant to the reparameterization of the surface induced by sampling on a grid from a different direction. This is an important issue that has not been adequately addressed in this work. In an attempt to demonstrate that this is indeed a problem, a new data fit term was derived that attempts to match corresponding points. The actual data fit calculation is identical to that of psD, but data points are matched against the closest point on the surface, rather than matching against the same grid point. The correspondences were determined by using the original data surface and the surface reconstructed using 122 5. Investigations with Synthetic Data ^ 59 r N. } Original surface , Reconstruction 0- - - - -0 View 1 constraints 0-0 View 2 constraints Figure 5.14: The data-point correspondence problem. Data points in different views are matched against different points on the postulated surface. 5. Investigations with Synthetic Data ^ 60 and psD to see which point on the surface was closest to each of the data points. These correspondences were then used throughout the reconstruction process3. The results of the reconstruction (using S22) for So and S30 are shown in Figure 5.15. The results are, unfortunately, even worse than before, but the correspondence problem is still thought to be an issue that needs addressing. For reconstruction from range data, it is possible to derive correspondences by finding closest points, but the situation is more difficult for gradient data, since the absolute position of each data point is unknown. Under certain assumptions, such as that of convex surfaces, it might be possible to match directly at the level of p and q to determine the correspondences. This was not pursued. 5.7 Effects of Noise To test the performance of these reconstruction methods in the presence of noise (before attempting to use real data), the gradient maps for the three surfaces were corrupted and the reconstruction re-applied. The addition of noise was achieved by treating each component of the gradient map as the tangent of an angle—the underlying angle was distorted by a random Gaussian with mean 0 and standard deviation 3°. In addition, each element of the gradient map was dropped from the data set with a probability 0.25. Figure 5.17 shows a slice through the p component of the gradient map for the original surface and the distorted p. Noise was added in this manner to each data set SO, S30, and 560 independently to form Sg, SI, and S. Figure 5.16 shows the results of interpolating and integrating S6z without smoothing. Flaws in the surface are obvious. 3This is an artificial way of determining correspondences that could obviously not be used in practice (since the required surfaces are not available). It was done just to make a point. 5. Investigations with Synthetic Data ^ Figure 5.15: Reconstruction with corresponding point data fit term. Invariance is still not achieved. 61 5. Investigations with Synthetic Data ^ . Figure 5.16: Surface reconstructed from noisy data without smoothing. 62 5. Investigations with Synthetic Data ^ 63 Figure 5.17: Noisy, sparse p component of gradient map compared to the uncorrupted gradient values. 64 5. Investigations with Synthetic Data^ I S30 -4 SO S30 -4 S60 ^ ^ 111 0.0579 0.0583 ^ ^ 0.0200 0.0554 92 Table 5.2: Distance between recovered surfaces for different smoothing functionals applied to noisy data. The results of the reconstruction are shown in Figure 5.18 and 5.19 for 5-21 and 122, respectively. Comparison with Figures 5.11 and 5.12 indicates that this level of noise and sparsity is not a serious problem. It does appear that 112 is better able to handle the noise, as the reconstructions for the rotated noisy data sets S;), and SI line up better than they do using SZi (though the results for sg still differs substantially). The average pointwise distance between the reconstructed surfaces is again measured, and appears in Table 5.2. The quantitative results indicate that 112 outperforms 521, though both struggle with the steep surface So. 5.8 Effects of Varying Smoothing Parameters While not central to the considerations of viewpoint invariance, it is instructive to examine the effects of varying the parameters that control the smoothing functional. These parameters are the regularizing parameter A controlling the tradeoff between smoothness and fit to the data, and v, a property of the material modelled by the variational formulation (see Section 3.2.3). Figure 5.20 shows the effect of different degrees of smoothing on the surface S30 (values shown are for the effective lambda, as defined in Section 5.5). As expected, higher values of A weight the smoothing term more heavily, causing the surface to be smoothed more. Higher values are appropriate when the input data is noisier. If the data is perfect, then 5. Investigations with Synthetic Data^ 0 00 00 • No rotation o • • a First rotation 00 . • O 0 • 00 0 o Second rotation 0 • O 0 03 . o• -0.5 -1 0.5 • • • • 0 00 00 • •_0.00 ••^ 00 0 . 0 w op. od, . Figure 5.18: Reconstruction from noisy data with C2i. 65 5. Investigations with Synthetic Data ^ Figure 5.19: Reconstruction from noisy data with 112. 66 5. Investigations with Synthetic Data ^ 67 Original 03 X.1.0 2.0 Figure 5.20: Reconstruction with different A values. The values shown have been normalized, as described in the text. A can be set to 0, eliminating smoothing altogether. Figure 5.21 shows the effects of different values of v. Setting v = 0.5 is the theoretical volume-conserving limit if the formulation is to correspond to a physically realizable material. v = 1 was tried for the sake of comparison. As can be seen, variations in this parameter seem to have little effect on the reconstruction, as has been observed previously (Terzopoulos, 1988). v = 0 was used for all the preceding experiments. 5. Investigations with Synthetic Data^ 68 Original ar......■ ■■■ v = 0.0 03 I■1 IMII MO v-03 •^• ... V .. 1.0 Figure 5.21: Reconstruction with different v values. v is a property of the material related to surface contraction per unit elongation. 6 Building Object Models from Real Data T he previous chapter has investigated various properties of the smoothing functionals using synthetically generated data. This chapter applies the reconstruction techniques to real data obtained with photometric stereo. Invariance of the results is explored in the context of fitting together surfaces from different views, a technique which can be used to build 3D object models. Data collection is discussed in Section 6.1, including descriptions of the photometric stereo technique, the imaging setup used, and the calibration performed. Section 6.2 discusses surface reconstruction and model building for a simple ellipsoid object. Some samples of the surfaces that were recovered are shown in Section 6.2.1. Section 6.2.2 describes the surface alignment process used to build 3D models and shows the models recovered. Additional reconstruction and alignment experiments were performed with data collected for the more complex surface of a doll's face, as described in Section 6.3. 6.1 Data Collection Image data was collected under controlled conditions and photometric stereo was used to recover the gradient at each point in the image. Section 6.1.1 describes the photometric stereo process. The imaging setup and its calibration are described in Section 6.1.2. 69 6. Building Object Models from Real Data ^ 70 6.1.1 Photometric Stereo This thesis investigates the use of gradient data for surface reconstruction. A robust and reliable source of gradient data is photometric stereo (Woodham, 1980). Various shape from shading techniques (Horn & Brooks, 1986) exploit the fact that the shading of a surface is related to its orientation by the image irradiance equation E(x, y) = R(p, q) (6.1) which relates irradiance E at a point (x, y) in the image to surface orientation (p, q) at a corresponding point in the scene. The function R is the reflectance map, which is determined by the reflective properties of the surface, the illumination conditions, and the imaging geometry. R generally depends nonlinearly on the gradient. Given measurements of the irradiance E at each point in the image, and knowledge of the reflectance map R, the image irradiance equation provides a constraint on the gradient. However, the equation is not invertible, since we have just one constraint on two unknowns. The basic idea of photometric stereo is to collect multiple images from the same viewing direction, but under different conditions of illumination, thus providing multiple constraints on the gradient and allowing direct local recovery of surface orientation. Consider the case of three images gathered under three different illumination conditions. The constraints imposed on surface orientation can be written as Ei(x , = Ri(p, q) i = 1,2,3^(6.2) The implementation of photometric stereo can be thought of as a lookup table which matches brightness triples (E1, E2, E3) to gradients (p, q). The reflectance map functions Ri are not generally known a priori, but can be determined from a calibration object of known shape (e.g. a sphere). 6. Building Object Models from Real Data^ 71 For more details on the photometric stereo technique, see (Woodham, 1980) or Chapter 10 of (Horn, 1986). 6.1.2 Imaging Setup and Calibration Image collection was done on a Newport calibrated optical bench. For the ellipsoid object, an ellipsoidal PVC' model was mounted on a rotational platform, allowing easy collection of multiple views of the object with known inter-view rotations. A PVC sphere was used for calibration. The imaging setup is shown in Figure 6.1. Data collection for the doll's head example was done similarly, though no ground truth was available. A 3 CCD colour RGB camera was used in conjunction with three coloured light sources, allowing the RGB channels to be used for simultaneous gathering of the multiple images required for photometric stereo. Figure 6.2 shows the images obtained on the three colour channels for the first view of the ellipsoid used in the model building experiments. The use of photometric stereo assumes that the projection can be modelled as orthographic (or, more precisely, weak perspective). This assumption is valid if the variation in range within the scene is small with respect to the distance to the camera. The projection equations for this simple model are Y= fylf Zo (6.3) where Z0 is the average scene distance and fx and fy combine factors due to focal length, aspect ratio, etc. These equations map world points (X, Y, Z) to image points (x, y). For the experiments performed here, the average scene distance was taken to be the distance from the image plane of the camera to the center of the rotational platform, Zo = 86". Calibration of the system was done to determine the parameters h = Al and v = a fr^fy ' which can be thought of as the height and width of a pixel in the units of the world 1Poly-vinyl chloride. 6. Building Object Models from Real Data^ 72 Figure 6.1: The imaging setup used for data collection. Objects are mounted on a rotating platform. The sphere was used for photometric stereo calibration. The four white dots were used to determine projection parameters. 6. Building Object Models from Real Data^ 73 Figure 6.2: Three images of the ellipsoid gathered under different lighting conditions. Left is the red channel, center is the green, right is the blue. 6. Building Object Models from Rea/ Data ^ 74 Figure 6.3: Optical bench calibration. Z0 — 1" Zo Z0 + 1" 0.02533 0.02563 0.02587 0.02012 0.02036 0.02060 Table 6.3: Pixel dimensions at different distances. coordinate system (see Section 4.1.1). Two vertical braces were mounted on either side of the rotational platform, with two circular white dots affixed to each (see Figure 6.3). The four white dots form a rectangle of known dimensions around the area of interest in the image. The centroids of the dots were calculated in a calibration image, giving the "pixel unit" dimensions of the rectangle, thus allowing easy determination of h and v. To ensure the robustness of this method, the calibration procedure was repeated with the side braces moved forward one inch, and again with them moved backward one inch. The results appear in Table 6.3. Differences between the three calibration positions agree well with that expected by a 1" variation at 86" (a; x 100% = 1.16%). These results were averaged to determine a camera aspect ratio of DA = 0.795. 6. Building Object Models from Real Data ^ 75 6.2 A Simple Ellipsoid Experiments were performed with the data produced by photometric stereo for multiple views of the ellipsoidal model. The following subsections describe the reconstruction of surfaces for each view and the alignment of these surfaces to create a 3D model. 6.2.1 Surface Recovery Once the photometric stereo method has been used to recover gradients from the raw image data, the surface reconstruction techniques described in previous chapters can be applied to recover the shape of the surfaces in view. This was done for multiple views of a PVC ellipsoid which were then meshed to form a full 3D model (Section 6.2.2). The ellipsoid was machined according to the mathematical specification \ 2 + 1E\ 2 ± t:z._ \ 2 = 1 a)^b)^c) (6.4) where a, b, and c are the axes of the ellipsoid (see Figure 6.4). Images of the ellipsoid were gathered for four different views, with a 900 rotation of the object between each view. The two components of the gradient map recovered for the first view of the ellipsoid are shown in Figures 6.5. The gradient map is recovered over a 280 x 130 image (which has been subsampled to plot the figures). These data were integrated using RsE, to recover a surface with no smoothing. The result appears in Figure 6.6. The coherence of the surface indicates the regularity of the data obtained with photometric stereo. Smoothing was then applied, using 121 and C-22 with ND, and height recovered from the smoothed gradient maps using R,,,. The effective A value (see Section 5.5) in each case was Ao = 1.0. Results are shown in Figure 6.7 for 1li and Figure 6.8 for S-12• The fact that boundary conditions have not been applied is 6. Building Object Models from Real Data^ 76 Figure 6.4: Mathematical specification of the ellipsoid. The three axes of the ellipsoid are 1.79", 2.99", and 5.39". The units shown in the figure are "pixel widths" (1 pixel = 0.02563"). 6. Building Object Models from Real Data^ (a) p component z (b) q component z Figure 6.5: p and q components of the gradient map calculated from photometric stereo. 77 6. Building Object Models from Real Data ^ 78 Figure 6.6: Surface height recovered without smoothing. seen in the distortion of the surface at the edges of the ellipsoid2. For the sake of easier comparison, slices through the 3D surfaces have been plotted in Figure 6.9. The surfaces obtained in each case are clearly very similar (an indication that perhaps smoothing is not really needed for this data). Figure 6.10 shows slices through the p component of the original gradient map and those obtained by smoothing with E/i and 112. 6.2.2 Building Models In order to test the invariance of the reconstruction methods, the surfaces reconstructed from multiple views of an object were meshed together to form a complete 3D model of the object. The more invariant the reconstruction technique, the better the different surfaces will fit together. Although the inter-view transformation is known, the use of gradient data means that the absolute distances of the recovered surfaces are unknown, 2The edge portions of these surfaces were trimmed with the aid of the silhouette image generated by the photometric stereo process before models were built. 6. Building Object Models from Real Data Figure 6.7: Surface height recovered with 1-21 smoothing. Figure 6.8: Surface height recovered with S22 smoothing. 79 6. Building Object Models from Real Data^ 80 No smoothing „,, „.,, „,,, .. Osempl moodhing ... ., . . Onmp2 smoothing Figure 6.9: Slice through row 140 of surfaces reconstructed with no smoothing, Ili smoothing, and 122 smoothing. (Absolute surface height is arbitrarily determined by the reconstruction process.) 6. Building Object Models from Real Data^ 81 (a) p component (row 140) No smoothing — — — Omegal smoothing — ,... — Omega2 smoothing (b) q component (column 80) No smoothing — — — — Omegal smoothing — — Omega2 smoothing Figure 6.10: Slices through row 140 of p and column 80 of q component of smoothed gradient maps. 6. Building Object Models from Real Data^ 82 and so the alignment process must determine one degree of translational freedom per view. Following sections describe the alignment problem and an algorithm to solve it, and present results from the model building process. Registering Surfaces from Multiple Views Multiple gradient views of an object have been obtained from known viewing directions and the object surface reconstructed in each view. A Cartesian coordinate system was used, with the positive z axis towards the viewer; the y axis vertical, increasing downwards; and the x axis horizontal, increasing to the right. If the intersection of the optical axis with the axis of rotation is taken as the origin, then the inter-view transformation is a known rotation about the y axis. In order to bring the multiple surfaces into alignment, the unknown absolute distance of each surface must be determined. If the axis of rotation is known, then there is one degree of translational freedom along the viewing axis associated with each surface (see Figure 1.1). In practice, the origin was the upper left corner of the image, with the z position of the origin determined arbitrarily by the reconstruction algorithm. The y coordinates are the same in each view, leaving two degrees of freedom to be solved for to determine the x and z alignment of neighbouring surfaces. Given an M x N surface z(x, y) that has been recovered for a particular view, it is simple to transform the surface into a 3D point set in world coordinates by using the pixel dimensions determined during camera calibration S = {(X, Y, Z) = (hx,vy,z(x, y))11 < x < M, 1 < y < N}^(6.5) where h and v are the height and width of a pixel in the units of the world coordinate system (determined during calibration). Given the point sets S and S' for two different 83 6. Building Object Models from Real Data ^ surfaces, it still remains to apply the rotation and determine the absolute depth of each surface. If points (x, y, z) E S and (x', y', z') E S' correspond to the same physical point on the object, then the transformation between the two surfaces is X / ^ x^ 0 ^ ^ ^ + 0 y V =R ^ z/ z (6.6) + Az].^AZ2 where Azi and Az2 represent the unknown translation along the line of sight for the two surfaces and R is the rotation matrix for the known rotation about the y axis by angle 0 cos 0 0 sin 9 ^ R=^0^1^0 (6.7) _ — sin 0 0 cos 9_ Alternatively, and equivalently, Equation 6.6 can be expressed as X ^V = / R ZI _ ^ x ^ Ax y + 0^ (6.8) _ Z^Az indicating that the rotation can first be applied and then two translational parameters determined. The equivalence of the two formulations is seen from Ax = Azi sin 0^Az = Azi cos 0 + Az2 ^ (6.9) Formulating the problem in this way also obviates the need to know the position of the (vertical) axis of rotation and justifies the use of the upper left corner of the image as the origin. Given two corresponding points, it is a trivial matter to determine the translation that aligns them. Unfortunately, the correspondences are unknown. This problem can 6. Building Object Models from Real Data^ 84 be overcome by using an iterative closest point algorithm similar to that of (Besl Sz McKay, 1992)3. Given two point sets A and B related by an unknown translation, the point correspondences are approximated by finding the closest point in B to each point in A. A translation vector is then calculated by averaging the difference vectors of each of these point matches. This translation is then applied to A and the process is iteratively repeated until A is "sufficiently close" to B. The success of this process hinges on the ability to use closest point matches in place of the (unknown) point-to-point correspondences. In the context of surfaces recovered from gradient data, each point in the point set obtained by Equation 6.5 can be tagged with the gradient at that point. Given point sets from two adjacent, overlapping surfaces, the gradients at each point can be used to determine the mutually visible points—i.e. the area of overlap. Having determined the points in the overlapping regions, the iterative closest point algorithm can be applied to register the point sets. Registering surfaces in this way results in a dense cloud of points that have all been brought into a common coordinate system. In order to provide a more succinct representation of the object model, the point cloud was aligned with the 3D origin and converted to a spherical coordinate representation (r, 0, 0). The model is generated as an array R(9, 0) which stores an r value for each (0, 0). When multiple points map to the same (0, 0), their r values are averaged. This representation is suitable as long as the objects to be modelled are star-shaped (i.e. every ray emanating from the origin intersects the surface of the object exactly once). The model building process may now be summarized as follows: 1. Reconstruct surfaces from multiple views. 313es1 and McKay solve for rotational as well as translational alignment. 6. Building Object Models from Real Data^ 85 2. Convert surfaces into point set representation and apply known inter-view rotation. 3. Determine overlapping region of adjacent surfaces by applying mutual visibility calculation to gradients. 4. Use the overlapping regions to register the point sets with the iterative closest point algorithm. 5. Merge the registered point sets into one point cloud representing the entire 3D surface of the object. 6. Align the object point cloud with the origin and convert to to the more compact spherical coordinate representation. Results for Full Surfaces The model building process described in the previous section was applied to the data gathered for four views of the ellipsoid separated by 9004. Surfaces were reconstructed for each view using (a) no smoothing, (b) smoothing with 121, and (c) smoothing with 112. The constraint metric used was ps,, in each of the smoothed cases. Height was recovered from gradient using RSD • The model recovered with no smoothing is shown in Figure 6.11. The (0, 0) space of the spherical coordinate system has been sampled on a 36 x 36 grid. The models recovered after surface reconstruction with 121 smoothing and C22 smoothing are shown in Figures 6.12 and 6.13. As would be expected from the similarity of the surfaces recovered in each view, the models are also similar. While the recovered models appear qualitatively similar to the ground truth shape of the object (Figure 6.4), there are a number of problems. The surfaces reconstructed from 'Nothing in the method described requires 900 rotations. The only requirement is that there be sufficient overlap between views to solve the registration problem. 6. Building Object Models from Real Data ^ Figure 6.11: Model built with no smoothing. 86 6. Building Object Models from Real Data^ Figure 6.12: Model built using C21 smoothing. 87 6. Building Object Models from Real Data ^ Figure 6.13: Model built using C12 smoothing. 88 6. Building Object Models from Real Data (a) Top (row 40) ^ ^ (b) Middle (row 140) ^ 89 (c) Bottom (row 240) 0 -40 -20 -20 -40 -60 Figure 6.14: Cross-section of registered surfaces with no smoothing. each view do not fit together particularly well, and the apparent coherence of the models is the result of extensive averaging—each r value in the spherical coordinate representation was calculated from as many as several hundred 3D points from the reconstructed surfaces. A clearer picture of the results of the the alignment is obtained by taking slices through the registered surfaces, as is done in Figures 6.14, 6.15, and 6.16. As can be seen, the alignment between adjacent surfaces is poor, and correcting the alignment within one cross-section would make it worse in another. Figure 6.17 shows the corresponding slices through the mathematical definition of the ellipsoid, providing a ground truth reference. Results for Trimmed Surfaces The lack of boundary conditions at the occluding boundaries of the object was thought to be a reason for the poor alignment of the surfaces. Since smoothing is imposed across the boundaries, the sharp edges will be smoothed away and bends introduced into the surface. In order to test this hypothesis, an experiment was conducted in which the original 6. Building Object Models from Real Data (a) Top (row 40) ^ ^ (b) Middle (row 140) 90 ^ (c) Bottom (row 240) Figure 6.15: Cross section of registered surfaces with 01 smoothing. - (a) Top (row 40) ^ (b) Middle (row 140) ^ (c) Bottom (row 240) -60-$-20^40 60 -60 Figure 6.16: Cross-section of registered surfaces with 12 smoothing. 6. Building Object Models from Real Data^ (a) Top (z = 79.5) (b) Middle (z = 0.0) (c) Bottom (z = -79.5) Figure 6.17: Cross-section of ground truth definition of ellipsoid. 91 6. Building Object Models from Real Data (a) Top (row 5) ^ ^ (b) Middle (row 35) ^ 92 (c) Bottom (row 65) Figure 6.18: Cross-section of registered surfaces for trimmed data with no smoothing. data for each view was trimmed to produce a sub-image with the gradient known at every point, so that no object boundaries were present in the image. The smoothing and reconstruction algorithms impose natural boundary conditions at the edge of the image, which could be expected to produce better results than the case in which object boundaries were smoothed over. A more sophisticated alternative would be to incorporate natural boundary conditions along irregular boundaries within the image. Such boundaries could be obtained either from a silhouette image or by thresholding the gradient map (so as to avoid the extreme gradient values encountered at the occluding boundaries of an object). Handling of such boundaries and discontinuities was an issue consciously avoided from the outset. Trimming the data so that no object boundaries fall within the image was a pragmatic choice that allowed a hypothesis to be tested within the framework of the algorithms that had already been developed. The results are indeed much improved, as is seen in Figures 6.18, 6.19, and 6.20, which show slices through the aligned surfaces for reconstruction with the trimmed data. The surfaces align well, even near the edges of the data. 6. Building Object Models from Real Data (a) Top (row 5) ^ ^ (b) Middle (row 35) 93 ^ (c) Bottom (row 65) Figure 6.19: Cross-section of registered surfaces for trimmed data with Ch smoothing. (a) Top (row 5) ^ (b) Middle (row 35) ^ (c) Bottom (row 65) Figure 6.20: Cross-section of registered surfaces for trimmed data with 122 smoothing. 6. Building Object Models from Real Data (a) Top (row 5) ^ ^ (b) Middle (row 35) ^ 94 (c) Bottom (row 65) Figure 6.21: Cross-section of registered trimmed surfaces with no smoothing. However, inadequate boundary conditions do not seem to be the root of the problem. Another experiment was performed in which the reconstructed surfaces, rather than the original data, were trimmed. Trimming was performed on the surfaces recovered from the full original data, with the regions extracted exactly corresponding to the trimmed regions in the previous experiment. These trimmed portions of the surfaces were then aligned. Results are shown in Figures 6.21, 6.22, and 6.23, and are seen to be as good as those obtained by trimming the data and reconstruction over a region interior to the object. This result would not be expected if errors induced at the object boundaries were propagating over the surface and making alignment impossible. The difficulties in aligning the surfaces thus seem due to the accumulation of errors across the reconstructed surface. In each of these cases, the differences between results obtained with no smoothing and those obtained with 121 or 122 smoothing are minimal. Viewpoint invariance seems to be possible when the extent of the regions to be aligned is not too great, as otherwise errors accumulate across the surface. Unlike range data, slope data do not provide any fixed 3D points to which the surface can be "anchored," 6. Building Object Models from Real Data (a) Top (row 5) ^ ^ (b) Middle (row 35) 95 ^ (c) Bottom (row 65) Figure 6.22: Cross-section of registered trimmed surfaces with C21 smoothing. (a) Top (row 5) ^ (b) Middle (row 35) ^ (c) Bottom row 65) Figure 6.23: Cross-section of registered trimmed surfaces with C2 2 smoothing. 6. Building Object Models from Real Data^ 96 meaning that errors can propagate unchecked. This seems to have more effect on the ability to align surfaces recovered from multiple views than does the choice of smoothing functiona15. An additional difficulty for the alignment algorithm is the nature of the example that was chosen. For a smooth object like the ellipsoid, cross-sectional segments that are being aligned are smooth, relatively featureless curves. This makes it difficult to perform accurate alignment, and it is not surprising that the cross-sections are "spread," since there are no features to hold them together. This may be the reason the reconstructed models invariably have a wider cross section than the ground truth model (compare the various alignment results with Figure 6.17). The following section presents results for a more complex surface which provides features that aid the alignment process. 6.3 Facial Reconstruction Experiments were performed using data gathered from a pottery doll's head. Photometric stereo was used to gather gradient data for two views of the head separated by a 300 rotation about the vertical axis. Figure 6.24 shows the red channel for the two views of the doll's head. The region actually used for reconstruction is delimited by a rectangle overlaid on each view. The raw p and q data for the first view are shown in Figure 6.25. (Note that this data set mostly avoids the extreme values of p and q found at occluding boundaries of objects, which were present in the ellipsoid data.) Surfaces were reconstructed for the two views and aligned using the iterative closest point algorithm described above in Section 6.2.26. The reconstruction and alignment process were performed using no smoothing, f2i 'Though this might not be true of data obtained from a noisier source than photometric stereo. 'Because of the way the data has been trimmed, the mutual visibility condition cannot be used to determine the region of overlap, and so it was delimited manually. 6. Building Object Models from Real Data^ Figure 6.24: Images of doll's face showing regions from which data was selected. 97 6. Building Object Models from Real Data^ (a) p component 150 (b) q component 150 Figure 6.25: Gradient data for the first view of the doll's head. 98 6. Building Object Models from Real Data^ 99 smoothing, and 12 smoothing. The surfaces recovered for the first view are shown in Figures 6.26, 6.27, and 6.28. The reconstruction is similar in the three cases, though the effects of smoothing are obvious, particularly in the 112 case (this may be due in part to inadequacies in the A normalization scheme). The viewpoint invariance of the reconstructions can be seen in Figures 6.29, 6.30, and 6.31, which show quite successful alignment. Models were not constructed because the spherical coordinate representation does not lend itself to the non-convex shape of the doll's face. 6. Building Object Models from Real Data^ 100 Figure 6.26: Two angles on the face reconstructed from the first view without smoothing. 6. Building Object Models from Real Data^ 101 2.7ARR,.1, 9.A.T4,221■ Figure 6.27: Two angles on the face reconstructed from the first view with C21 smoothing. 6. Building Object Models from Real Data^ 102 Figure 6.28: Two angles on the face reconstructed from the first view with 122 smoothing. 6. Building Object Models from Real Data ^ 103 (b) Eyes (row 40) 60 (c) Nose (row 75) (d) Mouth (row 135) 60 Figure 6.29: Face surface alignment without smoothing. Slices are plotted through representative features of the face. (a) Forehead (row 10) so (b) Eyes (row 40) so (c) Nose (row 75) (d) Mouth (row 135) so Figure 6.30: Face surface alignment with S21 smoothing. 6. Building Object Models from Real Data^ (b) Eyes (row 40) so -1 -20 -40 (c) Nose (row 75) (d) Mouth (row 135) 60 Figure 6.31: Face surface alignment with 122 smoothing 104 Conclusions and Future Work 7 V iewpoint invariant surface reconstruction from gradient data has been addressed in a variational framework. The property of invariance is important for the merging of information from multiple views, such as for automatic acquisition of 3D object models, for recognition of objects in arbitrary poses, and for accurate surface reconstruction and measurement generally. An invariant regularizing functional based on the energy of a deforming thin plate was presented. Three quadratic approximations to the smoothing component of the functional were derived-421 is a small deflection approximation; 122 is an approximation based on first derivative estimates; and 113 is based on estimates of the principal directions at each point of the surface. An invariant data fit metric based on the angle between surface normals was also derived. The evaluation method employed was to obtain gradient data for a surface viewed from different directions, apply each reconstruction functional to each view, and match the surfaces from different views to see if the same shape was recovered. Results were presented for synthetic and real data gathered using photometric stereo. Real data experiments were performed with both a simple ellipsoid of known shape and a more complex doll's face (without ground truth). A brief synopsis of the results is presented in Section 7.1. Various issues were raised during the course of experimentation, and are discussed in Section 7.2. Ideas for extensions and refinements to the work are outlined in Section 7.3. 105 7. Conclusions and Future Work ^ 106 7.1 Summary of Results Some degree of invariance was achieved, though the choice of smoothing functional proved to be less important than anticipated. The invariant data fit constraint metric proved unsuitable as part of the regularizing functional, and was abandoned. It was proved that there is no invariant, quadratic data fit term, and a non-invariant metric was adopted, thereby somewhat compromising the intent of the thesis to perform invariant reconstruction. C21 is equivalent to the quadratic variation stabilizer used in previous work by Grimson (Grimson, 1981) and Terzopoulos (Terzopoulos, 1988), and produced fairly invariant results when gradient values were relatively small. 12 performs similarly, but seemed more robust in the presence of (synthetic) noise. It was realized that the method of implementing 523 was inadequate, so only a few experiments were run with this functional. Real data from photometric stereo were introduced in the context of building full 3D models of objects by splicing together surfaces recovered from multiple views. Invariance is needed for the different views to be meshed together. Results were somewhat equivocal, as other factors such as boundary conditions and data collection seem to play a greater role than the choice of smoothing functional. Both ni and 12 smoothing appeared to give similar results. Experiments were performed for four views of a simple ellipsoid of known shape and for two views of the more complex surface of a doll's face. Models were generated for the ellipsoid with some success. The biggest problem is suggested to be the propagation of errors across the surface, as gradient data provides only relative constraints on local surface height. Knowledge of suitable boundary conditions can be used to mitigate this problem. Experiments were also conducted with data from a more complicated doll's head object. The presence of distinctive features on the surface led to more successful alignment. 7. Conclusions and Future Work^ 107 Again, little difference was observed between the two smoothing functionals. The coherence of the data obtained from photometric stereo meant that global smoothing was perhaps unnecessary for either the ellipsoid or the doll's face. 3D model acquisition by splicing surfaces obtained from multiple views appears feasible, but perhaps not optimal. An alternative approach is to align and integrate the gradient data from multiple views before attempting to reconstruct surface height. For some applications, orientation-based representations of shape that do not explicitly require surface height can be used (Li, 1993), and the need to construct surface height from gradient can perhaps be avoided altogether. 7.2 Issues Raised A variety of issues have been raised by this work, many of which were not foreseen at the outset. These problems are primarily concerned with the form of the variational formulation used to solve the surface reconstruction problem. Regularizing a problem generally involves choosing a smoothing term and a fit to the data term. Appropriate choices for these norms are imperative to a successful solution. This was made painfully obvious by the failure of the invariant data fit term that was proposed. This data fit term does not match well with the smoothing functionals and leads to excessive smoothing. The constraint metrics suggested are not L2 norms, which may be the source of the difficulty (Tikhonov Si Arsenin, 1977). A major goal of this thesis was to compare results obtained from different regularizing functionals combining different smoothing and data fit terms. It was rapidly realized that in order to do sensible comparisons, it is necessary to "normalize" the regularizing parameter A controlling the tradeoff between smoothness and data fit. If this is not done, then different functionals will produce different degrees of smoothing and it becomes hard 7. Conclusions and Future Work^ 108 to make meaningful comparisons. This is an issue that has apparently not been dealt with in the literature. For example, Stevenson and Delp use a constant value for A across all the smoothing functionals that they compared (Stevenson & Delp, 1992). The smoothing functional chosen was the energy of an idealized thin plate under deformation. A parameter v of the formulation is an elastic property of the material being modelled. Variation in this parameter was observed to have little effect on the reconstruction, as has been noted previously (Terzopoulos, 1988). Smoothing was applied to the gradient data, and a separate stage of processing integrated the gradients to yield surface height. The choice of functional used for the integration is not considered to be important, particularly when the gradient map is close to integrable. Integrability was not enforced in the present work, but should have been. Viewpoint invariance was the guiding light for the choice of smoothing functionals, but the emphasis was placed on deriving functionals that were invariant on a pointwise basis. An issue that was overlooked was the change in surface parameterization that results from resampling the surface on a grid imposed by a different viewing direction. When working with range data, distances in 3D space can be used to aid in solving this correspondence problem, but for gradient data this becomes more difficult. The surface could be reparameterized using coordinates intrinsic to the surface, rather than imposed by the viewpoint. Such a framework readily lends itself to invariant reconstruction. Such a reparameterization method has been developed for positional data (BoIle & Vemuri, 1991). The formulation relies on relative positions between adjacent points, and so could be readily extended to gradient data. Comparison of different smoothing functionals was the primary focus of this work, but the choice of smoothing functional is not the only factor to consider in trying to 7. Conclusions and Future Work^ 109 achieve viewpoint invariance. Other factors such as boundary conditions, tradeoffs between smoothness and data fit, and collection of data may overwhelm the effects of the smoothing functional. In particular, if noise affects the data in a viewpoint dependent manner, then it may be impossible to achieve invariant reconstruction. For the photometric stereo data that was used, the inherent smoothness of the data meant that the application of global smoothing may have been a flawed idea from the start. 7.3 Future Work The techniques developed for surface reconstruction suffer from a number of shortcomings that are pointed out in the previous section. Many of these problems are soluble and various extensions to the work are also possible. Integrability was not enforced during smoothing of the gradient map, an oversight that should be corrected. An easy way to do this is to smooth the gradient map and recover surface height simultaneously, so that numerical estimates of the derivative of the height estimate can be compared to the gradient map to derive a penalty term that encourages integrability. This method has been successfully employed in previous work on shape from shading (Horn, 1990; Ascher & Carter, 1993). Another alternative is to project the gradient map onto the "closest" integrable map (Frankot & Chellappa, 1988), either after each smoothing iteration or after the smoothing is complete. Once an integrable gradient field is thus derived, surface height can be recovered by direct integration. Testing different smoothing functionals was the original intent of the thesis, and the issue of discontinuities and boundary conditions was consciously ignored. Tests with real data showed this to be a shortcoming. The importance of boundary conditions for height reconstruction from gradient data has been demonstrated in (Li, 1991), and their 7. Conclusions and Future Work ^ 110 inclusion should be relatively straightforward. A possibility that has not been explored is the use of alternative representations of slope to aid in the derivation of invariant functionals. Unit normals or stereographic coordinates (Horn, 1990) may prove easier to work with in an invariant manner. Integration of slope and range data in the context of invariant surface reconstruction may prove fruitful. Slope and range data have previously been combined for reconstruction purposes (Harris, 1987), and invariant reconstruction from range data has received attention (Stevenson & Delp, 1992). Combining range and gradient data allows for integration of different visual modules and thus greater robustness in surface recovery. In addition, tagging gradient points with a range value may aid in deriving functionals that are invariant to the reparameterization of the surface induced by a new viewing direction. (Alternatively, the surface could be reparameterized in object-centered coordinates (Bolle & Vemuri, 1991).) Another source of data to be integrated is curvature data, such as can be determined from photometric stereo (Woodham, 1990). Curvature information could be used to independently determine principal directions on the surface and thus provide a better way of parameterizing C23. The expression for the energy density of a thin plate is based on changes in curvature (Pogorelov, 1988). Curvature data could be used to define a non-planar rest state for the plate that is fitted to the data, a novel idea that seems appropriate for the recovery of the complex surfaces one expects to find in typical scenes. For 3D model acquisition, reconstructing explicit surfaces in several views and then splicing the surfaces together leads to inconsistencies and errors at the points of overlap. An alternative is to match gradient data (or, rather, surface normals) directly from multiple views and then reconstruct the full 3D surface from the merged data. For example, it is possible to reconstruct a convex polyhedron from its extended Gaussian image (Little, 1985). Note that the in the work presented here, 3D model acquisition 7. Conclusions and Future Work^ 111 was intended as a means to test the invariance of the reconstruction algorithms, rather than an end in itself. Bibliography Anton, H. (1987). Elementary Linear Algebra, 5th edition. John Wiley & Sons, Toronto. Ascher, Ti. M. & Carter, P. M. (1993). A multigrid method for shape from shading. SIAM Journal of Numerical Analysis, 30(1), 102-115. Besl, P. J. & McKay, N. D. (1992). A method for registration of 3D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239-256. Blake, A. (1989). Comparison of the efficiency of deterministic and stochastic algorithms for visual reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 2-12. BoIle, R. M. & Vemuri, B. C. (1991). On 3D surface reconstruction methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(1), 1-13. Burns, J. B., Weiss, R. S. & Riseman, E. M. (1993). View variation of point-set and linesegment features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(1 ). Chen, H. H. & Huang, T. S. (1988). A survey of the construction and manipulation of octrees. Computer Vision, Graphics, and Image Processing, 43, 409-431. Courant, R. & Hilbert, D. (1989). Methods of Mathematical Physics, volume 1. John Wiley & Sons, Toronto. Faddeev, D. K. & Faddeeva, V. N. (1963). Computational Methods of Linear Algebra. W. H. Freeman and Company, San Francisco, Translated from Russian by R. C. Williams. Faugeras, 0. D., Hebert, M., Mussi, P. & Boissonnat, J. D. (1984). Polyhedral approximation of 3D objects without holes. CVGIP, 25, 169-183. Ferrie, F. P., Lagarde, J. & Whaite, P. (1990). Recovery of volumetric object descriptions from laser rangefinder images. In European Conference on Computer Vision, pages 387-396. 112 113 Bibliography^ Frankot, R. T. & Chellappa, R. (1988). A method for enforcing integrability in shape from shading algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 439 451. - Goad, C. (1983). Special purpose automatic programming for 3D model based vision. In Proceedings of the ARPA Image Understanding Workshop. - Grimson, W. E. L. (1981). From Images to Surfaces: A Computational Study of the Human Early Visual System. The MIT Press Series in Artificial Intelligence. The MIT Press, Cambridge, Massachusetts. Harris, J. G. (1987). A new approach to surface reconstruction: the coupled depth/slope model. In International Conference on Computer Vision, pages 277 283. IEEE. - Horn, B. K. P. & Brooks, M. J. (1986). The variational approach to shape from shading. Computer Vision Graphics and Image Processing, 33, 174 208. - Horn, B. K. P. (1986). Robot Vision. MIT Press/McGraw-Hill, Cambridge, MA. Horn, B. K. P. (1990). Height and gradient from shading. International Journal of Computer Vision, 5, 37 75. - Jastrzebski, Z. D. (1976). The Nature and Properties of Engineering Materials, 2nd edition. John Wiley & Sons, Toronto. Koenderink, J. J. & van Doom, A. J. (1980). Photometric invariants related to solid shape. Acta Optica, 27, 918-996. Li, Y. (1991). Surface reconstruction by coupled depth/slope model with natural boundary conditions. Technical Report 91-24, University of British Columbia Department of Computer Science, Vancouver. Li, Y. (1993). Orientation - based Representations of Shape and Attitude Determination. PhD thesis, University of British Columbia, Vancouver, British Columbia. Little, J. J. (1985). Recovering shape and determining attitude from extended Gaussian images. PhD thesis, The University of British Columbia, Vancouver, BC, Ph.D., supervisor R.J. Woodham. Lowe, D. G. (1987). Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence, 31, 355 395. - Maddox, I. J. (1988). Elements of Functional Analysis, 2nd edition. Cambridge University Press, Cambridge. Bibliography^ 114 Poggio, T., Torre, V. & Koch, C. (1985). Computational vision and regularization theory. Nature, 317,314-319. Pogorelov, A. V. (1988). Bending of Surfaces and Stability of Shells. American Mathematical Society, Providence, Rhode Island, Translated from Russian by J. L. Schulenberger. Pogorelov, A. V. (undated). Differential Geometry. P. Noordhoff N. V., Groningen, The Netherlands, Translated from Russian by L. F. Boron. Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. (1988). Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge. Roberts, L. G. (1965). Machine perception of three dimensional solids. In Tippett, J. T., Berkowitz, D., Clapp, L., Koester, C. & Vanderburgh, A., editors, Optical and Electro-optical Information Processing, pages 159-197. MIT Press, Cambridge, MA. Stevenson, R. L. & Delp, E. J. (1990). Invariant reconstruction of curves and surfaces with discontinuities with applications in computer vision. Technical Report TR-EE 90-35, Purdue University. Stevenson, R. L. & Delp, E. J. (1992). Viewpoint invariant recovery of visual surfaces from sparse data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(9), 897-909. Szeliski, R. & Tonnesen, D. (1991). Surface modeling with oriented particle systems. Technical Report 91/14, Digital Equipment Corporation, Cambridge Research Lab. Szeliski, R. (1990). Real-time octree generation from rotating objects. Technical Report 90/12, Digital Equipment Corporation, Cambridge Research Lab. Szeliski, R. (1990). Shape from rotation. Technical Report 90/13, Digital Equipment Corporation, Cambridge Research Lab. Terzopoulos, D. (1988). The computation of visible surface representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(4), 417-438. Tikhonov, A. N. & Arsenin, V. Y. (1977). Solutions of ill-posed problems. V.H. Winston & Sons, Washington, DC. Tsai, P-S. & Shah, M. (1992). A simple shape from shading algorithm. Technical Report CS-TR-92-24, University of Central Florida, Orlando, Florida. Bibliography^ 115 Vemuri, B. C. & Skofteland, G. (1992). Invariant surface and motion estimation from sparse range data. Journal of Mathematical Imaging and Vision, 1(1), 43-64. Volterra, E. & Gaines, J. H. (1971). Advanced Strength of Materials. Prentice-Hall, Inc., Englewood Cliffs, NJ. Wahba, G. (1990). Spline Models for Observational Data, volume 59 of Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics, Philadelphia. Wang, W. & Iyengar, S. S. (1992). Efficient data structures for model-based 3D object recognition and localization from range images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(10), 1035-1045. Woodham, R. J. (1980). Photometric method for determining surface orientation from multiple images. Optical Engineering, 19,139-144. Woodham, R. J. (1990). Surface curvature from photometric stereo. Technical Report 90-29, University of British Columbia. APPENDIX A Invariant Quadratic Constraint Metric Impossible A constraint metric for fit to gradient data is a function p(4, zy, p, q) that measures the "closeness" of the gradient (zz, zy) at a point on an estimated surface to the corresponding data point (p, q). To achieve invariant surface reconstruction requires an invariant constraint metric. For ease of minimization, a constraint metric quadratic in zs and zy is desirable. Unfortunately, these two qualities are incompatible—no quadratic function of zs and zy is invariant to rotations. Consider the surface normals (—zz, —zy, 1) and (—p, — q, 1) corresponding to surface and data gradients. A metric that depended only on the angle between these surface normals would be invariant to rotations of the data and surface. Intuitively, it seems reasonable that such a metric would not be quadratic—if the metric is independent of the length of the vectors, then some form of normalization must be involved, which is a non-quadratic procedure. An arbitrary constraint metric that is quadratic in zs and zy can be written as p(4, zy, p, q) = ki(p, q)4+1c2(p, q)zy2 d - k3(p, q)4 zy+ k4(P, q)zz +k5(P, q)zy+k6(p, q) (A.1) where the ki(p, q) are arbitrary functions of p and q. If p is invariant to rotations, then p(zx, zy, p, q) = p(z'x, zy' , p' , q')^ where z'z, zy' , p' , q' are the rotated gradients. 116 (A.2) ^ ^ A. Invariant Quadratic Constraint Metric Impossible^ 117 The rotation of a surface normal vector can be calculated by multiplying by a rotation matrix n / = Rn (nix, rey, n'z) = r11 r12 r13 nx r21 r22 r23 fly r31 r32 r33_ n, (A.3) The rotated gradients can thus be calculated as I ,nz^ riizx + ri2zy — ri3 zz = -- / = nz^r31zx + r32zy — r33 ny^r2izx + r22zy — r23 i ZY = nx^r31zx + r32zy — r33 (A.4) (A.5) Substituting these expressions for the rotated gradients into Equation A.1 and plugging the results into Equation A.2, cross-multiplying, and then equating coefficients of like powers of z, and zy yields 15 equations which must be satisfied by the coefficient functions ki if p truly is invariant (1) —7-73 ki (pi , V) — r33 k2(p' , V) — r13 r23 k3(11 , V) + ri3 r33 Ic4(P' , V) + r23 T33 k5 (PI, q') ± 713 Ics(P, q) — 7-33 k6(p,, v) = 0 (2) 2 ri2 ri3 ki(P', V) + 2 r22 r23 k2(p' , q') ± r13 r22 k3(P', 47') + ri2 r23 k3(P', qf) — r13 T32 k4 (Pt, §0 - ri2 r33 k4(p1) V) r22 T33 k5(Pt, V) - 2 r32 r33 k6 (p, + 1.13 k5(p, q) — r23 r32 k5(P', q") — q) + 2 T32 r33 k6(PI, (1) = 0 (3) —ri2 k1 (p', qf) + 713 k2(P, q) — 12 k2(j)1,44") - ri2 r22 kthi, qi) + ri2 r32 k4(Pf, V) — 2 r32 r33 k5(p , q) + r22 r32 k5(pf , V) + 7-32 k6(p , q) 712 Ics(P' ,V) = 0 (4) —2 r32 r33 k2(P, q) + 712 k5(p , q) = 0 — A. Invariant Quadratic Constraint Metric Impossible ^ 118 (5) 7-32 k2(P, q) = 0 (6) 2 r11 ri3 Ici(P', V) + 2 r21 r23 k2(P', V) + r13 r21 k3(P', V) + rn r23 k3(p1) qi) + 1j3 k 4(P , q) - r13 r3i k4(P', ql) - r21 r33 k5(P', rn r33 k4(Pf)qi) - r23 r31 k5(P', V) - V) - 2 r31 r33 ks(p, q) + 2 r31 r33 ks(P' , 42') -,--- 0 (7) -2 rn r12 ki(Pf, (11) - 2 r21 r22 k2(p11 (1) + 713 k3(p , q) - r12 r2i k3(P', q') r11 r22 k3(P', V) - 2 r32 r33 k4(p, q) + r12 7.31 Ic4(P', V) + rn r32 k4(p1, q') - 2 r31 r33 k5(p, q) + r22 r31 k5(P', V) + r2i r32 k5(PI, qi) + 2 r31 r32 k6(p, q) 2 r3i r32 k6(Pf, qi) = 0 (8) -2 r31 r33 k2(P, q) - 2 r32 r33 k3(p, q) + 712 k4(P, q) + 2 r31 r32 k5(P) q) = 0 (9) 2 r3i r32 k2(P, q) + 42 k3(P, q) = 0 (10) 713 ki(P, q) - rh ki(P', V) - gi k2o", v) - rn r21 k3(P', V) - 2 r3i r33 k4(p, q) + rn r3i k4(11 , V) + r2i r3i ics(P', 40 + r3i k6(P, q) ril (11) -2 ks(p' , q1) = 0 r32 1'33 ki(p, (12) 712 ki(p, q) - 2 r3i r33 k3(p , q) + 2 r31 r32 k4(P)q) ± rli. k5(p , q) = 0 q) + r3i k2(p , q) + 2 r31 r32 k3(p , q) = 0 (13) -2 r31 r33 kJ. (p, q) + 41 k4(p, q) = 0 (14) 2 r3i r32 ki(p, q) + 71k3(p, q) = 0 (15) 41 ki(p, q) = 0 These equations place very harsh constraints on the coefficient functions: (i) Equation (15), together with the fact that, in general, the components of the rotation matrix R are non-zero, implies Ici(p, q) = 0. (ii) Similarly, (5) indicates k2(p, q) = 0. (iii) Equation (9) and (ii) imply k3(p, q) = 0. A. Invariant Quadratic Constraint Metric Impossible^ 119 (iv) Equation (13) and (i) imply k4(p,q) = 0. (v) Equation (4) and (ii) imply k5(p, q) = 0. So it must be the case that p(zz, zy, p, q) = k6(p, q), which is independent of the values of zx and zy and thus clearly useless as a constraint metric. APPENDIX B Interpolation by Iterated Local Averaging Given a set of data points 7) = {z1} at a sparse set of points (i, j) on a regular M x N grid, we want to fill in the missing values in the grid by interpolating the data points. A simple way of doing this is iterated local averaging (ILA), which repeatedly replaces the value at a grid point (i, j) with the average of the values over the local neighbourhood. The data values act as "anchors" and are left unchanged. The algorithm is not particularly efficient, taking more iterations than necessary, but each iteration is rapid. The algorithm can be stated as follows: while not done for i = 1 to M for j = 1 to N if Z[i,j] not in data set sum = 0 n=0 if i > 1 then sum += Z[i-1,j]; n += 1 = if i < M then sum += Z[i+1,j]; n +1 if j > 1 then sum += Zri,j-1]; n += 1 if i < N then sum += Z[i,j+1]; n += 1 Z[i,j] = sum/n endif endfor endfor endwhile A suitable stopping criterion is when no value changes by more than some small value E. As presented, the algorithm sweeps sequentially across the image—conceptually it should act in parallel, though (in the limit) parallel and serial implementations are equivalent. The membrane energy functional measures the deformation energy of an idealized 120 B. Interpolation by Iterated Local Averaging ^ 121 Figure B.1: Sparse data on a grid. membranous surface. As an interpolation method, it fits a piecewise planar surface to a sparse set of data. The membrane functional is ff z: + zy2 dx dy The ILA algorithm can be shown to be equivalent to steepest descent minimization of the membrane equation, using the forward difference discretization ix = Ziti+1^Ziti iy = Zi+i,j - Zi j Figure B.1 shows a sparse data set consistent with a cylinder aligned along the axis. Z values are defined at 101 of 625 points on a 25 x 25 grid. Figure B.2 shows the interpolated surface. The reconstruction took 107 iterations to reach a point where no surface height value changes by more that 0.000001 in a single iteration. Implemented in C, this procedure took about 2 seconds on a Sparc workstation. B. Interpolation by Iterated Local Averaging ^ Figure B.2: Surface interpolated from sparse data. 122 APPENDIX Conjugate Gradient Descent C The problem is to find the solution vector z which minimizes a given measurement function M(z*). Suppose M can be approximated by the quadratic form 1 M(z) = — 2T Qz — bTz^ 2 (C.1) where Q is an n x n symmetric positive definite matrix, and any constant term has been dropped as irrelevant to the minimization problem. The elements of Q correspond to the coefficients of quadratic combinations of the elements of z the coefficient of zizi if i 0 i qii =^ twice the coefficient of z? if i = j (C.2) while linear terms are of the form biz. Starting at a point zo E V, the task is to generate a sequence {zk } that converges to the function minimum. At each step in the process, let gk be the gradient of the function gk = vM(zk) = Qzk — b (C.3) The chosen direction in which to move at each step will be denoted dk, with do = —go. Finding the line minimum in the direction dk is equivalent to finding ak to minimize the one-dimensional function hk(a) = M(zk + adk) ^ (C.4) For the quadratic case, we can solve the one-dimensional minimization analytically by setting the derivative to zero and solving for a dhk (ak) = akdTk'Qdk + z,Ndk — bT dk = 0^(C.5) da 123 C. Conjugate Gradient Descent^ yielding ak = „T A (Z71: — bT )d k^Sk4-ik dTC2dk^Cil:C2dk 124 (C.6) We now move to the new point Zk akdk (C.7) All that remains is to specify how to choose the new optimized direction dk+i Adk - gk+i (C.8) where a^gki-1 • gk+1 Pk =^ gk • gk (C.9) This is the method of Fletcher-Reeves. (Press et al, 1988) suggest that the Polak-Ribiere variation works better when the function being minimized is not quadratic. The only difference is in the calculation of lak Ok = (gk+i - gk) • gk+1 gk • gk (C.10) The net effect of the conjugate gradient method is that all the dk are mutually conjugate and the gie are mutually orthogonal (i.e. drQdj = 0 and gTgi, for all i j). The process is analogous to the Gram-Schmidt method for finding an orthogonal set of basis vectors for an inner product space (Anton, 1987). As with steepest descent, the iterative process is halted when suitable termination criteria are met. The gradient is zero at the solution, so the norm of the gradient can be used as a stopping criterion. Since the minimization problem is equivalent to the solution of the linear system Qz = b, we see that the gradient is just the residual Qi — b. Alternatively, iteration can be halted when the solution remains relatively unchanged from one iteration to the next. This was the approach chosen, with a threshold being 125 C. Conjugate Gradient Descent^ iteration Figure C.1: Maximum change in any variable as a function of iteration number This value was used as a convergence criterion during conjugate gradient descent. applied to lizk — zk_ir —i.e. the process was halted when no element of z changed by more than some small constant s in a given iteration. This seemed to give reasonable results. Figure C.1 shows the value of this norm for a typical run. The example is for gradient map smoothing of the first synthetic surface (shown in Figures 5.2 and 5.3) using (22 and ps,,, for which 43 iterations were sufficient for convergence to the 0.001 level. Note that the notion of conjugate directions is really only meaningful for quadratic forms, as it depends on the matrix Q. Under the assumption that the function M is approximately quadratic, the calculation can, however, proceed without explicit knowledge of Q, provided that the gradient can be determined and the one-dimensional minimization required to determine cek can be performed. In the system as implemented, the analytic solution for ak is used when the function is quadratic, while Brent's minimization method (Press et al, 1988) is used in the absence of a quadratic form. In addition, the Fletcher-Reeves technique is used for quadratics, while the Polak-Ribiere variant is used for non-quadratics.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Viewpoint invariant surface reconstruction from gradient...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Viewpoint invariant surface reconstruction from gradient data King, Yossarian Y. 1993
pdf
Page Metadata
Item Metadata
Title | Viewpoint invariant surface reconstruction from gradient data |
Creator |
King, Yossarian Y. |
Date Issued | 1993 |
Description | Surface reconstruction recovers complete and coherent surfaces from scattered, noisy data. Viewpoint invariant reconstruction allows seamless combination of multiple views, recognition of objects in arbitrary poses, and consistent quantitative measurements. Viewpoint invariant reconstruction from gradient data is studied in a variational framework. An invariant smoothing functional based on the bending energy of a thin plate is derived. The functional is nonconvex, and three convex approximations are presented. An invariant but nonconvex metric for measuring fit to gradient data is given, based on the angle between unit surface normals. A convex constraint metric is obtained from the second order Taylor series approximation. The chief goal of this thesis is to evaluate and compare the different functionals, with viewpoint invariance the prime evaluation criterion. The approximately invariant regularizing functionals are discretized on a regular grid using a finite difference method. Reconstruction from gradient data proceeds in three stages: interpolation of the initial data; smoothing of the gradient map; and recovery of height from gradient. Data were generated for multiple views of a synthetic surface. Each reconstruction method is applied to recover a surface for each view. The viewpoint invariance of the reconstruction is evaluated by measuring, qualitatively and quantitatively, how well the surfaces from different views match up. Experiments are performed with real data from photometric stereo in the context of building 3D object models by fitting together surfaces reconstructed from multiple views. Viewpoint invariance is important in order for the different surfaces to fit together. Multiple views of a simple ellipsoid of known shape allows accurate evaluation of the results, while data gathered from a doll's face provides more of a challenge to the reconstruction techniques. |
Extent | 5147383 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2008-09-16 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0051364 |
URI | http://hdl.handle.net/2429/2032 |
Degree |
Master of Science - MSc |
Program |
Computer Science |
Affiliation |
Science, Faculty of Computer Science, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1993-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1993_fall_king_yossarian.pdf [ 4.91MB ]
- Metadata
- JSON: 831-1.0051364.json
- JSON-LD: 831-1.0051364-ld.json
- RDF/XML (Pretty): 831-1.0051364-rdf.xml
- RDF/JSON: 831-1.0051364-rdf.json
- Turtle: 831-1.0051364-turtle.txt
- N-Triples: 831-1.0051364-rdf-ntriples.txt
- Original Record: 831-1.0051364-source.json
- Full Text
- 831-1.0051364-fulltext.txt
- Citation
- 831-1.0051364.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051364/manifest