UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Efficient reconstruction of 2D images and 3D surfaces Huang, Hui 2008

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2009_spring_huang_hui.pdf [ 10.03MB ]
JSON: 24-1.0066816.json
JSON-LD: 24-1.0066816-ld.json
RDF/XML (Pretty): 24-1.0066816-rdf.xml
RDF/JSON: 24-1.0066816-rdf.json
Turtle: 24-1.0066816-turtle.txt
N-Triples: 24-1.0066816-rdf-ntriples.txt
Original Record: 24-1.0066816-source.json
Full Text

Full Text

Efficient Reconstruction of 2D Images and 3D Surfaces by Hui Huang B.Sc., Wuhan University, China, 2000 M.Sc., Wuhan University, China, 2003 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in The Faculty of Graduate Studies (Mathematics) The University Of British Columbia (Vancouver) November 2008 c© Hui Huang 2008 Abstract The goal of this thesis is to gain a deep understanding of inverse problems arising from 2D image and 3D surface reconstruction, and to design effective techniques for solving them. Both computational and theoretical issues are studied and efficient numerical algorithms are proposed. The first part of this thesis is concerned with the recovery of 2D im- ages, e.g., denoising and deblurring. We first consider implicit methods that involve solving linear systems at each iteration. An adaptive Huber regularization functional is used to select the most reasonable model and a global convergence result for lagged diffusivity is proved. Two mechanisms— multilevel continuation and multigrid preconditioning—are proposed to im- prove efficiency for large-scale problems. Next, explicit methods involving the construction of an artificial time-dependent differential equation model followed by forward Euler discretization are analyzed. A rapid, adaptive scheme is then proposed, and additional hybrid algorithms are designed to improve the quality of such processes. We also devise methods for more chal- lenging cases, such as recapturing texture from a noisy input and deblurring an image in the presence of significant noise. It is well-known that extending image processing methods to 3D triangu- lar surface meshes is far from trivial or automatic. In the second part of this thesis we discuss techniques for faithfully reconstructing such surface models with different features. Some models contain a lot of small yet visually mean- ingful details, and typically require very fine meshes to represent them well; others consist of large flat regions, long sharp edges (creases) and distinct corners, and the meshes required for their representation can often be much coarser. All of these models may be sampled very irregularly. For mod- els of the first class, we methodically develop a fast multiscale anisotropic Laplacian (MSAL) smoothing algorithm. To reconstruct a piecewise smooth CAD-like model in the second class, we design an efficient hybrid algorithm based on specific vertex classification, which combines K-means clustering and geometric a priori information. Hence, we have a set of algorithms that efficiently handle smoothing and regularization of meshes large and small in a variety of situations. ii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 The Problems We Study . . . . . . . . . . . . . . . . . . . . 1 1.1.1 2D image denoising and deblurring . . . . . . . . . . 1 1.1.2 3D surface mesh smoothing with feature preservation 3 1.2 Thesis Overview and Contributions . . . . . . . . . . . . . . 4 2 Implicit 2D Image Reconstruction . . . . . . . . . . . . . . . 7 2.1 Mathematical Model Setting . . . . . . . . . . . . . . . . . . 7 2.2 Regularization Functionals . . . . . . . . . . . . . . . . . . . 8 2.2.1 Modified TV function . . . . . . . . . . . . . . . . . . 9 2.2.2 Huber function with automatic parameter choice . . . 11 2.3 Implicit Integration and Convergence . . . . . . . . . . . . . 13 2.3.1 Lagged diffusivity . . . . . . . . . . . . . . . . . . . . 13 2.3.2 Global convergence proof . . . . . . . . . . . . . . . . 15 2.4 Numerical Techniques for Large Problems . . . . . . . . . . . 18 2.4.1 Regularization parameter selection . . . . . . . . . . . 18 2.4.2 Multilevel continuation . . . . . . . . . . . . . . . . . 18 2.4.3 A multigrid preconditioner . . . . . . . . . . . . . . . 19 2.5 Numerical Experiments and Analysis . . . . . . . . . . . . . 21 2.5.1 Denoising Experiments . . . . . . . . . . . . . . . . . 21 2.5.2 Deblurring Experiments . . . . . . . . . . . . . . . . . 24 iii Table of Contents 3 Explicit 2D Image Reconstruction . . . . . . . . . . . . . . . 29 3.1 Artificial Time Embedding . . . . . . . . . . . . . . . . . . . 29 3.1.1 Continuous time system . . . . . . . . . . . . . . . . . 30 3.1.2 Discrete time system . . . . . . . . . . . . . . . . . . 32 3.2 Denoising Algorithms . . . . . . . . . . . . . . . . . . . . . . 34 3.2.1 Forward Euler . . . . . . . . . . . . . . . . . . . . . . 35 3.2.2 Gradient descent . . . . . . . . . . . . . . . . . . . . . 37 3.2.3 Explicit-Implicit scheme and sharpening . . . . . . . 44 3.3 Iterative Refinement Techniques . . . . . . . . . . . . . . . . 47 3.3.1 Iterative regularization . . . . . . . . . . . . . . . . . 49 3.3.2 Hierarchical decomposition . . . . . . . . . . . . . . . 51 3.4 Deblurring Algorithms . . . . . . . . . . . . . . . . . . . . . 54 4 3D Surface Smoothing with Intrinsic Texture Preservation 64 4.1 Surface Meshes with Intrinsic Texture . . . . . . . . . . . . . 64 4.2 Geometric Laplacian Operators . . . . . . . . . . . . . . . . 67 4.2.1 Continuous and discrete diffusion models . . . . . . . 67 4.2.2 Time discretization and discrete isotropic Laplacian . 68 4.2.3 Discrete anisotropic Laplacian . . . . . . . . . . . . . 69 4.3 Handling Intrinsic Texture . . . . . . . . . . . . . . . . . . . 73 4.4 Implementation, Numerical Results and Discussion . . . . . 75 4.4.1 Implementation details . . . . . . . . . . . . . . . . . 75 4.4.2 Comparative results . . . . . . . . . . . . . . . . . . . 77 4.4.3 Handling mesh irregularity and volume shrinkage . . 80 5 3D Surface Smoothing with Sharpness Preservation . . . . 86 5.1 Surface Meshes with Sharp Features . . . . . . . . . . . . . . 86 5.2 Vertex Classification . . . . . . . . . . . . . . . . . . . . . . . 89 5.2.1 Data set computation for clustering . . . . . . . . . . 89 5.2.2 K-means clustering . . . . . . . . . . . . . . . . . . . 92 5.2.3 Classification refinement . . . . . . . . . . . . . . . . 95 5.3 Crease Detection and Mesh Segmentation . . . . . . . . . . . 98 5.4 Numerical Results and Discussion . . . . . . . . . . . . . . . 101 6 Conclusions and Future Work . . . . . . . . . . . . . . . . . . 108 6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 6.1.1 Reconstructing images . . . . . . . . . . . . . . . . . 108 6.1.2 Smoothing surface meshes . . . . . . . . . . . . . . . 110 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.2.1 Spectral analysis on mesh Laplacian operators . . . . 112 iv Table of Contents 6.2.2 Anisotropic representation of unorganized points . . . 114 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 v List of Tables 2.1 Work units and runtime vs. the modification parameter ε . . 10 2.2 Huber parameter γ vs. mesh width h . . . . . . . . . . . . . . 22 4.1 Volume comparison on the original, noisy and smoothed models 83 5.1 Sizes of different example models . . . . . . . . . . . . . . . . 106 5.2 Iteration counts and runtime for different models . . . . . . . 106 vi List of Figures 2.1 Denoising the cameraman image . . . . . . . . . . . . . . . . 10 2.2 Denoising a surface with discontinuity . . . . . . . . . . . . . 22 2.3 Denoising a piecewise constant image . . . . . . . . . . . . . . 23 2.4 Reconstructing the blurry Lena image with 1% noise . . . . . 26 2.5 Reconstructing the blurry cameraman image without noise . 26 2.6 Reconstructing the blurry satellite image with 1% noise . . . 26 2.7 Reconstructing the blurry cameraman image with 5% noise . 27 3.1 Exponential vs. Tikhonov vs. Forward Euler . . . . . . . . . 31 3.2 Discrete dynamics for the Lena example . . . . . . . . . . . . 39 3.3 Denoising the Lena image . . . . . . . . . . . . . . . . . . . . 41 3.4 SD vs. LSD for the Lena example . . . . . . . . . . . . . . . 42 3.5 Denoising the cameraman image . . . . . . . . . . . . . . . . 43 3.6 Comparing three different denoising schemes . . . . . . . . . 45 3.7 Comparing Huber switching function with Tukey function . . 46 3.8 Smoothing or sharpening the cameraman image . . . . . . . . 47 3.9 Failure to preserve texture in the fingerprint image . . . . . . 48 3.10 Failure to preserve texture in the Barbara image . . . . . . . 48 3.11 Gradient descent denoising with iterative regularization . . . 51 3.12 Reconstructing the noisy image by hierarchical decomposition 52 3.13 type = ‘motion’, len = 15, theta = 30, η = 1 . . . . . . 57 3.14 SD step size vs. LSD step size for the deblurring example . . 57 3.15 type = ‘log’, sigma = 0.5, η = 1, β = 10−4 . . . . . . . . 58 3.16 type = ‘disk’, radius = 5, η = 1, β = 10−4 . . . . . . . . . 58 3.17 type = ‘unsharp’, alpha = 0.2, η = 1, β = 10−4 . . . . . 58 3.18 type = ‘gaussian’, sigma = 1.5, η = 5, β = 10−4 . . . . . 59 3.19 type = ‘laplacian’, alpha = 0.2, η = 10, β = 10−4 . . . . 60 3.20 Reconstructing blurred and quantized images . . . . . . . . . 62 4.1 Comparing with non-iterative bilateral filtering . . . . . . . . 65 4.2 Rapid unknown noise removal while preserving intrinsic texture 66 4.3 Fast denoising without losing significant features . . . . . . . 71 vii List of Figures 4.4 Removing heavy Gaussian noise from a corrupted model . . . 72 4.5 Illustrating the function λ at each vertex of the monk model . 75 4.6 Comparing AL with MSAL . . . . . . . . . . . . . . . . . . . 78 4.7 Comparing with bilateral filtering . . . . . . . . . . . . . . . . 79 4.8 Comparing with nonlinear anisotropic surface diffusion . . . . 80 4.9 Controlling retained amount of intrinsic texture using MSAL 81 4.10 Scaling for mesh sampling irregularity . . . . . . . . . . . . . 81 4.11 Plot of E(M,Mj) as a function of the iteration counter k . . 83 4.12 Restoring edge sharpness by our multiscale scheme . . . . . . 84 5.1 Mesh smoothing . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.2 Mesh regularization . . . . . . . . . . . . . . . . . . . . . . . 88 5.3 Smoothing the ring model . . . . . . . . . . . . . . . . . . . . 92 5.4 Smoothing the star model . . . . . . . . . . . . . . . . . . . . 94 5.5 Smoothing the fandisk model . . . . . . . . . . . . . . . . . . 96 5.6 Reconstructing the curved ridges . . . . . . . . . . . . . . . . 97 5.7 Smoothing and segmenting the donut model . . . . . . . . . . 99 5.8 Segmenting the star model according to input cutting paths . 100 5.9 Vertex classification on a more challenging CAD-like model . 102 5.10 Smoothing the bearing model . . . . . . . . . . . . . . . . . . 103 5.11 Reconstruction with edge sharpness preservation . . . . . . . 103 5.12 Smoothing and segmenting the torus model . . . . . . . . . . 104 5.13 Smoothing the scanned idol model . . . . . . . . . . . . . . . 105 5.14 Smoothing the scanned Maneki–Neko model . . . . . . . . . . 105 viii Acknowledgments Without the strong support and guidance of my supervisor, Uri Ascher, this thesis would not have been even possible. He opened the door to scientific computing, image processing and geometry modeling for me and encouraged all of my research endeavors. Working with him, I have learned not only how to think through a problem but also how to express my ideas in a precise and concise way. I would like also to sincerely thank my committee members, Chen Greif, Michael Friedlander, Brian Wetton, Michael Ward, Alla Sheffer and Wolf- gang Heidrich. Their comments greatly helped to improve the presentation of this thesis. My life at UBC would not be so enjoyable without my friends. Grateful thanks go to, in particular, Dan Li, Wan Chen, Na Yu and Ewout van den Berg, for their generous helps at all times. I always fail to properly explain to my parents what I am working on and why they are worth studying. Nonetheless, they support my pursuits simply because they are my family. I just cannot thank them enough. Lastly, I must thank my dearest husband, Zhiyan Qu. He has made the last five years the best of my life. Without him life would be nothing but an empty void. Really, there is nothing I can say but THANKS... ix Chapter 1 Introduction Generally speaking, image reconstruction is an inverse problem, consisting of the identification of the input of a given instrument from the knowledge of its output. Our goal in the first part of this thesis is to restore the original image from a given degraded or corrupted version. No matter how good an imaging device is, it must use a finite exposure or integration time, which cannot avoid introducing stochastic noise grown from the random arrival of photons. Blur is similarly omnipresent in any imaging system that uses electromagnetic radiation, e.g., visible light and X-rays, due to optical or atmospheric turbulence. So, reconstructing images snowed by noise or blurred is a fundamental task in image processing, and it arises in many applications, such as geophysical, medical and astronomical imaging, remote sensing and microscopy. In the second part of this thesis, we extend our image reconstruction study to problems in three space dimensions and consider polygonal sur- face meshes. The problem of denoising or smoothing surface meshes has received a lot of attention in recent years due to the increasing use of 3D scanning and acquisition technology. For example, even with high-fidelity laser scanners, acquired meshes are invariably noisy, and therefore require robust smoothing. Similarly, surfaces extracted from volume data, such as those obtained by MRI or CT devices, often contain a significant amount of noise and must be cleaned before the next process. Further, simulations in time involving such surfaces usually require subsequent fast denoising and mesh regularization. Developing an efficient and faithful reconstruction that rapidly removes noise while preserving original features, however, is not a trivial matter. 1.1 The Problems We Study 1.1.1 2D image denoising and deblurring A digital image is generally encoded as a matrix of gray level or color values. In the case of gray level images, x = (x, y) is a point on a 2D grid and 1 m(x) is a real value. In the case of classical color images, m(x) is a triplet of values for the red, green and blue components. This thesis focuses on square or rectangular 2D gray-level images, but all of what we discuss could be extended to color images. One can write the general model to roughly approximate image restoration as b = F (m) + , (1.1) where F is the forward modeling operator, m = m(x) = m(x, y) is the model to be recovered, b is available data, e.g., given noisy or blurry images, and  is omnipresent noise. This inverse problem is generally ill-posed: it might not have a solution in the strict sense, or solutions might not be unique or might not depend continuously on the data. This difficulty can be addressed by regularization, incorporating some a priori information. Such information may be in the form of expected physical properties of the object in terms, for example, of constraints on the size or on the smoothness of the solution. The so-called Tikhonov regularization yields the optimization problem min m T (m) ≡ 1 2 ‖F (m)− b‖2 + βR(m), (1.2) where R(m) is the regularization operator on which the present study fo- cuses, and β > 0 is the regularization parameter. The least-squares norm is applied in the data-fitting term for simplicity, although other more com- plicated norms depending on the noise statistics may occasionally be more suitable. In the more general context of inverse problems involving surface re- construction, the operators F under consideration may vary considerably from an identity in classic denoising problems [118, 119, 135], through a linear, constant coefficient form in deblurring and tomography problems [48, 67, 68, 135], to the solution, say, of a diffusive partial differential equa- tion (PDE) system involving Poisson or Maxwell’s differential equations [20, 32, 65, 94]. Here, our goal is to recover m by solving the optimiza- tion problem (1.2) when F is a linear operator, so it can be represented by a matrix-vector multiplication. This matrix may be ill-conditioned, or even rank-deficient in typical inverse problems, including deblurring, although it is well-conditioned in the case of image denoising. In Chapter 2, the recon- struction algorithms we consider therefore involve solving linear systems at each iteration. We refer to such approaches as implicit methods. Artificial time embedding for inverse problems and for image process- ing problems has been found useful and beneficial by many researchers. To 2 arrive from a Tikhonov-type regularization to a family of time-embedding methods, we can view the necessary conditions of (1.2) as the steady-state equations of a time-dependent differential equation. Thus, the actual com- putational model involves a discretization of this time-dependent differential system: forward Euler, which is an explicit method, is usually employed. The resulting dynamics of such an algorithm is then a discrete dynamics, and it is expected to be close enough to the dynamics of the continuous system, which is typically easier to analyze, provided that small—hence many—time steps, or iterations, are taken. Indeed, recent papers in the image processing literature routinely report results requiring thousands of iterations and more. This makes one wonder if and how the computational modeling process can be improved to better reflect the actual properties sought. In [7], we elaborate on several problem instances that illustrate the above observations. In Chapter 3 of this thesis, we specifically develop an explicit algorithmic scheme for denoising and deblurring images to show how a broader computational modeling approach may possibly lead to algorithms with improved efficiency. 1.1.2 3D surface mesh smoothing with feature preservation A related problem is that of smoothing 3D surface meshes, which arises in computer graphics, visualization and computer vision. Even though most techniques for this problem have predecessors in the literature on the con- ceptually simpler image denoising problem [11, 42, 103, 109, 113, 118, 126], lifting image processing methods up to 3D surface meshes is not trivial or automatic; see, e.g., [55]. Specifically, while in image processing the data as well as the sought surface consist of scalar intensity height maps correspond- ing to an underlying regular grid of pixels, here the data as well as the sought surface are defined by sets of points irregularly placed in 3D. A preprocessing step constructs a triangular mesh with the data points as vertices, and this mesh connectivity is retained throughout the smoothing process. We denote such a surface mesh by S, with its set of vertices V (S) = {xi}Ni=1 ⊂ IR3 and set of edges E(S). The noise in the data is therefore not only in intensity heights at known pixel values but also in what corresponds to the location of the data. Indeed, the following inter-related issues arise: • Some sort of separation between the surface location of the data points and their value in the normal direction is required. Thus, it is often desired to denoise the surface by adjusting data values at each vertex in the normal direction but not in directions tangential to the sur- 3 face. Motion of vertices in tangential directions is referred to as vertex drifting. • The mesh is irregular, and this gives rise to several issues addressed further below that do not arise in the analogous image processing problem. • The mesh describes a volume in 3D, and it is important to ensure that this volume does not shrink noticeably during the denoising process. While rapidly removing undesirable noise, there is a real challenge in preserving different features of the polygonal surface models themselves. In this thesis, we roughly divide models into two classes according to their fea- tures. Objects in the first class usually contain many visually meaningful fine-scale components or details, referred to as intrinsic texture [71]. Such objects typically require very fine meshes to represent them well. Examples can be found in Figs. 4.7 and 4.8; see also the models in Figs. 5.14 and 5.13. Piecewise smooth CAD-like models, which usually have large flat regions, long sharp edges and distinct corners, are divided into the second class. The meshes required for adequate representation of such models can often be much coarser. Some examples are depicted in Figs. 5.1 and 5.6. All of these models may be sampled very irregularly, as e.g., in Figs. 4.10 and 5.2. Chap- ter 4 is consequently dedicated to developing a fast adaptive multiscale mesh denoising algorithm that is capable of preserving intrinsic texture as well as retaining mesh sampling irregularity. This algorithm performs particularly well for models of the first class. In the following Chapter 5, we turn our focus on denoising models of the second class and regularizing meshes as in Fig. 5.2. The goal then becomes to generate enough smoothing on large featureless regions while preserving edge sharpness and corner distinction. 1.2 Thesis Overview and Contributions This thesis is organized into six chapters. Following the present introduc- tion, we describe effective methods for implicit image restoration in Chapter 2. This chapter forms part of Ascher, Haber and Huang [6]. An adaptive Huber regularization functional is first developed to select the most reason- able model that fits the discontinuous data, and then a fixed-point iteration algorithm called lagged diffusivity is employed. Its global convergence for the Huber variant is proved and two mechanisms to improve efficiency for large-scale problems are provided as well. 4 In Chapter 3 we consider explicit methods for similar problems, which may be viewed through the introduction of an artificial time-dependent dif- ferential system. The relationship between continuous dynamics and discrete algorithms is first addressed. A simple, quick and adaptive algorithm based on discrete dynamics is then derived and some very encouraging results are presented. Additional thoughts on improving the quality of such reconstruc- tion and solving more challenging problems follow. The first section of this chapter is part of Ascher, Huang and van den Doel [7] and the second section has been adapted for one application in Ascher, van den Doel, Huang and Svaiter [4]. The rest has not appeared anywhere else yet. Motivated by the research on two-dimensional images, we then turn to three-dimensional surface meshes that in several aspects is a more compli- cated case. In Chapter 4, a fast, dynamic, multiscale iterative method is designed to smooth, but not over-smooth, noisy triangle meshes with intrin- sic texture. An anisotropic Laplacian (AL) operator is developed at first. It is then embedded in an iteration that gradually and adaptively increases the importance of data fidelity, yielding a highly efficient multiscale algo- rithm (MSAL) that is capable of handling both intrinsic texture and mesh sampling irregularity without any significant cost increase. This chapter has been published as Huang and Ascher [71]. In the next Chapter 5 we continue to deal with another type of noisy models as described above, which are dominated visually by sharp features, e.g., edges and corners. While denoising, our method simultaneously regu- larizes triangle meshes on flat regions for further mesh processing and pre- serves crease sharpness for faithful reconstruction. A clustering technique, which combines K-means and geometric a priori information, is first de- veloped and refined. It is then used to implement vertex classification so that we can not only apply different smoothing operators on different vertex groups for different purposes, but also succeed in crease detection, where the tangent plane of the surface is discontinuous, without paying a high computational price. Consequently, we are capable of efficiently obtaining different mesh segmentations, depending on user input and thus suitable for various applications. This chapter has been published as Huang and Ascher [72]. Conclusions and an outline of related future work are offered in Chap- ter 6. We implemented the algorithms in Matlab; no attempt was made to optimize performance of the resulting programs at the expense of gen- erality and clarity. All presented numerical examples were run on an Intel Pentium 4 CPU 3.2 GHz machine with 512 RAM and CPU times are re- ported in seconds. To appreciate reconstruction results in image processing 5 and computer graphics, it is often beneficial to view enlarged digital fig- ures on screen rather than those poor-quality printed ones on paper. For this purpose, the electronic version of the present thesis can be found in http://www.math.ubc.ca/~hhzhiyan/HHPHD.pdf 6 Chapter 2 Implicit 2D Image Reconstruction This chapter considers a basic but important task in image processing, that is the reconstruction of an original image m from acquired data b. The transformation connecting m to b is usually the result of two phenomena. The first phenomenon is random, inherent noise degradation; the second phenomenon is related to the mode of image acquisition, e.g., blur created by turbulence of photonic media, some wrong lens adjustment or rapid me- chanical motion. 2.1 Mathematical Model Setting Mathematically, the simplest models accounting for both noise and blur are linear. Thus we write as in Chapter 1 b = F (m) + , where F (m) is a linear function of the sought model m. Suppose that the noise  here is white, Gaussian and additive. Without loss of generality, we also assume the model m(x) = m(x, y) is defined on a unit square, denoted by Ω. This domain Ω is then discretized by a uniform grid with n2 square cells of length h = 1/n each. Thus, with a slight abuse of notation the function m(x, y) becomes a two-dimensional grid function {mi,j}ni,j=0, which is occasionally further reshaped into a column vector, (m1, . . . ,mN )T ∈ IRN , N = (n + 1)2. Likewise, the data b, given on a subset of the grid’s nodes or cells, is occasionally rearranged as a column vector, b ∈ IRM . Since F : IRN → IRM is a linear transformation and assumed to be continuously differentiable, it can be represented as a matrix multiplication F (m) = Jm ⇒ J = ∂F ∂m , (2.1) where the sensitivity matrix J is constant. 7 As we mentioned in Chapter 1, the above inverse problem is ill-posed because J may have no inverse or be highly ill-conditioned and noise un- avoidably exists in the measured data. A Tikhonov-type regularization is therefore applied and it leads to the following optimization problem min m T (m) ≡ 1 2 ‖Jm− b‖2 + βR(m), (2.2) where R(m) is the regularization operator that is the focus of attention in the present chapter, β > 0 is the regularization parameter (see [49] or the classic [130]), and the least-squares norm ‖ · ‖ is used in the data-fitting term for simplicity. The necessary condition for optimality in (2.2) is the algebraic equation JT (Jm− b) + βRm = 0. (2.3) For the regularization term, one considers a same-grid discretization R(m) = ∫ Ω ρ(|∇(m−mref )|) + α̂(m−mref )2, (2.4) where α̂ ≥ 0 is a small parameter, ∇m denotes the discretized gradient of m, |∇m| is the gradient magnitude and mref is a given reference function. The selection of mref can be crucial, for instance in geophysical exploration, but here we set ∇mref = 0 and concentrate on the choice of the function ρ. The latter relates directly to the a priori information we have about the smoothness of the model. 2.2 Regularization Functionals From (2.4), we have Rm(m) = −∇ · (g(|∇m|)∇m), where g(σ) = ρ ′(σ) σ . To recover smooth models it is common to choose least-squares, ρ(σ) = 12σ 2, yielding g = 1. This yields particularly simple computations. However, although it is very good for locally reducing noise, this choice is inadequate in the presence of discontinuities: across jumps in image m is a δ-function in |∇m|, which is not square-integrable, hence leading to smeared edges or boundaries. The famous total variation (TV) choice (e.g., [34, 50, 116, 135]) uses the l1-norm instead, leading to ρ(σ) = σ, which implies g(σ) = 1/σ. 8 Unfortunately, g(σ) = 1/σ is unbounded when σ shrinks, i.e., in flat re- gions where |∇m| → 0. Three remedies have been proposed in the literature to alleviate this. The first is to view |∇m| as dual variables in a primal-dual approach; see, e.g., [26]. This leads to some beautiful, more complex math- ematical analysis. However, the other two possibilities to be explored below are so simple, general, popular and effective in practice that the case for their replacement is necessarily not crucial. Moreover, the adaptive variants proposed in the following often yield better results than the pure TV would; see, e.g., [66]. So we continue with the latter two. 2.2.1 Modified TV function The modified TV approach [27, 28, 135] replaces |∇m| in ρ and therefore in g by |∇m|ε = √ m2x +m2y + ε2. (2.5) The parameter ε > 0 must be selected to be small enough so as to preserve sharp discontinuities and large enough so as not to cause undue numerical hardship, especially when iterative methods such as preconditioned conju- gate gradients are used. We have found surprisingly little in the literature regarding the practical selection of this parameter. To see that there is an issue, consider the popular example of denoising given in Fig. 2.1 using a suitably modified TV regularization [28, 116]. For our grid function m = {mi,j}ni,j=0 (n = 256, h = 1/n), denoting D+,xi,j m = mi+1,j −mi,j h and D+,yi,j m = mi,j+1 −mi,j h , i, j = 0 . . . n−1, where the linear gradient operators D+,xi,j and D +,y i,j are from IR N to IR, we define at the (i, j) grid point |∇m|ε ≡ |∇m|h,ε = √ (D+,xi,j m)2 + (D +,y i,j m)2 + ε2. This yields a symmetric term in the sum replacing the integral for R(m) in (2.4) that appears in (2.2). We next vary ε and record work units and CPU times: the actual method used is described later, in Sections 2.3, 2.4 and 2.5; here we merely note that an increase in computing time relates directly to an increase in the total number of preconditioner calls in our procedure for solving the resulting linear systems. Table 2.1 clearly indicates that more computational effort is required for smaller values of ε. Next, the results displayed in Fig. 2.1 indicate that the 9 Table 2.1: Work units and runtime vs. the modification parameter ε ε 10−6 10−2 4 3000 Work units 37 34 21 4 CPU sec 127 90 55 14 (a) True image (b) Corrupted by 20% noise (c) ε = 4 (d) ε = 3000 Figure 2.1: Denoising the cameraman image: (a) true image; (b) image cor- rupted by adding 20% white noise and fed as data b to a denoising program using the usual Tikhonov regularization with modified TV and α̂ = 0; a smaller value of switching parameter ε produces a more pleasing result in (c) than that displayed in (d) where a much larger value of ε is used. Taking ε = 0.01 or ε = 10−6 yields results similar to that in (c) but at a higher cost. 10 choice ε = 3000 is not very good, especially for segmentation purposes; the other three values yield very similar and satisfactory results. The combination of quality and cost therefore indicates not only that (i) as is well-known, values of ε that are too large or too small should be avoided, but also that (ii) the best value among those displayed in this case, ε = 4, does not automatically leap to mind! It should be noted that the performance indicators in Table 2.1 do not change significantly in the range 4 ≤ ε ≤ 100. Thus, the quest is not to find an optimal value of ε, but rather, to select it from the right range. This parameter range, however, need not necessarily include values ε  1 that are commonly used by practitioners. Unless one is content with a highly expensive trial and error approach, a procedure is required that would yield an effective, automatic selection of ε. 2.2.2 Huber function with automatic parameter choice An alternative to modified TV uses the Huber switching function [50, 74, 118] ρ(σ) = { σ, |σ| ≥ γ, σ2/(2γ) + γ/2, |σ| < γ. (2.6) This yields Rm(m)← ∇ · ( min{1 γ , 1 |∇m|} ∇m ) . Unlike the smooth modified TV function, here ρ is not C2, although its second derivative is bounded. Nonetheless we show in Section 2.3 that the lagged diffusivity algorithm converges globally under the same assumptions as for the modified TV, extending the proof of [28]. The advantage of the choice (2.6) is that the selection of the parameter γ is much more transpar- ent, and we capitalize on this as follows. Indeed, γ in (2.6) is viewed as answering the question: what can be interpreted as a jump? Thus, it is the minimal size of a local change in |∇m| that is still interpreted by a TV functional and hence jumps will not be necessarily smeared. The TV interpretation is desirable, particularly if edges are to be respected, but it must be avoided in an O(h) band, where the resolution limitation does not allow a true distinction between what may be zero and what is not. With this in mind, we propose the following automatic, adaptive choice γ = h |Ω| ∫ Ω |∇m|. (2.7) 11 The integral is of course replaced by a sum upon discretization. If the area of domain—|Ω|—differs from unity then we divide by it to obtain the average of |∇m|. In (2.7), γ = γ(m) depends on the solution, and we adjust its value through the iteration in an obvious fashion. What this choice basically says is that we avoid the possibility of a jump only when |∇m| is below h in the relative sense; otherwise, a jump may happen. Note that the average value of |∇m| may well depend on the application, its scaling, and the amount of noise still present in a given iteration. Moreover, the rate of change in γ from one iteration to the next indicates to what extent the discontinuities have already been distinguished from the smooth parts of the surface m to be recovered. It is not difficult to see that, despite the appearance of the term 1/γ in (2.6), the regularization term R(m) remains bounded as h→ 0. Indeed, on the continuous level fix γ and let Φ = {x : |∇m(x)| < γ}, (2.8a) we can rewrite (2.2) as min m 1 2 ‖Jm− b‖2 + β [ 1 2γ ∫ Φ |∇m|2 + γ 2 ∫ Φ 1 + ∫ Ω\Φ |∇m| ] . (2.8b) But by (2.8a), we have 1 2γ ∫ Φ |∇m|2 + γ 2 ∫ Φ 1 < γ ∫ Φ 1, and also ∫ Φ |∇m| < γ ∫ Φ 1. Hence, letting γ → 0 we get R(m)→ ∫ Ω |∇m|. Now, the regularization term on a grid with the width h approximates the continuous R(m) up to O(h) by straightforward consistency. We summarize this as: Lemma 2.1 With γ given by (2.7) the regularization term satisfies R(m) = ∫ Ω |∇m| + O(h). Our choice of γ also leads to a selection of the parameter ε in (2.5). When ε |∇m| we obviously get approximately the least squares norm on 12 |∇m|. Thus, the considerations for selecting ε should follow similar lines as those for γ. We choose ε = 0.5γ, which yields agreement with (2.6) at σ = 0, where these formulae are most useful if considered as modifications of TV. It is important to note that while both forms of stabilizing the TV semi- norm are roughly equivalent, the choice of the stabilization parameter is natural in the Huber case but is less obvious for the modified TV. If edges are not to be emphasized then the choice of a larger γ again has a more obvious meaning in the Huber case. Note that as h → 0, |∇m|h,ε → |∇m| in a well-defined manner. Another, earlier choice uses so-called robust statistics theory where γ is interpreted as answering the question of how large the model’s gradient can be before it is considered to be an outlier. One arrives at the formula (see, e.g., page 231 of [118] and [19]) γ = 1.5 · MAD(∇m) = 1.5 · median(abs(∇m− median(∇m))), (2.9) where MAD denotes the median absolute deviation and the constant is derived from the fact that MAD of a zero-mean normal distribution with unit variance is 0.6745 ≈ 1/1.5. However, according to our numerical experiments (see, e.g., Fig. 2.3), this statistical formula (2.9) should be dependent on the mesh width h. In other words, this choice only works well when we connect it with a discretization, and so it is helpful when we try to determine vertex outliers in 3D discrete triangular surface meshes. See more in Chapters 4 and 5. 2.3 Implicit Integration and Convergence 2.3.1 Lagged diffusivity In [136], Vogel and Oman introduced a fixed-point iteration algorithm called lagged diffusivity for minimizing the TV-penalized least-squares functional T (m) = 1 2 ‖m− b‖2 + βR(m), where β > 0 and R(m) = ∫ Ω |∇m|ε. Here we consider the similar minimiza- tion problem (2.2) with respect to Huber regularization functional, so the corresponding Euler–Lagrange equation can be rewritten as − β∇ · ( min{1 γ , 1 |∇m|} ∇m ) + JT (Jm− b) = 0, (2.10) 13 with natural boundary condition ∂m∂n |∂Ω = 0. By defining L(m)m = −∇ · ( min{1 γ , 1 |∇m|} ∇m ) , and differentiating (2.10) on the square domain Ω with respect to m, we obtain the following system of equations β(L(m)m)i,j + (JT (Jm− b))i,j = 0, where (2.11) (L(m)m)i,j = 1 h2 [gi+1/2,j(mi,j −mi+1,j) + gi,j+1/2(mi,j −mi,j+1) + gi−1/2,j(mi,j −mi−1,j) + gi,j−1/2(mi,j −mi,j−1)], gi+1/2,j = min { 1 γ , h√ (mi,j −mi+1,j)2 + (mi,j −mi,j+1)2 } . Note that here h is the width of a uniform grid and i, j = 0, . . . , n− 1. The five-point discretization of the regularization term above can be thought of as a consistent finite-volume discretization of the differential operator −β∇ · (min{ 1γ , 1|∇m|} ∇m) in (2.10) with natural boundary conditions. Then lagged diffusivity is employed to solve the nonlinear system (2.11) numerically. It is actually an iteratively reweighted least-squares method (IRLS). Given an initial iteratem0, one repeatedly and approximately solves βL(mk−1)mk + JT (Jmk − b) = 0, k = 1, 2, . . . . (2.12) By simply setting mk = mk−1 + δm, we have( JTJ + βL(mk−1) ) δm = JT (b− Jmk−1)− βL(mk−1)mk−1, (2.13) which means that for each k we solve a linear, anisotropic diffusion equation. This method works very well for low accuracy requirements, i.e., it converges rapidly at first but then slows down as its asymptotic convergence rate is only linear, especially if JTJ is nonsingular. Another possibility proposed by Chan, Golub and Mulet [27] and advo- cated in [135] is a primal-dual Newton method: dual variables corresponding to the flux, i.e., discretized v = min{ 1γ , 1|∇m|} ∇m in (2.11), are considered as independent variables along with m and a Newton method is defined, observing maximum size constraints for v. However, the typical number of iterations required using lagged diffusivity to reach the rough error toler- ances that make sense in the present context is so small that the improved convergence rate of the primal-dual method may kick in too late to be of practical interest. 14 2.3.2 Global convergence proof Global convergence, i.e., from any starting point m0, for the modified TV norm was shown in several references in the case that JTJ is constant and bounded away from singularity. See [43] and references in Chapter 8 of [135], where it is also claimed that some proofs can be extended to Huber function with γ fixed. Next, for the sake of completeness, we extend the proof by Chan and Mulet [28] to our case where F (m) = Jm, J is constant with a full column rank, and use Huber switching function (2.6) with γ adaptively defined by (2.7). Theorem 2.1 Assume that J is a constant matrix with a full column rank in (2.1) – (2.3). Let {mk} be the sequence of iterates obtained by solving the discrete version of (2.12). Then the following holds: 1. T (mk) ≤ T (mk−1) for all integers k ≥ 1. 2. lim k→∞ ‖mk −mk−1‖ = 0. 3. The sequence {mk} converges to the unique global minimum. In order to prove Theorem 2.1, we first introduce gradient representation matrices ATij = ( D+,xi,j D+,yi,j ) ⇒ ATij ∈ IR2×N , i, j = 0 . . . n− 1, so that the necessary conditions (2.3) can be rewritten again as Tm(m) = β n∑ i,j=0 ( AijA T ijm |ATijm|γ ) + JT (Jm− b) = 0, where |ATijm|γ = max{γ, |ATijm|} > 0. Then we have the following for T : Lemma 2.2 The functional T is coercive and strictly convex. Proof: 1. Since T (m) ≥ 12‖Jm − b‖2 and J has full column rank, T is coercive and bounded below by 0. 15 2. For |ATijm|γ > 0 and ρ defined by (2.6) is twice differentiable, we consequently have that T is twice Fréchet differentiable. So we have for its Hessian: Tmm = β ∑n i,j=0Aij ( 1 |ATijm|γ (I2 − A T ijm N ATijm |ATijm|2γ ) + ρ′′(|ATijm|γ) ATijm N ATijm |ATijm|2γ ) ATij + J TJ, where ⊗ denotes the Kronecker product. By Cauchy-Schwarz, we have ‖ATijv‖2 − (ATijm,A T ijv) 2 |ATijm|2γ ≥ 0. So, with ρ′′(σ) ≥ 0, the Hessian of T satisfies, ∀ v 6= 0, Tmm(v, v) = β ∑n i,j=0 ( 1 |ATijm|γ (‖ATijv‖2 − (ATijm,A T ijv) 2 |ATijm|2γ ) + ρ′′(|ATijm|γ) (ATijm,A T ijv) 2 |ATijm|2γ ) + vTJTJv > 0. Hence T is a strictly convex functional by Theorem 2.42 in Chapter 2 of [135]. 2 Thus problem (2.2) has a unique solution by Theorem 2.30 in Chapter 2 of [135]. Then we define G(v,m) = T (m) + (v −m,Tm(m)) + 12((v −m), C(m)(v −m)), and follow [28] almost verbatim in recognizing that the lagged diffusivity algorithm is an instance of the generalized Weiszfeld method [140], so that the following lemma holds: Lemma 2.3 1. C(m) = β n∑ i,j=0 ( AijA T ij |ATijm|γ ) + JTJ is continuous. 2. λmin(C(m)) ≥ µ = λmin(JTJ) > 0,∀ m. 3. T (v) ≤ G(v,m),∀ v. Using Lemmas 2.2 and 2.3 we can now establish the global convergence for the Huber switching function with γ adaptively defined by (2.7), as stated in Theorem 2.1. We address each item in turn: 16 1. From the above properties, we have T (mk) ≤ G(mk,mk−1). More- over, the fixed-point iteration equation is equivalent to the necessary condition for the generalized Weiszfeld method, which yields mk = argmin v G(v,mk−1) =⇒ G(mk,mk−1) ≤ G(mk−1,mk−1) = T (mk−1). 2. Recall that T (mk) ≤ G(mk,mk−1) = T (mk−1) + (mk −mk−1, Tm(mk−1)) + 12((m k −mk−1), C(mk−1)(mk −mk−1)) = T (mk−1)− 12((mk −mk−1), C(mk−1)(mk −mk−1)). Then taking into account that λmin(C(mk−1)) ≥ µ > 0, we have 1 2 µ‖mk −mk−1‖2 ≤ 1 2 ((mk −mk−1), C(mk−1)(mk −mk−1)) ≤ T (mk−1)− T (mk). It transpires that 0 ≤ ‖mk −mk−1‖ ≤ √ 2µ−1(T (mk−1)− T (mk)). Note that T is bounded below and {T (mk)} is monotonically de- creasing. Thus, {T (mk)} converges, implying that limk→∞ ‖T (mk) − T (mk−1)‖ = 0, hence also lim k→∞ ‖mk −mk−1‖ = 0. 3. As {T (mk)} is monotonically decreasing, it is of course bounded above. The fact that T is coercive then guarantees that {mk} is bounded. Since a closed and bounded subset of Euclidean space is compact, it suffices to show that the limit of every convergent subsequence of {mk} is the global minimum of (2.2). By the strict convexity of T , it will then suffice to show that those limits are stationary points of T . Let {mkj} be a subsequence of {mk} that converges to m∗. Since Gv(v,m) = Tm(m) + C(m)(v − m), T is continuously differentiable and C is continuous, we have that Gv is continuous. From the above we have limkj→∞ |mkj −mkj−1| = 0, and that yields lim kj→∞ mkj = lim kj→∞ mkj−1 = m∗. 17 Then, taking limits for G mkj (mkj ,mkj−1), we deduce the following equation 0 = lim kj→∞ G mkj (mkj ,mkj−1) = G mkj ( lim kj→∞ mkj , lim kj→∞ mkj−1) = G mkj (m∗,m∗) = Tm(m∗) + C(m∗)(m∗ −m∗) = Tm(m∗). Hence, the sequence {mk} converges to the unique global minimum m∗ of (2.2). 2 2.4 Numerical Techniques for Large Problems 2.4.1 Regularization parameter selection While the choice of γ and ε is settled, that of the regularization parameter β, which is notoriously challenging in practical problems even for least- squares regularization [49, 135], does not become any easier. In general, the optimization problem (2.2) must be solved many times by the process described above for different regularization parameter values β. For large problems that correspond to high resolution images, the entire resulting reconstruction algorithms can become tediously slow. A solution is accepted when some stopping criterion is satisfied. This is not our focus in this thesis; however, leading to techniques described next we mention the simplest such criterion, namely, the discrepancy principle [49]. Thus, one requires ψ(m(β)) = ‖F (m)− b‖2 ≤ tol, (2.14) for some specified tolerance level tol that depends on the noise level and is assumed given. A procedure of continuation in β is then developed, where previous values of ψ are used to estimate the next value of β, e.g., by a secant iteration. In practice, it is very hard to get more than a rough estimate of what tol should be, even for the simpler case of smooth model recovery. Common methods are often based on statistical arguments such as the χ2 distribution. Fortunately, by running our fast, adaptive explicit algorithm designed in Chapter 3, we find that an approximation very closely satisfying (2.14) can be easily obtained. More details are presented in the next chapter. 2.4.2 Multilevel continuation In order to improve efficiency for large-scale problems, we briefly introduce two mechanisms here. The first is a multilevel continuation proposed in [5]. 18 Lemma 2.1 assures us that R(m) does not vary much as the resolution is increased on the same domain. The same holds true, of course, for the data- fitting term in (2.2). We therefore conclude that with γ selected by (2.7) the ideal parameter β varies little on all sufficiently fine grids! Consequently we would like to apply multilevel continuation methodology (see [5, 64]) to grow the model solution gradually, aimed chiefly at exporting the value of β constructed on a coarse grid to finer grids. Starting on the coarse grid and sub-sampling the data as needed, the algorithm calculates an approximate solution and an approximate regular- ization parameter. Next, we interpolate the solution to a finer grid using a piecewise bilinear interpolation and solve the optimization problem using the interpolated solution as a starting guess, with the regularization pa- rameter obtained on the coarse grid possibly slightly adjusted. The process continues until we reach the finest grid. The reason that we may need to adjust β as the grid is refined is that the discretization error may play a major role on coarser grids, particularly when the forward problem also involves the inversion of a discretized differ- ential equation. That the coarsest grid must be fine enough is a well-known requirement in the multigrid literature; see also [5]. Thus, a search for β commences as above, so that a criterion such as (2.14) with a soft tolerance is respected before work starts on the next finer grid. 2.4.3 A multigrid preconditioner Although multilevel continuation decreases the number of iterations needed to compute a solution on the finer grids, each iteration is still expensive because (2.13) is a large system of linear equations that corresponds to a differential or integro-differential equation with non-smooth coefficients. Therefore, an essential component in any fast algorithm for the solution of the problem is a fast algorithm for the solution of the linear system. To simplify notation we rewrite the system as H δm ≡ (JTJ + βL) δm = rhs. If JTJ is sparse with a sparsity pattern similar to that of L, such as denois- ing or super-resolution, then a preconditioned conjugate gradient (PCG) method with a multigrid cycle as a preconditioner can be directly applied to the system. If on the other hand JTJ is dense, there are two more cases to consider. Firstly, if J is sparse or has a specific structure, e.g., a block Toeplitz matrix, one may want to introduce a dual variable or a special 19 approximation, e.g., a circulant matrix Ĵ [134, 135]; secondly and more gen- erally, if no useful information for J is available, one can approximate JTJ either by a sparse matrix, e.g., diag{JTJ}, or by a low rank matrix (see [64]) and obtain an approximate ĴT Ĵ +βL that is used as a preconditioner. Note that either solving a dual problem or obtaining an efficient approxi- mation to J is a nontrivial matter (see, e.g., [17, 18, 26, 63, 73]), which is beyond the scope of this thesis. We therefore apply a multigrid preconditioner based on the approximate inversion of L plus a positive definite sparse matrix. Note that since L is a discretization of a differential operator with non-smooth coefficients, stan- dard multigrid methods are ineffective. One can resort to algebraic multigrid (AMG) (see, e.g., [132]); however, this may not be as efficient as geomet- ric multigrid that utilizes the underlining grid that exists for our problem. We therefore turn to geometric multigrid methods that were developed for problems with non-smooth coefficients [78, 83]. Let mF be a grid function on the finest grid of (n+1)2 points, or nodes. We set the next coarser grid function, mC , to be the grid function on (n/2+ 1)2 points chosen in the obvious manner. The difference from standard multigrid is in the use of operator-induced interpolation. This approach combines algebraic multigrid with geometric multigrid methods to obtain the interpolation operator. As common to algebraic multigrid methods we divide the fine grid points to coarse points (C) and fine points (F ). The approximate Gauss–Newton step is reordered as( HFF HFC HTFC HCC )( δmF δmC ) = rhs. We next define the exact interpolation P = ( H−1FFHFC −I ) , and then obtain the exact coarse grid equations P T ( HFF HFC HTFC HCC ) P δmC = rhsC , by eliminating the fine grid points. The obvious problem with the exact interpolation and coarse grid op- erators is that the matrix HFF does not have a sparse inverse. Different approximations for its inverse define different methods. See [83] for a survey and comparison of methods. Here we have chosen a similar approach to the 20 one suggested by Dendy [78] to obtain sparse interpolation and coarse grid operators. The restriction operator is chosen to be the transpose of the in- terpolation and symmetric Gauss–Seidel is used as a smoother. Combining the above components into a V(3,1) cycle we have found that in most cases a single cycle is sufficient to reduce the residual norm below 10−2. 2.5 Numerical Experiments and Analysis 2.5.1 Denoising Experiments For simple denoising, F (m) = m and hence J = I, the identity. We ap- ply PCG iterations using our multigrid V-cycle preconditioner as described above. The coarsest grid in all cases has hcoarsest = 2−4. For the cameraman example considered in Section 2.2, h = 2−8. Let us complete its details. The quality of the results strongly depends on β and some search was required before settling on β = 0.07 for 20% noise, or say the noise level η = 20. Using the Huber function with γ determined adap- tively according to the formulas (2.7) and (2.9), we obtain after 10 lagged diffusivity iterations results that are very similar, indeed indistinguishable to the naked eye, to those displayed in Fig 2.1(c) and Fig 2.1(d), respectively. The choice of 10 lagged diffusivity iterations has been made because that is when γ in (2.7) changes by less than 10% from one iteration to the next, but further iterations have not revealed significantly different results. The respective final values of γ are 8.34 and 6.56× 103, and this gives rise to the values of ε used in the experiment reported in Section 2.2. There has been no notable difference in the performance of the Huber vs. the modified TV variants in this case, and the results reported earlier therefore demonstrate both the general utility of the formula (2.7) and its superiority over (2.9). The non-optimized CPU times reported in Table 2.1 correspond directly to the total number of PCG iterations required, and the latter count, for 10 nonlinear iterations in each case, is reported as “work units” in Table 2.1. The V-cycle preconditioner is therefore seen to perform rather effectively as well in our algorithm. The additional cost of carrying out the evaluation of the adaptive γ by either formula (2.7) or (2.9), is negligible. Here is another denoising experiment. The clean data is synthesized from the Matlab function peaks, which is modified as follows: if abs(mi,j) > 0.01, then mi,j = mi,j + d sign(mi,j). We set d = 10. Adding 10% (η = 10) Gaussian noise and setting β = 0.06, we run for different values of h and 21 Table 2.2: γ by (2.7) vs. mesh width h. h 2−4 2−5 2−6 2−7 2−8 γ 4.05 2.03 1.04 0.518 0.255 (a) Noisy data (b) Denoised data Figure 2.2: Denoising a surface with discontinuity. record in Table 2.2 the values of γ for the final m in each run. The value of γ is determined by (2.7). The linear dependence of γ on h clearly emerges. For the relative tolerance value of 10−2 on both the lagged diffusivity iteration and the PCG iteration, three lagged diffusivity iterations are required in all cases for both Huber and modified TV. A total of three PCG iterations, i.e., one for each lagged diffusivity iteration, are needed for modified TV while an additional PCG iteration is required for the Huber function. A typical result is depicted in Fig. 2.2. Note also that the same value of β yields quality results on all five grids, satisfying ‖m − b‖ < η. On the other hand, when the coarsest grid is set coarser than h = 2−4, it proves useless for the multilevel process. A multilevel continuation process as described in section 2.4 has been set up as well. Bilinear interpolation from each grid to the next finer one proves more efficient than a similar piecewise constant interpolation. It cuts the iteration count on the finest grid to just one iteration and the overall elapsed time by a factor of more than two. Repeating the above experiments with the lower noise level η = 1 yields a faster convergence on the finest grid starting from the data (and requiring one or at most two iterations for convergence), so the multilevel continuation 22 20 40 60 80 100 120 20 40 60 80 100 120 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 (a) True model 20 40 60 80 100 120 20 40 60 80 100 120 0 0.5 1 1.5 2 (b) Noisy model 20 40 60 80 100 120 20 40 60 80 100 120 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 (c) Huber robust γ 20 40 60 80 100 120 20 40 60 80 100 120 0.4 0.6 0.8 1 1.2 1.4 1.6 (d) Huber automatic γ 20 40 60 80 100 120 20 40 60 80 100 120 0.4 0.6 0.8 1 1.2 1.4 1.6 (e) Huber rescaled robust γ Figure 2.3: Denoising a piecewise constant image with γ determined by different formulae. process is not useful. However, exporting the value of β = 0.006 determined on a coarse grid with the same hcoarsest as before to finer grids does prove to work well here, too. We have also made comparative runs choosing γ by the formula (2.9) from robust statistics. The resulting value is significantly larger than γ of (2.7), especially on fine grids. Thus, the least-squares norm is used more often in the regularization term R, typically resulting in a possibly better approximation in smooth regions but in less sharp edges as well. However, if we multiply (2.9) by the mesh width h, the resulting behavior is very similar to that for (2.7), comparing Fig. 2.3(d) with Fig. 2.3(e). Since there is a different discretization on different domains, e.g., Ω = (0, 1) × (0, 1) or Ω = (0, n) × (0, n), the robust choice (2.9) should depend on the discrete mesh width. To see this point more clearly, let us consider the reconstructing results of a piecewise constant image. In Fig. 2.3, the true model takes the value 2 inside a circular pie subdomain (brown) and 0 in the background (blue). The run-parameters are η = 10, h = 2−7, and β = 0.06. The robust γ = 2.83 determined by (2.9) is much larger than the automatic γ = 0.0276 23 determined by (2.7). Consequently, the discontinuity is more smeared in Fig. 2.3(c) than in Fig. 2.3(d). On the other hand, multiplying the formula (2.9) by h, the rescaled robust γ = 0.007 yields Fig. 2.3(e) that is very comparable with Fig. 2.3(d). So we can claim that the unscaled robust choice (2.9) works well only when the corresponding discretization is suitable, and this is supported by our numerical results presented in Chapters 4 and 5. 2.5.2 Deblurring Experiments Besides denoising, in [6] we also report on experiments involving a limited angle Single Photon Emission Computer Tomography (SPECT) model. The problem has the form F (m) = Jm, where J is a large, sparse, underdeter- mined constant matrix. Adaptive Huber regularization works very well in these experiments, too. Moreover, in one experiment (see Fig. 6 in [6]) the history of a continuation process is recorded with the unchanged β. Even though the total cost is only a fraction less than the one-grid process, the relative efficiency of multilevel continuation does improve for larger values of β. In particular, we observed that γ of (2.7) gets smaller faster than mesh width h in the multilevel process. It therefore confirms that the automatic γ depends not only on h but also on the smoothness of the surface. Here, instead, we consider a more common problem in the literature, that of atmosphere blur [69, 135]. Due to the complexity of atmospheric turbulence and aerosol scattering, mathematical modeling of atmospheric blur has been studied carefully in many important areas including adaptive optics, satellite imaging, and remote sensoring. In most cases, it can be linearly modeled to be shift-invariant with a point spread function (PSF), denoted by f(x, y). It is also well known in signal processing and system theory [106, 107] that a shift-invariant linear operator must be in the form of convolution F (m) = f(x, y) ∗m(x, y) = ∫ IR2 f(x− x′, y − y′)m(x′, y′)dx′dy′. (2.15) So, an observed blurred image b is related to the ideal sharp image m(x, y) by b = f(x, y) ∗m(x, y) + , where the point spread function f(x, y) may vary. In this section, we only take the most popular Gaussian PSF as an example; see Chapter 3 for more experiments on other PSFs. Based on the canonical theory of linear parabolic PDEs, blurring an image with a Gaussian kernel amounts to inte- grating a diffusion equation for some time with a given image as the initial 24 data [29]. So, deblurring is naturally equivalent to inverting the diffusion process, which is well known to be unstable. Therefore, proper regular- ization techniques (e.g., Tikhonov in (2.2)) have to be introduced into the deblurring process. In the implementation we apply the same discretization as in Chapter 5 of [135], and then the convolution (2.15) is discretized into a matrix-vector multiplication b = F (m) +  = Jm+ , (2.16) where  is additive noise introduced during image acquisition, b and m are reshaped into N × 1 column vectors with N = (n+ 1)2, and J is an N ×N symmetric, doubly block Toeplitz matrix; see also [135]. Such a blurring matrix J can be constructed by a Kronecker product and the Matlab pseudocode is as follows: f = [exp(−[0 : band− 1].∧2/(2σ2))]; f = sparse(ones(1, band), [1 : band], f , 1, N); J = toeplitz(f); J = kron(J, J)/(2piσ2); where J is stored in sparse matrix format. In each Toeplitz block, only matrix elements within a distance band− 1 from the diagonal are nonzero, i.e., band is the half-bandwidth. The parameter σ controls the width of the Gaussian point spread function and thus the amount of blurring, i.e., the larger σ, the wider the function and the more ill-posed the problem. For all deblurring examples presented here, we set σ = 1.5, band = 4, and use the approximation diag{JTJ}+βL for a multigrid preconditioner as described in Section 2.4. From Figs. 2.4 and 2.5, we can clearly see that with a rough tolerance value 0.1 for both IRLS and PCG iterations, one lagged diffusivity iteration is already enough for deblurring without or only with small noise. The grid function m on images (both Lena and cameraman) is 256 × 256 and then J is a large sparse matrix 66, 049× 66, 049. Both iteration counts and run- time show the efficiency of our proposed implicit algorithm, comparing with computational restoration results recorded in [73]. Of course such partial comparison is rough and incomplete, since the variety in implementation is often very large, involving different test examples, parameter selections, programming languages and data structures. The purpose here is to pro- 25 (a) True image (b) Gaussian blur (c) Deconvoluted image Figure 2.4: Reconstructing the blurry image in Fig. 2.4(b) with small ad- ditive 1% noise: γ = 5.69, β = 0.003, 1 IRLS iter, 11 PCG iters, 19.5 sec. (a) True image (b) Gaussian blur (c) Deconvoluted image Figure 2.5: Reconstructing the blurry image in Fig. 2.5(b) without noise: γ = 6.06, β = 0.001, 1 IRLS iter, 2 PCG iters, 4.5 sec. (a) True image (b) Gaussian blur (c) PCG-tol = 0.1 (d) PCG-tol = 0.01 Figure 2.6: Reconstructing the blurry image in Fig. 2.6(b) with small addi- tive 1% noise: γ = 4.88, 1 IRLS iter; (c) 1 PCG iter, 1.35 sec; (d) 6 PCG iters, 4.82 sec. 26 (a) Noisy and blurry image (b) β = 0.001 (c) β = 0.003 (d) β = 0.006 Figure 2.7: Deblurring the image in Fig. 2.7(a) corrupted by Gaussian blur with 5% random white noise: γ = 12.1, 1 IRLS iter; (b) 3 PCG iters, 6.1 sec; (c) 12 PCG iters, 21.9 sec; (d) 15 PCG iters, 26.8 sec. vide readers a scale to see how efficient an image restoration can generally be with this approach. To reconstruct the blurry satellite image (180× 180) in Fig. 2.6(b) with the same setting IRLS-tol = 0.1 and PCG-tol = 0.1, only one PCG iteration is needed, and the deconvoluted image in Fig. 2.6(c) is rather satisfactory. Decreasing PCG-tol to 0.01, six PCG iterations are required and the contrast between white satellite and black background is enhanced dramatically; it can be seen by comparing the recovered image in Fig. 2.6(d) with the true image in Fig. 2.6(a). However, in many applications, a relatively large amount of noise, say more than 5%, is inevitable, and this results in some much harder reconstruc- 27 tion problems. Because the effects of noise and blur are opposite, dealing with very noisy and blurry images is a very difficult task. For denoising, ba- sically we want to remove high frequency image components. This requires smoothing or suitably blurring data in some sense. On the other hand, for deblurring we focus our effort on correcting distortion and enhancing the contrast. Compare the reconstructed models in Figs. 2.5(c) and 2.7(b): we have used the same regularization parameter β, where the given data is the same cameraman image corrupted by exactly the same Gaussian SPF but with different noise levels; obviously the result in Fig. 2.5(c) is very satis- factory while the one in Fig. 2.7(b) looks worse because noise is amplified. Increasing the regularization parameter β helps a little but not very impres- sively. For example, the images in Figs. 2.7(c) and 2.7(d) still look blurry, although noise has been smoothed out to different extents. How can we efficiently achieve both denoising and deblurring goals at the same time? Probably, a splitting approach is worthy of more study. We consider this potential solution for deblurring an image in the presence of heavy noise more deeply in Chapter 3. 28 Chapter 3 Explicit 2D Image Reconstruction This chapter investigates the same problem as in the previous one but from a different direction. We introduce an artificial time variable to construct a differential equation model for computational purposes, also referred to as artificial time embedding, and discretize the time-dependent model explic- itly. The discretization of a continuous artificial time system, which is what gets implemented, is the actual computational model. The dynamics of the discrete process is therefore crucial. If the dynamics of the continuous com- putational model is significantly different from that of an efficient discrete system, or if a close discretization of the continuous model leads to many costly iterations, then the continuous model may not be the right one to look at in the first place. Reconstructing 2D images with forward Euler steps is a good case in point. A simple, rapid, explicit scheme is therefore proposed and various algorithms are developed as outgrowths from this basic, general approach. 3.1 Artificial Time Embedding Many successful approaches that use artificial time embedding have been reported in recent years, including highly innovative methods involving level sets [22, 110, 120] and PDEs in image processing and scale spaces [119, 138]. Often an artificial physical process is invented in such context as part of the mathematical modeling effort. However, several of the resulting algorithms also tend to be unreasonably slow, as noted e.g., in Vogel [135] and elsewhere. Since the life of an apparatus depends on its utility, inventing computational models that are then used primarily for analysis rather than for computation may prove to be an impediment for the goal of constructing useful, efficient computational algorithms for the actual given problem to be solved. For example, let us consider the optimization problem (1.2). In order to arrive from a Tikhonov-type regularization to a family of time-embedding 29 methods, we can view the necessary condition of (1.2) as the steady-state equation of a time-dependent differential equation: M(m) ∂m ∂t = −[JT (F (m)− b) + βRm], (3.1) m(0) = m0, where t ≥ 0 is the artificial time variable and the preconditioner M = M(m(t)) is symmetric positive definite. In this thesis we only consider the special case of (3.1) where M = I and F (m) = Jm, with J a constant matrix. See [44] for more choices of the preconditioner M . 3.1.1 Continuous time system Steady-state solutions are often the objective when simulating time-dependent PDE systems. A steady-state solution, however, represents a reduced sys- tem without a time variable, hence typical time-following algorithms for such problems involve an unnecessary, or even artificial, time variable. In fact, when recovering a distributed parameter model, typically one does not intend to carry out the integration to steady state, although there are some interesting exceptions (see Section 11.3 of [110] and the related paper [92]). Indeed, there are really two regularization processes: one caused by the Tikhonov term βR(m), and the other caused by integrating the ODE (3.1) to a finite time. To see the latter, note that the solution of (3.1) withm(0) = 0 and β = 0 is m(t) = (I − exp(−tJTJ))J†b. (3.2) For notational simplicity, assume further that J has full rank and consider its SVD [59], J = U(diag{si})V T . Thus, the exact solution without noise is m = J−1b = V diag{s−1i }UT b. The effect of regularization is to modify s−1i for those singular values that are very small, replacing s−1i by ω(s 2 i )s −1 i . In (3.2) we get m(t) = V diag{(1− e−ts2i )s−1i }UT b, 30 10−4 10−3 10−2 10−1 100 101 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 s ω  (τ,  β,  t) Exponential Tikhonov Forward Euler Figure 3.1: Exponential filter (3.3), Tikhonov filter (3.6) and Forward Euler discretization (3.9) for tβ = 1/2, with τ = 0.01 and β = 0.001. hence ω(s) = 1− e−ts. (3.3) Thus, the effect of small singular values gets dampened while large ones remain almost intact for an appropriate finite value of t. For the more general case where β ≥ 0, assume further that R(m) = 1 2‖m‖2. Then (3.1) reads ∂m ∂t = −[JTJ + βI]m+ JT b. (3.4) It is easy to see that ω(s) = 1− e−t(s+β) s+ β s, (3.5) and β = 0 yields the exponential filter (3.3). For the special case t→∞ we have the Tikhonov filter ω(s) = s s+ β , (3.6) 31 where the regularization parameter β plays a role in dampening or filtering out the effect associated with singular values less than β. In Fig 3.1, we have compared these two filter functions appearing in (3.3) and (3.6). Of course they both tend to 0 as s→ 0 and to 1 as s→∞. But even for mid-range values of s there is a similarity in the general shape. It appears that the choice t ≈ 1 2β , makes these filters particularly close under certain conditions and that the exponential filter switches from 0 to 1 more sharply. Calvetti and Reichel [24] give further reasons for preferring (3.3) in the linear case. They approximate the exponential of a matrix in (3.2) by a Lanczos process; see also [23]. In fact, they never refer to an ODE giving rise to this filter function. 3.1.2 Discrete time system Next, consider the usual discretization of the ODE (3.4). Applying forward Euler with a time step τ we have at the kth time step mk+1 = mk − τ [JTJ + β]mk + τJT b (3.7) = Gmk + τJT b, where G = I − τ(JTJ + βI). Note that ODE stability holds if 0 < τ ≤ 1/‖JTJ + βI‖2. Moreover, the step size τ need not be constant, and this is further discussed at the end of this section and in the next one. Due to the simplicity of F and R we can unravel the recursion in (3.7) in the usual manner. Moreover, the SVD of G is inherited from that of J . Assuming m0 = 0 we obtain mk+1 = k∑ j=0 GjτJT b = τV diag{ k∑ j=0 (1− τ(s2i + β))jsi} UT b. Since k∑ j=0 (1− τ(s2 + β))j = 1− (1− τ(s 2 + β))k τ(s2 + β) , 32 we have ω(s) = 1− (1− τ(s+ β))k s+ β s. (3.8) Setting β = 0 in (3.8) yields the Landweber iteration [49]. For a finite time there is a regularization effect, since ω(s) = 1− (1− τs)k. (3.9) See Fig 3.1 again. As k →∞ we get ω → 1, so there is no regularization. For k = 1 we get heavy regularization (damping) with ω(s) = τs. The smaller s is, the stronger the damping, which is good. The question becomes that of choosing k for the right amount of regularization. It is called iterative regularization in [135]. The principle of this method is to stop the integration when the iteration starts to diverge (see [49]). Letting t = kτ be fixed as τ decreases and k increases in (3.8) we obtain the continuous system formulae discussed earlier in this section. However, we obviously do not wish to take τ very small in practice. So, even though the limit process yields a viable regularization through (3.3) or (3.5), and the discrete process yields a viable regularization through (3.9) or (3.8), respectively, it is not clear a priori that the latter should be best considered a discretization of the former! Typically, the continuous model is easier to analyze, and such an analysis can be used to motivate a coarse discretization. Such a motivation is satisfying in case that the actual method works well in practice, as in Fig. 3.1. However, the continuous analysis is neither strictly necessary to back up the algorithm nor sufficient for a complete theory. The above considerations highlight not only the potential advantage in using artificial time-stepping but also the limitations inherent in taking this too inflexibly. On the other hand, if we consider (3.7) as a steepest descent step for minimizing the relevant special case of (1.2) where F (m) = Jm and R(m) = 12‖m‖2, then the step size τ can be determined exactly based on an exact line search because the objective function is quadratic and convex [16, 108]. This exact line search aims at finding mk+1 of (3.7) that is closest to the steady-state of (3.4) τ = [JT (Jm− b) + βm]T [JT (Jm− b) + βm] [JT (Jm− b) + βm]T (JTJ + βI)[JT (Jm− b) + βm] , (3.10) where all quantities are evaluated at the current step k. This formula and its extensions for the general case have been used routinely by sophisticated practitioners utilizing the Landweber method. They set β = 0 and stop 33 the iteration when it no longer improves much. See also [100] for more evidence on the steepest descent revival. In the next section we offer further improvement over the steepest descent time step. 3.2 Denoising Algorithms Based on the analysis above, we would like to show that some advantages could be gained by directly examining such a discrete dynamics, both as an implementation of artificial time and as an optimization method for a specific objective function. For simplicity, we consider the image denoising problem first [11, 109, 113, 116, 118, 119, 126, 135, 138]. Thus, F (m) = m, J = I, and the given data b is assumed to contain noise of some form so should not be approximated too well by m. The regularization operator R(m) here is still a typical spatial discretization of R(m) = ∫ Ω ρ(|∇m|), where ρ stands for a least-squares, modified TV, or Huber switching function on |∇m| (see Chapter 2). Since our regularization terms contain an integral of a function of |∇m|, the necessary condition involves a differential term in space. This differential term is further assumed differenced in exactly the same way as described in Chapter 2. Rewriting the gradient Rm as Rm(m) = L(m) ·m, and ignoring spatial discretization for a moment, we get a constant L = −∇ · ∇ for the case of least-squares and adaptive L(m) = −∇ · ( 1 |∇m|ε∇ ) for modified TV, plus appropriate boundary condition. Likewise, applying Huber switching function (2.6) with the adaptive choice (2.7) for γ , we have L(m) = −∇ · ( 1 |∇m|γ∇ ) , |∇m|γ = max{γ, |∇m|}. (3.11) Now the steady-state equation of the artificial time embedding (3.1) with M = I becomes m− b+ βRm = m− b+ βL(m)m = 0. (3.12) To solve this nonlinear system of algebraic equations for m, we propose an implicit scheme in Chapter 2. Here we contemplate sending β to infinity 34 and consider instead an explicit algorithm: m0 = b, (3.13a) mk+1 = mk − τkRm(mk), (3.13b) where the step size τk, evaluated atmk for k = 0, 1, 2 . . ., is to be determined as discussed below. 3.2.1 Forward Euler In the scale space approach, (3.13) is a forward Euler method applied to the space-discretized PDE ∂m ∂t = −Rm(m), (3.14) m(0) = b. See Perona and Malik [113], and Rudin, Osher and Fatemi [116]. A problem with this interpretation is that typically other, more efficient time stepping techniques for the latter time-dependent PDE are not employed, though there are still some exceptions [92, 110]. Obviously, no one really cares about reconstructing the solution for (3.14) as such. Moreover, it is not clear how to choose the step size from this point of view. As in a Fourier analysis for multigrid smoothers, consider a Fourier com- ponent of the mesh function m, which at mesh point (i, j) looks like ui,j = Aθeı(θ1i+θ2j), −pi ≤ θ1, θ2 ≤ pi. Further, assume that after the iteration (3.13b) we get a similar Fourier component except for the amplitude, ūi,j = Āθeı(θ1i+θ2j). For the simple case of least-squares we get ūi,j = ui,j + τ h2 (ui+1,j − 2ui,j + ui−1,j + ui,j+1 − 2ui,j + ui,j−1), using 5-point Laplacian discretization with Dirichlet boundary condition. So, we have Āθ = Aθ(1 + 2τ h2 [cos θ1 − 1 + cos θ2 − 1]). For the lowest frequency, 1 − cos θ = 1 − cosh = h2/2. So, the smallest decline in amplitude is |Āθ| ≤ |Aθ|(1− 2τ). 35 For very low frequencies the decrease in amplitude is very small if τ  1/2. The larger θ1 and/or θ2 the faster amplitude decreases. At θ1 ≈ θ2 ≈ pi we get Āθ ≈ Aθ(1− 8τ/h2). For stability, we must ensure 1− 8τ/h2 > −1, and hence we require that τ < h2/4. (3.15) The restriction (3.15) is identical to the step size restriction arising from the forward Euler method. Indeed, the symmetric positive definite matrix L corresponding to the least-squares regularization with Dirichlet boundary condition has the eigenvalues λi,j = 1 h2 [ 4− 2 ( cos (i+ 1)pi n+ 1 + cos (j + 1)pi n+ 1 )] , 0 ≤ i, j ≤ n, where hn = 1 for a unit square domain. The usual forward Euler absolute stability restriction then yields τλn,n < 2, which is the same as (3.15). This local mode analysis for least-squares regularization is not justified when Huber function is employed, because the coefficient 1/|∇m|γ cannot be considered locally constant in space. The latter is a necessary condition for the applicability of this analysis, which essentially assumes that the Fourier mode is an eigenvector of L. In the numerical PDE literature, the condition (3.15) is typically extended by a frozen coefficient analysis. This yields here τ h2minx{|∇m(x)|γ} < 1 4 , which for Huber reads τ h2minxmax{γ, |∇m(x)|} < 1 4 , so we have τ < h2 4 min x max{γ, |∇m(x)|} ≤ h 2 4 γ. (3.16) For either least-squares or Huber with the choice (2.7), it is easy to see that the number of steps required to reduce the gradient norm ‖Rm(m)‖ by a constant amount using such a step size upper bound is at best proportional to O(1/h2) = O(n2). To obtain faster convergence to either steady-state or some earlier time, some significantly larger step sizes must occasionally be taken. 36 3.2.2 Gradient descent For the latter purpose, let us reinterpret the scheme (3.13) as a gradient descent method for the problem min m R(m) (3.17) starting from the initial iterate m0 = b. The question now boils down to selecting the step size for the gradient descent method. One usual simple strategy is steepest descent (SD) τk = (Rm(mk))T (Rm(mk)) (Rm(mk))TL(mk)(Rm(mk)) . (3.18a) In the case of least-squares, the selection (3.18a) of τk can be justified as the result of an exact line search. For modified TV and Huber variants, (3.18a) may be viewed as a form of frozen-coefficient line search. That is to say, it is an exact line search within the local context of a lagged diffusivity iteration [135]. Barzilai and Borwein [12] propose a different two-step method for select- ing the step size τk. The basic idea is to use past, i.e., mk−1, rather than present information, i.e., mk, for the determination of the step size τk. In the present context, replacing the SD step size (3.18a) by the corresponding lagged steepest descent (LSD) step, we have τk = (Rm(mk−1))T (Rm(mk−1)) (Rm(mk−1))TL(mk−1)(Rm(mk−1)) . (3.18b) For what corresponds to the least-squares case, such a lagged method has been shown to converge R-linearly [40], although not monotonically, i.e., there are steps k where either the objective function or the gradient norm ‖Rm‖ may actually grow [12, 57, 115]. This is not really good news in terms of iteration control, but it is theoretically interesting [4]. It has also been successfully implemented in general purpose codes for minimizing quadratic objective functions subject to box constraints [39] and applied to compressed sensing and other inverse problems [14, 53], where the non-monotonicity is used for a positive effect. Here, we observe that to reach steady-state, the integration with LSD step size selection is much faster than that with SD step size selection. In the case of least-squares, the sequence of SD steps (3.18a) tends to a limit cycle of period two (see Fig. 3.2(a)), that results in the dramatically slower convergence. The dynamical sequence of LSD steps (3.18b) are generally chaotic, not obeying a closed form pattern, and 37 occasionally a much larger step size than forward Euler stability restriction (3.15) would allow is taken, resulting in an overall faster convergence. For more on this, see [4] and references therein. However, note that we are not really solving (3.17), nor are we integrating (3.14) to steady-state. Hence, a better explanation for (3.13) is a more direct one. This is a discrete process, the purpose of which is anisotropic smooth- ing. In analogy with the case where a Gauss-Seidel relaxation is judged as a smoother within a multigrid cycle by criteria that differ from those that would be applied to it as a stand alone method [132], here the important properties of the iteration (3.13) are those of an anisotropic smoother. The difference from a multigrid smoother is that there is no coarse grid correc- tion and no exact solution in the end of the recursion. Indeed, the solution of (3.17) is undesirable, so those more sophisticated iterative methods that approach it too fast are undesirable, too. In the following experiments we focus on investigating regularization properties of the simple gradient de- scent iteration (3.13), evaluating performance using both SD and LSD for such denoising purpose. Fig. 3.2 reports the discrete dynamics for the denoising example in Fig. 3.3, where we synthesize data by adding 10% randomly distributed Guassian white noise into an original Lena image (256 × 256). With least- squares regularization, SD step sizes τk (red dots in Fig. 3.2(a)) regularly oscillate about the stability restriction τmax = h2/4 = 2−18 = 3.81 × 10−6 (blue dash) after 9 iterations: τ2k > τmax, τ2k+1 < τmax and τ2k ≈ τ2k+2 for each k ≥ 5. Why does the step size sequence cycle? How does the in- tegration remain stable despite occasional violations of the absolute stability requirement (3.15)? To answer the first question, the essential theory is in Akaike [1]; see also [4, 56, 102]. Denote L’s eigenvalue-eigenvector pairs by (λi,j ,xi,j), where the eigenvectors are pairwise orthogonal and normalized. Assuming away certain special initial guesses, Akaike showed for SD that asymptotically both the error and the gradient, or say residual Rm(mk), tend to be in the plane spanned by the first and last eigenvectors x0,0 and xn,n corresponding to smallest and largest eigenvalues λ0,0 and λn,n. Since for SD each two consecutive gradients are orthogonal, RTm(m k)Rm(mk+1) = RTm(m k)Lmk+1 = RTm(m k)[Rm(mk)−τkLRm(mk)] = 0, clearly in the asymptotic regime Rm(mk) and Rm(mk+2), being in the same 2D plane and both orthogonal to Rm(mk+1), must be parallel, i.e., up to a 38 5 10 15 20 25 30 2.8 3 3.2 3.4 3.6 3.8 4 x 10−6 (a)  Step−Size for Least Squares 5 10 15 20 25 30 5 10 15 20 25 30 (d)  Misfit 5 10 15 20 25 30 0 5 10 15 20 x 10−3 (c)  Relative Error 5 10 15 20 25 30 0 2 4 6 8 10 12 x 10−3 (b)  Step−Size for Huber LS τ max Huber τ max LS Huber LS Huber 10% Noise Figure 3.2: Discrete dynamics for the Lena example in Fig. 3.3. Two regu- larizations, least-squares (LS) and Huber switching function with γ of (2.7), are compared. constant Rm(mk+2) ≈ Rm(mk). This constant cancels out in (3.18a), hence τk+2 ≈ τk. Such steps cycle ties directly to the slowness of the resulting gradient descent method; see the behavior of relative error norm ek = ‖mk+1 −mk‖/‖mk+1‖, (3.19) red dots in Fig. 3.2(c). For the second question, since Lxi,j = λi,jxi,j , there are coefficients wi,j such that Rm(m) = ∑ i,j wi,jxi,j . It yields Rm(mk+1) = ∑ i,j wki,j(1− τkλi,j)xi,j , =⇒ wk+1i,j = (1− τkλi,j)wki,j , 0 ≤ i, j ≤ n. (3.20) 39 Hence, if we rewrite the steepest descent step (3.18a) as τk = ∑ i,j(w k i,j) 2∑ i,j λi,j(w k i,j)2 , (3.21) we can see how the forward Euler absolute stability restriction may be vi- olated without causing problems. Obviously, if wki,j are all approximately equal then τk in (3.21) is roughly proportional to the reciprocal of the larger eigenvalues. But then from (3.20) we see that the ensuing iteration is more effective in diminishing the magnitude of the components wi,j that corre- spond to the larger eigenvalues than those corresponding to the smaller ones. Hence in the next iteration the step size τk+1 may be larger, cutting down the magnitudes corresponding to the smaller eigenvalues while actually in- creasing those of the larger ones, but not so much as to have an overall divergent effect. Red dots in Fig. 3.2(d) demonstrate that in the case of least-squares, the misfit between the given data b and the recovered data mk, rk = h‖mk − b‖, increases very fast, especially at the beginning of integration. Just after one iteration, we already have r1 > 10. This is undesirable, because if the variance of noise  is given as η, the ideal recovered image m should satisfy∫ Ω |m− b|2 = η2, i.e., in the Lena example, if mk ≈ mtrue, then rk ≈ 10. Fig. 3.2(b) depicts the behavior of step sizes when we employ the Huber regularization for the same Lena example. Blue dash records τmax derived by a frozen coefficient argument (3.16). It actually decreases from 8.76× 10−5 to 1.8×10−5 instead of a constant that looks like in Fig. 3.2(b). Even though step sizes by (3.18a) violate the stability upper bound (3.16) to a very large extent at the beginning of integration, e.g., τ0/τmax ≈ n/2, τk decreases monotonically and quickly converges to τmax. The overall process is perfectly stable and it behaves very similarly in other experiments, e.g., smoothing the image in Fig. 3.5. Such large violation of monotonic decrease results in a fast convergence of both relative errors and misfits; see sky blue dots in Figs. 3.2(c) and 3.2(d). Most important to note in these results is how well the misfit approxi- mates the variance of the noise for the Huber regularization, and how poorly 40 (a) True image (b) 10% noise (c) Misfit = 9.74 (d) Misfit = 10.61 (e) Misfit = 11.17 (f) Misfit = 10.96 Figure 3.3: (a) True image; (b) image corrupted by 10% Guassian white noise; (c) image denoised using SD step size (3.18a) for the relative error tolerance of 10−4, r17 = 9.74, 17 iters; (d) image denoised using LSD step size (3.18b) for the same relative error tolerance, r13 = 10.61, 13 iters; (e) image denoised using SD for the relative error tolerance of 10−5, r309 = 11.17, 309 iters, 51.9 sec; (f) image denoised using LSD for the same relative error tolerance, r140 = 10.96, 140 iters, 23.1 sec. it does so for least-squares. Using the former the noisy image is rapidly de- composed into a fairly good approximation of the clean image plus the noise that can now be estimated by b − mk; see, e.g., [109, 126]. The simpler least-squares regularization does not give this because of the well-known smearing of data that it introduces [110]. This noise estimate is very useful in the following hybrid scheme and iterative refinement techniques. Fig. 3.3 shows the quality of denoising (3.13) on the Lena image. For the relative error tolerance of 10−4, 17 SD steps are required. The resulting image in Fig. 3.3(c) may not be clean enough but it does provide a nice guess for the noise level. For the stricter relative error tolerance of 10−5, 41 5 10 15 0 5 10 15 x 10−3 Step−Size τ 5 10 15 5 10 15 20 25 Huber γ 5 10 15 0 5 10 x 10−3 Relative Error 5 10 15 6 8 10 12 Misfit 0 100 200 300 0 5 10 15 x 10−3 SD − τ 0 100 200 300 5 10 15 20 25 SD − γ 0 100 200 300 0 5 10 x 10−3 SD − Error 0 100 200 300 6 8 10 12 SD − Misfit 0 50 100 150 0 5 10 15 x 10−3 LSD − τ 0 50 100 150 5 10 15 20 25 LSD − γ 0 50 100 150 0 5 10 x 10−3 LSD − Error 0 50 100 150 6 8 10 12 LSD − Misfit SD LSD Figure 3.4: SD step size (3.18a) vs. LSD step size (3.18b) for the denoising example in Fig. 3.3: the first row compares SD (17 iters) with LSD (13 iters) for the relative error tolerance of 10−4; the second and third rows compare them for the relative error tolerance of 10−5, 309 SD iters vs. 140 LSD iters. Note that for each subfigure, X-axis denotes the iteration number k and Y -axis denotes the value of the item labeled. 309 SD steps are necessary and the resulting image in Fig. 3.3(e) is visually quite satisfactory. Note that misfits with respect to different tolerances do not vary too much; see also Fig. 3.4, which largely relaxes the criteria for choosing a good finite time to stop integration [7]. On the other hand, if we set the relative error tolerance too small, e.g., 10−6, the iteration (3.13b) may not stop even after 10000 steps. This is consistent with a well-known theory that the steepest descent method may converge very slowly. Even although it does not bother us too much since the main role of the gradient descent method here is to provide regularization, speeding up the integration to some earlier time under a relaxed tolerance while preserving the regularization properties still has a strong appeal. So, 42 (a) True image (b) 20% noise (c) Misfit = 20.96 (d) Misfit = 21.04 Figure 3.5: (a) True image; (b) image corrupted by 20% Guassian white noise; (c) image denoised using SD step size (3.18a) for the relative error tolerance of 10−5, r382 = 20.96, 382 iters, 64.4 sec; (d) image denoised using LSD step size (3.18b) for the same relative error tolerance, r117 = 21.04, 117 iters, 19.4 sec. replacing SD (3.18a) with the LSD step size selection (3.18b), we would like to see whether there is a significant improvement as happens in the case of using least-squares to reach steady-state. Using the Huber regularization to smooth the Lena image in Fig. 3.3(b), we obtain that for the relative error tolerance of 10−4 and 10−5, 13 (< 17) and 140 (< 309) LSD steps are required, respectively. The quality of the resulting images and the behavior of the corresponding adaptive parameter γ defined in (2.7), relative error norm and misfit are all quite comparable with those obtained using the SD step size selection; see Figs. 3.3 and 3.4. Such an observation also applies to the denoising experiment in Fig. 3.5. Clearly, the importance of LSD, i.e., the improvement over SD, becomes much less pronounced in our anisotropic denoising circumstance. In Fig. 3.4, the LSD strategy does not produce a disorderly dynamical system as it does 43 in the case of least-squares that allows a much faster convergence, but the resulting errors and misfits are preserved, yielding a good smoother. So we can conclude that the regularizing effect of gradient descent methods does not seem to depend too strongly on the choice of step size. 3.2.3 Explicit-Implicit scheme and sharpening Since the explicit algorithm (3.13) with SD may significantly slow down after a while, we intuitively would like to switch to some faster schemes by then, e.g., the implicit method (Tikhonov + Huber with (2.7) + IRLS + PCG) introduced in the previous chapter. Unlike in (3.14), the regularization parameter β here is finite and the related evolution equation reads ∂m ∂t = b−m− βRm(m). We already mentioned in Chapter 2 that selecting a suitable regularization parameter is not a easy job in general. However, in the present setting, we are able to achieve this without major effort. We have shown above that a few explicit iterations under a relaxed tolerance, e.g., 10−4, are capable of providing a good guess m̄ such that∫ Ω |m̄−b|2 ≈ η2. According to the discrepancy principle, we wish to maintain this constraint under the following integration, that is d dt ∫ Ω |m(t)− b|2 = 0, which yields∫ Ω (m− b)mt = −β ∫ Ω (m− b)Rm(m)− ∫ Ω |m− b|2 = 0. This therefore leads us to determine β as β = − ∫ Ω |m̄− b|2∫ Ω(m̄− b)Rm(m̄) = (r(m̄))2∫ Ω(b− m̄)Rm(m̄) , (3.22) where r(m̄) is the misfit between m̄ and b. In Figs. 3.6(a) and 3.6(b), we compare the result of this hybrid explicit- implicit scheme with that of the pure explicit scheme. The pre-denoised image after 17 SD steps, shown in Fig. 3.3(c), acts as a warm start and produces a good regularization setting β = 0.083 by (3.22) for the following implicit process. Then, after only 3 IRLS iterations, the denoised image 44 (a) Misfit = 11.17 (b) Misfit = 11.93 (c) Misfit = 10.39 Figure 3.6: Comparing three different schemes by denoising the image in Fig. 3.3(b): (a) image denoised using SD step size (3.18a) for the relative error tolerance of 10−5, 309 iters, 51.9 sec; (b) image denoised by 3 IRLS iterations with β = 0.083 after 17 SD steps (see the pre-denoised image in Fig. 3.3(c)), the total CPU time = 2.7 + 5.2 = 7.9 sec; (c) image sharpened using 10 SD steps with Tukey function after 17 pre-denoising SD steps with Huber function, the total CPU time = 2.7 + 2.2 = 4.9 sec. in Fig. 3.6(b) looks already quite comparable with—even slightly smoother than—the one in Fig. 3.6(a), which is continually denoised using SD for a stricter tolerance. The processing time of the hybrid scheme is only a small fraction of that required by the explicit scheme, even with the faster LSD step size. Figs. 3.5(c) and 3.8(b) tell us essentially the same story. Besides employing the implicit method to improve the quality of pre- denoised images, another possibility is to sharpen them by making R(m) gradually more and more non-convex [19, 46, 113, 118] and so reducing penalty on large jumps, e.g., replacing the Huber switching function in Fig. 3.7(a) with the Tukey function in Fig. 3.7(b). The Tukey function, scaled similarly to the Huber function, is defined by ρ(σ) = { 1 3 |σ| ≥ γ̂, (σ 2 γ̂2 − σ4 γ̂4 + σ 6 3γ̂6 ) |σ| < γ̂. (3.23) Since ρ′(σ) = φ(σ) = { 0 |σ| ≥ γ̂, 2σ γ̂2 (1− (σγ̂ )2)2 |σ| < γ̂, 45 −4 −2 0 2 4 0.2 0.4 0.6 0.8 1 −4 −2 0 2 4 −1 −0.5 0 0.5 1 −4 −2 0 2 4 1 2 3 4 (a) Huber switching function −4 −2 0 2 4 0 0.1 0.2 0.3 0.4 0.5 −4 −2 0 2 4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 −4 −2 0 2 4 0 0.1 0.2 0.3 (b) Tukey function Figure 3.7: Comparing Huber switching function with Tukey function: graphs of edge-stopping function g(σ) are in the left column; graphs of function φ(σ) = σg(σ) are in the middle column; graphs of function ρ(σ) = ∫ σg(σ)dσ are in the right column. the resulting edge-stopping function is g(σ) = φ(σ) σ = { 0 |σ| ≥ γ̂, 2 γ̂2 (1− (σγ̂ )2)2 |σ| < γ̂. To ensure that the Tukey and Huber functions start rejecting outliers at the same value we set γ̂ = √ 5γ, where γ for the Huber function is still defined by (2.7). Thus, the influence of the Tukey function decreases all the way to zero. Comparing the two functions φ(σ) = σg(σ) in the horizontal center of Fig. 3.7, the Huber function gives all outliers a constant weight of one whereas the Tukey function gives zero weight to outliers whose magnitude is above a certain value. From such shapes of φ we can correctly predict that smoothing with Tukey produces sharper boundaries than smoothing with Huber. We can also see how the choice of edge-stopping function acts to hold off excessive smoothing: given a piecewise constant image where all 46 (a) Misfit = 18.93 (b) Misfit = 22.67 (c) Misfit = 19.66 Figure 3.8: Smoothing or sharpening the image in Fig. 3.5(b): (a) pre- denoised using 21 SD steps for a rougher relative error tolerance of 10−4; (b) then denoised by 3 IRLS iterations with β = 0.15 determined by (3.22), the total CPU time = 3.4 + 5.2 = 8.6 sec; (c) or sharpened using 10 SD steps with Tukey function after (a), the total CPU time = 3.4 + 2.2 = 5.6 sec. discontinuities are above a certain threshold, Tukey will leave the image un- changed whereas Huber will not. Results in Figs. 3.6(c) and 3.8(c) confirm our predictions. After a quick pre-denoising with Huber, only 10 SD steps with Tukey result in obviously sharper discontinuities, i.e., image edges, than those without switching the regularization operator; see Figs. 3.6 and 3.8. It is important to note that a rough pre-denoising is necessary here. It helps us avoid strengthening undesirable effects of heavy noise by using the Tukey function. From an experimental point of view, we recommend to apply the explicit- implicit LSD scheme if a smooth image is desired; otherwise, employ the explicit Tukey regularization, starting from the result of the rough explicit Huber regularization, to get a sharper version. 3.3 Iterative Refinement Techniques In computer graphics, texture is almost a synonym for image, referring to any visually meaningful spatial pattern either regular or irregular. In im- age analysis and processing, though still lacking a uniform definition, texture generally refers to those image patterns that appear to be stochastic in terms of spatial distribution, oscillatory in terms of functional behavior, or atomic 47 (a) 121× 117 (b) η = 10 (c) LSD (d) Sharpen Figure 3.9: Failure to preserve texture in the fingerprint image using de- noising algorithm (3.13) with LSD step size (3.18b). Sharpening with Tukey function (3.23) makes it worse. (a) 256× 256 (b) η = 10 (c) SD (d) Hybrid Figure 3.10: After denoising with SD step size (3.18a), the cartoon compo- nent of the Barbara image becomes clean but most texture, e.g., lines on the cloth, is lost as well. These cannot be preserved by the explicit-implicit hybrid scheme either. and multiscale in terms of pattern composition [29, 95]. Texture is univer- sal in images and essential for making them real and authentic. Therefore proper texture modeling has remained for a long while a challenging but fundamental task in image processing and analysis. A comprehensive dis- cussion on texture goes beyond the focus of this thesis, and in what follows we only briefly introduce two iterative refinement techniques to recapture texture as much as possible. Let b be a given image that can be expressed by the linear model b = u+ v, (3.24) where u, close to b, is generally formed by homogeneous regions with sharp boundaries, say cartoon component, and v presents noise or small scale repeated details, also referred to as texture. 48 For many images the cartoon component u is close to the image itself, i.e., the texture component in v could be ignored, as in Lena or cameraman examples. Our previously introduced denoising methods work well for such images; see, e.g., Figs. 3.6 and 3.8. However, for images containing a lot of fine-scale visually meaningful details, the algorithm (3.13) does not work satisfactorily; for instance, see results in Figs. 3.9 and 3.10. Since we start from the given data b and then send the regularization parameter β to infinity, the data fidelity term 12‖m− b‖2 is actually ignored in the following denoising process. This may easily lead to over-smoothing in the presence of significant fine-scale details. In order to better preserve image texture, we adapt two refinement techniques into our algorithmic scheme: one is iterative regularization introduced by Osher, Burger, Goldfarb, Xu and Yin [109]; the other is hierarchical decomposition proposed by Tadmor, Nezzar and Vese (TNV) [126], offering an adaptive, multiscale representation for the different image features. To be consistent with the notation used in these two references (also note that u in (3.24) is not m, and v is not just noise ), we turn to consider the following minimization problem min u Ψ(u) ≡ λ 2 ‖u− b‖2 +R(u), λ > 0. (3.25) The necessary Euler-Lagrange equation is λ(b− u) +∇ · ( ∇u |∇u|γ ) = 0, (3.26) where the Huber regularization with automatic γ choice (2.7) is employed and the scale multiplier is λ = 1/β, agreeing with (3.12). Gradient descent methods with SD or LSD step size selection can be applied in exactly the same way to solve this system. 3.3.1 Iterative regularization From (3.24) and (3.26), we have v = b− u = − 1 λ ∇ · ( ∇u |∇u|γ ) , (3.27) and the iterative regularization procedure for image denoising reads: • Initializing: u0 = 0 and v0 = 0; 49 • For k = 0, 1, 2, . . . : compute uk+1 as a minimum of the modified model (3.25) uk+1 = argmin u { λ 2 ‖b+ vk − u‖2 +R(u) } , and update vk+1 = b+ vk − uk+1. Here the basic idea is to add vk, decomposed at kth step and containing both noise and texture, back into the original noisy image b, and the sum is then processed by the next minimization procedure. For such an algorithm, we certainly need a stopping criterion, i.e., for which k do we get uk closest to the true noise-free image mtrue? To answer this, we need the following two statements that have been proved in [109]. • The recovered image uk converges monotonically to the noisy image b, as k →∞. • uk approaches mtrue monotonically in the Bregman distance [21], at least until ‖uk∗ − b‖2 ≤ η2 = ∫ Ω |b−mtrue|2. These two facts yield a natural stopping rule—discrepancy principle— that we briefly introduced in Chapter 2 (or see [49]). In practice, for denois- ing b, a good strategy is to proceed iteratively until the result gets noisier, say until uk+1 is more noisy than uk. This procedure depends on the scale parameter λ as well. If λ is very large, one may approximate the noisy image too well, and the stopping index k∗ may be satisfied already after the first step. It is better to set λ small first, i.e., over-smooth initially to make sure the stopping index k∗ cannot be reached too soon. So we can play the same trick here as we did in the explicit-implicit hybrid scheme. After a quick pre-denoising, namely using 10 - 20 SD or LSD steps, we get an over-smoothed cartoon component ū, as shown in Fig. 3.9(c). Then we may determine λ under the discrepancy principle as λ = 1 β = − ∫ Ω(ū− b)Ru(ū)∫ Ω |ū− b|2 , (3.28) a direct reciprocal of (3.22). However, note that in this refinement procedure we still start from the smoothest u0 = 0 instead of the pre-denoised ū. Results in Fig. 3.11 show that the refined images {uk, k = 1, 2, . . .} grad- ually capture more and more texture until the stopping index is reached at 50 (a) u1 (b) u2 (c) u3 (d) u4 (e) u5 ... (f) u9 (g) 128 + v1 (h) 128 + v2 (i) 128 + v3 (j) 128 + v4 (k) 128+v5 ... (l) 128 + v9 Figure 3.11: Gradient descent denoising with iterative regularization: the best result is obtained at the 4th step with λ = 1.92 automatically com- puted by (3.28). Too much noise returns in succeeding u5, u6, ..., u9 that are getting closer and closer to the given noisy image in Fig. 3.9(b). The corresponding misfits between the given data and restored solutions in (a - f) are 38.21, 31.54, 20.42, 11.30, 9.97 and 1.19, respectively. k∗ = 4. Then, besides texture, more and more noise is recaptured into uk, the recovered models eventually converging to the given data b. To make the small values of v more visible, we added a constant 128 to their values in Figs. 3.11 and 3.12. 3.3.2 Hierarchical decomposition Another iterative refinement procedure—TNV [126]—is also based on the model (3.25) and also generates a sequence {uk, k = 0, 1, 2, . . .} that con- verges to the given image b. The starting point is still a variational decomposition of the given image b = u0 + v0, where u0 is the minimum of Ψ(u, λ) and λ is a fixed threshold that dictates separation of scales. So the initial u0 should capture cartoonish components of b, which are to be separate from smaller scale components, say texture, absorbed by v0. Clearly, the distinction between these two kinds of components is scale-dependent. Whatever gets interpreted as texture at a given scale λ, instead, consists of significant cartoon components when it is viewed under a refined scale, e.g., 2λ, u2 = argmin Ψ(v1, 2λ). (3.29) 51 (a) u0 (b) P1 k=0 u k (c) P2 k=0 u k (d) P3 k=0 u k ... (e) P6 k=0 u k (f) 128 + v0 (g) 128 + v1 (h) 128 + v2 (i) 128 + v3 ... (j) 128 + v6 Figure 3.12: Reconstructing the noisy image shown in Fig. 3.10(b) by hier- archical decomposition with λ = 0.001: the best result is obtained at the 4th step. The corresponding misfits between the given data and restored solutions in (a - e) are 18.71, 16.25, 13.62, 9.97 and 0.91, respectively. This may give us a better two-scale restored solution u = u1 + u2 with unresolved texture in v2 = v1 − u2 below the scale 1 2λ . If we continue as in (3.29) with a dyadic refinement step, uk+1 = argmin Ψ(vk, 2kλ), vk+1 = vk − uk+1, k = 1, 2, 3 . . . , then after k∗ such steps, the following hierarchical decomposition of b = ( k∗∑ k=0 uk ) + vk ∗ , can be produced. In each hierarchical step, an additional amount of texture is resolved in terms of the refined scaling for data discontinuities. In Fig. 3.12 we can clearly see how this hierarchical decomposition technique recaptures image details at each stage of the reconstruction. After 4 steps, most texture is recovered while noise with smaller scale is removed from the image. If we continue the decomposition to get into smaller scales, then noise will 52 reappear in the u components, as the refined scales reach the same scales of the noise itself. There are three differences from the iterative regularization as described in the previous subsection. • The TNV algorithm generates a hierarchical decomposition, where computing the difference in modified total variation between u and the previous iterate is repeated. • A dyadic sequence of scales, 2kλ, is used to obtain convergence. Al- though there is no automatic way to determine a good initial λ, we can always start from a small, safe one then dyadically increase it to a satisfactory level. • In [109] the authors indicate that their sequence of iterates uk has the property that the Bregman distance [21] between uk and uk ∗ , the final restored solution, decreases until the discrepancy principle is sat- isfied. There is no such result for the TNV procedure. This means that, e.g., for denoising, we do not have a clear mathematical stopping criterion to obtain the best recovered model, where most texture is preserved and most noise is removed. However, in practice, we can still manually stop decomposition when noise becomes visible in the resulting images. Note that here the inner iteration of both iterative regularization and hierarchical decomposition are based on our basic, fast and adaptive scheme (3.13) with step size selections in (3.18). Since in both cases the correspond- ing outer iterations are straightforward, we may easily upgrade the one-scale algorithm (3.13) in the presence of significant image fine-scale texture to im- prove the quality of denoising while maintaining its high efficiency: compare Figs. 3.9 vs. 3.11 and Figs. 3.10 vs. 3.12. In addition, such refinement techniques inspire us to develop a fast, dynamic, multiscale iterative method that can efficiently smooth, but not over-smooth, noisy three-dimensional triangle meshes (see Chapter 4). Due to different data structures, all details of our multiscale algorithm for 3D meshes are not even close to the ones given here. But the basic principle is similar, that is to recapture visually meaningful fine-scale components with- out, or rather, with less noise by gradually, locally adding roughness scales into the admitted data. 53 3.4 Deblurring Algorithms Although the majority of this chapter has dealt with denoising algorithms, our methods apply also to some more challenging and complex deblurring problems. Indeed, our techniques excel in providing good-quality recon- structions at low cost for blurry images. The blurring of an image can be caused by many factors: (i) movement during the image capture process, by the camera or, when long exposure times are used, by the subject; (ii) out-of-focus optics, use of a wide-angle lens, atmospheric turbulence, or a short exposure time, which reduces the number of photons captured; (iii) scattered light distortion in confocal mi- croscopy. Mathematically, such degradation can be described by a linear system of algebraic equations (2.16), where the matrix-vector multiplication Jm represents a discrete model of the distortion operator (2.15) convolved by the point spread function (PSF). In the spatial domain, the PSF describes the degree to which an optical system blurs or spreads a point of light. The PSF is the inverse Fourier transform of the optical transfer function (OTF). In the frequency domain, the OTF describes the response of a linear, position-invariant system to an impulse. The OTF is the Fourier transform of the PSF. The distortion operator creates the distortion by convolving the PSF with the original true image (see Chapter 2 and [29, 135]). Note that distortion caused by a PSF is just one type of distortion and the clear image mtrue generally does not exist in reality. This image represents what you would have if you had perfect image acquisition conditions. Thus, in the following numerical experiments, we suppose we have it and would like to present it in order to compare with our reconstructed models. Based on the basic denoising scheme (3.13), we can easily derive the following explicit deblurring algorithm: m0 = b, (3.30a) mk+1 = mk − τk[JT (Jmk − b) + βRm(mk)], (3.30b) = mk − τkG(mk), where G(mk) = JT (Jmk − b) + βRm(mk). For step size let us consider the two choices that correspond to (3.18) SD: τk = (G(mk))TG(mk) (G(mk))T (JTJ + βL(mk))G(mk) , (3.31a) LSD: τk = (G(mk−1))TG(mk−1) (G(mk−1))T (JTJ + βL(mk−1))G(mk−1) . (3.31b) 54 Strictly speaking, these would be steepest descent and lagged steepest de- scent only if L defined in (3.11) were constant, i.e., using least-squares reg- ularization. But we proceed to freeze L for this purpose anyway, which amounts to a lagged diffusivity approach. We assume that the PSF that determines a distortion is known, so we may generate the blurring matrix J by the same method described in Section 2.5. However, note that in the above explicit algorithm (3.30), we actually only need the multiplication JTJm instead of the constant matrix J . More- over, we know that using a 2D Fast Fourier Transform (FFT) algorithm [37], the computation cost of matrix-vector multiplication can be reduced from O(N2) to O(N logN). We therefore introduce an FFT technique into our deblurring iterations (3.30b); see Chapter 5 of the book [135] for details of this FFT algorithm. After discretizing the integral operator (2.15) in the form of convolution, we have a fully discrete model bi,j = n∑ p=0 n∑ q=0 fi−p,j−qmp,q + i,j , 0 ≤ i, j ≤ n, where i,j denotes random noise at the grid location (i, j). In general, the discrete PSF {fi,j}ni,j=−n is 2D-periodic defined as: fi,j = f(i′, j), whenever i = i′ mod n+ 1, fi,j = f(i, j′), whenever j = j′ mod n+ 1. This suggests we only need consider f ∈ IR(n+1)×(n+1), because by the peri- odic extension we can easily get (n+1, n+1)-periodic array fext, for which fexti,j = fi,j , whenever 0 ≤ i, j ≤ n. Then the FFT algorithm yields fext ∗m = (n+ 1)F−1{F(f) . ∗ F(m)}, where F is the discrete Fourier transform, ∗ denotes convolution product and .∗ denotes component-wise multiplication. This technique can save us a lot of computer memory and CPU time when computing large matrix-vector multiplications. To illustrate and analyze our deblurring algorithm, we generate some degraded data at first. We use the Matlab function fspecial to cre- ate a variety of correlation kernels, i.e., PSFs, and then deliberately blur 55 clear images by convolving them with these different PSFs. The function fspecial(type,parameters) accepts a filter type plus additional mod- ifying parameters particular to the type of filter chosen. For examples, fspecial(‘motion’,len,theta) returns a filter to approximate, once con- volved with an image, the linear motion of a camera by len pixels with an angle of theta degrees in a counterclockwise direction, which there- fore becomes a vector for horizontal and vertical motions (see Fig. 3.13); fspecial(‘log’,hsize,sigma) returns a rotationally symmetric Laplacian of Gaussian filter of size hsize with standard deviation sigma (see Fig. 3.15); fspecial(‘disk’,radius) returns a circular mean filter within the square matrix of side 2radius+1 (see Fig. 3.16); fspecial(‘unsharp’,alpha) re- turns a 3×3 unsharp contrast enhancement filter, which enhances edges and other high frequency components by subtracting a smoothed unsharp version of an image from the original image, and the shape of which is controlled by the parameter alpha (see Fig. 3.17); fspecial(‘gaussian’,hsize,sigma) returns a rotationally symmetric Gaussian low-pass filter of size hsize with standard deviation sigma (see Fig. 3.18); fspecial(‘laplacian’,alpha) returns a 3×3 filter approximating the shape of the two-dimensional Lapla- cian operator and the parameter alpha controls the shape of the Laplacian (see Fig. 3.19). All six images presented below are 256 × 256. For the first four experi- ments, we only add small random noise into blurred images, say 1% (η = 1), and stop deblurring when the relative error norm (3.19) is below 10−4. Af- ter the PSF is determined, the only parameter required is the regularization parameter β. From Fig. 3.13 we can clearly see that the smaller β is, the sharper the restored solution is, including both image and noise. So the boat image reconstructed with β = 10−3 in Fig. 3.13(c) looks still blurry and that cannot be improved by running more iterations. The boat image deblurred with β = 10−5 in Fig. 3.13(e) becomes much clearer; however, unfortunately, such a small β also brings the undesirable effect of noise am- plification. The setting β = 10−4 seems to generate the best approximation of the original scene, and so it does in all following explicit deblurring ex- periments. Note that the value of regularization parameter β has exactly the same influence in the implicit scheme; for instance, compare those three deblurred cameraman images in Figs. 2.7(b) - 2.7(d). Another important issue here is about gradient descent step size selec- tion. As in the case of denoising, the lagged steepest descent (LSD) (3.31b) usually leads to faster convergence. In particular, when β is small, e.g., less 56 (a) True image (b) motion blur (c) β = 10−3, 23 LSD iters (d) β = 10−4, 33 LSD iters (e) β = 10−5, 38 LSD iters Figure 3.13: type = ‘motion’, len = 15, theta = 30, η = 1. 0 5 10 15 20 25 0 0.5 1 1.5 2 (a)  SD − τ    β = 10−3 0 5 10 15 20 25 0 1 2 3 4 (c)  LSD − τ    β = 10−3 0 5 10 15 20 25 30 35 0 10 20 30 (d)  LSD − τ    β = 10−4 0 20 40 60 80 100 1 1.5 2 2.5 3 3.5 (b)  SD − τ    β = 10−4 Figure 3.14: SD step size (3.31a) vs. LSD step size (3.31b) for the deblurring example in Fig. 3.13 under the relative error tolerance of 10−4: (a) and (c) compare SD - τ with LSD - τ for β = 10−3, 25 SD iters vs. 23 LSD iters; (b) and (d) compare them for β = 10−4, 113 SD iters vs. 33 LSD iters. 57 (a) True image (b) log blur (c) 114 SD or 31 LSD iters Figure 3.15: type = ‘log’, hsize = [512, 512], sigma = 0.5, η = 1, β = 10−4. (a) True image (b) disk blur (c) 99 SD or 26 LSD iters Figure 3.16: type = ‘disk’, radius = 5, η = 1, β = 10−4. (a) True image (b) unsharp blur (c) 17 SD or 15 LSD iters Figure 3.17: type = ‘unsharp’, alpha = 0.2, η = 1, β = 10−4. 58 (a) True image (b) gaussian blur (c) Noise amplification (d) Pre-denoised (e) Deblurred (f) Sharpened Figure 3.18: type = ‘gaussian’, hsize = [512, 512], sigma = 1.5, η = 5, β = 10−4. For the relative error tolerance of 10−4, 9 steps (3.13) with LSD (3.18b) are needed as a quick pre-denoising, and 36 steps (3.30) with LSD (3.31b) are required for the next deblurring. 10 sharpening steps with Tukey function (3.23) follow. The total CPU time = 21.7 sec. than 10−4, the step size (3.31a) is very close to the strict steepest descent and so the two-periodic cycle appears in the step sequence that results in a slow convergence; see Fig. 3.14(b). The lagged step size (3.31b) carefully yet severely breaks such cycling pattern (see Fig. 3.14(d)), so as to provide a much faster convergence under the same error tolerance. Moreover, with the same parameter β, the reconstructed images using both SD (3.31a) and LSD (3.31b) are quite comparable. One almost cannot tell any differences between them by eye. For β = 10−5, 335 SD steps are required to reach the result presented in Fig. 3.13(e) and the corresponding CPU time is 441.6 sec. Using LSD we only need 48.7 sec, since the CPU time is roughly proportional to the number of steps required. Figs. 3.15 - 3.17 reinforce that the explicit deblurring algorithm (3.30) with LSD step selection (3.31b) works very well and the improvement of 59 (a) True image (b) laplacian blur (c) Noise amplification (d) Pre-denoised (e) Deblurred (f) Sharpened Figure 3.19: type = ‘laplacian’, alpha = 0.2, η = 10, β = 10−4. For the relative error tolerance of 10−4, 10 denoising steps (3.13) with LSD (3.18b) and 27 deblurring steps (3.30) with LSD (3.31b) are required, respectively. Then 10 sharpening steps with Tukey function (3.23) are applied. The total CPU time = 15.4 sec. LSD over SD here is even more significant than in the case of denoising, especially when a small β has to be chosen or accuracy requires more than 20 or so steepest descent steps. When we add more noise into blurred images, say η ≥ 5, directly running the deblurring scheme (3.30) may fail due to the more severe effect of noise amplification; see, e.g., Figs. 3.18(c) and 3.19(c). Noise amplification is a common problem of the deblurring methods that attempt to fit defective data as closely as possible. After a few iterations, the restored image can have a speckled appearance, especially for a smooth object observed at low signal-to-noise ratios. These speckles do not represent any real structure in the image, but are artifacts of fitting the noise in the given image too closely. It can be reduced by increasing the value of β, but this may result 60 in a still blurry image, e.g., in Fig. 3.13(c). Since the effects of noise and blur are opposite, as we mentioned at the end of Chapter 2, a quick and to some extent effective remedy is splitting. Firstly, we only employ denoising under a coarser tolerance, e.g., 10−4, that can be done in just a few steps (3.13) with either SD or LSD step size selection. Secondly, apply the deblurring algorithm (3.30) with LSD step size choice and default setting β = 10−4 to the pre-denoised data, as, e.g., in Figs. 3.18(d) and 3.19(d), to correct SPF distortion. Obviously, even though some speckles still appear on the image cartoon components, the results pre- sented in Figs. 3.18(e) and 3.19(e) are much more acceptable than those in Figs. 3.18(c) and 3.19(c) deblurred by the same number of iterations without pre-denoising. Finally, we can use the Tukey regularization to further im- prove our reconstruction, carefully yet rapidly removing unsuitable speckles and enhancing the contrast. The results are demonstrated in Figs. 3.18(f) and 3.19(f). Such a post-processing is referred to as sharpening [118]. The total CPU time, given in the caption of Figs. 3.18 and 3.19, clearly shows the efficiency of this hybrid deblurring scheme. Besides common Gaussian noise the quantization of a digital image, i.e., approximating an infinite precision real world sample measurement by a finite precision one so that it can be represented in computer memory, leads to a loss in the signal quality and thus introduces quantization error. All image values shown in this chapter except for Fig. 3.20 are represented as 8-bit numbers. This means that there are 256 possible levels for each pixel. Thus, the resolution in the amplitude is finite. This gives rise to the following question. Suppose that a blurry image is quantized to lower levels, e.g., 64, 32, 16, which in some sense can be viewed as introducing more and more noise: does our deblurring algorithm still work well? To investigate the effect of image quantization, we need an easy proce- dure to manually control this quantization. Let bq = ⌊ b− bmin bmax − bmin × (2 q − 1) ⌋ × bmax − bmin 2q − 1 + bmin, where b is the given blurry image with all values in the range [bmin, bmax] and q is the number of bits used to represent the sample data. The data range is divided into 2q equally spaced levels, so this procedure is called uniform quantization. The equation above first translates the data to the [0, 1] range; then the result is scaled up to the [0, 2q − 1] range; subsequently the values are rounded down to the nearest integer (there are 2q integers in that range); 61 (a) 6-bit (64 levels) (b) 5-bit (32 levels) (c) 4-bit (16 levels) (d) Deblurred from (a) (e) Deblurred from (b) (f) Deblurred from (c) Figure 3.20: Reconstructing blurred and quantized images: (a - c) the image in Fig. 3.16(b) is uniformly quantized from 8-bit to 6-bit, 5-bit and 4-bit; (d) restored image from (a) using β = 10−4, 109 SD vs. 26 LSD iters; (e) restored image from (b) using β = 5 × 10−4, 38 SD vs. 24 LSD iters; (f) restored image from (c) using β = 10−3, 29 SD vs. 22 LSD iters. and finally everything is translated back to the [bmin, bmax] range. For the gray scale images considered in this thesis, [bmin, bmax] = [0, 255]. Taking the hill image in Fig. 3.16(a) as our working example, Figs. 3.20(a) – 3.20(c) show the quantized results of that in Fig. 3.16(b) from 8-bit to 6- bit, 5-bit and 4-bit, respectively. We can clearly see that for the 6-bit data, using the same value β = 10−4 for the regularization parameter, the recon- structed result in Fig. 3.20(d) is quite comparable with the one in Fig. 3.16(c) recovered from the 8-bit data. Again, the LSD step selection offers an effi- ciency improvement over SD by a factor of roughly 4. For the 5-bit and 4-bit data, since more information is lost, which is equivalent to the noise level becoming higher, we intuitively slightly increase the value of β. The cor- responding reconstructed images are presented in Figs. 3.20(e) and 3.20(f). The quality degradation is gradual. Here, the efficiency improvement caused 62 by replacing SD with the LSD step selection is not so dominating, because not too many SD steps are required due to the larger β values used. Even though the quality of the reconstruction from the low-bit data, e.g., 4-bit, is not very impressive, Fig. 3.20 demonstrates the robustness of our deblurring algorithm. Similar results have also been obtained in our experiments on other images, such as the boat, house and satellite. In Chapters 2 and 3, we have discussed implicit and explicit methods for 2D image reconstruction. In the following Chapters 4 and 5, we will embark upon a related yet different type of problem, namely, surface meshes in 3D. Many image processing strategies could be extended for this new setting, but such lifting is usually far from trivial. 63 Chapter 4 3D Surface Smoothing with Intrinsic Texture Preservation The problem of denoising, or smoothing 3D meshes has received a lot of attention in recent years due to the increasing use of 3D scanning and ac- quisition technology. Meshes supplied by laser scanning devices often carry high-frequency noise in the position of the vertices, so a mesh smoothing al- gorithm is required to rapidly remove noise while preserving original features in the acquired data. In contrast to two-dimensional images, vertex-points {xi}Ni=1 ⊂ IR3 are connected by edges to form a set of non-overlapping tri- angles used to represent a 3D body. Such data structure may give us more information for successful reconstructions, and yet introduces a variety of additional technical difficulties. 4.1 Surface Meshes with Intrinsic Texture Several algorithms for smoothing 3D meshes were proposed in the 1990s in- volving geometric diffusion [62, 128]. Later it was recognized that smoothing based on isotropic diffusion inherently smears out important features such as edges (creases), which correspond to height discontinuities in image process- ing, so methods based on anisotropic diffusion were introduced for various purposes such as fairing height fields and bivariate data, and smoothing surfaces and level sets [9, 10, 35, 36, 42, 70, 111, 127, 129]. Irregularity in mesh sampling introduces additional difficulties [41, 58, 84] that occasionally distort results and significantly slow algorithms down. The most recent of the above methods reconstruct large sharp edges and corners of surfaces particularly well; see especially [70]. Also, the more elaborate surface representation methods of [54, 104], which are designed for other purposes, generally can handle sharp features very well. These methods, however, typically come with a relatively high price tag, both 64 (a) (b) (c) Figure 4.1: Comparing with non-iterative bilateral filtering: (a) noisy dragon head model (100,000 Vertices); (b) smoothed model presented in Jones, Durand and Desbrun [77] (80 sec on a 1.4Ghz Athlon); (c) model smoothed by our multiscale algorithm, ξ = 1/2, 4 iters, 10 sec. in terms of cost per iteration and in the number of iterations required to achieve satisfactory results. Moreover, they often require the user to provide unintuitive parameter values, including a threshold value for the anisotropic diffusion process. Such a value may highly depend on the given model, with different threshold values possibly yielding very different fairing results. Thus, an automatic selection formula that can adaptively determine the threshold value for different meshes is desirable. Another approach—bilateral filtering (BF)—has given rise to methods that more rapidly yield results of a quality similar to anisotropic diffusion, albeit with less theoretical justification [55, 77, 131, 144]. These methods usually require very few, cheap iterations, and can cost but a tiny fraction of the computational effort required to carry out an elaborate anisotropic diffusion process. In fact it can be argued that an algorithm that terminates after what amounts to 2 or 3 forward Euler time steps does not capture the nonlinear dynamics of a complex diffusion process, i.e., that methods based on accurately capturing such dynamics are inherently slower. However, the results generated by these bilateral filtering variants may strongly depend on the range of a vertex neighborhood and the manner in which tangent planes are approximated. It is not obvious, and not clearly specified in the references, how to adaptively and automatically determine these important algorithm components for different models. Moreover, none of these algorithms perform very well for problems with significant fine-scale features. For instance, the hair and scales of the famous Stanford dragon typically disappear when applying a fast bilateral filtering type method; see Figs. 4.1(b), 4.7(d) and 4.7(e) as well as the relevant examples in [77, 127, 144]. We refer to such visually meaningful fine-scale components or details as intrinsic texture. Such texture, which may appear 65 (a) (b) Figure 4.2: Rapid unknown noise removal while preserving intrinsic tex- ture: (a) scanned angel model (25,000 Vertices); (b) model smoothed by our multiscale scheme, ξ = 1/2, 3 iters, 1.8 sec. either regularly or irregularly, is harder to deal with because its separation from noise depends on the local mesh resolution. (A different notion of texture is considered in [36].) And yet, this is where the simpler and faster methods often are most visibly unsatisfactory. The goal of this chapter is then to design a fast, robust denoising method that performs well also in the presence of significant intrinsic texture. A basic anisotropic Laplacian (AL) iteration is developed first that costs es- sentially half that of the basic bilateral filtering iteration and requires no user-specified parameters. Then, since intrinsic texture crucially depends on the local scale on which it noticeably varies, an adaptive procedure is developed in which the AL operator is repeatedly applied while the local fidelity to the data is gradually increased. This principle is inspired by, yet is distinct from the image hierarchical decomposition [126] we introduced in Chapter 3. The resulting multiscale anisotropic Laplacian (MSAL) al- gorithm produces significantly better results than AL or bilateral filtering type methods at a fraction of the cost of sophisticated methods such as [36, 70, 104]. See Figs. 4.1, 4.2, 4.6, 4.7 and 4.8, where important fine-scale features are well-preserved. This scheme is also advantageous in cases where an approximated vertex normal near a boundary erroneously points to an averaged direction, which may lead to a rounded edge or to a significant volume shrinkage. All our final results are achieved using very few mesh sweeps of the simplest and most directly parallelizable sort. Thus, the entire cost of the linear algebra portion of our algorithm is lower than, for instance, one V- cycle in a multigrid preconditioner for solving a linear system arising when 66 carrying out just one time step of an implicit scheme such as used in [35, 36, 70]. 4.2 Geometric Laplacian Operators 4.2.1 Continuous and discrete diffusion models To distinguish the two components of smoothing and data fidelity, we first revisit anisotropic geometric diffusion. Starting from a continuous model, one seeks a one-parameter family of embedded manifolds {M(t)0≤t≤T } in IR3 and corresponding parameterizations x(t), such that for 0 < t ≤ T ∂tx(t)− divM(G ∇M x(t)) = λ(x(0)− x(t)), (4.1) M(x(0)) = M0. Here M0 is the given, noisy data and G is usually a diffusion tensor act- ing on the tangent space of M to enhance large sharp edges. A detailed discussion on choosing this diffusion tensor for surface meshes can be found in [10, 35, 36]. For isotropic diffusion on surfaces, where G is an iden- tity, the counterpart of the Laplacian is the Laplace–Beltrami operator ∆M = divM ∇M, and −∆Mx = 2κ(x)n(x), where κ is the mean curvature and n is the normal direction; see [25, 35] for details. The final integration time T controls the amount of diffusion administered. The nonnegative pa- rameter function λ(x(t)) controls data fidelity. This equation can be viewed as describing a composition of mean curvature normal motion caused by κ(x), smoothing out noise, and a retrieving force toward the original surface caused by (x(0)− x(t)), recapturing fine-scale texture. The magnitude of λ determines the weights in this composition [137]. Next, a manifoldM is discretized by a triangular surface mesh S with its set of vertices V (S) = x = {xi; i = 1, . . . , N} and set of directed edges E(S). If two distinct vertices xi and xj are linked by an edge ei,j = xj−xi then we denote j ∈ N (i) and the edge length li,j = ‖ei,j‖. The given data is a noisy mesh of this sort, and we denote its vertices x(0) = v = {vi; i = 1, . . . , N}. For all methods considered in this chapter an iteration can be written as updating each vertex i by xi ←− xi + τ∆xi + λi(vi − xi), (4.2) in obvious analogy to (4.1). The first iteration starts with the given data. Unless otherwise noted we set τ = 1. Moreover, for all calculations of this section we set λ ≡ 0, the model thus becoming comparable to [42, 55, 77, 67 113]. The rest of this section describes various possibilities for defining ∆xi in (4.2). Note that in general ∆xi is not intended to be simply an approximation to a Laplacian, but we find this notation convenient nonetheless. Moreover, it is important to realize that (4.1) is artificial and must be regarded as pro- viding a guidance, rather than describing a true continuous physical process for denoising, unlike, e.g., Navier–Stokes equations that are indeed widely believed to describe physical fluid flow. The iteration (4.2) on the surface mesh S does not necessarily derive its validity as an approximation of (4.1); see also [7]. 4.2.2 Time discretization and discrete isotropic Laplacian Recall next three discrete isotropic Laplacian operators. The first is the umbrella operator [128] ∆xi = 1 wi ∑ j∈N (i) ei,j , wi = |N (i)|, (4.3) where wi is the cardinal number of one-ring neighbor set N (i). This is a linear form implying the assumption that all neighboring edge lengths are equal to one. Hence it can serve as an effective smoother if the targeted mesh is close enough to being regular. However, when the model has differ- ent discretization rates, significant local distortions are introduced by this umbrella operator, and a better choice is the scale-dependent version [58] ∆xi = 1 wi ∑ j∈N (i) ei,j li,j , wi = ∑ j∈N (i) li,j , (4.4) which tends to keep the lengths of the edges. Still, this weighting scheme does not solve problems arising from unequal face angles. Based on a better approximation to the mean curvature nor- mal, a third scheme was proposed in [41, 96] that does not produce vertex tangential drifting when surfaces are relatively flat and compensates both for unequal edge lengths and for unequal face angles. All three schemes are based on isotropic diffusion, though, and may easily smear sharp features. The basic iterative process of (4.2) is related to (4.1) using the forward Euler method in artificial time. This, however, may work much less effec- tively on irregular meshes, requiring significantly more computational effort because of the larger number of time steps (or iterations) as well as addi- tional cost per iteration for the mean curvature method. One could think of 68 implicit time integration methods, such as backward Euler, that do not have similar stability restrictions on their time step. However, our multiscale al- gorithm below requires such a small amount of computational work as to be much faster than any truly implicit method, and its stability restriction does not give rise to stiffness issues. Moreover, the time step that the finer part of the mesh mandates when using forward Euler may not necessarily produce an effective smoother for coarser parts of the mesh when using geometric diffusion, even though there is no stability problem. Thus, it is the discrete dynamics (see also Chapter 3), rather than the continuous dynamics corresponding to (4.1), that counts [7]. For geometric diffusion the time step should ideally be adjusted locally, corresponding to local edge lengths, but that is hard to do economically. Alternatively, it is possible to view the forward Euler discretization as a steepest descent method for the minimization of x∗ = argmin x { λ 2 ‖x− x(0)‖2 + ∫ M |∇Mx| } , in suitable norms. The time step is determined by a line search that somehow averages contributions of different mean curvatures, and the iteration is better conditioned hence more effective the closer the mesh is to uniformity. 4.2.3 Discrete anisotropic Laplacian To better reconstruct sharp features, we consider developing an anisotropic Laplacian (AL) operator on a surface mesh S, where updates are made at each vertex in the direction of its normal. This is similar to bilateral filtering (BF) and different from (4.3), (4.4) and the schemes of [70, 129]. Our idea behind the AL operator is simple. A given normal ni at a vertex xi defines a tangent plane at this vertex. Restricting the update ∆xi to be in the direction of ni, we can view it as updating the height intensity of a 2D image defined on this tangent plane, i.e., in local coordinates, where the offsets of the neighboring vertices, denoted by {hi,j = eTi,jni, j ∈ N (i)}, act as image intensities at pixels. Let us consider one vertex xi with its one-ring neighborhood first. Thus, we define ∆xi = 1 Ci  ∑ j∈N (i) g(hi,j)hi,j ni, where Ci = ∑ j∈N (i) g(hi,j) is a normalization factor and g is the feature indicator. In determining g we then follow image processing sources, specif- ically [19, 118]. 69 Several robust filter choices, e.g., Huber, Lorentz, Gaussian and Tukey, have been discussed and compared in [19, 46, 118] for images (see also Chap- ter 3). These studies conclude that Gaussian or Tukey filters are more robust in the presence of outliers, hence better preserve important features. Ac- cording to our numerical experiments, this conclusion holds also for surface meshes. So, we employ the Gaussian filter g(hi,j) = exp(− h2i,j 2σ2i ), chosen for its robustness, stability and simplicity. This filter clearly reduces the influence of neighbors that contain large discontinuities in normal space during the smoothing process. The point from which neighboring vertices are treated as outliers depends on the parameter σi. As mentioned in Chapter 2, the image processing literature [19, 118] uses some tools from robust statistics to automatically estimate this robust scale σi as the median absolute deviation (e.g., (2.9)) of the given image intensity gradient. The main idea is that σi should characterize the variance of the majority of data within a region, so outliers would then be determined relative to this background variation. However, our numerical results show that σi determined by the median absolute deviation is occasionally too small to allow enough smoothing on large flat regions. On the other hand, σi determined by the simple mean (average) can be too large to preserve important surface features. Combining these two observations in some sense, a better setting is yielded by the mean absolute deviation σi = 2 · mean(abs(hi − mean(hi))), (4.5) where hi = (hi,j . . . , j ∈ N (i)) and the leading constant is derived from the fact that the inverse of mean absolute deviation of a zero-mean normal distribution with unit variance is near 2. Thus, our discrete anisotropic Laplacian (AL) operator is finally defined and can be rewritten as ∆xi = 1 Ci  ∑ j∈N (i) g(hi,j)nTi ei,j ni = ∑j∈N (i) exp(− |hi,j |22σ2i )hi,j∑ j∈N (i) exp(− |hi,j | 2 2σ2i )  ni, (4.6) with σi given by (4.5). The normalization scales the operator such that a step size τ = 1 can be stably used in the explicit method (4.2). From 70 (a) (b) Figure 4.3: Fast denoising without losing significant features: (a) scanned lady face model (41,000 Vertices) with unknown noise; (b) model smoothed by AL, 3 iters, 3.0 sec. Figs. 4.3 and 4.4, it is clear that the above scheme is able to rapidly re- move noise while preserving significant surface features. To connect, at least formally, between the AL operator and the anisotropic diffusion oper- ator divM(G ∇M x), we can write (cf. [129]) ∆xi = ∑ j∈N (i) Wi,jei,j , Wi,j = 1 Ci g(hi,j)ninTi , where the edge weights Wi,j are symmetric, semi-definite, 3 × 3, rank-1 matrices. However, the validity of the AL operator is not derived from (4.1) directly. The method obtained by employing the AL (4.6) in (4.2) with τ = 1 may be viewed as a variant of BF where the domain filter used to measure closeness there is defined as the identity here. We refer to the resulting BF variant as our BF. Straightforward operation count shows that our BF iteration is basically twice as expensive as AL. Our BF, in turn, is a variant of the BF method presented in [55]. In numerical experiments carried out for all examples reported in this chapter, our BF does not generate signif- icantly better results than AL. In order to pursue a better reconstruction by using BF, one has to set an appropriate neighborhood radius ς and then get the neighboring set of vertices, i.e., N (i) is defined by the set of points xj that satisfy ‖xj − xi‖ < ς. So the included neighbors in [55] depend on 71 (a) (b) (c) (d) (e) Figure 4.4: Removing heavy Gaussian noise from a corrupted model: (a) original Max–Planck model (50,000 Vertices); (b) model corrupted by about 20% noise (1/5 of the mean edge length) in the normal direction; (c) model smoothed by explicit mean curvature flow, 20 iters, 128 sec; (d) model smoothed by AL without normal mollification; (e) model smoothed by AL with normal mollification, 4 iters, 5.1 sec. Euclidean distance, approximating geodesic distance, rather than on con- nectivity. Unfortunately, the questions of how to automatically choose ς for different models and how to better approximate geodesic distances be- tween vertices are far from clearly resolved and may easily result in heavy computation, even if there is no problem with recapturing intrinsic texture. The description of the AL operator is not complete until we specify how to approximate the normal ni at vertex xi. Since we want to work on a smooth but not over-smooth vertex normal field, we estimate the vertex normal ni as the average of the adjacent face normals, leaving it unchanged during the fairing process ni = normalize  ∑ j∈N (i) ei,j × ei,j+1 |ei,j × ei,j+1|  . Another possible choice is to weigh adjacent face normals by their face ar- eas, but there has been no visible improvement observed in our numerical experiments upon introducing this. More importantly, we must be aware that very noisy given data may lead to an unstable computation of initial vertex normals, since the nor- mals are first-order properties of the mesh. Thus, for extremely noisy data, such as the Max–Planck model in Fig. 4.4, we apply a mollification step: 72 employing the normalized average of neighboring vertex normals as ini- tial n = {ni; i = 1, . . . , N}, i.e., running the umbrella operator on the normal field. This guarantees enough smoothing over relatively large non- feature surfaces and requires much less computation than using the two-ring or a larger neighborhood as suggested in [55], comparing Fig. 4.4(d) with Fig. 4.4(e). For common scanned models without very heavy noise, such mollification turns out not to be necessary. 4.3 Handling Intrinsic Texture Although AL is very efficient for smoothing some models as demonstrated in Figs. 4.3 and 4.4, it cannot avoid over-smoothing in the presence of essential intrinsic texture, especially when using λ = 0 in (4.2). The same holds for various BF variants; see Figs. 4.1(b), 4.7(e) and 4.7(f). A simple first step toward a better solution is then to use λ > 0, thus increasing the influence of the given data over all iterates during the denois- ing process [137]. This also helps to reduce the effect of volume shrinkage over several iterations [41, 128] so as to be essentially unnoticeable in the method developed below. Rewriting (4.2) as xi ←− (1− λi)xi + τ∆xi + λivi, we may view ∆xi as the smoothing contribution of the current iteration. Ac- cording to standard differential equation analysis we must require 0 < λi < 2 for stability. However, here we require non-negativity as well, which means 0 < λi ≤ 1 at all vertices. This ensures that the smoothing contributions are accumulated monotonically, preventing undesirable mesh deformation. Moreover, when τ is relatively smaller, the influence of the mesh smoothing modification gets smaller too, hence finer scale components are recaptured from the given mesh. Thus, we keep 0 < λ ≤ 1 and reduce τ gradually by setting at the kth iteration τ = ξk, (4.7) where ξ is a scalar parameter varying in (0, 1), and k here means the kth power. To see how the iteration progresses, let us denote x = x(k) at the kth 73 iteration, with x(0) = v. We then unroll the recursion and obtain x(k+1)i − vi = (1− λ(k)i )(x(k)i − vi) + ξk∆x(k)i = (1− λ(k)i ) [ (1− λ(k−1)i )(x(k−1)i − vi) + ξk−1∆x(k−1)i ] + ξk∆x(k)i = k∑ p=0 c (k−p) i ξ k−p∆x(k−p)i , where cki = 1, p = 0 and c (k−p) i = ∏p−1 q=0(1 − λ(k−q)i ), p > 0. Note that with 0 < λ ≤ 1 we have nonnegative weights c(k−p) for the smoothing contributions. Moreover, these weights shrink as p increases. Furthermore, for ξ < 1 the later smoothing contributions weigh less than the earlier ones in the sum, and more strongly so the smaller ξ is. Our default value is ξ = 1/2, and this often works well, as evident in the numerical examples. Next we need determine the mesh function λ. In image processing, it can be estimated if some statistical information on the noise is known, especially if we keep λ constant on the whole domain, e.g., (3.28) in Chapter 3. On a surface, since noise is carried in the position of vertices and changes all the neighboring height intensities hi,j , we expect that λi should depend on this noise effect. In other words, we want to apply enough smoothing over domains where the surface varies little on the scale of the mesh while preserving more original information over surfaces with fine-scale features. Fortunately, we already have a robust estimator on local mesh features at hand, namely, the Gaussian parameter σi adaptively determined by (4.5). Recall that σi is larger over surface patches containing intrinsic features and smaller over non-feature regions, due to the dependence on local height intensity. This naturally gives rise to the formula λi = σi/σ̄, where σ̄ = max{σi; i = 1, . . . , N}, (4.8) and all quantities are evaluated at the current mesh k. Fig. 4.5(a) illustrates the magnitude of λ at each vertex with k = 0. Note how far from being constant the function λ is. In Fig. 4.5(b), we apply darker color the larger the value of λi is. This figure clearly shows that λ specified by (4.8) does roughly indicate the feature properties of the mesh surface. The combination of the normalized formula for λ and the geometric de- crease of τ yields a multiscale algorithm. In fact, the iteration (4.2) with (4.7) can be viewed as a forward Euler discretization of the ordinary differ- ential equation dx dt = ξt∆x+ λ(v − x), t ≥ 0, 74  (a)  (b) Figure 4.5: Illustrating the function λ at each vertex of the monk model (see also Fig. 4.9): (a) magnitude of λi at vertex xi in the first smoothing run; (b) the values of λ specify the colors of the vertices as RGB values. applied at each vertex i starting from t = 0 with step size equal to 1. Thus, at the start of the kth iteration, t = k. Clearly, as t → ∞ the process converges at steady state to the original data. More importantly, the same holds also for the discrete dynamics. In practice of course we stay well away from this limit by taking only a few iterations. Moreover, an attempt to take a step with large τ , applying say a backward Euler discretization to this differential equation, would reintroduce the noise without significant smoothing, thus missing the goal. Hence, our gradual multiscale approach is necessary. 4.4 Implementation, Numerical Results and Discussion 4.4.1 Implementation details The complete multiscale anisotropic Laplacian process is given in the MSAL algorithm. We have implemented it in a set ofMatlab routines. Run times reported in Figs. 4.1(b) and 4.8(b) are from the corresponding references. The instructions following the nested loops in the MSAL algorithm in- volve array operations, i = 1, . . . , N , and we follow Matlab convention in distinguishing element-wise array products from more general loops. The inner iteration is based on our AL operator defined by (4.2) and (4.6), al- though a BF variant could easily be employed instead. Clearly, the number 75 Multiscale Anisotropic Laplacian (MSAL) Algorithm: Data: vertices {vi; i = 1, . . . , N} and normals {ni; i = 1, . . . , N}; Initializing: x = v, neighborhood {xj , j ∈ N (i)}, wi = |N (i)|; Input parameters: Niter and ξ; for k = 0 : Niter − 1 do for i = 1 : N do for j = 1 : wi do hi,j = (xj − xi) · ni; end end Compute array σ by Equation (4.5); Compute array λ by Equation (4.8); gj = exp(−h2j ./(2σ2)); d = ∑ j∈N (i) (gj . ∗ hj)./ ∑ j∈N (i) gj ; x = x+ ξk(d. ∗ n) + λ. ∗ (v − x); end of floating point operations needed for one AL iteration is O(wN), where w = max{wi, i = 1, . . . , N} and w  N . There are two input parameters in the MSAL algorithm. As mentioned earlier, after a few iterations the relative importance of later smoothing contributions decreases, more aggressively the smaller ξ is. So the scale of ξ should be tuned to be small enough to capture intrinsic texture and large enough not to recapture noise too quickly. In practice the default value ξ = 1/2 often works very well. See Fig. 4.9 below for more on this issue. Regarding the choice of the total number of iterations Niter, the figures displayed generally represent our best results using the eye norm and occa- sionally knowing the ground truth, i.e., the true model. See Fig. 4.6 below for more on this issue. This approach is of course common; however, it is important to note that the sensitivity to the choice, as for the choice of ξ, is low here. For most cases we have found that Niter ≤ 4 is adequate, and Niter = 4 is a stable default value. If we allow Niter to get larger then the results generated by MSAL get noisier, as explained in the previous section. We emphasize that our algorithm is completely specified and easily pro- grammable. All results presented are reproducible by readers in a straight- forward manner. 76 4.4.2 Comparative results Comparing results to those of others in the literature is a delicate task, not only because this involves different platforms and programming levels but also because typically not all algorithm details and relevant data are available. Here, fortunately, in addition to the fact that it only requires two stable input parameters ξ and Niter, the computational complexity of AL is obviously significantly lower than that of leading anisotropic diffusion schemes [9, 10, 36, 70] in terms of computation (assembly) at each node at each iteration, and in either the linear algebra cost per iteration or the number of iterations required. Comparing against the BF alternatives, a basic AL step costs roughly half that of a BF method if the same neigh- borhood range is defined, and less more generally. Moreover, the multiscale scheme built on AL does not result in any significant extra cost. Hence we are essentially left to demonstrate the quality improvements obtained by the introduction of MSAL. In the first sets of results presented in Section 4.1, Fig. 4.1 compares our method with the non-iterative bilateral filtering method proposed in [77]. Clearly, MSAL recaptures significantly more intrinsic texture with less computational effort. Fig. 4.2 shows that three multiscale steps can preserve most fine-scale features such as the eyelids and the grids on the angel’s wings, while sufficiently smoothing flat regions such as the face and arms. The second set of results displayed in Figs. 4.3 and 4.4 shows how powerful and efficient our AL operator is for models that do not possess much noticeable fine-scale features. In Fig. 4.6, we investigate and compare denoising results of AL and MSAL step by step. Observe that AL is able to quickly remove noise, but it also blurs visually distinguished intrinsic texture on the bunny’s fur. MSAL, on the other hand, is capable of recapturing fine scales and simultaneously maintaining necessary smoothing with almost the same computational cost, comparing Fig. 4.6(d) with Fig. 4.6(g). Fig. 4.7 compares MSAL against bilateral filtering (BF), both for our BF and as in [55]. Obviously, BF does not produce better results than AL for models with intrinsic texture, while MSAL yields significantly better results than any of the three variants without a data fidelity term. Fig. 4.8 compares smoothing results for a fine mesh Igea model by non- linear anisotropic diffusion in [70] and by MSAL. Observe the better details of face and hair of the model in Fig. 4.8(c) obtained using much fewer itera- tions by our multiscale algorithm. We hasten to say, however, that compar- ing MSAL to anisotropic diffusion is trickier than comparing BF to AL and 77 (a) (b) (c) (d) (e) (f) (g) (h) Figure 4.6: Comparing AL with MSAL (each run 3 iters, AL - 2.3 sec, MSAL - 2.6 sec): (a) corrupted bunny model (35,000 Vertices); (b) smoothed model after one iteration of AL; (c) smoothed model after two iterations of AL; (d) smoothed model after three iterations of AL; (e) smoothed model after one iteration of MSAL with ξ = 1/2; (f) smoothed model after two iterations of MSAL; (g) smoothed model after three iterations of MSAL; (h) true bunny model with fine scales. 78 (a) (b) (c) (d) (e) (f) Figure 4.7: Comparing with bilateral filtering: (a) true dragon model (50,000 Vertices); (b) noisy model corrupted by about 10% random tangential and normal noise; (c) model smoothed by the multiscale algorithm MSAL, ξ = 1/5, 3 iters, 3.8 sec; (d) model smoothed by AL where intrinsic texture has been removed, 3 iters, 3.4 sec; (e) model smoothed by our version bilateral filtering with one-ring neighborhood and similarity σs determined by the standard deviation of local height intensities, 3 iters, 5.9 sec; (f) model smoothed by bilateral filtering in Fleishman, Drori and Cohen-Or [55] with smooth region radius ς = 8.8 and closeness σc = 4.4 selected by trial and error and similarity σs determined automatically, 3 iters, 8.4 sec. 79 (a) (b) (c) Figure 4.8: Comparing with nonlinear anisotropic surface diffusion: (a) fine igea model (135,000 vertices) with intrinsic texture corrupted by noise; (b) model smoothed by the nonlinear anisotropic diffusion method in Hilde- brandt and Polthier [70], 25 iters, 33 secs on the 1.7 GHz Centrino laptop (result courtesy of Hilderbrandt); (c) model smoothed by MSAL, ξ = 1/2, 4 iters, 13 secs. MSAL. On one hand our algorithm in this chapter is not capable of repro- ducing the results displayed in [70] for CAD-like models on coarse meshes, and on the other hand the choice of methods and parameters within an anisotropic diffusion family required to produce a result such as Fig. 4.8(b) is far more complex and delicate than what is required to produce or repro- duce Fig. 4.8(c). Different applications may well yield different requirements for retaining intrinsic texture. By simply adjusting the scalar ξ we can get different flavors. For example, in Fig. 4.9 one may want to keep wrinkles or other details on the monk’s face for a more natural look. For this, we simply decrease ξ; see, e.g., the models in Figs. 4.9(b) and 4.9(c). But caution must be exercised because too small a ξ may bring back unwanted noise too quickly. On the other hand, a larger value of ξ yields a smoother model, as in Fig. 4.9(d). In brief, under the same total number of MSAL iterations we can very easily get different results suited for different goals. 4.4.3 Handling mesh irregularity and volume shrinkage Since our discrete anisotropic Laplacian operator (4.6) is not scaled by the local Voronoi area [41, 96], during the flow the velocity of vertices strongly 80 (a) (b) (c) (d) (e) (f) (g) Figure 4.9: Controlling retained amount of intrinsic texture using MSAL (each run 4 iters, 1.8 sec): (a) scanned monk model (19,000 Vertices) containing unknown noise; (b) model smoothed with ξ = 1/4; (c) model smoothed with ξ = 1/2; (d) model smoothed with ξ = 1; subfigures (e)-(g) are mean curvature visualizations of (b)-(d), respectively. (a) (b) (c) (d) (e) (f) Figure 4.10: Scaling for mesh sampling irregularity: (a) original head mesh (2,500 Vertices) contains different discretization rates; (b) mesh corrupted by 10% Gaussian noise; (c) mesh smoothed by AL, τ = 1, 3 iters , 0.29 sec; (d) mesh smoothed by AL with lumped mass matrix scaling, τ = 0.02, 8 iters , 3.6 sec; (e) mesh smoothed by edge length scaled-AL, τ = 0.02, 8 iters, 1.4 sec; (f) mesh smoothed by MSAL, ξ = 1/3, 3 iters, 0.29 sec. 81 depends on the geometric discretization. Points adjacent to large triangles move faster than points adjacent to small triangles. For regular meshes such as most real scanned data sets, this is not an issue because one can relate (4.2) and (4.1) by a constant rescaling of time. However, for highly non- uniform meshes this may deform the surface in an undesirable way, as in Fig. 4.10(c). One option to make up for this is to multiply the operator (4.6) by the inverse of a lumped mass matrix [70] that is a diagonal matrix with elements equal to 1/3 of the one-ring area of the vertices. However, the smaller step size required for stability, resulting in the need for more iterations, and the higher computational cost of each iteration, make this scaling method relatively expensive. A cheaper alternative is to use edge length scaling, similar to the scale- dependent umbrella operator (4.4), because the edge lengths also contain discretization information and are much easier to compute. Although this scaling does not take angles between edges into account, it is sufficient for examples such as the one depicted in Fig. 4.10. A scale-dependent AL operator for irregular meshes is given by ∆xi = ∑j∈N (i)(exp(− |hi,j |22σ2i )hi,j/li,j)∑ j∈N (i)(exp(− |hi,j | 2 2σ2i )li,j)  ni. The results in Figs. 4.10(d) and 4.10(e) produced using these two dif- ferent scalings are very similar. Although the edge scaling still suffers from a smaller step size, hence the need for more iterations, the timing in Fig. 4.10 indicates that the weight computation in each iteration is sig- nificantly cheaper than using lumped mass scaling. Our MSAL scheme, unchanged, is cheaper than either of these scal- ing alternatives. In Fig. 4.10(f) we can detect no significant distortion or unwanted artifacts, almost like when lumped mass matrix or edge length scalings are used and even better at the intersection of finer and coarser segments. We observe that the real contribution comes from the stronger retrieving force toward the original surface on vertices adjacent to larger triangles in the multiscale evolution. The algorithm with geometrically de- creasing smoothing weight ξk and multiscale version of data fidelity term, i.e., λ varies on vertices, acts not only for retaining fine features but also for retaining the original structure of the mesh while still removing noise. Of course, if the mesh is much more severely non-uniform, with very large 82 Table 4.1: Volume comparison on the original, noisy and smoothed models. Dragon Bunny Fandisk Head Original 11175335 1550.7 20.243 215.0 Noisy 11197273 1551.9 20.246 215.3 AL 11110735 1545.1 19.984 202.8 MSAL 11156748 1547.5 20.180 210.5 0 2 4 6 8 10 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Max Bunny Dragon Head Fandisk Figure 4.11: Plot of E(M,Mj) defined by (4.9) as a function of the iteration counter k, for smoothing different corrupted meshes. local mesh ratios and highly differing angles, then the above observations may change. In Table 4.1 we demonstrate the power for anti-shrinking of this multi- scale scheme. The interior volumes are calculated as in [41]. The volumes of four meshes smoothed by AL as well as MSAL are recorded respectively. Clearly, the multiscale algorithm helps toward restoring the original vol- umes. In cases where we have the true model, i.e., an original triangle mesh that we have corrupted in order to synthesize experimental data, we can measure the distance between our smoothing results and the original. An L2 normal-based mesh to mesh error metric was introduced in [13] for this 83 (a) (b) (c) (d) Figure 4.12: Restoring edge sharpness by our multiscale scheme (each run 3 iters, 0.56 sec): (a) original fandisk model (6,500 Vertices); (b) model cor- rupted; (c) model smoothed by AL smears sharp edges; (d) model smoothed by MSAL with ξ = 1/2. purpose, measuring deviations between the corresponding normals of the meshM (the true model) andMk that is the smoothing result at the kth iteration. Consider a triangle f̂ of the meshMk and find the triangle f of M closest to f̂ . Let n(f) and n(f̂) be the orientation unit normals of f and f̂ , respectively. The metric is then defined by E = √√√√ 1 A(Mk) ∑ f̂∈Mk A(f̂)|n(f)− n(f̂)|2, (4.9) where A(f̂) denotes the area of f̂ and A(Mk) is the total area of Mk. Fig. 4.11 shows the misfit between the original and the meshes smoothed by MSAL iterations. At first the misfit quickly drops, and then it slowly increases as more and more fine-scale components, including both intrinsic 84 texture and noise, are recaptured. This is consistent with the analysis at the end of Section 4.3. In general, 3 or 4 iterations are appropriate both visually and according to Fig. 4.11 to obtain a good reconstruction. To conclude, we first proceed to develop an AL operator that can be viewed as applying image processing techniques in local coordinates at each mesh vertex, yielding a procedure that adaptively determines the filter pa- rameters. The resulting quality of reconstruction of this AL operator is much better than that of isotropic Laplacian operators [41, 58, 62, 84, 128, 129] while the cost remains low. Next, we build our multiscale iterative scheme, which has a distinct advantage when intrinsic texture is significant or high irregularity in the mesh sampling need to be retained. However, besides intrinsic texture, long sharp edges and distinct corners that typically arise from CAD-like models are also very important for visualization. The AL algorithm with the mean absolute deviation scaling occasionally does not preserve such sharp features very well, e.g. in Fig. 4.12(b). Although this can be essentially improved using MSAL as in Fig. 4.12(c), some noticeable noise is also brought back on large flat regions by this algorithm, and edge sharpness, especially on the curved ridge, needs to be further enhanced. Therefore, we dedicate the next chapter to handle this situation. 85 Chapter 5 3D Surface Smoothing with Sharpness Preservation Polygonal surface models with features that are challenging to preserve can be roughly divided into two classes. Objects in the first class usually con- tain significant intrinsic texture and typically require very fine meshes to represent them well; see, e.g., the dragon model in Fig. 4.7. We have me- thodically developed a fast, adaptive, multiscale mesh denoising algorithm that performs particularly well for this kind of models in the previous chap- ter. In the present one, we focus on denoising piecewise smooth CAD-like models of the second class, which usually have large flat regions, long sharp edges and distinct corners; see, e.g., the double-torus model in Fig. 5.1. The meshes required for adequate representation of such models can often be much coarser and sampled very irregularly; see, for instance, the torus mesh in Fig. 5.2. The goal here then becomes to sufficiently smooth large feature- less regions while preserving original sharp features. Mesh regularization may be desirable and can be gained simultaneously while denoising. 5.1 Surface Meshes with Sharp Features All mesh smoothing methods mentioned in the previous chapter can be thought of as having just one stage or one pass. Another approach for adap- tive mesh smoothing is to recognize mesh features first. Shen, Maxim and Akingbehin [122] propose a method that consists of three stages: feature- preserving pre-smoothing, feature and non-feature region partitioning, and feature and non-feature region smoothing using two separate approaches. Shimizu, Date, Kanai and Kishinami [124] first use the eigen-analysis of a normal voting tensor to extract sharp edges, then apply bilateral filtering to smooth along the direction of the sharp edge, and finally, employ modi- fied bilateral filtering to the overall mesh to obtain a smoothed model with sharp features. The algorithm proposed below has a similar structure in which edges and corners are recognized at an earlier stage, but our tech- 86 (a) (b) 38 39 43 48 41 3 42 23 1 37 13 40 35 2 15 47 24 12 25 46 44 14 36 26 21 29 17 22 33 30 16 45 32 34 20 10 7 27 4 31 18 11 6 5 8 9 28 19 (c) 0 0.1 0.2 0.3 0.4 0 0.02 0.04 0.06 0.08 0.1 (d) Figure 5.1: Mesh smoothing: (a) noisy double-torus model; (b) model re- constructed; (c) crease detection; (d) data classification. niques for detecting features and denoising meshes are totally different. Our methods can be simply implemented, are computationally efficient, and can be easily extended to handle a wide variety of applications as described later in this chapter. In addition to using the notation and terminology introduced in Chap- ter 4, we refer to vertices in flat regions as non-feature vertices. Both edge and corner vertices are called feature vertices. Since different vertex groups should be dealt with in different ways, it is advantageous to cluster vertices accurately and then choose the most suitable discrete smoothing operator on each vertex cluster. Data classification on polygonal meshes is not new as such. A constant curvature region decomposition of 3D meshes follow- ing vertex classification was designed in [86]. Fuzzy clustering was applied on triangular mesh faces for hierarchical mesh decomposition in [82] and a 3D mesh segmentation algorithm through spectral clustering of mesh faces was proposed in [90]. In [31], Bayesian discriminant analysis was used to determine the decision boundary for separating potential feature and non- feature vertices in curvature space. However, we believe that the present chapter, corresponding to [72], is the first to develop a specific clustering technique using first-order height intensity and properly scaled Gaussian curvature, which is good enough to be applied directly on noisy meshes; see, e.g., Fig. 5.1. All related parameters are highly intuitive in the geometric context. Before introducing our clustering technique in the following section, we first recall several mesh denoising algorithms such that we are capable of choosing the most suitable operator to smooth each vertex cluster with a specific feature. As discussed in Chapter 4, the simplest discrete isotropic 87 (a) (b) (c) Figure 5.2: Mesh regularization: (a) original non-uniform torus mesh; (b) mesh corrupted by heavy noise, which is our input data; (c) mesh smoothed and regularized by our hybrid algorithm. Laplacian operator is the umbrella operator (4.3). This is a linear form implying the assumption that all neighboring edge lengths are roughly equal to one. Hence it can serve as an effective smoother if the targeted mesh is close enough to being regular. On the other hand, when the model has different discretization rates, significant local mesh regularization that may or may not be desirable is introduced by this linear operator; see Fig. 5.2 and also [105]. To retain mesh sampling irregularity, a better choice is the scale-dependent umbrella operator (4.4). Further, to solve problems arising from unequal face angles, a better approximation to the mean curvature normal was proposed in [41, 96]. All these schemes are based on isotropic diffusion though, which implies that they all easily smear sharp features during the smoothing process. To better reconstruct sharp features, MSAL is already available. There is only one slight yet important modification required. Recall that in Chapter 4 we have shown that MSAL with the mean absolute deviation scaling (4.5) performs rather well for recapturing fine-scale texture. However, we notice that MSAL with (4.5) sometime tends to over-smooth long sharp creases; see, e.g., the fandisk model in Fig. 5.6. So we turn back to the median absolute deviation scaling [19, 118] (see also (2.9) in Chapter 2), and extend it in the current context as σi = 1.5 · median(abs(hi − median(hi))). (5.1) Our numerical experiments show that for preserving edge sharpness, the scaling (5.1) works much better than (4.5). Similar conclusions are reached 88 in [142], where mesh smoothing algorithms via mean and median filters applied to face normals are compared. Unfortunately, median absolute de- viation scaling cannot generate enough smoothing on large flat regions and occasionally destroys corners due to its inherent limitations [30, 60]. This also suggests that different methods may best be applied for different pur- poses. Vertex classification is thus a good way to go further. For example, if we classify vertices on a CAD-like model into three groups, namely non- feature, edge and corner, we may apply different smoothing operators with different parameter selections to pursue a better reconstruction. 5.2 Vertex Classification 5.2.1 Data set computation for clustering One way to classify mesh vertices is according to their principal curvatures. The magnitude of both principal curvatures is small for non-feature vertices and large for corner vertices. For an edge vertex, the magnitude of one of its principal curvatures is small and the other is quite large. Thus, if we have curvature information of the mesh, we can partition mesh vertices into three clusters denoted as corner, edge and non-feature. In [96] the discrete mean curvature normal at vertex xi is defined by k(xi) = 3 2Ai ∑ j∈N (i) (cotαi,j + cotβi,j)(xi − xj)/2, (5.2) where αi,j and βi,j are the two angles opposite to the edge ei,j in the two triangles sharing it. Here we use one third of the simple one ring face area Ai at the vertex xi to approximate the complicated surface area Amixed defined in [96]. See [98, 141] for convergence properties of such an approximation. This yields the curvature expression κM (xi) = 1 2 ‖k(xi)‖. Let F(i) represent the one-ring-face set that is adjacent to the vertex xi, the discrete Gauss curvature κG at vertex xi is then approximated as κG(xi) = 3(2pi − ∑ j∈F(i) θj)/Ai, (5.3) where the sum is over the faces in the set F(i) and θj is the angle of the jth face at the vertex xi. Since κM = (κ1 + κ2)/2 and κG = κ1κ2, the discrete 89 principal curvatures at the vertex xi can be consequently computed by the quadratic formula κ1(xi) = κM (xi) + √ ∆(xi), κ2(xi) = κM (xi)− √ ∆(xi), (5.4) with ∆(xi) = max{κ2M (xi) − κG(xi), 0}. Note that in the continuous case and almost always in the discrete case, κ2M (xi) > κG(xi). The maximum principal curvature κ1 is alway positive, but the minimum principal curva- ture κ2 follows the sign of the Gaussian curvature κG that is negative at hyperbolic vertices. Since it is not necessary to differentiate ellipticity and hyperbolicity in our classification, we just consider the absolute value (| · |) of both Gaussian curvature κG and minimum principal curvature κ2. The quantitative 2D data matrix C ∈ IRN×2 for classification is then generated as C = [κ1(x1), κ1(x2), . . . , κ1(xN ); |κ2(x1)|, |κ2(x2)|, . . . , |κ2(xN )|]T . (5.5) Since we are directly working on noisy vertices, the second-order curva- ture information is more sensitive and more easily influenced by noise effects. This often results in undesirable vertex classification; see, e.g., Figs. 5.4 and 5.10. Moreover, computation of principal curvatures of each vertex is obviously not cheap. Fortunately, we already have first-order informa- tion to roughly partition noisy meshes. Recall the local height intensity hi,j = eTi,jni, which is the projection of ei,j along the vertex normal ni. Now define hmax(xi) = max j∈N (i) |hi,j |, hmin(xi) = min j∈N (i) |hi,j |. (5.6) Similarly to principal curvatures, for an edge vertex, hmin is small whereas hmax should be relatively quite larger; for a corner vertex, both hmin and hmax should be large; for a non-feature vertex, they should both be relatively small. Thus, for a given vertex matrix V = [x1,x2, . . . ,xN ]T , xi ∈ IR3×1, i = 1, . . . , N, we can generate a corresponding quantitative data matrix H ∈ IRN×2 with respect to height intensity H = [hmax(x1), . . . , hmax(xN );hmin(x1), . . . , hmin(xN )]T . (5.7) In pattern recognition terminology [45], the rows of C or H are called the patterns or objects and the columns are called the features or attributes. 90 The objective of clustering is to partition the data set into several clusters. Generally, a cluster is a group of objects that are more similar to one another than to members of other clusters. The term similarity should be understood as mathematical similarity, measured in some well-defined sense. In our case, considering the first column of data matrices as X-coordinate and the second column as Y-coordinate, the similarity can be measured simply by the Euclidean distance based on the geometric information we already have. The whole data set sits in the first quadrant. Objects that correspond to non-feature vertices should be near the origin; objects that correspond to edge vertices should be close to the X-axis and relatively far from the Y - axis; and objects that correspond to corner vertices should be close to neither X-axis nor Y -axis. However, a height intensity value depends on the approximation of the vertex normal. For some corner vertices, the value of their hmin could be very small where the average of the neighboring face normals might point along one edge such that they would be mistakenly grouped into the edge cluster; see, e.g., Figs. 5.4, 5.5 and 5.9. To address this, we employ the Gaussian curvature κG. There are two reasons. The first is that |κG| is larger at corner vertices and much smaller at both non-feature and edge vertices. The second reason is that computing κG by (5.3) is inexpensive. We can get θ and A very quickly by applying simple operations to the cross product of two edge vectors in each triangle. These cross products must be performed anyway when computing face normals. In order to generate the data matrix similar to C or H with respect to the Gaussian curvature, we set κG1(xi) = 6pi Ai + √ Θ(xi), κG2(xi) = 6pi Ai − √ Θ(xi), (5.8) with Θ(xi) = (6piAi ) 2 − κG(xi). The corresponding quantitative data matrix G ∈ IRN×2 for corner identification is then generated as G = [κG1(x1), . . . , κG1(xN ); |κG2(x1)|, . . . , |κG2(xN )|]T . (5.9) Clearly, |κG2| is larger at the corner vertices than at the rest. Therefore, objects in G corresponding to corner vertices should be relatively far above the X-axis, whereas objects corresponding to non-feature and edge vertices should be close to the X-axis. For a better clustering we linearly rescale the first column of G (X-coordinate in the Euclidean axis) such that all objects are bounded by a square box. 91 (a) (b) 0 0.02 0.04 0.06 0.08 0 0.002 0.004 0.006 0.008 0.01 0.012 (c) 0 0.02 0.04 0.06 0.08 0 0.002 0.004 0.006 0.008 0.01 0.012 (d) Figure 5.3: (a) Corrupted ring model; (b) model smoothed under the par- tition in (d): smaller pink dots → non-feature vertices, larger pink dots → feature (edge) vertices; (c) data set H computed for the noisy ring model in (a); (d) two clusters by K-means: red → non-feature, green → feature. 5.2.2 K-means clustering A classical unsupervised fast clustering algorithm is K-means [45]. It allo- cates each data point to one of c clusters to minimize the within-cluster sum of squares. Taking the partition data matrix H as an example, we seek a partition H1, . . . ,Hc to minimize the objective function c∑ k=1 ∑ ĥi∈Hk ‖ĥi − µk‖2, (5.10) where ĥi = [hmax(xi), hmin(xi)], Hk is a set of objects (data points) in the kth cluster and µk = mean( ∑ ĥi∈Hk ĥi) is the the center point over the kth cluster. The number of clusters c is usually two in the current context. For example, corner and non-corner clusters are always expected in the data matrix G, as in Figs. 5.4(i) and 5.5(c). After the corner cluster is identified and temporarily removed from H, nonempty edge and non-feature clusters are expected in the objects that remain; see e.g., Figs. 5.4(p) and 5.5(f). Or, as in the ring model in Fig. 5.3, there is no corner vertex at all, so we only need to classify vertices into the two clusters of feature and non-feature. One important issue here is the need to supply a good initial guess for the center points µk for starting the K-means clustering. According to our analysis of the distribution of data matrices G, H, or C above, we intuitively set µcorner = [κG1(xp), |κG2(xp)|], p = argmax 1≤i≤N |κG2(xi)|, µnoncorner = [κG1(xq), |κG2(xq)|], q = argmin 1≤i≤N |κG2(xi)|, 92 for corner identification in the data matrix G, and then µedge = [ max 1≤i≤N hmax(xi), min 1≤i≤N hmin(xi)], µnonedge = [ min 1≤i≤N hmax(xi), min 1≤i≤N hmin(xi)], for edge detection in the data matrix H. The same initial setting as above has been applied in C for comparison purpose. K-means clustering is simple and efficient, but it is based on estimating explicit models of the data. When the data distribution is arranged in a com- plex pattern or corrupted by heavy noise, K-means may not give us a very satisfactory vertex partition. A more advanced clustering approach, which has been shown to handle more complicatedly structured data, is spectral clustering [79, 99, 101, 112, 123, 139]. It does not require estimating an explicit model of data distribution, but rather employs a spectral analysis of the matrix of point-to-point similarities. The real power of spectral clus- tering, however, is not utilized in the present context. Furthermore, in our mesh smoothing application, especially for large meshes, spectral clustering may significantly slow the algorithm down, because it involves the calcu- lation of the c leading eigenpairs for the large and full affinity matrix. To maintain high efficiency in our algorithm, we propose another method to im- prove classification in a cheaper way, namely hierarchical K-means. That is, we first implement K-means clustering repeatedly on the unresolved clus- ter until all sub-clusters consist of vertices with only one type of feature: corner, edge or non-feature; then we simply combine sub-clusters that con- tain vertices with the same feature. When this process terminates, we have at most three clusters—corner, edge and non-feature—ready for smoothing and further processing. See the column labeled as Km-N in Table 5.2 for the number of clustering applications required for different models. Our numerical results show that, for denoising non-feature vertices, both the umbrella operator and AL with (4.5) perform well. The main difference between them is that the umbrella operator regularizes irregular meshes during smoothing, while AL with (4.5) keeps the mesh sampling irregularity relatively unchanged. Since the umbrella operator is simpler, faster, and produces simultaneous mesh regularization, we apply it by default on the non-feature cluster. Further, we apply MSAL with (5.1) on the edge cluster, and keep the corner cluster untouched; see e.g., Figs. 5.3(b) and 5.4(c). This approach is referred to as our hybrid mesh smoothing algorithm. The star example in Fig. 5.4 demonstrates a case where using the height intensity data matrix H instead of the principal curvature data matrix C 93 (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) (p) (q) Figure 5.4: (a) Corrupted star model; (b) model smoothed under the par- tition in (n); (c) model smoothed under the partition in (q); (d) 3D plot of feature vertices classified by principal curvatures (blue and green clusters in (n)); (e) 3D plot of feature vertices classified by height intensities (blue and green clusters in (q)); (f) data set H computed for the noisy star model in (a); (g) data set C; (h) data set G; (i) two clusters classified by K-means in data set G for corner identification: blue → corner; (j) marking corner vertices (blue) classified by Gaussian curvature in data set H; (k) marking corner vertices in data set C; (l) temporarily removing the corner cluster from data set C; (m) two clusters classified by K-means in (l): red → non- feature, green → edge; (n) adding the corner cluster back into (m); (o - q) same process as (l - n) performed on data set H instead. 94 is preferable. Even though in Figs. 5.4(l) and 5.4(m) edge and non-feature groups with respect to principal curvatures look well separated, Fig. 5.4(d) clearly demonstrates that vertices on 12 edges that form a cube are mistak- enly grouped into the non-feature cluster with the result that those 12 edges are over-smoothed in the reconstructed model; see Fig. 5.4(b). Even if we continue to do reclustering on the red non-feature cluster in Fig. 5.4(m), the result cannot be improved since principal curvature data points correspond- ing to vertices on the cube are highly mixed with those of vertices on flat regions. In comparison, height intensity data points in Fig. 5.4(o) provide a better distribution pattern such that only one application of K-means clustering generates a very good partition between edge and non-feature; see Figs. 5.4(p) and 5.4(e). Combining with the corner cluster identified in Fig. 5.4(i), the vertex classification depicted in Fig. 5.4(q) yields the high quality reconstruction in Fig. 5.4(c). 5.2.3 Classification refinement Since we are directly working with noisy data, we may not get exact ver- tex classification even when the more advanced clustering techniques are employed. The approximation of some vertex normals could be badly in- fluenced by heavy noise and likewise for local height intensities.1 Some non-feature vertices with high frequency noise might be wrongly grouped into the edge or corner clusters; some corner vertices, especially at saddle structures, might be wrongly grouped into the edge or non-feature clusters; some edge vertices on curved creases might be wrongly grouped into the non-feature cluster; and so on. These could result in undesirable bumps in flat regions, defects at edges and corners, or the vanishing of visually meaningful curved ridges, as e.g., in Fig. 5.6(b). Hence, a post-clustering procedure is proposed to refine the vertex classification. Take the fandisk model as an example. Firstly, partition the scaled Gaus- sian curvature data G into two groups to identify corner vertices; secondly, locate and temporarily remove corners from the data H, as in Figs. 5.5(c) and 5.5(d); thirdly, employ K-means clustering on the data remaining in H to get the non-feature and edge clusters, as in Figs. 5.5(e) and 5.5(f); fi- nally, combine these three clusters and apply the refinement procedure that we design below to improve clustering quality, as in Figs. 5.5(g - j). 1In our experiments, other methods for estimating the vertex normals [71, 75, 76] have not produced significant improvement in this regard. 95 (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) Figure 5.5: (a) data set H computed for the noisy fandisk model depicted in Fig. 5.6(a); (b) data setG; (c) corner identification in (b); (d) marking corner vertices classified by Gaussian curvature in (a); (e) temporarily removing the corner cluster from (d); (f) two clusters classified by K-means in (e): red → non-feature, green → edge; (g) adding the corner cluster back into (f); (h) recapturing potential edge vertices back into the edge cluster through the classification refinement procedure (ζ = 25); (i) 3D plot of edge and corner vertices based on clusters in (g) (the corresponding reconstructed model is in Fig. 5.6(b)); (j) 3D plot of edge and corner vertices based on refined clusters in (h) where red dots represent the recaptured vertices (the corresponding reconstructed model is in Fig. 5.6(c)); (k) detected creases represented in different colors. 96 (a) (b) (c) Figure 5.6: Reconstructing the curved ridges (see the true fandisk model in Fig. 4.12(a)): (a) noisy model; (b) model smoothed under the vertex parti- tion in Fig. 5.5(g); (c) model smoothed under the refined vertex partition in Fig. 5.5(h). Compare with the fandisk model in Fig. 4.12(d) reconstructed by MSAL without vertex classification. Refinement Algorithm: • A corner vertex should have at least three edge vertices in its one-ring neighbors, referred to as edge neighbors. Thus, for each unchecked vertex in the corner cluster: 1. send it into the non-feature cluster if it does not have any edge neighbor; 2. send it into the edge cluster if it has fewer than three edge neigh- bors. • An edge vertex should have at least two neighboring feature vertices, referred to as feature neighbors, each belonging to one of the edge cluster, corner cluster or potential edge list. Thus, for each unchecked vertex in the edge cluster with the empty potential edge list: 1. send it into the non-feature cluster if it does not have any feature neighbor; 2. if it only has one feature neighbor, apply recapturing: 2.1. add the non-feature neighbor with minimal local height in- tensity to the potential edge list; 2.2. if the total number of vertices on the list exceeds ζ, send all these vertices back into the non-feature cluster and then move on to the next unchecked edge vertex; otherwise continue; 97 2.3. begin to check the newest vertex added into the potential edge list: if it has more than one feature neighbor, recapture all vertices on the list from the non-feature cluster into the edge cluster and label them as checked edge vertices and then move on to the next unchecked edge vertex; otherwise do step 2.1 to pick another potential feature neighbor of this edge vertex. The confidence integer ζ controls the extent of edge searching. It is the only parameter appearing in our classification refinement algorithm. It should be neither too small nor too large, because we want to find as many potential edge vertices as possible whereas we must avoid recapturing non- feature vertices by mistake. Fortunately, it is very intuitive to adjust ζ up or down according to the 3D plot of feature vertices after running refinement, such as depicted in Fig. 5.5(j). In addition, the whole refinement procedure is very efficient (see Table 5.2), so there is no costly trial and error process. In our experience, for most cases the default setting ζ = 25 works very well. 5.3 Crease Detection and Mesh Segmentation Since we are able to mark all feature vertices, one quick extension is to detect all sharp creases on CAD-like models. Taking the donut model in Fig. 5.7(a) as one example, we consider its feature vertices as a complete, directed and weighted graph. If there is an edge in the original mesh from feature vertex xi to xj , then set the weight wi,j = hi,j ; otherwise set wi,j = ∞. Thus, by implementing the crease detection algorithm described below we are able to locate and label all visually distinct creases on the donut model; see Fig. 5.7(b). Subsequently, if some closed piecewise smooth curves have been deter- mined by ordering the labels of creases to form cutting paths, then we are able to carry out the corresponding specific mesh segmentation. This is dif- ferent from general mesh segmentation problems studied in various articles [8, 81, 90, 143], where the main challenge is to automatically produce seg- mentation results that are in close agreement with human shape perception. In the current context, the segmentability of a CAD-like shape is quite clear, and all creases that can define different cutting paths are already located. So what we really are concerned with here is how to efficiently make use of available data while respecting user specifications as much as we can. The algorithm is given below. Figs. 5.7(c) - 5.7(f) and 5.8(a) - 5.8(e) clearly demonstrate the flexibility of our mesh segmentation. Note that we are 98 0 0.005 0.01 0 1 2 3 4 5 6 7 x 10−3 (a) 25 23 28 27 24 22 26 21 84 99 83 87 36 47 1 13 75 50 66 57 90 98 81 88 34 46 33 35 48 67 51 12 15 2 3 58 49 55 54 56 74 71 52 53 72 31 44 14 61 96 82 95 89 73 4 30 60 43 32 62 80 59 64 45 29 19 20 37 38 77 65 39 68 70 41 40 91 85 93 97 63 76 79 78 5 16 42 69 100 92 86 94 8 17 6 18 9 11 10 7 (b) (c) (d) (e) (f) Figure 5.7: (a) Noisy donut model and vertex classification in data set H; (b) all creases detected and labeled; (c) segmenting the smoothed mesh into 3 patches; (d) 5 patches; (e) 10 patches; (f) 50 patches. making one major assumption, namely, that input cutting paths must be closed. Crease Detection Algorithm: • Input data: corner vertex set Vc and edge vertex set Ve; • Initializing: compute the weight matrix using mesh connectivity and height intensity; • Starting point: one random corner vertex in Vc; 1. from starting point mark edge vertices along path with the small- est weight until arriving at any corner vertex or marked edge vertex; 2. label the found crease and remove marked edge vertices from Ve; 99 (a) (b) (c) (d) (e) Figure 5.8: Segmenting the star model in Fig. 5.4 according to input cutting paths sketched in pink: (a) 2 patches; (b) 3 patches; (c) 6 patches; (d) 12 patches; (e) 24 patches. 3. if the starting corner vertex is not isolated, repeat step 1; other- wise mark it, set another unmarked corner vertex as the starting point and repeat step 1; • Check convergence if the whole set Vc is marked: exit successfully if Ve is empty; otherwise randomly pick a vertex in Ve as a new starting point and repeat step 1. Note that in this case the found crease may be closed without containing a corner; see, e.g., Figs. 5.3 and 5.10. Mesh Segmentation Algorithm: • Available data: vertex set V and connectivity structure; • Input data: the list of labels of the creases that form several closed cutting paths; • Initializing: Vcut = vertices on input creases and Vinterior = V \ Vcut; • Starting point: one random vertex in Vinterior; 1. send multiple-ring vertex neighbors of the starting point into one patch-vertex set Vp; stop neighbor-chasing at vertices in Vcut such that all vertices in the outermost ring are in Vcut; 2. paint one-ring face neighbors of all members in Vp ⋂ Vinterior with the same color and set Vinterior = Vinterior \ Vp; 3. check if there is any face with all three vertices in Vp ⋂ Vcut; if yes paint it too; • Checking convergence: exit successfully if Vinterior is empty; otherwise start a new patch painting with a different color. 100 5.4 Numerical Results and Discussion The first set of results presented in Figs. 5.1 and 5.2 demonstrates how well edge sharpness is preserved by our reconstructed 3D surface meshes and how efficiently mesh sampling is regularized. The second set of results in Figs. 5.3 - 5.6 shows step by step how the data set for the best vertex parti- tion is cheaply built, how K-means clustering efficiently works, how classifi- cation quality is optimized and how powerful our corresponding hybrid mesh smoothing algorithm can be. Next, in Figs. 5.7 and 5.8 we demonstrate the successful extension of our algorithm to handle sharp crease detection and adaptive mesh segmentation. A more complicated data pattern is considered in Fig. 5.9 and see also the model in Fig. 5.10. Just one application of K-means clustering using the data set G is not sufficient to give satisfactory corner identification, as in Fig. 5.9(d). Instead of using time-consuming spectral clustering technique, we employ hierarchical K-means developed in Sec. 5.2.2 to solve this problem in a much cheaper way. After reclustering, combining and refining, we obtain a perfect restoration of the bearing model in Fig. 5.10(c). Figs. 5.9(a) and 5.9(b) demonstrate that for the noisy bearing model, the height intensity data set H provides a better pattern distribution for vertex classification than the principal curvature data set C. Figs. 5.11 and 5.12 depict more examples where our algorithm is employed to yield rather pleasing results in terms of mesh smoothing, segmentation and crease detection. Figs. 5.13 and 5.14 demonstrate that our method not only preserves sharp features but also retains visually meaningful fine-scale components— intrinsic texture—even when the model contains large featureless regions. In this case we simply divide vertices into feature and non-feature groups, and then use the umbrella operator on the non-feature cluster to obtain enough smoothing while applying MSAL to the feature cluster to capture intrinsic texture. Note that for both two models MSAL can in fact be directly applied without any vertex classification, yielding good results that are comparable to those shown here at somewhat faster total execution times than those listed in Table 5.2. Our purpose here is to demonstrate that the hybrid method can successfully and efficiently combine different aspects, such as mesh regularization on flat regions and careful denoising in the presence of significant fine-scale features. In Chapter 4, we have demonstrated the power for anti-shrinking of the MSAL scheme. Since here we keep corner vertices unchanged and ap- ply MSAL on edges, the hybrid algorithm also helps toward restoring the original volumes, achieving even better results in this regard. Table 5.2 em- 101 (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) Figure 5.9: Vertex classification on a more challenging CAD-like model in Fig. 5.10: (a) Height intensity data matrixH computed for the noisy bearing model in Fig. 5.10(a); (b) principal curvature data matrix C; (c) scaled Gaussian curvature data matrix G; (d) two clusters classified by K-means in data set G; (e) reclassifying the corner cluster (blue) in (d) by K-means; (f) replacing the corner cluster in (d) with new corner cluster generated in (e) and combining new non-corner cluster (green) with old non-corner cluster in (d); (g) locating corner vertices identified by Gaussian curvature in (a); (h) locating corner vertices in (b); (i) temporarily removing the corner cluster from (g); (j) two clusters classified by K-means in (i): red → non-feature, green → edge; (k) adding the corner cluster back into (j); (l) classification refinement. 102 (a) (b) (c) Figure 5.10: (a) Noisy bearing model; (b) model smoothed under the ver- tex partition in Fig. 5.9(k); (c) model smoothed under the refined vertex partition in Fig. 5.9(l). (a) (b) (c) (d) (e) 0 0.5 1 1.5 2 0 0.1 0.2 0.3 0.4 (f) 0 0.5 1 1.5 2 0 0.1 0.2 0.3 0.4 (g) Figure 5.11: Reconstruction with edge sharpness preservation: (a) noisy sliced-sphere model; (b) model smoothed under the refined vertex classifi- cation in (g); (c) another view of (a); (d) another view of (b); (e) detected creases represented in different colors; (f) data set H computed for the noisy model in (a); (g) refined vertex classification. 103 0 0.2 0.4 0.6 0.8 0 0.1 0.2 0.3 0.4 Height Intensity 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 Scaled Gaussian Curvature 0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 Corner Identification 0 0.2 0.4 0.6 0.8 0 0.1 0.2 0.3 0.4 Final Vertex Classification (a) (b) 39 8 24 37 36 40 6 21 28 38 27 42 35 34 17 41 10 25 26 32 31 19 33 23 3 2 5 22 20 18 30 9 16 29 1 14 7 15 13 4 12 11 (c) (d) (e) (f) (g) Figure 5.12: (a) Vertex classification for the noisy torus model in (d); (b) 3D plot of edge and corner vertices based on clusters in (a); (c) crease detection; (d) noisy torus model; (e) model smoothed under the vertex partition in (a); (f) mesh segmentation (16 patches) when all detected creases in (c) used as input to construct cutting paths; (g) another view of (f). 104 (a) (b) (c) (d) Figure 5.13: Results reconstructed with large featureless regions and intrin- sic texture: (a) scanned idol model with noise; (b) model smoothed by our hybrid algorithm; (c) back view of (a); (d) back view of (b). (a) (b) (c) (d) Figure 5.14: (a) Scanned Maneki–Neko model containing unknown noise; (b) model smoothed by our hybrid algorithm with intrinsic texture preservation; (c) back view of (a): featureless region with noise; (d) back view of (b): featureless region without noise. 105 Table 5.1: Sizes of different example models. Model Vertices Faces Ring in Fig. 5.3 2,300 4,600 Torus in Fig. 5.12 2,700 5,400 Sliced-sphere in Fig. 5.11 4,300 8,600 Fandisk in Fig. 5.6 6,500 13,000 Idol in Fig. 5.13 10,000 20,000 Bearing in Fig. 5.10 14,000 28,000 Star in Fig. 5.4 28,000 56,000 Double-torus in Fig. 5.1 35,000 70,000 Maneki–Neko in Fig. 5.14 62,000 125,000 Table 5.2: Iteration counts and runtime for different models listed in Ta- ble 5.1. Km-N: number of applications of K-means clustering; Km-T: total time for K-means clustering; Re-T: time for classification refinement; Cr- T: time for crease detection; Sm-N: number of smoothing iterations; Sm-T: total time for hybrid smoothing; T-T: total execution time for the whole process. All reported CPU times are in seconds. Model Km-N Km-T Re-T Cr-T Sm-N Sm-T T-T Ring 1 0.04 0.06 0.06 4 0.38 0.54 Torus 2 0.13 0.09 0.06 4 0.92 1.20 Sliced-sphere 5 0.29 0.45 0.09 4 1.05 1.88 Fandisk 2 0.18 0.37 0.11 4 1.28 1.94 Idol 1 0.28 1.97 – 3 0.94 3.19 Bearing 3 0.48 1.41 0.29 4 1.67 3.85 Star 3 0.65 1.06 0.37 5 3.82 5.90 Double-torus 4 0.83 1.67 0.32 5 5.60 8.42 Maneki–Neko 3 3.76 7.94 – 3 5.41 17.1 106 phasizes the efficiency of our hybrid algorithm. The number of smoothing iterations Sm-N need not be large: four or five iterations are sufficient for most cases. Although the computation time for vertex classification and crease detection depends on the particular model structure, the total exe- cution time recorded in the last column increases almost linearly with the size of the mesh recorded in Table 5.1. 107 Chapter 6 Conclusions and Future Work 6.1 Conclusions In this thesis we have not only derived several highly efficient algorithms for deblurring and denoising 2D images and for smoothing and regularizing 3D surface meshes, but also justified and demonstrated the effectiveness of these algorithms on a variety of examples. 6.1.1 Reconstructing images Image reconstruction is an inverse problem. Due to either the ill-posedness or the presence of noise in the measured data, we must regularize by adding a priori information and isolating noise effects. The task then amounts to choosing, from all possible model functions, a model that appropriately fits the data and is close to the a priori information. A Tikhonov-type approach leads to the optimization problem (1.2) that depends on a regularization functional and a regularization parameter. In Chapter 2, we discretize this problem (1.2) into a linear algebraic system and solve it implicitly at each iteration. Regularization operators related to total variation (TV) are employed. Two popular methods are modified TV and Huber switching function. Both contain a parameter that must be selected. The Huber variant provides a more natural approach for selecting its parameter, and we use this to propose a scheme for both methods. Our selected parameter depends both on the resolution and on the model average roughness; thus, it is determined adaptively. Its varia- tion from one iteration to the next yields additional information about the progress of the regularization process. The modified TV operator has a smoother generating function; nonetheless we obtain a Huber variant with comparable, and occasionally better, performance. A global convergence for the employed lagged diffusivity algorithm, sim- ilar to that known for modified TV, has been proved for the Huber variant. 108 This gives further reinforcement for our proposed formula. For large prob- lems, e.g., high resolution, the resulting reconstruction algorithms can be very slow. We propose two mechanisms to improve efficiency. The first is a multilevel continuation approach aimed mainly at obtaining a cheap yet good estimate for the regularization parameter and the solution. The second is a special multigrid preconditioner for the conjugate gradient algo- rithm used to solve the linearized systems encountered in the procedures for recovering the model function. Artificial time embedding, especially for inverse problems and for im- age processing problems, has been found useful and beneficial by many re- searchers. They have gained good intuition leading to theory as well as to practical algorithms. Artificial time embedding has also proved a tool of convenience over several decades, allowing the extension of algorithms and software for solving a larger class of problems, albeit occasionally at the expense of reduced efficiency. In Chapter 3, we therefore consider the necessary condition of (1.2) as the steady-state equation of a time-dependent differential equation and proceed to discretize in time using an explicit method. However, we are not interested in that limit, but rather in finite time regularization [7]. We carry out only a few forward Euler steps and stop well short of the disastrous end. Such integration of an initial value differential equation using the forward Euler method can in turn be considered as gradient descent for the associated minimization problem (1.2). Of course the minimum is not what we pursue: that would be equivalent to integration to steady-state. But we do hope that just a few well-aimed gradient descent or forward Euler steps would significantly reduce high frequency noise, or rapidly correct SPF distortion in the case of deblurring. The steepest descent (SD) method, which is known for the regularizing effect that its first few steps have, is also known for its slowness as a stan- dard minimization method. So we further investigate the retention of this regularizing property using a faster gradient descent variant—lagged steep- est decent (LSD)—in the context of both deblurring and denoising. The difference between SD and LSD is only in the way for selecting step sizes. When the combination of regularization and accuracy requires more than, say, 20 SD steps, the alternative LSD offers an advantage, even though (in- deed because) the absolute stability limit of forward Euler is carefully yet severely violated. Starting from a basic, fast and adaptive algorithmic framework, some more advanced hybrid algorithms, e.g., an automatic explicit-implicit switch, an efficient discontinuity sharpening procedure, and a simple, yet effective, 109 denoising-deblurring splitting strategy, are intuitively designed to noticeably improve the quality of image reconstruction. Moreover, to carefully remove noise in the presence of significant fine- scale texture, two different iterative refinement techniques from [109] and [126] are considered and compared. We show how our scheme can be smoothly inserted into such multilevel algorithms and can be used to im- prove their efficiency. Some encouraging results are obtained. This applica- tion also provides inspiration when we step up to three-dimensional surface meshes as follows. 6.1.2 Smoothing surface meshes Chapters 4 and 5 are mainly concerned with deriving efficient denoising algorithms for triangular surface meshes. These two chapters are more self- contained than Chapter 3 and thus require less space to summarize them here, despite their centrality to this thesis. In Chapter 4, we methodically develop a multiscale algorithm that is capable of retaining intrinsic texture as well as handling mesh sampling irregularity while denoising 3D triangu- lar meshes. This approach does not rely directly on the availability of a continuous diffusion process analogue. In addition to specifying the very small total number of iterations, the only user-specified parameter is ξ in the MSAL algorithm. The latter parameter determines how much high fre- quency detail is retained in the reconstructed model, and it can be easily and intuitively adjusted to obtain various purpose-dependent results, as de- scribed in Sections 4.3 and 4.4; see, especially Fig. 4.9. The basic iteration employs the anisotropic Laplacian (AL) operator with a new, automatic, justified parameter choice. Then by gradually and lo- cally adding roughness scales to the admitted mesh, a multiscale anisotropic Laplacian (MSAL) smoothing scheme is defined. The high level of efficiency, parallelizability and transparency of the MSAL iteration are expressed in the MSAL algorithm, yielding a method that can almost automatically achieve high quality fairing results at rapid rates. All our results can be readily reproduced by a reader with adequate programming skills. Note that the use of AL as the inner iteration in our multiscale scheme, although usually rather satisfactory, is not mandatory. Not only bilateral filtering methods can immediately replace AL. Also a more sophisticated method such as those proposed in [36, 70] could be used to replace AL in the MSAL algorithm, obtaining a deluxe version that could for instance yield distinctively better edge sharpness reconstruction for certain models such as Fig. 4.12(a). The clarity of parameter specification, however, may 110 be lost, and the computational cost of such a combination could be much higher than that of our specified algorithm. In Chapter 5, after analyzing the strengths and weaknesses of the corre- sponding discrete operators, we introduce a refined clustering technique that incorporates K-means and geometric a priori information to do vertex clas- sification and subsequently choose the most suitable operator for smoothing on each vertex cluster with a specific feature. This procedure is capable of efficiently denoising surface meshes of CAD-like models while preserving original sharp features, and of generating sufficient smoothing while simul- taneously regularizing over featureless regions. Combined with the MSAL for models with intrinsic texture, we now have a set of algorithms that effi- ciently handle smoothing and regularization of meshes large and small in a variety of situations. Furthermore, based on the constructed vertex classification, we extend the algorithm to detect all sharp creases and segment meshes with respect to mesh features and user requirements. These operations can be carried out without any significant cost increase. Whereas our segmentation algorithm is new as far as we know, we emphasize its restricted nature. The assumption that cutting paths have been detected and are closed is a major one, and in this sense our algorithm is not comparable to the usual, more general mesh segmentation approaches. It is merely an inexpensive, direct extension of our system. Note that our method employs hierarchic K-means and a confidence integer ζ in the classification refinement. This implies that some user in- tervention in the form of parameter specification must be required. Since a fully automatic mechanism is always desirable, a future challenge is to deduce the optimal parameter ζ from the mesh structure itself and to in- troduce some computationally efficient fuzzy clustering techniques that may be more capable of classifying vertices with respect to their features while requiring little or no user intervention. 6.2 Future Work One general aspect that has not been our focus in this thesis is deriving a strong underlying theory. Of course we do rely on theory, developed in some of our references, e.g., [1, 25, 28, 29, 49, 70, 95, 109, 110, 119, 126, 135], as well as adding some of our own. But our emphasis on ultimate practical efficiency and effectiveness has often been at odds with deriving stronger guarantees. There is nothing unusual about this situation: especially for 111 the problems considered here, often the methods used in practice are dif- ferent from, and cannot be easily regarded as variants of, methods that allow rigorous analysis; see, e.g., [7, 113, 118] as well as all of our Siggraph references. All of our denoising work has been based on isolating and removing high frequency components in the given data. Other criteria for noise removal or smoothing may be more problem-specific and may be learned, as in e.g., [47, 91]. Also, we only discussed how to enhance and restore blurry images when the blur is linear and shift-invariant, and the PSF is known. There is certainly more exciting work to do in more challenging situations, for instance the blur generated by a complex physical system or when the PSF is unknown [51, 121]. Another possible extension of our work on 2D images is to design more efficient algorithms for other operations besides deblurring and denoising, such as inpainting, segmentation and compression [15, 38, 147]. Next we describe two particular directions in 3D that we plan to explore in the near future. 6.2.1 Spectral analysis on mesh Laplacian operators Since this Ph.D. work has primarily been concerned with effective methods for reconstruction problems arising from image processing and geometric modeling, one natural continuation of my research is to better analyze those discrete Laplacian operators on 3D surfaces in order to solve more problems, not only surface reconstruction but also mesh parameterization, segmenta- tion, compression, representation and so on. Spectral methods [33, 52, 97] are a promising approach that can be used to examine and manipulate the eigenvalues, eigenvectors, eigenspace projections, or a combination of these quantities, derived from Laplacians. The motivation comes from three per- spectives. Firstly, the eigenfunctions of the Laplacian can be viewed as an orthonor- mal basis of global Fourier functions. That is, by representing mesh geome- try using a discrete signal defined over the manifold mesh surface, a Fourier transform of such a signal can be obtained by an eigenspace projection of the signal along the eigenvectors of a mesh Laplacian. This analogy was first applied by Taubin [128] as a theoretical tool to design and analyze an approximation of a low-pass filter. Secondly, eigenvalues and eigenvectors of the discrete Laplace operator can reveal meaningful global information about the mesh shape. Thus if a matrix models the structure of a surface mesh, either in terms of topology or geometry, then we would expect its 112 eigenstructure to demonstrate an adequate characterization of the surface. This enables spectral techniques to provide high quality results for surface mesh parameterization and segmentation [61, 81, 90, 145, 146]. In particu- lar, taking spectral clustering [79, 99, 101] for surface reconstruction as an example, this spectral approach can not only present nonlinear patterns in the input data in a high-dimensional feature space, it may also unfold them into linear ones so that methods based on linear transformations and linear separators, e.g., principal component analysis (PCA) and K-means cluster- ing, can be applied. Last but not least, by properly selecting a small number of leading eigenvectors of the operator to construct an embedding, the ex- tremal properties of the eigenvectors ensure that the spectral embeddings are information preserving and so the dimensionality of the problem can be effectively reduced while the global characteristics of the original data set are still retained. Effective spectral approaches for mesh compression and representation have been developed and studied [80, 93, 117, 125]. Most spectral methods have a basic framework in common. A matrix M that represents a specific discrete operator based on the structure of the input mesh is first defined. Eigenvalues and eigenvectors ofM are then com- puted and further employed in a problem-dependent manner to obtain the desired solution. Therefore, the variations for the different spectral methods arise in the ways thatM is constructed and eigenvalues and eigenvectors are obtained and used. These lead to a two-fold concern, which is very interest- ing, meaningful and worthy of a further investigation. Mesh Laplacians are operators that act on functions defined on a mesh. A combinatorial mesh Laplacian is completely determined by the graph associated with the mesh while a geometric mesh Laplacian is a discrete approximation to the Laplace–Beltrami operator. A potential drawback of these linear Laplacian approximations used in current spectral methods is that they detect only global smoothness, and may poorly approximate a function that is not globally smooth but only piecewise smooth, or with dif- ferent smoothness in different regions, e.g., failure at preserving sharpness of mesh creases. Designing an anisotropic mesh Laplacian whose eigenfunc- tions may better localize the frequency of the creases is a possible solution to the challenge of improving preservation of sharp features in the spectral geometry processing. The Laplace operator used for spectral mesh analysis depends on the in- put mesh, which can become quite large and may not be sparse. Thus, eigen- value and eigenvector computations are known to be extremely intensive for large meshes, and speed-up techniques are definitely necessary. Koren, Carmel and Harel [85] proposed a fast multiscale eigenvector computation 113 for drawing huge graphs, referred to as the ACE method. They exploit spar- sity of the mesh Laplacians and apply algebraic multigrid to compute a few leading eigenvectors. An alternative method designed by Vallet and Levy [133], which allows computing a specific set of eigenvectors, is to introduce a spectral shift into the operator. This can be combined with iterative meth- ods to speed up the computation. Introducing a hierarchical factorization method to further accelerate the process is now under consideration. After a set of eigenvalues and eigenvectors is obtained, it can be directly used by an application to accomplish different tasks. Moreover, the eigen- vectors can also be used as a basis onto which a signal defined on a triangle mesh is projected. The resulting coefficients can be further analyzed and manipulated. One corresponding application we are currently interested in is interactive geometry filtering. Taking advantage of advanced filter design techniques in engineering to efficiently smooth, single out, exaggerate, or even edit useful underlying structure in the input data is in our future plan. 6.2.2 Anisotropic representation of unorganized points Another promising direction is to extend my study of polygonal models to point clouds: a topic that is receiving a rapidly growing amount of attention as a representation of physical models in computer aided geometric design and graphics. This is interesting since raw scanning data comes in form of points. Constructing meshes is a later stage, and maintaining them through deformations requires a lot of computation. It would be very useful if we are able to efficiently and faithfully approximate a surface implied by a cloud of points. Such point-based surfaces can be widely used for free-form deformation, interpolation, meshing, editing, shading and operations that arise naturally in simulations in many fields of science and engineering. Point-based surfaces [2] have emerged as a versatile representation for geometric models and have become the focus of digital geometry processing and analysis. There are three main reasons. Firstly, affordable and accurate scanning devices that capture points from the surfaces of real objects have become ubiquitous in recent years [88]. Secondly, highly detailed surfaces require a large number of small primitives that contribute to less than a pixel when displayed, so that points become an effective display primitive [114]. Last but not least, significant improvements in graphics hardware now allow us to handle such large numbers of primitives, so that modeling surfaces with clouds of points has become feasible. Point-based surface representations fall into two major categories: ex- plicit and implicit. One typical explicit representation—extremal surfaces 114 defined by Amenta and Kil [3]—is based on an energy function and a normal vector field. The other classic implicit representation—moving least-squares (MLS) defined by Levin [87]—involves local weighted least-squares polyno- mial approximations using a fast decaying weight function. Both methods have gained much popularity and are now well understood. Recently, Lip- man, Cohen-Or, Levin and Tal-Ezer [89] introduced a locally optimal pro- jection operator (LOP). This operator is parameterization free, in the sense that it does not rely on estimating a local normal, fitting a local plane, or using any other local parametric representation, and provides a second order approximation from unorganized points to smooth surfaces, without using any explicit or implicit approximation space. Such a direct and effective geometry reconstruction is very interesting and worth improving. We have noticed that LOP is density-dependent and only able to project points isotropically. In other words, the distribution of projected points by LOP is always locally regularized and appears not to respect curvature information of the surface implied by a given point-set, although the initial data density may provide some help in this regard. Our future goal is to develop a locally anisotropic projection operator (LAP), that can distribute points according to surface curvature without really computing the curvature and without having a surface whatsoever! If we succeed, a large variety of applications can take advantages of this LAP operator and the resulting anisotropic geometry representation. 115 Bibliography [1] H. Akaike. On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method. Ann. Inst. Stat. Math. Tokyo, 11:1–16, 1959. [2] M. Alexa, J. Behr, D. Cohen-Or, S. Fleishman, D. Levin, and C. T. Silva. Point set surfaces. In Proceedings of the conference on Visual- ization, pages 21–28. IEEE Computer Society, 2001. [3] N. Amenta and Y. J. Kil. Defining point-set surfaces. ACM Trans. on Graphics (SIGGRAGH), 23(3):264–270, 2004. [4] U. Ascher, K. van den Doel, H. Huang, and B. Svaiter. Gradient descent and fast artificial time integration. M2AN, 2009. To appear. [5] U. Ascher and E. Haber. Grid refinement and scaling for distributed parameter estimation problems. Inverse Problems, 17:571–590, 2001. [6] U. Ascher, E. Haber, and H. Huang. On effective methods for im- plicit piecewise smooth surface recovery. SIAM J. Scient. Comput., 28(1):339–358, 2006. [7] U. Ascher, H. Huang, and K. van den Doel. Artificial time integration. BIT Numerical Mathematics, 47:3–25, 2007. [8] M. Attenea, B. Falcidieno, and M. Spagnuolo. Hierarchical segmen- tation based on fitting primitives. The Visual Computer, 22:181–193, 2006. [9] C. Bajaj and G. Xu. Adaptive surface fairing by geometric diffusion. In Proceedings of Symposium on Computer Aided Geometric Design, pages 731–737. IEEE Computer Society, 2001. [10] C. Bajaj and G. Xu. Anisotropic diffusion on surfaces and functions on surfaces. ACM Trans. Graphics (SIGGRAPH), 22(1):4–32, 2003. 116 [11] D. Barash. A fundamental relationship between bilateral filtering, adaptive smoothing and the nonlinear diffusion equation. IEEE Trans. PAMI, 24(6):844–847, 2002. [12] J. Barzilai and J. Borwein. Two point step size gradient methods. IMA J. Num. Anal., 8:141–148, 1988. [13] A. Belyaev and Y. Ohtake. A comparison of mesh smoothing methods. In Israel-Korea Bi-National Conference on Geometric Modeling and Computer Graphics, pages 83–87. Tel Aviv University, 2003. [14] E. van den Berg and M. Friedlander. Probing the Pareto frontier for basis pursuit solutions. SIAM J. Scient. Comput., 2009. To appear. [15] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester. Image in- painting. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pages 417–424, 2000. [16] A. Bhaya and E. Kaszkurewicz. Iterative methods as dynamical sys- tems with feedback control. In 42nd IEEE Conf. on Decision and Control, pages 2347–2380. IEEE, 2003. [17] G. Biros and O. Ghattas. Parallel Lagrange-Newton-Krylov-Schur methods for PDE constrained optimization. Part I : the Krylov-Schur solver. SIAM J. Scient. Comput., 27(2):687–713, 2005. [18] G. Biros and O. Ghattas. Parallel Lagrange-Newton-Krylov-Schur methods for PDE constrained optimization. Part II : the Lagrange- Newton solver and its application to optimal control of steady flows. SIAM J. Scient. Comput., 27(2):714–739, 2005. [19] M. J. Black, G. Sapiro, D. H. Marimont, and D. Heeger. Robust anisotropic diffusion. IEEE trans. image processing, 7(3):421–432, 1998. [20] L. Borcea, J. G. Berryman, and G. C. Papanicolaou. High-contrast impedance tomography. Inverse Problems, 12:835–858, 1996. [21] L. Bregman. The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. U.S.S.R. Comput. Math. and Math. Phys., 7:200–217, 1967. 117 [22] M. Burger and S. J. Osher. A survey on level set methods for inverse problems and optimal design. European J. Appl. Math., 16:263–301, 2005. [23] D. Calvetti, B. Lewis, and L. Reichel. Smooth or abrupt: a comparison of regularization methods. In Advanced Signal Processing Algorithms, Architectures and Implementations VIII, volume 3461, pages 286–295. SPIE, 1998. [24] D. Calvetti and L. Reichel. Lanczos-based exponential filtering for discrete ill-posed problems. Numer. Algorithms, 29:45–65, 2002. [25] M. P. do Carmo. Riemannian Geometry. Birkhauser, Boston-Basel- Berlin, 1992. [26] A. Chambolle. An algorithm for total variation minimization and applications. J. Math Imaging and Vision, 20:89–97, 2004. [27] T. Chan, G. H. Golub, and P. Mulet. A nonlinear primal-dual method for total variation-based image restoration. SIAM J. Scient. Comput., 20:1964–1977, 1999. [28] T. Chan and P. Mulet. On the convergence of the lagged diffusivity fixed point method in total variation image restoration. SIAM J. Numer. Anal., 36:354–367, 1999. [29] T. Chan and J. Shen. Image Processing and analysis: variational, PDE, wavelet, and stochastic methods. SIAM, 2005. [30] C. Y. Chen and K. Y. Cheng. A sharpness dependent filter for mesh smoothing. Computer Aided Geometric Design, 22:376–391, 2005. [31] C. Y. Chen, K. Y. Cheng, and H. Y. Liao. Fairing of polygon meshes via bayesian discriminant analysis. J. WSCG, 12:1–3, 2004. [32] M. Cheney, D. Isaacson, and J. C. Newell. Electrical impedance to- mography. SIAM Review, 41:85–101, 1999. [33] F. R. K. Chung. Spectral Graph Theory. AMS, 1997. [34] J. Claerbout and F. Muir. Robust modeling with erratic data. Geo- physics, 38:826–844, 1973. 118 [35] U. Clarenz, U. Diewald, and M. Rumpf. Anisotropic geometric diffu- sion in surface processing. In Proceedings of IEEE Visualization, pages 397–405, 2000. [36] U. Clarenz, U. Diewald, and M. Rumpf. Processing textured surfaces via anisotropic geometric diffusion. IEEE Transactions on Image Pro- cessing, 13(2):248–261, 2004. [37] J. W. Cooley and J. W. Tukey. An algorithm for the machine calcula- tion of complex Fourier series. Mathematics of Computation, 19:297– 301, 1965. [38] D. Cremers, N. Sochen, and C. Schnörr. A multiphase dynamic label- ing model for variational recognition-driven image segmentation. Int. J. Computer Vision, 66(1):67–81, 2006. [39] Y. Dai and R. Fletcher. Projected barzilai-borwein methods for large- scale box-constrained quadratic programming. Numer. Math., 100:21– 47, 2005. [40] Y. Dai and L. Liao. R-linear convergence of the barzilai-borwein gra- dient method. IMA J. Numer. Anal., 22:1–10, 2002. [41] M. Desbrun, M. Meyer, P. Schröder, and A. H. Barr. Implicit fairing of irregular meshes using diffusion and curvature flow. ACM Trans. Graphics (SIGGRAPH), pages 317–324, 1999. [42] M. Desbrun, M. Meyer, P. Schröder, and A. H. Barr. Anisotropic feature-preserving denoising of height fields and bivariate data. In Proceedings of Graphics Interface, pages 145–152, 2000. [43] D. C. Dobson and C. R. Vogel. Convergence of an iterative method for total variation denoising. SIAM J. Numer. Anal., 34:1779–1791, 1997. [44] K. van den Doel and U. Ascher. On level set regularization for highly ill-posed distributed parameter estimation problems. J. Comp. Phys., 216:707–723, 2006. [45] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. 2nd Ed., John Wiley & Sons, INC, 2001. [46] F. Durand and J. Dorsey. Fast bilateral filtering for the display of high-dynamic-range images. ACM Trans. Graphics (SIGGRAPH), 21(3):257–266, 2002. 119 [47] M. Elad and M. Aharon. Image denoising via learned dictionaries and sparse representation. In Proceedings of the Conference on Com- puter Vision and Pattern Recognition, volume 1, pages 895–900. IEEE Computer Society, 2006. [48] I. Elbakri and J. Fessler. Statistical image reconstruction for polyen- ergetic X-ray computed tomography. IEEE Trans. Med. Imag., 21(2):89–99, 2002. [49] H. W. Engl, M. Hanke, and A. Neubauer. Regularization of Inverse Problems. Kluwer, 1996. [50] C. Farquharson and D. Oldenburg. Non-linear inversion using general measures of data misfit and model structure. Geophysics J., 134:213– 227, 1998. [51] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. T. Free- man. Removing camera shakefrom a single photograph. ACM Trans. Graphics (SIGGRAPH), 25(3):787–794, 2006. [52] M. Fiedler. Algebraic connectivity of graphs. Czech. Math. J., 23:298– 305, 1973. [53] M. Figueiredo, R. Nowak, and S. Wright. Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Special Topics on Signal Processing, 1:586– 598, 2007. [54] S. Fleishman, D. Cohen-Or, and C. T. Silva. Robust moving least- squares fitting with sharp features. ACM Trans. Graphics (SIG- GRAPH), 24(3):544–552, 2005. [55] S. Fleishman, I. Drori, and D. Cohen-Or. Bilateral mesh denoising. ACM Trans. Graphics (SIGGRAPH), 22(3):950–953, 2003. [56] G. E. Forsythe. On the asymptotic directions of the s-dimensional optimum gradient method. Numer. Math., 11:57–76, 1968. [57] A. Friedlander, J. Martinez, B. Molina, and M. Raydan. Gradi- ent method with retard and generalizations. SIAM J. Num. Anal., 36:1999, 275289. [58] K. Fujiwara. Eigenvalues of laplacians on a closed riemannian manifold and its nets. In Proceedings of AMS, volume 123, pages 2585–2594, 1995. 120 [59] G. H. Golub and C. F. van Loan. Matrix Computations. 3rd Ed., Johns Hopkins Series in the Mathematical Sciences, Johns Hopkins University Press, 1996. [60] R. C. Gonzalez and R. E. Woods. Digital image processing. 2nd Ed., Prentice-Hall, Engewood Cliffs, NJ, 2002. [61] C. Gotsman, X. Gu, and A. Sheffer. Fundamentals of spherical param- eterization for 3D meshes. ACM Trans. on Graphics (SIGGRAGH), 22(3):358–363, 2003. [62] I. Guskov, W. Sweldens, and P. Schröder. Multiresolution signal pro- cessing for meshes. ACM Trans. Graphics (SIGGRAPH), pages 325– 334, 1999. [63] E. Haber. Numerical Strategies for the Solution of Inverse Problems. PhD thesis, University of British Columbia, 1997. [64] E. Haber. A multilevel, level-set method for optimizing eigenvalues in shape design problems. J. Comp. Phys., 198:518–534, 2004. [65] E. Haber, U. Ascher, and D. Oldenburg. Inversion of 3D electromag- netic data in frequency and time domain using an inexact all-at-once approach. Geophysics, 69:1216–1228, 2004. [66] E. Haber, S. Heldmann, and U. Ascher. Adaptive finite volume method for distributed non-smooth parameter identification. Inverse Prob- lems, 23:1659–1676, 2007. [67] E. Haber, D. Oldenburg, and A. Celler. Recovery of dynamic param- eters in SPECT. IEEE on NUC SCI, 67:129–135, 1997. [68] P. C. Hansen. Rank Deficient and Ill-Posed Problems. SIAM, Philadel- phia, 1998. [69] J. W. Hardy. Adaptive Optics for Astronomical Telescopes. Oxford University Press, NY, 1998. [70] K. Hildebrandt and K. Polthier. Anisotropic filtering of non-linear surface features. EUROGRAPHICS, 23(3):391–400, 2004. [71] H. Huang and U. Ascher. Fast denoising of surface meshes with in- trinsic texture. Inverse Problems, 24:034003, 2008. 121 [72] H. Huang and U. Ascher. Surface mesh smoothing, regularization and feature detection. SIAM J. Scient. Comput., 31(1):74–93, 2008. [73] Y. Huang, M. K. Ng, and Y. W. Wen. A fast total variation min- imization method for image restoration. Multiscale Model. Simul., 7(2):774–795, 2008. [74] P. J. Huber. Robust estimation of a location parameter. Ann. Math. Stats., 35:73–101, 1964. [75] S. Jin, R. R. Lewis, and D. West. A comparison of algorithms for ver- tex normal computation. The Visual Computer, 21(1-2):71–82, 2005. [76] T. Jirka and V. Skala. Gradient vector estimation and vertex normal computation. In Technical Report No. DCSE/TR-2002-08. University of West Bohemia in Pilsen, 2002. [77] T. Jones, F. Durand, and M. Desbrun. Non-iterative, feature preserv- ing mesh smoothing. ACM Trans. Graphics (SIGGRAPH), 22(3):943– 949, 2003. [78] J. E. Dendy Jr. Black box multigrid. J. Comput. Phys., 48:366–386, 1982. [79] R. Kannan, S. Vempala, and V. Vetta. On spectral clustering - good, bad and spectral. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, pages 367–380, 2000. [80] Z. Karni and C. Gotsman. Spectral compression of mesh geometry. ACM Trans. Graphics (SIGGRAPH), pages 279–286, 2000. [81] S. Katz, G. Leifman, and A. Tal. Mesh segmentation using feature point and core extraction. The Visual Computer, 21(8-10):649–658, 2005. [82] S. Katz and A. Tal. Hierarchical mesh decomposition using fuzzy clustering and cuts. ACM Trans. Graphics (SIGGRAPH), 22(3):954– 961, 2003. [83] S. Knapek. Matrix-dependent multigrid homogenization for diffusion problems. SIAM J. Sci. Comp., 20(2):515–533, 1999. [84] L. Kobbelt, S. Campagna, J. Vorsatz, and H. P. Seidel. Interactive multiresolution modeling on arbitrary meshes. ACM Trans. Graphics (SIGGRAPH), pages 105–114, 1998. 122 [85] Y. Koren, L. Carmel, and D. Harel. Ace: A fast multiscale eigen- vector computation for drawing huge graphs. In Proceedings of IEEE Symposium on Information Visualization, pages 137–144, 2002. [86] G. Lavoué, F. Dupont, and A. Baskurt. Constant curvature region decomposition of 3D-mesh by a mixed approach vertex-triangle. J. WSCG, pages 245–252, 2004. [87] D. Levin. The approximation power of moving least-squares. Journal of Computation and Mathematics, 67(224):1517–1531, 1998. [88] M. Levoy, K. Pulli, B. Curless, S. Rusinkiewicz, D. Koller, L. Pereira, M. Ginzton, S. Anderson, J. Davis, J. Ginsberg, J. Shade, and D. Fulk. The digital michelangelo project: 3D scanning of large statues. ACM Trans. Graphics (SIGGRAPH), pages 131–144, 2000. [89] Y. Lipman, D. Cohen-Or, D. Levin, and H. Tal-Ezer. Parameterization free projection for geometry reconstruction. ACM Trans. on Graphics (SIGGRAGH), 26(3):22, 2007. [90] R. Liu and H. Zhang. Segmentation of 3D meshes through spectral clustering. In Proceedings of Computer Graphics and Applications, pages 298–305. IEEE Computer Society, 2004. [91] J. Mairal, G. Sapiro, and M. Elad. Learning multiscale sparse repre- sentations for image and video restoration. SIAM Multiscale Modeling and Simulation, 7(1):214–241, 2008. [92] A. Marquina and S. Osher. Explicit algorithms for a new time de- pendent model based on level set motion for nonlinear deblurring and noise removel. SIAM J. Scient. Comput., 22:387–405, 2000. [93] D. Mateus, F. Cuzzolin, E. Boyer, and R. Horaud. Articulated shape matching by robust alignment of embedded representations. In ICCV’ 07 Workshop on 3D Representation for Recognition, 2007. [94] S. McCormick and J. Wade. Multigrid solution of a linearized, reg- ularized least-squares problem in electrical impedance tomography. Inverse problems, 9:697–713, 1993. [95] M. Meyer. Oscillating patterns in image processing and nonlinear evolution equations. In University Lecture Series, AMS, volume 22, 2002. 123 [96] M. Meyer, M. Desbrun, P. Schröder, and A. H. Barr. Discrete differential-geometry operators for triangulated 2-manifolds. In Pro- ceedings of VisMath, Berlin, Germany, 2002. [97] B. Mohar. Some applications of laplacian eigenvalues of graphs. In Graph Symmetry: Algebraic Methods and Applications, volume 23, pages 225–275. Kluwer, 1997. [98] J. M. Morvan and B. Thibert. Smooth surface and triangular mesh: comparison of the area, the normals and the unfolding. In Proceedings of the Seventh ACM Symposium on Solid Modeling and Applications, pages 147–158, 2002. [99] B. Nadler, S. Lafon, R. Coifman, and I. Kevrekidis. Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In Advances in Neural Information Processing Systems, volume 18, 2005. [100] J. Nagy and K. Palmer. Steepest descent, cg and iterative regulariza- tion of ill-posed problems. BIT, 43:1003–1017, 2003. [101] A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Anal- ysis and an algorithm. In Advances in Neural Information Processing Systems, volume 14, 2001. [102] J. Nocedal, A. Sartenar, and C. Zhu. On the behavior of the gradient norm in the steepest descent method. Comp. Optimization Applic., 22:5–35, 2002. [103] N. Nordstrom. Biased anisotropic diffusion - a unified regularization and diffusion approach to edge detection. Image and Vision Comput- ing, 8:318–327, 1990. [104] Y. Ohtake, A. Belyaev, and M. Alexa. Sparse low-degree implicit sur- faces with applications to high quality rendering, feature extraction, and smoothing. In Proceedings of Symposium on Geometry Processing, pages 149–158, 2005. [105] Y. Ohtake, A. Belyaev, and I. A. Bogaevski. Polyhedral surface smoothing with simultaneous mesh regularization. In Proceedings of Geometric Modeling and Processing, pages 229–237, 2000. [106] A. V. Oppenheim and R. W. Schafer. Discrete-Time Signal Precessing. Prentice-Hall, Engewood Cliffs, NJ, 1989. 124 [107] A. V. Oppenheim and A. S. Willsky. Signals and Systems. Prentice- Hall, Engewood Cliffs, NJ, 1996. [108] J. M. Ortega and W. C. Rheinboldt. Iterative solutions of nonlinear equations in several variables. Academic Press, 1970. [109] S. Osher, M. Burger, D. Goldfarb, J. Xu, and W. Yin. An iterative reg- ularization method for total variation based image restoration. SIAM J. multiscale model. simul., 4:460–489, 2005. [110] S. Osher and R. Fedkiw. Level Set Methods and Dynamic Implicit Surfaces. Springer, 2003. [111] J. Peng, V. Sterla, and D. Zorin. A simple algorithm for surface denoising. In Proceedings of IEEE Visualization, pages 107–112, 2001. [112] P. Perona and W. T. Freeman. A factorization approach to grouping. In Proceedings of the 5th European Conference on Computer Vision, volume I, pages 655–670, 1998. [113] P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7):629–639, 1990. [114] H. Pfister, M. Zwicker, J. van Baar, and M. Gross. Surfels: Surface el- ements as rendering primitives. ACM Trans. Graphics (SIGGRAPH), pages 335–342, 2000. [115] M. Raydan and B. Svaiter. Relaxed steepest descent and cauchy- barzilai-borwein method. Comp. Optimization Applic., 21:155–167, 2002. [116] L. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D, 60:259–268, 1992. [117] R. M. Rustamov. Laplace-Beltrami eigenfunctions for deformation invariant shape representation. In Proceedings of Symposium on Ge- ometry Processing, pages 225–233, 2007. [118] G. Sapiro. Geometric Partial Differential Equations and Image Anal- ysis. Cambridge, 2001. [119] O. Scherzer. Scale-space methods and regularization for denoising and inverse problems. Advances in Image and Electron Physics, 128:445– 530, 2003. 125 [120] J. A. Sethian. Level Set Methods and Fast Marching Methods : Evolv- ing Interfaces in Geometry, Fluid Mechanics, Computer Vision, and Material Sciences. Cambridge, 1996. [121] Q. Shan, J. Jia, and A. Agarwala. High-quality motion deblurring from a single image. ACM Trans. Graphics (SIGGRAPH), pages 1– 10, 2008. [122] J. Shen, B. Maxim, and K. Akingbehin. Accurate correction of sur- face noises of polygonal meshes. Int. J. for Numerical Methods in Engineering, 64(12):1678–1698, 2005. [123] J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000. [124] T. Shimizu, H. Date, S. Kanai, and T. Kishinami. A new bilateral mesh smoothing method by recognizing features. In Proceedings of the Ninth International Conference on Computer Aided Design and Computer Graphics (CAD-CG’05), pages 281–286, 2005. [125] O. Sorkine, D. Cohen-Or, and S. Toledo. High-pass quantization for mesh encoding. In Proceedings of of Symposium on Geometry Process- ing, pages 42–51, 2003. [126] E. Tadmor, S. Nezzar, and L. Vese. A multiscale image representation using hierarchical (BV,L2) decompositions. SIAM J. multiscale model. simul., 2:554–579, 2004. [127] T. Tasdizen, R. Whitaker, P. Burchard, and S. Osher. Geometric surface smoothing via anisotropic diffusion of normals. In Proceedings of IEEE Visualization, pages 125–132, 2002. [128] G. Taubin. A signal processing approach to fair surface design. ACM Trans. Graphics (SIGGRAPH), pages 351–358, 1995. [129] G. Taubin. Linear anisotropic mesh filters. In IBM Research Technical Report RC-22213, October 2001. [130] A. N. Tikhonov and V. Y. Arsenin. Methods for Solving Ill-posed Problems. John Wiley and Sons, Inc., 1977. [131] C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In Proceedings of IEEE Intl. Conf. Computer Vision, pages 839–846, Bombay, India, 1998. 126 [132] U. Trottenberg, C. Oosterlee, and A. Schuller. Multigrid. Academic Press, 2001. [133] B. Vallet and B. Lévy. Spectral geometry processing with manifold harmonics. Computer Graphics Forum, 27(2):251–260, 2008. [134] C. R. Vogel. Solution of linear systems arising in nonlinear image deblurring. pages 148–158. Springer, Singapore, 1996. [135] C. R. Vogel. Computational Methods for Inverse Problem. SIAM, Philadelphia, 2002. [136] C. R. Vogel and M. E. Oman. Iterative methods for total variation denoising. SIAM J. Scient. Comput., 17:227–238, 1996. [137] J. Vollmer and R. Mencl. Improved laplacian smoothing of noisy sur- face meshes. EUROGRAPHICS, 18(3):131–139, 1999. [138] J. Weickert. Anisotropic Diffusion in Image Processing. B. G. Teubner, 1998. [139] Y. Weiss. Segmentation using eigenvectors: a unifying view. In Inter- national Conference on Computer Vision, pages 975–982, 1999. [140] E. Weiszfeld. Sur le point pour lequel la somme des distances de n points donnés est minimum. Tôhoku Math. J., 43:355–386, 1937. [141] G. L. Xu. Discrete laplace-beltrami operators and their convergence. Computer Aided Geometric Design, 21(8):767–784, 2004. [142] H. Yagou, Y. Ohtake, and A. Belyaev. Mesh smoothing via mean and median filtering applied to face normals. In Proceedings of Geometric Modeling and Processing, pages 124–131, 2002. [143] D. M. Yan, Y. Liu, and W. P. Wang. Quadric surface extraction by variational shape approximation. In Proceedings of Geometric Model- ing and Processing, pages 73–86, 2006. [144] Y. Yu, K. Zhou, D. Xu, X. Shi, H. Bao, B. Guo, and H.-Y. Shum. Mesh editing with Poisson-based gradient field manipulation. ACM Trans. Graphics (SIGGRAPH), 23(3):641–648, 2004. [145] H. Zhang and R. Liu. Mesh segmentation via recursive and visually salient spectral cuts. In Proceedings of Vision, Modeling, and Visual- ization, 2005. 127 [146] K. Zhou, J. Snyder, B. Guo, and H. Y. Shum. Isocharts: Stretch- driven mesh parameterization using spectral analysis. In Proceedings of Symposium on Geometry Processing, 2004. [147] M. J. Zukoski, T. Boult, and T. Iyriboz. A novel approach to medical image compression. Int. J. Bioinformatics Research and Applications, 2(1):89–103, 2006. 128


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items