Adaptive Contextual Regularization for Energy Minimization Based Image Segmentation by Josna Rao B.A.Sc., Simon Fraser University, 2007 A THESIS SUBMITED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Applied Science in THE FACULTY OF GRADUATE STUDIES (Electrical & Computer Engineering) THE UNIVERSITY OF BRITISH COLUMBIA (VANCOUVER) May 2010 © Josna Rao, 2010 ii Abstract Image segmentation techniques are predominately based on parameter-laden optimization processes. The segmentation objective function traditionally involves parameters (i.e. weights) that need to be tuned in order to balance the underlying competing cost terms of image data fidelity and contour regularization. Conventionally, the associated parameters are set through tedious trial and error procedures and kept constant over the image. However, spatially varying structural characteristics, such as object curvature, combined with varying noise and imaging artifacts, significantly complicate the selection process of segmentation parameters. This thesis contributes to the field of image segmentation by proposing methods for spatially adapting the balance between regularization and data fidelity in energy minimization frameworks in an autonomous manner. We first proposed a method for determining the globally-optimum spatially adaptive regularization weight based on dynamic programming. We investigated this weight with a basic minimum-path segmentation framework. Our findings indicated that the globally-optimum weight is not necessarily the best weight as perceived by users, and resulted in poorer segmentation accuracy, particularly for high noise images. We then investigated a more intuitive approach that adapts the regularization weight based on the underlying local image characteristics to more properly address noisy and structurally important regions. This method, which we termed contextual (data-driven) weighting, involved the use of a robust structural cue to prevent excessive regularization of trusted (i.e. low noise) high curvature image regions and an edge evidence measure, where both measures are gated iii by a measure of image quality based on the concept of spectral flatness. We incorporated our proposed regularization weighting into four major segmentation frameworks that range from discrete to continuous methods: a minimum-path approach [9], Graph Cuts [14], Active Contours Without Edges [24], and a contextual Mumford-Shah based approach [38]. Our methods were validated on a variety of natural and medical image databases and compared against the globally-optimum weight approach and to two alternate approaches: the best possible (least-error) spatially-fixed regularization weight, and the most closely related data-driven spatially adaptive regularization method. In addition, we incorporated existing texture-based contextual cues to demonstrate the applicability of the data-driven weights. iv Table of Contents Abstract .............................................................................................................................. ii Table of Contents ............................................................................................................. iv List of Tables ................................................................................................................... vii List of Figures ................................................................................................................. viii Acknowledgements ......................................................................................................... xii 1 Introduction and Background ................................................................................. 1 1.1 Motivation and Problem Statement ...................................................................... 2 1.2 Energy Minimization Segmentation Methods ...................................................... 6 1.2.1 General Framework ................................................................................................... 6 1.2.2 Discrete Methods ....................................................................................................... 8 1.2.3 Continuous Methods ................................................................................................ 13 1.3 Energy Terms ..................................................................................................... 19 1.3.1 External Terms ........................................................................................................ 19 1.3.2 Internal Terms .......................................................................................................... 22 1.4 Regularization of Image Segmentation Methods ............................................... 24 1.4.1 Role of Regularization ............................................................................................. 25 1.4.2 Conventional Methods for Setting Regularization .................................................. 26 1.5 Related Work ...................................................................................................... 27 1.5.1 Spatially Adaptive Regularization ........................................................................... 27 1.5.2 Estimation of Curvature ........................................................................................... 33 1.6 Thesis Contributions .......................................................................................... 35 2 Globally Optimal Regularization Weights ........................................................... 38 2.1 General Framework ............................................................................................ 38 2.2 Optimization Process .......................................................................................... 41 2.2.1 Djikstra’s Method in 3D .......................................................................................... 41 v 2.2.2 Implementation Details ............................................................................................ 42 2.3 Validation and Results ....................................................................................... 44 2.3.1 Performance Criteria ................................................................................................ 44 2.3.2 Synthetic Data Validation ........................................................................................ 46 2.3.3 Medical Data Validation .......................................................................................... 49 2.3.4 Natural Scene Data Validation ................................................................................ 52 2.4 Deficiencies in Globally Optimum Weight Model ............................................ 55 3 Data-driven Regularization Weights..................................................................... 59 3.1 Overview of method ........................................................................................... 60 3.2 Noise Evidence Cue ........................................................................................... 61 3.3 Edge Evidence Cue ............................................................................................ 65 3.4 Reliability Measure Formulation ....................................................................... 65 3.5 Curvature Cue .................................................................................................... 66 3.5.1 Curvature Formulation ............................................................................................. 66 3.5.2 Normalized Rescaled Curvature .............................................................................. 67 3.5.3 Noise-Gated Curvature ............................................................................................ 69 3.6 Data-Driven Regularization Weight Formulation .............................................. 71 3.6.1 Cue Combination and Mapping to Weight .............................................................. 71 3.6.2 Incorporation of Texture and Additional Cues ........................................................ 74 3.7 Incorporation of Regularization Weights into Segmentation Frameworks ........ 75 3.7.1 Minimum-path Frameworks .................................................................................... 76 3.7.2 Graph Cuts ............................................................................................................... 76 3.7.3 Active Contours without Edges ............................................................................... 77 3.7.4 Contextual Mumford-Shah Framework ................................................................... 85 3.8 Validation and Results ....................................................................................... 86 3.8.1 Performance Criteria ................................................................................................ 86 vi 3.8.2 Parameter Selection and Implementation Details .................................................... 88 3.8.3 Computational Performance .................................................................................... 88 3.8.4 Synthetic Data Validation ........................................................................................ 89 3.8.5 Medical Data Validation .......................................................................................... 95 3.8.6 Natural Scenes Data Validation ............................................................................. 111 4 Conclusions ............................................................................................................ 126 4.1 Discussion ........................................................................................................ 126 4.2 Limitations and Future Directions .................................................................... 128 Bibliography .................................................................................................................. 131 vii List of Tables Table 1: Average error over 25 noise realizations per image produced by least-error fixed weights and globally optimal weights for synthetic set of data. ....................................... 48 Table 2: Average error over 25 noise realizations per image produced by data-driven weights, least-error fixed weights, and globally optimal weights for synthetic set of data............................................................................................................................................ 93 viii List of Figures Figure 1: Examples of degradations in medical and natural images. ................................ 3 Figure 2: Example of role of regularization in degraded images. ....................................... 4 Figure 3: Example of structurally-important regions with high curvature. ........................ 5 Figure 4: Importance of the regularization weight in the segmentation process.. .............. 7 Figure 5: Diagram of wave-front expansion process in Djikstra’s method. ..................... 11 Figure 6: Value function in dynamic programming. ........................................................ 11 Figure 7: Segmentations on a synthetic image using spatially fixed regularization weights. ............................................................................................................................. 27 Figure 8: Texture cue devised in Erdem and Tari [35].. ................................................... 30 Figure 9: Curvature of an isointensity curve measured by the rate of change between the gradient angle and unit normal vector nnull. ........................................................................ 33 Figure 10: Graph search algorithm using Djikstra’s method. ........................................... 43 Figure 11: Segmentation of synthetic images with decreasing noise variance and decreasing Gaussian blurring from left to right.. .............................................................. 47 Figure 12: Profile of globally optimal weights along contour of Figure 11(a). ................ 49 Figure 13: Segmentation of corpus callosum (CC) structure from sagittal MR slice. ...... 50 Figure 14: Segmentation of cortical boundary. ................................................................. 51 Figure 15: Dice similarity coefficient (DSC) of segmentations from globally-optimal weights (GO) and least-error fixed weights for segmentation of white matter in coronal BrainWeb slices using the (a) T1 modality, (2) T2 modality, and (3) PD modality. ....... 52 Figure 16: Segmentation of leaf image from McGill database [71]. ............................... 53 Figure 17: Segmentation of flower image from McGill database.. .................................. 54 Figure 18: DSC of segmentations from least-error fixed weights (F) and globally-optimal weights (GO) on (a) 8 images from ImageNet database, (b) 11 images from PASCAL database, and (c) 24 images from McGill database .......................................................... 55 Figure 19: Spectral behavior of 1D signals....................................................................... 61 ix Figure 20: (a) Synthetic image with AWGN increasing in variance from left to right, and (b) resulting noise measure null(null,null) where black intensities correspond to a measure of 0 and white intensities to a measure of 1.. ........................................................................... 63 Figure 21: (a) Original synthetic image. (b) Average local spectral flatness versus variance of added AWGN for image in (a) with various levels of Gaussian blurring. ..... 64 Figure 22: (a), (b), (c) Synthetic images and (d), (e), (f) resulting gated curvature measure nullnull(null,null) where black regions correspond to a measure output of 0 and white regions to 1. ........................................................................................................................................... 70 Figure 23: (a) Original synthetic image with AWGN of variance 0.4. ............................ 71 Figure 24: Surface plot of null(nullnull,nullnull) based on the gated edge evidence (nullnull) and gated curvature cue (nullnull), and on different selections of the parameters nullnull and nullnull. ................... 73 Figure 25: (a), (b) Synthetic images with varying characteristics. (c), (d) Noise measure null(null,null). (e), (f) Noise-gated edge cue nullnull(null,null). (g), (f) Noise-gated curvature cue nullnull(null,null). (i), (f) Reliability weight null(null,null). ..................................................................... 74 Figure 26: Segmentation of grey object in synthetic image corrupted by AWGN with increasing standard deviation. ........................................................................................... 90 Figure 27: Segmentation accuracy (DSC) as increasing levels of noise are added to the synthetic image of Figure 26(a).. ...................................................................................... 90 Figure 28: Synthetic dataset testing with minimum-path approach. ................................. 91 Figure 29: Segmentation of a synthetic image using GC with adaptive regularization.. .. 94 Figure 30: Difference in average DSC between adaptive regularization GC and fixed regularization GC for images with increasing numbers of ellipses. ................................. 95 Figure 31: Segmentation from (a) proposed data-driven weight method for a corpus callosum MR image. ......................................................................................................... 96 Figure 32: CC 52-image dataset DSC results with minimum-path segmentation method. ........................................................................................................................................... 97 Figure 33: DSC results for IBSR dataset of (a) coronal and (b) transverse MR slices from 18 subjects when segmenting for white matter using minimum-path approach.. ............. 97 Figure 34: DSC results for slices from coronal BrainWeb dataset when segmenting for white matter using minimum-path approach.. .................................................................. 98 Figure 35: DSC results for segmentation of cancer tissue in 8 mammography images from DDSM database using minimum-path approach. .................................................... 98 x Figure 36: Segmentation of MR data from BrainWeb using GC with curvature-modulated regularization.. ................................................................................................................ 100 Figure 37: Segmentation of noisy MR data from BrainWeb using GC with curvature-modulated regularization. ............................................................................................... 101 Figure 38: DSC results for BrainWeb segmentation of white matter using graph cuts approach for (a) T1, (b) T2, and (c) PD modalities with increasing noise levels.. ......... 102 Figure 39: DSC results for segmentation of cancer tissue in 8 mammography images from DDSM database using the graph cuts approach.. ................................................... 103 Figure 40: DSC results for IBSR dataset of (a) coronal, (b) transverse and (c) sagittal MR slices from 18 subjects when segmenting for white matter (for (a) and (b)) and CC (for (c)) using graph cuts........................................................................................................ 103 Figure 41: Segmentation of white matter in PD coronal slice from BrainWeb with 5% noise.. .............................................................................................................................. 104 Figure 42: ACWE segmentation of mammography image from DDSM database [47, 48].......................................................................................................................................... 105 Figure 43: DSC results for BrainWeb segmentation of white matter using ACWE approach for (a) T1, (b) T2, and (c) PD modalities with increasing noise levels. .......... 106 Figure 44: DSC results for segmentation of cancer tissue in 8 mammography images from DDSM database using the ACWE approach. ........................................................ 106 Figure 45: DSC results for IBSR dataset of (a) coronal, (b) transverse and (c) sagittal MR slices from 18 subjects when segmenting for white matter (for (a) and (b)) and CC (for (c)) using ACWE. ........................................................................................................... 107 Figure 46: Segmentation of central ventricle structure from mid-volume coronal slices (BrainWeb). .................................................................................................................... 108 Figure 47: (a) T1 image with 9% noise (BrainWeb), and (b) curvature-modulated reliability. (c) Comparison of segmentations between data-driven weights (green) and fixed weights (red), and (d) between data-driven weights and the Erdem-Tari weights (blue) where yellow regions represent segmentation overlap and black contour represents the ground truth. .............................................................................................................. 109 Figure 48: DSC results for BrainWeb segmentation of central ventricle structure using contextual MS approach for (a) T1, (b) T2, and (c) PD modalities with increasing noise levels. ............................................................................................................................. 110 Figure 49: DSC results for segmentation of cancer tissue in 8 mammography images from DDSM database using the contextual MS approach.. ............................................ 110 xi Figure 50: DSC results for IBSR dataset of coronal MR slices from 18 subjects when segmenting for the central ventricle structure using contextual MS approach. ............. 111 Figure 51: (a) Original leaf image (McGill dataset [71]). (b) Reliability calculated by our proposed method. Contours produced by using (c) data-driven weights (blue), least-error fixed-weight (red), and globally optimal weight (cyan). ................................................ 112 Figure 52: Airplane image (PASCAL dataset [37]) segmented by minimum-path approach. ......................................................................................................................... 113 Figure 53: Segmentations from McGill database images of (a) a flower with a complex background and (b) a leaf with regions of the boundary obscured by water. ................. 114 Figure 54: DSC of segmentations from data-driven weights (DD), least-error fixed weights (F) and globally-optimal weights (GO) on (a) 8 images from ImageNet database, (b) 11 images from PASCAL database, and (c) 24 images from McGill database. . ..... 115 Figure 55: Graph cuts segmentation of flower image from ImageNet dataset [29]. ..... 116 Figure 56: Graph cuts segmentation of natural image ................................................... 117 Figure 57: Graph cuts segmentation of textured natural image (from ImageNet dataset [29]) using the curvature-and-texture modulated weight. ............................................... 118 Figure 58: Segmentation DSC from using data-driven weights (DD), least-error fixed weights (F), and data-driven weights with the texture cue (DD-Text) on natural images from (a) the ImageNet database (8 images), (b) McGill dataset (24 images), and (c) selected textured images taken from ImageNet database (10 images).. ......................... 119 Figure 59: Active Contours segmentation of a natural image. ....................................... 120 Figure 60: ACWE segmentation of flower. .................................................................... 121 Figure 61: Segmentation DSC from using data-driven weights (DD) and least-error fixed weights (F) on natural images from (a) the ImageNet database (8 images) and (b) McGill dataset (24 images). ........................................................................................................ 122 Figure 62: Segmentation of amoeba image from ImageNet. .......................................... 123 Figure 63: Segmentation of cheetah image presented in [35]. ........................................ 124 Figure 64: Segmentation DSC from using data-driven weights (DD), least-error fixed weights (F), data-driven weights with the texture cue (DD-Text), and Erdem-Tari weights (ET) on natural images from (a) the ImageNet database (8 non-textured images), and (b) selected textured images taken from ImageNet database (10 images). .......................... 125 xii Acknowledgements I would like to thank my supervisors, Dr. Rafeef Abugharbieh and Dr. Ghassan Hamarneh, for their guidance and expertise. I would also like to thank Miranda Poon, Shai Bagon, Yue Wu, Erkut Erdem, and Sibel Tari for providing code for segmentation methods. In addition, I would like to thank my colleagues at the Biomedical Signal and Image Computing Laboratory (BiSICL) for their help and support. Finally, I would like to thank Joe and my parents for their love and encouragement. JOSNA RAO The University of British Columbia May 2010 1 Chapter 1 1 Introduction and Background Image segmentation plays a key component in many fields, ranging from medical image analysis to popular image editing programs. Medical research studies often rely on accurate and robust segmentation techniques to provide key information about anatomical shapes. Alternately, image editing software such as the Adobe Photoshop Suite rely on segmentation methods that accurately capture object boundaries for accurate modification by users. In all these cases, the images encountered may often be corrupted by noise or imaging artefacts that often present difficulties to many image segmentation methods. For these reasons, robust automated image segmentation is a highly sought after goal that continues to defy solution. 2 1.1 Motivation and Problem Statement In medical images, natural and pathological variability as well as noise often result in unpredictable image and shape features that significantly complicate segmentation tasks. For example, MR images often contain measurement noise, partial volume effects, and image nonuniformity due to magnetic field inhomogeneities and magnetic susceptibility variations in the subject [88]. Furthermore, spatially nonuniform noise can result from numerous reconstruction and postprocessing techniques on MR images to correct for intensity inhomogeneity effects [90, 84], and from images obtained with partially parallel imaging (PPI) techniques [82, 18, 19]. Spatially varying noise levels can also result from images obtained with decreased acquisition times and high speedup factors [84]. In addition, regions with missing data or occlusions are commonly encountered in medical data, such as echo dropouts in ultrasound images that leads to irregularities along the feature boundary [99]. Other such degradations in medical images are due to tissue heterogeneity (“graded decomposition” [96]) and patient motion. In general photography applications, objects containing weak boundaries are quite common from poorly focused photos or from objects where sections in the image exceed the depth of field of the camera lens. Figure 1 shows some of these examples. 3 (a) (b) (c) (d) Figure 1: Examples of degradations in medical and natural images. (a) Patient movement during magnetic resonance (MR) scan [1]. (b) Intensity inhomogeneity in MR volumes from lack of bias field correction [89]. (c) Occlusions in natural images [75]. (d) Focus issues in natural images [75]. For all these cases, regularization, or smoothing, plays a crucial role in improving the robustness and accuracy of the resultant segmentations. Through the use of weighted regularization terms in conjunction with data fidelity terms, images plagued by high levels of deterioration, i.e. noise or poor edge contrast, are prevented from causing excessive irregularities and inaccuracies in the resultant segmentation. In order to increase segmentation robustness and accuracy, more regularization is needed in less reliable image regions which suffer from greater deterioration. One such example is shown for the noisy MR scan of a knee in Figure 2(a). A segmentation contour produced 4 with no regularization is erratic and inaccurate (Figure 2(b)). By including some smoothing (regularization) term, irregularities caused by the noise are reduced and a more accurate estimate of the original boundary is formed (Figure 2(c)). (a) (b) (c) Figure 2: Example of role of regularization in degraded images. (a) Noisy MR scan of knee. (b) Contour produced with no smoothing (regularization) term (red). (b) Contour produced with some level of regularization (green). Regularization removes segmentation irregularities to form a better estimate of the true object boundary. However, excessive regularization in regions of the image not plagued by deterioration can result in less detail and loss of key structures in the final segmentation. For example, magnetic resonance (MR) images of the brain typically feature highly detailed structures such as the cortical surface which contain many regions of high curvature, as shown in Figure 3. Excessive regularization of these high curvature regions results in segmentations which fail to capture key characteristics of the original object Figco acrosempirinadevariabgreatspatiashapeadvocrobustheseweighure 3: Exampronal MR bra Most repos an objecically. Thiquate reguility. Spatiaer control ovlly varying characterisate the strot, data-drive parameterst approachele of structurin scan (Brairted approat contour, s fixed aplarization flly adaptiner the segmnoise levelstics, e.g. smng need forn manner t. We also ds are often ially-importannWeb T1 imaches to segmi.e. one thaproach canor a singleg the regulaentation res and edge stooth in som spatially-ado relax the emonstrate nadequate f5 t regions withage [26]) prodentation kt does not result in object drization weiult, allowinrength, but e parts andaptive balanrequirementhow the gloor achieving high curvatuuce a highly veep a unifor vary spatboth excesepending oghts across g it to adaptalso to objec jagged in cing of cos on the usebally optim accurate se re. Cortical farying objectm level of rially and issive regulan the objea segmenta not only tots with spatothers. In tht terms in ar to painstaized approagmentation.olds in mid- boundary. egularizatio determinerization anct boundartion provide images witially-varyinis thesis, wn automatedkingly tweach and fixe n d d y s h g e , k d 6 1.2 Energy Minimization Segmentation Methods Our work focuses on modulating the regularization of energy-minimization segmentation methods. In this section, we will examine in detail several modern and popular segmentation approaches ranging from discrete to continuous, and we will discuss the tradeoff between data fidelity and regularization terms in these methods. 1.2.1 General Framework The vast majority of existing segmentation methods are predominantly based on parameter-laden optimization procedures designed to produce `optimal' segmentations at their minimum. These techniques represent the energy of segmentation as a combination of smoothing terms and data fidelity terms. The ‘optimum’ segmentation is the segmentation which represents the minimum energy of all possible segmentations. Regularization terms are called the internal energy, and data fidelity terms are referred to as the external energy. A simplified but general form of the cost or energy, null, of a segmentation, null, of an image, null, is null(null|null,null,null) =nullnullnullnullnull(null) +nullnullnullnullnull(null|null) (1) nullnullnullnull is the internal energy term contributing to the regularization of the segmentation, and nullnullnullnull is the external energy term contributing to the contour's conformity to desired image features, e.g. edges. These terms will be discussed in more detail in Section 1.3. The weights null (referred to as the regularization weight) and null in (1) control the highly sensitive tradeoff between the regularization terms and data fidelity terms. The resultexamoptimwill bthe mreflecdifferFigurean obshownweighthe obobject how In motypicalgordrivinant segmenple, setting ization proe optimal. Sinimum exts boundarent settings (a) 4: Importanscured-bound in green) prht (black). Thescured boun boundary. null and null into best balanst cases, thal non-technithm's inner g motivatiotations canthe weight nullcedure. Instimilarly, seternal energy regions oof the weigh ce of the reguary leaf [75]oduced by ( low regularizdary region. (1) are typce the requis is a veryical end usworking. An for our vary drast to zero resead, the segtting null to zy to be optr not. A dts is shownularization we. Segmentatiob) a low reguation weight The high regically set emirements fo difficult taer, e.g. a clvoiding thework here.7 ically basedults in the ementation wero will resuimal, regardemonstrativ in Figure 4(b) ight in the sens (by a seeularization wproduces a seularization wpirically byr regularizatsk and the inician, who practice of Addressing on how xternal enerhich minimlt in the segless of whee example. egmentation pded minimumeight (red) anegmentation weight fails to the users baion and adhparameters lacks knowad-hoc setti the issue this balancegy playing nizes the inmentation tther the exof segmen(c)rocess. (a) Or-path approd (c) a high ith insufficien capture key sed on theirerence to immay be uninledge of thng of such wof how to is set. Foo role in thternal energhat produceternal energtations from iginal image oach with seedregularizatiot smoothing iregions of th judgment oage contentuitive for e underlyineights is thbest balancr e y s y f s n n e f t. a g e e 8 competing cost terms is of great importance to many related algorithmic formulations in computer vision. More generally, this tradeoff is seen in likelihood versus prior in Bayesian methods [2] and loss versus penalty in machine learning [106, 105]. The role of regularization is discussed in more detail in Section 1.4. A wide variety of energy minimization segmentation methods exist, with each method differing in the type of energy functional used and in the technique used for optimizing the functional. In general, these methods can be classified into two groups based on the type of space the functional is defined on [79, 15]: 1) Functionals defined on a discrete space (discrete set of variables) 2) Functionals defined on a continuous space (continuous contour or surface) Depending on the type of segmentation method, the segmentation null in (1) can differ in representation. In addition, segmentation techniques differ in the choice of external and internal energy terms, as will be discussed in Section 1.3 1.2.2 Discrete Methods Discrete segmentation methods formulate the problem as a combinatorial optimization in a finite space nullnull (a finite set of integer-valued variables) [79]. These methods predominately use a graph-based representation of an image where graph vertices (nodes) correspond to image pixels. These methods can be further divided into explicit methods, such as path-based methods [9, 4, 43, 41, 55], where the object boundary is represented by a path in a graph, and implicit methods, such as graph-cuts 9 approaches [13, 15, 14] where the segmentation is represented functionally. These methods can guarantee finding the globally optimal segmentation. 1.2.2.1 Minimum Path Minimum-path approaches formulate segmentation through modelling the image as a graph that consists of nodes and edges that link to neighbouring nodes. Image pixels represent vertices and the links from each pixel to its eight neighbours represent edges. In these approaches, the segmentation is modelled as a path along the graph consisting of a set of nodes where each node is connected by an edge to a single ‘forward’ node and a single ‘backward’ node. The subgraph formed by a closed path along the graph represents the object or region of interest. Each edge in the graph is given a cost such that the cumulative cost of the path from a start vertex to a goal vertex is the sum of the individual cost of each edge along the path. The problem of finding the optimum segmentation is modelled as a graph-searching problem where the goal is to find the optimum, or least-cost, path between two vertices. Many segmentation methods, such as [15, 17, 108, 47], use a graph-based approach to modelling the segmentation problem. Common examples of minimum path approaches are Amir et al [4], Geiger et al [4], Falcao et al [41], and Jerymyn and Ishikawa [55]. In particular, one popular minimum-path segmentation approach is Livewire [41, 9] where dynamic programming in conjunction with user supplied seedpoints are used to find the optimum path on a graph-based representation of the image. Livewire uses the edges costs of the graph to ensure that the optimum path represents characteristics desired in the segmentation. Each edge cost, or local cost, is a weighted sum of data fidelity and 10 regularization terms. The local cost, nullnullnullnullnullnull(null,null), for the directed edge from pixel p to neighbouring pixel q is as follows: nullnullnullnullnullnull(null,null) =nullnullnullnull(null) +nullnullnullnull(null,null) (2) where a gradient magnitude measure nullnull(null) acts as a data fidelity term and nullnull(null,null) is the regularization term that penalizes longer paths (see Section 1.3 for further discussion of energy terms). The balance between data fidelity and regularization is controlled by the weights nullnull and nullnull which are typically set empirically. In order to determine the optimum contour, Livewire uses dynamic programming, which solves optimization tasks by recursively breaking down the tasks into similar but smaller subtasks which can be solved more easily. In the context of graph-based segmentation, the globally optimum path between vertices null and null is found by recursively finding the optimum path for smaller subgraphs. This process is accomplished by first initializing the vertices with the local cost at that pixel location as determined by equation (2). The target vertex null is then expanded by summing the cumulative cost of null into all adjacent vertices, and then summing the cost of those vertices into their corresponding neighbours. This expansion continuous in order of minimum cumulative cost and produces a ‘wavefront’ that extends over the graph, as illustrated in Figure 5 . At the end of this process, the graph is transformed into what is known as a value function that represents the cost of the globally optimum path from each vertex to the target vertex null, as illustrated in Figure 6. The optimum path from any vertex, including null, to the goal vertex can then be determined by a gradient descent. The benefit is that the globally optimum path from all vertices to the target vertex is now known. This optimization proceconsirefer FigstartFigureValueany nvalue node. 1.2.2. propodure has bests of pixel to Chapter 22 52 35 44 05 4ure 5: Diagraing node is su(a) 6: Value fun map formed ode (pixel) to function wheThe optimal p2 Graph CAnother psed for imen expandecoordinates where 3D g2 12 31 34 56 3m of wave-frommed into unction in dynby Djikstra’starget node. Cre z-axis reprath is then deuts opular graage segmend in recent and a third raph search2 521+= 5 44 05 4nt expansionvisited neighpermis namic program method wheosts are loweesents cost oftermined by ph-based setation in [111 work to a ‘scale’ varies are discu2 1341+2= 31+= 11+= 41+= 6 3 process in Djhbours, which sion from [81(b) ming. (a) Orre intensity vst along pixe the optimal pa gradient degmentation7], further 3D graph wable [81, 64ssed in mor343456232545ikstra’s methare in turn ex]) iginal image alues represenls containing ath from anyscent procedu approach expanded ihere the o]. For moree detail. +5= 82+3=534 34 10 44 6od. Cumulatipanded. (Rep(cwith target nt cost of optistrong edge e (null,null) locatiore. are s-t grapn [13, 16, ptimum pat information+1= 44463 ve cost from roduced with) ode in red. (bmal path fromvidence. (c) 3n to the targh cuts, fir15, 14], anh , ) D et st d 12 proposed outside the field of image segmentation in [48, 13, 83, 53, 60] . Our work will incorporate the implementation of graph cuts in [15]. Unlike minimum-path approaches which define the segmentation as a contour along the object boundary, graph-cuts approaches define the segmentation as a region where each pixel is represented as a binary variable, either inside or outside the object of interest. Graph-cuts defines the energy functional for a segmentation as follows: null(null) = null nullnullnullnull,nullnullnullnull,nullnullnullnullnull,nullnull∈null+nullnullnullnullnull(nullnull)null∈ null (3) null∈ L is the labeling for all pixels null∈ null where L is the label space and null is the set of pixels in image null. nullnull,null is the pairwise interaction penalty between pixel pairs (i.e. the penalty of assigning labels nullnull and nullnull to pixels null and null), null is the set of interacting pairs of pixels, and nullnull measures how well label nullnull fits pixel null given the observed data. nullnull and nullnull,null are discussed in more detail in Section 1.3. From the cost functional, we see that graph-cuts also involves a tradeoff between regularization and data fidelity through the weights nullnull and nullnull in (3). The optimization procedure consists of generating a labeling through two types of moves: expansions and swaps, which changes the labels of large numbers of pixels simultaneously. The method divides the image into nodes, which are the pixels or voxels, terminal nodes, which consist of a source node and a sink node that represent the background and object labels, null-links, which are edges connecting neighboring nodes, and null-links connecting nodes to terminal nodes, where links are assigned a cost based on the edge. The resulting contour is defined as a ‘cut’ through a subset of null-links, 13 separating the sink node (background) from the source node (object). The cut cost is defined as the sum of costs from the edges that were cut. 1.2.3 Continuous Methods Continuous segmentation methods formulate the problem in the continuous space nullnull and use a representation of the segmentation that deforms according to external and internal forces. Additionally, these methods tend to rely on a gradient descent procedure for optimization. These types of methods are divided into explicit models and implicit models. Explicit models are based on an explicitly defined parametric curve that is evolved and deformed, and consist of active contour and snake methods [56, 28, 29, 71, 65]. Implicit models are where the contour is represented as the level-set of a higher-dimensional scalar function, as seen in [21, 86, 76, 85, 77]. Unlike discrete methods, continuous methods can only guarantee finding the local minima of the energy functional. However, continuous methods do have the advantage that pixel connectivity in the final segmentation is guaranteed. Our work will focus on two such continuous methods that are both different approximations of the Mumford-Shah segmentation framework 1.2.3.1 Mumford-Shah Model Mumford and Shah [74] proposed a solution to the problem of image denoising by dividing an image into a smooth cartoon-like component and a noise component such that the image is smoothened but object edges are retained. This concept that has been further analyzed by [12, 44, 78]. To accomplish this, the Mumford-Shah (MS) model was proposed, where a segmentation consists of a piecewise-smooth approximation, null, of the 14 original image, nullnull, and an edge set, Γ. In this formulation, a segmentation has the following energy: nullnullnull(null,Γ) =nullnullnullnull(null,null)−null(null,null)nullnullnullnullnullnullnull+nullnull |∇null(null,null)|nullnullnullnull\null+nullnullnullnullnullℎ(Γ) (4) where Ω⊂ℜnull is a connected, bounded and open subset that represents the image domain, null is the original image defined on ℜ , Γ⊂Ω is the edge set segmenting Ω, null is the piecewise smooth approximation of null, and null and null are the scale space parameters of the model. The first term corresponds to the external energy and penalizes large differences between null and the original image. The regularization terms include null|∇null|nullnullnullnull\null which penalizes large edge sets which would result from more erratic segmentations, and nullnullnullnullnullℎ(Γ) which penalizes an excessively large edge set (and a low smoothened image). The weights null and null control the balance of regularization versus data fidelity. The original MS model is difficult to minimize due to the unknown representation of the edge set Γ. We will present two different approximations of the MS problem: contextual Ambrosio and Tortorelli (AT) approximation, and the Active Contours without Edges (ACWE) approximation. 1.2.3.2 Contextual Ambrosio-Tortorelli MS Approximation The AT approximation of the MS model simplifies the minimization process by introducing a smooth edge indicator function null [3], and has been utilized by many other 15 segmentation methods [8, 5, 37, 39, 87, 92, 94]. The original model incorporated a characteristic function nullnull as the edge indicator. In the AT approximation, the cardinality of Γ is approximated by: 12nullnullnull|∇null|null+(1−null)nullnullnullnullnullnull (5) where null is a parameter set such that as null→0, null(null) ≈0 if null∈Γ and null(null) ≈1 otherwise. This approximation modifies the MS functional to the following: nullnullnull(null,null) =nullnullnull(null−null)null+null(nullnull|∇null|null)+12nullnull|∇null|null+(1−null)nullnullnullnullnullnullnull (6) The AT approximation allows for the partial differential equations (PDEs) that dictate the segmentation evolution to be decoupled into separate evolution equations for the image process and edge set function as follows: nullnullnullnull=∇ ∙(nullnull∇null)−nullnull(null−null);nullnullnullnullnullnullnull=0 (7) nullnullnullnull=∇nullnull−2null|∇null|nullnullnull−(null−1)nullnull;nullnullnullnullnullnullnull=0 (8) where nullΩ is the boundary of Ω and null is the unit normal vector to nullΩ . The AT approximation adds the additional regularization term null|∇null|null that forces edges to be smooth. Although the parameters null and null control the data fidelity term and one of the regularization terms, respectively, null|∇null|nullis not modulated by a regularization weight. From the energy functional, it also difficult to separate regularization of the image 16 process and regularization of the edge process. Additionally, it is important to note that the edge term null itself acts as a weight for the regularization of the image process. From the term nullnull∇null, if an edge exists (null≈0), no regularization occurs. Thus, we follow the Erdem and Tari [38] modification of the AT MS model that allows for proper control of regularization of both the image process and edge process through modifying the evolution equation for the image process as follows: nullnullnullnull=∇ ∙((nullnull)null∇null)−nullnull(null−null);nullnullnullnullnullnullnull=0 (9) where the constant null controls how strongly the edge term null weights the level of smoothening in the image process. This segmentation approach is termed the contextual MS method. The evolution equations (8) and (9) are simultaneously solved for null and null by the Finite Differences numerical discretization technique iteratively where null is updated by the evolution equation while null is kept fixed, and vice versa, and where the iterations are stopped when the solution is stationary. It is important to note that the contextual MS approach is automated in that no initial contour or seeds are used to determine null and null at the initial iterations. Instead, null is initialized as the inverse of the gradient such that regions with high edge evidence result in null≈0. Additionally, the contextual MS approach is primarily used for denoising as it produces a cartoon-like version of the original image. For use as a segmentation method, a binary mask can be formed from closed contours in the edge set. 17 1.2.3.3 Active Contours Without Edges The active contours without edges (ACWE) segmentation approach, proposed by Chan and Vese [24, 23, 22, 97], represents a subset of the original MS model. Here, the original MS segmentation problem is restricted to piecewise constant functions, forming what is known as the minimal partition problem. The original MS formulation had a data fidelity term of null |null(null,null)−null(null,null)|nullnullnullnullnullnull. In the ACWE formulation, the image process is simplified to a binary piecewise constant function as follows: null=nullaverage(null) insideΓaverage(null) outsideΓ (10) null is thus represented by two constants, nullnull and nullnull, which represent the average of the original image inside and outside, respectively, of the object boundary Γ. Essentially, ACWE is a simplification of the original MS model to only segment for an object and background and producing a binary image null with the edge set simplified to an active contour representing the object boundary. As a binary mask is produced, ACWE is geared more towards the application of segmentation rather than denoising. The ACWE energy functional is: null(nullnull,nullnull,Γ) =nullLength(Γ)+null Areanullnullnullnullnullnullnull(Γ)null+ nullnull |null(null,null)−nullnull|nullnullnullnullnullnullnullnullnullnullnull(null)+ nullnull |null(null,null)−nullnull|nullnullnullnullnullnullnullnullnullnullnullnull(null) (11) 18 where the different external energy weights have been replaced with a single weight null and the regularization weights are null and null (in practice, null=0 typically). The weights null and null form the same tradeoff between data fidelity and regularization that is present in the other segmentation methods discussed. To optimize for the energy, the functional is first transferred to a level sets formulation as follows. The segmentation contour Γ⊂Ω is represented by the zero level set of a Lipschitz function null;Ω → ℜ where pixels null interior to the zero-level set of null are labelled as objects and exterior pixels as background. The length and area of the zero level set of null is determined through the use of the Heaviside function null and the one dimensional Dirac function nullnull as follows: nullnullnullnullnullℎ(null=0) =nullnullnullnullnull(null,null)null|∇null(null,null)|nullnullnullnullnull (12) nullnullnullnull(null≥0) =nullnullnullnull(null,null)nullnullnullnullnullnull (13) Through the use of null and nullnull, the energy function is written in level-sets form as follows: null(nullnull,nullnull,null) =nullnullnullnullnullnull(null,null)null|∇null(null,null)|nullnullnullnullnull+nullnullnullnullnull(null,null)nullnullnullnullnullnull + nullnull|null(null,null)−nullnull|nullnullnullnull(null,null)nullnullnullnullnullnull+ nullnull|null(null,null)−nullnull|nullnull1−nullnullnull(null,null)nullnullnullnullnullnullnull (14) 19 The corresponding Euler-Lagrange equation is derived by minimizing null with respect to null. The optimum contour is then determined by gradient descent where the descent direction parameterized by an artificial time null≥0 is: nullnullnullnull=null(null)nullnull nullnullnullnull∇null|∇null|null−null−null(null−nullnull)null+null(null−nullnull)nullnull=0 (15) The finite differences implicit scheme is first used to discretize null. The initial nullnull is then set by the user as an initial contour. The optimum contour is then determined by iteratively updating null to determine nullnullnullnull by adding the evolution equation scaled to a step size. The process continues until the solution is stationary. Unlike the contextual MS approach, the ACWE approach is not automated and allows user control through the initial contour to segment specific objects in an image. 1.3 Energy Terms We will next discuss the individual external and internal energy terms used in the segmentation methods of Section 1.2 and how these terms respectively enforce data fidelity and regularization of the segmentation and capture properties of the object of interest which is desired in the segmentation. 1.3.1 External Terms The external energy terms of a segmentation method contribute to data fidelity and ensuring that the final segmentation does not greatly differ from the original image, or in some cases, ensuring that the contour is aligned with high edge characteristics of the 20 original image. In general, external terms can be divided into two categories: boundary terms and region terms. 1.3.1.1 Boundary terms Methods such as Livewire and other minimum-path approaches, where the segmentation produces a path representing the object boundary, tend to use boundary terms that penalize paths, or contours, that do not consist of pixels containing high edge evidence. In the case of the simplified Livewire framework used in this thesis, the key edge evidence term is: nullnull=1−|∇null|max(|nullnull|) (16) where the gradient magnitude of the original image null is scaled and inverted using an inverse linear ramp function such that high image gradient regions correspond to a low edge energy or cost. This also acts as a first order positioning of the contour. The original Livewire framework uses additional edge evidence terms such as the Canny edge evidence measure [20] and the Laplacian of Gaussian second order term [9], but these were omitted in the simplified representation used here for the purpose of focussing on the regularization weight balance rather than the segmentation framework itself. 1.3.1.2 Regional terms The graph cuts segmentation method uses a region based term to reflect how well the intensity of a pixel null fits into given intensity models that are known a priori. These intensity models (or histograms) are created from seeds provided from the user representing the object labels and the background labels. The graph cuts method makes 21 use of a negative log likelihood function to determine the data fidelity region energy nullnull as follows: nullnull(nullnullnull) =−lnPrnullnullnullnull′nullnullnull′null nullnull(nullnullnull) =−lnPrnullnullnullnull ′nullnullnull′null (17) where the prior probability of a pixel null with intensity nullnull belonging to the object label is determined by an intensity model (histogram) created from the intensities of the object seed pixels, and similarly for the background label [15, 48]. The contextual MS model uses the least square error between the segmentation (image process null) and the original image as the data fidelity term as follows: nullnullnullnullnullnull=nullnullnullnull(null,null)−null(null,null)nullnullnullnullnullnullnull (18) where high differences between the smoothened piecewise-constant segmentation and the original image are assigned a high energy to minimize. The ACWE simplification of the MS model uses a similar least square error term as the external energy, but unlike the contextual MS method, the ACWE model simplifies the segmentation null to a binary image consisting of two constants, nullnull and nullnull, as follows (using the level sets formulation): 22 nullnullnullnullnullnullnullnull=nullnull(null(null,null)−nullnull)nullnullnullnull(null,null)nullnullnullnullnullnull+ nullnull(null(null,null)−nullnull)nullnull1−nullnullnull(null,null)nullnullnullnullnullnullnull (19) where nullnullnull(null) =nullnullnullnullnullnullnull(null)nullnull nullnull≥0nullnullnull(null) =nullnullnullnullnullnullnull(null)nullnull nullnull<0null (20) Large differences between the segmented image and the piecewise constant original image will be assigned large external energies. 1.3.2 Internal Terms Internal energy terms, or smoothening/regularization terms, range from simple penalizations of long contour lengths to terms that enforce shapes or prior knowledge [10]. Here, we will focus on the terms employed by the segmentation methods introduced in Section 1.2. The Livewire framework and the majority of minimum-path segmentation approaches use a regularization term that penalizes longer and jagged contours through estimating the contour length. In the Livewire framework, the local regularization cost of a vertex (pixel) null’s link to neighbouring vertex null is an estimate of the Euclidean distance to that neighbour: nullnull=null (nullnull−nullnull)null+nullnullnull−nullnullnullnull such that diagonal neighbours incur a higher regularization cost [81]. 23 In more general minimum-path frameworks, the internal energy is calculated as the piecewise length of the contour: nullnullnullnullnullnull(null)null=nullnullnull(null)nullnullnull (21) where null is the contour parameterized by some variable null (i.e., null(null) =nullnullnull(null),null(null)null:[0,1] →Ω⊂ℜnull in image null:Ω → ℜnull ). In addition, active contour methods, such as the classical snakes method [56], use a second order regularization term as follows: nullnullnullnull=nullnullnullnull(null)nullnullnullnull+nullnullnullnullnull(null)nullnullnullnullnull (22) In the graph cuts formulation, the internal energy uses a penalty term between neighbouring pixels where a certain penalty is assigned if the pixels are assigned to different labels, thereby favouring similar groupings between neighbouring pixels. This term is as follows: nullnullnullnull=null1nullnullnullnull≠ nullnull0nullnullnullnull= nullnull (23) where pixels null and null are assigned labels nullnull and nullnull, respectively. Typically, the penalty term is then weighted by a term that has a high penalty when neighbouring pixels with highly dissimilar intensities are assigned the same label. In the contextual MS approach, the internal energy consists of two regularization terms: 24 nullnullnullnull=null(nullnull|∇null|null)+12(null|∇null|null) (24) where the first term smoothes the image with a filter radius proportional to the values of nullnull and nullnull, which preserves edges during regularization [38, 94, 8, 44]. The second term, which comes from the AT approximation of the cardinality of the edge set Γ, enforces more smooth edges by assigning a high energy to large edge sets. The ACWE segmentation method uses the length of the level set contour and the total area within the contour as regularization terms as follows: nullnullnullnull=null Lengthnullnull=0null+null Areanullnull≥0null=nullnullnullnullnullnull(null,null)null|∇null(null,null)|nullnullnullnullnull+ nullnullnullnullnull(null,null)nullnullnullnullnullnull (25) which assigns a high energy if the zero level set of null is erratic and long in length, and an additional high energy if the interior of the contour is excessively large (in practice, the area regularization term is not used however). In additional, segmentation methods may have more specific terms designed to enforce data fidelity. For example, in addition to the standard contour length term, the classical snakes segmentation method [56] uses a curvature term for regularization, which will be further discussed in Section 1.5.2. 1.4 Regularization of Image Segmentation Methods In all the segmentation methods that have been discussed in Section 1.2, a tradeoff exists between the external and internal energy terms. In this section, we will discuss the 25 role the regularization weight plays in this balance and the consequences of inadequate or excessive regularization. Additionally, we will discuss the traditional methods for setting this regularization weight. 1.4.1 Role of Regularization Regularization plays a key role in reducing segmentation inaccuracies that arise from image degradations and from object boundary behaviour. Noise, ranging from simple impulse salt-and-pepper noise to complex average white Gaussian noise (AWGN), often arises in digital images during image acquisition. Often, environmental conditions such as low light contribute to noise, as well as image sensors, and corruption during image transfer [46]. The process of removing noise from images can often cause the object of interest to be degraded. For example, applying a low pass filter to remove the noise will often weaken the gradient of object edges. For these reasons, segmentation methods often encounter images that have not had any major noise removal preprocessing performed on them. The segmentation method must therefore by robust to noise levels. Complicating matters is the fact that noise is enhanced more than the image signal during edge detection processes [46, 42]. The end result is that noisy regions contribute high external energies and skew optimization processes to favour contours that contain these noisy regions [70]. By either penalizing longer erratic contours, or penalizing segmentations that differ in intensity or area from the original image object, a sufficient level of regularization is able to produce more robust segmentations. Unlike noise, weak gradients contribute to low external energy, causing optimization processes to favour contours or segmentations that contain pixels from stronger gradients in the image, even if these 26 strong gradients are located farther from the object of interest. Regularization by either penalizing the contour length or segmentation area, or the difference in the intensity from original object to segmentation, produces more accurate results. However, excessive regularization can remove detailed regions in the object if these are mistaken as noise or image deteriorations. We will next discuss the consequences of improper setting of the regularization weight. 1.4.2 Conventional Methods for Setting Regularization Determining the optimum balance between regularization and adherence to image content has predominantly been done empirically and in an ad-hoc manner. Most reported approaches to segmentation keep a uniform level of regularization across an object contour, i.e. one that does not vary spatially and is determined empirically. All possible weights are tested, and the weight which produces the least error segmentation is selected as the best fixed weight. However, compensating for image deteriorations by uniformly increasing the level of regularization until the most degraded region of the image is properly regularized may result in excessive smoothing in those regions that do not require that much regularization. Subsequently, this results in a loss in segmentation accuracy, particularly for objects with highly curved boundaries. For example, consider the synthetic example in Figure 7 of an object with highly varying boundary characteristics, and where the image has been corrupted by AWGN with spatially varying levels. When the example object is segmented by a simple minimum-path approach (simplified Livewire) with a low level of regularization, the highly detailed sections of the object are accurately segmented. However, the sections deteriorated by noise are not 27 regularized and thus produce an inaccurate segmentation. Alternately, if the level of regularization is set high, the structurally important region of the object are excessively regularized. The result is that neither segmentation weight is suitable for the entire image. Figure 7: Segmentations on a synthetic image using spatially fixed regularization weights of 0 (green), 0.3 (red), 0.7 (blue), and 0.9 (purple). Lower regularization weights result in erratic contour behavior in the high noise left region of the image, and higher regularization weights result in poor segmentation of the high curvature right region of the image. 1.5 Related Work 1.5.1 Spatially Adaptive Regularization As addressed in McIntosh and Hamarneh [72], adapting the regularization weights across a set of images is necessary for addressing the variability present in real image data. However, although an optimal weight can be found for a single image in a set, that weight may not be optimal for all regions of that image. In [91], a max-margin approach is used to learn the optimal parameter setting, but required the use of prior knowledge. In [60], Kolmogorov et al. characterized the types of energy functional that can be minimized via graph cuts and solved the optimization problem for a range of parameters rather than a single regularization weight. In Pluempitiwiriyawej et al. [80], the weights are changed as the optimization progresses, but in an ad-hoc predetermined manner. Some form of spatially adaptive regularization over a single image appeared in Dong et al. [35]. For segmenting an aneurysm, they varied the amount of regularization 28 based on the known surface curvature of a pre-segmented vessel. The results demonstrated improvements due to adaptive regularization. However, the regularization weights did not rely on the properties of the image itself, which limited the generality of the method. The closest related work to ours is by Erdem and Tari [38] who proposed a regularizer for the MS segmentation framework with feature preserving capabilities through the use of a contextual feedback. The segmentation method itself was discussed in Section 1.2.3.2. The framework features a term that modulates the level of regularization that the edge process null has on the image process null. This term, nullnull in (9), depends on contextual cues through two types of feedback: negative feedback for feature smoothening, where the regularization shifts towards the maximum value of 1 when a cue measure, null , is large, and positive feedback for feature preservation, where the regularization shifts to the minimum value of 0 when null is large. The two forms of feedback are achieved by nullnull = nullnull +(1−null)null (26) where null=1 for negative feedback, and null=0 for positive feedback. The purpose of the two forms of feedback is to adjust the regularization for the edge process and the image process separately such that certain features, like texture, can be preserved (not regularized) in the image process, and other features, such as texture edges, can be eliminated (regularized) in the edge process. Erdem and Tari focused on four data cues to modulate the regularization: (1) edge continuity, (2) edge (gradient) consistency, (3) a texture edge measure, and (4) a texture 29 region measure. The first cue was designed to create an edge set with linked edges by regularizing any edges that were not sufficiently linked to neighbouring edges, thus forming a coherent edge set and providing some level of weak noise detection, primarily against impulse noise. The level of directional consistency was measured through determining the angle between gradient direction vectors. The second cue estimated edge continuity to handle noise and weak edges that may cause breaking up of an edge contour. Through a support term that measures how many neighbouring pixels have an edge indicator, a measure of edge formation was calculated. As the edge is determined to be non-accidental in a region, positive feedback is used so that the diffusivity approaches 0 (feature preservation). The third cue estimated texture edges to prevent texture gradients from being included as part of the object boundary gradient set. The texture in a local window is estimated by comparing the similarity between the central patch, nullnull, and patches shifted to the left, right, above, and below, as shown in Figure 8(a). The similarity metric is the Euclidean distance transform between each shifted patch and the central patch (which acts as the template), producing the similarity distributions nullnullnullnull,nullnullnullnullnullnull,nullnullnullnullnullnull, and nullnullnullnullnullnullnull. If a pixel (null,null) lies in a textured region, the central patch will differ from the shifted patches, and thus nullnullnullnull will differ from nullnullnullnullnullnull and nullnullnullnullnullnull will differ from nullnullnullnullnullnullnull, as illustrated in Figure 8(b). To verify this difference, the Wilcoxon Mann-Whitney test [101] (a rank-sum test) is used to produce a p-value where p-values ≪0.05 indicate a significant difference. Thus the estimate of texture is determined as follows: 30 null(null,null) =1−expnull–nullnullnullminnullnullnull(null,null),nullnull(null,null)nullnullnull (27) where nullnull is a decay rate parameter and nullnull(null,null) and nullnull(null,null) represent the p-values returned from the Wilcoxon Mann-Whitney test between nullnullnullnull and nullnullnullnullnullnull and between nullnullnullnullnullnull and nullnullnullnullnullnullnull. If texture exists around the pixel at (null,null), nullnull(null,null) and nullnull(null,null) will be low, indicating a significant difference between nullnullnullnull and nullnullnullnullnullnull and between nullnullnullnullnullnull and nullnullnullnullnullnullnull, thus producing a low null(null,null). If the pixel at (null,null) lies in a piecewise constant region, the shifted patches will not differ from the central patch, and thus the distributions will not differ from one another, producing high p-values and therefore a high null(null,null). We note that this method suffers from mistaking some non-texture edges as texture since non-texture edges will cause the shifted distributions to slightly differ from one another. (a) (b) Figure 8: Texture cue devised in Erdem and Tari [38]. (a) In a piecewise-constant region of the image, the similarity distributions between the central patch, nullnull, and each of the patches shifted in 4 directions are similar to one another. Thus the p-value produced by the Wilcoxon Mann-Whitney test on nullnullnullnullnull and nullnullnullnullnullnull (similarity distributions to patches on the left and right) is high and on nullnullnull and nullnullnullnullnull is high. (b) In a textured region of the image, the similarity distributions between nullnull and each of the shifted patches differ from one another. Thus the p-value produced by nullnullnullnullnull and nullnullnullnullnullnull is low, and by nullnullnull and nullnullnullnullnull is low. 31 The fourth cue, local scale, preserved texture by preventing excessive smoothening in these regions. Using a patch around a pixel null, the median of the differences between the gradient magnitude at pixels in the patch and the median of the gradient magnitudes is determined. Texture regions will have large median gradient difference values (due to variation in the gradient magnitude in texture regions). This value is then used in an exponential decay function with a decay rate such that high values of the median difference results in the diffusivity warping towards 0 so that texture regions have feature preservation (no smoothening in these regions). After each cue is calculated, the cues are then combined by simply taking the product of the measures. Kokkinos et al [59] proposed a spatially adaptive texture estimation measure through an amplitude/frequency modulation model of images that allows for a probabilistic discrimination between edges, textured and smooth regions. In [59], a texture cue, a loosely defined context-based classifier cue, and an intensity cue were used to distinguish between texture edges and edges between different objects. Only the latter edges were then used to dampen the curve evolution and define the segmentation boundary. Gilboa et al [45] presented a graph-cut based segmentation framework with spatially varying regularization through edge weights in the graph using a gradient magnitude-based cue. Malik et al [69] proposed the Normalized Cuts segmentation framework to regularize segmentation in textured regions through the use of local texture and gradient cues. Their work was also significant for the use of ‘cue gating’, where gradient information is suppressed in high texture regions through the following: 32 nullnull(null,null) =null1−null(null,null)nullnull(null,null) (28) where null(null,null) is the orientation cue (estimate of gradient strength), null(null,null) is the texture cue, and nullnull(null,null) is the gated orientation cue. In addition, we note that the methods of [80, 35, 69, 38] that use continuous segmentation frameworks only modulate the regularization term and leave the data fidelity term as is. 1.5.1.1 Deficiencies in Existing Methods The key problem with the methods described in Section 1.5.1 is that none consider object structure when controlling the regularization, particularly object curvature. As demonstrated in Section 1.1, curvature can often play a key role in distinguishing objects. For example, white matter in MR images has a highly undulating surface that requires low regularization in regions of high curvature. In addition, no method has a strong method for estimation of image noise. The method for handling noise in Erdem and Tari is a weak measure since a noise model is not considered, and instead a measure is formulated based on the connectivity of edges, which can be easily disrupted by texture edges and by stronger forms of noise (such as AWGN). One last problem with the methods of Section 1.5.1 is that each method is tied to a particular segmentation approach and often requires prior knowledge. Our goal in this thesis is to seek a more unified approach to adaptive regularization that can be generalized to many energy minimization segmentation methods with minor modifications to the energy formulation. In addition, in Chapter 3, we will show improvements obtained from combining the texture cue of Erdem and Tari with our proposed data-driven regularization weight. 33 1.5.2 Estimation of Curvature A key section of this thesis (Chapter 3) focuses on a structural measure of object curvature. Existing curvature methods focus on estimating either the curvature of segmentation contours, or estimating the curvature of iso-intensity contours in the image in an attempt to estimate object curvature. Kitchen and Rosenfeld [57] introduced a basic 2D curvature measure for iso-intensity contours in an image (posing the problem as that of corner detection) which has been used by many others [36, 56, 52]. In [57], the curvature of a 2D contour null(null,null) was determined by measuring the rate of change between the gradient angle at a point in the image, null = tannullnull(nullnull,nullnull) , and the unit vector perpendicular to the gradient direction, nnull= (−sinnull,cosnull) , to form the curvature estimate Κ as follows Κ =nullnullnullnnull=nullnullnullnullnullnull−2nullnullnullnullnullnullnull+nullnullnullnullnullnullnullnullnullnull+nullnullnullnullnull/null (29) as shown in Figure 9. Figure 9: Curvature of an isointensity curve measured by the rate of change between the gradient angle and unit normal vector nullnull. tn Tangential circleIsointensity curve34 Segmentation methods such as the classical snakes active contour model [56] use this curvature term as an external energy to attract the segmentation towards high curvature regions of the image. The method calculates the curvature of the level contours and assigns a high energy for contours with high curvature. It is important to differentiate this from weighting the regularization by using the curvature. Instead of adapting how the regularization changes based on how the curvature changes, these methods instead use a curvature term in the energy functional such that the optimum contour will have low curvature if the level of regularization is high (reducing high curvature erratic regions). The methods do not verify if the high curvature regions are in fact due to noise or due to structure however. Cohen et al [27] presented a method to incorporate curvature information into the motion of deformable objects, and track high curvature points through a time series of images. The motivation was that high curvature points in an image are anatomically important and should be matched correctly to the deformable curve. Hermann and Klette [52] presented a survey of various common 2D curvature estimators using differential geometry. However, these methods involve knowledge of the entire curve and were not explicitly based on image intensity information. All methods involved a preprocessing step where the longest straight line segment at each pixel that is tangent to the curve is mapped such that the curve is discretized into a series of straight line segments. Many corner detection methods have been proposed through the use of the Curvature Scale Space (CSS), first described in [73] and expanded in [49, 103, 107]. These methods used the same basic corner detector as [57, 36, 56, 52], but also 35 determined the optimum scale that the curvature should be determined in. Depending on scale, the resulting curvature can change dramatically. If the goal is to highlight any sharp point of curvature in the image, the curvature must be determined in all scales. The CSS methods approached this problem by computing the curvature at the highest scale nullnullnullnullnull, thresholding the values to get possible corner candidates, and then tracking the corners to the lowest scale for localization. These methods suffer from the issue that the detection of the corner candidates is done in a single scale, nullnullnullnullnull, and involve the use of a global threshold to determine candidate corner regions. Other methods, such as Zhang et al [104] and Awrangjeb et al [6], focus on detecting corner points, which are defined as point of extreme curvature. However, these methods do not provide an accurate measure of curvature for non-extreme regions of the image (i.e., no continuous measure of curvature is provided, only a binary measure) which is not useful for the purposes in this thesis. The curvature estimation method most closely aligned to the method proposed in this paper is that of Lindeberg [68, 67] where a multi-scale approach is used for curvature detection and where a continuous estimate of curvature is provided for iso-intensity contours in the image. This method will be outlined in detail in Chapter 3. 1.6 Thesis Contributions In this thesis, we present two methods for spatially adaptive regularization for general energy-minimization segmentation frameworks and present validation results on a variety of current segmentation techniques. We focus on methods that determine the 36 regularization in an automated manner without the need for prior information. Additionally, we focus on using a single regularization weight as opposed to multiple weights which we leave to future work (see Section 4.2). In particular, we present the following: Contribution 1 In Chapter 2, we present a novel method for deriving globally optimal regularization weights for minimum-path segmentation methods. Through extensive validation with synthetic, medical, and natural scene datasets, we investigated the correlation of the corresponding results to human perception. Our results suggested that global optimums do not always result in ‘correct’ segmentations as determined by experts. We found the segmentations from globally optimal weights to suffer from bimodal behaviour of the weights, poor robustness to noise, and poor reflection of underlying image characteristics, which resulted in large inaccuracies. This contribution was published in the following work: Rao, J., Hamarneh, G., Abugharbieh, R.: Adaptive contextual energy parameterization for automated image segmentation. In: ISVC. Volume 5875-I. (2009) 1089-1100. Contribution 2 To address the shortcomings of our first contribution, in Chapter 3 we proposed the novel concept of locally adaptive regularization weights that incorporate both structure and noise reliability. These contextual weights are the first to adapt regularization in a manner that seeks to preserve structurally important high curvature features in images, and to estimate the reliability of image evidence through a novel and 37 robust signal-to-noise ratio (SNR) measure. Our work is also the first to propose general regularization weights that can be incorporated into any energy minimization segmentation method with minor changes, unlike other methods that tie regularization weights to a specific segmentation method. In addition, rather than using curvature measures as internal or external energy, we use these measures to adapt the importance of any general internal or external energy term in the optimization process. Through validation using synthetic, medical, and natural scene datasets, we demonstrate how the contextual weights produce significantly more accurate results than the traditional approach of least-error spatially-fixed weight, the previously discussed approach of globally optimal weights, and the existing adaptive regularization weights of Erdem and Tari [38]. We demonstrate the generality and applicability of the contextual weights by incorporation into discrete and continuous segmentation frameworks. In addition, we show how our method can easily incorporate contextual cues proposed in other works, such as texture cues. This contribution was published in the following works: Rao, J., Hamarneh, G., Abugharbieh, R.: Adaptive contextual energy parameterization for automated image segmentation. In: ISVC. Volume 5875-I. (2009) 1089-1100. Rao, J., Abugharbieh, R., Hamarneh, G.: Adaptive Regularization for Image Segmentation Using Local Image Curvature Cues. Submitted to: ECCV (2010). In the concluding Chapter 4, we will discuss limitations and areas of future work in this thesis. 38 Chapter 2 2 Globally Optimal Regularization Weights This chapter first discusses the motivation for determining the globally optimal regularization weight, and then discusses the implementation of a graph search in 3D space with references to existing techniques. This is followed by validation on standard synthetic and real databases, and concludes with an analysis of the performance of the method and an explanation of the limitations of this method. 2.1 General Framework A theoretically appealing and intuitive approach for setting the regularization weight is to determine the globally optimal weight by incorporating the weight into an optimization process that determines the globally optimal segmentation. One such way to achieve this is to focus on graph-based minimum-path approaches that allow for 39 modification of the graph such that additional dimensions (to optimize along) can be added. In our work, we will in particular focus on a simple minimum path approach where we will insert our regularization weight in a convex manner. The problem of optimizing minimum-path approaches for the path spatial coordinates and a third variable has been significantly researched in the field of segmentation of tubular structures, and in particular, anatomical vessel segmentation. One such approach is Li and Yezzi [64] who optimized for the spatial path and vessel radius variable simultaneously. Additionally, Wink et al [100] used Djikstra’s algorithm [34] to solve for the globally optimum medial (centerline of the vessel) path and vessel radius. Most recently, Poon et al [81] solved for the optimum vessel radius in conjunction with the median path using a modified version of Livewire, titled LiveVessel. LiveVessel focused on modifying the local cost term for vessel segmentation by adding the vessel radius as variable that played a large role in vessel cost terms. As discussed in Section 1.2, 2D minimum-path segmentation methods optimize a path or contour null which consist of nodes null with spatial coordinates (null,null). From each node null, the path choices are to a neighbouring node null(nullnull,nullnull). In [81], the 3D expansion of the graph search shifts the optimization space to (null,null,nullnullnullnullnull) by expanding the path choices from each node null(null,null,nullnullnullnullnull) to a neighbouring node null(nullnull,nullnull,nullnullnullnullnullnull) with the restriction that null≠nullnull and null≠nullnull in order to form a contour with consecutively linked nodes. Through this method, the globally optimum vessel radius (scale) at each node is guaranteed along with the globally optimum path. 40 In our implementation, we will differ from these methods by using a convex form of the local cost functional. Our formulation employs energy-minimizing boundary-based segmentation, where the objective is to find a contour that correctly separates an object from background. We embed a parametric contour null(null) = nullnullnull(null),null(null)null:[0,1] →Ω⊂ℜnull in image null:Ω → ℜ. We use a single adaptive weight null(null) ∈ [0,1] that varies over the length of the contour and re-write (2) as: nullnullnull(null),null(null)null =null null(null)nullnullnullnullnull(null)+null1−null(null)nullnullnullnullnullnullnull(null)nullnullnull nullnull (30) where nullnullnullnullnullnull(null)null = 1−null∇nullnullnull(null)nullnull/maxnullnull∇nullnullnull(null)nullnull (31) penalizes weak boundaries and nullnullnullnullnullnull(null)null=nullnullnull(null)nullnullnull (32) penalizes longer and jagged contours. Our approach for determining the globally optimum regularization weight is to optimize null in equation (30) for the weight null(null) itself in addition to optimizing the contour. In our discrete setting, this involves optimizing for nullnull(null) =null null(null),null(null),null(null)null. The external energy term of equation (31) remains the same, but the internal energy term of equation (32) now represents the length of the contour in the 3D space (null,null,null) where large 41 changes in the regularization weight are penalized with a higher internal energy. We will explain the optimization process next. 2.2 Optimization Process 2.2.1 Djikstra’s Method in 3D To minimize null with respect to null(null) in (30), we model a 3D graph of dimension null×null×nullnull where the image is of dimension null×null and we discretize the weight into nullnull levels. Each vertex nullnull represents pixel coordinates and a weight (null,null,null). Graph edges nullnullnull =〈 nullnull,nullnull 〉 represent vertex connectedness (e.g. 24-connectedness in 3D graphs). A local cost nullnullnull = nullnull nullnullnullnullnullnullnull,nullnullnull+null1−nullnullnullnullnullnullnull(nullnull) (33) is assigned to each edge nullnullnull, where nullnullnullnull nullnullnull,nullnullnull is the Euclidean distances between nullnull and nullnull. The contour that minimizes the total energy nullnullnullnullnullnull=nullnullnullnullnullnullnull∈null (34) represents the optimal solution for the segmentation. Note that the optimal path nullnull(null) =null null(null),null(null),null(null)null cannot pass through the same nullnull(null),null(null)null for different null, i.e. only a single weight can be assigned per pixel. Our graph search abides by this simple and logical constraint. The optimal null(null) and null(null) that globally minimize (30) are calculated using dynamic programming on this (null,null,null) graph. 42 2.2.2 Implementation Details The 3D graph search is implemented in a similar manner as the 2D graph search described in Section 1.2.2.1 with the additional dimension contributing to an null(nullnull) search process. The key difference is the expansion of a node’s set of neighbouring nodes from 8 to 8nullnull where nullnull is the number of discretized weight levels. In our implementation, we selected nullnull=11 in the range [0,1] with the distance between weight levels as nullnull=0.1. 1. Initialize a list of node costs to infinity. Initialize a list of visited nodes to empty. Initialize a queue of nodes to empty. 2. Set the initial node (start point) as visited, place it in the node queue, and assign it a cumulative cost of zero. 3. Retrieve the node in the queue with the smallest cumulative cost and term this the current node. Calculate the cumulative cost to each of the current node’s unvisited neighbours by adding the cost of the edge nullnullnull from current node null to the neighbour node null to the cumulative cost of the current node null. If this new cost null′null=nullnullnull+nullnull is less than the previously recorded cost, i.e. null′null<nullnull, than the old cost is overwritten with the new cost. Each neighbour is added to the node queue if it is not already there. Once all neighbours of the current node has been visited, mark the current node as visited. 4. Re-sort the node queue such that the node with the lowest cumulative cost is first out of the queue. 43 5. If the node queue is empty (all nodes has been visited), the algorithm is finished. Otherwise, repeat steps 3 and 4. The process is illustrated in the flowchart in Figure 10. Figure 10: Graph search algorithm using Djikstra’s method. The minimum-path segmentation method described in Section 1.2.2.1 is an interactive method since it requires user-entered seedpoints around the boundary of the object of interest. Since our focus in this thesis is not on user interactivity, we selected equally spaced seedpoints around the object boundary that were automatically determined from the ground truth segmentation. The seedpoints consisted of the spatial coordinates and the regularization weight at that node. The spatial coordinates were determined from the ground truth contour, and the regularization weight was simply set to 0.5. For each pair of seedpoints, the minimum-path contour was determined from the 3D graph search. 44 We used the same seedpoint set for the globally optimum weight segmentation and for the least-error fixed weight segmentation for proper comparison of the results. 2.3 Validation and Results We demonstrate the performance of the globally optimum regularization weights on a wide range of datasets, including synthetic images, medical datasets, and natural images. In addition, we validate the method quantitatively using analysis of variance (ANOVA). All our validation is performed against ground truth segmentation determined either functionally (for synthetic images) or by manual segmentation by experts. 2.3.1 Performance Criteria Our measure of error for our synthetic dataset is to compare the vertical (along the y-axis) distance between the contour and the sinusoidal functional that is used to make the synthetic image (See Section 2.3.2 for details about how the synthetic dataset is devised), as follows: nullnull=maxnull(null,null)|null−nullnull(null)| (35) where (null,null) are the spatial coordinates of each node null in the contour null(null), and nullnull(null) is the sinusoidal function representing the synthetic image. Our measure of error for the medical and natural datasets was to form a closed segmentation mask from the contour produced by the method, and to compare this mask to the ground truth segmentation by computing the Dice Similarity Coefficient (DSC) 45 measure that estimates the agreement in pixel labels between two segmentations as follows: nullnullnullnullnull,null=2(nullnullnullnullnull∩nullnullnullnullnull)(nullnullnullnullnull+nullnullnullnullnull) (36) where nullnullnullnullnull represents the binary segmentation mask produced from method null and nullnullnullnullnull represents the binary segmentation mask produced from method null . For the medical and natural datasets, the ground truth segmentation was that produced by manual tracing. We benchmarked the globally optimum weights against the conventional method of determining the least-error (or maximum dice similarity) spatially-fixed regularization weight. This is accomplished by determining: nullnullnullnullnullnull=arg maxnull∈[null,null]nullnullnullnullnullnull,nullnull (37) where nullnull is the segmentation produced by the weight null and nullnull is the ground truth segmentation. The weight is discretized into 11 levels with a step size of 0.1 between each level. We determined the segmentations for 25 trials for each image. This is of importance for synthetic images where noise was added and thus the noise profile is determined randomly for each image. We thus generated 25 noise profiles for each image and determined the segmentation for each image. The error is determined by averaging the resulting 25 segmentation accuracies. To determine if the globally optimum weights differ significantly from the least-error fixed weight, we then performed ANOVA over 46 the set of 25 segmentation accuracies for each image and analyzed the resulting null-value. Computationally, the 3D graph search method required less than 7 minutes for a 768× 576 image when run on a Pentium 4 (3.6GHz) machine using MATLAB code. 2.3.2 Synthetic Data Validation We first tested the method on a dataset of synthetically produced images. Through the synthetic images, we tested extreme cases of object boundary variability, noise variation, edge strength variation, and object curvature variation. We first modelled an object boundary as a sinusoidal function with spatially-varying frequency to simulate varying contour smoothness conditions. The images were produced by the functional: nullnull(null) = nullsinnullnullnull10null (38) where the sinusoidal is parameterized by null=[null2 nullnullnull,null2nullnullnull] and will vary in frequency from nullnull at the start and nullnull at the end of the contour. null represents the amplitude (set to 2) and null represents the frequency width (set to 10). We also added spatially-varying (non-stationary) AWGN patterns of increasing variance. We also spatially varied the gradient magnitude of the object boundary across each image by applying Gaussian blurring kernels at different scales in different locations. We created 16 of these synthetic images carefully designed to cover extreme shape and appearance variations. Two synthetic images with the resulting segmentations are shown in Figure 11. The contour obtained using the globally-optimal weights method is shown in green, along with the contour obtained using the spatially-fixed regularization weight from (37) shown in red. 47 (a) (b) Figure 11: Segmentation of synthetic images with decreasing noise variance and decreasing Gaussian blurring from left to right. (a) Low curvature case, and (b) higher curvature case. Contours produced by globally optimal weights (green), least-error fixed weight (red), and ground truth contour (black). Globally optimal weights are 0 over the whole contour, resulting in little difference between the contours and poor regularization to noise. In addition, Table 1 presents the modified Hausdorff error measure of (35) for segmentations produced by both regularization weights for all images. The error values represent the mean nullnull for 25 noise realizations for each synthetic image. The least-error fixed-weight method had a mean error of 12.05 ± 1.6 while the globally optimal weight method had a mean error of 33.06 ± 3.66. Ground truth contourFixed weight 0GO contourGround truth contourFixed weight 0.1GO contour48 Table 1: Average error over 25 noise realizations per image produced by least-error fixed weights and globally optimal weights for synthetic set of data. Segmentations from the globally-optimal weights have significantly less accuracy than segmentations from the least-error fixed weight. Average p-value ≪null.nullnull. Image # Mean nullnull error over 25 segmentationsFixed weights G.O. weights 1 13.02 17.682 15.04 45.243 8.92 17.44 13.8 38.65 6.8 7.166 12.96 42.27 14.4 42.48 13.28 40.929 11.8 35.7210 13.04 47.6811 14.84 40.0412 6.64 7.9213 10.62 27.2414 12.74 4115 11.92 44.216 13.02 33.64 From Figure 11, it is clear that the globally optimal (GO) weights produce a more erratic segmentation when compared to the least-error fixed weight segmentation. Although the GO weight is accurate for the noise-free regions of the image, the regions with high noise have insufficient regularization. The quantitative results in Table 1 confirm that the GO method produced significantly poorer results when compared to do the least-error fixed weight. We can further analyze the behaviour of the GO weight by plotting the weights over the contour as shown in Figure 12. 49 Figure 12: Profile of globally optimal weights along contour of Figure 11(a). Weights are either 0 or 1 indicating bimodal behaviour. The weight is significantly bimodal, oscillating between 0 and 1. We will discuss the difficulties the globally optimum weight has with noise, and the cause of the bimodal weight behaviour, in Section 2.4. 2.3.3 Medical Data Validation We also tested the globally optimal weights on various medical datasets validated with expert ground truth segmentations. We first used a dataset of sagittal slices from MR scans of the brain taken for 52 subjects. The focus of these images are on the corpus callosum (CC) structure. Segmentation of the boundary of the CC is difficult due to the weak edge separating the CC from the fornix, as labelled in Figure 13(a). Figureimagesegmethe laof the The throuHowecornethe glthe ccorresegmcontobimothe te with spatia 13: Segmene showing thentation from beled colormaweights leadsfornix sectigh the weakver, the regrs on both sobally optimolour mappsponds to entation, paur obtaineddal behaviourms at a timIn additiovarious molly varying (a) tation of cor weak delineglobally-optimp, and with g poor segmenon requires gradient raularization wides of the um weighting as shoa weight rticularly du using globr (either ble. n, we testeddalities fromnoise can bpus callosumation of theal weights wround truth tation of the wa high regther than foeight mustCC structur segmentatiown where of 1. The e to the bially optimaue or red in the method the Braine created w50 (CC) struct CC and forith contour cshown as daseak edge bouularization llowing the be low in oe. Figure 13n on one sublue correglobally omodal behal weights e Figure 13(b on a synthWeb databith this dataure from sagnix structureolour indicatihed black conndary. weight to a strong grarder to accu(b) demonstch CC imagsponds to ptimal weviour of thexhibits an o)) completeetic dataset ase [26, 63set by adju(b) ittal MR slics as labeled.ng weight levtour. The bimllow the codient of the rately segmrates the pee from this a weight oights produ weights. Nptimal, yetly favouringof MR brain, 62, 31]. sting the noe. (a) Origin (b) Resultinel according todal behaviontour to cufornix itselent the sharrformance odataset usinf 0 and reced a pooote how th undesirable only one o scans takeIn particulaise level anal ng o r t f. p f g d or e , f n r, d intenmodacontaintenattracratherspatiacaptuFigurenoise. truth matteweigh weighand psity inhomolity is showin weak edsity levels. ted to the s than the wlly-fixed wre cortical f 14: Segmen(b) Contours(black). Globr and cerebrht contour has In Figuret and the sproton densigeneity levn in Figure ge strengthsThe contoutrong edge beak edge eight produolds. (a) tation of cort from globallally optimal wal fluid regioexcessive regu 15, we preatially-fixedty (PD) moel. The seg14. These c, highly var from theoundary beboundary bces a contical boundaryly optimal weieights resultn rather thanularization in sent the DS weight ondalities wit51 mentationases presentrying object globally otween the getween the our with ex . (a) BrainWights (cyan), l in contour a edge betwehigh curvatuC of segmimages fromh increasingof the whi a difficult s boundariesptimal weigrey matter white mattecessive regeb T1 mid-voeast-error fixttraction to then white matre regions. entations fr the BrainW noise levete matter ucenario sinc, and varyihts, shownand cerebrar and greyularization (b) lume coronaled weight (ree strong edgter and greyom the globeb dataset uls (where ssing the Te the imageng noise an in cyan, il fluid regio matter. Ththat fails t slice with 9%d), and groune between gre matter. Fixeally optimasing T1, T2egmentation1 s d s n e o d ey d l , s 52 were determined for 25 noise realizations per noise level). As with the synthetic images, we averaged the DS over 25 segmentations for each image in each dataset. In the low noise cases, the globally optimum weight and the spatially-fixed weight produce accurate segmentations. However, as the level of noise increases, the segmentation accuracy degrades due excessive or inadequate regularization by both methods. (a) (b) (c) Figure 15: Dice similarity coefficient (DSC) of segmentations from globally-optimal weights (GO) and least-error fixed weights for segmentation of white matter in coronal BrainWeb slices using the (a) T1 modality, (2) T2 modality, and (3) PD modality. As the level of noise in the slices increases, the globally optimal weights perform poorer that the least-error fixed weights due to low regularization (from noise mistaken as high external energy). DSC values are averaged from segmentations of 25 noise realizations per noise level. Our findings indicated poor performance by the globally optimal weights. Further qualitative and quantitative performance evaluation of the GO weights, in comparison with the data-driven weights (discussed in Chapter 3), are provided in Section 3.8. 2.3.4 Natural Scene Data Validation We also tested the globally optimal weights on images of natural scenes from the following datasets: the McGill Calibrated Colour Image Database [75], the PASCAL object recognition database [40], and through the ImageNet database [32, 33] and validated through manual segmentations. 0% 3% 5% 7% 9%0.50.60.70.80.91Noise LevelDice SimilarityT1 BrainWeb Segmentation Accuracy Fixed weightsGO weights0% 3% 5% 7% 9%0.50.60.70.80.91Noise LevelDice SimilarityT2 BrainWeb Segmentation Accuracy Fixed weightsGO weights0% 3% 5% 7% 9%0.50.60.70.80.91Noise LevelDice SimilarityPD BrainWeb Segmentation Accuracy GO weightsFixed weights from curvafrom waterboundregulresultweighnullnullnullnullnullFigurecontouoptimleaf suWe first p[75]. Theseture properthe backgro, and Figuary. In botharization in s in predomt which fornull=0.3 for 16: Segmenrs produced al weight is pbmerged by wresent the s images preties of the ound. Figurere 17 prese cases, the gregions of winately zeromed a moreFigure 17. (a) tation of leaby globally oredominatelyater. Fixed wegmentationsent a segmbject, and 16 presentnts results lobally opteak edge bo levels of re accurate se f image fromptimal weigh 0 and produeight cannot53 results on entation chafrom the vas segmentatfrom a floimum methoundaries. Tgularizationgmentation u McGill dathts (cyan) andces segmenta accurately senatural flowllenge becaurying edgeion results wer with d performs he bimodal , unlike the sing nullnullnullnullnullnullabase [75]. ( least-error ftion with lowegment the toper and leaf se of the h strength anfrom a leafa highly vapoorly due behaviour oleast-error s=0.4 for F(b) a) Original iixed weights regularizatio tip of the leaimages takeighly varyind occlusion obscured brying objecto inadequatf the weightpatially fixeigure 16 an mage, and (b(red). Globaln in regions of. n g s y t e s d d ) ly f 54 (a) (b) Figure 17: Segmentation of flower image from McGill database. (a) Original image, and (b) contours produced by globally optimal weights (cyan) and least-error fixed weights (red). Inadequate regularization by both methods results in oversegmentation and leakage into the background. Figure 18 presents the quantitative results from images taken from the McGill, PASCAL, and ImageNet databases. As with the previous examples, the DSC represents the average over 25 segmentations for each image where each segmentation is produced from different seed placements (all equally spaced over the object boundary). Additional qualitative and quantitative results produced by the globally-optimal weights are presented in Section 3.8 and compared to the data-driven weights proposed in the next chapter. 55 (a) (b) (c) Figure 18: DSC of segmentations from least-error fixed weights (F) and globally-optimal weights (GO) on (a) 8 images from ImageNet database, (b) 11 images from PASCAL database, and (c) 24 images from McGill database. Average error over dataset (and over 25 segmentations for each image) was F = 0.9188, GO = 0.8629 for (a), F = 0.9241, GO =0.8782 for (b), and F = 0.9402, GO = 0.9024 for (c). All p-values ≪null.nullnull. Globally optimal weights produce segmentations significantly less accurate than the least-error fixed weight method. 2.4 Deficiencies in Globally Optimum Weight Model From the performance of the globally optimum method on the wide variety of datasets, we can conclude that the globally optimum weight segmentation produces poorer results than the spatially-fixed weight segmentation. The results highlight the three main drawbacks to this globally optimum (in (null,null,null)) method: (i) it encourages a bimodal behavior of the regularization weight; (ii) it does not explicitly encode image characteristics; and (iii) it combines the weight and segmentation optimization into one process, thus reducing the generality of the method. The first drawback, that the weights are bimodal, means that the weights either take on the extreme values of nullnullnull=0 or nullnullnull=1 but no value in between. This bimodal behavior stems from optimizing (30) when nullnullnullnull(null) >nullnullnullnull(null) and vise versa. In particular, we observe that the following occurs: F GO0.750.80.850.90.951Dice SimilarityImageNet DatabaseF GO0.750.80.850.90.951Dice SimilarityPASCAL DatabaseF GO0.70.750.80.850.90.951Dice SimilarityMcGill Database56 null(null) =null0nullnullnullnull(null) >nullnullnullnull(null)1nullnullnullnull(null) <nullnullnullnull(null) (39) A simple proof of the bimodal behavior of the optimal weight is as follows: If the total energy is represented as in (30) with the adaptive weight null(null) ∈ [0,1], we can rewrite the total energy as null=nullnullnull+(1−null)nullnull=null(nullnull−nullnull)+1 (40) where we assume energies nullnull∈ [0,1]and nullnull∈ [0,1]. Determining the optimum energy by gradient descent/ascent is done by first determining the rate of change of (40) with respect to the adaptive weight, and determining maximum/minimum regions of zero rate of change as follows: nullnullnullnull=nullnull−nullnull=0 (41) If nullnull>nullnull, the slope is positive and the minimum will occur when null=0. Alternately, if nullnull<nullnull, the slope is negative and the minimum will occur when null=1. Thus, regardless of the difference between nullnull and nullnull, the optimal weight will always be bimodal, either 0 or 1. Although large changes in the weight are penalized with a higher internal energy, the resulting behavior is still predominately bimodal since the weight shifts from 0 to 1 in a few steps rather than in a single step (i.e., from 0 to 0.5 to 1) which results in a lower internal energy per step. Bimodal weights cannot address the levels of regularization required in images. From the results in Section 2.3, the least-error spatially fixed weight was always a value between, but not including, 0 and 1. Extreme values of regularization weights mean that 57 the resulting segmentation will either have the full level of regularization, which is a straight line in the case of minimum-path boundary based methods, or have no regularization, which means that the optimal contour will contain any edge evidence in the vicinity of the seedpoints. This lack of regularization results in extremely poor results for images containing regions of high external energy that do not correspond to the object boundary; in particular, for noisy images and for images containing occlusions and weak object edges. The bimodality of the weights leads to the second drawback of this method: the weights have no relationship to the underlying image characteristics. Even though regularization is essential in regions of high image deterioration, the globally optimal weights method has no way of differentiating between non-noisy, or ‘reliable’/trusted, external energy, and unreliably external energy. The result is that high noise is confused as strong edge evidence through the external energy measure, resulting in nullnullnullnull<nullnullnullnull, which forces the globally optimal weight to 0. Essentially, the globally optimal weight is not necessary the ‘correct’ weight as determined by the user and as determined by the ground truth segmentation. In addition, the globally optimal weight does not encode structural information about the object of interest and does not adapt the regularization weight in accordance to regions of higher curvature or texture. This problem is likely due to deficiencies in the optimization function where the energy functional does not accurately represent the problem to be solved. The inclusion of additional energy terms specifically designed for each image segmentation problem could provide globally optimal weights that reflect correct segmentations. However, this reduces the generality of the segmentation method, and may require energy functionals with multiple 58 regularization weights, which is outside the scope of this thesis and is left to future work, as discussed in Section 4.2. The final drawback of the globally optimal weight is that the minimum-path 3D graph search process used to determine the weight is intrinsically combined with the segmentation process. This combination of the weight and segmentation optimization into one process reduces the generality of the method since finding the globally optimal weights for other segmentation frameworks would require significant changes to the energy minimization process. In conclusion, even though optimal with respect to null in (30), the solution proposed in this chapter is incorrect and, as we later demonstrate, inferior to the spatially adaptive balancing of energy cost terms based on deriving a relationship between the weights and image characteristics, as discussed next in Chapter 3. 59 Chapter 3 3 Data-driven Regularization Weights In this chapter, we discuss our approach for spatially adapting the regularization weights based on underlying image characteristics that are measured through a series of contextual cues, which we term ‘data-driven’ regularization. We will first discuss the general framework of the method and how each of the data cues are determined. We will then analyze the limits and performance of each data cue before presenting the validation results on a wide variety of synthetic, natural, and medical datasets. We will demonstrate the generality of the method by incorporating the weights into four popular segmentation methods. 60 3.1 Overview of method Our results from Chapter 2 motivated us to devise a relationship between the regularization weights and the underlying image characteristics. We found that the regularization weights need to vary in a manner that respects trusted edge and structurally important regions but disregards regions of high image deterioration. In particular, we will focus on three such measures: estimating the level of noise in the image, the level of trusted edge evidence, and the level of trusted curvature. We will incorporate concepts from cue gating, and we will later present results from incorporating existing data cues to show how our method is able to fit into existing techniques. We first discuss how regularization should behave under different image characteristic conditions. For noise conditions, regularization should be increased to prevent erratic segmentation behaviour. However, regions of high curvature that correspond to object boundaries should have low regularization to allow the segmentation to capture important structural details. In addition, regions with low edge strength that belong to the object boundary should have high regularization again. Boundary regions with high texture should have high regularization to prevent these texture edges from being included as the contour itself. In each case, we must make sure that the edge, curvature, and texture measure that we devise is a ‘trusted’ measure; i.e. these measures should be noise-free, which we can accomplish by using the noise measure to gate these measures. 61 3.2 Noise Evidence Cue In audio signal processing and compression applications, spectral flatness (SF) is a well known Fourier domain measure used to estimate noise [54, 93]. SF exploits the property that white noise exhibits similar power levels in all spectral bands and thus results in a flat power spectrum, whereas uncorrupted signals have power concentrated in certain spectral bands and thus result in a more impulse-like power spectrum [46]. This is shown in Figure 19 for two 1D signals, one clean (Figure 19(a)) and one noisy (Figure 19(c)). (a) (b) (c) (d) Figure 19: Spectral behavior of 1D signals. (a) Clean signal in spatial domain and (b) corresponding impulse-like behavior in spectral domain. (c) Noisy signal in spatial domain and (d) corresponding flatness in spectral domain. Noisy signals can be identified by examining flatness in the spectral domain. 0 10 20 30 40 50-1-0.500.51Y(x)0 100 200 300 400 50000.20.40.60.8Frequency|Y(f)|0 10 20 30 40 50-6-4-2024Y(x)0 100 200 300 400 50000.20.40.60.81Frequency|Y(f)|62 The original SF measure in audio processing is as follows: nullnullnull(null) =expnull12nullnull lnnullnullnull(null)nullnullnullnullnullnull12nullnull nullnullnull(null)nullnullnullnullnull(42) where nullnullnull(null) = |nullnullnull(null)|null is the power spectrum of the signal (where nullnullnull(null) are the Fourier coordinates of the signal), and null is the frequency. The measure nullnullnull essentially compares the ratio between the geometric mean (numerator of (42)) and the arithmetic mean (denominator of (42)), and is a general measure of flatness. If the Fourier spectrum contains an impulse, meaning that the spatial-domain signal is clean, the arithmetic mean will be much larger than the geometric mean, and nullnullnull→0 . If the original signal is noisy, the Fourier spectrum will be flat, and the arithmetic mean will be very similar to the geometric mean, resulting in nullnullnull→1. Assuming additive white noise, uncorrelated between pixels, we extend the SF measure to 2D and estimate the spatially-varying noise levels null(null,null) as null(null,null) =expnull14nullnullnullnulllnnullnullnullnullnull,nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull14nullnullnullnullnullnullnullnullnull,nullnullnullnullnullnullnullnullnullnullnullnullnullnullnull (43) where nullnullnullnullnull,nullnullnull = nullnullnullnullnull,nullnullnullnullnull is the 2D power spectrum of a local window in the image, nullnullnullnullnull,nullnullnull is the Fourier spectrum of the image window and nullnullnull,nullnullnull are the two spatial radian frequencies. As with the 1D SF measure, null(null,null) ∈ [0,1] where null(null,null) =0 corresponds to low noise regions and null(null,null) =1 corresponds to high noise regions. This noise measure responds best to white noise-like patterns (i.e. occupying a 63 very wide and flat spectrum) which are typically the most difficult type of noise. However we note that this method cannot address speckle noise (multiplicative Gaussian noise). We estimate the local noise level by determining null(null,null) using local windows around each pixel and using those windows to calculate the Fourier spectrum. Figure 20(a) shows a synthetic image with added AWGN with increasing variance from right to left, and Figure 20(b) shows the resulting measure null(null,null). As the noise variance decreases, the noise measure decreases in accordance. In addition, the noise measure is lower for the region representing the object edge. (a) (b) Figure 20: (a) Synthetic image with AWGN increasing in variance from left to right, and (b) resulting noise measure null(null,null) where black intensities correspond to a measure of 0 and white intensities to a measure of 1. The noise measure is high for the high noise regions of the image without mistaking edges as noise. One of the benefits of the SF measure is its consistency and robustness to changes in the image edge strength. To demonstrate this, we use the simple synthetic image of Figure 21(a) and determine the mean local spectral flatness over the image versus increasing variances of added noise (Figure 21(b)). We compare the profile of SF versus noise variance for the image of Figure 21(a) with different levels of Gaussian smoothing. As we decrease the edge strength in the image, the profile of the mean estimated noise level over the noise variance remains the same. This allows the estimation of the noise to remain decoupled from the estimation of edge strength. 64 (a) (b) Figure 21: (a) Original synthetic image. (b) Average local spectral flatness versus variance of added AWGN for image in (a) with various levels of Gaussian blurring. As the edge strength weakens, the spectral flatness remains approximately constant, demonstrating how the noise detection is not affected by the edge strength. The Fourier spectrum for each window region is determined through the fast Fourier transform (FFT). The size of the window for estimating the local spectral flatness plays a role in the quality of the resulting noise estimate. When using a smaller sized window, some correlation may exist between noise pixels, resulting in lowered flatness in the spectral domain and a resulting inaccurate lower noise estimate. Alternately, using a larger window size reduces the local nature of the measure. This dilemma is due to the dependence of noise on the scale of the image. As a tradeoff, we selected a window size of 15×15. In addition, we note that we have targeted AWGN in our measure as it represents one of the more difficult noise types to filter for in pre-processing steps and thus must be considered in images used for segmentation. Additional noise models will be considered in the future, as discussed in Section 4.2. 0 0.2 0.4 0.6 0.8 100.10.20.30.40.50.60.7Variance of noiseSF SF for original imageSF for blurring level 1SF for blurring level 2SF for blurring level 365 3.3 Edge Evidence Cue We estimate the edge evidence in the image by using the measure: null(null,null) =maxnull|∇nullnull(null,null)|,null∇nullnull(null,null)nullnull (44) where ∇nullnull(null,null) and ∇nullnull(null,null) represent the x and y components of the image gradient. We chose this measure rather than the standard gradient magnitude for its rotational invariance in the discrete domain. With the typical gradient magnitude measure |∇null| =null∇nullnullnull+∇nullnullnull, the result if ∇nullnull=1 and ∇nullnull=1 is |∇null| = √2 instead of |∇null| =1. 3.4 Reliability Measure Formulation The edge evidence cue of (44) can be highly unreliable if the image is deteriorated by noise. Therefore, we use the noise measure null(null,null) of Section 3.2 to ‘gate’ the cue such that gradient information is suppressed in high noise regions. This concept of cue gating was first proposed in Malik et al [69] for the purposes of suppressing gradient information in high texture regions (described in further detail in Section 1.5.1). With this noise-gating, our edge evidence cue is as follows, nullnull(null,null) =null1−null(null,null)nullnull(null,null) (45) where nullnull(null,null) ∈ [0,1]. We term this noise-gated edge evidence cue as the image reliability as it represents how reliable an edge is. 66 3.5 Curvature Cue We will next discuss how we estimate the level of curvature of the object boundary through a measure termed the curvature cue. 3.5.1 Curvature Formulation We determine the curvature by a method similar to that of Section 1.5.2 [57, 36]. Let null(null,null;null) be a smoothened image such that null(null,null;null) =nullnull(null,null) ∗nullnull(null,null) (46) where nullnull(null,null) is the original image and null is the Gaussian scale parameter. The unit vector tangent to the isointensity contour, nullnull(null,null;null) passing through a point (null,null) is given as: null(null,null;null) =1nullnullnull,nullnull(null,null)+nullnull,nullnull(null,null)nullnullnull,null(null,null)−nullnull,null(null,null)null(47) where nullnull,null and nullnull,null are the image derivatives along null and null, respectively, at scale null. Denoting the Hessian matrix of null(null,null;null) by nullnull(null,null)as follows nullnull(null,null) = nullnullnullnull,null(null,null) nullnullnull,null(null,null)nullnullnull,null(null,null) nullnullnull,null(null,null)null(48) the local image curvature null(null,null;null) can be calculated as: 67 null(null,null;null) =|nullnullnullnullnull|nullnullnull,nullnull(null,null)+nullnull,nullnull(null,null).(49) Note that the absolute value of null(null,null;null) is used since we are not concerned with differentiating between convex and concave curvature. (49) expands to null(null,null;null) = nullnullnull,nullnullnullnullnull,null−2nullnull,nullnullnull,nullnullnullnull,null+nullnull,nullnullnullnullnull,nullnullnullnull,nullnull+nullnull,nullnullnullnull/nullnull .(50) We follow the method in [67] where null(null,null;null) is enhanced to have a stronger response near edges by multiplication with the gradient magnitude raised to some power, which we chose as 2. The edge enhanced curvature is then nullnull(null,null;null) =nullnullnull,nullnullnullnullnull,null−2nullnull,nullnullnull,nullnullnullnull,null+nullnull,nullnullnullnullnull,nullnullnullnull,nullnull+nullnull,nullnullnull.(51) 3.5.2 Normalized Rescaled Curvature The selection of the Gaussian kernel size null, also referred to as the scale of the image, when smoothening the image in (46) plays a role in determining what sized structures in the image we will obtain meaningful curvature values for. Generally, larger structures will have meaningful curvature values when the original image is smoothened by a larger Gaussian kernel, and vice versa for smaller structures. However, curvature values cannot be compared across scales since the amplitude of the image spatial derivatives decreases with increasing scale. Thus, to compare curvature values across different scales, the curvature must be scale normalized, which is accomplished through the normalized scale coordinates of [67]. Once the normalized rescaled curvature, nullnullnullnullnullnull, 68 is determined, Lindeberg approaches automated scale space selection by selecting scales at which the normalized rescaled curvature assumes maximum values. nullnullnullnullnullnull is determined through scale-normalized coordinates, null, where null=nullnull(52) and where the normalized derivative operators for these coordinates are nullnull=nullnullnull, nullnullnull=nullnullnullnull. (53) We substitute the scale normalized coordinates into (51) as follows: nullnullnullnullnullnull(null,null;null)=nullnullnullnullnullnull,nullnullnullnullnullnullnullnullnull,nullnull−2nullnullnullnull,nullnullnullnullnullnull,nullnullnullnullnullnullnullnull,nullnull+nullnullnullnull,nullnullnullnullnullnullnullnullnull,nullnullnullnullnullnullnull,nullnullnull+nullnullnullnull,nullnullnullnullnull=nullnullnullnullnullnull,nullnullnullnullnull,null−2nullnull,nullnullnull,nullnullnullnull,null+nullnull,nullnullnullnullnull,nullnullnullnullnullnull,nullnull+nullnull,nullnullnull (54) which can be simplified as: nullnullnullnullnullnull(null,null;null) =nullnullnullnull(null,null;null). (55) After the curvature values at each scale have been normalized, the final curvature cue at every pixel is determined by selecting the scale at which nullnullnullnullnullnull assumes a maximum value as follows: 69 nullnull(null,null) =maxnullnullnullnullnullnullnull(null,null;null).(56) 3.5.3 Noise-Gated Curvature Although robust to signal variations, the curvature measure nullnull(null,null) is sensitive to noise and might inaccurately give rise to a strong response at non-structure, high-noise regions of the image. Following the concept of cue gating, as we have previously used to gate the edge evidence cue in Section 3.4, we define a noise-gated curvature cue, nullnull(null,null), that suppresses our curvature cue in high noise regions as follows: nullnull(null,null) =null1−null(null,null)nullnullnull(null,null). (57) We demonstrate the curvature cue on a series of synthetic images shown in Figure 22. Figure 22(a) to (c) are synthetic images of objects with high curvature shapes, and Figure 22(d) to (f) are the resultant nullnull(null,null) measures. The spiral image of Figure 22(b) results in increased detected curvature towards the center region (Figure 22(e)), and the noisy regions of Figure 22(c) do not disrupt the curvature measure (Figure 22(f)). 70 (a) (b) (c) (d) (e) (f) Figure 22: (a), (b), (c) Synthetic images and (d), (e), (f) resulting gated curvature measure nullnull(null,null) where black regions correspond to a measure output of 0 and white regions to 1. In the spiral of (b), the curvature measure increases towards the center. In the noisy image of (c), the resulting gated-curvature is high for the extremities of the shape only. In addition, we demonstrate the need for noise gating on the synthetic image of Figure 23(a) which has added AWGN of variance 0.4. Figure 23(b) shows the non-gated curvature cue which poorly reflects the actual object curvature and is falsely high for noise regions. The noise-gated curvature in Figure 23(c) removes the false positives and enforces higher regularization for the non-edge regions. 71 (a) (b) (c) Figure 23: (a) Original synthetic image with AWGN of variance 0.4. (b) Curvature cue nullnull(null,null) (without noise gating) where intensities of 1 correspond to high curvature and intensities of 0 correspond to low curvature. High noise regions are incorrectly detected as high curvature levels. (c) Noise-gated curvature cue nullnull(null,null) where the sharp corners of the triangle are assigned the highest curvature and noisy regions are disregarded. 3.6 Data-Driven Regularization Weight Formulation We next discuss how the data cues are combined and mapped to the regularization weight such that the weight behaves as discussed in Section 3.1. 3.6.1 Cue Combination and Mapping to Weight To combine our noise-gated local image cues in a meaningful way, we define a mapping of those cues into our single adaptive weight, null(null,null), that satisfies the following requirements: (1) In high trusted (noise-gated) edge evidence, little regularization is needed, regardless of the curvature strength. (2) In regions with low edge evidence, we set the regularization to be inversely proportional to the trusted (noise-gated) curvature such that high curvature regions are not overly regularized. Note that `high' curvature or edge evidence means a value close to 1 as all our cues are normalized. Thus we form the adaptive weight as follows: 72 null(null,null) = 1 − nullnull(null,null)nullnullnullnullnullnullnullnullnull(null,null)null.(58) If nullnull(null,null) is large (approaching maximum value of 1), the exponent has little effect on the resulting weight, and requirements (1) is satisfied. If nullnull(null,null) is low and nullnull(null,null) is non-zero, the noise-gated edge evidence will be raised to a power null1 − nullnull(null,null)null≈ 0, resulting in a lower null(null,null), satisfying requirement (2). Note that the detrimental effects from noise are handled by this model through the noise-gating of the cues. We refer to nullnull(null,null)nullnullnullnull null nullnullnullnull(null,null)null as the curvature-modulated image reliability measure. We include the parameters nullnull and nullnull to allow minor adjustments of how strongly the edge evidence and curvature term should affect the regularization weight. We stress that these parameters are to allow for user tweaking and are set to a constant value over the image (i.e., the weights still vary spatially in an automated manner). Figure 24(a) shows a surface plot of the regularization weight null(nullnull,nullnull) as nullnull(null,null) and nullnull(null,null) vary from 0 to1, and where the parameters nullnull=1 and nullnull=1. As both of the cues increase, the regularization weight decreases since the local region of the image is considered more ‘reliable’ and regularization is not needed. However, as the cues decrease, the regularization weight increases since the edge evidence is no longer present or is not trusted (i.e. high noise is present). In addition, Figure 24(b) to (d) shows null(nullnull,nullnull) for parameters settings of (nullnull,nullnull) = (0.8,0.4) , (nullnull,nullnull) = (1,0.2) , and (nullnull,nullnull) = (0.2,1), respectively. The parameters change the level of concavity of the function, and the maximum weight produced. 73 (a) (b) (c) (d) Figure 24: Surface plot of null(nullnull,nullnull) based on the gated edge evidence (nullnull) and gated curvature cue (nullnull), and on different selections of the parameters nullnull and nullnull. (a) nullnull=null, nullnull=null. (b) Default setting nullnull=null.null, nullnull=null.null . (c) nullnull=null, nullnull=null.null , (d) nullnull=null.null, nullnull=null. The parameters change the level of concavity of the function and the maximum weight produced We demonstrate the cues and resulting weight measure for two synthetic images with a sinusoidal varying boundary (produced by (38) in Section 2.3.2), spatially varying AWGN and Gaussian smoothening with spatially varying kernel sizes, as shown in Figure 25(a) and (b). The resulting measures (Figure 25(c) to (j)) demonstrate robustness to noise and a final weighting that regularizes high noise regions but lower regularization in high curvature and strong trusted edge regions. Figure(f) Noweighcolumchanghas himeasu3.6.2 texturedges 25: (a), (b) Sise-gated edght null(null,null). Blan image has ing boundaryigher curvature. IncorpoIn many ne edges rat from being(a) (c) (e) (g) (i) Synthetic imae cue nullnull(null,nullck intensitiesspatially var smoothness (re and noise ration of atural imagher than fro included iages with vary). (g), (f) N correspond rying noise asmooth on thlevels. The rTexture aes, large gram edges repn the final 74 ing characteroise-gated cuto measures nd blurring e left and jaggesults confirmnd Additidients and lresenting oedge set ofistics. (c), (d)rvature cue of 0 and wh(increasing frged on the rigs the desiredonal Cuesarge curvatubject bound the image,(b) (d) (f) (h) (j) Noise measunullnull(null,null). (i),ite intensitiesom right to ht). The right behavior of re values caries. To pr we must ere null(null,null). (e (f) Reliabilit to 1. The leleft) and wit column imag the reliabilitan arise fromevent texturnsure greate), ty ft h e ty e r 75 regularization occurs in textured regions. We employ a texture measure from Erdem and Tari [38] that estimates the probability of a pixel being near a texture edge, discussed in Section 1.5.1 and shown again here for clarity: null(null,null) =1−expnull–nullnullnullminnullnullnull(null,null),nullnull(null,null)nullnullnull (59) We incorporate the texture cue into our framework by modifying nullnull in (45) to form the texture-gated and noise-gated edge evidence term as follows: nullnull,null(null,null) = null(null,null)null1−null(null,null)null|∇null(null,null)|. (60) Incorporating nullnull,null with our spatially adaptive weight produces the texture-dependent regularization weight, nullnull(null,null), as follows: nullnull(null,null) = 1 − nullnull,nullnullnull(null,null)nullnullnullnullnullnullnull(null,null)null.(61) In addition to texture, any other data cue can be combined into our regularization framework through multiplication into the gated edge evidence term of (60). 3.7 Incorporation of Regularization Weights into Segmentation Frameworks The regularization weight mapping of (58) produces a 2D map of weights over the image. This weight map can be input into any energy minimization segmentation framework with only minor modifications to allow for convex adaptive weighting. To demonstrate this, we will now incorporate the data-driven regularization weight of (58) into four existing segmentation frameworks with minor changes to the energy functional. 76 3.7.1 Minimum-path Frameworks We incorporate the data-driven regularization weight into a basic minimum-path framework that was described in Section 2.1 and repeated here as: nullnullnull(null),null(null)null =null null(null)nullnullnullnullnull(null)+null1−null(null)nullnullnullnullnullnullnull(null)nullnullnull nullnull (62) The external energy is the same as in Section 2.1, however the internal energy now reflects the length of the contour in the 2D space (null,null). The optimization process is through a 2D graph search using Djikstra’s method as described in Section 1.2.2.1. By using this framework, we are able to first test the data-driven weight against the globally optimal weight for validation. 3.7.2 Graph Cuts We incorporated our adaptive weights, null(null), into a graph cuts (GC) based segmentation process [7, 17]. The segmentation energy in this case becomes: null(null) =nullnull(null)nullnullnullnullnullnullnull,nullnullnullnull,null∈null+nullnull1−null(null)nullnullnullnullnullnullnullnullnullnull∈ null (63) where null∈ L is the labelling for all pixels null∈ null, where L is the space of all possible labellings, and null is the set of pixels in image null. nullnullnullnull is the interaction penalty between pixel pairs (i.e. the penalty of assigning labels nullnull and nullnull to pixels null and null), nullnullnullnull measures how well label nullnull fits pixel null given the observed data, and N is the set of interacting pairs of pixels. Refer to Section 1.3 for more details about nullnullnullnull and nullnullnullnull. 77 3.7.3 Active Contours without Edges We next present the ACWE segmentation framework with spatially adaptive regularization. Although we have used a convex weighting scheme in the discrete segmentation frameworks, we will only weight the regularization term itself for the continuous frameworks. This follows the formulation used by the majority of existing spatially adaptive regularization methods for continuous frameworks [38, 80, 35, 69]. We found through testing that a convex weighting results in impedance of curve evolution when the external terms are weighted to zero. However we will leave the analysis of curve evolution and curve stalling in level sets to future work (Section 4.2). We modified (14) in Section 1.2.3.3 to incorporate spatially adaptive regularization by replacing null with an adaptive weight, as follows: nullnullnull(null,null)null =nullnull(null,null)null nullnull(null,null)null|∇null(null,null)|nullnull nullnullnull + null|null(null,null) – nullnull|null nullnullnull(null,null)nullnull+|null(null,null) − nullnull|nullnull1−nullnullnull(null,null)nullnullnullnullnullnull (64) where the segmentation is represented here via a Lipschitz function, null(null,null):Ω→ℜ, where pixels null interior to the zero-level set of null(null,null) are labelled as objects and exterior pixels as background. Also note that we use the notation |∇null(null,null)| =nullnullnullnullnullnullnullnull+nullnullnullnullnullnullnull.(65) 78 We determine null(null,null) that minimizes (64) by using the Euler-Lagrange equation to solve the gradient descent PDE: nullnullnullnull=−nullnullnullnull=−nullnullnullnullnull−nullnullnullnullnullnullnullnull−nullnullnullnullnullnullnullnullnull =0(66) where null = null(null,null)nullnullnull(null,null)null|∇null(null,null)| + |null(null,null)−nullnull|nullnullnullnull(null,null)null+ |null(null,null)−nullnull|nullnull1−nullnullnull(null,null)nullnull (67) and where we use the notation nullnull=nullnullnullnull. We first determine the partial derivative nullnullnullnull as follows: nullnullnullnull = nullnullnullnull(null,null)nullnullnull(null,null)null|∇null(null,null)| + nullnullnullnull|null(null,null)−nullnull|nullnullnullnull(null,null)null+|null(null,null)−nullnull|nullnull1−nullnullnull(null,null)nullnullnull (68) which in expanded form is: 79 nullnullnullnull = nullnullnullnull(null,null)nullnullnull(null,null)null|∇null(null,null)| + nullnullnull|null(null,null)−nullnull|nullnullnullnull(null,null)null+ nullnullnull|null(null,null)−nullnull|null − nullnullnull|null(null,null)−nullnull|nullnullnullnull(null,null)null. (69) We use the property nullnullnullnullnullnull(null,null)null=nullnullnull(null,null)null(70) and the fact that nullnullnull|null(null,null)−nullnull|null=0(71) to simplify (69) as follows: nullnullnullnull = null(null,null)nullnullnullnull(null,null)null|∇null(null,null)| + nullnullnull(null,null)null[|null(null,null)−nullnull|null−|null(null,null)−nullnull|null] (72) where we use the notation nullnullnullnullnullnull(null,null)null=nullnullnullnull(null,null)null. We then determine nullnullnullnullnull from (67) as follows: 80 nullnullnullnullnull = nullnullnullnullnull(null,null)nullnullnull(null,null)null|∇null(null,null)| + nullnullnullnullnull|null(null,null)−nullnull|nullnullnullnull(null,null)null+ |null(null,null)−nullnull|nullnull1−nullnullnull(null,null)nullnullnull. (73) We note that nullnullnullnullnull|null(null,null)−nullnull|nullnullnullnull(null,null)null+|null(null,null)−nullnull|nullnull1−nullnullnull(null,null)nullnullnull = 0. (74) and that nullnullnullnullnull(null,null)nullnullnull(null,null)null|∇null(null,null)| =null(null,null)nullnullnull(null,null)nullnullnullnullnull|∇null(null,null)|= null(null,null)nullnullnull(null,null)null2nullnull2nullnullnullnullnullnullnullnull+ nullnullnullnullnullnullnull= null(null,null)nullnullnull(null,null)nullnullnull|∇null(null,null)| (75) Thus we can simplify (73) as follows: nullnullnullnullnull= null(null,null)nullnullnull(null,null)nullnullnull|∇null(null,null)|.(76) Similarly we obtain the following for nullnullnullnullnull: 81 nullnullnullnullnull= null(null,null)nullnullnull(null,null)nullnullnull|∇null(null,null)|.(77) We then take the derivative of (76) with respect to null as follows: nullnullnullnullnullnullnullnullnullnull = nullnullnullnullnull(null,null)nullnullnull(null,null)nullnullnull|∇null(null,null)|null.(78) Using the product rule null(nullnull)nullnull= nullnullnullnullnull+nullnullnullnullnull,(79) and noting that due to the chain rule, we can make the expansion nullnullnullnullnullnull(null,null)null=nullnullnullnull(null,null)nullnullnull,(80) we expand (78) as follows: nullnullnullnullnullnullnullnullnullnull = nullnullnullnullnull(null,null)nullnullnull(null,null)nullnullnullnull|∇null(null,null)|+null(null,null)nullnullnullnullnullnullnull(null,null)nullnullnullnull|∇null(null,null)|+ null(null,null)nullnullnull(null,null)nullnullnullnullnullnullnull|∇null(null,null)|null = nullnull(null,null)nullnullnull(null,null)nullnullnull|∇null(null,null)|+ null(null,null)nullnullnullnull(null,null)nullnullnullnullnull|∇null(null,null)|+null(null,null)nullnullnull(null,null)nullnullnullnullnullnullnull|∇null(null,null)|null (81) 82 = nullnull(null,null)nullnullnull(null,null)nullnullnull|∇null(null,null)|+null(null,null)nullnullnullnull(null,null)nullnullnullnull|∇null(null,null)|+null(null,null)nullnullnull(null,null)nullnullnullnullnullnullnull|∇null(null,null)|null . Similarly, we take the derivative of (77) with respect to null as follows: nullnullnullnullnullnullnullnullnullnull = nullnull(null,null)nullnullnull(null,null)nullnullnull|∇null(null,null)|+ null(null,null)nullnullnullnull(null,null)nullnullnullnull|∇null(null,null)|+null(null,null)nullnullnull(null,null)nullnullnullnullnullnullnull|∇null(null,null)|null . (82) We then combine (81) and (82). We first note that nullnullnull+nullnullnull|∇null(null,null)|= |∇null(null,null)|(83) and thus that null(null,null)nullnullnullnull(null,null)nullnullnullnullnull|∇null(null,null)|+nullnullnull|∇null(null,null)|null=null(null,null)nullnullnullnull(null,null)null|∇null(null,null)|. (84) We also note that nullnull(null,null)nullnull|∇null(null,null)|+nullnull(null,null)nullnull|∇null(null,null)|=∇null(null,null)∙∇null(null,null)|∇null(null,null)|. (85) Additionally, using the product rule (79), we note that: 83 nullnullnullnullnullnull|∇null(null,null)|null+nullnullnullnullnullnull|∇null(null,null)|null =nullnullnullnullnullnull1|∇null(null,null)|null+1|∇null(null,null)| nullnullnullnullnull+ nullnullnullnullnullnull1|∇null(null,null)|null+1|∇null(null,null)| nullnullnullnullnull =nullnullnullnullnullnullnull1|∇null(null,null)|null+nullnullnullnullnullnull1|∇null(null,null)|nullnull+1|∇null(null,null)|nullnullnullnullnullnullnull+nullnullnullnullnullnullnull =∇null∙∇1|∇null(null,null)|+1|∇null(null,null)|div (∇null). (86) Since the divergence of a scalar function null and a vector null is as follows: div(nullnull) = (∇null)∙null+nulldiv(null), (87) we can simplify (86) by using the substitution null=null|∇null(null,null)| and null=∇null as follows: nullnullnullnullnullnullnull1|∇null(null,null)|null+nullnullnullnullnullnull1|∇null(null,null)|nullnull=nullnullnullnullnullnullnull|∇null(null,null)|null+nullnullnullnullnullnull|∇null(null,null)|nullnull=divnull∇null(null,null)|∇null(null,null)|null. (88) We use (88) to derive: 84 null(null,null)nullnullnull(null,null)nullnullnullnullnullnullnullnull|∇null(null,null)|null+nullnullnullnullnullnull|∇null(null,null)|nullnull=null(null,null)nullnullnull(null,null)nulldivnull∇null(null,null)|∇null(null,null)|null (89) From (84), (85), and (89), we combine (81) and (82) as follows: nullnullnullnullnullnullnullnullnullnull+nullnullnullnullnullnullnullnullnullnull= null(null,null)nullnullnullnull(null,null)null|∇null(null,null)| +nullnullnull(null,null)null∇null(null,null)∙ ∇null(null,null)|∇null(null,null)|+null(null,null)nullnullnull(null,null)nulldivnull∇null(null,null)|∇null(null,null)|null (90) We substitute (72) and (90) into the Euler-Lagrange (66) to obtain the final evolution equation that corresponds to the spatially adaptive ACWE function (64) as follows: nullnullnullnull= nullnullnull(null,null)null∇null(null,null)∙∇null(null,null)|∇null(null,null)|+null(null,null)nullnullnull(null,null)nulldivnull∇null(null,null)|∇null(null,null)|null−nullnullnull(null,null)null[|null(null,null)−nullnull|null+|null(null,null)−nullnull|null] (91) When we compare this new form of the evolution equation to the original form in (15) from Section 1.2.3.3, we see that we obtain the new term nullnullnull(null,null)null∇null(null,null)∙ ∇null(null,null)|∇null(null,null)|. We optimize the spatially adaptive ACWE energy functional by first using finite differences to discretize the system, then initializing null(null,null) to a user-entered contour in the vicinity of the object to segment, and then iteratively adding the image evolution term of (91) to null(null,null) until null(null,null) is stationary. 85 3.7.4 Contextual Mumford-Shah Framework In order to compare our method against the closest existing adaptive regularization technique, we incorporated our weights into the Mumford-Shah based segmentation framework of Erdem and Tari [38] as described in Section 1.5.1 where contextual regularization is used. We note that the regularization weights of the Erdem and Tari (ET) method is not general and can only be used and compared against within this framework. We incorporated our data cues into the ET method by using the negative feedback method (described in Section 1.5.1 and reprinted here for clarity) nullnull = nullnull +(1−null) (92) where null represents the combined cues used in the ET method. We replace null with our curvature-modulated reliability term (see Section 3.6.1) as follows: nullnull = nullnullnullnull(null,null)nullnull – nullnullnullnull(null,null)nullnull+null1−nullnullnullnull(null,null)nullnullnullnullnullnullnull(null,null)null null (93) which simplifies to nullnull = nullnullnullnull(null,null)nullnull–nullnullnullnull(null,null)null(null−1). (94) In addition, we also tested incorporating the texture edges term of [38] such that nullnull = nullnull,nullnullnull(null,null)nullnullnullnullnullnullnull(null,null)null(null−1). (95) where nullnull,null(null,null) is described in Section 3.6.2. The nullnull term modifies the evolution equations of the functional (as described in Section 1.5.1) which are then used to 86 iteratively update the image process and edge process to form the segmentation (as described in Section 1.2.3.2). 3.8 Validation and Results We demonstrate the performance of the data-driven regularization weights on the synthetic, medical, and natural datasets that we first introduced in Section 2.3 with the globally optimal weight. In addition to the least-error spatially fixed weight, which we will compare our results against for all methods, we will compare our method to the globally optimal weight for the minimum-path segmentation framework, and to the ET method for the contextual Mumford-Shah segmentation framework. 3.8.1 Performance Criteria We tested each segmentation method of Section 3.7 with databases suitable for the method. For example, natural images with object comprising of a wide intensity range cannot be used for segmentation methods with regions-based external terms since these methods assume that the object of interest is approximately piecewise constant. However, these non-piecewise constant objects can be segmented by methods with boundary external terms, such as the minimum-path framework of Section 3.7.1. In our tests using the minimum-path segmentation framework, we first analyze results on the sinusoidal synthetic dataset described in Section 2.3.2 that was used for the globally optimal weight validation. We measure our error for this dataset using the same nullnull error as we did for the globally-optimal weight validation (see (35) in Section 2.3.1). We also tested the minimum-path segmentation framework on medical and natural 87 datasets where we used the DSC error metric from (36) in Section 2.3.1. For the graph cuts framework, we first analyze results on a synthetic dataset testing detection of objects with decreasing signal-to-noise ratio (SNR). Since the process involves matching null labels between the ground truth (original noise-free shape) and the segmentation, we determine the error by using the multi-labelling Hungarian method [61] to solve the label assignment problem, and then using the DSC measure of (36) in Section 2.3.1. The remainder of the testing on the medical and natural datasets using all four of the aforementioned method used the DSC measure. We compared our method against existing techniques suitable for each segmentation framework. For all frameworks, we compared our data-driven weight against the least-error spatially fixed weight nullnullnullnullnullnull as described in Section 2.3.1. For the minimum-path framework, we additionally compared our method against the globally optimal weight as discussed in Chapter 2. For the contextual Mumford-Shah framework, we additionally compared our method against the ET regularization cues. Within our method, we also compared the weights produced by only the noise-gated edge cue to the weights produced by the curvature-and-noise modulated edge cue, and by addition of the texture cue into our framework. For each set of results, we determined 25 segmentations for each image and averaged to produce the quantitative results. In addition, we performed ANOVA to determine if the data-driven weights produced significantly more accurate results than nullnullnullnullnullnull, nullnullnull, and the ET method. 88 3.8.2 Parameter Selection and Implementation Details For the segmentation approaches, we used a graph cuts wrapper from [7], an implementation of ACWE from [102], and an implementation of the contextual MS method from [38], all of which were modified as proposed in Section 3.7. The minimum-path segmentation framework determines the optimal contour between seedpoints, which we entered as equidistant points determined automatically around the object boundary as determined by the ground truth segmentation, and where each seedpoint consists of the pixel coordinates (null,null). For the graph cuts framework, we selected a low number of random seeds (0.3% of image pixels for each label) automatically by using the ground truth. For the ACWE and contextual MS frameworks, we used an initial contour of a null/4×null/4 square placed in the center of the object of interest, where null×null is the image size. For all frameworks, we used the same initializations/seedpoints for the comparison methods. For our data-driven weights, we set the parameters nullnull=0.8 and nullnull=0.4. 3.8.3 Computational Performance In our tests, we used un-optimized MATLAB code on a PC with 3.6 GHz Intel Core Duo processor and 2GB of RAM. Computationally, the proposed method required less than 3 minutes to calculate the regularization weights for a 768 × 576 image. The most computationally intensive aspect of our work in the noise measure as determined through spectral flatness (see Section 3.2) because the fast Fourier transform (FFT) must be determined for each local window region in the image. 89 3.8.4 Synthetic Data Validation 3.8.4.1 Noise limitation tests We first analyzed the robustness of our data-driven weights against increasing levels of noise (AWGN) to determine the limit at which the weights are no longer meaningful. Figure 26(a), (b), and (c) show synthetic images corrupted by increasing levels of AWGN. The graph cuts adaptive weight segmentation for the images corrupted by noise levels of 0.05 and 1.05 std. dev. (Figure 26(d) and (e), respectively) adheres to the corners of the object and does not leak outside of the object, unlike the fixed weight segmentation in red. At an extremely high noise level of 1.90 std. dev. shown in Figure 26(c), the resulting adaptive weight segmentation Figure 26(f) begins to show holes and degradation. Analysis of the DSC of adaptive weight graph cuts segmentations for the synthetic image of Figure 27 over various noise levels (25 noise realizations for each noise level) found that segmentation accuracy begins to drop at noise levels greater than 1.75 std. dev. Figurestanda(e), (f)fixed w(c), thFigureimage(a) (d) 26: Segmenrd deviation. Correspondeight red whe segmentatio 27: Segmene of Figure 26(DSCtation of gre (a), (b), (c) Oing segmentaere yellow regn (f) begins totation accuraa). Accuracy 00.50.550.60.650.70.750.80.850.90.951 ey object in sriginal imagetions from thions are whe form holes acy (DSC) as begins to drop0.5Std90 (b) (e) synthetic images with std. dee proposed are the segmennd inaccuraciincreasing le at noise leve1 1.5. dev. of noisee corrupted v. of 0.05, 1.0daptive weightations overlaes. vels of noisels greater tha2(c(f)by AWGN w5, and 1.90, reht green and p. At the hig are added ton 1.75 std. dev2.5) ith increasinspectively. (dthe least-erroh noise level o the synthetiation. ng ), r f ic 3.8.4. sameand ablurriglobashowSectiFigurenoise on theobtainweighhigh n2 MinimumWe tested sinusoidal sppearance ng with splly-optimal n in Figure on 3.6.1): 28: Synthetiand blurring left and jaged from: (blht. The data-doise regions a-Path Synt the performynthetic datvariations watially varyweight segm28(a) and (bc dataset test(increasing frgged on the rue) data-drivriven weightnd accuratelyhetic Testsance of theaset of Sectith AWGNing kernel entation an) for the twing with miniom right to light). (b) Imen weights, (s produce th captures hig91 minimum-pion 2.3.2 tha of spatiasizes. Wed the leasto synthetic(a) (b) mum-path apeft) and withage with highred) best fixee only segmeh curvature rath segment is designelly varying compared -error fixed images of Fproach. (a) I changing boher curvatured weight, anntation that egions of the tation framed to cover e variance aour resultsweight segmigure 25(a)mage with spundary smoo and noise led (cyan) gloaccurately reboundary. work for thxtreme shapnd Gaussia against thentation, a and (b) (seatially varyinthness (smootvels. Contourbally optimumegularization ie e n e s e ng h s n 92 In Figure 28(a), the least-error fixed weight is low (0.2) in order to segment the high curvature region of the image, and thus is not able to accurately segment the high noise region of the object. The data-driven weight produces an accurate segmentation by lowering regularization in the high curvature and reliable edge region of the image, and increasing regularization is the unreliable high noise regions. The curvature-modulated reliability measure (see Figure 25(g) and (h) in Section 3.6.1) confirms this behaviour. We quantitatively examined our method's performance using ANOVA testing on 25 noise realizations of each image in the dataset, where the error was determined by nullnull. Our method resulted in a mean error (in pixels) of 6.33 ± 1.36, whereas the best fixed-weight method had a mean error of 12.05 ± 1.61, and the globally-optimum weight method had a mean error of 33.06 ± 3.66. Furthermore, for each image, we found our method to be significantly more accurate with all p-values << 0.05. Our error results are presented in Table 2. 93 Table 2: Average error over 25 noise realizations per image produced by data-driven weights, least-error fixed weights, and globally optimal weights for synthetic set of data. Segmentations from the data-driven weights have errors less than alternate methods with p-values from ANOVA testing ≪null.nullnull. Image # Mean nullnull error over 25 segmentations Fixed weights Data-driven weights G.O. weights 1 13.02 7.2 17.68 2 15.04 7.08 45.243 8.92 6.04 17.4 4 13.8 8.64 38.65 6.8 3.64 7.16 6 12.96 5.92 42.27 14.4 7.4 42.4 8 13.28 7.56 40.929 11.8 3.12 35.72 10 13.04 4.76 47.6811 14.84 8.16 40.04 12 6.64 3.32 7.9213 10.62 5.4 27.24 14 12.74 7.48 4115 11.92 6.32 44.2 16 13.02 9.28 33.64 3.8.4.3 Graph Cuts Synthetic Tests We validated graph cuts with our proposed method on simulated noisy images of variably-sized ellipses with complicated background patterns, e.g. with image contrast decreasing from right to left, as in Figure 29(a). The leftmost ellipses with lower contrast have a lower SNR than the rightmost ellipses and thus require greater regularization. Note how our resulting reliability measure Figure 29(b) indicates lower image reliability for low contrast ellipses. When comparing our segmentation results to those from the spatially-fixed weight, as shown in Figure 29(c), 6 ellipses out of 14 were mislabelled, whereas GC with adaptive regularization correctly labelled 12 ellipses (Figure 29(d)). To quantify the advantage of our approach, we tested a synthetic dataset of images containing 2 to 40 ellipses at various noise levels. We calculated the DSC of the 94 segmentation to the ground truth for each ellipse and averaged over all the ellipses in the image. Figure 30 plots the difference in average DSC between adaptive regularization GC and standard GC for images of increasing ellipse numbers. The same images were also tested at various noise levels. Note that a positive difference in the DSC indicates that our proposed regularization method with GC had greater success detecting low contrast ellipses. (a) (b) (c) (d) Figure 29: Segmentation of a synthetic image using GC with adaptive regularization. (a) Synthetic image of 14 ellipses with image contrast increasing from left to right. (b) Reliability calculated by our data-driven weights. (c) Segmentation from fixed weights, where each color represents a separate label. (d) Segmentation from data-driven weights, which result in greater numbers of low-contrast ellipses successfully segmented. 95 Figure 30: Difference in average DSC between adaptive regularization GC and fixed regularization GC for images with increasing numbers of ellipses. Different curves represent different noise standard deviations as shown in legend. Positive DSC difference indicates segmentations from data-driven weights are more successful than fixed weights at labelling ellipses with low image quality. 3.8.5 Medical Data Validation We performed tests on the following medical datasets: set of 52 sagittal slices centered around the CC structure, 8 mammography images from the Digital Database for Screening Mammography (DDSM) [51, 50], 15 images from the BrainWeb database [63, 62, 31] using the T1, T2, and PD modalities with varying amounts of noise and intensity inhomogeneity, and coronal and transverse slices from the 18-subject Internet Brain Segmentation Repository (IBSR) database (provided by the Center for Morphometric Analysis at Massachusetts General Hospital and available at http://www.cma.mgh.harvard.edu/ibsr/). 3.8.5.1 Minimum-Path Sagittal MR images of the CC exhibit the known problem of a weak boundary where the CC meets the fornix (as discussed and labelled in Figure 13(a) of Section 2.3.3). Unlike the globally optimal weights which produced bimodal behaviour (either 0 5 10 15 20 25 30 35 40-0.200.20.40.60.811.2Number of distinct objects in imageDifference in DSC 0.10.150.20.250.30.4blue (stronas seeFigureimagecorresregulaContoglobal on impresewas awere 34 shnoiseperforesultor red in Fger red in Fn in the seg 31: Segmene. The coloringponding to nullrization in turs producedly-optimum wThe quanages from tnted Figure veraged ovefor the bounows how th levels for rmed the ses for segmeigure 13(b)igure 31(a)mentation re(a) tation from ( of the conto=null and phe difficult f by using theight (cyan). titative resulhe CC datas32, Figure 3r 25 segmedary of the e segmentatthe BrainWgmentationsnting cancer), our meth) at the CCsults (Figura) proposed durs reflects thure red to nullornix regione proposed adts of the seget, IBSR da3, Figure 3ntations. Thcortical whiions from teb database for 25 nois tissue in m96 od automa-fornix boune 31(b)). ata-driven we value of the=null. The and has a aptive weighmentationstaset, Brain4, and Figure segmentate matter inhe data-driv using T1,e realizationammographtically boosdary produeight method spatially-adadata-driven smooth tranht (blue), best using the mWeb datasee 35, wheretions of the transverse aen weights T2, and Ps per noise y images frots up the rcing a bette (b) for a corpusptive weight weights resusition betwee fixed-weightinimum-patt, and DDSM the DSC foIBSR datasend coronal do not degrD modalitielevel. Figurm the DDSMegularizatior delineation callosum Mwith pure blults in greaten weights. (b (red), and thh framewor dataset arr each imagt (Figure 33slices. Figurade at highes, where we 35 present dataset. n , R e r ) e k e e ) e r e s 97 Figure 32: CC 52-image dataset DSC results with minimum-path segmentation method. Graph shows average DSC over dataset for segmentations from DD = data-driven weights, F = least-error fixed weights, and GO = globally optimal weights. DSC for each image was averaged from 25 segmentations with different seeds. DD average DSC (over total dataset) was 0.9224, F average DSC was 0.8984, GO average DSC was 0.8657. Average p-values for DD vs. F and for DD vs. GO were << 0.05. Segmentations from the DD weights were significantly more accurate. (a) (b) Figure 33: DSC results for IBSR dataset of (a) coronal and (b) transverse MR slices from 18 subjects when segmenting for white matter using minimum-path approach. Graph shows mean and variance for segmentations produced by data-driven weights (DD), least-error fixed weight (F), and the globally optimal weights (GO). The DD average DSC (over 18 image dataset) was 0.9323 for the coronal dataset and 0.9278 for the transverse dataset. The F average DSC was 0.9036 and 0.9094 for the coronal and transverse datasets, respectively, and the GO average DSC was 0.8806 and 0.8886 for the coronal and transverse. All p-values were ≪null.nullnull. Although the difference between the segmentations from the DD weights and the fixed weights were low, the DD weight segmentations were still significantly more accurate. 0.650.70.750.80.850.90.95123Dice SimilarityDD F GO0.70.750.80.850.90.95123Dice SimilarityDD F GO0.750.80.850.90.95123Dice SimilarityDD F GO98 (a) (b) (c) Figure 34: DSC results for slices from coronal BrainWeb dataset when segmenting for white matter using minimum-path approach. (a) T1 data, (b) T2 data, and (c) PD data. Data-driven weights (DD) produce segmentation accuracies that decrease less than fixed weights (F) and globally optimum weight (GO) as the level of noise in the image increases from 0% to 9% (averaged over 25 noise realizations per noise level). Figure 35: DSC results for segmentation of cancer tissue in 8 mammography images from DDSM database using minimum-path approach. The data-driven weights (DD) resulted in a mean error of 0.8958, the fixed-weight (F) in a mean error of 0.7961 and the globally-optimal weights (GO) in 0.7370 with all p-values ≪null.nullnull. Segmentations from the DD weights show a clear improvement over segmentations from F and GO weights. 3.8.5.2 Graph Cuts We next present qualitative results using graph cuts with our data-driven regularization weights on images from BrainWeb. Figure 36(a) shows a T1 image with an intensity inhomogeneity of 20%. High curvature-modulated reliability in the cortical folds (Figure 36(b)) results in lower regularization in these regions. The overlayed GC segmentations (Figure 36(c)) using the adaptive regularization weight versus the least-0% 3% 5% 7% 9%0.50.60.70.80.91Noise LevelDice SimilarityT1 BrainWeb Segmentation Accuracy DDFGO0% 3% 5% 7% 9%0.50.60.70.80.91Noise LevelDice SimilarityT2 BrainWeb Segmentation Accuracy DDFGO0% 3% 5% 7% 9%0.50.60.70.80.91Noise LevelDice SimilarityPD BrainWeb Segmentation Accuracy DDGOFDD F GO0.70.750.80.850.9Dice SimilarityDDSM Segmentation Error99 error fixed weight shows greater segmentation accuracy in high curvature regions. Additionally, the curvature-modulated regularization weights show improvements over the non-curvature weights of Section 3.4 (Figure 36(d)). Figure 37(a) shows the same T1 image of Figure 36(a) but with a noise level of 7%. The resulting curvature-modulated reliability map (Figure 37(b)) is not corrupted by the noise and still enforces greater regularization in high curvature cortical folds, as seen in the resultant segmentation comparisons of Figure 37(c) and Figure 37(d). At higher noise levels, our data-driven weights results in a more accurate segmentation than the standard least-error uniform weight, and even more accurate than our non-curvature image reliability system, thus verifying the importance of the noise-gated curvature cue. 100 (a) (b) (c) (d) Figure 36: Segmentation of MR data from BrainWeb using GC with curvature-modulated regularization. (a) T1 slice with 20% intensity non-uniformity. (b) Curvature-modulated reliability. Black intensities corresponds to 0 (low reliability/high regularization) and white to 1. Note higher reliability in cortical folds. (c) Comparison of segmentations from the curvature-modulated weight (green) to the least-error fixed weight (red), and (d) to the non-curvature modulated image reliability weight (blue). Yellow regions are where the segmentations overlap, and ground truth contour is shown in black. Segmentations from the curvature-modulated DD weights result in no leakage into the background and accurately capture the high-curvature cortical folds. 101 (a) (b) (c) (d) Figure 37: Segmentation of noisy MR data from BrainWeb using GC with curvature-modulated regularization. (a) T1 slice with 7% noise level. (b) Curvature-modulated reliability. (c) Comparison of segmentations from the curvature-modulated adaptive weight (green) to the least-error fixed weight (red), and (d) to the non-curvature modulated image reliability weight (blue). We present quantitative results for our graph cuts tests with the BrainWeb, DDSM, and IBSR datasets in Figure 38, Figure 39, and Figure 40. On average, these results are lower than the results from the minimum-path segmentation approach because the lack of seedpoints reduces the ability to target the object of interest for each image. For the BrainWeb dataset (Figure 38), as the level of noise increases (where we performed the segmentations for 25 noise realizations per noise level), the data-driven weights produce more accurate segmentations than the least-error fixed weights. For the 102 DDSM dataset (Figure 39), both methods produce segmentations with low DSC which is due the complex background in the mammography images, where the non-cancerous tissue has similar intensity values to the cancerous tissue. Unlike minimum-path approaches where we can target the tissue object of interest with seedpoints, the seeding used for graph cuts is only for intensity profile estimation. The result is that many objects in the background (breast tissue) are mistaken as cancerous tissue due to similarities in intensity values. For the IBSR segmentations (Figure 40), the data-driven weights produced significantly more accurate segmentations for the coronal and transverse slices (when segmenting for white matter). When segmenting for the CC structure in the sagittal slices (Figure 40(c)), the data-driven weights were not significantly more accurate due to similarities in intensity between the CC structure and the fornix. (a) (b) (c) Figure 38: DSC results for BrainWeb segmentation of white matter using graph cuts approach for (a) T1, (b) T2, and (c) PD modalities with increasing noise levels. Data-driven weights (DD) produce less degradation in results at high noise levels when compared to least-error fixed weights (F). DSC values are averaged over segmentations for 25 noise realizations per noise level. As the noise level increases, the segmentations from the DD weights result in greater DSC than the fixed weight segmentations. 0% 3% 5% 7%0.750.80.850.90.95Dice SimilarityNoise LevelT1 BrainWeb Segmentation DDF0% 3% 5% 7% 9%0.70.750.80.850.90.95Dice SimilarityNoise LevelT2 BrainWeb Segmentation DDF0% 3% 5% 7% 9%0.70.80.91Dice SimilarityNoise LevelPD BrainWeb Segmentation DDF103 Figure 39: DSC results for segmentation of cancer tissue in 8 mammography images from DDSM database using the graph cuts approach. As these images represent difficult scenarios due to noise, weak edges, and non-piecewise constant objects, both methods for regularization weights produce poorer results. However, the data-driven weights (DD) produce significantly improved results over the fixed weights (F) segmentation. DD average DSC = 0.3576, F average DSC = 0.2909. (a) (b) (c) Figure 40: DSC results for IBSR dataset of (a) coronal, (b) transverse and (c) sagittal MR slices from 18 subjects when segmenting for white matter (for (a) and (b)) and CC (for (c)) using graph cuts. The DD average DSC (over 18 image dataset) was 0.8999 for the coronal dataset, 0.8199 for the transverse dataset, and 0.7149 for the sagittal dataset. The F average DSC was 0.8627, 0.7462, and 0.6611 for the coronal, transverse, and sagittal datasets, respectively. Average p-values were 0.0118, 9.89E-9, 0.052. For the CC segmentations in the sagittal plane, the proposed method did not produce significantly improved results. This is since the fornix and CC have the same intensity and region-based external terms are not successful in differentiating the structures. 3.8.5.3 Active Contours Without Edges We next present results using the ACWE segmentation framework on medical datasets. Figure 41(a) shows a PD coronal slice with 5% noise from BrainWeb. The resulting curvature-modulated reliability (Figure 41(b)) is bright for the high curvature tips of the central ventricles and for the cortical folds of the white matter. The resulting segmentation (Figure 41(c)) from the data-driven weights (green) for the white matter is DD F00.20.40.60.81Dice SimilarityDDSM Segmentation by Graph CutsDD F0.50.60.70.80.91Dice SimilarityDD F0.50.60.70.80.91Dice SimilarityDD F0.50.60.70.80.91Dice Similarity104 able to capture the cortical folds while not oversegmenting into the ventricle region, unlike the least-error fixed weight result (red). (a) (b) (c) Figure 41: Segmentation of white matter in PD coronal slice from BrainWeb with 5% noise. (a) Original image, (b) curvature-modulated reliability, and (c) comparison of segmentations from data-driven weight (green), least-error fixed weight (red), where overlapping regions are in yellow and ground truth contour shown in black. Data-driven segmentation results in less over-segmentation into ventricle region and better segmentation of high curvature tips. In addition, we present segmentations of cancer tissue from mammography images (DDSM database) such as in Figure 42, where the ACWE contour was initialized as a square shown in Figure 42(a) and iterations were run until the contour evolution converged (at most 700 iterations). The segmentation from the data-driven weights more accurately grows into the left region of the tissue. Neither segmentation accurately contains the right region of the tissue which is due to the cancer object containing differing intensities. Figureimagesegmeoverlacapturlow-in Figurdatabdifferweightransvresultsimil (a) 42: ACWE se with initial ntations fromp and groune the white-itensity region We presee 43, Figurease were loent intensityts provide erse datases for the IBarities in integmentation ACWE cont data-drivend truth is shntensity regio of the tissue nt quantitati 44, and Figw for both values in tsignificantlts (Figure SR sagittalensity betweof mammogrour, (b) cur weights (greown in blacn of the cancwhich represeve results fure 45. As w the fixed he cancer my more acc45) when dataset when the CC a105 (b) aphy image frvature-modulen) and fixek. Segmentater tissue. Neints a difficulor the Brainith graph cweight andasses as shourate segmesegmenting en segmentnd fornix. om DDSM daated reliabild-weights (reion from DDther segmentt case. Web, DDSMuts, the ACW data-drivenwn in Figuntations fofor white ing for the (c)tabase [50, 5ity, and (c) d) where yel weights moation correct, and IBSE results o weight dure 42(a). Thr the IBSRmatter, but CC structur 1]. (a) Origincomparison olow represenre successfullly captures thR datasets in the DDSMe to greatle data-drive coronal aninsignificane due to thal f ts ly e n y n d t e 106 (a) (b) (c) Figure 43: DSC results for BrainWeb segmentation of white matter using ACWE approach for (a) T1, (b) T2, and (c) PD modalities with increasing noise levels. Data-driven weights (DD) produce less degradation in results at high noise levels when compared to least-error fixed weights (F). DSC values averaged over segmentations from 25 noise realizations per noise level. Segmentations from the DD weights are more accurate than fixed weight segmentations at high noise levels. Figure 44: DSC results for segmentation of cancer tissue in 8 mammography images from DDSM database using the ACWE approach. The data-driven weights (DD) produce significantly improved results over the fixed weights (F) segmentation. DD average DSC = 0.3565, F average DSC = 0.2833. Average p-value was 6E-4. Although the DD weight segmentations are more accurate, both methods produce low accuracies since neither method could segment regions of the cancer tissue with greatly differing intensity values. 0% 3% 5% 7% 9%0.50.60.70.80.91Dice SimilarityT1 segmentation DDF0% 3% 5% 7% 9%0.50.60.70.80.91Dice SimilarityT2 segmentation DDF0% 3% 5% 7% 9%0.50.60.70.80.91Dice SimilarityPD segmentation DDFDD F00.10.20.30.40.5Dice SimilarityDDSM segmentation107 (a) (b) (c) Figure 45: DSC results for IBSR dataset of (a) coronal, (b) transverse and (c) sagittal MR slices from 18 subjects when segmenting for white matter (for (a) and (b)) and CC (for (c)) using ACWE. The DD average DSC (over 18 image dataset) was 0.9019 for the coronal dataset, 0.7895 for the transverse dataset, and 0.6345 for the sagittal dataset. The F average DSC was 0.8466, 0.7196, and 0.6196 for the coronal, transverse, and sagittal datasets, respectively. Average p-values were 3.303E-7, 3E-4, 0.482. The segmentation from the DD weights was significantly more accurate for the coronal and transverse datasets, but was not significantly more accurate for the sagittal dataset when segmenting for the CC structure. This is due to the intensity similarities between the CC and the fornix structures which produces problems for region-based external terms. 3.8.5.4 Contextual Mumford-Shah Method We validated the contextual Mumford-Shah method by comparing results to both the least-error fixed weight segmentation and to the segmentation produced the ET adaptive regularization method. The contextual MS method is automated and does not allow for user input to focus the segmentation on certain objects. As such, we chose to demonstrate segmentations of the central ventricle structure using the BrainWeb and IBSR databases rather than segmentations of the white matter which is a complex structure encompassing the entire image. We targeted the ventricles by cropping the image around the structure. Figure 46(a) and Figure 47(a) show the central ventricle region taken from coronal slices from the BrainWeb dataset, where Figure 46(a) depicts a PD image with a noise level of 5% and Figure 47(a) depicts a T1 image with a noise level of 9%. For both cases, the curvature-modulated reliability (Figure 46(b) and Figure 47(b)) is higher for the tips of the ventricle structure, and the resulting segmentations DD F0.50.60.70.80.91Dice SimilarityDD F0.50.60.70.80.91Dice SimilarityDD F0.50.60.70.80.91Dice Similarity(FigumethoFigure(a) Psegmedrivenoverlaregula re 46(c) andds due to lo 46: SegmenD image wintations betw weights andp and black rization in th (d), and Fiwer regular(a) (c) tation of centth 5% noiseeen data-driv Erdem-Taricontour repre tips of the cgure 47(c) aization in th ral ventricle , and (b) cen weights (ET) weightesents the gentral ventric108 nd (d)) are e tip regionsstructure frourvature-mo(green) and fis (blue) wherround truth. le structure amore accur. m mid-volumdulated reliaxed weights (e yellow regiThe data-drnd the weak eate against b(b) (d) e coronal slicbility. (c) Cred), and (d)ions represeniven weights dge sides. oth alternat es (BrainWebomparison o between datat segmentatioprovide lowee ) f -n r FigureCompbetwesegmeprovidWe pdatassignifdo no 47: (a) T1 arison of segen data-driventation overle a segmentaresent the ets in Figuicantly mort degrade ov(a) (c) image with 9mentations ben weights anap and black tion that captquantitativere 48, Figue accurate ser increased % noise (Brtween data-dd the Erdemcontour reprures the high results fore 49, and egmentation levels of n109 ainWeb), andriven weight-Tari weightesents the grocurvature tipr the BrainFigure 50.s for the coroise for the (b) curvatus (green) and s (blue) wheund truth. Os. Web, DDS The dataonal IBSR vBrainWeb s(b) (d) re-modulatedfixed weightsre yellow regnly the data-M, and ISB-driven weientricle appegmentation reliability. ( (red), and (dions represendriven weighR (coronaghts providlication, ans. c) ) t ts l) e d 110 (a) (b) (c) Figure 48: DSC results for BrainWeb segmentation of central ventricle structure using contextual MS approach for (a) T1, (b) T2, and (c) PD modalities with increasing noise levels. Data-driven weights (DD) produce less degradation in results at high noise levels when compared to least-error fixed weights (F) and Erdem-Tari weights (ET). DSC values are averaged over segmentations from 25 noise realizations per noise level. Figure 49: DSC results for segmentation of cancer tissue in 8 mammography images from DDSM database using the contextual MS approach. The data-driven weights (DD) produce significantly improved results over the fixed weights (F) segmentation. DD average DSC = 0.3092, F average DSC = 0.2433, ET average DSC = 0.2863. All results are poor due to the non-piecewise constant cancer object. 0% 3% 5% 7% 9%0.50.60.70.80.91Dice SimilarityT1 segmentation DDFET0% 3% 5% 7% 9%0.50.60.70.80.91Dice SimilarityT2 segmentation DDFET0% 3% 5% 7% 9%0.50.60.70.80.91Dice SimilarityPD segmentation DDFETDD F ET00.10.20.30.40.5Dice SimilarityDDSM segmentation111 Figure 50: DSC results for IBSR dataset of coronal MR slices from 18 subjects when segmenting for the central ventricle structure using contextual MS approach. Graph shows mean and variance for segmentations produced by data-driven weights (DD), least-error fixed weight (F), and the Erdem-Tari weights (ET). The DD average DSC (over 18 image dataset) was 0.9783, the F average DSC was 0.9509, and the ET average DSC was 0.9617. All averaged p-values were ≪null.nullnull. Segmentation from the DD weights are significantly more accurate than the segmentations from the F and ET weights. 3.8.6 Natural Scenes Data Validation We next present results from using the following datasets of natural scenes: the McGill Calibrated Colour Image Database [75], the PASCAL object recognition database [40], and through the ImageNet database [32, 33]. 3.8.6.1 Minimum-Path We first demonstrate results on an image from the McGill database, the tree leaf on a complicated background shown in Figure 51(a). The resulting curvature-modulated reliability measure (Figure 51(b)) indicates higher regularization at the regions of the leaf obscured by snow and lower regularization in the leaf tip regions. The resulting segmentation from the data-driven segments the obscured shape of the leaf accurately (Figure 51(c)). DD F ET0.750.80.850.90.951Dice SimilarityIBSR Ventricle segmentation112 (a) (c) (b) Figure 51: (a) Original leaf image (McGill dataset [75]). (b) Reliability calculated by our proposed method. Contours produced by using (c) data-driven weights (blue), least-error fixed-weight (red), and globally optimal weight (cyan). The GO contour fails to capture the leaf tip region, and the fixed weight contour fails to regularize in the obscured edge region, as highlighted. Additionally, we segmented the airplane image of Figure 52(a) (from the PASCAL database) which consists of high curvature structures that require lower regularization, and weak-edged regions that require high regularization. The curvature-modulated reliability measure (Figure 52(b)) shows that the plane tips in the image are assigned low regularization weights (high reliability). The resulting segmentation using the data-driven weights correctly captures these regions (Figure 52(c)) whereas the least-error fixed weight, 0.5 in this case, correctly regularizes the weak edge regions but is too excessive to capture the airplane tips. The globally optimum weight, which was 113 predominately zero for this segmentation, correctly segments the tips but inaccurately includes the weak edge regions. (a) (b) (c) Figure 52: Airplane image (PASCAL dataset [40]) segmented by minimum-path approach. (a) Original image, (b) curvature-modulated reliability, and (c) contours from data-driven weight (blue), least-error fixed weight (red) and globally optimal weight (cyan). Segmentation from the GO weights inadequately regularizes in the weak edge upper region of the plane. Segmentation from the fixed weight fails to capture the wing tips. We also present the contours produced by the data-driven weights for the examples shown in Section 2.3.4 of Chapter 2. Figure 53 shows segmentations of a complex-boundary flower and a submerged leaf. In Figure 53(a), only the data-driven weight in blue prevents oversegmenting of the image and background leakage. In Figure 53(b), the data-driven weight segmentation accurately captures the weak-edge leaf tip region. 114 (a) (b) Figure 53: Segmentations from McGill database images of (a) a flower with a complex background and (b) a leaf with regions of the boundary obscured by water. The segmentation from the data-driven weights (blue) provide greater regularization in weak edge regions and lower regularization in high curvature regions, unlike the segmentations from the globally-optimal weight (cyan) and the spatially fixed weight (red). We present the quantitative results on the McGill, PASCAL, and ImageNet databases in Figure 54. The ANOVA results indicate the significant improvement the data-driven weights provide when compared to existing methods. We note that these databases do not contain texture images, which we will address in our graph cuts and contextual MS tests. 115 (a) (b) (c) Figure 54: DSC of segmentations from data-driven weights (DD), least-error fixed weights (F) and globally-optimal weights (GO) on (a) 8 images from ImageNet database, (b) 11 images from PASCAL database, and (c) 24 images from McGill database. Average error over dataset (and over 25 segmentations for each image) was DD = 0.9489, F = 0.9188, GO = 0.8629 for (a), DD = 0.9455, F = 0.9241, GO =0.8783 for (b), and DD = 0.9619, F = 0.9402, GO = 0.9025 for (c). All p-values ≪null.nullnull. For all datasets, the segmentations from the DD weights produce significantly higher DSC than the segmentations from the GO weights and fixed weights. 3.8.6.2 Graph Cuts We present results of the graph cuts segmentation method with our proposed regularization framework on a flower image from the McGill database, as shown in (Figure 55(a)) where this image has been corrupted by AWGN with a standard deviation of 0.3. From this image, we produced the curvature-modulated reliability mapping in (Figure 55(b)). The higher curvature-modulated reliability in the petal tip regions allows for a more accurate segmentation when compared to the least-error fixed weight segmentation (Figure 55(c)). In addition, we investigate the key role of the curvature cue by comparing the curvature-modulated segmentation to the non-curvature reliability weight segmentation (Figure 55(d)) which, as expected, required higher regularization in the detailed petal tip regions, resulting in leakage into the background. DD F GO0.70.80.91Dice SimilarityImageNet DatabaseDD F GO0.70.80.91Dice SimilarityPASCAL DatabaseDD F GO0.70.80.91Dice SimilarityMcGill Database116 (a) (b) (c) (d) Figure 55: Graph cuts segmentation of flower image from ImageNet dataset [32]. (a) Original image with AWGN of standard deviation 0.3. (b) Curvature-modulated reliability (higher in petal tip and crevice regions) (c) Comparison of segmentations from the curvature-modulated reliability weight (green) to the least-error fixed weight (red), and (d) to the non-curvature reliability weight (blue) with overlapping regions in yellow. The curvature-modulated DD weights provided the best segmentation of the petal tips and had the least amount of leakage. We segmented another flower image from the McGill database with corruption by AWGN of standard deviation 0.3 (image values normalized to range between 0 and 1), as shown in Figure 56(a). The curvature-modulated reliability (Figure 56(b)) produces lower regularization weights in the petal tips and petal crevices. In Figure 56(c), the fixed-weight segmentation excessively regularizes in the petal region, resulting in leakage (shown in red). Our method does not leak into the background and is able to capture the petal tips (shown in green). 117 (a) (b) (c) (d) Figure 56: Graph cuts segmentation of natural image (a) Original image corrupted by AWGN with standard deviation of 0.3. (b) Curvature-modulated reliability. (c) Comparison of segmentations from the data-driven weight (green) to the least-error fixed weight (red) and to non-curvature reliability weight (blue) with overlapping regions in yellow. In (c), high regularization in the background prevents the segmentation from the curvature-modulated DD weights from leaking, unlike the fixed-weight method in red. In (d), the non-curvature modulated weights (see overlapped region in yellow) fail to capture all petals. We demonstrate the ability of the texture-modulated weight nullnull(null,null) from (61) (see Section 3.6.2) to segment the textured image of Figure 57(a) (from ImageNet) where we set the parameter nullnull in (59) to 0.1. The curvature modulated reliability shown in Figure 57(b) is erroneously large for regions with texture. The curvature-and-texture modulated reliability shown in Figure 57(c) is lower for the texture edges. The resulting GC segmentation from using nullnull(null,null) is shown in Figure 57(d). Incorporation of a texture cue reduces leakage into the background, and the curvature cue reduces regulhow whencontomethoFigurethe cucalculCompweighthese weigh McGtrialsthe rearization in the curvatur in fact thesur. We are d. 57: Graph crvature-and-tated by our marison of seght (red) with ovregions fromhts in green capWe preseill databases. In additionsults, the dathe protrusioe measure ce regions shable to solv(a) (c) uts segmentaexture modulethod with nomentations frerlapping re inclusion intures plant tnt quantitati in Figure , we includeta-driven apn region ofan mistake ould be rege the issuestion of textuated weight. texture gatinom the propgions in yellowto the segmeips. ve results o58(a) and (bd results froproach is si118 the plant. Htexture regiularized to p by incorpored natural im(a) Original imng. (c) Curvatosed adaptiv. Reduced rntation resun non-textu) where them a set of tegnificantly owever, thions for strurevent inclrating existage (from Iage. (b) Curure-and-textue weight (greeliability in hlt. Only the red images error is thextured imagmore accuras example dcturally impusion into thing texture (b) (d) mageNet datavature-modulre modulateden) to the leigh-texture resegmentation from the Im DSC averes in Figurete when comoes highlighortant edgee segmentecues into ouset [32]) usinated reliabilit reliability. (dast-error fixeegions preven from the DageNet anaged over 2 58(c). Frompared to tht s d r ng ty ) d ts D d 5 e 119 fixed-weight approach for all datasets except for the textured dataset (see DD result in Figure 58(c)). However, the inclusion of the texture cue (DD-Text in Figure 58(c)) leads to significantly improved results. (a) (b) (c) Figure 58: Segmentation DSC from using data-driven weights (DD), least-error fixed weights (F), and data-driven weights with the texture cue (DD-Text) on natural images from (a) the ImageNet database (8 images), (b) McGill dataset (24 images), and (c) selected textured images taken from ImageNet database (10 images). Average DSC over dataset (and over 25 segmentations per image) are as follows: DD = 0.9651, F = 0.9066 for (a), DD = 0.9727, F = 0.9236 for (b), and DD = 0.8916, F = 0.8922, DD-Text = 0.9421 for (c). All p-values ≪null.nullnull. Addition of texture cue in (c) resulted in improved performance over standard DD weights and fixed weights. 3.8.6.3 Active Contours Without Edges We demonstrate the ACWE segmentation method with our regularization framework on the octopus image of Figure 59(a) (from the ImageNet database). Iterations were run until the contour evolution converged (1000 iterations). The low curvature-modulated reliability (Figure 59(b)) in regions outside the octopus prevents the resulting segmentation from including shadows in the background, unlike the fixed weight segmentation (Figure 59(c)) which leaked into the background of the image (see red region). DD F0.70.80.91Dice SimilarityImageNet DatabaseDD F0.70.80.91Dice SimilarityMcGill DatabaseDD-Text DD F0.70.80.91Dice SimilarityTexture Database120 (a) (b) (c) Figure 59: Active Contours segmentation of a natural image. (a) Original image. (b) Curvature-modulated reliability calculated by our method. (c) Comparison of segmentations from the data-driven weight (green) to the least-error fixed weight (red) where yellow regions are where segmentations overlap. Segmentation from fixed weight (red) leaks into the background, unlike segmentation from DD weights (see overlapped region in yellow). Another such example from the McGill database is an image of a flower containing shadow regions as shown in Figure 60(a). The shadow inner folds of the flower have a low reliability measure as shown in Figure 60(b) due to the weak edge and low curvature. The resulting fixed-weight segmentation in Figure 60(c) has inadequate regularization in the weak edge shadow regions, whereas the data-driven weights produce a more accurate segmentation that regularizes in the shadows but reduces regularization in the high curvature region where the flower is obscured. 121 (a) (b) (c) Figure 60: ACWE segmentation of flower. (a) Original image, (b) curvature-modulated reliability measure, and (c) comparison of segmentation from data-driven weights (blue) to the least-error fixed weight (red) where (yellow) regions represent segmentation overlap and the ground truth is shown as the black contour. Segmentation from fixed weight fails to capture shadow regions of flower, unlike segmentation from DD weights (see green region). We present the quantitative results for the ImageNet and McGill databases in Figure 61. As seen in Figure 61(b), the ACWE method with data-driven weights did not produce significantly more accurate segmentations than the fixed weights on the McGill database (p-value was 0.0514). This is likely because for certain images in the dataset, both methods performed poorly due to non-optimal placement of the initial contour. The ACWE segmentation method is sensitive to initial contour placement which is not related to our proposed work. 122 (a) (b) Figure 61: Segmentation DSC from using data-driven weights (DD) and least-error fixed weights (F) on natural images from (a) the ImageNet database (8 images) and (b) McGill dataset (24 images). Average DSC over dataset (and over 25 segmentations per image) are as follows: DD = 0. 9563, F = 0. 9144 for (a) and DD = 0. 9435, F = 0. 9044 for (b). Average p-value for (a) is 2E-4, but average p-value for (b) is 0.0514. Segmentations on the McGill database using the DD weights were not significantly more accurate due to non-optimal placement of the initial contour. 3.8.6.4 Mumford-Shah Method For the contextual Mumford-Shah method, we tested textured natural images, such as the image of an amoeba shown in Figure 62 (from the ImageNet database). The curvature-modulated reliability is falsely high for the inner textured region of the amoebas, resulting in a disconnected segmentation as shown in Figure 62(b) in green. However, by addition of the texture term from the ET framework, the texture-and-curvature modulated reliability is smooth for these sections of the image, and the resulting segmentation is more accurate than the fixed weight and ET weights (Figure 62(c) and (d)). Although the segmentation produced by the ET method correctly regularizes the textured region, it is unable to segment the protrusions of the amoeba which is due to the framework lacking a structural curvature cue. DD F0.80.850.90.951Dice SimilarityImageNet DatabaseDD F0.80.850.90.951Dice SimilarityMcGill Database123 (a) (b) (c) (d) Figure 62: Segmentation of amoeba image from ImageNet. (a) Original image. (b) Comparison of segmentation from non-texture data-driven weight (green) to segmentation from fixed weight (red). (c) Comparisons of segmentation from texture data-driven weight (green) to fixed weight (red), and (d) to segmentation from ET cues (blue). Segmentation from the DD weights without texture mistakes textured regions as separate objects, resulting in a fragmented segmentation (see green segmentation in (b)). Segmentation from the DD weights with texture correctly captured the full objects, including protrusions, unlike the fixed weights (see overlapped yellow in (c)) and the ET weights (see overlapped yellow in (d)). In addition, we segmented the texture image of a cheetah shown in Figure 63(a). The non-texture reliability in Figure 63(b) is falsely high in the spotted region of the cheetah. However, by incorporating the ET texture cue, the resulting texture-and-curvature modulated reliability is lower in the spotted region (Figure 63(c)). The resulting segmentation from the texture data-driven weight in green is more accurate than the segmentations from the fixed weight in red (Figure 63(d)) and the ET weight in blue (Figure 63(e)). Figuremoducue infixed withouundessegmeweigh the Dmethomoreaccur 63: Segmenlated reliabilito our frameweight (red),t texture inirable behavntation (greenhts (blue) whicThe quantSC betweend and the g accurate seate segment(b) (d) tation of chety. (c) Textuwork. (d) Com and (e) ET (b) inaccurior is remov) more accuh result in leaitative resul the binaryround truth gmentationsations for th etah image pre-and-curvatparison of sweight (blueately detectsed in (c) wrately capturekage. ts are prese mask creasegmentatio for the none textured 124 (a) resented inure modulateegmentations) where yello textured reith the addis the cheetahnted in Figuted from then. We find -textured daset (DD in F [38]. (a) Origd reliability results fromw regions reegions as higtion of the than the fixre 64 where edge-procthat the datataset (Figurigure 64(b)(c) (e) inal image. (from incorpo data-driven present overlh curvature texture cue. ed weights (re we measuress produce-driven weie 64(a)) but). However b) Curvaturrating texturweight (greenap. Reliabilitregions. ThThe resultind) and the Ee the error ad by the Mghts produc produce les, the additioe-e ), ty is ng T s S e s n 125 of the textured cue from the ET framework results in more accurate segmentation than the non-curvature-modulated ET framework (DD-Text in Figure 64(b)). (a) (b) Figure 64: Segmentation DSC from using data-driven weights (DD), least-error fixed weights (F), data-driven weights with the texture cue (DD-Text), and Erdem-Tari weights (ET) on natural images from (a) the ImageNet database (8 non-textured images), and (b) selected textured images taken from ImageNet database (10 images). Average DSC over dataset (and over 25 segmentations per image) are as follows: DD = 0.8311, F = 0.6482, and ET = 0.7719 for (a), and DD = 0.6107, DD-Text = 0.8681, F = 0.6004, and ET = 0.8323 for (b). Addition of texture cue in (b) resulted in improved performance over standard DD weights, fixed weights, and ET weights. All p-values of DD versus fixed and ET methods in (a) and DD-text versus fixed method and ET method in (b) were ≪null.nullnull. Segmentations from the DD weights on the non-textured ImageNet dataset were significantly more accurate than the fixed weight and ET weights. Addition of the texture cue resulted in significantly more accurate performance for the textured set in (b). DD F ET0.50.60.70.80.91Dice SimilarityImageNet segmentationDD DD-Text F ET0.50.60.70.80.91Dice SimilarityTexture Set segmentation126 Chapter 4 4 Conclusions In this chapter, we discuss conclusions for this thesis and how our work fits into the general field of image segmentation. We discuss the limitation of this method and present an overview of the areas for future research in this work. 4.1 Discussion In this thesis, we presented novel approaches for addressing a ubiquitous problem that plagues most energy minimization-based segmentation techniques; how to properly balance the weights of competing data fidelity and regularization energy terms. We focused on automated methods to determine a single regularization weight for convex energy functionals. We first presented a method for determining the globally optimized values of the object function parameters. During validation, we found this method to suffer from bimodal weights, poor response to image characteristics, and decreased 127 generality to other segmentation methods. We then proposed spatially adaptive weights that depended on contextual cues that gauge image reliability and structural evidence. This method employed a novel and robust local measure of signal reliability through estimating the spectral flatness of the image. In addition, our method used a local scale-invariant curvature cue for modulating regularization in conjunction with edge evidence, where both cues were made robust to noise through gating by the local signal reliability. We demonstrated the applicability of our contextual weights by incorporating the weights into a variety of continuous and discrete segmentation frameworks, including minimum-path, graph cuts, and two forms of the Mumford-Shah model: active contours without edges, and the AT approximation of the Mumford-Shah model with feedback regularization. We validated the contextual weights on a wide variety of datasets, the majority of which are publically available: functionally-derived images, DDSM mammography dataset, 52-subject sagittal slices of CC structure, 3-modality BrainWeb data with varying noise and intensity inhomogeneity, 18-subject IBSR dataset, images from the McGill Colour Calibration database, PASCAL object recognition database, and the ImageNet database. We compared the contextual weights against the least-error spatially fixed weight, the globally optimal weights, and the spatially adaptive weights proposed by Erdem and Tari [38]. In addition, we incorporated existing texture-based contextual cues into our method to show how it can be incorporated with current techniques in the field. 128 4.2 Limitations and Future Directions Although we have demonstrated the significant improvements in segmentation accuracy our contextual weights provide, there are limitations to our methods and areas for future research, which we list here: Energy Functionals with Multiple Weights In our work, we have focused on using a single regularization weight and adapting this weight accordingly. However, many energy minimization segmentation techniques use multiple weights, i.e. as follows: nullnullnullnullnullnull=nullnullnullnull+nullnullnullnull+nullnullnullnull+⋯ (96) Our work is limited to single weight energy formulations, which we form by grouping the terms into either nullnullnullnull or nullnullnullnull and using a single weight in a convex combination for graph cuts and minimum-path approaches, and non-convex for ACWE and MS. However, single weights are limited since many segmentation techniques give greater importance to certain energy terms nullnull through assigning that term a higher nullnull. Determining how to best set multiple regularization weights will greatly increase the generality and usability of this method. Additional Cues for Non-Spectrally Flat Noise Models Our method for noise estimation is limited to types of noise that are spectrally flat, i.e. white noise. However, many images may be corrupted by other noise models, particularly in medical imaging. For example, ultrasound images are often corrupted by 129 multiplicative noise (speckle noise) which our method is not able to estimate. A future area of research would be to devise new noise cues to handle various noise models and combine them with the existing noise cue. Expansion to 3D Our current method is limited to 2D images. However, in medical imaging, 3D data is far more common and is segmented by various 3D energy minimization techniques, predominately deformable model methods. Thus our contextual regularization method must be modified by (a) expanding the spectral flatness, edge evidence, and curvature cues to the 3D space, and (b) incorporating these cues into 3D segmentation frameworks. Convex Functionals for Continuous Segmentation Methods In our tests with continuous segmentation frameworks, such as ACWE and the MS model, we had difficulties using a convex energy functional with respect to the adaptive weight as we had done for the discrete frameworks; instead, we only weighted the regularization term. We found that weighting the external terms resulted in the stalling of the curve evolution. However, we were unable to determine exactly why that occurred. Further research would be beneficial in this area so that our method could be incorporated into segmentation frameworks in a more unified manner. Investigating Globally Optimal Weights for Additional Segmentation Methods We presented a method for determining the globally-optimal weight for the minimum-path segmentation framework, and showed that it did not provide accurate 130 results. However, we did not determine the globally-optimal weights for additional segmentation frameworks, such as continuous methods. A key area of future research is to determine if the globally optimum weight for other segmentation frameworks are also insufficient in addressing regularization needs. This would be important in proving that the globally optimal weights do not necessarily reflect correct segmentations. Further Validation with Additional Segmentation Frameworks and Comparison Methods We presented results from validation with four segmentation techniques, and comparisons against two existing methods (spatially-fixed, and the ET weights). Future work should focus on exploring the use of our weights with additional segmentation frameworks, such as the robust higher order potentials method by Kohli and Ladicky [58], normalized cuts [69], and shape priors and discrete MRFs by Besbes et al [10]. In addition, future work should focus on expanding the number of methods we compared our work against. The contextual texture-based regularization method of Malik et al [69] using Normalized Cuts should be validated, along with the texture-based regularization of Kokkinos et al [59] using segmentation by weighted curve evolution. In addition, a key family of segmentation approaches which our existing validation work does not cover are clustering techniques, such as null-means clustering [30, 95, 98], fuzzy c-means clustering [11, 25], spectral clustering (i.e., Normalized Cuts [69]), and expectation-maximization (EM) clustering [66]. 131 Additional Contextual Cues Incorporation of additional contextual cues would benefit our work, particularly in the area of texture estimation, additional noise estimation methods, and shape information. These additional cues can be incorporated into our method through multiplication with the noise-gated edge evidence as we have done with the Erdem and Tari texture cue in Section 3.6.2. In particular, the structural curvature cue would benefit from use of a structure tensor matrix where the eigen-decomposition of this matrix may capture more precise gradient features. 132 Bibliography [1] Propeller MRI. Southern MRI Research Institute. [Online]. Available: http://southernmri.org/?q=node/13. Accessed: May 1, 2010. [2] A. Akselrod-Ballin, M. Galun, M. John Gomori, A. Brandt, and R. Basri. Prior knowledge driven multiscale segmentation of brain MRI. In MICCAI (2), volume 4792 of Lecture Notes in Computer Science, pages 118–126. Springer, 2007. [3] L. Ambrosio and V.M. Tortorelli. Approximation of functional depending on jumps by elliptic functional via t-convergence. Communications on Pure and Applied Mathematics, 43(8):999–1036, 2006. [4] A. Amini, T. Weymouth, and R. Jain. Using dynamic programming for solving variational problems in vision. IEEE Transactions on pattern analysis and machine intelligence, pages 855–867, 1990. [5] C. Aslan and S. Tari. An axis-based representation for recognition. In International Conference on Computer Vision, pages II: 1339–1346, 2005. [6] M. Awrangjeb and G. Lu. Robust image corner detection based on the chord-to-point distance accumulation technique. IEEE Transactions on Multimedia, 10(6):1059–1072, October 2008. [7] S. Bagon. MATLAB wrapper for graph cuts. http://www.wisdom.weizmann.ac.il/bagon, 2006. [8] L. Bar, N. Kiryati, and N. Sochen. Image deblurring in the presence of impulsive noise. International Journal of Computer Vision, 70(3):279–298, 2006. [9] W. A. Barrett and E. N. Mortensen. Interactive live-wire boundary extraction. Medical Image Analysis, 1:331–341, 1997. 133 [10] A. Besbes, N. Komodakis, G. Langs, and N. Paragios. Shape priors and discrete MRFs for knowledge-based segmentation. In Computer Vision and Pattern Recognition, pages 1295-1302, 2009. [11] J. C. Bezdek, L. O. Hall, and L. P. Clarke. Review of MR image segmentation techniques using pattern recognition. Medical Physics, 20(4):1033–1048, 1993. [12] A. Blake and A. Zisserman. Visual Reconstruction. MIT Press, Cambridge, MA, 1987. [13] Y. Boykov and M.P. Jolly. Interactive graph cuts for optimal boundary and region segmentation of objects in ND images. In International Conference on Computer Vision, volume 1, pages 105–112. Citeseer, 2001. [14] Y. Boykov and O. Veksler. Graph cuts in vision and graphics: Theories and applications. The Handbook of Mathematical Models in Computer Vision. Springer, 2006. [15] Y. Boykov and G. Funka-Lea. Graph cuts and efficient N-D image segmentation. International Journal of Computer Vision, 70(2):109–131, 2006. [16] Y. Boykov and V. Kolmogorov. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE transactions on Pattern Analysis and Machine Intelligence, 26(9):1124–1137, 2004. [17] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12):1222–1239, 2001. 134 [18] M. Bydder, D.J. Larkman, and J.V. Hajnal. Simultaneous acquisition of spatial harmonics (SMASH): ultra-fast imaging with radiofrequency coil arrays. Magnetic Resonance in Medicine, 38(1):591–603, 1997. [19] M. Bydder, D.J. Larkman, and J.V. Hajnal. Generalized SMASH imaging. Magnetic Resonance in Medicine, 47(1):160–170, 2002. [20] J. Canny. A computational approach to edge detection. In RCV87, pages 184–203, 1987. [21] V. Caselles, R. Kimmel, and G. Sapiro. Geodesic active contours. International Journal of Computer Vision, 22(1):61–79, 1997. [22] T. F. Chan, B. Y. Sandberg, and L. A. Vese. Active contours without edges for vector-valued images. Journal of Visual Communication and Image Representation, 11(2):130–141, 2000. [23] T. F. Chan and L. A. Vese. An active contour model without edges. In Scale Space, pages 141–151, 1999. [24] T. F. Chan and L. A. Vese. Active contours without edges. IEEE Trans. Image Processing, 10(2):266–277, 2001. [25] K. S. Chuang, H. L. Tzeng, S. Chen, J. Wu, and T. J. Chen. Fuzzy c-means clustering with spatial information for image segmentation. Computerized Medical Imaging and Graphics, 30(1):9–15, 2006. [26] C. A. Cocosco, V. Kollokian, R. Kwan, and A. Evans. BrainWeb: Online interface to a 3D MRI simulated brain database. In NeuroImage, volume 5. Academic Press, 1997. 135 [27] I. Cohen, N. Ayache, and P. Sulger. Tracking points on deformable objects using curvature information. In ECCV, pages 458–466, 1992. [28] L. D. Cohen. On active contour models and balloons. CVGIP: Image Understanding, 53(2):211–218, 1991. [29] L. D. Cohen and R. Kimmel. Global minimum for active contour models: A minimal path approach. International Journal of Computer Vision, 24(1):57–78, 1997. [30] G. B. Coleman and H. C. Andrews. Image segmentation by clustering. Proceedings of the IEEE, 67(5):773–785, 1979. [31] D.L. Collins, A.P. Zijdenbos, V. Kollokian, J.G. Sled, N.J. Kabani, C.J. Holmes, and A.C. Evans. Design and construction of a realistic digital brain phantom. IEEE Transactions on Medical Imaging, 17(3):463–468, 1998. [32] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR, 2009. [33] J. Deng, K. Li, M. Do, H. Su, and L. Fei-Fei. Construction and Analysis of a Large Scale Image Ontology. Vision Sciences Society, 2009. [34] E. Dijkstra. A note on two problems in connection with graphs. Numerical Mathematics, 1(5):269–271, 1959. [35] B. Dong, A. Chien, Y. Mao, J. Ye, and S. Osher. Level set based surface capturing in 3D medical images. In MICCAI (1), volume 5241 of Lecture Notes in Computer Science, pages 162–169. Springer, 2008. [36] M. Donias, P. Baylou, and N. Keskes. Curvature of oriented patterns: 2-D and 3-D estimation from differential geometry. In ICIP, pages I: 236–240, 1998. 136 [37] E. Erdem, A. Erdem, and S. Tari. Edge strength functions as shape priors in image segmentation. In EMMCVPR, pages 490–502, 2005. [38] E. Erdem and S. Tari. Mumford-Shah regularizer with contextual feedback. Journal of Mathematical Imaging and Vision, 33(1):67–84, 2009. [39] S. Esedoglu and J. H. Shen. Digital inpainting based on the Mumford-Shah-Euler image model. European Journal of Applied Mathematics, 13:353–370, 2002. [40] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2009 (VOC2009) Results. http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html. [41] A.X. Falcão, J.K. Udupa, S. Samarasekera, S. Sharma, B.E. Hirsch, and R.A. Lotufo. User-Steered Image Segmentation Paradigms: Live Wire and Live Lane. Graphical Models and Image Processing, 60(4):233–260, 1998. [42] O. Faugeras. Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press, Cambridge, Massachusetts, 1993. [43] D. Geiger, A. Gupta, LA Costa, and J. Vlontzos. Dynamic programming for detecting, tracking, and matching deformable contours. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(3):294–302, 1995. [44] S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6):721–741, 1984. [45] G. Gilboa, J. Darbon, S. Osher, and T. Chan. Nonlocal convex functionals for image regularization. UCLA CAM-report, pages 06–57, 2006. [46] R. C. Gonzales and R. Woods. Digital Image Processing. Addison Wesley, 1992. 137 [47] L. Grady. Random walks for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11):1768–1783, 2006. [48] D. Greig, B. Porteous, and A. Seheult. Exact maximum a posterori estimation for binary images. Journal Royal Statistical Society, B: 51(2):271–279, 1989. [49] X. C. He and N. H. C. Yung. Curvature scale space corner detector with adaptive threshold and dynamic region of support. In ICPR, pages II: 791–794, 2004. [50] M. Heath, K. Bowyer, D. Kopans, P. Kegelmeyer, R. Moore, K. Chang, and S. Munishkumaran. Current status of the digital database for screening mammography. Computational Imaging and Vision, 13:457–460, 1998. [51] M. Heath, K. Bowyer, D. Kopans, R. Moore, and P. Kegelmeyer. The digital database for screening mammography. In Proceedings of the 5th International Workshop on Digital Mammography, pages 212–218, 2000. [52] S. Hermann and R. Klette. A comparative study on 2D curvature estimators. In ICCTA, pages 584–589. IEEE Computer Society, 2007. [53] H. Ishikawa and D. Geiger. Occlusions, discontinuities, and epipolar lines in stereo. In ECCV, page I: 232, 1998. [54] N. S. Jayant and Peter Noll. Digital Coding of Waveforms. Prentice-Hall, Englewood Cliffs, 1984. [55] I.H. Jermyn and H. Ishikawa. Globally optimal regions and boundaries. In Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, volume 2, 1999. [56] M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. International Journal of Computer Vision, 1(4):321–331, 1988. 138 [57] L. Kitchen and A. Rosenfeld. Gray-level corner detection. Pattern Recognition Letters, 1(2):95 – 102, 1982. [58] P. Kohli, L. Ladicky, and P.H.S. Torr. Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision, 82(3):302–324, 2009. [59] I. Kokkinos, G. Evangelopoulos, and P. Maragos. Texture analysis and segmentation using modulation features, generative models, and weighted curve evolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1):142–157, 2009. [60] V. Kolmogorov, Y. Boykov, and C. Rother. Applications of parametric maxflow in computer vision. ICCV, 8, 2007. [61] H.W. Kuhn. The Hungarian method of solving the assignment problem. Naval Res. Logistics Quart., 2:83–97, 1955. [62] R. Kwan, A. Evans, and G. Pike. An extensible MRI simulator for post-processing evaluation. In Visualization in Biomedical Computing, pages 135–140. Springer, 1996. [63] R. Kwan, A. Evans, and G. Pike. MRI simulation-based evaluation of image-processing and classification methods. IEEE Transactions on Medical Imaging, 18(11):1085–1097, 1999. [64] H. Li and A. Yezzi. Vessels as 4-D curves: Global minimal 4-D paths to extract 3-D tubular surfaces and centerlines. IEEE Transactions on Medical Imaging, 26(9):1213–1223, 2007. 139 [65] J. Liang, T. McInerney, and D. Terzopoulos. United snakes. Medical Image Analysis, 10(2):215–233, 2006. [66] Z. Liang, R.J. Jaszczak, and R.E. Coleman. Parameter estimation of finite mixtures using the EM algorithm and information criteria with application to medical image processing. IEEE Transactions on Nuclear Science, 39(4):1126–1133, 1992. [67] T. Lindeberg. On scale selection for differential operators. In Scan. Conf. on Image Analysis, pages 857–866, 1993. [68] T. Lindeberg. Edge detection and ridge detection with automatic scale selection. International Journal of Computer Vision, 30(2):117–154, 1998. [69] J. Malik, S. Belongie, T. K. Leung, and J. Shi. Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43(1):7–27, 2001. [70] P. Martin, P. Réfrégier, F. Goudail, and F. Guérault. Influence of the noise model on level set active contour segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6):799–803, 2004. [71] T. McInerney and D. Terzopoulos. Deformable models in medical image analysis: A survey. Medical Image Analysis, 1(2):91–108, 1996. [72] C. McIntosh and G. Hamarneh. Is a single energy functional sufficient? Adaptive energy functionals and automatic initialization. In MICCAI (2), volume 4792 of Lecture Notes in Computer Science, pages 503–510. Springer, 2007. [73] F. Mokhtarian and R. Suomela. Robust image corner detection through curvature scale space. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12):1376–1381, 1998. 140 [74] D. Mumford and J. Shah. Optimal approximations by piecewise smooth functions and variational problems. Comm. on Pure and Applied Math., XLII(5):577–685, 1988. [75] A. Olmos and F. Kingdom. McGill calibrated colour image database. http://tabby.vision.mcgill.ca. 2004. [76] S. J. Osher and R. Fedkiw. Level Set Methods and Dynamic Implicit Surfaces. Springer, 2002. [77] S. J. Osher and N. Paragios. Geometric Level Set Methods in Imaging, Vision, and Graphics. Springer-Verlag, 2003. [78] P. Perona and J. Malik. Scale space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7):629–639, 1990. [79] D.L. Pham, C. Xu, and J.L. Prince. Current methods in medical image segmentation. Annual Review of Biomedical Engineering, 2(1):315–337, 2000. [80] C. Pluempitiwiriyawej, J. M. F. Moura, Y. Lin Wu, and C. Ho. STACS: New active contour scheme for cardiac MR image segmentation. IEEE Transactions on Medical Imaging, 24(5):593–603, 2005. [81] M. Poon, G. Hamarneh, and R. Abugharbieh. Live-vessel: Extending livewire for simultaneous extraction of optimal medial and boundary paths in vascular images. In MICCAI (2), volume 4792 of Lecture Notes in Computer Science, pages 444–451. Springer, 2007. 141 [82] K. P. Pruessmann, M. Weiger, M. B. Scheidegger, and P. Boesiger. SENSE: sensitivity encoding for fast MRI. Magnetic Resonance in Medicine, 42(5):952–962, 1999. [83] S. Roy and I. J. Cox. A maximum-flow formulation of the N-camera stereo correspondence problem. In ICCV, pages 492–502, 1998. [84] A. A. Samsonov and C. R. Johnson. Noise-adaptive nonlinear diffusion filtering of MR images with spatially varying noise levels. Magnetic Resonance in Medicine, 52(4):798–806, 2004. [85] G. Sapiro. Geometric partial differential equations and image analysis. Cambridge Univ Pr, 2001. [86] J. A. Sethian. Level Set Methods and Fast Marching Methods. Cambridge University Press, 1999. [87] J. Shah. A common framework for curve evolution, segmentation and anisotropic diffusion. In CVPR, pages 136–142. IEEE Computer Society, 1996. [88] D. W. Shattuck, S. R. Sandor-Leahy, K. A. Schaper, D. A. Rottenberg, and R. M. Leahy. Magnetic resonance image tissue classification using a partial volume model. NeuroImage, 13(5):856–876, 2001. [89] J. G. Sled, A. P. Zijdenbos, and A. C. Evans. A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transactions on Medical Imaging, 17(1):87–97, 1998. [90] M. Styner, C. Brechbuhler, G. Szckely, and G. Gerig. Parametric estimate of intensity inhomogeneities applied to MRI. IEEE Transactions on Medical Imaging, 19(3):153–165, 2000. 142 [91] M. Szummer, P. Kohli, and D. Hoiem. Learning CRFs using Graph Cuts. In ECCV, pages 582–595. Springer, 2008. [92] Z. S. G. Tari, J. Shah, and H. Pien. Extraction of shape skeletons from grayscale images. Computer Vision and Image Understanding, 66(2):133–146, 1997. [93] D. S. Taubman and M. W. Marcellin. JPEG 2000: Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, Norwell, MA, USA, 2001. [94] S. Teboul, L. Blanc Feraud, G. Aubert, and M. Barlaud. Variational approach for edge-preserving regularization using coupled PDEs. IEEE Transactions on Image Processing, 7(3):387–397, 1998. [95] J. Theiler and G. Gisler. A contiguity-enhanced k-means clustering algorithm for unsupervised multispectral image segmentation. In Proc. SPIE, volume 3159, pages 108–118. Citeseer, 1997. [96] J. K. Udupa and G. J. Grevera. Go digital, go fuzzy. Pattern Recognition Letters, 23(6):743–754, 2002. [97] L. A. Vese and T. F. Chan. A multiphase level set framework for image segmentation using the Mumford and Shah model. International Journal of Computer Vision, 50(3):271–293, 2002. [98] K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl. Constrained k-means clustering with background knowledge. In Proc. 18th International Conf. on Machine Learning, pages 577–584. Morgan Kaufmann, San Francisco, CA, 2001. 143 [99] I. Wendelhag, Q. Liang, T. Gustavsson, and J. Wikstrand. A new automated computerized analyzing system simplifies readings and reduces the variability in ultrasound measurement of intima-media thickness. Stroke, 28(11):2195, 1997. [100] O. Wink, WJ Niessen, and MA Viergever. Multiscale vessel tracking. IEEE Transactions on Medical Imaging, 23(1):130–133, 2004. [101] L. Wolf, X. Huang, I. Martin, and D. Metaxas. Patch-based texture edges and segmentation. In ECCV, pages 481–493, 2006. [102] Y. Wu. Chan Vese active contours without edges. http://www.mathworks.com/matlabcentral/fileexchange/23445. 2009. [103] X. H. Zhang, M. Lei, D. Yang, Y. Wang, and L. T. Ma. Multi-scale curvature product for robust image corner detection in curvature scale space. Pattern Recognition Letters, 28(5):545–554, 2007. [104] X. H. Zhang, H. Wang, M. Hong, L. Xu, D. Yang, and B. C. Lovell. Robust image corner detection based on scale evolution difference of planar curves. Pattern Recognition Letters, 30(4):449–455, 2009. [105] P. Zhao and B. Yu. Boosted lasso. Journal of Machine Learning Research, to appear, 2004. [106] Peng Zhao and Bin Yu. Stagewise lasso. Journal of Machine Learning Research, 8:2701–2726, 2007. [107] B. J. Zhong and W. H. Liao. Direct curvature scale space: Theory and corner detection. IEEE Trans. Pattern Analysis and Machine Intelligence, 29(3):508–512, 2007. 144 [108] S.C. Zhu and A. Yuille. Region competition: Unifying snakes, region growing, and Bayes/MDL for multi-band image segmentation. In ICCV, pages 416-423, 1995.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Adaptive contextual regularization for energy minimization...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Adaptive contextual regularization for energy minimization based image segmentation Rao, Josna 2010-06-02
pdf
Page Metadata
Item Metadata
Title | Adaptive contextual regularization for energy minimization based image segmentation |
Creator |
Rao, Josna |
Publisher | University of British Columbia |
Date Issued | 2010 |
Description | Image segmentation techniques are predominately based on parameter-laden optimization processes. The segmentation objective function traditionally involves parameters (i.e. weights) that need to be tuned in order to balance the underlying competing cost terms of image data fidelity and contour regularization. Conventionally, the associated parameters are set through tedious trial and error procedures and kept constant over the image. However, spatially varying structural characteristics, such as object curvature, combined with varying noise and imaging artifacts, significantly complicate the selection process of segmentation parameters. This thesis contributes to the field of image segmentation by proposing methods for spatially adapting the balance between regularization and data fidelity in energy minimization frameworks in an autonomous manner. We first proposed a method for determining the globally-optimum spatially adaptive regularization weight based on dynamic programming. We investigated this weight with a basic minimum-path segmentation framework. Our findings indicated that the globally-optimum weight is not necessarily the best weight as perceived by users, and resulted in poorer segmentation accuracy, particularly for high noise images. We then investigated a more intuitive approach that adapts the regularization weight based on the underlying local image characteristics to more properly address noisy and structurally important regions. This method, which we termed contextual (data-driven) weighting, involved the use of a robust structural cue to prevent excessive regularization of trusted (i.e. low noise) high curvature image regions and an edge evidence measure, where both measures are gated by a measure of image quality based on the concept of spectral flatness. We incorporated our proposed regularization weighting into four major segmentation frameworks that range from discrete to continuous methods: a minimum-path approach [9], Graph Cuts [14], Active Contours Without Edges [24], and a contextual Mumford-Shah based approach [38]. Our methods were validated on a variety of natural and medical image databases and compared against the globally-optimum weight approach and to two alternate approaches: the best possible (least-error) spatially-fixed regularization weight, and the most closely related data-driven spatially adaptive regularization method. In addition, we incorporated existing texture-based contextual cues to demonstrate the applicability of the data-driven weights. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-06-02 |
Provider | Vancouver : University of British Columbia Library |
DOI | 10.14288/1.0070979 |
URI | http://hdl.handle.net/2429/25340 |
Degree |
Master of Applied Science - MASc |
Program |
Electrical and Computer Engineering |
Affiliation |
Applied Science, Faculty of Electrical and Computer Engineering, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2010-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 24-ubc_2010_fall_rao_josna.pdf [ 11.57MB ]
- Metadata
- JSON: 24-1.0070979.json
- JSON-LD: 24-1.0070979-ld.json
- RDF/XML (Pretty): 24-1.0070979-rdf.xml
- RDF/JSON: 24-1.0070979-rdf.json
- Turtle: 24-1.0070979-turtle.txt
- N-Triples: 24-1.0070979-rdf-ntriples.txt
- Original Record: 24-1.0070979-source.json
- Full Text
- 24-1.0070979-fulltext.txt
- Citation
- 24-1.0070979.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0070979/manifest