Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Slant from texture : computational methods for recovering surface slant from images of textured scenes Breu, Heinz 1980

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1980_A6_7 B74.pdf [ 4.94MB ]
Metadata
JSON: 831-1.0051595.json
JSON-LD: 831-1.0051595-ld.json
RDF/XML (Pretty): 831-1.0051595-rdf.xml
RDF/JSON: 831-1.0051595-rdf.json
Turtle: 831-1.0051595-turtle.txt
N-Triples: 831-1.0051595-rdf-ntriples.txt
Original Record: 831-1.0051595-source.json
Full Text
831-1.0051595-fulltext.txt
Citation
831-1.0051595.ris

Full Text

SLANT FROM TEXTURE: Computational methods for recovering surface slant from images of textured scenes by Heinz Breu B.Sc, University of British Columbia, 1978 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in The Faculty of Graduate Studies (Department of Computer Science) We accept this thesis as conforming to the required standard The University of British Columbia March, 1980 Heinz Breu, 1980 I n p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r an a d v a n c e d d e g r e e a t t h e U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e a n d s t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d by t h e Head o f my D e p a r t m e n t o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . D e p a r t m e n t o f / oywpv "e(Z Gcitncs„ The U n i v e r s i t y o f B r i t i s h C o l u m b i a 2075 W e s b r o o k P l a c e V a n c o u v e r , C a n a d a V6T 1W5 D E - 6 B P 75-51 I E i i Abstract This thesis examines the problem of computationally recovering or determining the slant of a surface from an image of that surface. Images are restricted to those of planar surfaces produced by orthographic projection. This thesis is concerned only with those cues obtainable from the image texture. These cues arise primarily due to the foreshortening property of orthographic projection. Texture measures have typically been partitioned into three classes: statistical approaches, micro-structural approaches, and macro-structural approaches. In this thesis, measures from each of these classes are used to develop algorithms capable of detecting surface orientation. It is concluded that these three classes are not distinct and, indeed, are arti f i c i a l l y rendered by the prevailing definition of texture. A new definition involving nested structures is suggested. Acknowledgements I would like to thank my advisor Alan Mackworth for his many suggestions which I have incorporated in this thesis and for his friendly guidance when my work seemed to be at an impasse. Thanks also to my wife Wendy for a l l her support while I was in school and especially for her extensive help in the production of the final copy of this thesis. iv Table of Contents 1. INTRODUCTION 1 1.0. PHYSICAL ASSUMPTIONS 1 1.1. DEFINITIONS 2 Texture 2 Surface Slant 3 1.2. MOTIVATION 4 1.3. THESIS OUTLINE 9 1.5. READING PATHS 10 2. RELATED RESEARCH '. 12 2.1. THE DEFINITION OF TEXTURE 12 2.2. APPROACHES TO TEXTURE MEASURES 14 Gray-level Statistics 14 Structural Analysis 18 Frequency Domain 19 Hough Transforms 20 2.3. APPLICATIONS 31 Discrimination 31 Surface Slant 32 Depth Cue 33 2.4. SUMMARY 34 3. APPROACHES TO COMPUTATIONAL DETERMINATION OF SURFACE SLANT FROM TEXTURE MEASURES 35 3.1. INTRODUCTION 35 Theorem 3.1.1: The Foreshortening Effect of Orthographic Projection 39 V 3.2. STATISTICAL APPROACHES 42 Regular Dots 42 Circles 43 Other Micro-structures 49 Uniform Dots 49 Theorem 3.2.1: The Non-directionality of Random Textures ....52 Conclusions 53 3.3. STRUCTURAL APPROACHES: THE MICRO-STRUCTURE 54 The Single Surface 56 One Diagonal 56 Theorem 3.3.1: a S.l.d. is a U.i.l 61 One or More Diagonals 63 Theorem 3.3.2: Effect of more than one S.l.d 65 Single Surface Results 70 More Than One Surface 70 Details of the Implementation 90 Conclusions 93 3.4. STRUCTURAL APPROACHES: THE MACRO-STRUCTURE 94 "Grid-like" Textures 94 Results 106 4. AN INTERPRETATION OF THE RESULTS 112 4.1. SUMMARY 112 4.2. FALLACIES, OR WHY 3.2, 3.3, AND 3.4 ALL START WITH 3 113 The Fallacy of the Statistical-Structural Dichotomy 113 The Fallacy of the Micro/Macro-Structure Dichotomy 114 vi 5. CONCLUSION 117 5.1. DIRECTIONS FOR FURTHER RESEARCH 119 Extensions 119 Formalize the Definition 121 Application to the Origami World 121 5.2. SYNOPSIS 124 6. APPENDICES 125 6.1. APPENDIX TO 2.2 125 Derivation of the Rho-theta Equation of a Line 125 6.2. APPENDIX TO 3.3 127 Least Squares Fit of Data to An Ellipse 127 6.3. APPENDIX TO 3.4 130 Orientation of a Surface from True Shape and Projected Shape 130 Bibliography 133 v i i Table of Figures Figure f1.1.1: Picture and Gradient of a Line 5 Figure f1.2.1: Labellings of a "cube" 7 Figure fl.2.2: Gradients of Figure fl.2.1 8 Figure f2.2.1: Divided Squares 15 Figure f2.2.2: Checkerboard 15 Figure f2.2.3: Virtual Line 17 Figure f2.2.4: "Mesh-like" Textures 21 Figure f2.2.5: Rho-theta Parametrization of a Line 23 Figure f2.2.6: Hough Transform of "City Streets" 24 Figure 12.2.1\ Polar Plot of Hough Transform of "City Streets" 27 Figure f2.2.8: Derivation of T 29 Figure f3.3.1: Orthographic Projection of a Line 36 Figure f3.1.2: Surface Slant Notation 38 Figure f3.1.3: Figure for theorem 3.1.1 40 Figure f3.2.1: Examples of Deltas 44 Figure f3.2.2: Second Order Statistics of a Circle 46 Figure f3.2.3: Intercircle Deltas 47 Figure f3.2.4: Digitization of a Circle 48 Figure f3.2.5: Effect of Rotation on Co-occurrence Matrices 50 Figure f3.2.6: Effect of Rotation on Inter-texel Distances 51 Figure f3.3.1: "Cube" Textured with Squares 55 Figure f3.3.2: Tilted Surface Textured with Line Segments 57 Figure f3.3.2: Polar Plot of L-0-list Taken of Figure f3.3.2 59 Figure f3.3.4: Examples of Micro-structures with Exactly One Sufficiently Long Diagonal 62 v i i i Figure f3.3.5: Examples of Micro-structures Containing One Or More Sufficiently Long Diagonals 64 Figure f3.3.6: Figure for theorem f3.3.2 66 Figure f3.3.7: Tilted Surface Textured with Squares 68 Figure f3.3.8: Polar-plot of L-6-list Taken of Figure f3.3.7 69 Figure f3.3.9: "Cube" Textured with Line Segments 73 Figure f3.3.10: Polar of the L-6-list Taken of Figure f3.3.9 74 Figure f3.3.11: Ellipse Detected for Noisy Points 77 Figure f3.3.12: Iyy v.s. I x v Taken of Figure f3.3.1 81 Figure f3.3.13: Application of the Clustering Algorithm 82 Figure f3.3.14: Flow of Control of the Micro-structural Analysis Algorithm 91 Figure f3.3.15: Data Structures for the Micro-structural Analysis Algorithm 92 Figure f3.4.1: "Complete" Grid Texture 95 Figure f3.4.2: "Line-segments" Grid Texture 96 Figure f3.4.3: "Boxes" Grid Texture 97 Figure f3.4.4: The Relationship Between Lj and dj 100 Figure f3.4.5: Illustration of p^and ^2 Figure f3.4.6: Slanted Surface Textured with "boxes" Grid 104 Figure f3.4.7: Polar Plot of Hough Transform of Figure f3.4.6 105 Figure f3.4.8: Flow of Control for the Macro-structural Analysis Algorithm 108 Figure f3.4.9: Data Structures for the Macro-structural Analysis Algorithm 109 Figure f4.2.1: Hierarchical Texture I 115 Figure f4.2.2: Hierarchical Texture II 117 Figure f5.1.1: "Cube" Labelling and Gradients 122 ix Figure f5.1.2: Two "possible" Gradient Configurations of Figure f5.1.1 123 Figure f6.1.1: Rho-theta Equation of a Line 126 Figure f6.3.1: "Squashing" Effect of the Orthographic Projection ..131 1 1. INTRODUCTION "There is more to seeing a pikestaff than is commonly believed ..." (Boden, *77) This thesis is about deducing certain properties of a scene from an image of that scene. These properties are the orientations of the surfaces making up the scene and the deduction of these properties is made on the basis of knowledge about the texture present in the image. 1.0 PHYSICAL ASSUMPTIONS In order to study how perceived texture can enable a re-construction of the scene, the properties and the events to be analysed must be isolated. One way of accomplishing this isolation is to simplify the world. The following assumptions and restrictions, then, outline our simplification. They will be mentioned again in the thesis as they become appropriate. 1) All surfaces are planar. 2) All images are produced by an orthographic projection. 3) The scenes contain only continuous surfaces. 4) Some preprocessing of the image will always be assumed. The nature of this preprocessing will be mentioned when required. 5) "Nearby" regions of the image that have "similar" textures are really the same region. The term, "similar", will be defined when i t is needed. The term, "nearby", is not defined nor is it used by us. This assumption is made largely to allow the following assumption. 6) The images are segmentable into regions, each of which corresponds to a unique surface of the scene. In most cases, we will assume that this segmentation has taken place. 7) Assumptions will be made throughout the thesis as to the nature of the actual texture. So assumptions 1) - 3) simplify the physical "reality", 4) - 6) ? represent work already done which we assume and may call upon, and 7) is our major heuristic, for, without 7), any given image could have been produced from an infinite number of scenes. 1.1 DEFINITIONS This thesis, then, confronts the task of recovering three-dimensional scene information (namely, surface slant) using texture as a cue from a two-dimensional image produced by an orthographic projection of the scene. To proceed, the terms "texture" and "surface slant" require clarification. Texture Texture certainly deserves to be a nebulous concept (there is a pun in there somewhere) and no definition to date has made it very clear. Texture in an image normally consists of intensity variations in the image over space. These variations may be produced by two distinct means: 1) Variations in albedo. 2) Variations in the surface normal. This thesis assumes the former. This assumption is made as a further simplification of the world, for i t is now the case that perceived  texture does not depend on the position of the scene's light source(s). The variations in albedo and the surface markings are referred to 3 as the actual or scene texture and the variations in the image's intensity as the apparent or image texture. This will be discussed further in Chapters 2 and 4. For now, i t is sufficient to think of texture as being a large number of similar (visible) patterns or elements, each of which is small relative to the textured surface, arranged over the surface according to some set of placement rules. Note that the apparent texture then depends on the scale of the image since different things will become the basic elements as the scale varies. For example, wheat fields viewed aerially from one kilometer have a definite apparent texture quite distinct from the same wheat fields viewed from six centimeters. Image resolution is often associated with scale, resolution decreasing with the image-to-scene ratio. Some Artificial Intelligence paradigms (Kelly, 1971) (Tanimoto and Pavlidis, 1975) vary resolution with the level of attention . Hence the apparent texture also depends on the level of attention. The determination of what constitutes the basic elements for any given textured image is quite difficult and we will largely assume that this has been done for us i f we need i t (assumption 4). It is felt that this determination problem is one created by an insufficient definition of texture. This opinion is discussed in Chapter 4.2. Surface Slant Fortunately, the definition of "surface slant" is somewhat more concrete. We will generally talk about the gradient vector (Huffman, 1971) (Mackworth, 1973) of a surface or minor variations of i t . The gradient, (p,q), of a line in a scene is a vector "whose direction is in 4 the direction of the picture line and whose length is the tangent of the angle the scene line makes with the picture plane." (Mackworth, 1976). See Figure fl.1.1, from (Mackworth, 1976). We will say that the gradient of a surface is the gradient of the set of parallel lines having the steepest inclination to the picture plane. We will refer to the direction of this gradient vector as o and to the arctan of its  length as 0 . The cartesian co-ordinates, (p,q), and these angles are related by the following equations: p = tanjzJ cosa q = tanjzJ sincr In most cases, we will find i t more convenient to determine a and than to compute the cartesian co-ordinates of the gradient directly as Kender (see Chapter 2.2) does. We will also call a the direction of  slant and the amount of slant . 1.2 MOTIVATION It is not within the scope of this thesis to give motivations for recovering 3-D scenes from 2-D images, so such motivations will be assumed. But why study texture and surface slant? We will use an application to illustrate our motives. The origami world is a model of the visual world where surfaces, which may stand alone, (Kanade, 1978) are the basic elements rather than polyhedra. That is, the origami world includes as a subset the solid-objects studied by (Huffman, 1971), (Clowes, 1971), (Waltz, 1972) and (Mackworth, 1973) but allows other objects as well. The origami world is (a) A picture of an edge. .. > picture and gradient of a line Figure f1.1.1 6 defined on line-drawings and an object i s "understood" by assigning one of four l a b e l s , + (convex edge), - (concave edge) and, <-, and -> (occluding edge) to each l i n e i n the image of the object. Kanade's system yi e l d s the 3 (up to rotation) l a b e l l i n g s shown i n Figure f l . 2 . 1 for the l i n e drawing shown. We can think of a) as a cube-l i k e configuration, b) as a concave corner and c) as a "roof" placed on a plane. Kanade uses edge p r o f i l e s to determine which l a b e l l i n g i s "correct" for a given image, but l e t us see how texture can help us here. Let us f i r s t note that the gradients of the surfaces for Figure f l . 2 . 1 a), b) and c) are related as shown i n Figure f l . 2 . 2 a), b) and c) respectively. Suppose that a l l the surfaces have been textured and on t h i s basis we have determined the gradient vector of each surface ( i t i s the purpose of t h i s thesis to demonstrate that t h i s can be done). Then our findings w i l l e a s i l y distinguish between Figures f l . 2 . 1 a) and c ) , for example. Note that we w i l l not be able to distinguish figures f l . 2 . 1 a) and b) on t h i s basis since texture cues w i l l only y i e l d the surface orientation to within Necker reversal, that i s , texture w i l l give us two gradients for any given surface, the " r e a l " gradient and i t s negative or r e f l e c t i o n about the o r i g i n . Note that textures are e a s i l y carried over into l i n e drawings whereas int e n s i t y p r o f i l e s are not. Hence, i t i s not necessary to return to the o r i g i n a l image, which may be a photograph for example, to determine surface slant from texture. This i s especially useful i f the photograph i s not of s u f f i c i e n t resolution to allow u t i l i z a t i o n of edge p r o f i l e s . 7 a) b) 1 A A c) q / \ gradients of Figure f1.2.1 Figure fl.2.2 9 1.3 THESIS OUTLINE I t i s proposed, then, that texture i s a useful cue for the determination of surface slant from orthographic images. Three di f f e r e n t "kinds" of textures w i l l be examined and di f f e r e n t techniques w i l l be used to analyse them. Programs w i l l be implemented to determine surface slant for two of these texture classes. Let us now give a summary of the thesis by chapter. Chapter 1 Introduction This chapter opens by presenting the major assumptions made t h i s thesis. Section 1.1 discusses concepts c r u c i a l to t h i s work. Section 1.2 discusses our motivation with an example of a possible application of t h i s work. Section 1.3 gives an outline of the thesis. Chapter 2 Related Work Section 2.1 presents h i s t o r i c a l attempts at defining texture. These are discussed here mainly as references to l a t e r discussions. Section 2.2 introduces the main texture measures used. We w i l l use a l l but the Fourier transform in our work. Section 2.3 presents applications of these measures that lend credence to assumption 6) by demonstrating texture discrimination. Also discussed i s previous work on surface slant detection and depth cues. in Chapter 3 The Solution Section 3.1 defines "orthographic projection" and proves an important theorem about the nature of texture under t h i s imaging process. Section 3.2 examines the use of s t a t i s t i c a l measures for determining surface orientation and proves a theorem indicating the l i m i t s of t h i s approach. Section 3.3 undertakes a structural analysis of the texture's "basic element" and presents an implemented algorithm for detecting surface slant. Section 3.4 makes use of the Hough transform to analyse the texture's macro-structure. Again, an implemented algorithm i s presented. Chapter 4 An Interpretation of the Results Section 4.1 summarizes the results of Chapter 3. Section 4.2 then examines a common myth about textures i n l i g h t of these results. Chapter 5 Conclusions This chapter gives a b r i e f summary of what has been accomplished. Several directions for further study are proposed. 1.5 READING PATHS This thesis w i l l , naturally, be of d i f f e r e n t interest to d i f f e r e n t readers. So, l e t us sketch a few "paths" through the thesis. 11 D e f i n i t i o n of Texture I f the reader i s interested i n knowing only what we think texture i s , much of the work we have done may be glossed over. But read Sections 1.1 to get acquainted with texture. 2.1 to get a h i s t o r i c a l perspective. 3.1 to understand the imaging process. 4 and the conclusions to 3.2, 3.3 and 3.4 to get our opinions on texture. 5 the f i r s t part, to get our conclusions. S t a t i s t i c a l Approaches Read 1.1, 2.2, 3.2, 4.1, 4.2, 5. Micro-structural Approaches 1.1, 2.3, 3.3, 4.1, 4.2, 5. Macro-structural Approaches 1.1, 2.4, 3.4, 4.1, 4.2, 5. The reader interested i n working with texture i n any context i s advised to at least browse through the entire thesis. In t h i s endeavour, may I wish the reader f r u i t f u l studies. 12 2. RELATED RESEARCH 2.1 THE DEFINITION OF TEXTURE Everyone i s fa m i l i a r with v i s u a l texture but no one seems to be able to define i t adequately. Texture i s d i f f e r e n t things to dif f e r e n t people. Certainly, texture depends on one's point of view, or l e v e l of attention, as was pointed out i n the previous chapter. I t appears reasonable, however, to define what one i s working with, and several stabs have been made at i t . The lack of successful d e f i n i t i o n s i s due largely to the practice of viewing textured surfaces with the picture plane p a r a l l e l to the surface and to the confusion caused by considering texture to be composed of micro-structures and macro-structures. This l a t t e r problem is discussed i n chapter 4. The former results i n a conceptual blurring of the d i s t i n c t i o n between scene and image. That i s , d e f i n i t i o n s have been proposed which describe the actual texture i n terms of the image (regions, gray-tones, arrays). Such d e f i n i t i o n s are acceptable only so long as fron t o - p a r a l l e l views are maintained. I f the surface slant i s non-zero, however, the question of texture i d e n t i t y and problems i n terminology quickly a r i s e . The image-based d e f i n i t i o n s would say that a non-tilted surface and a t i l t e d surface have di f f e r e n t textures. There i s no way, i n such a d e f i n i t i o n , of expressing the notion of the same actual or surface texture. Our own d e f i n i t i o n i s scene based, hence we would say that the slanted surface has the same actual texture as the unslanted surface. We would agree, of course, that a di f f e r e n t apparent texture i s present. I t should be realized that a l l of the following researchers have proposed image-based d e f i n i t i o n s . 13 In an excellent review of the role of texture i n object perception, Pickett defines v i s u a l textures to be "two-dimensional arrays of variations" (Pickett, 1970). A more e x p l i c i t d e f i n i t i o n , "texture i s composed of large numbers of si m i l a r 'basic elements' or 'pieces' each of which i s small r e l a t i v e to the size of the textured region", i s proposed i n (Rosenfeld, Lee, and Thomas, 1970). Haralick et a l fe e l that "texture i s concerned with the s p a t i a l ( s t a t i s t i c a l ) d i s t r i b u t i o n of gray tones" (Haralick, Shanmugan and Dinstein, 1973). Along these l i n e s , Zucker has proposed a general model for texture (Zucker, 1976). This model i s based on a Chomsky-style grammar. Zucker proposes an "alphabet" of micro-structures. These micro-structures are then arranged on a surface according to a set of placement rules (syntax). The result i s then distorted i n some way (transformational component) to y i e l d the f i n a l texture. I t should be noted that the placement rules are chosen to y i e l d periodic structures. I t i s often f e l t that a suitable description of a texture i s one which w i l l enable the re-generation of that texture. Rosenfeld and Lipkin have developed a system allowing texture synthesis (Rosenfeld and Lip k i n , 1970). The required texture description i s also in the form of (1) the nature of the micro-structures and (2) the appropriate placement rules. 14 2.2 APPROACHES TO TEXTURE MEASURES Gray-level S t a t i s t i c s Gray-level s t a t i s t i c s can be c l a s s i f i e d into 1 s t , 2n& or, i n general, i t h order s t a t i s t i c s . The order corresponds to the number of pix e l s or points considered i n i n t e r - r e l a t i o n . So, 1 s t order s t a t i s t i c s include gray-level average and variance (Hawkins, 1970) for example. Note that these values depend on the individual gray-level values only and not on their arrangement. Nothing can be deduced about micro-structure size or shape from 1 s t order s t a t i s t i c s . For example, figures f2.2.1 and f2.2.2 below y i e l d the same average and variance values. In fa c t , a l l f i r s t order s t a t i s t i c s are invariant under any p i x e l permutation. 2 n d order s t a t i s t i c s characterize i n t e r - r e l a t i o n s of pairs of pixe l s or dots. These may be viewed as "dipole s t a t i s t i c s " or the pr o b a b i l i t i e s of "needles" of fixed length and orientation thrown on the image having given gray-level values at th e i r t i p s . I t has been conjectured by Julesz (Julesz, 1973) that 2 n d order s t a t i s t i c a l differences are necessary but not s u f f i c i e n t for human vi s u a l texture discrimination. That i s , a pair of textures may d i f f e r i n 2 n d order s t a t i s t i c s and yet not be discriminable by humans. However, Julesz himself has produced counter-examples to his o r i g i n a l conjecture; that i s , discriminable textures with i d e n t i c a l 2 n d order s t a t i s t i c s were exhibited ( C a e l l i and Julesz, 1977). Julesz modified his psychological model (and conjecture) by introducing pseudo-colinearity detectors that 15 Checkerboard Figure f2.2.2 16 come into play only when 2nc^ order s t a t i s t i c s are i d e n t i c a l . Schatz modified t h i s procedure by considering only those dipoles connecting endpoints of real or l o c a l " v i r t u a l " l i n e s (Schatz, 1978). An example of a l o c a l v i r t u a l l i n e i s the dotted l i n e i n figure f.2.2.3. Schatz's s t a t i s t i c s , do y i e l d i d e n t i c a l values for some pairs of textures where 2n<3 order s t a t i s t i c s are d i f f e r e n t . These textures are not subject to human discrimination. Hence, Schatz feels that h is method comes closer to providing a s u f f i c i e n t condition for human texture discrimination. Schatz's method assumes that the image has been preprocessed to y i e l d " l i n e drawings" or "primal sketches" (Marr, 1976). Of course, Schatz assumes that the o r i g i n a l image i s so preprocessable. This i s not always the case, as with texture elements composed of curved l i n e s or dots only. In these cases, a l l l i n e s are v i r t u a l l i n e s . 2n<3 order s t a t i s t i c s are computed by Haralick with the help of "gray-tone spatial-dependence" or co-occurrence matrices (Haralick, Shanmugan and Dinstein, 1973). Consider a d i g i t i z e d image and a displacement d' =(DX,DV) defined on i t . Then the co-occurrence matrix Pcj is defined such that Pcj(i,j) = pr o b a b i l i t y that a point with gray-level j occurs at a displacement o from a point with gray-level i Haralick defines 14 di f f e r e n t texture s t a t i s t i c s that may be computed from these matrices and suggests possible psycho-physical interpretations for some of them (e.g., "coarseness", "entropy"). Note that the displacements, cj, correspond to Julesz's dipoles. Note also that since i s fixed i n orientation, such matrices can be used to measure textural features i n a given d i r e c t i o n . This i s useful for our work i n detecting surface slant and also for detecting d i r e c t i o n a l i t y i n a texture. In f a c t , we w i l l examine "blind" application of t h i s 17 V i r t u a l l i n e figure f2.2.3 18 technique i n chapter 3.2. More to the point i s our use of dipoles i n chapter 3.2 (called "diagonals" in that chapter). Our detection of these dipoles, however, i s implemented by a structural analysis . This type of analysis i s developed next. Structural Analysis (Tomita et a l , 1973) have used structural analysis for region detection. The assumptions made here are that the image i s made up of c l e a r l y outlined atomic elements (such as c i r c l e s or squares) and that each region to be detected i s made up of a homogeneous set of these elements. A certain l e v e l of noise (input i s v i a a TV camera) i s also assumed and a set of i j t n order moments (see pgs. 78 to 79 for d e f i n i t i o n s ) , M j j , are chosen as shape descriptors, p a r t i a l l y for t h e i r resistance to such noise. This process, while used only for texture discrimination here, would be useful to our task as we could determine how the atomic elements have been deformed by the apparent surface slant. Nevertheless, we w i l l not use them in t h i s context, but for region detection only (see Chapter 3.3). Hawkins describes the use of l o c a l "matched f i l t e r s " for texture discrimination (Hawkins, 1970). This i s a structural approach i n the sense that structures or shapes (such as edges, l i n e s and wedges) are being searched f o r , but no attempt i s made to iso l a t e possible micro-structures or to describe t h e i r shapes. As Hawkins points out, these measures are insensitive to too many image properties. In p a r t i c u l a r , they are insensitive to changes due to surface slant since no new edges result from t i l t i n g the surface. Schatz's work may also be construed as s t r u c t u r a l . He assumes that 19 a l l real l i n e s i n the image have already been found. As w e l l , v i r t u a l l i n e s are constructed between real l i n e terminators with a notion of " l o c a l " . However, Schatz only looks at the overall s t a t i s t i c s of these l i n e s rather than t h e i r s t r u c t u r al properties. Frequency Domain Consider a d i g i t a l image i n matrix form, g(x,y), where x and y are integers over the domain (0...k). The image's discrete Fourier transform i s given by k-1 k-1 F(n,m) = (1/k 2) J 1 g ( x , y ) e - ( 2 P i ) i [ ( x n + y m ) A ] x=0 y=0 The power spectrum of F i s given by P(n,m) = |F(n,m)|. The power spectrum i s invariant with respect to translation i n the s p a t i a l domain but not with respect to rotation. Hawkins points out that textures may d i f f e r only i n t h e i r phase relationships and yet appear very di f f e r e n t (Hawkins 1970), indicating a shortcoming of power spectra as texture descriptors. However, Bajcsy (Bajcsy, 1972,1973) uses the d i r e c t i o n a l s e n s i t i v i t y of power spectra to some advantage. The power spectrum i s transformed from cartesian coordinates to polar coordinates, P(r,#). Then for each d i r e c t i o n 0 and for each frequency r we have the one dimensional functions P 0(r) and Pr(jrf) respectively. She then defines k P(r) = 25Ptf(r) and jz>=0 w/2 Q(&) = ZPr(^) where w i s the window size r=l the basis for her texture descriptors i s then the pair <P(r),Q(0)>. 20 Bajcsy defines many descriptors from t h i s pair. For the i r d e f i n i t i o n and further d e t a i l s of the above see (Bajcsy, 1972,1973). Textural features derived from the Fourier domain have been found to give poorer results than s t a t i s t i c a l features i n some comparative studies (Dyer, Weszka and Rosenfeld, 1975a, 1975b, 1975c). I t has been suggested (Rosenfeld,1975) that t h i s may be due to "edge effects" of the discrete Fourier transform. Some work has been done using reflected patterns to minimize t h i s problem (Dyer and Rosenfeld, 1975). The Fourier-based approach may be useful for our work since d i r e c t i o n a l features show up strongly. However, structural information i s disguised by the transform, yet t h i s i s quite important information, as we have seen i n the previous chapter. The Fourier transform could be used in the macro-structural analysis presented i n Chapter 3.4. Nevertheless, we found the rho-theta Hough transform i n t u i t i v e l y more transparent for t h i s application. This transform i s developed h i s t o r i c a l l y i n the following section. Hough Transforms The problem of detecting surface slant from texture gradients i n the perspective projection case has been tackled by (Kender,1979). Consider the special case where the micro-structures (true shape) are " r e s t r i c t e d to be one-dimensional and l i n e - l i k e ; they are arranged i n a regular mesh-like fashion." Examples of such textures appear i n Figure f2.2.4 below. Kender enlisted help from the techniques of a related d i s c i p l i n e , that of l i n e detection i n images. To see how Kender uses t h i s technique 21 22 to some advantage, i t may be useful to set the stage by quickly reviewing the technique's development. This procedure for detecting l i n e s i n an image was f i r s t proposed by (Hough,1962) and i s now commonly referred to as the use of Hough  Transforms. B a s i c a l l y , i t involves transforming each point i n the pattern into a l i n e i n a parameter space, i n Hough's case, into slope-intercept space. The f i r s t r e a l l y useful development occurred several years l a t e r (Duda and Hart,1972). Here, angle-radius (also referred to as theta-rho or rho-theta) space i s used. At t h i s stage, a closer look i s warranted. Duda and Hart decided to use normal parameterization. Referring to Figure f2.2.5 for the d e t a i l s of the parameters, we note that a straight l i n e can be expressed as xcos(0) + ysin(O) = rho (see the Appendix to 2.2 for derivation) After correlating t h e i r d i g i t i z e d image with a differencing operator (a way of detecting intensity changes), Duda and Hart then map each figure point (x^y^) to the sinusoidal curve i n (quantized) theta-rho space: rho = XJCOS(0) + yjSin(G) each c e l l i n the space being augmented by 1 every time i t i s "written" into. Since c o l l i n e a r points i n the image map into l i n e s intersecting at a point corresponding to the l i n e they a l l l i e on, points of maximal accumulation (detected by histogramming) each correspond to a detected l i n e . Figure f2.2.6 below i s the theta-rho Hough transform of the " c i t y streets" of Figure f2.2.4. Rho-theta parameterization of a line Figure f2.2.5 TT_ X Hough transform of "city streets" Figure f2.2.6 25 The next major step was made by (O'Gorman and Clowes,1973). Following a suggestion by A.K. Mackworth and J . Francis, they decided to make use of l o c a l evidence to determine uniquely the value of ©. In eff e c t , they considered each p i x e l i n the image to be an "edgel" (having d i r e c t i o n and position but no length). Let us see how t h i s i s accomplished. We can estimate the inte n s i t y gradient G=(DXXV,DYXV) at each point (x,y) by correlating the image with the mask -1/6 0 1 +1/6 -1/6 0 1 +1/6 -1/6 0 1 +1/6 to obtain DX Y V and with the mask +1/6 +1/6 +1/6 0 0 0 -1/6 -1/6 -1/6 to obtain DY X V. That i s , for every point (x,y): +1 +1 DX X V = 1/6(T I(x+l,y+i) - 2 I(x-l,y+i) ) i=-l i = - l +1 +1 DY x y = 1/6(2 I(x+i,y+l) - 2 I(x+i,y-l) i = - l i = - l where I(x,y) i s the value of the image's int e n s i t y array at point (x,y). Now we can approximate (tan(0) = DY Xy/DX x v) for each "edgel" since the ?6 i n t e n s i t y gradient i s generally perpendicular to the picture edge. So for each "edgel" we can increment a unique histogram bucket indexed by rho and ©. Having developed the necessary to o l s , we are f i n a l l y ready to look at Render's work with texture. Kender used the observation that true p a r a l l e l l i n e s i n a perspective image converge to a vanishing point (or points) to note that such l i n e s would map into points i n the theta-rho parameter space l y i n g on the sine-curve corresponding to the vanishing point. Sinusoidal curves being rather d i f f i c u l t to detect, Kender proposed two modifications to the transform. The f i r s t i s to plot the theta-rho transform in polar coordinates. The c i t y streets example of Figure f2.2.4 i s plotted below i n t h i s space (Figure f2.2.7). Kender noted that p a r a l l e l l i n e s i n the image now map into points l y i n g on a straight l i n e going through the o r i g i n and converging l i n e s map into points on a c i r c l e through the o r i g i n . I t should be noted at t h i s stage that the distance between points corresponding to p a r a l l e l (in the image) l i n e s i s precisely the normal distance between these l i n e s i n the image. This feature w i l l be examined i n Section 3.4 of the next chapter. Kender also noted that i f we describe a point .in the parameter space i n terms of the rectangular co-ordinates T x and T v then i t i s not necessary to compute 6 e x p l i c i t l y . In f a c t , no trigonometry at a l l i s required. Consider an "edgel" defined by a position P=(x,y) and an "edge vector" E=(E X,E V). The edge vector i s a vector perpendicular to the l i n e segment that i t refers to. The value of the edge vector's length 27 T y A Polar plot of Hough transform of "city streets" Figure f2.2.7 28 i s l e f t open to interpretation i n Render's paper but may be most conveniently thought of as the "strength" of the l i n e . That i s , i f the image i s correlated with the masks of O'Gorman and Clowes, then the edge vector may be taken to be E = G = (DX X V,DY X V). The parametric form of t h i s edgel, T = (T X,T V), can then be computed to be: T = ((E»P)/|E|2)E That i s , T i s i n the d i r e c t i o n of E, perpendicular to the l i n e segment, and i s of length (E»P)/|E| which i s just the length of P projected onto the unit vector i n the di r e c t i o n of E and T, namely rho (see Figure f2.2.8). Render's second modification i s simply to map every point, (rho,©), to (K/rho,9) for some constant K. So, T i s s t i l l i n the di r e c t i o n of E, but i s now of length R/rho. That i s , T = (K/rho)(E/IEI) -= (K/[(E-P)/|E|]) (E/IEI) = (K|E|E)/([(E-P)|E|] T = (R/(E«P))E Converging l i n e s now map into points l y i n g on a straight l i n e , p a r a l l e l l i n e s into points l y i n g on a straight l i n e through the o r i g i n . Derivation of T Figure f2.2.8 30 Render then proves the following theorem: "Suppose an image contains the perspective projection of a planar surface defined by two or more coplanar sets of p a r a l l e l l i n e segments. Let the o r i g i n of the transform space correspond to the coordinates of the focal point i n the image (that i s , where the camera i s "aimed"). Let R equal the focal distance. Then T = R E / (E-P) transforms edges so that the intersection of l i n e s of accumulation points i n the transform space i s at (p,q), the gradient vector of the surface." And so the surface's slant i s detected. Notice, however, that t h i s method f a i l s i f the projection i s orthographic or i f the vanishing points are very far away since the intersection point w i l l not be distinquishable from the o r i g i n . This method also r e l i e s on our distinguishing between p a r a l l e l l i n e s that converge due to perspective and non-parallel l i n e s . This, as was pointed out e a r l i e r , i s not always that easy to do. Nevertheless, Render has introduced modifications that make the Hough transform computationally more desirable. Chapter 3 w i l l discuss the uses of the Hough transform for orthographic projections. 31 2.3 APPLICATIONS Discrimination The most common application of texture i s the discrimination of homogeneous textured regions. This i s t y p i c a l l y concerned with planar surfaces viewed with the surface normal perpendicular to the picture plane. In t h i s case the image texture i s the same as the scene texture. As was mentioned before, t h i s i d e n t i t y i s the source of a great deal of confusion i n defining texture. These are generally "real-world" applications such as t e r r a i n c l a s s i f i c a t i o n from s a t e l l i t e images (Dyer, Weszka and Rosenfeld, 1975a,1975b) and micro-biological images (Hawkins, 1970). An exception to the above i s the work of Tomita et a l (Tomita, Yachida and T s u j i , 1973) mentioned i n the preceding section under "Structural Analysis". The input picture here could be of a textured cube on a d i f f e r e n t l y textured table-top, for example. The image i s then segmented into regions on the basis of several textural features. An important aspect of t h i s work i s that segmentation occurs i n stages or le v e l s as determined by the "supervisor". For example, the cube image i s f i r s t segmented into the "cube" region and the "table-top" region on the basis of one feature. The cube region i s then further segmented into regions corresponding to the three v i s i b l e surfaces of the cube. Note that at the top l e v e l the cube i s seen as being one region. The scene texture i s indeed the same over the whole cube. The image texture, however, i s d i f f e r e n t for each region representing a face of the cube. I t i s a strength of t h i s program that i t detects these 32 l e v e l s . Immediate discrimination of more abstract (and perhaps a r t i f i c i a l ) textures has been studied by Julesz (Julesz et a l , 1973) (C a e l l i and Julesz, 1978) from a psychological point of view and by Schatz (Schatz, 1978) from a computational point of view. Only absolute discrimination (are the two textures d i f f e r e n t or not?) i s considered by a l l of these researchers, although recently ( C a e l l i and Julesz ,1978) have mentioned "ease of discrimination". No interpretation (such as apparent texture from true texture) was placed on differences i n apparent texture. In the above cases, discrimination was performed by generating a histogram of the discriminating feature, setting thresholds from the histogram, and segregating textures as to how the feature applied over the texture related to the threshold. For example, a l l atomic regions with area greater than the threshold are grouped into one region (Tomita et a l , 1973). We w i l l demonstrate t h i s i n Chapter 3.3. Surface slant Gibson has studied the use of texture gradient (the change, due to perspective, i n some textural features such as coarseness) in human "space perception" (Gibson, 1950). Gibson has shown that, i n the case of perspective images, d i s c o n t i n u i t i e s i n texture gradient give r i s e to the perception of edges, such as caused by corners or di s c o n t i n u i t i e s i n the surface ("occluding edges"), and that the texture gradient could be used to determine the orientation or shape of a surface. Computationally, many problems a r i s e . Among them: i t i s d i f f i c u l t to establish i f the texture gradient over a region i n an image i s caused by perspective or the curve of the surface. Gibson also observed that the 33 monocular perception of depth depends on texture gradient to some degree. We have already mentioned the work by Render i n t h i s f i e l d , and we now note that, he also, exploits texture gradient i n perspective images by computing the vanishing point. This thesis does not exploit texture gradient, as w i l l be discussed l a t e r . Both of Gibson's and Render's theories do not determine surface slant i n orthographic images. I t i s the purpose of t h i s thesis to present a theory that succeeds i n t h i s respect. Depth cue Computations based on texture gradient have actually been used as a depth cue (Bajcsy and Lieberman, 1976) in perspective images. The assumption i s made that a l l surfaces are planar. "Real-world" scenes (such as a grassy f i e l d and an ocean view) were considered and images of these were used as input to the program. Features derived from the Fourier domain were used and r e l a t i v e distances were co r r e c t l y deduced. I t was conjectured i n t h i s work that absolute distances could be calculated i f the program contained a camera model with information such as the lens focal length. Some problems were encountered when the planar surface assumption was violated (the grassy f i e l d r o l l e d a b i t ) . This work, may have benefited i f discrimination of surface slant had been incorporated into the program. 34 2.4 SUMMARY Texture i s usually defined as many basic elements (micro-structures) on a surface. Each micro-structure may or may not have a structure associated with i t . As w i l l be further explained i n Chapter 3.1 and 3.3, texture w i l l be viewed as micro-structures placed on a surface according to a macro-structure (placement ru l e s ) . The word "texel" i s used to refer to a projected image of a micro-structure. The shortcomings of t h i s approach w i l l be discussed i n Chapter 4.4. Many texture measures have been developed and they f a l l into three broad classes: s t a t i s t i c a l , micro-structural and macro-structural. A l l three classes w i l l be used i n the next chapter, but, as we s h a l l see i n Chapter 3 and discuss i n Chapter 4, these class d i s t i n c t i o n s are somewhat ambiguous. I t w i l l be assumed that texture discrimination i s possible i n the remainder of t h i s d i s s e r t a t i o n . Furthermore, work concerned with determining surface slant from images of textured scenes has been re s t r i c t e d to perspective images. The rest of t h i s thesis w i l l extend t h i s work by examining texture i n orthographic projections. 35 3. APPROACHES TO COMPUTATIONAL DETERMINATION OF SURFACE SLANT FROM TEXTURE MEASURES 3.1 INTRODUCTION I t i s demonstrated i n t h i s chapter that information about a surface's orientation can be extracted from the apparent, texture of the surface's orthographic image. This demonstration i s constructive i n that we w i l l a c tually specify algorithms to accomplish t h i s task (and i n some cases our computer w i l l accomplish i t ) . With t h i s end i n mind, l e t us now look at the orthographic picture taking process. Let us suppose that we have represented a given object i n terms of the Cartesian co-ordinates of a l l of the points on i t s surface. This object, and the space that i t i s i n w i l l be referred to as the scene. Now l e t us position a picture plane i n such a way that the x- and y-axes of the picture plane are p a r a l l e l to the X- and Y-axes of the scene's frame of reference and the picture plane o r i g i n l i e s on the Z-axis of the scene's reference frame. We w i l l say that a point, (x,y), on the picture plane i s an orthographic projection of a point (X,Y,Z), in the scene i f and only i f x = kX and y = kY for some constrant k. This process i s i l l u s t r a t e d for a l i n e i n figure f3.1.1. Suppose now that our scene consists of a planar surface with l i n e s and dots on i t . A surface i s considered to be " i n f i n i t e " in a l l directions (the extent of any l i n e s and dots on i t , however, i s f i n i t e ) . I t i s easy to see that the surface, unless i t i s p a r a l l e l with the picture plane, intersects the picture plane along a straight l i n e . Without loss of generality, we can think of t h i s intersection l i n e as being p a r a l l e l to the x-axis ( i f i t i s not, we need simply rotate the Orthographic projection of l i n e Figure f3.1.1 37 frame of reference about the z-axis u n t i l i t i s ) . We w i l l also assume for s i m p l i c i t y that the surface passes through the reference frame o r i g i n and that any l i n e on the surface also passes through the o r i g i n unless otherwise mentioned. These l a s t two assumptions are made simply to allow referring to absolute co-ordinates, (x,y), rather than being forced to use r e l a t i v e co-ordinates (delta x, delta y). We now want to talk about how the surface i s slanted with respect to the picture plane. To make t h i s task somewhat easier, we define the following terms. We define the amount of slant, tf, of a surface as the angle between the X-Y plane (or the picture plane) and the surface's gradient vector. I f the frame of reference i s rotated as we assume i t to be, then & i s the angle between the Y-axis and that l i n e on the surface which i s normal to the X-axis. For any l i n e (passing through the origin) we define 0 a (a for actual) to be the angle between the l i n e and the X-axis. We also define 0 to be the angle between the projected image of the l i n e and the x-axis. Note that 0 < 0 a i n general. These concepts are i l l u s t r a t e d i n figure f3.1.2. I t i s the goal of t h i s chapter to recover & from information obtained only from the image (and the assumptions we make about i t ) . In general, for a given image and a given surface, the surface w i l l not intersect the picture plane at a l i n e p a r a l l e l to the x-axis, contrary to our assumption. In f a c t , the intersection l i n e w i l l be at an angle c n (which may be 0) from the x-axis of the picture plane. Recall that a i s the di r e c t i o n of the surface's gradient vector, hence the notation a n to indicate that the intersection l i n e i s normal to the di r e c t i o n of the gradient. I t i s also our goal to recover c from the same information. For the purpose of i l l u s t r a t i o n , however, l e t us s t i c k to the a n = 0 Surface slant notation Figure f3.1.2 39 assumption. The feature of t h i s projection that i s most useful to our work i s the effect of "foreshortening" of lengths. To see what t h i s means, l e t us prove the following theorem. Theorem 3.1.1 Assume that a l l of the l i n e s on the surface are of length L and that these l i n e s take on a l l orientations (-ir/2 < ©a < ir/2). Let the surface be t i l t e d by an angle jzf. Let L 1 be the length of a projected image of a l i n e . Then a polar plot of L" versus © i s an e l l i p s e centred on the o r i g i n with a major axis a = L and a minor axis b = Lcos(jrf). Proof: Referring to Figure f3.1.3, i t i s noted that X = OA = Lcos©a x = O'A1 =L'cos© and that X = x since OA = O'A' so Lcos©a = L'cos© (1) I t i s also observed that Y = BP = CD = OE ODcos^ APcosjzf (OPsin©a)cos^ LsinOgCosjrf y A'P' L'sin© and Y y since Y = BP = A'P' = y so Lsin©acosjzS = L'sin© (2) Then, dividing (2) by (1) we get tan© = tan©acos^ (3) To see the e l l i p s e i n L 1, we now want to express L 1 as a function of a, b, and 6. Since a = L and b = Lcos^, we want to express L' as a function of L, cos^, and ©. Now 40 Figure for theorem f3.1.1 Figure f3.1.3 4 1 L 2COS 2© a L « 2 =  cos 20 by (1) L 2 = (trigonometric equality) (l+tan 2 e a)cos 26 L 2 = b y ( 3 ) (l+tan 2 e)cos 2 e COS2Jf) L2cos2j?) cos 29cos 2jzi + s i n 2 0 then multiplying by L 2 / L 2 we get (L 2) (L2cos2)zJ) L 2 s i n 2 6 + (L 2cos 2jzi) cos 2 e a 2 b 2 a 2 s i n 2 9 + b 2cos 20 which i s the equation of an e l l i p s e . Q.E.D. I t i s now f a i r l y easy to see that any given l i n e i n the scene w i l l be maximally foreshortened i n the d i r e c t i o n of the surface slant (the di r e c t i o n of the gradient vector) and not foreshortened at a l l i n a d i r e c t i o n normal to i t . Furthermore, theorem 3.1.1 allows us to calculate the amount of sl a n t , jrf, as: 0 = cos-1(b/a) There i s , nevertheless, a defect i n the method of determining jrf from foreshortening. From the theorem just proven, l i n e s i n the scene are foreshortened by a factor cosjzJ in the d i r e c t i o n of the surface's gradient. But due to the invariance of the function, cos, over the sign of i t s argument, i t i s possible to determine only the absolute value of 42 f6 from the foreshortening. That i s , since cos^ = cos ( - t f ) , we cannot distinguish between the image of a scene and the image of i t s Necker reversal. With these observations i n hand, l e t us now look at texture. 3.2 STATISTICAL APPROACHES Second order s t a t i s t i c s have been used by Julesz, Haralick, and Schatz (see Chapter 2 for d e t a i l s ) . These s t a t i s t i c s are examined i n the hope that they may be useful for the detection of surface slant and the results obtained are presented i n t h i s section. Input i s i n the form of a 64X64 matrix with two possible gray-levels, 0 (blank) and 1 (a mark). 2X2 co-occurrence matrices (see Section 2.2 for definitions) are calculated for displacement vectors 6 = (x,y) where 0 < x < 10, 0 < y < 10. Regular dots Consider a surface textured with dots (with, t h e o r e t i c a l l y , no size) arranged on the vertices of an ( i n v i s i b l e ) square g r i d . The surface i s then t i l t e d back by = ir/4 about the horizontal axis of the g r i d (which i s also the picture plane horizontal). This rotated surface i s then 'imaged1. Analysis of the second order s t a t i s t i c s for t h i s image indicates that p J ( l , l ) w i l l have high values for d's i n the horizontal d i r e c t i o n having lengths equal to multiples of the interpoint distance. High values for P r i ( l , l ) w i l l also occur for d's i n the v e r t i c a l d i r e c t i o n and having lengths equal to cos(jrf) times the interpoint distance. For d's of other dir e c t i o n s , Po^(l,l) w i l l y i e l d high values for those lengths of d' that correspond to distances between 43 points having the same direction of cf between them (see Figure f3.2.1). The plot of Pcf(l,l) for cf = k(l,0) where k=l,2,3,... ,10 peaks at cfs corresponding to multiples of the interpoint distance. The plot of Pcf(l,l) for cf = k(0,l) where k is as above, while somewhat muddled, gave recognizable peaks at cos(fi) times the interpoint distance as was expected. Thus could be determined to be: length (cf) at 1 s t peak of Pc»(l,l) fi = k(l,0) = cos-1  length(d) at 1 s t peak of PcJ(l,l) 6 = k(0,0) Note that i f this technique were to be used on an image of a surface with unkown slant, we would have to be sure to "pick" the cfs such that they f a l l on adjacent points, otherwise we could not determine the size of the squares of the grid. The problem would be alleviated i f the grid were "filled in", that is, if we had a grid of lines as well as vertices. In this case we could assume that a polar plot of length (cf) for smallest peaks would give a square and we could then f i t our data accordingly. Note that we are actually looking at a grid and that "connecting the dots" as in (Schatz,1977,1978) would not change our result. In fact, the dominant characteristic here seems to be the structure of the grid, rather than the statistics such a grid presents. Circles The surface here consists of rings arranged on the vertices of a square grid. The circles are small enough that no overlap occurs. Again, the surface is slanted by jz» = nr/4. My collegues (actually barely willing fellow graduate students) and I had no difficulty seeing the surface slant ourselves (human perception), however, an analysis of 2no" 44 Examples of deltas Figure f3.2.1 45 order s t a t i s t i c s for the resulting d i g i t i z e d image proved very messy. The s t a t i s t i c s of just a single c i r c l e are shown i n Figure f3.2.2 below. I f there were no other contributors to P ^ ( l , l ) when |o^ |<d_iam c i r c l e and diam c i r c l e i s s u f f i c i e n t l y large for the plot of Pd'ci/l) to be recognizable, t h i s method would pick out the "diameter" of the c i r c l e i n any d i r e c t i o n . Since c i r c l e s are deformed into e l l i p s e s by the surface slant, we could measure the major(a) and minor(b) axis and set = cos-1(b/a). We could also say (correctly) that the surface i s t i l t e d about the major axis. Unfortunately, neither of our conditions are met, i n general. F i r s t o f f , since the minimum i n t e r c i r c l e distance i s less than diam c i r c l e for most textures (and i n p a r t i c u l a r for our test case) involving " c i r c l e s " (e.g. Corkboard, concrete walkways), these i n t e r c i r c l e pairs of points also contribute to P r i ( l , l ) in the region of interest (see Figure f3.2.3 below). D i g i t i z a t i o n attributes (the top, bottom, and sides of the c i r c l e s are flattened) also crept i n to deform our experimental plots (see Figure f3.2.4). The implementation v e r i f i e d t h i s analysis and c i r c l e diameters did not seem to be detectable without p r i o r knowledge of the arrangement. Again we find that structural properties dominate. This makes sense to some extent since our micro-structures (circles) are highly structured objects indeed. Perhaps not so obvious i s the effect of the arrangement, or macro-structure, on the second order s t a t i s t i c s . I t i s the macro-structure that i s muddying the plot of P6(1,1) by introducing the i n t e r c i r c l e distances into the i n t r a c i r c l e part of the p l o t . Yet i t was precisely the macro-structure of the regular dot texture that allowed us to say anything at a l l about the surface slant for that pattern (as dots do not have a useful micro-structure). In t h i s case at l e a s t , a s t r u c t u r a l analysis of some sort appears to be necessary. 46 47 Intercircle deltas Figure f3.2.3 D i g i t i z a t i o n o f a c i r c l e F i g u r e £2.2A 49 Other Micro-structures C i r c l e s were chosen i n the previous example for t h e i r s i m p l i c i t y . They are r o t a t i o n a l l y invariant under Po^(l,l) and d i r e c t i o n a l measurements on them are e a s i l y described due to the simple polar equation of a c i r c l e (r = c ) . More complex regular shapes (e.g. squares) are not r o t a t i o n a l l y invariant under P r i ( l , l ) in that Figure f3.2.5a w i l l y i e l d a lower value (0) for P r i ( l , l ) where d' = (d,0) than Figure f3.2.5b. I t i s now the case that i n t e r - t e x e l distances depend not only the macro-structure of the texture but also on the rotation of the micro-structures (which, i t i s true, may be construed to be a part of the macro-structure) as shown i n Figure f3.2.6 below. I t i s clear that second order s t a t i s t i c s quickly become u n i n t e l l i g i b l e i n t h i s case. I n t u i t i v e l y , however, squares, c i r c l e s and other regular shapes are i d e a l l y suited for detection of surface slant as c i r c l e s are deformed into e l l i p s e s and squares into parallelograms i n a c l e a r l y specified way. Uniform Dots Consider, now, a surface textured with dots randomly distributed over the plane (e.g., a high-contrast stucco w a l l ) . Suppose that the surface i s again t i l t e d by an angle away from the picture plane i n a d i r e c t i o n a. One would expect the dots to be "compressed" in the image of t h i s surface, and t h i s i s indeed the case. But t h i s observation i s misleading. Suppose the d i g i t i z e d image, I, has a resolution of N*N and that the dots are now uniformly distributed i n the Effect of rotation on co-occurrence matrices Figure f3.2.5 v V, V S v. -v-y -> Effect of rotation on inter-texel distances Figure f3.2.6 52 discrete case. The average density (# dark pi x e l s / N*N) i s higher i n the image I than i n an image of the u n t i l t e d surface. But, can we use the knowledge of t h i s compression to detect the surface slant from I? Equivalently, i s there a d i r e c t i o n a l dependence detectable i n the s t a t i s t i c s of the projected image? Suprisingly, the answer i s "no". Theorem 3.2.1: In an orthographic image of a surface textured by dots positioned randomly and uniformly there i s no d i r e c t i o n a l dependence i n the co-occurence matrices. Proof: Consider a square section of the surface of dimensions a x a aligned with the scene axes X and Y. Suppose the prob a b i l i t y density function (p.d.f) of the dot d i s t r i b u t i o n on the surface i s f(x,y) = g(x)g(y) where g(s) = 1/a for 0 < s < a. In the image, the p.d.f of the dot d i s t r i b u t i o n w i l L be f'(x,y) = g(x)h(y) where g(x) i s as above and h(y) = 1/acos^ for 0 < y < acostf The p.d.f. F 1 can be regarded as a constant, uniform p.d.f. for independently sampled points over the rectangle i n the image of dimensions a x acos)Z$. I f f' (x,y) i s re-expressed i n polar co-ordinates r,6 then i t i s c l e a r l y a function of r alone. Hence there can be no di r e c t i o n a l dependence i n the co-occurence matrices. Q.E.D. As i t seems reasonable to assume that d i r e c t i o n a l i t y i s necessary for the detection of surface sl a n t , i t i s concluded that i t i s not 53 possible to determine the slant of t h i s surface from second order s t a t i s t i c s of the texture. I t should be pointed out, however, that the amount of s l a n t , 0 , can be determined from the image i f the parameters of the probabilty density function are known. The theorem indicates that a plot of P<$(1,1) would be f l a t for such an image. This has been experimentally v e r i f i e d using the implementation mentioned. Conclusions The results obtained from the implementation indicate that second order s t a t i s t i c s on structural (in the micro sense) textures quickly become muddled and u n i n t e l l i g i b l e . This i s not too suprising, as i t seems appropriate to use structural techniques on structural textures and s t a t i s t i c a l techniques on " s t a t i s t i c a l " textures. However, we have seen that for t r u l y random texture, second order s t a t i s t i c s do not help us at a l l ! Furthermore, second order s t a t i s t i c s seem to apply best to regular dots. But here, the dots are arranged in a pattern, or structure. Again, structural techniques are appropriate, t h i s time for analysis of the macro-structure of the texture. In t h i s case the micro-structures are simply dots and t h e i r analysis need not concern us here. The shortcomings of s t a t i s t i c a l analysis strongly suggest the use of s t r u c t u r a l analysis, and t h i s w i l l be examined i n the next two sections. 54 3.3 STRUCTURAL APPROACHES: THE MICRO-STRUCTURE Consider a surface textured by placing copies of a given micro-structure over i t . We w i l l use the term micro-structure to refer to an abstract d e f i n i t i o n of shape and s i z e . These are the patterns out of which to construct a texture. We w i l l use the term texel to refer to a texture element i n the image. Thus, the image of a copy of a micro-structure that has been placed on a surface i s a t e x e l . We w i l l assume that the micro-structure i s not altered i n any way (other than rotation and translation) and we w i l l not concern ourselves with the d e t a i l s of the macro-structure (the pattern i n which the micro-structures are placed). The purpose of t h i s section, then, i s to examine the problem of surface slant detection from texel analysis. Consider Figure f3.3.1 below. I n t u i t i v e l y we would say that the figure represents a cube (or corner), each surface of which has been textured with square micro-structures. Furthermore, we would p a r t i t i o n the image into regions corresponding to the three v i s i b l e surfaces of the cube. We would also say that the texels i n any given region are squashed i n the d i r e c t i o n of the surface's gradient vector. We make these observations even though we do not know, a p r i o r i , that the micro-structure i s a square, neither do we know i t s s i z e , and we do so i n the presence of a f a i r amount of noise ( a l l the micro-structures would have mapped into parallelograms i n a noiseless image). Any shape-from-texture system should be just as v e r s a t i l e . The following algorithms attempt to embody some of the i n t u i t i v e notions mentioned above. I t should be mentioned, however, that they do not detect the shape of the true micro-structure. For example, for Figure f3.3.1, they do not recognize the texels as slanted squares. o ° ^ ° o ° O o O 0 o ° a o 0 0 o ° o *> * V * o * * * * * * * * * * * * a /7 0 o Q 4 0 a o 0 "cube" textured with squares Figure f3.3.1 56 The Single Surface Before we act u a l l y tackle an image l i k e Figure f3.3.1, l e t us f i r s t look at some simpler cases and work our way up. Consider the image of Figure f3.3.2. The scene represented here i s a single surface textured with randomly rotated straight l i n e segments of a given length. This surface has been rotated (slanted) by an angle about a l i n e at an angle c n from the X-axis. That i s , i t s gradient vector i s of length tan(jzS) and has d i r e c t i o n a. Before setting about analysing t h i s image, l e t us set down a few assumptions (in addition to our usual ones). These assumptions w i l l be retained for the duration of t h i s section (3.3). -there i s only one micro-structure used. -the surface i s textured by placing copies of the micro-structure rotated to make a l l orientations equally l i k e l y over i t . Note that no assumptions are made as to where these copies are placed. For example, the placement may be random or orderly. -the image (of the scene) i s pre-processable to y i e l d a l i s t of texels ( s u f f i c i e n t l y well represented, by a l i s t of vertex coordinates, for example). -the image has been so pre-processed. One Diagonal For each texel (l i n e segment, i n t h i s case) l e t us measure i t s (Euclidean) length, L, and i t s angle of orientation, 0, from the x-axis (-pi/2 < 0 < pi / 2 ) . Let us refer to t h i s set of (L,0) pairs as the L-0-l i s t . Since each micro-structure copy i s randomly rotated before being / \ / \ \ \ \ \ \ / _ / I \ 1 I Tilted surface textured with line segments Figure f3.3.2 58 placed on the surface, we can now appeal to theorem 3.1.1 to see that the measured data w i l l l i e roughly on an e l l i p s e . The image, of course, i s not a perfect orthographic projection, but i t i s a good approximation to one within a noise factor (caused by d i g i t i z a t i o n ) . The data, therefore, a c t u a l l y represents a noisy e l l i p s e . A polar plot of t h i s data taken of Figure f3.3.2 i s shown i n Figure f3.3.3. Our task i s now reduced to finding the e l l i p s e factors a, b, and c n from the data, L-0-l i s t . Since we have determined that those l i n e segments making an angle 0 = a n with the x-axis are not foreshortened (theorem 3.1.1 again), we can reasonably estimate a n to be that 0 corresponding to the largest L in L - 0 - l i s t . This i s c l e a r l y a guess, the point may be very noisy and therefore i n error, but i t i s hoped that i t i s accurate enough (and i t does, a f t e r a l l , save us some work). A more noise resistant approach would be to treat a as a parameter i n the least-squares f i t t i n g process. We would, however, l i k e the function that we are trying to f i t the data to be l i n e a r with respect to i t s parameters (see the appendix to t h i s section). This i s not the case for the function that we have decided to use. To make o n a l i n e a r parameter then, i t w i l l be necessary to factor out a n and then find a s i m i l a r function which does depend l i n e a r l y on a l l three parameters, a, b, and a n . Let us now think of the data, L - 9 - l i s t , as a function of 6 (each pair i s a mapping of some 0 to some L). Now assume that the pr o b a b i l i t y density function of the v a r i a t i o n i n L i s a symmetric function about a mean of 0. We can now determine the factors a and b by f i t t i n g the data to an e l l i p s e with a least-squares approximation (Conte and de Boor, 1965; pg. 241-246). The d e t a i l s may be found i n the appendix to t h i s section. We are now nearly done. Appealing yet again to theorem 3.1.1, we see that 59 -j * * •% t * 1 * » * « i i i I S . 0 0 0 4-4 4 4 Polar plot of L - 6-list taken of Figure f3.3.2 Figure f3.3.3 60 = cos— L (minor-axis/major-axis) where minor-axis = MIN(a,b) and major-axis = MAX(a,b). Also, o n has already been determined. Thus the gradient vector (see Chapter 1) of the surface can be determined to be: p = -tan^ s i n a n p = +tanjrf s i n a n or q = +tanjtf cosa n q = -tanjzJ cosa n This algorithm would be of l i t t l e interest i f line-segments were the only micro-structures that we could handle. However, i f our micro-structure had some consistently i d e n t i f i a b l e l i n e s , then we could work with these l i n e s only. That i s , i f l i n e s of some given length i n the micro-structure were marked i n a way (coloured, highlighted, topologically unigue, etc.) that allows determination of the corresponding length i n any given t e x e l , then we could treat t h i s l i n e as i f i t were a single line-segment micro-structure. One way of ensuring that we have an i d e n t i f i a b l e l i n e i s to require our micro-structure to have a s u f f i c i e n t l y long diagonal ( s . l . d . ) . We s h a l l say that d i s a s u f f i c i e n t l y long diagonal (for the image i n question) i f for a l l other diagonals dj (we w i l l use the term diagonal to refer to both true diagonals and edges) in the micro-structure, length(d) = length(dj) or length (d) * cos(jzS) > length (dj) where i s the angle of t i l t (with respect to the picture plane) of the surface. The idea here i s to make sure that we have a diagonal that w i l l not be foreshortened "too much". That i s , we would l i k e i t s image to be i d e n t i f i a b l e by i t s length. Note that a s. l . d . i s either unique or, i f i t i s not, the same length as a l l other s.l.d.s. The i d e n t i f i a b i l i t y of a s u f f i c i e n t l y long diagonal depends on the angle of 61 t i l t , tf. Hence, a s.l. d . i s a weaker concept than a uniquely i d e n t i f i a b l e length ( u . i . l . ) since t h i s dependence i s not required. Indeed, other usable u . i . l . s such as topologically unique l i n e s do not suffer from t h i s r e s t r i c t i o n . Consider the class of micro-structures that have precisely one s u f f i c i e n t l y long diagonal. Examples of such micro-structures (for some images) appear i n Figure f3.3.4 below. Note that the micro-structures need not be convex, polygonal, closed, nor even contain any l i n e segments at a l l . To go any further, i t i s necessary to f i r s t prove the following theorem for texels containing one or more s u f f i c i e n t l y long diagonals. This theorem gives the reason that the diagonal i s detectable. Theorem 3.3.1: The longest diagonal of a texel corresponds to one of the s u f f i c i e n t l y long diagonals of the micro-structure. Proof: Without loss of generality, l e t the surface be t i l t e d about the X-axis by an angle jzJ (Figure f3.1.1). Then any diagonal of length L and dir e c t i o n 6 i n the micro-structure w i l l be mapped into a diagonal of length L'2 = L 2sin 2ecos 2(z$ + L2cos2© = L 2(sin 20cos 2^ + cos 20) Thus L 1 i s longest when 0 = 0 or © = i r (L'= L) and i s shortest when 0 = ir/2 or © = 3ir/2 (L 1 = Lcosjz)'). But since a s u f f i c e n t l y long diagonal i s such that LCOSJZJ" > for a l l other L-_, we see that L' at i t s shortest i s s t i l l longer than L-_* at i t s longest. The theorem follows immediately. So, for the pa r t i c u l a r case of precisely one s u f f i c i e n t l y long diagonal, we know that the longest diagonal of each texel i s the image 62 Examples of micro-structures with exactly one sufficiently long diagonal Figure f3.3.4 63 of the s u f f i c i e n t l y long diagonal of the micro-structure. This means that we can e f f e c t i v e l y ignore a l l of the points of any texel except those belonging to the longest diagonal, and i n doing so, ignore a l l of the diagonals of the micro-structure except the s u f f i c i e n t l y long one. Since the micro-structure i s randomly rotated before being placed on the surface, so i s the s u f f i c i e n t l y long diagonal. I f we then measure and record the length, L, and the d i r e c t i o n , 0, of each texel's longest diagonal, we w i l l have a l i s t , L - 0 - l i s t , s a t i s f y i n g a l l of the assumptions of theorem 3.1.1. We can then extract and a n from L-0-l i s t i n exactly the way we did for the single line-segment case. To summarize b r i e f l y , an algorithm has been developed which extracts surface slant information from images of surfaces textured with 1) "Dipoles" or micro-structures consisting of just a single l i n e -segment 2) "Thick dipoles" or micro-structures containing precisely one s u f f i c i e n t l y long diagonal. Furthermore, we note that 1) i s just a special case of 2). To allow treatment of a more natural class of micro-structures, we now relax our r e s t r i c t i o n on them to include micro-structures containing one or more s u f f i c i e n t l y long diagonal(s). One or More Diagonals Some examples of such micro-structures are shown i n Figure f3.3.5. The s i t u a t i o n i s now somewhat more d i f f i c u l t . The d i f f i c u l t y arises primarily because the longest diagonal of a texel may now be the image of any one of the s u f f i c i e n t l y long diagonals of the micro-structure. 64 Figure f3.3.5 65 This affects us i n the following way. Theorem 3.3.2: Let and 62 be two of the s u f f i c i e n t l y long diagonals of the micro-structure. Let the angle 1 between them (in the micro-structure) be p. I f d i s the longest diagonal of a t e x e l , then the angle between d and the major axis of the e l l i p s e i s less than u/2. Proof: See Figure f3.3.6. Since dj and d2 are d i s t i n c t , p < it*. Without loss of generality, l e t the surface be t i l t e d about the x-axis. That i s , the major axis of the e l l i p s e i s the X-axis. Let the angle between 61 and the X-axis be 0 a. Also without loss of generality, l e t 6 a be less than or equal to u/2 ( i f t h i s i s not true for d]_, i t w i l l be for 62)- Then d must be the image of dj since the image of d j , d]/, w i l l be at least as long as the image of 62, $2'• L e t e2 ^ e t n e angle between d2 and the X-axis. Then Id-.' I = x 2 + y 2 = L 2 c o s 2 0 a + L2sin2©acos2jr> |d 2 ' l = L 2cos 202 + L 2sin 292Cos 2j?) Hence ^ ' l 2 < Id]/I2 since ea < 62 Note that |dj_'| = |d2'l only when © a = ©2 --n which case i t can be assumed that d i s the image of d^ without loss of generality. Referring back to equation (3) of Theorem 3.3.1, we know that tan© = tan©acos^ So tan© < tan(p/2)cosjz>" since © a < u/2 < ir/2 tan© < tan (u/2) since 0 < jz5 < ir/2 © < u/2 since 0 < ©, p/2 < ff/2 Q.E.D. and d2 are considered to be l i n e s rather than vectors. The "angle between them" i s considered to be the major angle of th e i r intersection (unless otherwise noted). 66' Figure for theorem f3.3.2 Figure f3.3.6 67 This i s e a s i l y generalized to more than two diagonals. Note also that u/2 i s not the tightest bound, but i t i s our purpose merely to show that such a r e s t r i c t i o n e x i s t s . From theorems 3.3.1 and 3.1.1 we know that a l l of the data points i n L - 0 - l i s t s t i l l l i e on an e l l i p s e , but theorem 3.3.2 shows that not a l l of the e l l i p s e w i l l be f a i r l y represented. In f a c t , there may be no points at a l l at the minor axis of the e l l i p s e . Figure f3.3.7 shows an image of a surface textured with squares (which have two s u f f i c i e n t l y long diagonals). A polar plot of the L-0-l i s t information for t h i s image i s shown i n Figure f3.3.8 i l l u s t r a t i n g the theorem. We w i l l , however, ignore t h i s gap i n our data and hope that enough data are available to accurately f i t an e l l i p s e to what we do have. I t should now be clear why we have decided to use a l e a s t -squares, trigonometric approximation, which i s computationally a b i t expensive (especially when using LISP), rather than a computationally inexpensive smoothing of data by averaging, for example. The l e a s t -squares method makes s t a t i s t i c a l assumptions only about the noise. A smoothing-by-averaging method, on the other hand, also assumes that the pr o b a b i l i t y of any given point on the e l l i p s e being represented i s equal to that of any other point. Theorem 3.3.2 shows us that t h i s i s not a wise assumption to make. I t should be noted that as the number of s u f f i c i e n t l y long diagonals of the micro-structure increases, the amount of e l l i p s e represented by the L - 0 - l i s t decreases. This leads to the paradoxical case of the micro-structure with i n f i n i t e l y many diagonals, a c i r c l e . For t h i s case, the algorithm f a l l s apart completely. The only point i n the L - 0 - l i s t for t h i s image i s the point at the end of the ° ° o o- • o o o o o o o 0 • o o o o a ° ° a a o O o o o0<>o0 o o o o ° o D P O °o a9 - ? • O d d Tilted surface textured with squares Figure f3.3 . 7 69 • * 1 * * * A m * * t t * » I- . t * * * * •» •» » 4 * V •# *• i — * - • »- * * •* *—V : •» i . . . • * * 21.021 •» * *• * * *• * * * * Polar-plot of L-6-list taken of Figure f3.3.7 Figure f3.3.8 70 major axis i n the range (-rf/2 to Tf/2) and yet we can e a s i l y determine the surface slant simply by f i t t i n g an e l l i p s e to any given texel (which w i l l be an e l l i p s e ) and analysing t h i s f i t i n the same way as we have been doing. The algorithm can be made to handle t h i s case simply by doing exactly that when the L - 0 - l i s t represents only one point on the e l l i p s e . Nevertheless, i t i s unpleasant to have to deal with a special case i n t h i s manner. Single surface results Pages 71-72 show two sample runs of the algorithm on some representative synthesized images. Given also are the actual values of and a n for comparison. More than One Surface In the beginning of t h i s section, i t was hinted that several surfaces (actually the image regions corresponding to these surfaces) could be segmented on the basis of t h e i r apparent texture. As was mentioned i n Chapter 1, such segmentation i s assumed. Nevertheless, to see i f i t i s possible, t h i s problem i s examined for t h i s texture class. Since the L - 0 - l i s t data of any given region corresponding to a surface l i e on an e l l i p s e unique to the surface, we may reasonably suggest that t h i s data may be used for the region detection task. Figure f3.3.9 shows an image of a cube made up of surfaces l i k e Figure f3.3.2. A polar plot of the L - 0 - l i s t data for t h i s image i s shown i n Figure f3.3.10 which can be seen as three superimposed e l l i p s e s . For t h i s image, i t i s conceivable to detect the e l l i p s e s i n i t s L - 0 - l i s t . One way to detect these e l l i p s e s i s with a Hough transform (Hough,1962) that * (DISKIN grad) > 30 FORMS READ FROM: grad > * (gradient (image> * * * * * * * * * * * * * * J N P U l S U R F A C E * * * n , • O o . o o o o o o o " A C E o o > Input surface ok? * T > This surface is tilted by an angle phi = ±0.474918 > about a line an angle sigma_n = -.444419 from the X-axis > Therefore, its gradient vector is given by > p = -0.221058 or p = 0.221058 > q =0.464222 q = 0.464222 > The actual value of sigma__n = -.523598 > The actual value of phi = ±.523598 72 * (gradient (image)) * * * * * * _ - / * N -* * * * * / \ ..... x \ \ \ \ / * * * * * / \ I > Input surface ok? Input surface > This surface i s t i l t e d by an angle phi = ±0.767956 > about a l i n e an angle sigma_n = 0.571337 from the X-axis > Therefore, i t s gradient vector i s given by > p = 0.522215 or p = -0.522215 > q = 0.812335 q = -0.812335 > The actual value of sigma_n = 0.523598 > The actual value of phi = ±0.785398 — / / — - / \ _ / \ x r \ \ / / \ I / ^ i IT / i i ^ / / / / \! -' ' ' / 7 N. / " c u b e " t e x t u r e d w i t h l i n e s e g m e n t s F i g u r e f3.3.9 74 -» •» -* •i * * * * * *-* *• X *-S * * •• S- * * t * 4 * * -i * 1 5 . 1 3 3 Polar of the L-6-list taken of Figure f3.3.9 Figure f3.3.10 75 maps e l l i p s e s into points i n some transform space. One need then simply look for accumulation points i n the transform space. However, two problems immediately become apparent. The f i r s t i s one of computational complexity. I f we assume that we have no information about the e l l i p s e s , except that they a l l have t h e i r origins at the co-ordinate o r i g i n , then the transform space must necessarily be three-dimensional (since there are three remaining parameters). The second problem i s much more f a t a l and persistent. I f the e l l i p s e were noiseless, t h i s Hough transform technique would work w e l l . As i t i s , we have found, experimentally, that the transform i s very sensitive to noise, and gives unexpected results. This second problem persists even i f we t r y to eliminate the f i r s t . We can do t h i s by noting that, since the same micro-structure i s used over the entire image, the length of the major axis w i l l be the same (and equal to the length of the micro-structure's s u f f i c i e n t l y long diagonal) for each e l l i p s e . We can estimate t h i s length to be the maximum L value found i n the L - 0 - l i s t . Hence, i f the e l l i p s e s are parameterized i n terms of the lengths of the major- and minor-axis and the rotation of the major-axis, o n, a two-dimensional transform space can now be used. Let us further assume that a has somehow been cor r e c t l y determined as w e l l . The transform space i s now one-dimensional and, i f a l l were well i n the world, i t would now be a simple histogram problem. Nevertheless, i t i s easy to see why the second problem i s s t i l l present by looking at the following analysis. The transform w i l l 76 consist of evaluating the minor-axis length for each a n . Let us assume that we only have one e l l i p s e , then l e t us see i f we can detect i t . Consider those points near the ends of the major axis. Note that the lengths of a l l of these w i l l be less than the length of the major axis (as they should be) but due to the selection of the major axis length and the noise i n the points themselves, some lengths w i l l be much smaller. I t can be seen that, for these points, very small values of the minor-axis w i l l be obtained. Note that these are the correct values for these points; they do indeed l i e on very eccentric e l l i p s e s and not on the e l l i p s e we would l i k e to detect. Figure f3.3.11 i l l u s t r a t e s t h i s . We could t r y to correct t h i s problem by finding a better estimate for the major-axis length and by smoothing the data (we w i l l have to be careful that the smoothing algorithm does not t r y to smooth the e l l i p s e into a c i r c l e as averaging smoothers w i l l ) . We did not have much success with any of these solutions, however, so we decided that another approach i s c l e a r l y i n order. "We a l l have our moments"anonymous Fortunately, texture discrimination i s an old problem and much work has already been done on i t (see Chapter 2). As was mentioned i n the l a s t chapter, however, most of the work has been concerned with distinguishing d i f f e r e n t textures on the same surface (such as wheat f i e l d s from grass f i e l d s i n an a e r i a l image). The task considered i n t h i s thesis i s better expressed as distinguishing d i f f e r e n t surfaces that have the same (actual) texture on them. Before proceeding, l e t us re-emphasize that, although we are using the term "surface", we are act u a l l y practicing region detection at t h i s Ellipse detected for noisy points Figure f3.3.11 78 stage of the analysis. Since the surface i s part of the scene (and not part of the image) i t w i l l be maintained that a surface has not been detected u n t i l i t s three-dimensional attributes (such as surface slant) can be described. Several researchers (Tomita et a l , 1973) (Wang et a l , 1979) have found that the second order moments of i n e r t i a of the texels form a good feature space for surface detection of t h i s kind. We can see why t h i s should be so by considering the case where the micro-structures are a l l c i r c l e s . Then the texels w i l l a l l be e l l i p s e s whose major axis i s i n the d i r e c t i o n of the l i n e that the surface has been t i l t e d about and whose minor axis i s i n the d i r e c t i o n of the surface's gradient vector. In such a case, the minimum angular i n e r t i a of a l l the texels on a given region w i l l be that taken about the major axis. The domain of micro-structures for t h i s section may be thought of as polygons approximating c i r c l e s . So, rather than getting unique points i n the moment space for each class of texels on a region, s l i g h t l y scattered clusters i n the space are obtained. I f a texel i s described as a l i s t of points ( ( x l r Y l ) (X2'Y2) ••• ( xn*yn)) which are the co-ordinates of the vertices within the t e x e l , then the moments of these vertices are given by: n Ixx = d/n) 2 ( Y i - y m ) 2 i = l n I w = (1/n) 2 (xi - x m ) 2 i=l n Ixy = d/n) 2 (*i - % ) (yi - y m) i=l where (x m,y m) are the co-ordinates of the vertices' centre of mass. For a more general d e f i n i t i o n see (Wang et al,1979). In Chapter 2, in the 79 section on structural analysis, we referred to the i j t n order moments, M i j . These are given by n Mij = l / n l ( x k - x m ) i ( y k y m ) J k=l Thus Mu = I x y M02 = Jxx M20 = lyy So, assume that we these moments have been computed for every texel i n the image. The task now i s to pick out the regions i n the image corresponding to the surfaces. Rather than constructing one-dimensional histograms of each moment (Tomita et al,1973), or computing the ec c e n t r i c i t y and d i r e c t i o n of each texel (Wang et a l , 1979), l e t us simply view the set of computed moments as points i n the three-space, I x x X Iyy X I X y , and then look for clusters i n t h i s space. Note that, i n so doing, we are making a l l of the usual clustering assumptions. One of these i s that the feature space w i l l indeed produce clusters separated by some threshold. In t h i s case, the metric (distance function) i s simply Euclidean distance, and the two works previously referenced give us reason to believe that the feature space i s useful. We are also relying heavily on our previous assumptions and our notion of texture. For example, the scene could conceivably be composed of as many surfaces as texels with one texel to a surface. This case c l e a r l y contradicts the d e f i n i t i o n of texture outlined i n Chapters 1 and 2. And of course, the assumption i s s t i l l made that only one micro-structure i s used. Note that i f the scene consists of two d i s t i n c t (in space) but i d e n t i c a l l y slanted surfaces, the clustering algorithm w i l l only see one surface. To ensure that segmentation occurs 80 in a meaningful way, i t w i l l be necessary to go back to the image and v e r i f y that the detected clus t e r indeed defines a single region. I f not, further processing w i l l have to take place. The discussion of these problems i s not within the scope of t h i s paper and t h i s implementation does not tackle them. The problem does e x i s t , however, and any complete scene analysis system w i l l need to apply i t s e l f to i t . ' Keeping i n mind that these assumptions have been made, i t i s now possible to set about implementing a cluster analysis algorithm. Any algorithm, such as one of those described i n (Duda and Hart, 1973) or (Tou and Gonzalesz, 1974), would do here, since we have found the clusters to be widely separated. In the implementation, a simple algorithm known as the maximin-distance algorithm (Tou and Gonzalesz, 1974) i s used. This served well for purposes of demonstration. For a large scene analysis system, however, a more dependable graph-theoretic algorithm, such as one of those described i n (Zahn, 1971) i s recommended. Figure f3.3.12 shows a plot of Iyy versus I x v taken of Figure f3.3.1 showing the good separation obtained even for t h i s two-dimensional case. An example of the algorithm at work i s shown i n Figure f3.3.13. Now that the image i s divided up into regions, i t i s necessary only to determine the slant of each surface. This can be done one surface at a time since the techniques for handling the single surface case have already been described. An example of the entire process at work i s shown on pages 86 to 89. 81 yy 233.$ 00 * * *• » * • jajL LQJL * * 4 * 4 * * * * » * % xy -120.000 122.100 T w v.s. I w taken of Figure f3.3.1 Figure f3.3.12 a ° O a a ° O a O a /7 * * **** *& 0 O 0 4 0 0 THIS IS THE INPUT }HAGE Application of the clustering algorithm Figure f3.3.13 a ° O o o ° ° 0 o & o o o ° a o o DETECTED SURFACE Application of the clustering algorithm Figure f3.3.13 <?0 a o & # 0 O Q 0 Q 0 DETECTED SURFACE Application of the clustering algorithm Figure f3.3.13 •0-DETECTED SURFACE Application of the clustering algorithm Figure f3.3.13 86 * (grad i) * * * * H— * T _ + ~ + ~ + _ * ++ _+ =- T- _ * + _ + + + ++ T ~ -+ -* _=- TTT- +-+ _ _-_ _=+ T T +-= * +T T=_ T H—+ - T ++ * + - -_ —_T _-_ T+T -_- TT_ * TT + + _ ++ _ +=_ _T+_T T * T4- T++ T+ - H—+ ++ + +T++ T * -=_+-_TTT+ =+ ++ T + +_TT+= ++++ T+=-T * +-H-TT TT += ++ T T += T+ ++ + +T ++ * -TTTT ++ T TT =- +T ++ +TT — * T- _-_ ++ _~ T_+ ~ +T 4+ T | ++ -* += +T- ++ T ++- +T4+ +T TT * -_- TT +T+ T+ * T+ +_T+T * T * This is the input image * > This is the input image ok? * T 87 * FREE SPACE EXPAND * * * * +- _ _ * T_+ + +_ * ++ _+ =- T- _ * + _ + + + + + T ~ -+ -* _=- TTT- +-+ _ _-_ _=+ T T +-= * T=_ T H—+ - T * ~_ —_ T _~_ T + T * + + _ + = _ * +-+ ' * T T * * * * * * * * Detected surface > Detected surface ok? * T > This surface is tilted by an angle phi= ±0.767009 > about a line an angle sigma_n = -3.964055E-3 from the X-axis > Therefore, its gradient vector is given by > p = 3.820867E-3 or p = -3.820867E-3 > q = 0.963873 q = -0.963873 > > > actual value of phi = ±fl/4 » ±0.785398 > > actual value of sigma n = 0 88 > > > > > > > > +T > + -> IT + + _ > T-t— T++ T+ -> -=_+-_TTT+ =+ -_ ++ > +++TT TT += ++ > -TTTT 4+ T _ TT > T- _-_ ++ _ T_+ > += +T- ++ T > -_- TT > T+ ^ _ > > Detected surface > > Detected surface ok? * T > This surface i s t i l t e d by an angle phi= ±1.021549 > about a l i n e an angle thetaO = -0.94459 from the X-axis > Therefore, i t s gradient vector i s given by > p = -1.323798 or p = 1.323798 > q = 0.95753 q = -0.95753 > > actual value of phi = ±rf/3 » ±1.047197 > > actual value of sigma_n = -tan - 1(SQRT(2)) * -0.955316 > 89 > > > > > > > > ++ > TT_ > _ T + _ T T > ++ + +T++ T > + +_TT+= ++-H- T+=-T > T T += T+ ++ + +T ++ > _=- +T ++ +TT — > +T 4+ T | ++ -> ++- +T++ +T TT > +T+ T+ > +_T+T > T > Detected surface > > Detected surface ok? * T > This surface is tilted by an angle phi= ±0.923183 > about a line an angle sigma_n = 0.868427 from the X-axis > Therefore, its gradient vector is given by > p = 1.00908 or p = -1.00908 > q = 0.854031 q = -0.854031 > > actual value of phi = ±ir/3 » ±1.047197 > actual value of sigma_n = tan-1(SQRT(2)) » 0.94459 > 90 Details of the implementation The process of detecting surface slant from texture i s composed of three major steps i n t h i s program: load the image, segment the image into regions corresponding to surfaces, and f i n a l l y compute the slant information for each surface. The "load image" module i s responsible for the pre-processing of the input image. The entire process i s i l l u s t r a t e d i n Figure f3.3.14. The purpose of the supervisor i s two-f o l d : i t must maintain the flow of control and i t must transform data structures to allow communication between modules. Flow of control i s simple. The image i s f i r s t loaded. This information i s then passed to the region detector. The supervisor then passes one region at a time to the gradient vector calculation routines. The real need for the supervisor i s ensuring that each module receives the necessary information. The data structures used are shown in Figure f3.3.15. From the flow of control diagram, we see that the maximin cluster algorithm, for example, requires a l i s t of moments as input data. Output would consist of a l i s t of c l u s t e r s , each cluster being a l i s t of moments. However, a l i s t of clustered moments i s of no help i n computing the gradient vectors since they are not used nor even needed for t h i s task. Therefore, the fourth element of the "moments" data structure i s a "texel" data structure. This element i s simply carried along (but otherwise ignored) by the maximin cluster algorithm. The supervisor then "extracts" the texels from each cluster to form a region. / < 1 Supervisor |< \ v I Length and | I Direction I I c a l c u l a t i o n | I L - 0 - l i s t I v I Least-squares I f i t to an I e l l i p s e I a,b,6 I v I E l l i p s e I parameters I to gradient I vector v I V (p,q) I I /->—/ I I image I v Compute | moments I I moments l i s t I v Maximin | cluster |-algorithm I I I \ < \ I I I image I I I I I Load image | I I clusters I I I ->--/ Flow of Control of the Micro-structural Analysis Algorithm Figure f3.3.14 92 I image I I texel texel . . . texel | I point point . . . point I I I x y I I moment-list I I I moments moments . . . moments I I I Ixx ^ yy ^ xy texel I I clusters I I I moment-list moment-list . . . moment-list | I L - 6 - l i s t I I I L-d L-d . . . L-d I I length 6 | Data Structures for the Micro-structural Analysis Algorithm Figure f3.3.15 93 Conclusions We have seen that surface slant can be computed with a structural analysis of the texels i n an image of the surface. This i s found to be possible under the assumptions that the surfaces i n the scene are planar, that exactly one micro-structure i s used, that the micro-structure i s randomly rotated before being placed on the surface, and that the (orthographic) image i s pre-processable to y i e l d a l i s t of texels. The t i t l e of the section i s "Structural Approaches", but perhaps the algorithm should be examined a l i t t l e more cl o s e l y to see just why i t should be considered s t r u c t u r a l . While i t would be incorrect to say that the technique i s s t a t i s t i c a l ( p r o b a b i l i t i e s or other s t a t i s t i c a l values are not calculated at any time), i t would be correct to point out that i t r e l i e s on statistical^assumptions. In p a r t i c u l a r , the findings of t h i s section agree with Schatz 1 (Schatz,1977) r e s t r i c t e d version of Julesz' (Julesz, 1973) hypothesis concerning texture discrimination. That i s , the data, L - 0 - l i s t , that i s used to calculate the surface's gradient vector corresponds to the second order s t a t i s t i c s used by Julesz and Schatz. We also r e s t r i c t e d ourselves to those points connected by edges or diagonals (lines and v i r t u a l l i n e s (Schatz,1977)). For d e t a i l s of the work by Julesz and by Schatz see Chapter 2. The difference between our approach and thei r approach i s our treatment of the L - 0 - l i s t data as a function, L = F(0). We are not interested i n how often a given point, (L,0), occurs, as s t a t i s t i c a l approaches are, rather we are interested i n the nature of the function, F. I t i s t h i s interest that makes our approach a structural one. 94 3.4 STRUCTURAL APPROACHES; THE MACRO-STRUCTURE In the l a s t section an algorithm was developed for detecting surface slant from an image which has been pre-processed to y i e l d a l i s t of texels, where each texel had a structure containing surface slant information. This pre-processing stage i s a v a l i d assumption for the textures examined i n the l a s t section, except, perhaps, for the unconnected micro-structures, but i t i s not possible for a l l images. For example, the square texels i n Figure f3.4.1 are not e a s i l y discernible ( i t would be necessary to compute intersections of l i n e s , e t c . ) , and the l i n e texels of figure f3.4.2 do not appear to be very useful. Indeed, the useful information i n both cases seems to l i e i n the macro-structure of the texture. In t h i s case, the texels can be viewed as being l i n e segments only. Many successful l i n e finding algorithms have been proposed ( S h i r a i , 1973) (Horn, 1971) (O'Gorman and Clowes, 1973) so the texel detection assumption now s i t s on a firmer foundation than i t did before. In the section, one technique, a modified Hough transform, for analyzing the macro-structure w i l l be examined. "G r i d - l i k e " Textures I t w i l l be assumed that the image i s of a single surface, relying on the texture discrimination work described i n Chapter 2 to do so. The micro-structures are r e s t r i c t e d to l i n e segments of a r b i t r a r y length. There may be several such micro-structures for any given image. These micro-structures are placed on the l i n e s of an i n v i s i b l e square g r i d of fixed s p a t i a l frequency. This frequency i s the same i n both directions of the g r i d . Both Figures f3.4.1 and f3.4.2 are examples of t h i s kind "complete" g r i d texture Figure f3.4.1 "line-segments" grid texture Figure f3.4.2 "boxes" grid texture Figure f3.4.3 98 of texture. Another example i s shown i n Figure f3.4.3. Note that t h i s c l a s s of textures i s very s i m i l a r to that of (Render, 1979) discussed i n Chapter 2.2. The texture examined i n t h i s t h e s i s i s d i f f e r e n t i n that a square g r i d and a f i x e d s p a t i a l frequency are reguired, which i s not the case f o r Render's textures. Render uses Hough transforms to e x p l o i t the phenomenon of converging p a r a l l e l l i n e s i n persepective images. Since t h i s t h e s i s i s r e s t r i c t e d to orthographic images t h i s feature cannot be r e l i e d on. Nevertheless, Hough transforms are so natural for t h i s domain of textures that other features can e a s i l y be exploited, as was hinted i n Chapter 2.2. The plan here i s to examine the image and determine the nature of the parallelogram texels as if they existed. That i s , the nature of the images of the square micro-structures of Figure f3.4.3. Of course, only an image of t h i s p a r t i c u l a r f i g u r e w i l l a c t u a l l y contain such texels; the proposed program w i l l be required to " h a l l u c i n a t e " texels f o r the other images considered. This w i l l be done by using information obtained from a rho-theta Hough transform of the image. Then, assuming that the micro-structure i s a square, the gradient of the surface can be determined by analysing the transformation from micro-structure to t e x e l . In preparation, the next few paragraphs w i l l develop the r e l a t i o n s necessary to do so. Note that the lengths of the sides of such texels are related i n some way to the normal distances between the l i n e segments. The d e t a i l s of t h i s r e l a t i o n are required because the normal distances mentioned are r e a d i l y measurable i n the Hough transform of the image. The lengths, |LjJ and 11-21, of the sides of a parallelogram are given by: 99 |L]J = Idjl/cosoi and |L 2 I = |d 2l/cosor 2 where Id^l i s the normal distance between sides of length | L 2 | |d 2| i s the normal distance between sides of length IL I^ 01 i s the angle between L4 and the normal to L 2 ( i . e . between L4 and dj) 0 2 i s the angle between L 2 and the normal to ( i . e . between L 2 and d 2) This i s demonstrated i n Figure f 3 . 4 . 4 . I t w i l l be seen that |dj| and i t s d i r e c t i o n , 0 j , can be computed from the Hough transform of the image. The goal i s to use these values to determine the shape of the texels and, from t h i s and the knowledge of the micro-structure shape, to determine the gradient of the surface. Lj_ i s normal to d 2 and L 2 i s normal to dj_. So, referring to Figure f 3 . 4 . 5 , the, directions of L i and L 2 , pi and p 2 respectively, are given by e2 + tf/2 i f e2 < 0 F i = e2 - ir/2 i f e2 > 0 and Q1 + ir/2 i f Bi < 0 F2 = 01 - ir/2 i f 0 i > 0 where 0j i s the d i r e c t i o n of d j . I t w i l l be seen shortly that d} and 0j are e a s i l y obtained from the Hough transform. I t i s , therefore, simple to calculate the cartesian co-ordinates of an "idealized" t e x e l , one The relationship between and d^ Figure f 3 . 4 . 4 101 7" //•I a Illustration of u^and u 2 Figure f 3 . 4 . 5 102 corner of which is at the origin, from the polar co-ordinates, (Li,pi) The columns of P are interpreted to be cartesian vectors representing the sides of the texel. So P will be viewed as the picture or image of the square micro-structure. Now, (Mackworth, 1974) has shown that, knowing the "true shape" of a "surface" (read "micro-structure"), one can deduce the gradient vector of the surface from its image (read "texel"). In this case the true shape is given by: and the apparent shape is given by P (S for square, P for picture). So the task of determining surface slant is accomplished by applying Mackworth"s algorithm to S and P. The details involved in extracting the measures mentioned will now be discussed. The application of theorem 3.4.1 is required to get the values of |L]_|, IL2I, pi, and p 2 since a texel may not exist. However, i t is s t i l l necessary to compute d^ and d 2, the normal-distance-between-parallel-lines vectors. Note first that i t is a property of the orthographic imaging process that line segments map into line segments and parallel line segments map into parallel line segments. This assures that the domain of discourse is not vacuous. The intention is to extract di and d 2 from a rho-theta Hough transform (see Chapter 2 for details of the transform). So each line and (L 2,p 2). If p 2 < pi, we can represent the texel by the matrix 103 segment is first parameterized in terms of rho and 0. In our implementation this step must actually be carried out since the input is a l i s t of line-segments represented as a l i s t of end points (the data structures used are described later in this section). However, if the input image is in the form of an array of pixels, then i t is conceivable that the line detection stage, assumed in this thesis, will be performed with the aid of a rho-theta Hough transform. See Chapter 2 for details of using the Hough transform as a line finder. In this case i t would not be necessary to reconvert the data into the endpoint l i s t representation used in this thesis. Input may simply be the accumulation points in rho-theta space, as this is precisely the information we want. So, having done this, the broken lines vanish and only solid lines result in the parameterization For example, a polar-plot of the parameterization of Figure f3.4.6 is shown in Figure f3.4.7. Notice that this transform is exactly that of the complete grid "underlying" Figure f3.4.6. Each set of parallel lines in the image contains identical, unique to the set, values of theta (0). It is now pointed out that the normal distance between lines is simply the difference, delta-rho, in the value, rho, between adjacent points in the transform. This task, then, is simple; group the parameterized points by 0 and, for each group, measure delta-rho. Then d^ and 62 are set to the the two delta-rhos and 0j and © 2 are set to the corresponding 0s from the transform. The angles u^ and U 2 are then set to the complements of © 2 and ©^ respectively. Then 104 • • • D D D ! n p • a a a D n S • a D D D Slanted surface textured with "boxes" grid Figure f3.4.6 1 0 5 T 6SI.UE2 Polar plot of Hough transform of Figure f3.4.6 Figure f3.4.7 i I 106 o i = Hi - e l and C 2 = H2 " e 2 (see theorem 3.4.1 and Figures f3.4.4 and f3.4.5 again) F i n a l l y , L]_ = d]_ / cos c?i L 2 = d 2 / cos a2 as dictated by theorem 3.4.1. The matrix P i s then constructed as described previously. Then using a specialized version of Mackworth's algorithm (see the appendix to t h i s section), the amount of s l a n t , jzJ, and the d i r e c t i o n of the gradient vector, a are calculated. Results This system has been implemented and tested on several images. We found the results to be somewhat more accurate than the results of the section on micro-structural analysis. In that section, the angles and an were obtained to within ±0.09 radians. The macro-structural analysis implementation obtained the angles tf and a to within ±0.01 radians. This difference i n accuracy may be attributed to the s t a t i s t i c a l nature of the algorithm of that section. No such s t a t i s t i c a l assumptions are made here (the assumption of "enough" texels i s not made nor are values such as crn guessed at.) Some sample runs are given on the following pages. The entire algorithm i s i l l u s t r a t e d i n figure f3.4.8. The supervisor serves the same purpose here as i n the l a s t section. For a noiseless image, the values of the 0s and delta-rhos for each group would be i d e n t i c a l . Due to noise, however, i t was found necessary to 107 calculate the value of theta for each group's centroicTand the average value of delta-rho. The data-structures used are illustrated in figure f3.4.9. Note that the "image" data structure is the same as in the last section, except that each texel contains only two points, the end-points of each line segment. Two sample runs are shown on pages 110-111. 108 / < 1 Supervisor |<-I calculate I I picture I I matrix P I P I v I Mackworth's | I Algorithm | I I output I V W,a) / >~/ V I \— cluster I / + \ I I I I V V —<—\ I find | | find I I average I I centroid I I delta I I theta I I rho I V I V -/ cluster I I cluster I I about I I e I I parameterization I I Parameterize I I l i n e segments I I I image I I Load image Flow of Control for the Macro-structural Analysis Algorithm figure f3.4.8 I image I I I texel texel . . . texel I I I point point | I I x y I I parameterization I I I rho-theta rho-theta . . . rho-theta | I I rho e I I cluster-pair I I I parameterization parameterization I I P I I I row row I I I x y I Data Structures for the Macro-structural Analysis Algorithm Figure f3.4.9 * (supervisor (clusters (parametrize (image> * THERE ARE 1 PLOTS IN THE FILE. * ENTER NUMBER OF PLOT TO BE DISPLAYED. * 1 * * VECR CONTAINS EIGENVECTORS * 0.710198 0.704001 * -0.704002 0.710199 * . * EVR CONTAINS EIGENVALUES * 4.417268 2.195629 * * * > The actual value of phi = ±1.047196 * > The actual value of sigma = -0.785398 * * (supervisor (clusters (parametrize (image> * THERE ARE 1 PLOTS IN THE FILE. * ENTER NUMBER OF PLOT TO BE DISPLAYED. * 1 * * VECR CONTAINS EIGENVECTORS * 0.881348 -0.472472 * 0.472467 0.881346 * EVR CONTAINS EIGENVALUES * 17.906174 14.963252 * * 1 p P P 1 p P P 1 p P P 1 p P P 1 p P P 1 p P P phi = ±0.581486 > * > sigma = 1.078706 p L-l p P P p. P P p P P p P P p P P n P P * > The actual value of phi = ±0.523598 * > The actual value of sigma = 0.942477 * 112 4. AN INTERPRETATION OF THE RESULTS 4.1 SUMMARY The previous chapter examined texture from three different points of view: - statistical analysis - micro-structural analysis - macro-structural analysis Section 3.1, the introduction to the chapter, defined "orthographic projection" and proved a theorem illustrating the foreshortening property of this projection. It was also noted that this property is not sufficient to distinguish between Necker reversals of a scene. Statistical approaches to detecting surface slant from texture measures were discussed in Section 3.2. These measures were in the form of 2x2 co-occurence matrices and were applied to images of regular dots and of circles. It was concluded that such a blind application of statistical techniques is not very useful to the problem of detecting surface slant, due to data from complex images that proved difficult to interpret. A theorem was then proved stating that there is no directionality in the co-occurence matrices computed for images of truly random textures. It was concluded from this theorem that co-occurrence matrices can not be used to determine surface slant from such images. Section 3.3 examined the analysis of the texels 1 structures. It was assumed that each texel contained a line (or virtual line) which is identifiable as the image of a given line (or virtual line) in the micro-structure. In the algorithm developed within this section, the 113 presence of a sufficiently long diagonal was required to assure an identifiable line in the texel. The use of the macro-structure of a texture was examined in Section 3.4. Grid-like textures were assumed; they were analysed by determining the shape of the texels as _if they existed. This was accomplished by applying a Hough transformation to the image and retrieving the needed information from the resulting transformed image. Then, assuming the micro-structure to be a square, the gradient of the surface was computed with an algorithm due to Mackworth. 4.2 FALLACIES, OR WHY 3.2, 3.3, AND 3.4 ALL START WITH 3 At first glance, these 3 approaches are quite independent of one another. A closer look reveals marked similarities in the underlying assumptions. The Fallacy of the Statistical-Structural Dichotomy The texture of randomly, uniformly distributed dots considered in theorem 3.2.1 (the non-directionality of random textures) can be described as one having a micro-structure consisting of a dot of negligible size and a macro-structure defined as randomly distributed micro-structures on a surface. In light of Sections 3.3 and 3.4, neither of these structures is useful. Hence, i t is not surprising that statistical techniques f a i l on this image as well. Throughout Chapter 3.2, the use of second order statistics was "blind", that is, they were applied over the whole image just to "see what could be seen". 114 Admittedly, not much is seen. But suppose that more judicious use is made of them, as i t is in (Schatz, 1978). Schatz (see Chapter 2 for details) considers only the second order statistics taken for end points of lines or "virtual" lines. This is, effectively, the same information that is calculated and used in Chapter 3.3. In fact, this thesis maintains that second order statistics are best thought of as dipole structures in the image. It is true that i t would be hard to extract just what is needed from Schatz1 data since the concept of "longest diagonal" will get buried in a l l the other length measurements and that Schatz is interested in the occurrence statistics, whereas our interest is in the length as a function of direction, but these are largely matters of proposed application. So Schatz' statistical work is, in some sense, micro-structural analysis since statistics are confined to pairs of points connected by structures (lines) and the micro-structural analysis (Chapter 3.3) is, in some sense, statistical since the L-0-list can be reegarded as statistical measures taken of the texels. The Fallacy of the Micro/Macro-Structure Dichotomy Similarily, the distinction between micro- and macro-structure becomes less well defined as i t is examined. There was l i t t l e effort involved in distinguishing micro- from macro-structure in Chapter 3.3. The texels were simply considered to be the "connected" line segments (although we did wave our hands and allow unconnected texels). The problem arises when the connections are not so convenient. Figure f4.2.1, for example, may be considered to be dots arranged in a complex Hierarchical texture I Figure f4.2.1 116 macro-structure. However, i t may be more profitable to consider i t as square texels, made up of dots, arranged in a simple macro-structure, namely a square. Note that the labelling of macro- and micro- structure depends on the scale of the image. For example, Figure f4.2.1 may just be a micro-structure of Figure f4.2.2. In fact, this thesis proposes that a macro-structure is best expressed as a micro-structure, and this is precisely what is done in Chapter 3.4 when the dimensions of a parallelogram texel are found, even though i t may not exist. This thesis also maintains that the labels "micro-structure" and "macro-structure" are misleading and may be forcing the acceptance of an unnatural view of texture. Certainly, a l l of the views discussed in Chapter 2.1 have fallen into this trap. Since there is no real distinction between micro- and macro- structure, i t is suggested that i t would be best to express a texture as nested structures, a l l of which are expressed as micro-structures. For example, Figure f4.2.2 should be described as "a (square of (squares of (squares of dots)))". It would then be possible for a program (or user) to decide on what level a "texel" should exist. Under this interpretation, then, a l l of the techniques of the last chapter are structural. They appear different at first because each constructed an art i f i c i a l boundary to the level of structural analysis i t would undertake. Now, the claim is not made that i t is impossible to do any sort of non-structural statistical analysis of texture, but i t is claimed that such an analysis has no place in the domains this thesis is restricted to. This appears to be due to the nature of the orthographic imaging process and is discussed in the next chapter. 117 e . * o 4 9 O « O 9 • o o Hierarchical texture II Figure f4.2.2 118 5. CONCLUSION Texture has been shown to be a useful cue for the detection of surface slant in orthographic images. Texture is used by applying structural analysis techniques to i t , determining surface slant by relying on the "foreshortening" property of the orthographic imaging process (see theorem 3.1.1). In a l l cases this has been done without considering texture gradient at a l l . As mentioned in Chapter 2, a l l previous works described have used texture gradient since they assumed a perspective imaging process. Similarily, purely statistical techniques were found not to be useful in the domain of apparent textures under orthographic projection. Let us point out, however, that i f the statistics (notably first and second order statistics) changed in some determinable way with the distance of the textured object from the viewer, then surface slant could be determined by a purely statistical analysis. This _is the case for perspective images and Bajcsy uses the change in first-order statistics (texture gradient) in her work, as does Gibson (see Chapter 2). This is not the case for orthographic projections since the location of the viewer is indeterminate on the Z-axis (see Chapter 3.1). We have put forth several opinions suggesting a "nested" structural description of texture. It was found that this notion clashes with previous definitions (see Chapter 2) which defined texture in terms of a micro-structure and a macro-structure. It was also found that the new definition makes difficulties such as texel determination vanish by making them non-problems. One reason for the lack of greater progress in texture and the lack of acceptance of texture by related concerns in scene analysis is the unavailability of such a definition. It is in the 119 spirit of providing a better basis for texture research, then, that we propose i t . 5.1 DIRECTIONS FOR FURTHER RESEARCH Extensions Immediate extensions of this work would be to relax the various assumptions made throughout the paper. Such extensions might include: Incorporation of "Real-World" Textures This would basically take the form of introducing variation into the strictly structural textures mentioned. For example, random, many-sided polygons distributed uniformly over a surface could simulate cork-board or plastered-wall texture. Pebble walls and concrete walks are of the type of texture studied in Chapter 3.3 but with uneven "texels". However, many man-made surfaces, such as tiled floors, woven cloth, and windowed building walls, are textured in the way studied in Chapter 3.4 with l i t t l e or no changes. These, too, should be studied more closely. Non-planar Surfaces This would probably present a very difficult problem for the class of texture examined in Chapter 3.3 since so l i t t l e information is available "locally" in the image. In Chapter 3.3, we computed the surface slant from texture measures made over the entire surface. We relied on many such measures to accurately f i t an ellipse to the data. Hence, the technique of that chapter would not be applicable to non-120 planar surfaces. As (Kender, 1979) also points out, such processing would be in the manner of (Woodham, 1977). However, non-planar surfaces should present l i t t l e difficulty i f the texture is "continuous", as the grid-like textures of Chapter 3.4 are. This class of textures then, would be a logical starting point for such work. Such textures are often used in line drawings by artists and workers in computer graphics precisely because the surface gradient and curvature are so easily recovered. Perspective Projections This paper has dealt solely with orthographic projections and, while this is a good approximation to many "real" images that one may want to deal with, i t is not a valid assumption for a l l cases. In fact, a l l single-camera, fixed-viewpoint images are perspective images and we should be able to deal with these. It is not clear to us how the algorithms in this paper could be generalized but we do not doubt that they could be since the foreshortening aspect of orthographic projections is also a part of perspective projections. Generalization to perspective images is a promising extension, for, as Bajcsy and Kender have discovered, there is much more information present due to the fixed position of the viewer. 121 Formalize the D e f i n i t i o n Our d e f i n i t i o n of texture as "nested" structures i s very informal. We f e e l that t h i s d e f i n i t i o n i s important enough to require formalization. I t would then be necessary to explore the power and r e s t r i c t i o n s of the d e f i n i t i o n and to apply i t to such tasks as texture description and generation. Application to the Origami World The origami world was introduced to i l l u s t r a t e our motivations i n Chapter 1. However, the application of our work to t h i s domain i s not as straight forward as we may have led the reader to believe. The d i f f i c u l t y i s due mainly to the i n a b i l i t y of texture analysis to distinguish between Necker reversals. We are not saying that i t should, though. Nevertheless, t h i s does introduce some complications. Let us look again at our "cube" example. Figure f5.1.1 i l l u s t r a t e s the s i x gradient values that could be obtained from such a cube. But how are we to determine that the gradient configurations for the l a b e l l i n g of Figure f5.1.1 are given by Figure f5.1.2 a) and not by Figure f5.1.2 b)? The answer i s that we must look again at the l a b e l l i n g and see which of the possible gradient configurations are "consistent" (Mackworth, 1973). Note that none of the configurations may be consistent. In t h i s case, we have determined that the l a b e l l i n g i s not correct for the given image. 122 "cube" labelling and gradients Figure f5.1.1 0 A Two "possible" gradient configurations of Figure f5.1.1 Figure f5.1.2 124 5.2 SYNOPSIS Summing up, then, several algorithms operating on several classes of textures have been presented. It has been decided that these classes are not actually distinct, but that the distinction is forced by the currently held definition of texture. In light of these results, we have proposed a new definition of texture which avoids this class distinction. Several directions for further study, including an application to a problem which served as a motivation for this thesis, have been suggested. It is hoped that the reader will think seriously about at least one of these problems, for only by confronting such problems will (s)he gain a real feeling for the problems inherent in determining the nature of a scene from its image. 125 6. APPENDICES 6.1 APPENDIX TO 2.2 Derivation of the rho-theta Equation of a_ Line Referring to Figure f6.6.1, we note that a point X = (x,y) i s on the l i n e precisely when the vector, X-N i s perpendicular to the vector, N. That i s , when, (X-N)'N = 0 X«N - N'N =0 X-N = N«N X«N = |N|2 X'N = rho 2 (x,y)•(rho*cos©,rho*sin©) = rho 2 x*rho*cos© + y*rho*sin© = rho 2 xcos© + ysin© = rho Which i s therefore the rho-theta equation of the l i n e . Rho-theta equation of a line Figure f6.1.1 127 6.2 APPENDIX TO 3.3 Least Squares Fit of Data to an Ellipse Theorem 3.1.1 gives us good reason to believe that our data, L-0-l i s t , lies on ellipse with the equation: F(6;a,b) = r 2 = (a2b2) / [a 2sin 2 (0-crn) + b 2cos 2 (0-cn) ] (1) where a n is the angle that the major axis of the ellipse makes with the line, 6=0. Our task here is to determine a and b. In order to do so, i t will be convenient to express F(0) in a form that depends linearly on its parameters. Thus, let us not use F(0) = r 2 , but rather F(0) = 1/r2. Then F(0;a,b) = 1/r2 = [sin 2(0-a n)] / b 2 + [cos2(0-on)] / a 2 F(0;c 1,c 2) = ctfi(9) + c2)rf2(0) (2) where c^ = l/b2 c 2 = 1/a2 ^(0) = sin 2(0-a n) tf2(0) = cos2(0-an) So we will be content to determine c^ and c 2 and, from them, a and b. To get a least-squares approximation, we wish to minimize the sum of the squares of the errors. That is, we wish to minimize 128 n E(clfc2) = 2 [fi - F(e i;c 1,c 2)] 2 i=l (Conte and de Boor,1965; pg.241-246) where = l / L ^ 2 n = number of elements in L-d-list (Li,6i) is the i t n element of L-d-list It should be pointed out that we have made a major compromise by insisting on the linearly dependent form. Ideally, we would like to minimize n 1 [Li - r ] 2 . 1=1 With F(0) = 1/r2 we are actually minimizing n 2 [ l / L i 2 - 1/r 2] 2. 1=1 This "weights" the small values of Lj and r more than large values, hence the data will f i t the minor-axis part of the ellipse found by this method better than the rest of the ellipse. Nevertheless, we will continue using this method because of its ease of applicability. Since E( C},c 2) is continuously differentiable with respect to c^ and c 2, we can detect the minimum at the point where the 1 s t partial derivatives vanish. That is n dE(c 1,c 2)/dc i = T d[fi - F(6 ic 1,c 2)] 2/dc i i=l n = "2 2 [fi - F(e ic 1,c 2)] dF(6 i;c 1,c 2)/dcj (3) 129 Now dF(e i;c 1,c 2 ) / d C j = d[(cinJi(ei) + c 2 * * 2 ( e i ) ] / d C j = ^ j ( 6 i ) (4) So substituting (4) into (3) we get n -2 T [fi - F(e ic 1,c 2)](zJ i(e i) = 0 i= l And substituting (2) into t h i s we get n n n c i l ^ j ( 6 i ) ^ i ( 6 i ) + c 2 2 r f j ( e i ) jtJ 2(9i) = I f i ^ j ( 6 i ) i=l i =l i=l Hence we have a system of two li n e a r equations (j=l,2) in two unknowns ( c i , c 2 ) . This system i s then solved using Gaussian elimination (Conte and de Boor,1965;pg 110-127). 130 6.3 APPENDIX TO 3.4 Orientation of a Surface from True Shape and Projected Shape This is taken nearly verbatim from (Mackworth, 1974). The orthographic imaging process may be given as a matrix transformation by: P= (tilt)(rotation)(scale)F where P is of dimensions 2*(n-l) and contains the co-ordinates of a l l n vertices except the pair at the origin. F is like P but contains the vertices of the true face. Ik 0^  (scale) = [ I 0 k cos(r) sin(r) (rotation) = -sin(r) cos(r) < \ fcos(o) -sin(a) \ fcos(fi) o\ / cos (a) sin (a) (tilt) = ; ^sin(a) cos (a) J ^0 1^  y-sin(a) cos (a) See Figure f6.3.1 (also from (Mackworth, 1974)). In our case, F represents a square and is given by: 1 0 F = [ | which is the identity matrix 0 1 So P = (tilt)(rotation)(scale). Now from P, we would like to obtain the direction and magnitude of the gradient vector, a and tan(0) "squashing" effect of the orthographic projection Figure f6.3.1 132 respectively. We first compute r as follows: (tilt)(rotation)(scale) = P (tilt)(scale)(rotation) =P (tilt)(scale) = P(rotation)- 1 _ / j l l 312 \ 321 322 Since (tilt) and (scale) commute and are both symmetric then the right hand side must also be symmetric: 1 hn n12\ /cos(r) -sin(r) ^ h21 h22/ ysin(r) cos(r) That is, -pjjsinCr) + p 1 2cos(r) = p 2icos(r) + p 2 2sin(r) so tan(r) = (p 1 2 - P21) / (Pn + P22) Now, from J we compute the eigenvectors and the eigenvalues which must be Ei = (cos (a), sin (a)) and X i = kcos(jrf) E2 = (-sin (a), cos (a)) and X2 = k (see Figure f6.3.1 again) Hence tf = cos-1 ( V ] / . X 2 ) and a = (E]_ (1)/E2(2)) . 133 Bibliography Bajcsy, R. Computer identification of textured visual scenes. Technical-report AIM-180, Stanford Artificial Intelligence Laboratory, Stanford University, Oct. 1972. Bajcsy, R. Computer description of textured surfaces. IJCAI3(1973),572-579. Bajcsy, R. and Lieberman, L. Texture gradient as a depth cue. Computer  Graphics and Image Processing 5,1 (March 1976),52-67. Caelli, T. and Julesz, B. Perceptual analyzers underlying visual texture discrimination: part I. Biological Cybernetics 28 (1978),167-175. Carton, E. J.,Weszka, J. S., and Rosenfeld, A. Some basic texture analysis techniques. Technical-report tr-288, Computer Science, University of Maryland, Jan. 1974. Clowes, M.B. On seeing things. Artificial Intelligence 2,1 (1971),79-116. Conte, S.D. and deBoor, C. Elementary Numerical Analysis: an Algorithmic  Approach. McGraw-Hill, New York, . Davis, L.S., Johns, S.A., and Aggarwal, J.K. Texture analysis using generalized co-occurrence matrices. IEEE Transactions on Pattern  Analysis and Machine Intelligence PAMI-1,3 (July 1979), 251-259. Deutsch, E.S. and Belknap, N.J. Texture descriptors using neighborhood information. Computer Graphics and Image Processing 1,2 (Aug. 1972),145-168. Duda, R.O. and Hart, P.E. Use of the Hough transform to detect lines and curves in pictures. CACM 15,1 (Jan. 1972),11-15. Dyer, C. R. and Rosenfeld, A. Fourier texture features- suppression of aperture effects. Technical-report tr-391, Computer Science, University of Maryland, July 1975. Dyer, C.R., Weszka, J.S. and Rosenfeld, A. Experiments in terrain classification on Landsat imagery by texture analysis. Technical-report tr-383, Computer Science, University of Maryland, June 1975a. Dyer, C.R., Weszka, J.S., and Rosenfeld, A. Further experiments in terrain classification by texture analysis. Technical-report tr-417, Computer Science, University of Maryland, Sept. 1975b. Dyer, C.R., Weszka, J.S., and Rosenfeld, A. Detection of 'hazy anomalies' in Landsat imagery by texture analysis. Technical-report tr-429, Computer Science, University of Maryland, Dec. 1975c. 134 Gibson, J.J. The Perception of the Visual World. Houghton Mifflin Co., Boston,Mass., 1950. Haralick, R.M., Shanmugan, K., and Dinstein, I. Textural features for image classification. IEEE Transactions on Systems, Man, and  Cybernetics SMC-3,6 (Nov. 1973),610-621. Hawkins, J.K. Textural properties for pattern recognition. In (Lipkin,B. and Rosenfeld,A.,Eds.),347-370. Horn, B.K.P. The Binford-Horn line finder. Memo 285, Al Lab, MIT, July 1971. Hough, P.V.C. Method and means for recognizing complex patterns. U.S. patent 3,069,654, Dec. 18, 1962. Huffman, D.A. Impossible objects as nonsense sentences. In Machine Intelligence 6(1971), Meltzer, B., Michie, D. (Eds),295-323. Julesz, B. Foundations of Cyclopean Perception. University of Chicago Press, Chicago, 1971. Julesz, B., Gilbert, E.N.,Shepp, L.A. and Frisch, H.L. Inability of humans to discriminate between visual textures that agree in second-order statistics-revisited. Perception 2 (1973),391-405. Julesz, B. Experiments in the visual perception of textures. Scientific  American ,232 (1975),34-43. Kanade, T. A theory of origami world. Technical-report cmu-cs-78-144, Computer Science, Carnegie-Mellon University, Sept. 20, 1978. Kelly, M.D. Edge detection by computer using planning. In Machine Intelligence 6(1971), Meltzer, B., Michie, D. (Eds),397-410. Kender, J.R. Shape from texture: an aggregation transform that maps a class of textures into surface orientation. IJCAI-79,475-480. Lipkin, B. and Rosenfeld, A. (Eds.) Picture Processing and Psycho- pictorics. Academic Press, New York, 1970. Mackworth, A. K. Interpreting pictures of polyhedral scenes. Artificial  Intelligence 4,2 (1973),121-137. Mackworth, A.K. On the interpretation of drawings as three-dimensional scenes. D.Phil, thesis , University of Sussex, 1974. Mackworth, A. K. Model-driven interpretation in intelligent vision systems. Perception 5 (1976),349-370. Nevatia, R., Price, K., and Vilnrotter, F. Describing natural textures. IJCAI 79,642-644. O'Gorman, F. and Clowes, M.B. Finding picture edges through collinearity of feature points. IJCAI3(1973),543-555. 135 Pickett, R.M. Visual analysis of texture in the detection and recognition of objects. In (Lipkin,B. and Rosenfeld,A.,Eds.),289-308. Rosenfeld, A., Lee, Y.H., and Thomas, R.B. Edge and curve detection for texture discrimination. In (Lipkin,B. and Rosenfeld,A.,Eds.),381-393. Rosenfeld, A. Visual texture analysis- an overview. Technical-report tr-406, Computer Science, University of Maryland, Aug. 1975. Rosenfeld, A. and Lipkin, B.S. Texture synthesis. In (Lipkin,B. and Rosenfeld,A.,Eds.),309-345. Schatz, B.R. The computation of immediate texture. PhD thesis AIM-426, Department of Artificial Intelligence, MIT, Aug. 1977. Schatz, B.R. The computation of immediate texture discrimination. Technical-report cmu-cs-78-152, Computer Science Dept., Carnegie-Mellon University, Dec. 1978. Shirai, Y. A context sensitive line finder for recognition of polyhedra. Artificia l Intelligence 4,2 (1973),95-119. Tamura, H., Mori, S. and Yamawaki, T. Textural features corresponding to visual perception. IEEE Transactions on Systems, Man, and  Cybernetics SMC-8,6 (June 1978),460-473. Tanimoto, S. and Pavlidis, T. A hierarchical data structure for picture processing. Computer Graphics and Image Processing 4^,2 (June 1975),104-119. Tomita, F., Shirai, Y. and Tsuji, S. Descriptions of textures by a structural analysis. IJCAI 79,884-889. Tomita, F., Yachida, M. and Tsuji, S. Detection of homogeneous regions by structural analysis. IJCAI3(1973),364-371. Tou, J.T. and Gonzalesz, R.C. Pattern Recognition Principles. Addison-Wesley, Reading,Mass., 1974. Tsuji, S. and Tomita, F. A structural analyzer for a class of textures. Computer Graphics and Image Processing 2,3/4 (Dec. 1973),216-231. Waltz, D.L. Understanding line drawings of scenes with shadows. In the Psychology of Computer Vision, Winston, P. (Ed),19-91. Wang, S.,Velasco, F.R.D., and Rosenfeld, A. A comparison of some simple methods for extracting texture primitives and their effectiveness in texture discrimination. Technical-note tr-759, Computer Science Centre, Univ. of Md., April 1979. Winston, P.H.(Ed.) The Psychology of Computer Vision. McGraw-Hill, New York, 1975. 136 Woodham, R.J. A cooperative algorithm for determining surface orientation from a single view. IJCAI5(1977),635-641. Yachida, M., Ikeda, M. and Tsuji, S. Boundary detection of textured regions. IJCAI 79,992-994. Zahn, C.T. Graph-theoretic methods for detecting and describing gestalt clusters. IEEE Trans. Computers vol. c-20,no. 1 (1971),68-86. Zucker, S.W., Rosenfeld, A. and Davis, L.S. Picture segmentation by texture discrimination. IEEE Transactions on Computers C24 (Dec. 1975),1228-1233. Zucker, Steven W. Toward a model of texture. Computer Graphics and Image  Processing 5,2 (June 1976),190-202. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051595/manifest

Comment

Related Items