UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Sketch-based instancing of parameterized 3D models Xiao, Dan 2004

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_2004-0697.pdf [ 3.23MB ]
JSON: 831-1.0051323.json
JSON-LD: 831-1.0051323-ld.json
RDF/XML (Pretty): 831-1.0051323-rdf.xml
RDF/JSON: 831-1.0051323-rdf.json
Turtle: 831-1.0051323-turtle.txt
N-Triples: 831-1.0051323-rdf-ntriples.txt
Original Record: 831-1.0051323-source.json
Full Text

Full Text

Sketch-based Instancing of Parameterized 3D Models by Dan Xiao M.Sc, Zhejiang University, 2002  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Masters of Science  in THE FACULTY OF GRADUATE STUDIES (Department of Computer Science)  We accept this thesis as conforming to the required standard  The University of British Columbia September 2004 © Dan Xiao, 2004  F A C U L T Y O F G R A D U A T E S T U D I E S j,  THE UNIVERSITY OF BRITISH C O L U M B I A  Library Authorization  In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.  DM  oiirl lo  XIAO  Name of Author (please print)  Title of Thesis:  Degree: Department of  tyxAcJ*  l^ 7  Date (dd/mm/yyyy)  - 6MK>1  ttfj. &^ PcvrweXerf-i&J.  JAIM-QUU*  MflVtrg/T  Ctfn^bjCt'-t/r  Y e a r :  JZ  ^~yiA Q/lUy€s  The University of British Columbia Vancouver, BC Canada  grad.ubc.ca/forms/?formlD=THS  page 1 of 1  last updated:  20-Jul-04  Abstract  We present a system for constructing 3D models from simple hand-drawn sketches.  T h e system exploits a priori knowledge about the type of object being  sketched. In particular, we assume that the class of object being drawn is known, (e.g., a rocket ship) and that the object will be drawn using a fixed number of known components. B y using these assumptions, we show that we can b u i l d a sketch-based modeling system that, while object specific, is capable of quickly creating geometric models that are beyond the scope of current sketch-based modeling approaches. K e y to our approach is the use of a K-means classifier for labeling each drawn stroke as being a particular component i n the generic model. We demonstrate our approach by applying this classifier to face and rocket sketches.  Contents  Abstract  ii  Contents  iii  List of Tables  v  List of Figures  vi  Acknowledgements  viii  1 Introduction  1  1.1  Motivation  1  1.2  System Overview  3  1.3  A p p l i c a t i o n of Sketch Recognition  3  1.4  Thesis Outline  4  2 Related Work  5  2.1  Attributes of a Sketch Recognition System  5  2.2  Algorithms for Object Recognition  6  2.3  2D Sketch Recognition  7  2.4  3D Sketch-based M o d e l i n g  9  2.5  Summary  :  11  in  3  4  Single-stroke Classification 3.1  Problem  12  3.2  T h e Features  13  3.3  Methods for Stroke Classification  15  3.3.1  Least Squares M e t h o d  15  3.3.2  K-means M e t h o d  16  3.3.3  Expectation-Maximisation M e t h o d  18  3.4 . Classifier Comparison  19  3.5  22  4.2  6  Summary  Sketch Recognition and 3D Model Construction 4.1  5  12  23  Sketch Recognition: Training  23  4.1.1  Shapes  23  4.1.2  Choice of Features  24  4.1.3  Sketch Recognition: Matching  3 D M o d e l Construction  .  28 28  Results  30  5.1  Face Sketching  30  5.2  Rocket Sketching  31  5.3  Discussion  /  31  Conclusions and Future Work  35  Bibliography  36  Appendix A User Interface  39  iv  List of Tables  1.1  Correspondences between strokes and components of a rocket  4.1  Description of components for face  26  4.2  Description of components for rocket  26  5.1  Comparisons of drawing and recognized face components w i t h incorrect recognition  2  31  v  List of Figures  1.1  A n example of sketch recognition and modeling  2  1.2  System Overview  2  2.1  Design process [13]  6  2.2  E x a m p l e of user interface: Teddy  10  2.3  Another example of user interface: Chateau  10  3.1  Features extracted from a gesture for identification  14  3.2  Examples from the gesture set used for evaluation  20  3.3  Recognition rate vs. training size for least square method  20  3.4  Recognition rate vs. training size for K-means method  21  3.5  Recognition rate vs. training size for E M method  21  4.1  Shapes of basic symbols: arc, ellipse, triangle, square, open rectangle.  24  4.2  Features of head used for classification  26  4.3  Illustration of face components  27  4.4  Illustration of rocket components  27  4.5  M o d e l i n g parameters for tapered cylinder and ellipsoid.  5.1  Face modeling w i t h correct recognition  5.2  Rocket modeling w i t h correct recognition  33  5.3  Face modeling w i t h incorrect recognition  34  vi  -  .........  29 32  T h e interface of modeling system  v  Acknowledgements I would like to give my great gratitude to my supervisor, D r . M i c h i e l van de Panne, for his invaluble advice on "my thesis work. W i t h o u t h i m , this would never have been completed. I am also very grateful to D r . Nando de Preitas, who is giving the interesting machine learning course and stimulate the idea i n this thesis. To Jason Harrison, who contribute useful books and papers. A l s o to Peng Zhao, Y i z h e n C a i and X i a o j i n g W u for providing several discussions.  DAN XIAO  The University- of British September  Columbia  2004  viii  Chapter 1 Introduction  1.1  Motivation  " A picture speaks a thousand words". People often rely on sketches when they t r y to convey ideas to others. We all have the experience of drawing something. O f all drawers, children are among the most fanatic. T h e y like to convert what they see and think into pictures. Sometimes their pictures are simply drawn but they are at the same time capable of conveying much information. Creating and animating 3D models is currently not possible, however, without significant training and expertise. New technologies are t h r i v i n g i n our modern society. One common characteristic of successful technologies is their ease of use, so that people can grasp those technologies i n a short period of time and use them i n an easy way. W i t h devices such as P D A s and Tablet P C s , there exists hardware that largely replicates the feel and function of a pencil or pen or paper. W h a t now remains is for new software applications to exploit these new capabilities. O u r system can interpret a limited class of 2D sketeches as 3D objects, and thus represents such a new application.  1  Figure 1.1: A n example of sketch recognition and modeling.  Stroke Classification  3D M o d e l Construct! on  •K 3D Model  Chapter 4  Chapter 3  Figure 1.2: System Overview.  Stroke 1 2 3 4 5  Resulting component cone body tail left wing right wing  Table 1.1: Correspondences between strokes and components of a rocket.  2  1.2  System Overview  Figure 1.1 shows an example drawn and recognized by our system. E a c h pen stroke i n the graph is recognized as a component of the object and then constructs a component of the 3D model. In the above graph, a rocket is drawn w i t h 5 pen strokes and later, a 3D rocket model is displayed w i t h 5 components.  E a c h pen stroke  represents one component and their correspondences are illustrated i n Table 1.1. T h e overall system structure is given i n Figure 1.2.  1.3  Application of Sketch Recognition  T h e application domains of Freehand Sketch Recognition System ( F S R S ) include: • 3D modeling based on sketching interface, such as computer aided design (CAD). • Query by sketch : image database retrieval. A long-standing dream i n computer-aided design is the possibility of using freehand sketching as the language for interactive design. T h e ability to sketch a 3D object, predict its performance, and re-design it interactively based on physics-based feedback would b r i n g the power of state-of-the-art C A D tools into the critical, early design phase.  T h e enormous potential of sketch-based interfaces is widely recog-  nized, and has been broadly pursued. However, the practical use of such attempts has remained limited because these interfaces have been primarily 2 D , losing much of the benefit of mainstream 3D C A D . In order to become t r u l y 3 D , a sketch interface must automatically be able to reconstruct the spatial geometry from a single 2D sketch i n near real-time, much like a human observer does implicitly.  3  1.4  Thesis Outline  T h e rest of this thesis is organized as follows. In Chapter 2, we discuss work related to this project.  Chapter 3 explores how to label individual strokes using classi-  fication and compares different classifiers. Chapter 4 describes our implemented system. Chapter 5 presents results. We make conclusions and discuss future work i n Chapter 6.  4  Chapter 2 Related W o r k 2.1  Attributes of a Sketch Recognition System  Freehand sketching can play an important role i n the realm of user-interfaces for graphics systems. Sketching appears to be a natural communication language, enabling fast conveyance of qualitative information while not burdening the creativity of the user or disrupting the flow of ideas. A s illustrated i n Figure 2.1, sketching is a key element i n modeling, developing and defining a design result. D u r i n g the whole design process, sketching often provides the input for conceptual design. [17] suggests that a sketch recognition technology must meet three m a i n requirements: it must deal reliably w i t h the pervasive variability of hand sketches, provide interactive performance, and be easily extensible to new configurations. User interfaces based on sketching can generally be divided into three categories, according to the level of information they intend to gather from the input sketch [22]: 1. D r a w i n g Pads. Basic sketching is allowed for general purpose drawings. Input strokes are smoothed and many other graphic tools are provided. 2. 2D sketch applications. These applications smooth sketch strokes and classify them into 2D primitives, such as lines, arcs, and splines.  5  Some geometric  Figure 2.1: Design process [13]. constraints and relationships among the entities are utilized to further refine the sketch. 3. 3D sketchers. Sketches are analyzed as representing rough projections of 3D scenes. Sketch strokes are identified as basic geometrical shapes, such as lines, arcs and corners.  However, some of the sketch strokes do not necessarily  represent what they appear to be since the analyzed sketch represents a rough projection of a three dimensional scene [11, 14].  2.2  Algorithms for Object Recognition  T h e sketch recognition problem is strongly related to the object recognition problem in computer vision. A broad class of such object recognition problems have benefited from statistical learning machinery. T h e use of Principle Component Analysis ( P C A ) has produced good results [25] i n the area of face recognition. O n more general object recognition tasks, several other learning methods such as Bayes classifier [21] and decision tree learning [28] have been applied. T h e technique of boosting is proven to be a viable method of feature selection i n [26]. Support vector machine methods have demonstrated success i n template matching problems such as  6  recognizing pedestrians i n [16]. [2] and [31] present a novel approach to measure similarity between 2 D shapes and exploit it for object recognition. In order to solve the correspondences problem, they attach a descriptor, the shape context, to each point.  T h e shape context  at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. In this way, they treat recognition i n a nearest-neighbor classification framework as the problem of finding the stored prototype shape that is m a x i m a l l y similar to that i n the image.  2.3  2D Sketch Recognition  Research on sketch recognition and interpretation has a long history.  A variety  of systems have incorporated gesture recognition into their user interface.  The  development of a Freehand Sketch Recognition System involves three stages: 1. Stroke-level recognition which interprets the pixels and produces low-level geometric primitives such as lines and curves. 2. 3D modeling based on 2D sketches. 3. Multi-stroke understanding. E a c h sketch of an object consists of multiple strokes. T h e first step i n i n tepreting a sketch is to classify individual pen strokes by processing them. Usually there are two assumptions i n this area, one is that each pen stroke symbolizes a single shape, such as a single curve segment or triangle segment, whichever fits the stroke best. M u c h of the previous work has relied either on using stroke methods i n which an entire symbol must be drawn as single stroke [20, 10, 4], or single primitive methods i n which each stroke must be a single line, arc, or curve [18, 6, 27]. Another one used i n [3] is that multiple primitives can be drawn i n the same pen stroke. In this way, a square can be drawn as four individual pen strokes or a single pen stroke  7  w i t h three 90° bends. T h e key challenge is determing which bumps and bends are drawn intentionally and which are unintentionally. A r v o and Novins described a k i n d of stroke-level recognition algorithm [1]. Different from traditional ways of recognizing sketches, their method continuously morphs raw input strokes into ideal geometric shapes. Another stroke-level recognition algorithm is a domain-independent system for sketch recognition [29]. Users are allowed to draw sketches as naturally as how they do on paper. T h e system then recognizes the drawing through imprecise stroke approximation which is implemented i n a unified and incremental procedure. T h i s method can handle smooth curves and hybrid shapes as gracefully as it does to polylines. W i t h a feature-area verification mechanism and the intelligent adjustment i n a post-process, the system can produce the results intended by the user. Rubine presented a trainable gesture recognizer which used a "linear machine" classifier to discriminate gestures [20]. E a c h gesture class is associated w i t h a linear evaluation function using 13 features.  In order for training, appropriate  weights are learned for each attribute i n the linear function. T h e attributes consider aggregate properties of a pen stroke, and it is possible that two different gestures would have the same aggregate properties. Recent studies show us some interesting gesture-based interfaces i n which the user specifies commands by simple drawings. G R A N D M A , a tool for building gesture-based applications introduced i n [19], supports the integration of gestures and direct manipulation. It allows views that respond to gestures and views that respond to clicks and drags to coexist i n the same interface.  O u r work is closely  related to this i n that we also a set of stroke features and then apply a classifier to discriminate different gestures. Landay and Myers advanced an interactive sketching tool called S I L K i n which designers can quickly sketch out a user interface and transform it into a fully operational system [12]. A s the designer sketches, the recognizer of S I L K matches  8  the pen strokes to symbols representing various user interface components, returns the most likely interpretation.  and  Their recognizer is limited to single-stroke  shapes drawn i n certain preferred orientations. [23, 24] have developed a program called SketchIT that can transform a sketch of a mechanical device into working designs. Electronic C o c k t a i l N a p k i n [5] employs a trainable recognizer that works for multi-stroke shapes. T h e recognition process consists of glyph (low-level) and configuration (high-level) recognition. A glyph is described by a state transition model of the pen path, the aspect ratio and size of the bounding box, and the number of corner points. Configuration recognition takes the spatial relationships between the glyphs into consideration. T h i s method is sensitive to changes i n orientation, and the 3x3 grid may be inadequate for symbols containing small features.  2.4  3D Sketch-based Modeling  Zeleznik et al. have extended the use of gesture recognition for 3D modeling[30]. T h e S K E T C H system they proposed attempts to combine 2D image w i t h 3D computer modeling systems to create an environment for rapidly conceptualizing and editing approximate 3D scenes. A l l interaction w i t h S K E T C H is v i a a three-button mouse, occasional use of one modifier key on the keyboard, and a single orthographic window onto the 3D scene. T h e Teddy system [8] i n Figure 2.2 is aimed especially at the modeling of simple free form objects.  T h e system is able to innate 2 D closed curves into 3D  objects, which can then be edited using simple gestures. T h e interface shown i n Figure 2.3 is Chateau [7], a prototype systemin which the user gives the hints to the system by highlighting related lines and the system suggests possible operations based on the hints, shown i n the thumbnails at bottom.  9  Figure 2.2: Example of user interface:  Teddy.  2.5  Summary  In this chapter, we presented a review of sketch recognition systems, 2D sketch recognition methods, along w i t h some 3D sketch-based modeling examples. In the next chapter, we w i l l present some machine learning methods used for gesture recognition.  11  Chapter 3 Single-stroke Classification One of the goals of our work is to obtain an efRcent classifer as well as a rich and meaningful feature vectors. In this chapter, we compare several machine learning methods that classify strokes into one of several categories based upon the stroke feature vectors [20]. T h e selection of useful features and a good classifier w i l l determine the effectiveness of our gesture recognizer.  3.1  Problem  E a c h gesture is denned as an array g of coordinate values and time.  9 = {x ,y ,tp) P  p  0<p<P  P  T h e gesture recognition problem is stated as follows. There is a gesture set w i t h C classes. Given a gesture, we need to determine to which class it belongs. T h i s is done by first extracting a limited set of features. Features are extracted from the gesture g and are used as the input for classification. T h e feature vector / = [ / i . . . fp] is taken as the training data for classification.  12  3.2  The Features  A s an example, a stroke can be characterized by a set of 11 geometric and 2 dynamic features [20]. T h i s is the set of features we shall use for testing several classification algorithms. Note that many other feasible feature sets could also have been used, w i t h possible varying results. Figure 3.1 shows how to extract these specific features from a stroke. T h e discription of features are:  • Feature • Feature •  1 ( A ) : the cosine of the initial angle of the gesture. 2 ( A ) : the sine value of the initial angle of the gesture.  Feature 3 ( A ) : the length of the bounding box diagonal.  • Feature  4 ( A ) : the angle of the bounding box diagonal.  • Feature  5 ( A ) : the distance between the first and the last point.  • Feature  6 ( A ) : the cosine of the angle between the first and last point.  • Feature  7 ( A ) : the sine of the angle between the first and last point.  • Feature  8 ( A ) : the total gesture length.  • Feature  9 ( A ) : the total angle traversed.  • Feature 10 (fio) : the sum of the absolute value of the angle at each mouse point. • Feature 11 ( / n ) : the sum of the squared value of those angles. • Feature 12 ( / 1 2 ) : the m a x i m u m speed (squared) of the gesture. • Feature 13 ( / 1 3 ) : the duration of the gesture.  fi = cos a = x  2  - x/ 0  J(X  - x)  2  2  13  0  + (2/2 -  Vo)  2  Figure 3.1: Features extracted from a gesture for identification.  /  = sin a = y - yo/y [x - x )  + (y  2  2  2  fz  =  2  \J(%max  0  2  %min) "I" (^moi  2  Vmax y-min  fi — arctan -  max  •''mm  h = \ / ( z p - i - xo) + {yp-i - yo) 2  2  fe = cos/3 = (xp-i = sin/3 = ( y P _ i -  f  7  Let Ax  p  = x  p +  x )/f 0  2/0V/5  i - x ,Ay = j/ i - y p  p  p +  p  f = J2jAxl + Ay* P  8  Let 0 = arctan P  ^  -  '  "  ^  ^ P-2  /9 =  ^P p = l  14  2  Zmin)  2  L  yo)  5  P-2  fw  P  =i  P-2 hi P  Let  Atp  =i  — £p+i —  P—2 ( • /12 = max -•max  •max  p=0  /l3  3.3  =  tp+1  - to  Methods for Stroke Classification  We now introduce several possible classification methods w i t h the purpose of determining which one works best. Three machine learning classification schemes are presented i n this section: least squares, k-means, expectation-maximisation. T h e least squares method is a supervised learning technique while the latter two use unsupervised learning.  3.3.1  Least Squares Method  [20] uses a linear machine algorithm for strokes classification, which is a k i n d of least square approach. Given C as the number of total classes and F is the number of features, each gesture class c has parameters tu; for 0 < i < F . T h e evaluations, v , are calculated c  as follows:  c  F  0<c<C T h e class of a gesture g is the c which maximizes v . c  D u r i n g the training process, we need to determine the weights ui i from the c  example gestures. Let f i ce  be the feature of the example of gesture class c, 0 < e <  15  E , where E is the number of training examples of class c. T h e sample estimate of c  c  the mean feature vector per class f  is defined as:  c  1 fci  =  E c  ~  x  TT ^  c  Aei e=0  T h e sample estimate of the covariance matrix of class c, JZcij, is computed as: Ec-l  Etij  =  _  ^ (fcei ~ fci)(fcei e=0  T h e _ cr/ are averaged to yield _ i 7 ,  a r L  ~  fci)  estimate of the common covariance  matrix. E  .._ _ _ _ _ _ _  We can invert the common covariance matrix and use it to get the weights u> i as follows [20]: c  F u  CJ  _  = J2^~%fa  1<3<F  i=i  F 2  3.3.2  =i  K-means M e t h o d  We use K-means clustering introduced i n [15] for classification. I n K-means clustering, K , needs to be determined at the onset. T h e goal is to divide the objects into K clusters such that some metric relative to the centroids of the clusters is minimized. Various metrics related to the centroids can be minimized, including: • T h e m a x i m u m distance to its centroid for any object. • T h e sum of the average distance to the centroids over a l l clusters. • T h e sum of the variance over a l l clusters. • T h e total distance between a l l objects and their centroids. 16  T h e metric to minimize and the choice of a distance measure w i l l determine the shape of the o p t i m i u m clusters. T w o different algorithms are available to search for the o p t i m u m set of clusters.  In the first procedure, the objects are randomly assigned to one of the K  clusters. Once this is done, the position of the K centroids are determined, as is the value of the metric to minimize. A global optimization method is then used to reassign some of the objects to different clusters. New centroids are determined, as is the metric to minimize. T h i s procedure is continued until the o p t i m u m assignment of objects to clusters is found. In the second procedure for K-means clustering, placement of the K centroids can be done by the following procedure. Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids. 1. Assign each object to the group that has the closest centroid. 2. W h e n a l l objects have been assigned, recalculate the positions of the K centroids. 3. Repeat Steps 2 and 3 until the centroids no longer move. T h i s produces a separation of the objects into groups from which the metric to be minimized can be calculated. A global optimization method is then used to move the position of one or more of the centroids. is determined.  T h e above procedure is repeated and new metric value  T h e object is to move the centroids into a position such that an  optimum separation of objects into groups occurrs. For K - M e a n s clustering, it is necessary to calculate a "distance" between r  either two objects (one of which may be a cluster seed point) or an object and a group centroid. In this discussion, it is assumed that each object is described by an array of real-valued metrics.  17  Since we need to minimize the largest or average distance, or the variance, during clustering, it is important that each metric contribute equally to the total distance. In other words, if one metric spans the range [0.0,0.5] and another spans [0.0,100.0], the m a x i m u m deviation i n the first would have little effect on the total distance, while even a modest separation i n the second would have a much larger effect. To remove this dependency on the range spanned by each metric, it is important to first standardize the values. T h i s means that each metric, when compared over the full set of objects should have a mean of 0.0 and a variance (or standard deviation) of 1.0. For each metric, the following steps should be taken to standardize each metric describing the objects. 1. S u m the values of the metric over all objects and divide the sum by the number of objects. 2. Subtract this average value from the metric i n a l l objects. 3. S u m the square of these new values over all objects, divide the sum by the total number of objects, and take its square-root. T h i s is the standard deviation of the new values. 4. Divide the metric by the standard deviation i n each object.  3.3.3  Expectation-Maximisation Method  T h i s method supposes that the stroke features have normal Gaussian distributions. The E M method can be applied to build a gesture classifier. T h e detailed steps are given by Jordan [9]: 1. Initialise. 2. E Step: A t iteration t, compute the expectation of the indicators for each i and c: = U  p(c)N(xi\fx ,T,c) c  E*p(c')A^|/v,£ ') c  18  and normalize it. 3. M Step: U p d a t e the parameters P(c), fj, , S . c  E  c  E i = l Cic  c  1  p(c) =  - _ >  T h e E M method is used to find the parameters for the n o r m a l distribution of each class and then apply this distribution to classify new, unseen feature vectors. T h e distribution w i t h the highest probability is defined to be the class to which the gesture belongs.  3.4  Classifier Comparison  T h e gestures used for evaluation of the three classification algorithms are shown i n Figure 3.2. In this figure, four different gesture styles of each class are given. Performance is evaluated on 3 gesture classes. Figure 3.3 shows the results using the algorithm presented by [20]. T h e plot demonstrates the recognition rate as a function of the number of training examples per class for evaluation gestures. One line is for 2 classes and the other one is for 3 classes. Figure 3.4 illustrates the results w i t h k-means. T h e results when applying E M are shown i n Figure 3.5. F r o m the above three figures, we can see that the K-means method gives the best performance. For the K-means method, i n the cases where 3 gesture classes are recognized by a classifier trained w i t h 10 or more examples per class, at least 96% of the test gestures are classified correctly. W h e n there are only 2 gesture classes, the classifier trained w i t h 10 or more examples per class yields 100% correct classification.  19  A  A  Figure 3.2: Examples from the gesture set used for evaluation.  Classification with Least Squares  100  —  90  B  n  H  - 30  40  _  B  2 classes 3 classes  ta  80  i  70 60 50 40 30 20  il  I  10  0S| 0  10  20  training examples per classes  50  60  Figure 3.3: Recognition rate vs. training size for least square method.  20  Classification with K-means 100  •B 2 classes -+- 3 classes  90 80 70 60 50 40 30 20 10 OS.  20 30 40 training examples per classes  50  60  Figure 3.4: Recognition rate vs. training size for K-means method.  Classification with EM 100  ~ B 2 classes - * - 3 classes  30 B0  /  70 60 50  7  40 30 20 10  od  //  tf if  20 30 40 training examples per classes  60  Figure 3.5: Recognition rate vs. training size for E M method.  21  The linear machine classifier used by Rubine that is trained w i t h 10 or more examples can recognize 3 gesture classes w i t h correctness > 60% , and for 2 gesture classes, the adequacy is > 85%. This method gives better performance when there are 40 training examples per class, but the correctness falls after increasing the number of examples. The recognition adequacy of E M method is low when the number of training examples is less than 30 per class, however, it increases when more training examples added. T h e more training examples per class, the higher the correctness at recognizing gestures. T h e classifier trained w i t h 30 or more examples can recognize 3 gesture classes at above 85% correctness, and for 2 gesture classes above 80% correctness. It is surprising to note that it w i l l give better performance for 3 gesture classes than 2 gesture classes for large number of training examples.  3.5 The  Summary generation of a meaningful, extensible and effective feature set is essential.  How many features are enough to describe a gesture without redundancy and what k i n d of features should be extracted is an important problem to be tackled. In this chapter, testing strokes comprise congruent lines and arcs, and we can see that thirteen features are quite efficient for classification.  Given different gesture sets,  some features may be useless and a small number of feature set may be applied to generate good results as well. In the next chapter, we want to classify strokes of different shapes from the ones we used i n this chapter. We choose K-means method to use for our sketch recognition system. T h e y are generally simpler and thus we w i l l use a feature set w i t h a smaller number of attributes.  r 22  Chapter 4  Sketch Recognition and  3D  Model Construction T h i s chapter w i l l discuss the main contributions of our work: a system for sketch recognition and 3D model construction.  Sketch recognition consists of assigning  feature labels to i n d i v i d u a l pen strokes, i.e., identifying the object features they corresponded to. It includes two phases: training and learning. We have chosen K-means method for stroke classification and we w i l l show how we select features and apply this classifier to sketch recognition. Lastly, we w i l l go through the details of 3D model construction w i t h a generic example.  4.1 4.1.1  Sketch Recognition: Training Shapes  In our system, each pen stroke symbolizes a single shape, such as a single curve or triangle. T h e advantage is that it facilitates shape recognition. T h e disadvantage is that the system requires the user to sketch features i n a fairly specific fashion. We define a shape set for typical symbols, including curve, ellipse, triangle, square, open rectangle, as shown i n Figure 4.1. We regard a line as being a k i n d of curve.  23  Figure 4.1: Shapes of basic symbols: arc, ellipse, triangle, square, open rectangle. We use x and y values of the stroke points to determine its shape style, which includes two steps. For curve and open rectangle, the distance between the begin point and the end point is quite faraway; whereas for ellipse, triangle and square, the begin point is close to the end point. In this way, we can categorize the five shape types into two classes: the one w i t h open stroke and the one w i t h closed stroke. T h e next step is to employ geometrical properties of different shape type to distinguish them. Every triangle, right-angled or not, w i l l have at least two acute angles. A square contains four right angles and an open rectangle takes two angles, whether acute or obtuse. O n a stroke, the lines between a point and its two neighbouring points form an angle. T h i s three points consist of a triangle, length of each edge can be calculated and then the cosine value of this angle. Since sketch is drawn freehand, it may incur a lot of minor noises, for instance, a line may contain some sharp angles. A good way to avoid those noises is to include more neighbouring points for calculation. W i t h this method, angle values on a stroke can be computed one point by one point. These values change smoothly for curve and ellipse because they don't have such obvious angle traits as triangle, square and open rectangle.  4.1.2  Choice of Features  Four features as the intrinsic properties of a stroke are denned to interpret one single stroke, including: • Feature 1 ( / i ) : Normalized location i n x direction.  24  • Feature 2 (f ) 2  : Normalized location i n y direction.  • Feature 3 (fs) : D r a w i n g time. • Feature 4 (f±) : D r a w i n g shape. Once an object sketch is drawn, an axis-aligned bounding box for the whole sketch is computed. A s shown i n Figure 4.2, the sketch bounding box is illustrated i n dotted lines. T h e normalized x value (feature / i ) is computed as a f\ = x'/xb , 0  where x' is.the distance from the center of stroke bounding box to the left side of the sketch bounding box, and x b is the bounding box w i d t h . In the same manner, 0  h = V'/VbbT h e drawing time (feature fs) is measured as the total time between first and last sample point. T h e last feature used for stroke classification is drawing shape (feature fi) which is categorized according to our shape set denned i n Figure 4.1 and calculated from the method we discribed i n 4.1.1. We assign a unique integer value to each shape type: 0 to curve, 1 to ellipse, 2 to triangle, 3 to square and 4 to open rectangle. I n this way, all the features can be parameterized. Figures 4.3 and 4.4 show the sketched components of face and rocket that w i l l be recognized by our system. A face consists of 9 strokes drawn i n any order: face, left eye, right eye, left eyebrow, right eyebrow, nose, mouth, left ear and right ear. A rocket is described by 5 strokes that can be drawn i n any order: head, body, tail, left wing and right wing. Numbers i n the figures correspond to the drawing order for each stroke for the example database, which provides a labelled set of training data. Detailed information is listed i n Table 4.1 for face and Table 4.2 for rocket. In these tables, we also show possible drawing shapes for each component and the required shapes for some component during recognition. We set required shapes to facilitate recognition. After we aquire all the features for each stroke, we use a K - m e a n classifer to distinguish them.  25  Figure 4.2: .Features of head used for classification.  Order 1 2 3 4 5 6 7 8 9  Components  Possible Shapes  Required Shapes  face left eye right eye left eyebrow right eyebrow  ellipse ellipse ellipse curve curve  ellipse ellipse ellipse curve curve  nose mouth left ear right ear  curve, curve, curve, curve,  triangle, ellipse ellipse ellipse ellipse  Table 4.1: Description of components for face.  Order 1 2 3 4 5  Components  Required Shapes  head body tail left wing right wing  triangle open rectangle open rectangle open rectangle open rectangle  Table 4.2: Description of components for rocket.  26  4 •* *• 3 *• 9 *• 6  7  Figure 4.3: Illustration of face components.  4.1.3  Sketch Recognition: Matching  T h e gesture recognition can be regarded as classification problem for which we can apply K-means scheme as described i n Chapter 3 . There are C classes for classification, C = 9 for face and C = 5 for rocket. E a c h gestures i n one class should have some similarity and can be clustered together.  4.2  3D M o d e l Construction  In the modeling phase, the recognized strokes are used to construct 3 D models. Once classified, a stroke plays a key role i n determining model's geometrical construction. For our current system, curves and ellipses are rendered into ellipsoids, trianglea become cones and rectangles into cylinders. Figure 4 . 5 illustrates parameters used for modeling cylinder and ellipse sketches into 3 D models. There are 4 parameters for modeling a tapered cylinder: rl, h and nslices.  r2,  h represents cylinder's height, rl and rl are the radius of two ends  of tampered cylinder, nslices  is the the number of polygons used to represent the  cylinder, which can be considered to be open on either end. If the shape of a sketch is triangle, r l is zero. T h e parameters for representing ellipsoid are: r, h and  nslices.  T h e cross section of an ellipsoid along z axis is an ellipse w i t h its minor axis value to be r and major axis value to be h. A l l the radius and height values are taken from axis-aligned bounding box around a stroke. After sketch recognition i n the previous stages, we have determined each component and its shape for an object. Because we have a priori knowledge about what k i n d of object is being modeled, once we identity each stroke, we can locate their position related to other components, for instance, nose and eyes are on the face while ears are at the two sides of a face. T h e following is an example description of how we model a face from raw sketches.  Figure 4 . 3 shows 9 strokes of a face.  28  Stroke 1 is the head outline and  / h nslices  x  h  x  Z Figure 4.5: Modeling parameters for tapered cylinder and ellipsoid. the resulting 3D head is modeled using an ellipsoid. Stroke 2 is the left eye and is modeled using an ellipsoid. Stroke 4 is left eyebrow modeled using a 3 D curve. We assume eyes, eyebrows, nose and m o u t h are all located on the surface of the head and that the ears are located on the two sides of the head. E v e n i f the ear strokes are not drawn i n this way, we still model them as a normal face.  Since they are  related to head model, the z value of these models should be calculated from the model of the head outline. For the head, the cross section along the z and x axis of an ellipsoid is an ellipse and the cross section along y axis is a circle. T h e position values (x, y, z) of any point on an ellipsoid can be computed from ellipse and circle mathematical equations. In the next chapter, we will show some concrete examples of human faces and rockets that go from raw sketches to 3D models.  29  Chapter 5  Results We test our methods w i t h sketch-based modeling of simple faces and rockets. Results from our system are demonstrated i n this chapter.  We show examples of both  successful and unsuccessful sketch-based modeling. We assume that a face always takes 9 strokes and a rocket w i t h 5 strokes, otherwise, the system w i l l not work properly.  5.1  Face Sketching  Figure 5.1 shows three examples of faces drawn and recognized by our system. We show examples of both successful and unsuccessful sketch-based modeling. From left to right are the sketches, front view of the 3 D models and a rotated view of the 3D models. Figure 5.3 shows two faces drawn i n our system w i t h incorrect recognition. Table 5.1 list the drawing components and recognized ones for these two examples. In example one, "mouth" is mistaken as a "left ear" because these two strokes are drawn i n the same shape and other features data are similar and therefore confuse the classifier. In example two, "left eye" is wrongly recognized as "left eyebrow" because they are drawn too close to each other. One possible solution is to make each component unique during recognition. Each component can be marked and if  30  Drawing Order 1 2 3 4 5 6 7 8 9  Example 1 Recognized Drawing Components Components  Example 2 Recognized Drawing Components Components  face left ear right ear left eyebrow left eye right eyebrow right eye nose mouth  left eyebrow right eyebrow  left eyebrow right eyebrow  left eye right eye nose mouth face left ear right ear  left eye right eyebrow nose mouth face left ear right ear  face left ear right ear left eyebrow left eye right eyebrow right eye nose left ear  Table 5.1: Comparisons of drawing and recognized face components w i t h incorrect recognition. there is another component also marked as "left eyebrow", we can choose the label w i t h the least error according to that classifier.  5.2  Rocket Sketching  Three examples of rockets are shown i n Figure 5.2. F r o m left to right are the raw sketches, front image of rendered rockets and rocket images after rotation.  5.3  Discussion  We set up a training library for sketches including face and rocket, the size of database w i l l affect recognition correctness.  From Chapter 3, we know that the  more examples the database contains, the more accurately the sketches w i l l be recognized. O u r training data consists of 40 face skteches and 30 rocket sketches. Classifiers are obtained after applying K-means clustering to this data.  31  32  3:3  Figure 5.3: Face modeling w i t h incorrect recognition.  34  Chapter 6 Conclusions and Future W o r k T h e system has been tested on simple drawings such as human faces and rockets. In order to make the system work for a new class of objects, we have to specify how many components this object has and from some training examples, collect the shapes that can best symbolize this component. W i t h this information, the system can create 3D models from simple sketches. O u r current algorithms and implementation are efficient enough for experimental use. However, they can fail or generate unintuitive results when the user draws unexpected strokes. We must devise more robust and flexible algorithms to handle a variety of user inputs. In particular, we plan to enhance the algorithm to allow multi-stroke input. Currently, we assume that each stroke represent one shape. If we would allow pen strokes to represent any number of shape primitives connected together, the system could potentially cope deal w i t h more complicated sketches. Another important research direction is to develop additional features to support a wider variety of shapes w i t h arbitrary topology, and to allow more precise control of the shape.  Second, i n training phase, we would like to test w i t h more  classifiers to construct a more accurate classifier for matching strokes.  35  Bibliography [1] James A r v o and K e v i n Novins.  F l u i d sketches:  Continuous recognition and  In Proceedings of the 13th Annual ACM Symposium on User Interface Software and Technlogy, November 2000.  morphing of simple hand-drawn shapes.  [2] Berge Belongie, Jitendra M a l i k , and J a n Puzicha. Shape matching and object recognition using shape contexts.  IEEE Transactions on Pattern Analysis and  Machine Intelligence, 24(24) :509-522, 2002. [3] Chris Calhoun, Thomas F . Stahovich, Tolga K u r t o g l u , and Levent B u r a k K a r a . Recognizing multi-stroke symbols.  AAAI Spring Symposium on Sketch Under-  standing, pages 15-23, 2002. [4] F Cohen, Z Huang, and Z Yang. Invariant matching and identification of curves using b-splines curve representation.  IEEE Transactions on Image Processing,  4(1):1-10, 1995. [5] M . Gross and E D o . Ambiguous intentions: a paper-like interface for creative design. In  Proceedings of UIST 96, pages 183-192, 1996.  [6] T Igarashi, S Matsuoka, S Kawachiya, and H Tanaka. Interactive beautification: A technique for rapid geometric design. In UIST'97, pages 105-114, 1997. [7] Takeo Igarashi and J o h n F . Hughes. A suggestive interface for 3d drawing. ACM  UIST Symposium on User Interface Software and Technology, pages 173-181, 2001. [8] Takeo Igarashi, Satoshi Matsuoka, and Hidehiko Tanaka. Teddy: A sketching  interface for 3d freefrom design. SIGGRAPH 99, Computer Graphics Proceedings, pages 409-416, 1999. [9] Michael I. Jordan.  An Introduction to Probabilistic Graphical Models. To be  published, 2004.  36  [10] T . D . K i m u r a , A A p t e , and S. Sengupta.  A graphic diagram editor for pen  computers. Software Concepts and Tools, pages 82-95, 1994. [11] D . L a m b and A Bandopadhay. Interpreting a 3d object from a rough 2d line drawing. In  Proc First IEEE Conf on Visualization, 90, pages 50-66, 1990.  [12] James A . Landay and B r a d A . Myers. human interface design. [13] H a n  Li.  Toward more  IEEE Computer, 34(3):56-64, 2001. Freehand  http://science.unitn.it/  Sketching interfaces:  sketch  recognition  system,  tomasi/think/pdf/pha.ppt.  [14] H o d L i p s o n and Moshe Shpitalni. Correlation-based reconstruction of a 3d object from a single freehand sketch.  2002 AAAI Spring Symposium on Sketch  Understanding, pages 99-104, 2002. [15] B r i a n T . Luke. K-means clustering, http://fconyx.ncifcrf.gov/ lukeb/kmeans.html. [16] P. Sinha E . Osuna M . Oren, C . Papageorgiou and T . Poggio. Pedestrian de-  Proc. IEEE Conf. Comput. Vision and Pattern Recognition, pages 193-199, June 1997.  tection using wavelet templatess. In  [17] James V . Mahoney and M a r k u s P. J . Frommerz. Three m a i n concerns i n sketch recognition and an approach to addressing the. Sketch Understanding, Papers from the 2002 AAAI Spring Symposium, pages 105-112, 2002. [18] Zhao R . Incremental recognition i n gesture-based and syntax directed diagram  editor. In Proceedings of InterCHI'93, pages 95-100, 1993. [19] Dean Rubine.  Integrating gesture recognition and direct manipulation.  In  Proceedings of the Summer'91 USENIX Technical Conference, pages 95-100, 1991. [20] Dean Rubine.  Specifying gestures by example.  SIGGRAPH  91, Computer  Graphics Proceedings, 25:329-337, 1991. [21] H . Schneiderman and T . Kanade. A statistical method for 3 dimensional object detection applied to faces and cars. In  Proc. IEEE Conf. Comput. Vision and  Pattern Recognition, 2000. [22] Moshe Shpitalni and H o d Lipson. Classification of sketch strokes and corner detection using conic sections and adaptive clustering.  chanical Design, 119(2):131-135, 1997.  37  ASME Journal of Me-  [23] T . F Stanovich. Sketchit: a sketch interpretation tool for conceptual mechanism design.  Technical report, MIT Al Laboratory, 1996.  [24] T . F . Stahovich, R . Davis, and H Shrobe. from a sketch.  Generating multiple new designs  Artificial Intelligence, 104(1-2) :211-264, 1998.  [25] M . T u r k and A . Pentland. Eigenfaces for recognition.  J. Cognitive Neuro science,  3(l):71-96, 1991.  Second international workshop on statistical and computational theories of vision, 2001.  [26] P. V i o l a and M . Jones. Robust real-time object detection.  [27] L . Weisman. A foundatoin for intelligent multimodal drawing and sketching programs. 1999. [28] D . Geman Y . A m i t and K . . W i l d e r . tree classifiers.  Joint induction of shape features and  IEEE Trans. Pattern Analysis and Machine Intelligence,  19(11):1300-130596, November 1997. [29] B o Y u and Shijie C a i . A domain-independent system for sketch recognition. 2003. [30] Robert C . Zeleznik, K e n n e t h P. Herndon, and J o h n F . Hughes.  Sketch: A n  interface for sketching 3d scenes. SIGGRAPH 96, Computer Graphics Proceedings, pages 163-170, 1996. [31] Hao Zhang and Jitendra M a l i k . Learning a discriminative classifier using shape context distances.  IEEE Computer Vision and Pattern Recognition, pages 242-  247, 2003.  38  Appendix A User Interface T h e physical user interface of this sketch system is implemented w i t h F L T K , a platform independent graphics toolkit.  T h e interface consists of a m a i n display  window on the left side and a control panel on the right. M o s t control actions are focused on drawing gestures and displaying object, we use a two-button mouse for drawing and control. T h e top panel called "Object Display" contain a number of buttons, the function of which is given below. • sketch: ready for new sketching the object or add more sketches to an rendered object. • solid shading: show object w i t h solid shading. • render: render the whole sketches after finished drawing an object. • show normal: display normal on each point of object meshes. • reset: clean the display window and ready to draw other objects. • redisplay: show original sketches. In the middle, there is a animation panel designed for showing some simple animations. Different training and drawing choice buttons are listed at the buttom  39  1 Sketching and Rendering Syste:  Object Display | sketch  | solid_shading  • j render  |show_normal  j reset  | redisplay  Animation r—  element: face  O redisplay O 3D Animation  Training face racket finish detect  Draw What face rocket snip Open tile  '' •~ j  Save as -" /  Exit  F i g u r e A . l : T h e interface of m o d e l i n g system.  of the c o n t r o l panel. I n a d d i t i o n to gestures, we also use a few b u t t o n widgets for a u x i l i a r y operations, such as save and load.  40  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items