UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Interactive animation of the eye region Libório Cardoso, João Afonso 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2016_november_cardoso_joao.pdf [ 6.67MB ]
JSON: 24-1.0314578.json
JSON-LD: 24-1.0314578-ld.json
RDF/XML (Pretty): 24-1.0314578-rdf.xml
RDF/JSON: 24-1.0314578-rdf.json
Turtle: 24-1.0314578-turtle.txt
N-Triples: 24-1.0314578-rdf-ntriples.txt
Original Record: 24-1.0314578-source.json
Full Text

Full Text

Interactive Animation of the EyeRegionbyJoa˜o Afonso Libo´rio CardosoB.Sc., University of Coimbra, 2014A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Computer Science)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)September 2016© Joa˜o Afonso Libo´rio Cardoso 2016AbstractHumans are extremely sensitive to facial realism and spend a surprisinglyamount of time focusing their attention on other people’s faces. Thus, be-lievable human character animation requires realistic facial performance.Various techniques have been developed to capture highly detailed actorperformance or to help drive facial animation. However, the eye region re-mains a largely unexplored field and automatic animation of this region isstill an open problem. We tackle two different aspects of automatically gen-erating facial features, aiming to recreate the small intricacies of the eyeregion in real-time.First, we present a system for real-time animation of eyes that can beinteractively controlled using a small number of animation parameters, in-cluding gaze. These parameters can be obtained using traditional animationcurves, measured from an actors performance using off-the-shelf eye track-ing methods, or estimated from the scene observed by the character usingbehavioral models of human vision. We present a model of eye movement,that includes not only movement of the globes, but also of the eyelids andother soft tissues in the eye region. To our knowledge this is the first systemfor real-time animation of soft tissue movement around the eyes based ongaze input.Second, we present a method for real-time generation of distance fieldsfor any mesh in screen space. This method does not depend on object com-plexity or shape, being only constrained by the intended field resolution.We procedurally generate lacrimal lakes on a human character using thegenerated distance field as input. We present different sampling algorithmsfor surface exploration and distance estimation, and compare their perfor-mance. To our knowledge this is the first method for real-time or screenspace generation of distance fields.iiPrefaceVersions of Chapter 3 have been published in the following:ˆ Debanga R. Neog, Joa˜o L. Cardoso, Anurag Ranjan, and Dinesh K.Pai. Interactive gaze driven animation of the eye region. In Pro-ceedings of the 21st International Conference on Web3D Technology,Web3D ’16, 2016ˆ Debanga Raj Neog, Anurag Ranjan, Joa˜o L Cardoso, and Dinesh KPai. Gaze driven animation of eyes. In Proceedings of the 14th ACMSIGGRAPH/Eurographics Symposium on Computer Animation, 2015The ideas described in Chapter 3 are part of the EyeMove project, startedbefore I joined the U.B.C. Department of Computer Science. The EyeMoveproject was funded in part by grants from NSERC, Peter Wall Institutefor Advanced Studies, Canada Foundation for Innovation, and the CanadaResearch Chairs Program. This project was supervised by Prof. DineshK. Pai, and worked on by myself, Anurag Rajan and Debanga Raj Neog.We worked independently on different parts of the project according to ourfields of competence. I focused on the real-time aspect of the project, whileAnurag Rajan and Debanga Neog focused on capturing real world data andtraining generative models. Debanga Neog and I collaborated intensively inthe intersection of these two realms of the project.With that in mind, Sections 3.4 and 3.5 contextualize the remainder ofthe chapter by briefly describing work developed entirely by the remainderof the group. I was responsible for the majority of the implementation, withSection 3.6 being mostly implemented by Debanga Neog. Regarding writing,Sections 3.1, 3.2 and 3.3 are based on the aforementioned publications, wherethe corresponding writing had been done mainly by Prof. Dinesh Pai.Chapter 4 describes work that has never been submitted for publication.I alone worked on this chapter, which was reviewed by Prof. Dinesh Pai.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . ix1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Overview of Contributions . . . . . . . . . . . . . . . . . . . 22 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1 Anatomy Background . . . . . . . . . . . . . . . . . . . . . . 42.1.1 Structure . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.2 Movement . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Geometry Reconstruction . . . . . . . . . . . . . . . . . . . . 62.3 Interactive Animation . . . . . . . . . . . . . . . . . . . . . . 92.4 Real-Time Rendering . . . . . . . . . . . . . . . . . . . . . . 113 Interactive Gaze Driven Animation of the Eye Region . . 153.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Skin Movement in Reduced Coordinates . . . . . . . . . . . . 173.3 Factors Affecting Skin Movement . . . . . . . . . . . . . . . 193.4 Movement Capture and Tracking . . . . . . . . . . . . . . . . 203.5 Generative Model of Skin Movement . . . . . . . . . . . . . . 213.6 Transferring Animations . . . . . . . . . . . . . . . . . . . . 243.7 Interactive Model Control . . . . . . . . . . . . . . . . . . . . 273.8 Deformation by Globe Movement . . . . . . . . . . . . . . . 293.9 Interactive Motion Synthesis . . . . . . . . . . . . . . . . . . 31ivTable of Contents3.10 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.11 Legacy Hardware Considerations . . . . . . . . . . . . . . . . 384 Screen Space Distance Fields . . . . . . . . . . . . . . . . . . 404.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.2.1 Traditional Screen Space Ambient Occlusion . . . . . 424.2.2 Image-Space Horizon-Based Ambient Occlusion . . . 434.3 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . 444.4 Sampling Methods . . . . . . . . . . . . . . . . . . . . . . . . 464.4.1 Equiangular Sampling on a Circle . . . . . . . . . . . 464.4.2 Exponential Line Search . . . . . . . . . . . . . . . . 484.4.3 Backtracking Line Search . . . . . . . . . . . . . . . . 504.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.1 Discussion and Future Directions . . . . . . . . . . . . . . . . 58Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60vList of Tables3.1 Blinking distributions . . . . . . . . . . . . . . . . . . . . . . 353.2 WebGL applications performance . . . . . . . . . . . . . . . . 374.1 Sampling algorithm performance. . . . . . . . . . . . . . . . . 54viList of Figures2.1 Structure of the eye globe . . . . . . . . . . . . . . . . . . . . 52.2 Anatomy of the extraocular muscles . . . . . . . . . . . . . . 72.3 Eye globe representation . . . . . . . . . . . . . . . . . . . . . 92.4 Task based blink frequency . . . . . . . . . . . . . . . . . . . 102.5 Effect of subsurface light transport . . . . . . . . . . . . . . . 122.6 Iridal chromatic variations . . . . . . . . . . . . . . . . . . . . 133.1 WebGL application portability . . . . . . . . . . . . . . . . . 163.2 Example of automatic facial animation generation . . . . . . 163.3 Overview of the spaces used for modeling skin movement . . 173.4 Body is the union of “skull” and globes . . . . . . . . . . . . 183.5 Gaze and skin tracking. . . . . . . . . . . . . . . . . . . . . . 213.6 Generative skin motion model . . . . . . . . . . . . . . . . . . 223.7 Reconstruction error . . . . . . . . . . . . . . . . . . . . . . . 233.8 Mesh registration . . . . . . . . . . . . . . . . . . . . . . . . . 243.9 Animation transfer . . . . . . . . . . . . . . . . . . . . . . . . 253.10 Globe mesh fitting . . . . . . . . . . . . . . . . . . . . . . . . 293.11 Spherical mapping . . . . . . . . . . . . . . . . . . . . . . . . 303.12 Fine details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.13 Overall architecture of my applications. . . . . . . . . . . . . 323.14 Application flow chart . . . . . . . . . . . . . . . . . . . . . . 333.15 Facial expressions wrinkles generated interactively. . . . . . . 363.16 Skin deformation in eye closing during a blink. . . . . . . . . 383.17 Static scene observation . . . . . . . . . . . . . . . . . . . . . 394.1 Volumetric distance fields. . . . . . . . . . . . . . . . . . . . . 414.2 Screen space ambient occlusion: sampling . . . . . . . . . . . 424.3 Horizon based ambient occlusion: theory vs reality . . . . . . 434.4 Overview of the spaces used for sampling . . . . . . . . . . . 454.5 Equiangular sampling distance fields. . . . . . . . . . . . . . . 474.6 Exponential line search distance field. . . . . . . . . . . . . . 484.7 Two-pass metric schematic. . . . . . . . . . . . . . . . . . . . 50viiList of Figures4.8 Backtracking line search anti-aliasing. . . . . . . . . . . . . . 514.9 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.10 Visible geometry artifacts . . . . . . . . . . . . . . . . . . . . 554.11 Lacrimal lakes. . . . . . . . . . . . . . . . . . . . . . . . . . . 56viiiAcknowledgementsFirst of all, I would like to thank those who have worked with me during mygraduate studies. Special thanks to my supervisor, Dr. Dinesh K. Pai, andDebanga Raj Neog, both of which accompanied me throughout the wholejourney. Your constructive criticism was always invaluable. I’d like to thankDr. Dinesh K. Pai for teaching me about research process and scientificwriting. Debanga Neog, for all the times we brainstormed and chattedtogether. I would also like to thank Anurag Rajan, whom I worked with,even only for a brief period of time, and Paul Debevec from the WikiHumanProject, for providing the Digital Emily model used in our examples.This goes out to my laboratory: thank you for having me around. ToPrashant, for always being open to share his experience. To Cole, for puttingup with our hardware requests. To Darcy, for your most valuable commentsto my presentations in the laboratory meetings.To Dr. Paul Kry, for having introduced me to the world of researchand computer animation, inviting me to your lab and putting up with meeven when I was just starting my undergraduate studies. I would neverhave pursued graduate studies in Canada if it was not for you. To Dr.Derek Nowrouzezahrai, for introducing me to physically based renderingand letting me study on your lab.Lastly, I’d like to thank my family and friends. To all my longtime friendsback in Europe and to the more recent ones in Vancouver. But especially tomy parents, for without them I’d not be the same. Thank you for visitingme so often, even with me on the other side of the world. Thank you for allthe guidance and insufferable nagging. Finally, to my sister, for by growingyou always remind me of how long I’ve been gone.ixChapter 1IntroductionHumans communicate by a mixture of body language, voice and facial ex-pressions. Still, the most important communication visual aid is arguablythe eyes: humans spend a surprisingly large amount of time looking atpeople’s eyes, as shown by the work of Alfred Yarbus (1967). The eyesare extremely important because they communicate information regardinga person’s mental and physical state (e.g., attention, intention, emotion,health, fatigue). Hence humans are extremely perceptive to the region sur-rounding the eyes and all the small details it conveys.Nowadays, computers can generate very realistic environments and highlydetailed characters. Yet, human facial animation is still not on par with theremaining character animation. We are very sensitive to human realism,especially faces, and hence even small discrepancies in the appearance areglaringly obvious to us. These small discrepancies are quite worrisome,as they can cause a response of revulsion from observers - a phenomenonknown as the uncanny valley, first described by Masahiro Mori et al. (2012)and coined by Jasia Reichardt (1978). This is a problem that the filmand interactive media industries frequently struggle with. For example, thefilm Final Fantasy: The Spirits Within received negative reactions due toits near photo-realistic yet imperfect visual depictions of human characters(Eveleth, 2013). The humans in The Polar Express were heavily criticizedby reviewers. Anderson (2004) described them as “creepy and dead-eyed(...) zombies”.These discrepancies are consistently found in the ocular region. Wedefine the ocular region as the combination of the eye globes, the upper andlower eyelids, the eyelashes, the canthus, the conjunctiva, the tear film andthe periorbital soft tissues that surround the orbit. Besides the “dead eye”look caused by imperfect shading andor animation of the globes, the ocularregion is full of small features which, if not animated properly, will lead intouncanniness.11.1. Overview of ContributionsIn the past few years, there have been some cases of exceptional digi-tal doubles created for film: a famous example is Paul Walker’s double forthe movie Fast and Furious 7, which was made when the actor died be-fore the movie was completed. Letteri (2015) explained that the featuresand performance were reconstructed by hand from old footage. The real-ism was further enhanced by blending the character performance with thereal footage. Even still, 10% of the shots were made by imposing existingfootage of the actor. Ed Ulbrich (2009), talking of the film The StrangeCase of Benjamin Button, described the human head as the “holy grail ofthe film industry”. Although his team managed to create a remarkable dig-ital character, it took intensive actor performance recording and facial scan,per expression manual tweaking and 155 people over two years to achieveit. For example, an artist worked exclusively on the ocular region over twoyears. Automatizing this process remains a challenging problem. This isespecially troublesome in interactive media, were the performance limita-tions are much more severe and actor performance cannot be directly usedto animate characters, neither can each single possible shot be handcraftedby an artist.1.1 Overview of ContributionsAs previously discussed, there is a very high time investment associatedwith animating all the small intricacies of the ocular region by hand. Whileresearch on modeling gaze - and hence globe motion - is quite mature, re-search on automatically animating the other components of the ocular regionis almost non-existent. In this dissertation I explore models for interactiveanimation of some of these components in real-time based on real-world dataor biophysical models. In particular, I describe:ˆ A system for real-time animation of the eyelids, canthus and periorbitalsoft tissues that can be interactively controlled using a small numberof animation parameters, including gaze.ˆ A system for real-time computation of distance fields in screen-spaceof any arbitrary geometry that does not depend on scene complexity.From these two systems, I highlight as my personal major contributions:ˆ A real-time algorithm for skin sliding over the globes, which is able torecreate corneal displacement and reconstruct fine details from simpli-fied skin representations.21.1. Overview of Contributionsˆ A model for intuitive interactive control of the factors that affect skinmovement in the eye region.ˆ Three sampling algorithms for real-time generation of distance fieldsin screen-space.ˆ A proof of concept algorithm for procedural lacrimal lake generationbased on distance to skin.3Chapter 2Related WorkThe synthesis of a realistic ocular region requires an accurate modeling,animation and rendering of its underlying structure. In this section, I discussbackground knowledge and relevant related work regarding each of thesethree topics.2.1 Anatomy Background2.1.1 StructureAccording to Ruhland et al. (2014), the “eye globe is one of the most com-plex organs in the human body”, with multiple layers, each one devotedto perform a specific task. The transparent cornea, located in front of theeye, is the first refractive surface to the light entering the eye. The tearfilm moistens and softens the cornea’s surface to minimize distortion andscattering of light. The cornea is embedded in the opaque white sclera andseparated by the limbus. The sclera tissue preserves the shape of the eyeand protects against substances and pathogenic elements. The light passesthrough the pupil and is focused by the lens behind it onto the retina. Theprimary function of the iris is to regulate the amount of light that reaches theretina as as a function of the prevailing light conditions. The retina formsthe inner, light sensitive part of the eye. The light captured at the retina isprocessed and transmitted as electrical signals to the brain for processing.The pupil shape and diameter changes with the contraction of the irismuscles. The lens shape changes with the contraction of the ciliary muscle,thereby increasing the optical power and accommodating the projection ofnearby objects on the retina.External to the globe, the medial canthus and lateral canthus are, re-spectively, the inner and outer corners of the eye where the upper and lowereyelids meet (Dorland, 1980). The lacrimal apparatus is the physiologicalsystem containing the structures for tear production and drainage (Gray,42.1. Anatomy BackgroundFigure 2.1: Structure of the eye globe. Reproduced from National EyeInstitute (2016).2009).The lacrimal gland, close to the lateral canthus, secretes the tearsinto the globe. The conjunctiva, located inside of the eyelids and coveringthe sclera, helps lubricate the eye by producing mucus and a smaller volumeof tears. The nasolacrimal duct, near the medial canthus, drains the fluidinto the cavity of the nose.2.1.2 MovementIt is not commonly appreciated that the eyes have full 3 degrees of freedom,even though to point the optical axis at a target only requires 2 degreesof freedom. Rotations about the optical axis, called “torsion”, have beenknown and modeled at least since the 19th century. The six separate musclesthat control eye globe motion, the extra-ocular muscles, are a surprisinglycomplex muscular system that allows the globe to perform a wide repertoireof movements (Leigh and Zee, 2015). While complex, these movements havebeen extensively studied by neurologists, psychologists and neuroscientists(Ruhland et al., 2014). Therefore, both their characteristics and the condi-52.2. Geometry Reconstructiontions in which they occur are well known and extremely well documented.“Saccades”, arguably the most noticeable type of globe movement, arethe rapid shifts in globe rotation that focus the gaze on targets of interest.They are characterized by extremely rapid initial acceleration and final de-celeration (Becker, 1989). Humans make an average of three saccades everysecond, and slow movements called “smooth pursuit” to track small movingtargets. The “vestibulo-ocular reflex” is responsible for stabilizing the eyesduring head motion. It occurs with extremely short latency and hence itcan be considered as effectively simultaneous with head movement (Leighand Zee, 2015). The smooth pursuit system serves to stabilize moving im-ages on the retina. It has an higher latency than the vestibulo-ocular reflexbut lower than saccades. Unlike them, smooth pursuit is more situationaland consequently not as common. Vergence occurs when a target lies nearthe visual mid-line. It consists of a convergence of the globes rotations tomaintain a single binocular vision. Vergence movements are far slower thansaccades.The two most studied lid movement types are blinks and lid saccades.There are three types of blinks: spontaneous, voluntary, and reflexive. Whilethere is an high variation in blink rates, their frequency has been subject of awide variety of studies. They have been linked to cognitive state and activity(Skotte et al., 2007; Stern et al., 1984), fatigue (Anderson et al., 2010; Johnset al., 2007), lying (Burgoon et al., 2016), speech production (Nakano andKitazawa, 2010) and others. Blinks are characterized by a quick down-phase, which causes an almost complete closure of the lids, followed by anapproximately twice as slow up-phase. Lid saccades, on the other hand, donot exhibit as much marked asymmetry between down and up-phases asblinks do.2.2 Geometry ReconstructionDue to the aforementioned complexity of the ocular region, modeling itmanually is no simple task. Consequently the region is commonly grosslysimplified. Eyes are traditionally modeled as spherical shapes and highresolution pictures of human eyes are used as texture maps (Itti et al., 2004;Weissenfeld et al., 2010). According to Be´rard et al. (2014), these typicallygeneric eye models are insufficient for capturing the individual identity of areal human.62.2. Geometry ReconstructionFigure 2.2: Anatomy of the extraocular muscles. Reproduced from Open-Stax (2013).72.2. Geometry ReconstructionSome methods that produce more anatomically correct models of theeye, particularly the iris, have been proposed. Sagar et al. (1994) presentsa model for surgery simulation, where the iris is represented as two layers(ciliary and pupillary) of fibers with opposite curvatures. A Gaussian per-turbation is used on the ciliary fibers as the pupil dilates. Retinal bloodvessels are generated on a spherical sclera using fractal trees (Oppenheimer,1986). Lefohn et al. (2003) is able to synthesize human eyes, more noticeablythe iris, using knowledge from the field of ocular prosthetics. Eyes are syn-thesized by stacking multiple layers of dots, radial smears or radial spokes,in a similar fashion to what ocularists do. Franc¸ois et al. (2007) presentsa method to recover the iris structure and scattering features from a sin-gle photograph, taking into account reflection and refraction at the cornealsurface based on the ambient light.Moriyama et al. (2006) presents a different take on the problem. Itsuggests a parametrization of the individuality of the ocular region as asimplified set of features, such as iris size and eyelid skewness. It is thencapable of identifying such parameters from real world images while trackinggaze and eyelid opening across multiple frames. This data can be used toreconstruct simplified geometry.More recently, there have been major improvements on capture and re-construction of the ocular region. Notably, Be´rard et al. (2014) is ableto capture and reproduce all the intricacies of a subject’s sclera, corneaand iris. Bermano et al. (2015) captures and outputs time-varying high-resolution eyelids and is able to reproduce the the eyelid fold, even undercomplex deformation, folding and strong self-occlusion.Considerable research in the past decade has been focused around facialsimulation and performance capture. Physically based deformable mod-els for facial modeling and reconstruction include the seminal work of Ter-zopoulos and Waters (1990). Synthesis of high definition textures using agenerative Bayesian model has been discussed in Tsiminaki et al. (2014).The majority of recent work has been focused on data driven methods.Some state-of-the-art methods focus on obtaining realism based on multi-view stereo (Beeler et al., 2011; Bickel et al., 2007; Furukawa and Ponce,2010; Ghosh et al., 2011; Wu et al., 2011). This data can be used to driveblendshapes (Fyffe et al., 2013). Some of the work is based on binocular82.3. Interactive AnimationFigure 2.3: Left: generic spherical eye representation. Right: higher orderapproximation of an individual eye, as presented in Be´rard et al. (2014).(Valgaerts et al., 2012) and monocular (Garrido et al., 2013; Shi et al., 2014)videos. Recent work by Li et al. (2013b) described a system for real-timeand calibration free performance driven animation.Several methods are specifically targeted towards robust tracking of fa-cial movements. Some of the recent work includes using Active Appearancemodels (Koterba et al., 2005; Xiao et al., 2004) and physically based meth-ods (Decarlo and Metaxas, 2000). Much of facial tracking includes sparsetracking of different features across the face in a learning based approach.The sparse tracking methods such as feature based tracking and AAMs failsto capture the detailed and complex motion of skin especially around theeyes.2.3 Interactive AnimationMost of the attention on eyes has been on modeling gaze, especially thegaze behavior of a character in a virtual environment. Bahill et al. (1975)correlates the magnitude of saccades with some of their proprieties, suchas duration and peak velocity, by looking into experimental data. Harwoodet al. (1999) presents a model for the shape of saccade trajectories as a func-tion of their duration. Blohm et al. (2006) proposes a model of the smoothpursuit system, which makes use both of saccadic and smooth movements totrack moving objects. Yeo et al. (2012) implement a smooth pursuit systemto animate human characters performing fast visually guided tasks.Eye blinks are also a widely studied phenomenon. Flash and Hogan92.3. Interactive Animation(1985) describes the web known bell shape profile of muscular propelledmotion. Evinger et al. (1991) describes the relation of blink amplitude toblink maximum velocity and phase duration. Trutoiu et al. (2011) tracksinter-eyelid distance on test subjects using an hight-speed video camera.It then studies blink duration and eyelid closure profile variation acrossdifferent subjects. Ishimaru et al. (2014) analyzes head motion patternsand blink frequency associated with performing different activities. It thenis able to recognize activities with some success given the head motion andblink frequency.Figure 2.4: Blink frequency when performing different tasks. Reproducedfrom Ishimaru et al. (2014).The dynamic behavior of the pupil has also been studied. Pamplona et al.(2009) builds a model of pupil deformation as a function of the lighting ofthe environment. Agustin et al. (2006) measures pupil brightness in multiplesubjects as a function of gaze direction.On the other hand, very few papers deal with animating specifically theocular region. Pinskiy and Miller (2009) presents an anatomically motivatedapproach for the region. It is able to procedurally produce deformations ofthe skin surrounding the eye due to eyelid closure and gaze direction. Thesame lack of attention has been given to human tears animation. Mostof research on this topic deals almost exclusively with the animation ofteardrops flowing on the human face. Hence the great majority of existingwork applicable to human tears consists of generic methods for droplet flowon surfaces. For example, Mu¨ller et al. (2003); Wang et al. (2005); Zhanget al. (2012) present different physically-based models to simulate small-102.4. Real-Time Renderingscale fluid phenomena in contact with solid surfaces, which could be used tosimulate teardrops. Jung and Behr (2009) present an image-space methodfor real-time simulation of droplet flows on 3D surfaces, optimized for GPUprocessing, and show its effectiveness on teardrop simulation. Chen et al.(2012) presents a hybrid method between a particle system and image-spacemodel for simulating water droplets on the glass pane on interactive rates.van Tol and Egges (2012) develops a variant of the Smoothed Particle Hy-drodynamics method optimized for real-time tear generation and control.2.4 Real-Time RenderingOne of the biggest challenges of rendering human skin stems from the fact itis composed of multiple different semi-translucent layers: the epidermis, thedermis and the hypodermis. When light reaches the surface of the skin, partof it is scattered by interacting with the translucent materials, transversesthe layers, and exits the surface at different points. This effect is known assubsurface scattering.There are well established methods for physically based rendering ofscattering inside materials. A traditional approach is to use the classicaldipole scattering model from radiative transport and medical physics, in thetwo part form proposed by Jensen and Buhler (2002). This model is based onthe assumption that light entering a material will go through many internalscattering events. In this situation, the diffusion theory is applicable andan analytical solution to the subsurface scattering profile becomes possible.This allows to simulate the scattering effect without having to explain thelarge number of individual internal propagation events. However, the dipolemodel does not address the multilayered structure of the skin or similarmaterials. For that purpose, Donner and Jensen (2005) proposed a multipolemodel, which allows the rendering of layered thin translucent slabs.However, handling multiple scattering events is still too computation-ally expensive for real-time rendering. Luckily, research on real-time skinrendering is quite mature, and several methods for approximating the ef-fect of skin subsurface scattering on a time constraint have been proposed.Mertens et al. (2005) approximates the dipole scattering model as a Gaus-sian filter blurring operation on a 2D diffuse irradiance texture. d’Eon andLuebke (2007) enhances this idea by approximating the multi-pole modelwith a sum of Gaussians. Yet these methods scale poorly with scene com-112.4. Real-Time Renderingplexity, since the subsurface scattering shading needs to be performed on aper-object basis. Jimenez et al. (2009) solves this issue by performing theoperation in image space, which also limits computations to the visible partsof objects. Jimenez et al. (2015) further optimized the method, making itimplementable in a post-processing stage using two 1D convolutions and alow sample rate. Penner and Borshukov (2011) avoids Gaussians entirely bypre-integrating the illumination effects of subsurface scattering due to cur-vature into the shading model, assuming screen space object normals havebeen pre rendered. Curvature is estimated using two mipmap levels of thenormal map.(a) No subsurface scattering (b) Subsurface scatteringFigure 2.5: The effect of subsurface light transport, reproduced fromJimenez et al. (2015).A major issue with all the previous real-time models is that they do notconvey the effect of light transversal from the back of an object to its visiblesurface. Jimenez et al. (2010b) addresses this issue separately. This presentsan approximate model of transverse light ratio given transverse distance andapproximates the transverse distance as the distance between the point andits projected location on the light source shadow map.Recent work takes real-time rendering one step further. For exam-ple,Jimenez et al. (2010a) proposes a model of skin rendering taking into ac-count concentrations of two chromophores, melanin and hemoglobin, whichallows for dynamic control of skin color. Nagano et al. (2015) synthesizes122.4. Real-Time Renderingthe effects of skin micro-structure deformation by anisotropically convolvingthe micro-structure displacement map.Eye rendering is not as mature as skin rendering. As described in Section2.1, the complexity of the eye allows for very different geometrical modelsof the eye. These, in turn, lead to different rendering approaches, not all ofwhich are usable for interactive rendering. For example, Lefohn et al. (2003)renders the eye by ray tracing through cones with transparent and opaquetextures, which is not practical for interactive rendering. In contrast, Vill(2014) models the eye as a single outer surface and ray traces from it anexplicit procedural concave surface, which represents both the iris and thepupil.Figure 2.6: Synthesized iridal chromatic variations, reproduced from Lamand Baranoski (2006).Lam and Baranoski (2006) presented the first biophysically-based lighttransport model of the human iris. It takes into account the iridal morpho-logical and optical characteristics to compute the light scattering and ab-sorption processes occurring within the iridal tissues. Biophysical attributescan be controlled to achieve different iris coloration. Yet it does so by divid-132.4. Real-Time Renderinging the iridal tissues into four layers (aqueous humour, ABL, stromal layerand IPE) and using a Monte Carlo ray-tracing approach.14Chapter 3Interactive Gaze DrivenAnimation of the Eye Region3.1 IntroductionIn this chapter I describe our system for real-time animation of the eyelids,canthus and periorbital soft tissues that can be interactively controlled us-ing a small number of animation parameters, including gaze. My first twocontributions are part of this system. Our goal is to model the movementsof the skin around the eyes, because it is the most important part of the faceto convey expression. Therefore it is worthwhile to design a model specifi-cally for this region, while other parts of the face may be modeled by moretraditional methods.Our system has two motivations. First, since there are no articulatingbones in the eye region, the skin slides on the skull almost everywhere.Therefore, we would like to efficiently model this skin sliding. The exceptionis where it slides over the globes, which consists in my first contribution.Second, we would like the model to be easily learned using single cameravideos of real human subjects.Our system also makes use of my second contribution, a generalized con-trol model for of the factors that affect skin movement. Thus, the anima-tion parameters used by our model can easily be obtained using traditionalkeyframed animation curves, measured from an actors performance usingoff-the-shelf eye tracking methods, or estimated from the scene observed bythe character, using behavioral models of human vision.Sections 3.2 and 3.3 clarify the core concepts used in the entire EyeMoveproject. Sections 3.4 and 3.5 contextualize the remainder of the chapterby briefly describing our actor capture system and skin motion generativemodel. Section 3.6 describes our system for transferring animation from one153.1. Introduction(a) (b)Figure 3.1: My WebGL application renders a character in real-time and runsin most computers (a) and mobile devices (b).(a) (b)Figure 3.2: Example of automatic facial animation generation while watch-ing a hockey video. The character starts watching the match with a neutralexpression (a) and gets concerned when a goal is scored (b). Eye movementswere generated automatically using salient points computed from the hockeyvideo.163.2. Skin Movement in Reduced Coordinatescharacter to another. The remainder of the chapter focuses on my maincontributions to the Eyemove project, which I alone worked on.3.2 Skin Movement in Reduced CoordinatesSkin space Physical spaceSkin atlas Camera coordinatesFigure 3.3: Overview of the spaces used for modeling skin movementTo represent the motion of skin, we use the reduced coordinate represen-tation of skin introduced by Li et al. (2013a). This representation constrainsthe synthesized skin movement to always slide tangentially on the face, evenafter arbitrary interpolation between different skin poses. This avoids car-toonish bulging and shrinking and other interpolation artifacts. We will seein Section 3.9 that deformation perpendicular to the face can be achievedwhere needed, for example in the movement of the eyelid. This representa-tion also reduces the data size in most computations.Skin is represented by its 3D shape in a reference space called skin space.173.2. Skin Movement in Reduced CoordinatesThis space is typically called “modeling space” in graphics and “materialspace” in solid mechanics. The key idea is that since skin is a thin structure,we can also represent it using a 2D parameterization pi, using an atlas ofcoordinate charts. In our case a single chart is sufficient (see Figure 3.3).We call this chart skin atlas. It can be thought of as the skin’s texture space.Skin is discretized as a mesh S = (V,E), a graph of nv vertices Vand ne edges E. In contrast to Li et al. (2013a), this is a Lagrangianmesh, i.e., a point associated with a vertex is fixed in the skin space. Sincemost face models use a single chart to provide texture coordinates, thesecoordinates form a convenient parameterization pi. In a slight departurefrom the notation of Li et al. (2013a), a skin material point correspondingto vertex i is denoted Xi in 3D and ui in 2D coordinates. We denote thecorresponding stacked arrays of points corresponding to all vertices of themesh as X in 3D skin space and u in the 2D skin atlas.The skin moves on a fixed 3D body corresponding to the shape of the headaround the eyes. Instead of modeling the body as an arbitrary deformableobject as in Li et al. (2013a), we account for the specific structure of thehard parts of the eye region. We model the body as the union of two rigidparts:ˆ The skull, a closed mesh corresponding to the anatomical skull withthe eye sockets closed by a smooth surface.ˆ The mobile globes, two closed meshes corresponding to the outer sur-face of the eyeballs.This representation allows us to efficiently parameterize changes in theshape of the body using the rotation angles of the globe and train the modelindependently of globe geometry. It is also useful for synthesizing gaze driveas described in Section 3.8.Figure 3.4: Body is the union of “skull” and globes183.3. Factors Affecting Skin MovementThe skin and body move in the physical space, which is the familiarspace in which we can observe the movements of the face, for instance,with a camera. For modeling, we assume there is a head-fixed camera withprojective transformation matrix P that projects a 3D point correspondingto vertex i (denoted xi) into 2D camera coordinates ui, plus a depth valuedi. This modeling camera can be either a real camera used to acquire video,as described in Section 3.4, or a virtual camera used as a convenient way toparameterize the skin in physical space. We note that P is invertible, sinceP is a full projective transformation, and not a projection. We denote thestacked arrays of points corresponding to all vertices of the mesh as x in 3Dphysical space and u in the 2D camera coordinates.During movements of the eyes, the skin in the eye region slides over thebody. It is this sliding we are looking to model. Following the standardnotation in solid mechanics, the motion of the skin from 3D skin space to3D physical space is denoted φ (see Figure 3.3). Therefore, we can writex = φ(X). However, directly modeling φ is not desirable, as it does not takeinto account the constraint that skin can only slide over the body, and notmove arbitrarily in 3D. Instead, the key to our reduced coordinate model isthat we represent skin movement in modeling camera coordinates. In otherwords, we model the 2D transformation:u = P (φ(pi(u)))def= fg(u) (3.1)Our goal is to directly model the function fg as a function of input pa-rameters, g, such as gaze and other factors that affect skin movement aroundthe eyes. This representation has the dual advantage of both enforcing thesliding constraint and being easy to acquire video data from which to learnhow the skin moves, as shown in Section Factors Affecting Skin MovementWe now examine the different input variables g that determine skin move-ment in the eye region. The most important and dynamic cause is eyemovements that change gaze, in other words, that change what the charac-ter is looking at. However, other parameters, such as eyelid aperture andexpressions, also affect the skin. We combine these into the “generalized”gaze vector g.In the rest of this chapter, we assume g is a ni×1 column matrix, where193.4. Movement Capture and Trackingni is the total number of possible inputs. Submatrices are extracted usingMatlab-style indexing, e.g., g1:3 is the submatrix comprised of rows 1 to 3,and g[4,5] is a submatrix with just the fourth and fifth elements.Gaze We represent eye motion as a 3D rotation around the center of theglobe. Any parameterization of 3D rotations could be used, but we use thecoordinates from Fick (1854), which are widely used in the eye movementliterature to describe the 3D rotation of the eye, since it factors the torsionin a convenient form. These are a sequence of rotations: first horizontal(g1), then vertical (g2), finally torsion (g3).Eyelid Aperture Eyelid movements are affected by both gaze and otherfactors. When our gaze shifts, eyelids, especially the upper eyelids, moveto avoid occluding vision. We also move our eyelids to blink, and whenexpressing mental state such as arousal, surprise, fatigue, and skepticism.The upper and lower eyelids move in subtly different ways. Therefore, we usetwo additional input parameters to define aperture. One is the displacementof the midpoint of the upper eyelid above a reference horizontal plane withrespect to the head (g4); the plane is chosen to correspond to the position ofthe eyelid when closed. The other input is the displacement of the midpointof the lower eyelid below this plane (g5).Expressions The skin in the eye region is also affected by facial expres-sions, such as surprise, anger, and squint. We can optionally extend theinput parameters g to include additional parameters to control complex fa-cial expressions. Expressions may be constructed using action units (AUs),defined by the Facial Action Coding System (FACS), first proposed by Ek-man and Friesen (1977). In our implementation, action units are used in asimilar way as blend shapes: they may be learned from using ‘sample poses’that a subject is asked to perform or could also be specified by an artist. Thestrength of the ith action unit used in the model contributes an additionalinput parameter, gi+10 ∈ [0, 1]. Note that we defined five parameters pereye (3 gaze and 2 aperture), which together contribute the first 10 inputs.3.4 Movement Capture and TrackingTo train a model, first we need to capture real world data. I briefly describethe system we used for tracking skin motion and gaze of human subjects,just for the purpose of contextualization.203.5. Generative Model of Skin MovementOur setup (shown in Figure 3.5) only requires a single high frame rateRGB camera. We sit subjects with their chins firmly resting in front of thecamera and ask them to look at markers set up around the environment. Wethen ask them to follow a point on a screen placed between them and thecamera, and finally to perform some expressions. This process takes about40 seconds to execute.We also generate a 3D mesh of the subject using a Kinect camera andthe commercial program Faceshift. By manually selecting a few landmarkson the recorded video and the mesh, we are able to project the mesh intothe camera coordinates. We then track some of the mesh vertices along theframes while, at the same time, also tracking the gaze direction. Addition-ally, we have a system capable of tracking wrinkle lines and reconstructingtheir shape for any video frame.Figure 3.5: Left: gaze and skin tracking setup. Right: tracked verticesdisplayed on top of a recorded video.3.5 Generative Model of Skin MovementI now describe the learned skin motion model from the training data, justfor the purpose of contextualization.Learning the model directly from gaze parameters resulted in over-fittingand did not perform well for wide ranges of gaze parameters. We observedthat the deformation of skin in the eye region is well correlated with theshape of the eyelid margin. This makes biomechanical sense, since the softtissues around the eye move primarily due to the activation of muscles sur-rounding the eyelids, namely orbicularis oculi and levator palpebrae mus-213.5. Generative Model of Skin Movementcles. Following these observations, we factored the generative model intotwo parts: eyelid shape model, and skin motion model. Each of these twomodels can be constructed or learned separately. A schematic diagram ofthe implementation is shown in Figure 3.6.Figure 3.6: We predict the eyelids shape from gaze and aperture. We thenpredict skin motion using it and expression affectsWe explored two different modeling approaches: neural networks of ra-dial basis functions and multivariate linear regression. Although the neuralnetworks achieved a lower normalized reconstruction error with just one ra-dial basis function layer (see Figure 3.7), the difference did not justify thesignificant additional computational cost, as the reconstruction error fromthe linear approach is already quite low. Errors were computed using crossvalidation with randomly picked data points from the training data. Hence,to achieve real-time performance, we choose multivariate linear regressionto exploit GPU computation and keep evaluation cost low. The advantagesof MLR are shown in greater detail in Section 3.7.We also reduced the dimension of the training data before training themodels. We tested a variety of sophisticated dimensionality reduction meth-ods, including probabilistic principal component analysis, neighborhood com-ponent analysis and maximally collapsing metric learning. Yet simple prin-cipal component analysis provided the best results.223.5. Generative Model of Skin Movement(a) Eyelid shape model (b) Skin motion modelFigure 3.7: Reconstruction error for the two models depending on the num-ber of used principal components.Eyelid Shape Model We define eyelid shape for each eyelid as piecewisecubic spline curves. We found that using between 17 and 22 control pointsfor each spline faithfully captures the shape of the eyelids. The eyelid shapedepends on both gaze and aperture. The general form of the model is:l = K l¨ = K L g[1−10] (3.2)where l is the column matrix of coordinates of all control points for theeyelid and l¨ is its reduced principal component representation. We foundthat the results are well approximated using either 5 or 13 principal com-ponents, as additional components do not improve considerably the results(see Figure 3.7).Skin Motion Model The skin of the eye region is modeled at high res-olution (using about a thousand vertices in our examples) and is deformedbased on the movement of the eyelids and expression data. Note that theskin motion depends on all four eyelids. The stacked vector of coordinates ofall eyelids and expression action units is denoted l. To generate expressiondata for training, we manually mark sample poses for each expression andthe preceding neutral poses (see Section 3.4). We linearly vary each actionunit from 0 to 1 between the two corresponding marked poses. The resultingmodel is:233.6. Transferring Animationsu = u0 + J e = J M l (3.3)where e is the reduced principal component representation of u and u0is the position of the skin at the origin of the principal component space.We found that the results are well approximated using only 4 principal com-ponents, as additional components do not improve considerably the results(see Figure 3.7).3.6 Transferring AnimationsCaptured subject mesh Target character mesh Target mesh registeredon subject meshregistrationSkin meshMotion Model3D Reconstruction3D Reconstructioncharacter 1character 2(a)                                   (b)Figure 3.8: Overview of mesh registration. The target character mesh (red)is registered non-rigidly on the capture subject mesh (blue) shown in thetop row. Image coordinates of target mesh are computed from the imagecoordinates of the model output using barycentric mapping computed duringregistration.243.6. Transferring AnimationsFigure 3.9: Models trained on one subject can be used to generate anima-tions on arbitrary character meshes of any topology.253.6. Transferring AnimationsThe skin motion model of Section 3.5 is constructed for the capturedsubject and a specific facial scanning of his or hers face. While facial scan-ning is now becoming widely available on commodity hardware Weise et al.(2011), training a skin motion model still requires our more complex capturesetup (see Section 3.4). Animating artist made fictional characters is alsoan issue. Finally, the skin model may not include some parts of a givencharacter, such as the inner eyelid margins, which are difficult to see andtrack in video (see Figures 3.5 and 3.8).Here we discuss how the information in the generative model can betransferred to other target characters and to untracked parts of ocular re-gion. The majority of this section has been done by Debanga Raj Neog, withme contributing ideas mostly towards the movement extrapolation process.Character Transfer Given a new target character mesh with topologydifferent from the captured subject mesh (3D face mesh of the subject forwhom the model was constructed), we have to map the model output u tonew image coordinates u˜ representing the motion of the new mesh in imagecoordinates. The map is computed as follows: we first use a non-rigid ICPmethod (Li et al., 2008) to register the target mesh to the captured subjectmesh in 3D. The resulting mesh is called the registered mesh. The verticesof the registered mesh are then snapped to the nearest faces of the capturedsubject mesh. We compute the barycentric weights of the registered meshvertices with respect to the captured subject mesh, and construct a sparsematrix B of barycentric coordinates that can transform u to u˜ such that:u˜ = B u (3.4)Finally, the registered mesh is projected to the image space using theprojection matrix P to obtain S˜.Movement Extrapolation Some vertices of the captured subject mesh,particularly those of the inner eyelid margin, are not included in the skinmovement model computed in Equation 3.3. This occurs because such ver-tices are difficult to be tracked in video. As such, a data-driven model cannotbe built for them. Instead, we compute the skin coordinates of untrackedvertices as a weighted sum of nearby tracked vertices. The general form is:u˜+ = B+ u˜ (3.5)263.7. Interactive Model ControlWhere u˜+ is a vector containing both the image coordinates of thetracked vertices in the new mesh, u˜, and of the newly added untrackedvertices. For extrapolation, we use normalized weights proportional to theinverse distance to the neighboring points in the starting frame.3.7 Interactive Model ControlOne issue with our model, as defined in Section 3.5, is the input parameters.While powerful, they are not consistent nor easy to control: first, aperture isdefined in the capture camera space, which is not consistent between differ-ent subjects or data capture sessions; second, gaze and aperture need to becontrolled simultaneously to achieve realistic skin movements. Additionally,the skin motion model is not optimized for evaluation performance, makingit unusable in real-time situations. The addition of the animation transferoperations, defined in Section 3.6, creates additional overhead.In this section I describe my second contribution, which presents solu-tions to these issues.Aperture Normalization I want to specify aperture in a format g˜ thatis independent of capture conditions and training subject performance. Idefine aperture ∈ [0, 1] as we did for expressions. That is, an aperture inputof 0 must correspond to having the eye closed, while an aperture input of 1to having the eye wide open.Remember that we asked subjects to close and open their eyes duringcapture (see Section 3.4) and that eyelid aperture is defined by the fourinput parameters g[4,5,9,10] (see Section 3.3). I define the following heuristicfor eye closure:ι = |g4 − g5|+ |g9 − g10| (3.6)I find the frames with the highest and the lowest ι scores on the trainingdata. These two correspond to the moments when the eyes were closed andopen, respectively. Let gc and go be the input parameters associated withthese two frames. Conversion from the general input parameters g˜ into theparameters used by the eyelid shape model g becomes, for the left eye:g[4,5] = gc[4,5] + (go[4,5] − gc[4,5]) g˜[4,5] (3.7)273.7. Interactive Model ControlOptimized Motion Model Notice that all of the operations describedso far are linear, and were represented as vector additions or matrix mul-tiplications. Consider g to be a point in a 10 + i order euclidean space,where i is the number of auction units used. One can then use homogeneouscoordinates representation to replace vector addition operations with ma-trix multiplications as well. If I were to pre-multiply all the aforementionedmatrices, I would obtain a new matrix U such that:u˜+ = U g˜ (3.8)This implies U is a n˜v by 10+imatrix, where n˜v is the number of verticesin the target character mesh. Now the entire pipeline can be run in parallelon a vertex per vertex basis, which is ideal for exploiting GPU computationalpower. Yet remember we represented vertex motion in 4 principal directions,as described in Section 3.5. Thus I can further optimize computations. Idefine instead the general model:u˜+ = U˜ W g˜ (3.9)wheree = W g˜ (3.10)This results in a small 4 by 10+ i matrix W and a tall nv by 4 matrix U˜ ,which reduces per vertex data to a 4-dimensional vector, perfect for GPUcomputations.Baseline Aperture Model As discussed previously, the eyelid apertureis influenced by gaze direction. Testing the system found that maintaining astatic aperture produces unrealistic results, but manually controlling aper-ture to match gaze movements is impractical. I want aperture to dependon gaze, but still be able to control eyelid opening and closing to producevoluntary actions such as partial eye closure or wide eye opening.We used multivariate linear regression to train a linear baseline aperturemodel, A, for predicting the aperture due to gaze. Since the torsion angleof the globe does not have a significant effect on aperture, I only use thefirst two components of gaze. As a consequence of the aforementioned nor-malization of the aperture in the range [0, 1], the baseline aperture can thenscaled by an eye closing factor c ≥ 0 to simulate blinks, frowning, arousal,etc. The resulting model for the left eye is:283.8. Deformation by Globe Movementg˜[4,5] = c A g˜[1,2]. (3.11)3.8 Deformation by Globe Movement(a) (b)Figure 3.10: Visualization of the effects of globe mesh fitting. Without de-forming (a), intersections between the globe and skin geometry are glaringlyobvious. Eyelid depth and corneal subsurface deformations are not present.Our skin motion model predicts motion of the skin in 2D modeling cam-era coordinates u˜+, which correspond to the 3D coordinates x˜+ of the char-acter skull mesh. However, remember that we model the body as the unionof the skull and the globes (see Section 3.2). Our skin motion model doesnot take into account the globes geometry. The reason behind this sep-aration of the body is that, as described in Section 2.2, the globe is notspherical and has a prominent cornea. Thus, rotations of the globe shouldproduce subsurface deformations of the skin. I want to recreate these effectsin a computationally efficient manner, while also preventing geometry in-tersections between the skin and the globes. However, computing geometryintersections would be far too computationally intensive for this task.To address this challenge, I present my first contribution: an efficientmethod that relies on computing the distance from any point to the globe’ssurface.293.8. Deformation by Globe MovementDeformation in Polar Space As the globe is still similar to a sphere, Irepresent the shape of the globe in polar coordinates in the globe’s referenceframe. Define a radial distance map D of the globe’s outer surface to theglobe center in polar coordinates. This map can be sampled to obtain theglobe’s radial distance δ as a function of polar coordinates α:δ ≈ D(α) (3.12)Then, for any point with polar coordinates α and cartesian coordinatesr in the globe’s reference frame, the distance to the eye surface can beapproximated as:|r| − δ + t (3.13)A negative value means the point is inside the globe. t a user defined skinthickness on the point. For each frame, radially displace vertices inside theglobe at its current orientation, which allows the skin to slide on the globeand prevents geometry intersections. Additionally, I am able to dynamicallycontrol skin thickness.(a) (b)Figure 3.11: Spherical coordinates are computed for each globe vertex (a).Distance to the globe center is mapped according to the remaining coordi-nates (b).Preserving Fine Details Some regions of the face should not slide di-rectly on the skull or globe geometry. For example, the canthus is locatedfurther deep in the skull than the surrounding regions, while the eyelid mar-gins must be at a distance from the globes. Luckily, all these details are303.9. Interactive Motion Synthesislocated in the extrapolated regions. As such, I handle vertices in these re-gions differently. For each extrapolated vertex, I precompute an heuristicδ0, which is the radial distance between the original vertex position and theglobe surface:δ0 = |r0| −D(α0) (3.14)r0 is the original position of the vertex in the globe’s reference frame andα0 in polar coordinates. I then radially offset the skin vertex position by δ0,which allows me to approximately reconstruct the eyelid margin thicknessand canthus depressions.(a) Canthus (b) Eyelid marginFigure 3.12: Examples of details that would be lost if all the skin slideddirectly on the skull and globes.3.9 Interactive Motion SynthesisOur framework is well suited for real-time synthesis using GPUs. I nowdiscuss how our model can be used in a real-time rendering pipeline togenerate realistic skin motion and shading.As a proof of concept, I implemented two different WebGL browser ap-plications. Both synthesize animations of the ocular region in real timeusing our system. I used the Digital Emily data provided by the WikiHu-man project (Ghosh et al., 2011; WikiHuman), with permission, to createour main virtual character. The applications start by downloading all the313.9. Interactive Motion Synthesisrequired mesh, texture, and optimized model data. Then they run offlineand perform all computations without communicating with a server.Data storageon serverClient applicationOffline capture and modelingMesh registrationStudio captureSubject transferTracking TrainingModel constructionTransferred dataAnimation engineStochastic blink modelWebGL RenderingUser interactionorAnimationkeyframeAnimation modelSkull mapTarget characterWrinkle mapsWrinkle modelFigure 3.13: Overall architecture of my applications.The two applications differ only on the model input sources: one is afully interactive user controlled application, while the other plays a cut-scene defined using keyframed animation curves. Head movements can beadded for realism. Different expressions, such as surprise and anger, canalso be added at key moments.Reconstructing 3D Geometry Recall that we represent the 3D coor-dinates x˜+ of the character skull mesh using the 2D skin coordinates u˜+corresponding to the neutral pose. Yet, due to animation transfer, this oper-ation might not correspond to a projective nor affine transformation. Hence,to determine 3D vertex positions x˜+ given u˜+, I pre-render a texture of x˜+in 2D skin space using natural neighbor interpolation on the vertices values.This texture T , which I name the skull map, can be sampled to obtain x˜+as a function of u˜+:x˜+ ≈ T (u˜+) (3.15)Following the same procedure, I pre-render a texture Υ to sample theskull surface normals η˜+ given u˜+, which we name the skull normal map:η˜+ ≈ Υ(u˜+) (3.16)The aperture model and the PCA-reduced motion model W are per-formed on the CPU. The PCA reconstruction using U , the more compu-tationally expensive portion of the model, is performed on the GPU on aper vertex basis, as a matrix-vector multiplication in the vertex shader (seeFigure 3.14).323.9. Interactive Motion SynthesisgazeStochastic blink modelInputAperture modelMotion modelOutput mesh Bump mapblink affectgaze and affectaperturePCA reconstruction Bump map blending2D to 3D ReconstructionCPUGPUmotion eigenvalues wrinkle weightsFigure 3.14: Flow chart of how inputs are handled in the applications andwhere computations are performed.Surface Normal Generation So far, I have described how to computethe vertices positions in real-time. To perform surface shading, one mustalso compute the surface normals η˜+. I compute normals on a per-vertexbasis. As described in Equation 3.16, the surface normals can be easilyretrieved for the portions of the skin that are sliding in the skull mesh usingthe skull normal map Υ. However, the same is not true for vertices thathave been radially displaced, whether because they are sliding on a globe,or for being extrapolated vertices with a δ0 heuristic.For any vertex sliding on a globe, I take the surface normal of the globeτ where the vertex is sliding on, which I name the globe sliding normal :η˜+ = τˆ (3.17)τ could be sampled by precomputing a surface normal map in polarcoordinates, in a similar fashion to the map D defined for globe mesh fitting.Yet, it is enough to approximate the surface of the globe to a sphere in thiscase. The surface normal in head space then becomes:333.9. Interactive Motion Synthesisτ = x˜+ − ø (3.18)where ø is the globe’s center in head space.For points that are not sliding neither on the skull nor on a globe, thatis, for vertices with a δ0 heuristic, I propose a different method. Rememberthat extrapolated vertices are always radially displaced. As in Equation3.18, I compute the current globe sliding normal τ , but also compute theglobe sliding normal τ0 at the vertex original position. I then compute thesmallest rotation τ that aligns these two normals:τ = τˆ0 × τˆ (3.19)This rotation serves as an estimation of how the surface around thevertex has been rotated from it’s original position. Finally, I rotate theoriginal vertex surface normal η0 using τ .Input: Motion eigenvalues eOutput: Head space 3D vertex coordinates x˜+i and normal η˜+iCompute skin coordinates: u˜+i = U˜i · eSample skull map: x˜+i = T (u˜+i)Sample skull normal map: η˜+i = Υ(u˜+i)Compute globe sliding normal: τ = x˜+i − øCompute position in globe space: r = Ø · x˜+iCompute polar angle: α = (tan−1( rxrz ), cos−1( ry|r|))Compute globe radial distance: δ = D(α) + t+ δ0Compute distance ratio: δˆ = δ/|r|If (δˆ > 1 || δ0) then radially displace: x˜+i = Ø−1(r · δˆ)If (δˆ > 1) then η˜+i = τˆIf (δ0) then η˜+i = η0 rotated by τˆ0 × τˆAlgorithm 1: My vertex animation algorithm for a single globe executedper frame.Wrinkle Reconstruction As mentioned in Section 3.4, we have a systemthat is able to reconstruct wrinkle geometry from any captured pose, repre-sented as spline geometry. We can also synthesize bump map textures that343.10. Resultsrepresent these wrinkles for any given pose, and project it into the targetcharacter’s texture space. To generate facial expression wrinkles interac-tively, we synthesize a bump map texture for each of the expression sampleposes marked in Section 3.5. I then combine these bump maps using asthe blending weights the corresponding expression affect parameters. Theblending operation is performed on the fragment shader.Realistic Input Synthesis In the user controlled application, blink inputis handled by a stochastic blink model that generates spontaneous humanblinking. Our model determines blinking intervals and blink amplitude ac-cording to a normal distribution, as this is often used to represent naturaloccurrences. We estimated blink frequency distribution based on data col-lected by Ishimaru et al. (2014) and blink amplitude distribution on thedata by Trutoiu et al. (2011) and Evinger et al. (1991). See table 3.1. Therelation of amplitude to phase duration was based on Evinger et al. (1991)and the blink shape profile was based of Flash and Hogan (1985).µ σBlink frequency 0.1339 Hz 0.1605 HzBlink amplitude 34.5895° 3.805°Table 3.1: Normal distribution of blink frequency and amplitude while per-forming a watching activity.Meanwhile, eye movement is modeled by a biologically based saccademodel, which controls gaze shifts due to user requests to changing the visualtarget. It was based on the work of Bahill et al. (1975) and Harwood et al.(1999). We also implemented a target tracking model, which makes use ofboth smooth pursuit and saccadic movements based on the work of Blohmet al. (2006). Yet we never used it, as a smooth pursuit system did notcomplement well our application user interface.3.10 ResultsTo our knowledge, this is the first system for data-driven real-time animationof soft tissue movement around the eyes based on gaze input. As describedin Section 2.3, almost all previous work in animation of eyes has been onanimating gaze, with some recent attention paid to the kinematics of blinksand the appearance of the globe and iris. For instance, Ruhland et al.353.10. ResultsFigure 3.15: Facial expressions wrinkles generated interactively.363.10. Results(2014), an excellent recent survey of eye modeling in animation does noteven mention the soft tissues or wrinkles surrounding the eyes.Performance The applications run in any modern browser, at 60fps on adesktop with an Intel Core i7 processor and an NVIDIA GeForce GTX 780graphics card, and at 24fps on an ultraportable laptop with an Intel Core i7processor and integrated Intel HD 5500 Graphics, and at 6fps on a Nexus5 android phone with Adreno 330 GPU (see Figure 3.1). The majority ofworkload is for rendering; the model itself is very inexpensive (see table 3.2).Static Animated OverheadFile download (MB) 3.7 5.3 1.6Memory usage (MB) 240 370 130GPU memory (MB) 390 417 27Runtime per frame (ms) 0.5450 ± 0.1553 0.6717 ± 0.1564 0.1267 ± 0.0011Table 3.2: Performance overview of the applications. Note the animationframework is run twice per frame as described in Section 3.11.Eyelid Deformation during Blink We can generate realistic skin de-formation in a blink sequence using my stochastic blink model described inSection 3.9. A blink sequence is shown in Figure 3.16.Saliency Map Controlled Movement When we observe object motionin real life or in a video, our eyes produce characteristic saccades. We com-puted saliency maps, a representation of visual attention, using the methodproposed by Itti et al. (1998). Points in the image that are most salientare used as gaze targets to produce skin movements around the eyes. Weshow an example of skin movement controlled by gaze, using salient pointsdetected in a video of a hockey match in Figure 3.2.Static Scene Observation Our generative gaze model can be controlledby gaze data obtained from any eye tracking system. We used gaze dataof a subject observing a painting to drive our system. This produces veryrealistic movements of eyelid and skin around the eyes as can be seen inFigure 3.17.373.11. Legacy Hardware ConsiderationsFigure 3.16: Skin deformation in eye closing during a blink.3.11 Legacy Hardware ConsiderationsOne of the major changes of rendering paradigms in the interactive mediaindustry over the past few years was the adoption of multi-pass rendering.This feature is used for less traditional rendering methods, such as deferredshading, or to achieve post-processing effects, such as depth of field (forexample, Demers, 2004), screen-space reflections (e.g. Tatarchuk, 2009) or,more importantly for my case, subsurface scattering methods (e.g. Jimenezet al., 2015).One of the most important requirements for efficient multi-pass render-ing is support for multiple render targets. That is, to be able to renderinto multiple draw buffers in a single pass. While multi-target renderinghas become prevalent on desktop and laptop devices, this is still not trueon mobile devices. For example, one of the main limitations of the WebGLspecification is that multi-target rendering is not part of the core specifica-tion. While the majority of computers support the multi-target renderingextension, the majority of mobile devices still does not.As described in Section 2.4, to achieve realistic real-time skin renderingit is standard to apply an approximation of the subsurface scattering pro-prieties of the skin. While some do not require multi-pass rendering, the383.11. Legacy Hardware ConsiderationsFigure 3.17: Skin movement driven by gaze during static scene observation.The red circle in the left represents the image point subject is looking at.majority do. In our demo application, I implement the method proposed byJimenez et al. (2015). It requires 3 rendering passes: the first projects thegeometry to generate screen space diffuse, specular and depth maps of theskin. This is followed by two screen space passes that perform a Gaussianblur on the diffuse illumination map taking into account the depth map.The last of these passes will also sum the diffuse and specular illuminationsto achieve the final result.In my implementation, while I still only perform 3 passes, I need to per-form two geometric passes (instead of one): the first rendering pass generatesthe diffuse illumination and depth maps (depth is stored in the alpha chan-nel). The second one is a screen space Gaussian blur. The third, althoughit performs the screen space blur, also computes specular illumination andsums it to the diffuse for the final result. Hence, my optimized subsurfacescattering system for WebGL requires the vertex shader program to runtwice per frame.39Chapter 4Screen Space Distance Fields4.1 MotivationAs explained in Section 2.3, while there is some work regarding interactiveanimation of tear drops flowing on human faces, there is no work regardingthe interactive animation of the tear film. I hypothesize that small scaledetails, such as the tear film, might be reproducible procedurally at thepost-rendering stage. In other words, I hypothesize that the visual effects ofthe tear film might be reproducible in screen-space. Alexander et al. (2013)suggests that the main noticeable visual effects of the lacrimal lake are:ˆ Darkening due to light absorption during light transversalˆ Specular reflectionsThe first effect can be parameterized by the amount of fluid visible ateach pixel. That is, the depth of the lacrimal lake from the camera perspec-tive (screen space). The second requires the positions and normals of thefilm surface to be computed per pixel. Given a screen-space depth map ofthe scene and the aforementioned depth of the fluid at each screen point, thefilm surface position can be reconstructed using the sum of the two and theinverse projection matrix. Similarly, the surface normal can be computedusing a screen-space normal map and the derivative of the fluid depth.Hence, I want to define a function of fluid depth per pixel. A possiblestrategy would be to parameterize this function according to distance fromthe surface point to the nearest skin point. This distance could be efficientlycomputed if a volumetric distance field of the skin was available. This is acommon strategy in real-time lighting computations, as volumetric distancefields can be precomputed for rigid objects. However, the skin is not a rigidobject, and computing a volumetric distance field of it in real time is not aviable solution, as seen in Erleben and Dohlmann (2008) and Sanchez et al.(2012).404.2. Background(a) volumetric distance fields (b) computation timeFigure 4.1: Volumetric signed distance fields and corresponding computationtimes. Reproduced from Erleben and Dohlmann (2008).To solve this problem, I explored computing distance fields of an object(in this case, the skin of a human face) in screen space. In this chapterI describe my third contribution: three different algorithms for real-timeprocedural generation of distance fields in screen-space, and a study of theircomparative performance and quality. I show that it is a viable strategy,both in terms of performance and quality.These three different sampling strategies for estimating distance fields aredescribed in detail in Section 4.4. To elucidate how and why I designed thesealgorithms, Section 4.2 provides some background on unrelated but similarproblems. Section 4.5 provides a comparison of these three algorithms andshows them being used to procedurally generate lacrimal lakes in real-time.4.2 BackgroundIt is common in games to use feature outline rendering to draw contoursaround objects. This is generally done either to highlight objects on thescene or for artistic purposes. One traditional method to achieve outlinerendering is the one proposed by Rossignac and van Emmerik (1992), whichconsists in rendering the objects first in the stencil buffer, and then renderingthick wireframes of outlined objects. One could potentially compute distancefields by rendering multiple overlapping outlines for each skin object, witheach outline having a different scale and color intensity corresponding to thedistance they represent. Yet, this would scale poorly with screen resolutionand, more importantly, scene complexity.414.2. BackgroundI choose to explore computing distance fields without resorting to ad-ditional geometry passes. Instead, I resort to searching the screen-spaceneighborhood. This strategy shares a premise with all screen space ambientocclusion methods, which is that the properties on a surface point can be ap-proximated by the visible surrounding geometry. Hence, while the purposeof this method is very different and the final sampling methods presenteddiverge much from the techniques used for computing ambient occlusion, itis still relevant to understand these existing screen space methods.4.2.1 Traditional Screen Space Ambient Occlusion(a) (b)Figure 4.2: Screen sampling distribution as in Mittring (2007) (a). Blackpoints are identified as occluders. Using the surface normal (b) allows toreduce the sampling space to a more relevant hemisphere. Reproduced fromChapman (2011).Screen space ambient occlusion (Mittring, 2007) is generally acceptedas the first method of this kind. Its essential concept is to approximate anocclusion factor for each point on a surface by sampling points in a spherecentered on the point and estimate occluders.Let P be the camera projection matrix, as in chapter 3, and x a visiblesurface point in the camera view space. Randomly choose ns samples inview space inside of a sphere of user defined radius µ centered around x.Thus, the set of all possible samples Ψ is defined as:Ψ = {si | µ > |x− si|, 1 ≤ i ≤ ns} (4.1)where si is the ith random sample. The goal is to determine whether eachsample si lies behind the visible geometry. This can be achieved by casting424.2. Backgrounda ray from the camera to each sample si to find the intersecting visiblegeometry position si along the ray. If si~z > si~z , then si is not visible andcontributes to the occlusion of x. The ambient occlusion of x is estimatedby the number of occluded samples found.To efficiently find the surface point si corresponding to a sample point si,Mittring (2007) avoids performing ray-to-geometry intersections. Instead,the sample si is first projected to screen image space:ui = P si (4.2)The screen space depth buffer, D(u), is then used to reconstruct thesurface position rendered at ui:si = P−1 [ui, D(ui), 1] (4.3)Later methods also make use of the surface normal to sample within ahemisphere oriented along the surface normal at that pixel (Bavoil et al.,2008; Chapman, 2011). This improves the relevance of the sampling space,increasing fidelity of the results. It has the disadvantage of requiring a per-fragment normal map, but this is already available when performing deferredshading.4.2.2 Image-Space Horizon-Based Ambient Occlusion(a) (b)Figure 4.3: Theoretically, Bavoil et al. (2008) raymarches the depth bufferin a number of equiangular directions across a circle in image space (a). Inreality, sampling is made at texel centers (b).Horizon-based ambient occlusion (Bavoil et al., 2008) is an example ofa more recent influential method for screen space ambient occlusion. One434.3. Problem Definitionof the improvements in comparison with the traditional method describedbefore is that it splits the unit sphere by a horizon line, defined by computinga signed horizon angle at every surface point. Yet, for the purposes of myproblem, I am only concerned with the sampling strategy used, and hence Iwill not cover the above problem.Instead of sampling points in a sphere on view space, Bavoil et al. (2008)picks ns directions in the image space around the current pixel, which cor-responds to directions around the z axis in eye space, and raymarches alongthose directions. The raymarching is used to keep track of the elevationangle at each direction. Each time the angle is larger than the previousmaximum, a new chunk of occluding geometry has been found. A staticstep size is used for raymarching, but sampling is always made on texelcenters to avoid depth discontinuity artifacts.4.3 Problem DefinitionMy goal is to compute, for any visible surface point x, the shortest distanceλ from x to the skin object. In other words, I want to produce a screenspace image of λ, which will be the screen space distance field.My method is intended to be used on the second stage of a deferredshading pipeline. This means that one can assume information such as thesurface albedo, normal and view depth have been rendered and stored intoscreen-space texture buffers. I compute λ in screen space by making use ofsome of these textures.Distance Metric To efficiently compute λ and not depend on scene com-plexity, I approximate it by searching the visible geometry. I estimate λ byfinding the closest visible skin surface point to x and computing the distancebetween the two points in camera view space:λ ≈ |s− x| (4.4)where s is the ideal skin surface position I would like to find. Let ns bethe number of skin surface samples, and si the ith sample. Then intuitivelythe distance λ can be approximated as:λ ≈ mini{|si − x|, 1 ≤ i ≤ ns} (4.5)444.3. Problem DefinitionMaterial Detection How to determine whether a given sample si lies onthe skin surface or some other material must also be addressed. Let ui bethe screen space coordinates of si. Define a screen space binary function,M(u), which returns whether a given sample ui corresponds to a skin point.Similar functions can be defined to identify other materials.This function can be implemented as a screen-space material mask tex-ture, interactively rendered alongside the depth buffer. This is quite efficient,as a single material mask texture can support up to 256 different materialsusing a single 8-bit channel. That is, it can implement up to 256 differentbinary functions.Figure 4.4: Overview of the spaces used for sampling. All sample pointslie in the plane Ψ, defined by the current point x, and have correspondingscreen space and visible surface positions.Sampling Space I am only interested in sampling points along the visiblesurface. Hence, sampling on a sphere in view space as in Mittring (2007) isnot appropriate. On the other hand, one of the major differences betweenthis distance problem and ambient occlusion is that the scale of the lacrimallake I want to recreate is known. Thus the maximum distance at which thelacrimal lake can be from the eyelids, Λ, is known as well. However, samplingin image space, as in Bavoil et al. (2008), does not take into account thedistance of the camera to the eyeball. This can lead to inconsistent results atdifferent distances and does not allow to take advantage of prior knowledge.So, instead, I the set of potential sample points Ψ as:454.4. Sampling MethodsΨ = {si | si,z = xz, 1 ≤ i ≤ ns} (4.6)where xz is the z coordinate of x in view camera coordinates. Thatis, the samples are picked in a plane that passes through x, parallel to thecamera projection plane. I then cast, as in Mittring (2007), a ray fromthe camera to each sample si in the Ψ plane to find the intersecting visiblegeometry position si along the ray. Equations 4.2 and 4.3 describe how sican be efficiently computed.The biggest advantage of this approach versus the image space samplingapproach of Bavoil et al. (2008) is that the process becomes independent ofthe distance from the camera to the eyeball, while still avoiding the redun-dancy of the spherical sampling space used by Mittring (2007).4.4 Sampling MethodsThe major bottleneck of my approach is sampling, as randomly accessingtextures on the GPU is computationally costly. Sampling across the entirescreen is not an option. Thus, the sampling strategy chosen is of extremeimportance.In this section I describe three different sampling algorithms I developed,which vary greatly in the strategies used. However, they share the premise ofsearching in the plane Ψ, resorting to the same distance metric and detectingskin points as described in the previous Section Equiangular Sampling on a CircleThis first algorithm represents a naive approach heavily inspired by ambientocclusion methods. While it differs from previous work, due to the natureof the sampling space and the problem, I tried to keep it as close as possibleto related work. It is presented as an example of why the sampling strate-gies used for ambient occlusion methods cannot be transfered to a distancecomputation problem.Sampling Directions Pick ns equiangular directions in view space alongthe Ψ plane and sample once in each direction around x. This results in thegeneral formula:464.4. Sampling Methods(a) 8 samples per pixel (b) 16 samples per pixel(c) 32 samples per pixel (d) 64 samples per pixelFigure 4.5: Distance field of skin on a right eye region using different qualitylevels of equiangular sampling. As expected, results are not ideal.si,[x,y] = x[x,y] + hi µi (4.7)where si,[x,y] are the x and y axis coordinates of si in view camera space,h is a set of 2D unit vectors defined equiangularly around the origin, andµi ∈]0,Λ] is a scalar that indicates the sampling distance from x.Sampling Randomization Sampling in the same ns directions on everypixel will produce biased results for a low number of directions, which in turncan lead to artifacts. This is a very common problem in ambient occlusionalgorithms. As increasing sampling per pixel is costly, a common strategyto solve the problem is randomizing sampling on a per pixel basis (Bavoilet al., 2008; Chapman, 2011; Mittring, 2007). This results in trading biasfor high frequency noise, which is not problematic, as high frequency noisecan be greatly reduced by a post process blurring step (see Figure 4.5).474.4. Sampling Methods(a) initial vertical pass (b) final horizontal passFigure 4.6: Distance field of skin on a right eye region using exponential linesearch with 8 samples per axis direction. Quality is superior for the samenumber of samples than the previous method.I rotate the ns directions at each surface position x using a randomlyrotated kernel. I also randomize the sampling distances µ from x per sample,making the probability the same for all distances. All random values wereobtained using a pseudo-random function that appears random in the screenimage space.4.4.2 Exponential Line SearchThe main problem with the equiangular sampling approach, described inSection 4.4.1, is that it is better suited for estimating overall properties ofthe surrounding geometry, like ambient occlusion. However, I am lookingfor a local minimum, which is a fundamentally different problem. One couldtake advantage of previous samples of a pixel for the subsequent samples.In this section I describe the use of line search to find the minimum alongthe view space axis directions. It provides less noisy results and much betterdistance metric precision than the previous algorithm.Search Algorithm To perform line search I make one assumption: thata pixel closer to another in screen image space is likely to correspond tocloser positions in view camera space. While this does not apply to allpossible surfaces, the regions I am working with are reasonably smooth andorthogonal to the view direction, and as such it is a rational heuristic.Let hi,0 be a 2D vector in the Ψ plane, with some user defined length. My484.4. Sampling Methodsobjective is to find the point, in direction hi,0, on which the other objectsend and skin starts by searching along that direction. Let nm be a userdefined number of iteration steps, and m ∈ [1, nm]. Define the first samplesi,0 as:si,0,[x,y] = x[x,y] + hi,0 (4.8)Then, take up to nm exponentially increasing steps until a skin point issampled. That is, until M(ui,m) holds true:hi,m = hi,m−1 µ (4.9)si,m,[x,y] = si,m−1,[x,y] + hi,m (4.10)where µ > 1 is a user defined constant and ui,m are the screen spacecoordinates of si,m. Once a skin point is found, it implies that the boundaryof the skin is between si,m and si,m−1. At that stage, invert the tracingdirection and take exponentially decreasing steps:hi,m =hi,m−1µ(4.11)si,m,[x,y] = si,m−1,[x,y] − hi,m (4.12)Once a non-skin point is found (in other words, M(ui,m) no longer holdstrue), invert the direction once again but keep taking decreasing steps. Thisdirection inversion process continues until nm samples have been made.Axis Directions One could search in equiangular directions, as in Section4.4.1. However, I am now making nm samples in each direction, and thus Ineed to be more conservative about the number of sampled directions.Thankfully, one can take advantage of the search directions by dividingthe algorithm into two passes. On a first pass, I sample along one of theview space axis and store the result, which I call λ˜. Notice that any point inλ˜ contains not only information about that point, but also pertaining to theneighboring samples, which were located along the view space axis chosen. Iexploit this fact: on a second pass, I sample perpendicularly to the previoussampling direction and use λ˜ for my line search boundary condition.494.4. Sampling MethodsFigure 4.7: 2D schematic of the modified distance metric for a two-passimplementation.To make use of λ˜, I define a modified version of the distance metricspecified in Equation 4.4. λ˜(ui,m) constitutes the distance from si,m to anear skin point along the sampled axis. I approximate the distance betweenx and that unknown point by assuming the three points make a 90° angle,as this angle is efficient to compute:λi ≈√|si,m − x|2 + λ˜(ui,m)2 (4.13)My objective is to find the local minimum of λ˜i. Thus, I perform a linesearch on λ˜i: I sample in exponentially increasing steps, and then decreasingsteps as described for the first pass; however, I choose the sampling directionbased on the variation of λ˜i.Direction Randomization Similarly to the method described in Section4.4.1, sampling on the image axis can cause aliasing along those axis. Be-cause I implemented M(u) as a material mask texture, it further intensifiesthe problem, due to the aliasing already present on the texture.Hence, I randomly rotate the search directions per pixel, in the samemanner I did for individual samples in Section 4.4.1. This produces manage-able noise, which can be decreased with a blurring step, while still stronglyreducing aliasing effects.4.4.3 Backtracking Line SearchThe major problem with the line search algorithm described in Section 4.4.2is that it presents us with a strict compromise between three factors: max-504.4. Sampling Methods(a) 8 directions x 8 samples (b) 16 directions x 16 samples(c) 32 directions x 32 samples (d) 64 directions x 64 samplesFigure 4.8: Distance field using backtracking line search. The indicatednumber of samples denotes the maximum number of samples for each searchdirection, and not the actual number of samples performed.imum sampled distance Λ, metric precision µ and the number of samplesper direction nm. For example, when keeping nm low, a low value of Λ willallow for a good distance metric precision, but will limit the maximum sizeof the lacrimal lake. Increasing the value of Λ will increase search space, butreduce precision. Furthermore, the exponential sampling does treat searchspace equally, as points nearby to the target material will always presenthigher distance precision than points farther away.In this section, I improve upon the line search algorithm from Section4.4.2. I discuss the use of backtracking to treat search space equally andmakes a better use of samples by taking into account Λ, preemptively ignor-ing unpromising sampling directions and using strategies to reduce overalltexture access per sample.514.4. Sampling MethodsScreen Space Notice that the material detection is performed in screenspace (see Section 4.3). Thus, it would be more efficient to line search inscreen space, as computing sample positions in view space requires accessingthe depth buffer (see Equation 4.3). However, the sampling needs to beindependent of the distance between the camera and the skin object, andthat information is not present in screen space.Similarly to the method described in Section 4.4.1, let h be a set of 2Dunit vectors defined equiangularly around the origin. For each vector hi,representing a sampling direction in the Ψ plane, transverse from x by alength of Λ and project into screen space:si,[x,y] = x[x,y] + hi Λ (4.14)ui,0 = P si (4.15)If M(ui,0) is false, meaning the initial sample si is not a skin point,assume that there is not skin between x and si, and thus do not search inthat direction. Otherwise, define our search region as a line in screen spacebetween x and si. As this line always has a length of Λ in view space, thesearching process is independent of the distance between the camera andthe skin object even in screen space.Search Algorithm To treat search space equally, I perform line searchusing a “divide and conquer” approach. Let the extremities of this initialsearch space be ai,0 and bi,0:ai,0 = P x , bi,0 = ui,0 (4.16)At each m ∈ [1, nm] iteration step, sample the center of the search region:ui,m =ai,m−1 + bi,m−12(4.17)Then, divide the search space in half. If ui,m is a skin point, take thefirst half of the region as our new search region:ai,m = ai,m−1 , bi,m = ui,m (4.18)524.5. ResultsOtherwise, take the second half:ai,m = ui,m , bi,m = bi,m−1 (4.19)Having completed this process, the closest skin sample to x in screenspace is bi,nm . I make the assumption that this is also true in the viewspace. Thus, proceed to reconstruct the surface position rendered at bi,nmand compute the distance metric:si,nm = P−1 [bi,nm , D(bi,nm), 1] (4.20)λi = |si,nm − x| (4.21)4.5 ResultsOverall, the three sampling algorithms behaved as expected. Equiangu-lar sampling produced unsatisfactory distance fields, confirming that exist-ing screen space sampling techniques were not appropriate to our problem.Backtracking line search proved to be an improvement over the simple linesearch along the axis.The quality of the results of backtracking line search on a single passwere favorable enough that dividing the algorithm into two passes as inSection 4.4.2 was not worth the overhead. In fact, blurring was no longervery effective in reducing noise. It proved to be much more efficient to usean anti-aliasing method such as Fast Approximate Anti-Aliasing (Lottes,2011). This suggests that the constraint on the quality of the sampling fieldmight no longer so much the sampling strategy, but the aliasing alreadypresent in the material mask.Performance For each of the three sampling methods, I analyzed thecomputation time per frame, the number of texture accesses performed pertexel in the area of relevancy of the field and computed the per texel errorof the obtained distance field. For the ground truth field, I used backtracksampling with 128 samples per 128 directions as an approximation. Thetests were ran on an early 2011 15-inch MacBook Pro, sporting a AMDRadeon HD 6490M GPU. I used the Digital Emily data provided by the534.5. Results(a) 4 sampling directions (b) 6 sampling directionsFigure 4.9: Distance field using backtracking line search without directionrandomization. Aliasing is visible along the search lines.WikiHuman project (Ghosh et al., 2011; WikiHuman) for my skin mesh,and these metrics were computed from multiple camera perspectives. Theiraverage is shown in Table 4.1.Testing the method on meshes with different number of faces or verticesis not relevant, as my method does not depend on object complexity. AsI only took into account texels in the area of relevancy, distance to thecharacter is also not relevant.Equiangular 64 Exponential 8x8 Backtracking 8x8Runtime (ms) 0.0601± 0.0413 0.0215± 0.0206 0.0192± 0.0159Accesses 94.0233± 6.9828 101.8739± 4.0530 46.0344± 7.1043Visual Error 3.3858% 0.4597% 0.3681%Table 4.1: Average performance comparison of the three sampling algo-rithms running on a single render pass. Sampling algorithm settings chosento allow for a maximum of 64 samples per texel.Interestingly, exponential line search presented an higher texture accesscount than equiangular sampling. This implies that exponential line searchis finding more relevant samples (skin points) and, as such, computing thedistance metric more frequently than exponential sampling does.Meanwhile, as expected, backwards line search presented the best per-formance, both in computation time and memory access. This is justified544.5. Results(a) (b)Figure 4.10: Distance field at two slightly different view angles. Our methodcan handle sharp angles of view (a). However, if not enough geometry isvisible for the method to sample, artifacts will become visible (b).by two reasons: first, it can preemptively ignore sampling directions, whilethe other two methods always perform same number of samples regardless.Second, it searches in screen space, thus avoiding accessing the depth bufferon each sample. The other two methods require two texture accesses persample; once on the mask texture, to check for skin, and once on the depthbuffer, to compute the surface position.Lacrimal Lake Generation As an use case and proof of concept, Ipresent my fourth contribution: a method for procedurally creating lacrimallakes using screen space distance fields. Figure 4.11 shows lacrimal lakes ren-dered on the Digital Emily character (Ghosh et al., 2011; WikiHuman) usingeach of the sampling methods.I naively defined the lacrimal lake depth as a first order linear functionof the distance to the skin. Surface normals where computed by derivingthe visible surface view space positions in the fragment shader. As an addedefficiency bonus, when rendering lacrimal lakes one only needs to computethe distance fields on the surface of the globes.Limitations As mentioned before, the major limitation of my methodcomes from the aliasing already present on the material mask M . This cancreate slight value changes along the sampling directions (see Figure 4.9, forexample). While these are very minor, they can be problematic when com-puting the derivatives of distance fields. Thankfully, I was able to overcomethis problem on all the three sampling algorithms using randomization.554.5. ResultsFigure 4.11: Lacrimal lakes generated procedurally using the three methods.From top to bottom: no method, equiangular sampling, exponential linesearch, backtracking line search.564.5. ResultsAnother issue is that our method only takes into account the visiblegeometry. Sharp angles with the skin surface can create visible artifacts thatflicker as view perspective changes (see Figure 4.10). This happens whennot enough skin geometry is visible, and thus being sampled, to generateinformed results.57Chapter 5ConclusionIn this thesis I have presented work for automatically animating facial detailswhich, while small in scale, play a crucial role in character realism. I haveshown that our methods can be used in real-time scenarios and interactivelyaffected by user control. Generalization to different digital characters orglobes has also been explored.The major advantage of our methods is that they improve current in-teractive human facial animation without additional artist input. The workdescribed in Chapter 3 automatically ensures that skin in the eye regionand gaze direction always match. My contributions allow for our systemto automatically animate all the smaller features of the region and to becontrolled with little to no artist intervention. As recorded actor perfor-mance is not viable for animation of interactive characters, recreating theseintricacies would otherwise require an artist to meticulously build blend-shapes to represent them, with various degrees of success. The same is truefor the work described in Chapter 4, which can be easily applied on top ofexisting rendering frameworks to generate the lacrimal lakes of all charac-ters in scene. As performing fluid simulation is too computationally costlyfor real-time performance, providing a character with a lacrimal lake wouldnormally require an artist to manually animate a liquid mesh, which wouldhave to react to all the possible character facial poses.5.1 Discussion and Future DirectionsI now make a number of general observations about possible future work inthe topics of this thesis.One of the limitations in my analysis of the system described in Chap-ter 3 is the one-camera capture system mentioned in Section 3.4, that isunable to track the skin vertices close to the edges of the eyelids or in thecanthus. These vertices had to be extrapolated, as described in Section 3.6.585.1. Discussion and Future DirectionsThus, it would be of some significance to see how the rest of the system,in particular my contributions, would scale to more costly capture setups,which could track those vertices or provide higher precision. On the samenote, the major limitation of our reduced coordinate representation of skinmotion, described in Section 3.2, is that it can’t represent the finer detailsof the human face. These had to be reconstructed using my heuristic model,described in Section 3.8. Possible research paths could be to explore dif-ferent heuristics and compare their respective effectiveness in the model, orfind alternative representations that do not suffer from the same problem.Regarding Chapter 4, distance field generation in screen space opensmany unexplored possibilities. A clear research potential is to explore otheruse cases for these fields, besides lacrimal lake procedural generation. An-other is to improve upon the distance field generation itself: for example, Ihave yet to analyze which are the effects of using 2D screen space distanceas the distance metric. This would reduce the number texture accesses. Onthe same note, one could explore rendering distance fields at half resolution.Combining our screen space approach with the geometry based approachesfor mesh border generation, mentioned in Section 4.2, has also never beenexplored; even though this would make field generation dependent on meshcomplexity.Finally, considering the case of lacrimal lake generation, my results wereonly a proof of concept, and defining an effective procedural function forthis effect is still an open problem.59BibliographyJavier San Agustin, Arantxa Villanueva, and Rafael Cabeza. Pupil bright-ness variation as a function of gaze direction. In Proceedings of the 2006Symposium on Eye Tracking Research &Amp; Applications, ETRA ’06,pages 49–49, New York, NY, USA, 2006. ACM. ISBN 1-59593-305-0.Oleg Alexander, Graham Fyffe, Jay Busch, Xueming Yu, Ryosuke Ichikari,Andrew Jones, Paul Debevec, Jorge Jimenez, Etienne Danvoye, BernardoAntionazzi, et al. Digital ira: Creating a real-time photoreal digital actor.In ACM SIGGRAPH 2013 Posters, page 1. ACM, 2013.Clare Anderson, AW Wales, and James A Horne. Pvt lapses differ accordingto eyes open, closed, or looking away. Sleep, 33(2):197–204, 2010.John Anderson. ’polar express’ derails in zombie land. Newsday, 2004.A Terry Bahill, Michael R Clark, and Lawrence Stark. The main sequence,a tool for studying human eye movements. Mathematical Biosciences, 24(3):191–204, 1975.Louis Bavoil, Miguel Sainz, and Rouslan Dimitrov. Image-space horizon-based ambient occlusion. In ACM SIGGRAPH 2008 Talks, SIGGRAPH’08, pages 22:1–22:1, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-343-3.Wolfgang Becker. The neurobiology of saccadic eye movements. metrics.Reviews of oculomotor research, 3:13, 1989.T. Beeler, F. Hahn, D. Bradley, B. Bickel, P. Beardsley, C. Gotsman, R.W.Sumner, and M. Gross. High-quality passive facial performance captureusing anchor frames. ACM Trans. Graph., 30(4):75:1–75:10, July 2011.Pascal Be´rard, Derek Bradley, Maurizio Nitti, Thabo Beeler, and MarkusGross. High-quality capture of eyes. ACM Trans. Graph., 33(6):223:1–223:12, November 2014. ISSN 0730-0301.60BibliographyAmit Bermano, Thabo Beeler, Yeara Kozlov, Derek Bradley, Bernd Bickel,and Markus Gross. Detailed spatio-temporal reconstruction of eyelids.ACM Trans. Graph., 34(4):44:1–44:11, July 2015. ISSN 0730-0301.Bernd Bickel, Mario Botsch, Roland Angst, Wojciech Matusik, MiguelOtaduy, Hanspeter Pfister, and Markus Gross. Multi-scale capture offacial geometry and motion. ACM Transactions on Graphics (TOG), 26(3):33, 2007.Gunnar Blohm, Lance M Optican, and Philippe Lefevre. A model that inte-grates eye velocity commands to keep track of smooth eye displacements.Journal of computational neuroscience, 21(1):51–70, 2006.Judee K Burgoon, Laura K Guerrero, and Kory Floyd. Nonverbal commu-nication. Routledge, 2016.John Chapman. Screen space ambient occlusion tutorial. http://john-chapman-graphics.blogspot.ca/2013/01/ssao-tutorial.html, 2011.Kai-Chun Chen, Pei-Shan Chen, and Sai-Keung Wong. A hybrid method forwater droplet simulation. In Proceedings of the 11th ACM SIGGRAPHInternational Conference on Virtual-Reality Continuum and Its Applica-tions in Industry, VRCAI ’12, pages 341–344, New York, NY, USA, 2012.ACM. ISBN 978-1-4503-1825-9.Douglas Decarlo and Dimitris Metaxas. Optical flow constraints on de-formable models with applications to face tracking. International Journalof Computer Vision, 38(2):99–127, 2000.Joe Demers. Depth of field: A survey of techniques. GPU Gems, 1(375):U390, 2004.Eugene d’Eon and David Luebke. Advanced techniques for realistic real-timeskin rendering. GPU Gems, 3(3):293–347, 2007.Craig Donner and Henrik Wann Jensen. Light diffusion in multi-layeredtranslucent materials. ACM Trans. Graph., 24(3):1032–1039, July 2005.ISSN 0730-0301. doi: 10.1145/1073204.1073308. URL http://doi.acm.org/10.1145/1073204.1073308.William Alexander Newman Dorland. Dorland’s medical dictionary. Saun-ders Press Philadelphia, PA, 1980.61BibliographyPaul Ekman and Wallace V Friesen. Facial action coding system. ConsultingPsychologists Press, Stanford University, Palo Alto, 1977.Kenny Erleben and Henrik Dohlmann. Signed distance fields using single-pass gpu scan conversion of tetrahedra. In Gpu Gems 3. Addison-Wesley,2008.Rose Eveleth. Robots: Is the uncanny valley real? BBC Future, 2013.Craig Evinger, Karen A Manning, and Patrick A Sibony. Eyelid movements.mechanisms and normal data. Investigative ophthalmology & visual sci-ence, 32(2):387–400, 1991.A. Fick. Die bewegung des menschlichen augapfels. Z. Rationelle Med., (4):101–128, 1854.Tamar Flash and Neville Hogan. The coordination of arm movements: anexperimentally confirmed mathematical model. The journal of Neuro-science, 5(7):1688–1703, 1985.Guillaume Franc¸ois, Pascal Gautron, Gaspard Breton, and Kadi Bouatouch.Anatomically accurate modeling and rendering of the human eye. In ACMSIGGRAPH 2007 Sketches, SIGGRAPH ’07, New York, NY, USA, 2007.ACM.Yasutaka Furukawa and Jean Ponce. Dense 3d motion capture from syn-chronized video streams. In Image and Geometry Processing for 3-D Cin-ematography, pages 193–211. Springer, 2010.Graham Fyffe, Andrew Jones, Oleg Alexander, Ryosuke Ichikari, Paul Gra-ham, Koki Nagano, Jay Busch, and Paul Debevec. Driving high-resolutionfacial blendshapes with video performance capture. In ACM SIGGRAPH2013 Talks, page 33. ACM, 2013.Pablo Garrido, Levi Valgaerts, Chenglei Wu, and Christian Theobalt. Re-constructing detailed dynamic face geometry from monocular video. ACMTrans. Graph., 32(6):158, 2013.Abhijeet Ghosh, Graham Fyffe, Borom Tunwattanapong, Jay Busch, Xuem-ing Yu, and Paul Debevec. Multiview face capture using polarized spheri-cal gradient illumination. ACM Transactions on Graphics (TOG), 30(6):129, 2011.62BibliographyHenry Gray. Gray’s Anatomy: With original illustrations by Henry Carter.Arcturus Publishing, 2009.Mark R Harwood, Laura E Mezey, and Christopher M Harris. The spectralmain sequence of human saccades. The Journal of neuroscience, 19(20):9098–9106, 1999.Shoya Ishimaru, Kai Kunze, Koichi Kise, Jens Weppner, Andreas Dengel,Paul Lukowicz, and Andreas Bulling. In the blink of an eye: combininghead motion and eye blink frequency for activity recognition with googleglass. In Proceedings of the 5th augmented human international confer-ence, page 15. ACM, 2014.Laurent Itti, Christof Koch, and Ernst Niebur. A model of saliency-basedvisual attention for rapid scene analysis. IEEE Transactions on patternanalysis and machine intelligence, 20(11):1254–1259, 1998.Laurent Itti, Nitin Dhavale, and Frederic Pighin. Realistic avatar eye andhead animation using a neurobiological model of visual attention. InOptical science and technology, SPIE’s 48th annual meeting, pages 64–78. International Society for Optics and Photonics, 2004.Henrik Wann Jensen and Juan Buhler. A rapid hierarchical renderingtechnique for translucent materials. ACM Trans. Graph., 21(3):576–581, July 2002. ISSN 0730-0301. doi: 10.1145/566654.566619. URLhttp://doi.acm.org/10.1145/566654.566619.Jorge Jimenez, Veronica Sundstedt, and Diego Gutierrez. Screen-space per-ceptual rendering of human skin. ACM Transactions on Applied Percep-tion (TAP), 6(4):23, 2009.Jorge Jimenez, Timothy Scully, Nuno Barbosa, Craig Donner, Xenxo Al-varez, Teresa Vieira, Paul Matts, Vero´nica Orvalho, Diego Gutierrez, andTim Weyrich. A practical appearance model for dynamic facial color.ACM Transactions on Graphics (TOG), 29(6):141, 2010a.Jorge Jimenez, David Whelan, Veronica Sundstedt, and Diego Gutierrez.Real-time realistic skin translucency. IEEE Computer Graphics and Ap-plications, 30(4):32–41, 2010b.Jorge Jimenez, Ka´roly Zsolnai, Adrian Jarabo, Christian Freude, ThomasAuzinger, Xian-Chun Wu, Javier von der Pahlen, Michael Wimmer, andDiego Gutierrez. Separable subsurface scattering. Computer GraphicsForum, 2015.63BibliographyMurray W Johns, Andrew Tucker, Robert Chapman, Kate Crowley, andNatalie Michael. Monitoring eye and eyelid movements by infrared re-flectance oculography to measure drowsiness in drivers. Somnologie-Schlafforschung und Schlafmedizin, 11(4):234–242, 2007.Yvonne Jung and Johannes Behr. Gpu-based real-time on-surface dropletflow in x3d. In Proceedings of the 14th International Conference on 3DWeb Technology, Web3D ’09, pages 51–54, New York, NY, USA, 2009.ACM. ISBN 978-1-60558-432-4.Seth Koterba, Simon Baker, Iain Matthews, Changbo Hu, Jing Xiao, JeffreyCohn, and Takeo Kanade. Multi-view aam fitting and camera calibration.In Computer Vision, 2005. ICCV 2005. Tenth IEEE International Con-ference on, volume 1, pages 511–518. IEEE, 2005.Michael Lam and Gladimir Baranoski. A predictive light transport model forthe human iris. In Computer Graphics Forum, volume 25, pages 359–368.Wiley Online Library, 2006.Aaron Lefohn, Brian Budge, Peter Shirley, Richard Caruso, and Erik Rein-hard. An ocularist’s approach to human iris synthesis. Computer Graphicsand Applications, IEEE, 23(6):70–75, 2003.R John Leigh and David S Zee. The neurology of eye movements. OxfordUniversity Press, USA, 2015.Joe Letteri. Weta digital on their digital paul walker. FXGuide.Retrieved from https://www.fxguide.com/fxpodcasts/fxpodcast-298-weta-digital-on-their-digital-paul-walker/,Dec. 2015.Duo Li, Shinjiro Sueda, Debanga R Neog, and Dinesh K Pai. Thin skinelastodynamics. ACM Transactions on Graphics (TOG), 32(4):49, 2013a.Hao Li, Robert W Sumner, and Mark Pauly. Global correspondence op-timization for non-rigid registration of depth scans. Computer graphicsforum, 27(5):1421–1430, 2008.Hao Li, Jihun Yu, Yuting Ye, and Chris Bregler. Realtime facial animationwith on-the-fly correctives. ACM Transactions on Graphics (ProceedingsSIGGRAPH 2013), 32(4), July 2013b.Timothy Lottes. Fxaa. NVIDIA white paper. Retrieved from http: // goo.gl/ fr3plh , 2011.64BibliographyTom Mertens, Jan Kautz, Philippe Bekaert, Frank Van Reeth, and Hans-Peter Seidel. Efficient rendering of local subsurface scattering. In Com-puter Graphics Forum, volume 24, pages 41–49. Wiley Online Library,2005.Martin Mittring. Finding next gen: Cryengine 2. In ACM SIGGRAPH2007 Courses, SIGGRAPH ’07, pages 97–121, New York, NY, USA, 2007.ACM. ISBN 978-1-4503-1823-5.Masahiro Mori, Karl F MacDorman, and Norri Kageki. The uncanny valley[from the field]. Robotics & Automation Magazine, IEEE, 19(2):98–100,2012.Tsuyoshi Moriyama, Takeo Kanade, Jing Xiao, and Jeffrey F Cohn. Metic-ulously detailed eye region model and its application to analysis of facialimages. Pattern Analysis and Machine Intelligence, IEEE Transactionson, 28(5):738–752, 2006.Matthias Mu¨ller, David Charypar, and Markus Gross. Particle-based fluidsimulation for interactive applications. In Proceedings of the 2003 ACMSIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’03,pages 154–159, Aire-la-Ville, Switzerland, Switzerland, 2003. Eurograph-ics Association. ISBN 1-58113-659-5.Koki Nagano, Graham Fyffe, Oleg Alexander, Jernej Barbic, Hao Li, Ab-hijeet Ghosh, and Paul Debevec. Skin microstructure deformation withdisplacement map convolution. ACM Transactions on Graphics (TOG),34(4):109, 2015.Tamami Nakano and Shigeru Kitazawa. Eyeblink entrainment at break-points of speech. Experimental brain research, 205(4):577–581, 2010.National Eye Institute. Diagram of eye, 2016.OpenStax. Anatomy & physiology, June 2013.Peter E Oppenheimer. Real time design and animation of fractal plants andtrees. In ACM SiGGRAPH Computer Graphics, volume 20, pages 55–64.ACM, 1986.Vitor F. Pamplona, Manuel M. Oliveira, and Gladimir V. G. Baranoski.Photorealistic models for pupil light reflex and iridal pattern deformation.ACM Trans. Graph., 28(4):106:1–106:12, September 2009. ISSN 0730-0301.65BibliographyEric Penner and George Borshukov. Pre-integrated skin shading. Gpu Pro,2:41–54, 2011.Dmitriy Pinskiy and Erick Miller. Realistic eye motion using proceduralgeometric methods. SIGGRAPH 2009: Talks, page 75, 2009.Jasia Reichardt. Robots: fact, fiction+ prediction. Thames & Hudson Ltd.,London, 1978.Jareck Rossignac and Maarten van Emmerik. Hidden contours on a frame-buffer. In Proceedings of the 7th workshop on computer graphics hardware,1992.K Ruhland, S Andrist, JB Badler, CE Peters, NI Badler, M Gleicher,B Mutlu, and R McDonnell. Look me in the eyes: A survey of eye andgaze animation for virtual agents and artificial systems. In Eurographics2014-State of the Art Reports, pages 69–91. The Eurographics Associa-tion, 2014.Mark A Sagar, David Bullivant, Gordon D Mallinson, and Peter J Hunter.A virtual environment and model of the eye for surgical simulation. InProceedings of the 21st annual conference on Computer graphics and in-teractive techniques, pages 205–212. ACM, 1994.Mathieu Sanchez, Oleg Fryazinov, and Alexander Pasko. Efficient evaluationof continuous signed distance to a polygonal mesh. In Proceedings of the28th Spring Conference on Computer Graphics, SCCG ’12, pages 101–108, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1977-5. doi: 10.1145/2448531.2448544. URL http://doi.acm.org/10.1145/2448531.2448544.Fuhao Shi, Hsiang-Tao Wu, Xin Tong, and Jinxiang Chai. Automatic acqui-sition of high-fidelity facial performances using monocular videos. ACMTrans. Graph., 33(6):222:1–222:13, November 2014. ISSN 0730-0301.JH Skotte, Jacob Klenø Nøjgaard, LV Jørgensen, KB Christensen, andG Sjøgaard. Eye blink frequency during different computer tasks quanti-fied by electrooculography. European journal of applied physiology, 99(2):113–119, 2007.John A Stern, Larry C Walrath, and Robert Goldstein. The endogenouseyeblink. Psychophysiology, 21(1):22–33, 1984.66BibliographyNatalya Tatarchuk. Advances in real-time rendering in 3d graphics andgames i. In ACM SIGGRAPH 2009 Courses, page 4. ACM, 2009.D. Terzopoulos and K. Waters. Physically-based facial modelling, analysis,and animation. The Journal of Visualization and Computer Animation,1(2):73–80, December 1990. ISSN 1049-8907.Laura C Trutoiu, Elizabeth J Carter, Iain Matthews, and Jessica K Hod-gins. Modeling and animating eye blinks. ACM Transactions on AppliedPerception (TAP), 8(3):17, 2011.Vagia Tsiminaki, Jean-Se´bastien Franco, Edmond Boyer, et al. High resolu-tion 3d shape texture from multiple videos. In International Conferenceon Computer Vision and Pattern Recognition, 2014.Ed Ulbrich. How benjamin button got his face. TED talk. Re-trieved from https: // www. ted. com/ talks/ ed_ ulbrich_ shows_ how_benjamin_ button_ got_ his_ face , 2009.Levi Valgaerts, Chenglei Wu, Andre´s Bruhn, Hans-Peter Seidel, and Chris-tian Theobalt. Lightweight binocular facial performance capture underuncontrolled lighting. ACM Trans. Graph., 31(6):187, 2012.Wijnand van Tol and Arjan Egges. Real-time crying simulation for 3d char-acters. 2012.Artur Vill. Eye texture raytracer. https://www.chromeexperiments.com/experiment/eye-texture-raytracer, Feb. 2014.Huamin Wang, Peter J. Mucha, and Greg Turk. Water drops on surfaces.ACM Trans. Graph., 24(3):921–929, July 2005. ISSN 0730-0301.Thibaut Weise, Sofien Bouaziz, Hao Li, and Mark Pauly. Realtimeperformance-based facial animation. ACM Transactions on Graphics(Proceedings SIGGRAPH 2011), 30(4), July 2011.Axel Weissenfeld, Kang Liu, and Jo¨rn Ostermann. Video-realistic image-based eye animation via statistically driven state machines. The VisualComputer, 26(9):1201–1216, 2010.WikiHuman. http://gl.ict.usc.edu/Research/DigitalEmily2/, 2015.67BibliographyChenglei Wu, Bennett Wilburn, Yasuyuki Matsushita, and ChristianTheobalt. High-quality shape from multi-view stereo and shading un-der general illumination. In Computer Vision and Pattern Recognition(CVPR), 2011 IEEE Conference on, pages 969–976. IEEE, 2011.Jing Xiao, Simon Baker, Iain Matthews, and Takeo Kanade. Real-timecombined 2d+ 3d active appearance models. In CVPR (2), pages 535–542, 2004.Alfred L Yarbus. Eye movements during perception of complex objects.Springer, 1967.Sang Hoon Yeo, Martin Lesmana, Debanga R Neog, and Dinesh K Pai. Eye-catch: simulating visuomotor coordination for object interception. ACMTransactions on Graphics (TOG), 31(4):42, 2012.Yizhong Zhang, Huamin Wang, Shuai Wang, Yiying Tong, and Kun Zhou.A deformable surface model for real-time water drop animation. IEEETransactions on Visualization and Computer Graphics, 18(8):1281–1289,August 2012. ISSN 1077-2626.68


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items