UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A design space for whole-hand flat surface interaction Chen, Timothy Tien-Hua 2005

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2005-0180.pdf [ 20.23MB ]
Metadata
JSON: 831-1.0091985.json
JSON-LD: 831-1.0091985-ld.json
RDF/XML (Pretty): 831-1.0091985-rdf.xml
RDF/JSON: 831-1.0091985-rdf.json
Turtle: 831-1.0091985-turtle.txt
N-Triples: 831-1.0091985-rdf-ntriples.txt
Original Record: 831-1.0091985-source.json
Full Text
831-1.0091985-fulltext.txt
Citation
831-1.0091985.ris

Full Text

A DESIGN SPACE FOR WHOLE-HAND FLAT SURFACE INTERACTION by TIMOTHY TIEN-HUA C H E N B.Sc , The University of British Columbia, 2002 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Applied Science in Electrical & Computer Engineering in THE F A C U L T Y OF G R A D U A T E STUDIES (Electrical & Computer Engineering) THE UNIVERSITY OF BRITISH C O L U M B I A April 2005 © Timothy T. Chen, 2005 Abstract Touch interfaces offer interaction possibilities that are not viable with common computer user input devices, yet many touch interfaces only sense single-point touch. Whole-hand input devices capture more complexities of the hands and fingers, but are not frequently used to capture touch. The confluence of these areas describes whole-hand touch interfaces that are capable of capturing rich hand actions performed with respect to a surface. While prevailing interaction paradigms suggest that constraining input to two or three dimensions yields superior usability and efficiency, other domains, such as art and music, lend themselves for input with many more degrees of freedom. This research proposes an organization of whole-hand surface interaction that comprises a classification of existing touch interfaces, a taxonomy of flat surface-constrained hand actions, and a design space for whole-hand flat surface touch interaction. A n objective of this research is to better understand which properties of touch interfaces are conducive to capturing certain actions. Finally, we present some case studies that illustrate examples of whole-hand touch applications and technologies in the context of this research. n Table of Contents Abstract ii Table of Contents i i i List of Figures vii List of Tables x Acknowledgments xi Chapter 1: Introduction 1 1.1 Motivations 2 1.2 Contributions 3 1.3 Contents of the Thesis 3 Chapter 2: Touch Control in Interactive Systems 4 2.1 Physiology of Touch 4 2.2 Touch Interaction with Physical Tools 6 2.3 Conventional Human-Computer Input 7 2.4 Conventional Human-Computer Touch Input 8 2.5 Summary 10 Chapter 3: Related Work 12 3.1 Design Methods for Interfaces 13 3.1.1 Appropriateness of Input 14 3.1.2 Taxonomy of Hand input 14 3.1.3 Evaluation Guide 14 3.1.4 Device Capabilities 15 3.2 Classifying User Interfaces 15 3.2.1 Logical Devices 15 3.2.2 Focus on Pragmatics 16 in 3.2.3 Design Space of Input Devices 18 3.2.4 Input Device Models 21 3.3 Touch Technologies 22 3.3.1 Interfaces Using Electrical Properties 22 3.3.1.1 Single-point Sensing 24 3.3.1.2 Multiple-point Sensing 24 3.3.2 Interfaces using Optical Properties 25 3.3.2.1 Video Capture 25 3.3.2.2 Light Transmission 26 3.3.3 Interfaces using Mechanical Properties 27 3.3.3.1 Haptic Devices 27 3.4 The Human Sense of Touch 28 3.5 Summary 31 Chapter 4: A Taxonomy for Touch Interfaces 32 4.1 Motivation 32 4.2 Basis for New Taxonomy 33 4.3 Classifying Touch Input Devices in Existing Taxonomy 33 4.3.1 Explanation of Device Placement in Taxonomy 34 4.3.1.1 TouchScreen 35 4.3.1.2 Laptop Touch Pad 35 4.3.1.3 Wacom Drawing Tablet 36 4.3.1.4 DiamondTouch 36 4.3.1.5 MTS (FingerWorks) 36 4.3.1.6 SmartSkin 37 4.3.1.7 M T C Express 37 4.3.1.8 STC-1000 37 4.3.1.9 GelForce 38 4.3.2 Limitations to Existing Taxonomy 38 4.4 Updating Taxonomy 38 4.4.1 Single-point, Multi-point, Higher-order Input 38 4.4.2 Pressure Sensing 39 4.4.3 Sensor Resolution 39 4.4.4 Other Changes 40 4.5 Important Insights 40 4.6 Summary 42 Chapter 5: The Space of Hand Actions 43 5.1 Motivation 43 5.2 Examples of Hand Gestures 44 5.3 Touch Interaction in the Physical Context 45 5.3.1 Modes of Physical Touch Interaction 45 5.3.2 Physical Touch Interaction in Interactive Systems 46 5.3.3 Generalized Touch Interaction 47 5.3.4 Characterization of Hand Actions 47 5.4 Methodology 48 5.5 Atomic Single Hand Actions 48 5.5.1 Hand-centric Actions 49 5.5.1.1 Hand Press 49 iv 5.5.1.2 Hand Wipe 49 5.5.1.3 Hand Twist 49 5.5.1.4 Hand Roll 50 5.5.2 Finger-centric Actions 50 5.5.2.1 Finger Press 50 5.5.2.2 Finger Drag 50 5.5.2.3 Coordinated Finger Drag 50 5.5.2.3.1 Pinch/Squeeze 50 5.5.2.3.2 Stretch/Spread. 51 5.5.2.4 Scratch 51 5.5.2.5 Finger Twist 51 5.5.2.6 Coordinated Finger Twist 51 5.5.2.6.1 Finger-centred Twist 52 5.5.2.6.2 Centroid-centred Twist 52 5.5.2.7 Finger Roll 52 5.6 Additional Semantics of Touch Interaction 52 5.6.1 Bimanual Hand Actions 52 5.6.2 Dominant/Non-dominant Hand Actions 53 5.6.3 Compound/Repetitive Actions 53 5.6.4 Pressure Variation 53 5.6.5 Onset/Release and Duration Semantics 53 5.7 Summary 54 5.8 Figures 55 5.8.1 Hand Press 55 5.8.2 Hand Wipe 56 5.8.3 Hand Twist 57 5.8.4 Hand Roll 58 5.8.5 Finger Press 59 5.8.6 Finger Drag 60 5.8.7 Coordinated Finger Drag (Pinch/Squeeze; Stretch/Spread) 61 5.8.8 Scratch 63 5.8.9 Finger Twist 64 5.8.10 Coordinated Finger Twist 65 5.8.11 Finger Roll 67 Chapter 6: Design Space for Touch Interfaces 68 6.1 Generating the Design Space 69 6.1.1 Primitive Movement Vocabulary 69 6.1.1.1 Manipulation Operator (M) 69 6.1.1.2 Current State (S) 70 6.1.1.3 Input Domain (In) 70 6.1.1.4 Output Domain (Out) 71 6.1.1.5 Resolution Function (R) 71 6.1.1.6 Work Properties (W) 71 6.1.2 Composition Operators 72 6.1.2.1 Merge Composition 72 6.1.2.2 Layout Composition 72 6.1.2.3 Connect Composition 72 v 6.1.3 Design Space for Touch Interfaces 72 6.1.3.1 Placement of Devices 74 6.2 Testing Points in the Design Space 75 6.2.1.1 Expressiveness 76 6.2.1.2 Effectiveness 76 6.3 Summary 77 Chapter 7: Whole-hand Interactive Applications 79 7.1 FlowField - Exploring Interface Limitations 80 7.1.1 FlowField Experience 81 7.1.2 System Architecture 82 7.1.3 Mapping of Touch Input 83 7.1.4 Evaluation of FlowField 84 7.1.5 Relating FlowField to Thesis Contributions 84 7.2 Three Investigations Motivated by FlowField 85 7.2.1 Investigation of Improving Data Produced by Interface 86 7.2.1.1 Layered Framework for Touch Interface Application Design 86 7.2.1.2 Interpolation of Raw Data 87 7.2.1.3 Relating to Design Space 89 7.2.1.4 Improving FlowField using Interpolated Data 89 7.2.2 Investigation of Suitable Hand Actions for the Interface 91 7.2.2.1 3D Model Viewer 91 7.2.2.2 Labyrinth Game 93 7.2.2.3 Relating to Design Space 95 7.2.3 Investigation of Different Interface Attributes 95 7.2.3.1 System Architecture 95 7.2.3.2 Proof-of-Concept Application 96 7.2.3.3 Relating to Design Space 97 7.3 Summary 97 Chapter 8: Conclusions and Future Work 99 8.1 Contributions 99 8.1.1 Taxonomy of Touch Interfaces 99 8.1.2 Enumeration of Surface-constrained Hand Actions 100 8.1.3 Design Space of Touch Interfaces 100 8.1.4 Applications and Related Contributions 101 8.1.5 Publications 101 8.2 Future Work 101 8.3 Conclusion 102 References 103 Appendix A: Glossary 107 Appendix B: FlowField Experience Questionnaire 108 vi List of Figures Figure 0.1: "Piled Higher and Deeper" by Jorge Cham (vvww.phdcomics.com); originally published February 24, 2005 xi Figure 2.1: Standard keyboard and mouse 7 Figure 2.2: Transition diagram for tracking and dragging mechanisms of a mouse 8 Figure 2.3: Transition diagram for tracking and dragging mechanism for touch pad (grey for extension i f auxiliary button is present) 9 Figure 2.4: Examples of a laptop touch pads; centre figure also shows an isometric joystick .9 Figure 2.5: Transition diagram for tracking and dragging mechanisms with a pressure-sensitive touch pad 10 Figure 3.1: Design method for whole-hand input; shows design flow for developing whole-hand input for any specific application or set of tasks 13 Figure 3.2: Enumeration of some common interactive techniques and interface technologies in terms of logical action primitives 17 Figure 3.3: Taxonomy based on design space analysis of input devices 20 Figure 3.4: Parameterization of input device properties 23 Figure 3.5: Cirque touch pad (left); Wacom Graphire3 drawing tablet (right) 24 Figure 3.6: Multi-point sensing desk interfaces: Smartskin (left), Diamondtouch (right) 25 Figure 3.7: Range information generated by Haptic Lens (left); interacting with GelForce (right) 26 Figure 3.8: M T C Express (left); STC-1000 (right) 27 Figure 3.9: Immersion CyberGrasp 28 Figure 4.1: Input device taxonomy based on design space by Card et al. Rows denote vii property type; columns indicate dimensionality and resolution of the measured input ..34 Figure 4.2: Classification of touch interfaces in the Mackinlay and Card taxonomy 35 Figure 4.3: Proximity sensing in SmartSkin 37 Figure 4.4: Taxonomy of touch interfaces. Circles denote single-point devices; squares denote multi-point/field sensing devices. Sub-columns denote scale of sensor resolution. 41 Figure 5.1: Axes of rotation for hand actions with respect to touch interactive surface 49 Figure 5.2: Examples of hand press action (1): left: force applied directly into the surface (orthogonal hand press); right: force applied in an off-normal direction (directional hand press) 55 Figure 5.3: Examples of hand press action (2): left: closed fist; right: side of palm 55 Figure 5.4: Hand wipe action: hand remains static throughout wipe 56 Figure 5.5: Hand wipe action: hand changes configuration during wipe 56 Figure 5.6: Hand wipe action: using two fingers (cf. finger drag where focus is on individual fingertips) 56 Figure 5.7: Hand twist action: axis of rotation is orthogonal to the surface and centred in the middle of the palm (dot); axis can be located anywhere 57 Figure 5.8: Hand twist action involving side of palm; axis of rotation denoted by dot 57 Figure 5.9: Hand roll action: closed hand rolls from left to right 58 Figure 5.10: Hand roll action: open hand rolls from left to right 58 Figure 5.11: Hand roll action: closed hand rolls from bottom to top 58 Figure 5.12: Hand roll action: two fingers roll from left to right (cf. single finger roll) 58 Figure 5.13: Multiple variations of a finger press action; in all cases contact is made by individual fingers or discrete points on the hand, rather than with the whole hand 59 Figure 5.14: Examples of finger drag actions: top: single finger; middle: two fingers; bottom: all fingers and thumb 60 Figure 5.15: Finger pinch action between thumb and index finger with both moving 61 Figure 5.16: Finger pinch action between thumb and index finger with finger moving towards thumb 61 Figure 5.17: Finger pinch action between two fingers and thumb 61 Figure 5.18: Finger pinch action between all fingers and thumb; can be considered squeeze action to emphasize less precision compared with pinch actions diagrammed above ....62 Figure 5.19: Finger pinch action between two fingers and thumb; palm planted on surface..62 Figure 5.20: Finger pinch action between little finger and thumb; side of palm planted on surface 62 Figure 5.21: Finger scratch action with index finger 63 Figure 5.22: Finger twist action of finger pointed directly into surface 64 Figure 5.23: Finger twist action of finger contacting the surface through the finger pad 64 Figure 5.24: Finger twist action of thumb 64 Figure 5.25: Coordinated finger twist action of two fingers that touch the surface; neither one stays in the same spot, and therefore considered a centroid-centred twist 65 Figure 5.26: Finger-centred twist of two fingers; middle finger remains in place while index finger moves; can be considered a drag action from the perspective of the index finger65 Figure 5.27: Finger-centred twist of two fingers and thumb; thumb remains in place while two fingers move about thumb 66 Figure 5.28: Centroid-centred twist with two fingers; neither finger remains in place—they rotate about axis that lies between roughly between two points marked by fingertips ...66 viii Figure 5.29: Centroid-centred twist with two fingers and thumb; fingers and thumb all rotate about a point located within the area demarcated by points of contact 66 Figure 5.30: Finger roll action 67 Figure 6.1: Laptop touch pad and M T C Express in the design space for touch interfaces; columns denote type of hand actions (SF=single-finger, MF=multi-finger) 73 Figure 6.2: Visualization of design space for touch interfaces; spheres for GelForce represent ability to capture pressure in all directions, not just into the surface 74 Figure 7.1: System diagram of FlowField 80 Figure 7.2: The virtual field of particles arranged in a cylinder (left); circular obstructions in flow (right) 81 Figure 7.3: Data flow diagram for FlowField; boxes in each computer represent concurrently running threads 82 Figure 7.4: Obstructions seen in field of virtual particles; mapped to one half of cylinder only. 83 Figure 7.5: Layered framework for touch interactive applications 86 Figure 7.6: Raw data values from the M T C Express 87 Figure 7.7: Results of bilinear processing of raw data 88 Figure 7.8: Results of bicubic interpolation of raw data 88 Figure 7.9: Results of Gaussian interpolation of raw data 88 Figure 7.10: Pressure value fluctuations due to traversal of finger over and between sensor locations (circles) 89 Figure 7.11: Snapshots of FlowField 2: force vectors are represented in yellow 90 Figure 7.12: 3D model viewer using hand-centric roll as input: centred teapot (top); rotated teapots (left) with corresponding pressure input data (right) represented as greyscale image (dark corresponds to high pressure) 93 Figure 7.13: Screenshot of Xtreme Labyrinth 94 Figure 7.14: Images of Malleable Surface Interface: video camera positioned underneath interface (left); interface with tracking dots visible (top right); deforming the interface (bottom right) 96 Figure 7.15: KineticsKit spring mass model: interfaced with Malleable Surface Interface (left); interfaced with M T C Express (right) 97 ix List of Tables Table 3.1: Evaluation guide 15 Table 3.2: Taxonomy of input devices centred around pragmatic functionalities 18 Table 3.3: Physical properties sensed by input devices 19 Table 3.4: Existing touch interfaces organized by technology 23 Table 3.5: Touch research concepts important to this thesis 29 Table 5.1: Hand actions classified according to hand- or finger-centric and transformation..49 Table 6.1: Terms of representation of an input device 69 Acknowledgments Figure 0.1: "Piled Higher and Deeper" by Jorge Cham (www.phdcomics.com); originally published February 24, 2005. Why is there a comic strip about grad students spinning their wheels trying to graduate that also happens to have a character that looks kind of like me in it? I do not know. But this comic and my journey over the last few years have made me realize that no matter what kind of difficulties I face and situations I find myself in, I am not alone. Accordingly, there are many people to which I need to express my gratitude. First and foremost, of course, I need to thank my parents for their unending support and xi understanding, not to mention food and shelter. To my brother, Chris, who continually shatters the monotony of my life with his attention-grabbing antics and what passes for music these days. I love you all. To Sid, my supervisor, for all the obvious reasons: accepting me as a student, all the advice, encouragement, and the opportunities of being part of the lab. Oh, and funding, too. To Drs. Booth and Iverson: for agreeing to serve on my committee. I appreciate all your insights and comments. To the HCT Lab, for making my experience over the past three years memorable. In particular: Sarah: for all the work and help with many of the contributions of this thesis, most importantly the interpolation work, as well as some interesting demos using the M T C Express; Florian: for all the technical help and collaborations; Ashley: for his sense of humour and boundless enthusiasm, as well as showing me that one can accomplish a lot besides finishing a thesis; Grace, Edgar, and Farhan: for showing that graduation with Sid is possible, as well as all the good times cobbling the nascent lab together; Nelson and Tony, for giving feedback for my defence. To the two Patricias, whose incessant incredulity over my inability to finish this degree was expressed as somewhat facetious encouragement, which actually worked. To my previous two employers: .477? and NewMIC, for the opportunity to meet Sid, and to jointly collaborate with him, respectively. One company has changed considerably, and the other is no more, but they were both important in my journey; and to my current employer: Electronic Arts Canada, and particularly Steven, a student I TAed in EECE494, who referred me in. To all my friends from school and church, who actually kept remembering that I was doing a degree, motivating me to just get it done. And speaking of church, I feel I should give thanks to God, for where I am today. No way could I anticipate or plan the turn of events that have led me to meet Sid in Japan, work with him to finish my undergrad degree at NewMIC, switch over to EECE to do a Masters, and begrudgingly help TA his EECE494 class for a third time, which in turn led me to my position at EA, keeping me in Vancouver. This journey was not without obstacles or challenges, but He has seen me through them to this point, and I am happy and grateful. I can only pray and wonder about what is in store for me in the future, but I know, it will in the end, be better than I can possibly imagine. xii Chapter 1: Introduction The human hands are wondrous instruments that are essential to our everyday existence. The range of motion and dexterity of our hands allow them to perform an amazing variety of tasks. We use our hands to explore and manipulate our environment, to hold and control physical tools and machines, as well as to convey information through gesture and touch. These actions are guided through our sense of touch, which is especially acute in our hands. In the context of human-computer interaction (HCI), a primary means for using computers is through our hands: we push buttons on a keyboard and mouse, as well as grip the mouse to physically move it. However, when compared with the wide range of tasks that involve our hands in non-computing tasks, conventional keyboard-and-mouse interaction seems a rather limited usage of our manual abilities. Touch interaction includes these simple actions of manipulating a keyboard and mouse, as well as actions that require our hands to engage in complex movements. Examples of touch interfaces include touch pads and touch screens, and these interfaces often capture movements of the human hand, rather than detecting the actuation of physical mechanisms such as buttons or sliders. Because of the complex nature of our sense of touch and the prodigious abilities of our hands, the study and use of touch in interactive systems is a diverse field of research, encompassing areas such as gesture recognition, haptics, as well as psychology and physiology when we consider the cognitive and physical aspects of touch. We focus our research on the use of touch in interactive systems from the dual perspectives of interface technologies and interactive techniques. This thesis organizes research and 1 developments concerning touch interaction in a framework that promotes greater understanding and guides future efforts in this area. This framework consists of classifications of touch interfaces as well as hand actions which are synthesized in a design space that associates technological attributes with semantics involved with touch interaction. To focus our efforts further, we concentrate on touch interaction that is constrained to a flat, physical surface, since many of the touch interfaces we study have touch-sensitive surfaces which are flat. This also constrains the set of hand actions that can be performed, but we will see that even flat surface interaction has considerable depth. This introduction provides an overview of the major motivations and contributions of the thesis, as well as a detailed organization of its contents. 1.1 Motivations Much research into exploring touch interaction is device-driven. One approach is to investigate novel interaction paradigms using an established touch interface. Another is to invent an application that makes use of a particular interface with novel attributes. These approaches are problematic in that there is a fundamental disconnect between interface attributes and touch semantics. There is difficulty answering the question: what can be done with this particular interface? We were tasked to answer precisely this question when we investigated the M T C Express during the development of FlowField, described in Chapter 7. The M T C Express (Figure 3.8) captures multiple pressure values across an instrumented surface, which is a unique combination of attributes. Its touch pad form factor, coupled with the claim of "multi-touch" support, suggested that it can capture the trajectories of multiple fingers. This was an invocation of intuition, which proved to be problematic as we had issues with choosing an appropriate mapping and ultimately, supporting the desired interaction was difficult. Again, our intuition suggested a few potential directions for investigation: we can process the raw data to better recognize the input we desired; we can explore what kind of other hand actions can be more readily captured; or we can choose or design another interface. These are all valid directions, but deciding between them seemed to be a haphazard proposition, dependent on how much effort we were willing to expend and how much time we had to complete the project. What was missing was a systematic way of addressing issues encountered during the development of touch interactive applications for a particular interface with novel attributes. We needed to identify relevant interface attributes and also understand the space of hand actions that we can consider, given these attributes. Our design space of whole-hand flat-surface interaction aims to fill this role, which builds on the useful compilation and organization of existing touch interfaces as well as the study of how we can use touch with these interfaces. The goal is for interface designers and application developers working in the area of surface touch interaction to be guided with the knowledge of successful past efforts as well as potential difficulties and pitfalls. This work should also be revisited and updated upon 2 relevant new discoveries, though the hope is that it will eventually reduce the overlap of effort in future works. 1.2 Contributions Our exploration of touch interaction has grown from the nascent roots of an application design exercise to a comprehensive framework organizing relevant technologies and semantics. This framework is presented in the following three contributions: 1. Taxonomy of touch interfaces (Chapter 4): this classifies interfaces according to several salient attributes. With this taxonomy, we can relate different interfaces with their capturing characteristics, particularly in terms of the capabilities of the human hand. 2. Organization of hand actions (Chapter 5): we are interested in whole-hand actions that are constrained to a flat surface. We classify them between hand- and finger-centric foci, as well as transformation properties, to establish the extent of the capabilities of the human hand. 3. Design space of touch interaction (Chapter 6): integrating the first two contributions, we propose a design space that will correlate interface properties and hand actions. This will offer application designers a basis for selecting which interface properties to use in order to support a desired type of interaction. Conversely, this will also be instructive in understanding which interactions are possible given a particular device. The original design exercise, FlowField, is described in Chapter 7, along with several other applications that were motivated by our increasing understanding of touch interaction. These applications were driven by some of the directions proposed in the previous section, but benefited from the systematized knowledge embodied in our touch interface design space. These developments and supporting applications comprise a tangible contribution that supports the theoretical findings in this thesis. 1.3 Contents of the Thesis The thesis is presented in eight chapters. The major contributions listed in the previous section are presented in separate chapters as indicated. However, before we present this material, some background information must be established. Chapter 2 presents a high-level overview of touch interaction from the perspectives of physiology and human-machine interaction. Concrete examples of touch interactive applications and technology are presented in the related works, in Chapter 3, along with some theoretical background on classifying user input devices, which is instrumental to our own efforts. Chapters 4 to 7 are described as above. The thesis is concluded in Chapter 8 with a summary of the major findings and contributions, as well as a treatment on the vast possibilities of future directions. 3 Chapter 2: Touch Control in Interactive Systems This thesis presents a framework for developing interfaces and applications that are controlled by touch. In this chapter, we build towards the contributions in the thesis by describing touch interaction in terms of touch from the perspective of human physiology as well as the conventions of how touch is commonly used in current interactive systems. We will see that while touch is an integral part of our everyday existence, methods for touch input in computing systems are comparably limited. Our research investigates this area of human-computer touch interaction so we can better understand how to design interfaces and applications that take advantage of our abilities to convey touch. Describing touch interaction in terms of input and output, as is common in some literature, can lead to ambiguities arising from whether the focus is human-centric or interface-centric. We will focus on the conveying of touch from the human onto a touch-sensitive surface; this is typically referred to as touch input, and we will continue to use this meaning. In contrast, we consider tactile input or feedback to be the conveying of sensations from an interface to the human touch sensors. These and other related terms are clarified in Appendix A. 2.1 Physiology of Touch The human sense of touch is essential to our understanding and manipulation of the environment. Like our visual and auditory senses, the sense of touch gathers pertinent information about our environment. Many attributes are determined most effectively through direct contact, and recognition of tactile information is achieved through intuition and experience. The sense of touch is not only crucial for recognizing physical attributes of the environment, but it is important to our abilities to physically manipulate our environment, as 4 well. In fact, many of the physical actions we perform would be made difficult i f the sense of touch were deficient or disabled. This is especially true with our hands, which are our primary means of physical interaction with our environment. Our hands have a wide range of motion which is useful for performing a variety of distinct actions. When in contact with a physical medium, the increased sensor density of the human hands, compared with other parts of the anatomy, facilitates precise and fine control, which is important for many basic manual tasks. Many tools have been designed to effectively leverage the abilities of the hands, perhaps amplifying or transforming the manual input to execute actions that would be difficult to perform using the hands directly. More complex tools use the hands for invoking an action and fine-tuning, such as pushing a button or adjusting a dial. In human physiology, touch is processed through skin (cutaneous) receptors which sense contact and pressure, as well as related qualities such as texture and heat. In addition to these, kinaesthesia and proprioception are important contributors to our abilities to perceive and manipulate the environment and objects within it. Proprioception is the implicit knowledge of the existence and configuration of the parts of the body without visual confirmation. With this, actions such as moving our limbs to certain locations are trivial and not dependent on explicit feedback cues. Kinaesthesia is related; it is the implicit knowledge of the human body in motion without external cues. The human hands and fingers are among the areas of the human body which have a high degree of sensitivity to contact and pressure. This is not surprising because we use these most frequently to actively touch our environment. The sensitivity to pressure allows us to recognize not only that we are touching something, but what forces the object imparts onto us due to friction, gravity or other sources. This, in turn, is invaluable to determine various qualities, such as whether we have a sufficient grip on an object to move it, or the texture or hardness of a surface. The way humans usually and instinctively explore the world is through touching objects using their hands and fingers in a dynamically moving manner. This is called active touch. In fact, using the tactile acuities of our hands and fingers together with our visual sense is essential to extracting information about space-occupying objects; using our touch senses alone is second in effectiveness [Schiff82]. A l l parts of the human body are touch-sensitive to varying degrees, but none are nearly as articulate or dextrous as the hands and fingers, and thus cannot experience sensations through active touch. Passive touch tends to generate more subjective evaluations of the experience [Gibson62]. As a result, much of haptic research is focused on providing sensations to the hands and fingers (i.e. vibrating game controllers, mice, etc.), often called tactile or haptic feedback. This is the focus of haptic research; however, we are more concerned with the conveying of touch onto touch-sensitive surfaces for the purposes of human-machine interaction. In particular, we focus on systems that have the potential to capture and process touch by sensing contact and/or pressure through a flat touch-sensitive surface, rather than those that 5 use touch only to actuate physical mechanisms. From the perspective of human physiology, the conveyance and sensing of touch are not independent, as every action we make on a touch interface is exerted back to us as well. The same biological systems that are key to our abilities to sense the physical world are also critical to our abilities to manipulate and apply touch with our hands. The channels of cutaneous contact and pressure sensing, kinaesthesia, and proprioception are interdependent; we collectively process information from these channels, in order to know how we are touching something. Our human hands have evolved to give us unique abilities to manipulate our environment that are rarely found among other living organisms. In the next sections, we describe touch interaction in the context of interacting with physical objects in everyday tasks as well as human-computer interactive tasks. This section outlines some of the basic mechanisms for conveying touch to interactive systems. There is far more information regarding the biological and psychological foundations of the way we perceive and use touch than can be summarized here. In the next chapter, some references of seminal contributors to this diverse area are introduced. We now turn to examining how we use touch with physical objects and interactive systems. 2.2 Touch Interaction with Physical Tools Throughout history, humans have created increasingly complex tools to help them do work— to perform tasks that would be otherwise difficult or impossible with our own abilities alone. These tools invariably are held, controlled, and/or invoked using postures and movements of the human body; moreover, it is often the human hands that perform the useful input. These abilities are learned through training and also our own exploration with our touch senses. We are able to intuitively manipulate simple tools, such as those employing combinations of one or more of the six simple machines (lever, inclined plane, wheel and axle, screw, wedge, and pulley). These tools transfer the work done by forces applied by the human body or by gravity along different directions to perform useful work. For example, a screwdriver is used to drive a screw into wood by gripping it with a hand and applying a twisting motion. More complex tools employ powered systems to perform additional work so that we do not have to continuously apply forces on our own (i.e. a power drill with a screw bit). Human work is done to perform tasks that invoke or facilitate the operation of these machines, rather than to perform the effective work itself. Now, with electronics, we use machines that perform computational work, rather than physical work. We find ourselves increasingly interacting with such systems, but still primarily with our hands. We maintain our abilities to manipulate objects in complex manners, yet these are typically not leveraged by many machines, in the interest of saving work. In fact, many controls of interactive systems use only very basic and low impact hand actions; consider the many common physical interfaces: buttons, dials, levers, sliders, among others. Even with these basic controls, we can see how the abilities of the hand are beneficial when we need to manipulate more than one control at the same time, such as pushing a bank of sliders or a series of buttons (i.e. musical instruments). 6 2.3 Conventional Human-Computer Input Despite the dexterity of the human hands, there are only two predominant tasks that humans usually perform when interacting with computers. One is typing on the keyboard to provide character information, and the other is moving a mouse for pointing, dragging, and selecting interactions with a graphical interface (Figure 2.1). The latter action is sometimes substituted by touch pad interaction; in either case, an on-screen cursor is moved. The actions involved in performing these tasks are not complex, usually involving button pressing and pushing a physical device around (or dragging a finger across a touch pad). Figure 2.1: Standard keyboard and mouse These interaction methods have persisted through many years of computer innovation. While the vast majority of computer users prefer a keyboard and mouse setup, this is more due to the fact that there are few alternatives rather than any purported superiority these interfaces have over other technology. Nevertheless, there is incremental innovation for these devices (i.e. multimedia buttons, scroll wheels, etc.) as well as some alternatives which become accepted because of special considerations—an example is the laptop touch pad used commonly in mobile computing. In terms of physically operating a keyboard and mouse, pushing buttons is not challenging, and only rudimentary motor skills are required to move the mouse, though the latter action does require considerable coordination to correlate the physical movement to the onscreen effect. The mouse provides the semantics of 2 D control of an onscreen element, typically a cursor. From there, the ubiquitous WIMP (Windows, Icons, Menus, Pointer)-based user interface was designed to take advantage of the mouse. The mouse is currently the universally accepted technique for 2 D control. There are some alternatives to the mouse, such as trackballs, joysticks, and touch pads. None of these approach the mouse in dominance; however, each of these still remain used in some specialized applications. Touch pads, in particular, are ubiquitous in mobile applications, most notably in laptop computers where using a mouse may be cumbersome. We claim that manual interaction with computers does not make full use of the abilities of 7 the human hand. There clearly are reasons for the widespread use of the mouse and keyboard. It can be argued from a variety of perspectives that the mouse and keyboard are sufficient for manipulating conventional computer software—historical, functional, and ergonomic. User interface innovations range from the development myriad variants of existing devices to the invention of novel interfaces; examples include ergonomic keyboards, keyless keyboards [WestermanOl], and 3D mice. Of course, for each new keyboard or mouse-based device, there is a new exotic interface that is touted as the next great thing, only to meet limited success for various reasons. Mainly this is because its usefulness is not enough to supplant the keyboard and mouse as primary means of input. Nevertheless, there are many examples of applications for which the keyboard and mouse are insufficient, driving research into more effective means of interaction and control. While the specialized device market is important, we see value in general-purpose interfaces that have the potential to leverage the complex possibilities of human touch, more than what is currently possible with common devices. 2.4 Conventional Human-Computer Touch Input Our capacity for conveying touch information with our hands is not engaged when using the mouse and keyboard; the contact between our hands and fingers and the input devices do not affect anything more than activation of buttons (or other simple mechanisms). Moving the mouse is performed with physical positioning of the entire hand. One of the functionalities of the mouse has been formalized by Buxton, who presented a two state transition diagram, describing the basic tracking and dragging mechanisms of a mouse [Buxton85]: button up tracking dragging Figure 2.2: Transition diagram for tracking and dragging mechanisms of a mouse Touch input for a mouse consists of the contact between the hand and the mouse body and buttons; the mouse has no notion of the fact that there is this contact. This is in contrast with touch pads (Figure 2.4), which rely detecting and tracking a finger when it is contacting the touch surface. These devices are used as substitutes for a mouse, and therefore are intended to manipulate the position of a single cursor. The finger tracking trajectory is computed from the initial point of contact to the final release of the finger; hence only binary contact sensing is required. This tracking interaction is represented by states 1 and 2 of Figure 2.3 [Buxton85]. 8 release button up tracking Figure 2.3: Transition diagram for tracking and dragging mechanism for touch pad (grey for extension if auxiliary button is present) A laptop touch pad usually is accompanied by one or more buttons in order to provide the same functionality buttons afford a mouse. Differently from the mouse, these buttons are usually located below the touch pad itself to be pressed by the thumb while the finger performs the tracking activity. This arrangement is awkward when performing the equivalent to the dragging task (state 3 in Figure 2.3) as the thumb remains stationary while the forefinger moves about. Figure 2.4: Examples of a laptop touch pads; centre figure also shows an isometric joystick Some newer touch pads produce a mouse-click event upon a tap of the finger on the touch pad itself, thereby alleviating the need to control the thumb independently of the finger. A finger tap, however, only can be mapped to a full press/release button action; there is no way to separate the button-up and button-down events. Therefore, it is not possible to achieve the functionality outlined by the 3-state diagram in Figure 2.3 with only the touch surface as they are found in most laptops. With a touch pad that senses more than two values for contact, or pressure, it is possible to achieve the full tracking and dragging functionality. Applying "hard" pressure to the touch surface translates to the button-down event, and reverting to "light" pressure signals button-up (Figure 2.5). While this works, this can be problematic due to the arbitrary nature of when the applied pressure is considered "hard" versus "light", but this can be alleviated with some form of visual feedback. 9 release light tracking dragging Figure 2.5: Transition diagram for tracking and dragging mechanisms with a pressure-sensitive touch pad Another alternative pointing device that uses touch to some degree is the isometric joystick (centre figure in Figure 2.4). This device accelerates the cursor at a rate proportional to the amount of force applied on the joystick, in the same direction. This relies on the finger's subtle sensitivity to cutaneous pressure and proprioception in order to finely control the cursor. One advantage of touch interfaces over the human sense of touch is that they are not susceptible to masking of tactile stimuli [Schiff82]. In humans, masking is a failure of the sensory system to process incoming information; in the case of tactile stimuli, this involves the presence of noise in incoming signals. This noise could come in the form of stimulus of different position and/or pressure or one that came slightly before or after the intended stimulus. In contrast, conventional touch interfaces are passive touch sensing devices only. The sensing surface is usually fixed and immovable—a contrast to how humans usually sense touch, through actively moving their hands and fingers to perceive the world through the dynamic sensations, and integrating the information into a complete description of the touched object. 2.5 Summary With the sense of touch being as important to the human sensory system as the senses of sight and hearing [Heller91], it is disappointing to see that in contrast to tremendous advances in sound and vision technologies, devices that make use of touch remain few and impractical. This can be attributed to the fact that since touch almost always involves physical forces, devices that stimulate sensations indicative of touch are difficult to produce, due to the sheer sensitivity and locations of human touch sensors. Our focus of touch input, however, explores the opposite direction of capturing touch conveyed from the human to a surface. Technology to detect and process rudimentary touch is already employed by a few interfaces, such as the laptop touch pad and others which will feature in subsequent chapters. The most common application of touch capture is for position tracking, but it is difficult to consider what other semantics of touch can be usefully captured by touch interfaces. To further explore this question, we will look at the space of touch interfaces, identifying important attributes and what they capture in Chapter 4. Together with a taxonomy of hand 10 actions in Chapter 5, we can better ascertain what forms of touch can be captured with current interfaces, and this will be useful for developing touch applications as well as developing interesting interfaces that take fuller advantage of the capabilities of the human hand. 11 Chapter 3: Related Work The previous chapter outlined how touch is most commonly used in our interaction with computers. While certain interaction techniques and interface technologies have seen widespread adoption, there is still work that explores various aspects of touch and its possibilities for human-computer interaction. Our thesis proposes a framework that organizes the efforts in this field, and in this chapter, we present some of these contributions and related works. First, we look at Sturman's design approach for whole-hand input devices [Sturman93] as a comparable process for designing and evaluating interaction styles and application requirements for designing interfaces. This work comprises several stages expanded upon in our research, including taxonomies of interaction styles, hand actions, and identification of device capabilities. To assist our efforts to organize touch interfaces, we examined some of the seminal works on classifying input devices. A functional approach derived from examining techniques used in interactive applications is found in Foley's development of virtual devices and subsequent graphics-oriented taxonomy [Foley74] [Foley 82]. Categorizing by the type and dimensionality of measured quantities of input devices is a part of Buxton's work [Buxton83], which took a user-centric perspective. The notion of an input design space, proposed by Mackinlay et al. [Mackinlay90], forms the basis of our own interface classification and design space for whole-hand surface interaction. Naturally, developing a systematic method for organizing touch interfaces is only useful i f there exist devices that support touch interaction. We will introduce some of the important interfaces, from both the research and commercial arenas, that will figure prominently in our 12 efforts to formulate a taxonomy and design space, and also some that are specifically explored in our application development work. In Chapter 2, we introduced some important concepts of touch interaction from a physiological perspective. This area is significant and diverse, and we will introduce some of the more important concepts and seminal contributors. Among these are studies of gestures and prehension, which are important aspects of human touch and are important to our consideration of a space of hand actions in Chapter 5. 3.1 Design Methods for Interfaces Our research proposes a systematic framework to organize the various aspects of whole-hand flat surface interaction, ultimately to guide interface and application design. Sturman presents a similar approach for the area of whole-hand interaction [Sturman93], which is instructive for our efforts. Figure 3.1 shows the various stages of selecting, evaluating, and testing interfaces for an application, which identifies aspects of interactive application design that can be formalized and systematically executed. Appropriateness for application Taxonomy choose siyis of interaction v y / B r e a k tasks into \ task primitives H Evaluation Guide Task primitive requirements Hand action capabilities match analyze refine z Device selection Prototype interface Testing & Evaluation Completed interface L Z 7 Figure 3.1: Design method for whole-hand input; shows design flow for developing whole-hand input for any specific application or set of tasks 13 This method is broken down into several stages: determine appropriateness, selecting an interaction style from a taxonomy, evaluating an application's tasks based on task primitives and hand action capabilities, selecting a device, then finally testing and evaluation which leads to a completed interface. Indeed, these are all stages that are useful for evaluating touch interfaces for applications, and thus we shall examine each stage in more detail, and show how they can be relevant to our research. 3.1.1 Appropr iateness of Input This stage determines the suitability of an application for a particular type of interaction. This is evaluated based on several criteria derived from the features of the hand: Naturalness considers how a desired task can take advantage of preacquired sensorimotor skills, existing lexicon of hand signs, an absence of intermediary device, and how well the task control maps to hand actions, in order to reduce learning time and cognitive load. Adaptability considers how the hand can be used to switch between diverse modes rapidly and smoothly, so that a single interface can be used to increase overall efficiency of task control. Coordination ascertains whether a task requires the coordination of many degrees of freedom (DOF), since this increases the cognitive workload, but can be addressed by using the natural dexterity of the hand to reduce complexity. These criteria are definitely applicable to touch interaction as we are interested in evoking our natural abilities to manipulate and interact with surfaces. 3.1.2 Taxonomy of Hand input A taxonomy of hand input describes styles of use of a particular type of interaction. Sturman's work considers whole-hand input and distinguishes different styles based on the associations between the interpretation of hand actions by applications and the hand actions themselves. The space of hand actions is relevant to our organization of surface-constrained actions in Chapter 6. 3.1.3 Evaluation Guide In order to choose the type of interaction for a task, a set of measures is specified as analogues between the hand action and task domains. By decomposing the application task in task primitives, designers can select hand actions based on prior experience and knowledge of how the hand is used in similar functions. This process can be performed iteratively to refine the selection of an appropriate input action. Table 3.1 shows these corresponding sets of measures associated with task characteristics and hand action capabilities. Each measure is compared individually, and i f a particular hand action is insufficient for "any one measure that the task primitive requires, then the hand action can be modified or another action selected to correct that deficiency. 14 Task characteristics and Hand action capabilities requirements Degrees of freedom Degrees of freedom Task constraints Hand constraints Degrees of freedom Range of motion Physical constraints Coupling Temporal constraints Spatial interference External forces Strength Coordination Coordination Resolution Resolution Spatial Spatial Temporal Temporal Speed Speed Repeatability Repeatability Steadiness Steadiness Endurance Endurance Expressiveness Expressiveness Modality Adapability Task analogy Familiarity Comparison to existing methods Similarity to existing skills Similarity to other tasks Similarity to everyday motions Table 3.1: Evaluation guide 3.1.4 Device Capabil it ies This part of Sturman's work ties into a major focus of our research, the classification of user interfaces, which we will present in the next section. 3.2 Classifying User Interfaces There have been several efforts to classify user interfaces. Earlier taxonomies [Foley82] [Sherr88] have been focussed on the mechanical and electrical properties of each device, such as differentiating between trackballs, joysticks, etc. When input devices became more numerous and new interactive applications started emerging, the need for classifications along other dimensions drove several important developments in classifying user interface. 3.2.1 Logical Devices One important development is the idea of device independence, or logical or virtual input devices [Foley74]. This was developed in an attempt to define and standardize interaction paradigms with respect to computer graphics development. The GKS (Graphical Kernel System) contributes to this by abstracting the workstation from the actual graphics hardware devices and thus defining several logical devices as interfaces to the abstract workstation. These devices are specified in terms of the values each generates: • locator: a pair of values giving a position in a virtual coordinate system; • pick: indicates a unit that is being selected; • choice: allows selection of one option from a set of options; 15 • stroke: a sequence of x- and y-coordinates in a virtual coordinate system; • string: a set of characters; • valuator: a single value of type real. With these logical devices, developers can now experiment with different devices by modifying the device driver, rather than the software application itself, since the application accepts input from these abstract logical device types. This is an example of a taxonomy based on use, as opposed to one based on physical properties. Foley et al. [Foley84] took the concept of logical devices and presented them from the perspective of a user rather than from an application. The six devices listed above therefore are represented as actions which act to provide the type of values each device is intended to produce: • orient an object; • position an object; • select an object; • ink; draw a line; • enter text; • value; specify a scalar value. The correspondence between each action primitive and device is not direct, but we can now more clearly differentiate between important interaction types as these terms draw on human actions. Foley et al. use this taxonomy of interactive techniques to classify a comprehensive set of technologies, shown in Figure 3.2. Foley et al. developed these taxonomies as a means to organize input devices for the purpose of interaction with graphics applications (cf. interaction using a character-based terminal). Therefore, devices and actions related to orientation and position are major categories that classify input(s) provided by each device with respect to these major tasks. Devices developed for graphics manipulation have gained mainstream acceptance for interaction with computer graphical user interfaces in general, i.e. the mouse as an indirect locator. 3.2.2 Focu s on Pragmatics A similar approach is described by work by Foley et al. and expanded by Buxton in terms of four layers: conceptual, semantic, syntactic, and lexical [Foley74][Buxton83]. The conceptual layer encompasses the main concepts of a system from the perspective of the user, in other words, a user model; the functionality of the system is incorporated in the semantic level; the syntactic level defines the grammatical structure of the tokens used to articulate the semantics; the structure of these tokens are defined by the lexical level. A benefit of this classification is that it works well for system analysis at the design stage. Buxton explains that this approach, however, has one major drawback: the lexical layer is still too broad to describe attributes such as the spatial location of items on a display, the location of devices in the workstation, and the type of physical gestures required to effect an action. 16 Light pen Touch panel Tablet Mouse Joystick Trackball Cursor keys Keyboard Keyboard Soft keys Tablet and stylus Light pen Voice input Voice recognition Direct locator Touch panel Tablet Mouse Joystick Trackball Cursor keys " v ~ \ ^ Direction keys Cursor keys Direct pick Light pen Orient Indirect locator Joystick Figure 3.2: Enumeration of some common interactive techniques and interface technologies in terms of logical action primitives To address this issue, Buxton separates the lexical layer into two components: the lexical component which deals with spelling of lexical symbols or icons; and the pragmatic, referring to issues of space, devices, and gestures. It is this pragmatic level that comprises the primary level of contact with an interactive system, and thus has a strong connection with the user and an effect on how the user perceives the system. From this concept of pragmatics, Buxton proposes new axes along which input devices can be classified. One is whether devices inherently measure continuous or discrete quantities. Another classifies devices according to their primary agent of control—typically a biological mechanism, such as a hand or a voice. Properties being sensed by the device can be categorized as position, pressure, or motion. Finally, the number of dimensions that is being sensed also comprises an important axis. As an example, Buxton provides this taxonomy that classifies continuous manual input devices (Table 3.2). The rows categorize the property sensed by each device, and it further distinguishes between devices that require a physical intermediary and those that sense touch directly. The columns classify according to dimensionality, and subdivide among devices that Selection Direct pick Indirect locator Time scan Button push-Recognition-17 use comparable motor control. Number of Dimensions 1 2 3 Property Sensed c o -*—' '</) o Q_ Rotary Pot Sliding Pot Tablet and puck Tablet and stylus Light pen Floating joystick 3D joystick M Property Sensed Touch tablet Touch screen T Property Sensed c o o Continuous rotary pot Treadmill Mouse Trackball 3D trackball M Property Sensed Ferinstat X / Y pad T Property Sensed CO CO CD L _ 0 . Torque sensor Isometric joystick T Table 3.2: Taxonomy of input devices centred around pragmatic functionalities This taxonomy encapsulates many common input devices and classifies them on pragmatic levels, as opposed to a more restrictive set of lexical actions. Despite the use of major and minor categories for each axis to classify according to multiple criteria, it still only addresses manual continuous input devices. Furthermore, like the taxonomies by Foley et al., there is no defined notion of completeness. 3.2.3 Design Space of Input Devices Mackinlay et al. [Mackinlay90] proposes a framework that addresses the issue of completeness by generating a design space, which is derived from the analysis of human-machine interaction, in order to model the language of input device interaction. They define two main ideas that are required: a primitive movement vocabulary; and a set of composition operators. As described by Mackinlay et al., the movement vocabulary comprises elementary sentences, expressible in an artificial language, that composition operators can combine into a "combinatorically richer set". The primitive movement vocabulary is formalized as the following elements: • manipulation operator (M) • input domain (In) • output domain (Out) • current state of the device (S) • resolution function (R) that maps from input domain to output domain • general purpose set of device properties (W) that describe additional aspects Manipulation operators correspond to dimensionality and sensing property in Buxton's taxonomy, which Mackinlay et al. extend to give the following set of possible combinations in M (Table 3.3): 18 Linear Rotary Position Absolute Relative Position P Movement dP Rotation R Delta rotation dR Force Absolute Relative Force F Delta force dF Torque Delta torque dT Table 3.3: Physical properties sensed by input devices Other properties can also be measured, such as heat and voice, though the vast majority of input devices make use of only the above physical mechanisms. Composition operators take these movement elements and combine them to generate more complex device input and output properties. The merge composition combines multiple devices to provide one whose input domain is a cross product of each device's input domain; the layout composition occurs when multiple devices are co-located on the same physical panel (parallel input); and the connect composition takes an output domain from device and maps it to the input domain of another device (sequential input). The design space by Mackinlay et al. presents is described as "the set of possible combinations of the composition operators with the primitive vocabulary". In Figure 3.3, a visualization of this space is presented. A device is represented by a set of circles connected together—lines indicate a composition operator, and each circle represents a physical property the device measures. This figure was adapted from the amalgamated reclassification of devices listed in the works from Foley et al. and Buxton. The taxonomy bears similarity to Buxton's taxonomy of manual continuous input devices shown in (Table 3.2). It categorizes according to the type, dimensionality, and resolution of the physical properties sensed by a device. The rows denote the property type, but the actual type represented by a particular grid is also dependent on the dimensionality of the property. The rows indicate pressure, force and their derivatives for devices that measure linear input (left half). For rotary input (right half), the same rows indicate angle, torque, and their derivatives, which are simply the analogues to their counterparts in the linear domain. The columns denote the dimension of the measured property in 3-space; again there is a different set of dimensions for each measurement domain. Linear inputs can be made along the X - , Y- , and Z-dimensions, whereas rotary inputs are made about the X - , Y- , and Z-axes, denoted by rX, rY, and rZ, respectively. The columns are further subdivided to indicate the resolution of the measured input values, going from unit to infinite resolution from left to right. This is marked by column footers. Each device is represented by circles which are placed in the appropriate locations on the grid, connected by a solid or dotted line. In the Mackinlay and Card design space, these lines indicate a type of composition operator: solid lines for a merge composition, dotted lines for a layout composition. A merge composition denotes an intrinsic link between two or more of a device's input mechanisms. For example, a mouse provides linear input in the X - and Y -19 dimensions simultaneously; it is difficult and unintuitive to provide input to either of those dimensions independently. This represents a perceptually integral task [Jacob94]. X Linear Y I Z rX Rotary rY rZ c o - p CO r o Q. a o o O 10 I o menu keyboard YPLGIove o tablet light pen o sliding pot touch screen o •o Polhemus Cube •o pie menu o •o 0 absolute joystick o rotary p'ot Etch-a-Sketch - o o o •o -o 3D joystick R CO CD § d P > o E o o -0 Tasa X-Y Pad o treadmill o Tasa Ferihstat 0 0 -0 velocity joystick o continuous pot 0 trackball o thumbwhee'l •o -0 3D trackball > a> a) .1F o pressure pad 0 -o -0 Minsky's Force Screen isometric joystick o torque sensing O T a c CD 0 O o d F d T > o B c CD 1 10 100 °° 1 10 100 °° 1 10 100 °° 1 10 100 - 1 10 100 °° 1 10 100 °° Figure 3.3: Taxonomy based on design space analysis of input devices A layout composition occurs when two or more sets of input mechanisms are collocated on the same input panel, and are intended to be manipulated by the same hand. A mouse contains a layout composition, as the buttons (linear input in Z-dimension) provide input that is independent of the position sensing mechanism. In fact, many devices in Figure 3.3 that are also commonly used as pointing devices (trackball, joystick) can also show a layout 20 composition i f they have accompanying buttons. Variables in the circles corresponding to buttons indicate their number, representing independent input mechanisms. A connect composition is represented by an arrow, and can be seen linking the rotary inputs to the Skedoodle and the Etch-a-Sketch that provide input to cursors. Upon determining the type and dimensionality of the input mechanisms of a device, the circles' horizontal position within the grid is important. For many pointing devices, high-resolution input is usually desired. Thus we can see the circles placed at the far right of the column. For button controls, such as the mouse buttons and the keyboard, the circles w i l l be placed at the far left, indicating the lowest resolution possible: binary on/off. Note that the scale in the footers of each column represents a continuum between discrete and continuous, and not any particular metric (i.e. 10 units/cm). To understand the diagram, consider a common input device, the mouse. The mouse is represented as two circles in the dP row, indicating that it senses change in linear position in the X and Y dimensions, which is consistent with the operation of a mouse. The solid line linking the two circles indicates the use o f a merge composition, meaning the position sensing of both dimensions are intrinsically linked. The circles are aligned towards the infinite resolution scale because a mouse has high position sensitivity. Buttons on the mouse are represented as a circle in the linear Z column (the variable y refers to the number of buttons), but in the unit resolution, due to each button's binary nature. This circle is linked by a dotted line denoting a layout composition. Using this taxonomy to model the space of input device designs, Mackinlay et al. can now evaluate specific designs according to two important criteria: expressiveness and effectiveness. Expressiveness describes how the input conveys exactly and only the intended meaning. Effectiveness is how well this input can convey the intended meaning, including factors such as pointing speed and precision, errors, time to learn, time to grasp the device, user preferences, desk footprint and cost. 3.2.4 Input Device Models The above contributions are organizations of input devices from different contexts and purposes. Bleser [Bleser91] proposes motivation for developing such models by identifying several objectives and requirements. Six general objectives are defined as the purpose of a model of input devices: • support the design of effective user interfaces • emphasize the user's view of the interface • organize the input space • incorporate an extended definition of an input device • provide a declarative rather than functional model • emphasis on general interaction These objectives in turn suggest requirements for the structure and content of a model of input devices, which are summarized as follows: 21 Structural requirements • general description that covers a variety of devices • extensible model • shared descriptions • separation of input from output • separate description of physical package • flexibility in binding between descriptions of input actions and their interpretations • flexibility in binding between descriptions of input actions and feedback Content requirements • data characteristics of a device • emphasize physical use of devices • physical characteristics of device • characteristics that are used in human factors-based design decisions • human factors knowledge regarding use of specific devices In order to fulfill these objectives and requirements, Bleser proposes a model of input devices that is structured to clearly characterize each input device by its functional and physical properties. A s with other taxonomies, functional characteristics describe the input and output domains of the device, and physical qualities describe packaging, form factor, and the human actions recognized by the device. One particular parameterization of input device properties is shown in Figure 3.4. Based on some of these interesting ways to organize input devices, we wi l l propose our own classification in Chapter 4 to classify specifically touch interfaces. Before that, however, there actually needs to be a set of touch interfaces to organize, and this is what follows in the next section. 3.3 Touch Technologies Most human-computer interfaces require the use of touch. Our focus is on interfaces that capture touch as an input modality, rather than interfaces that have physical mechanisms that are manipulated through touch. This section is organized into three sections which describe the most common technologies used to capture touch interaction: electrical, optical, and mechanical. Table 3.4 summarizes the devices found in each category. 3.3.1 Interfaces Us ing Electrical Properties Early touch-sensitive controls relied on the capacity of the human skin to conduct electricity. These controls would activate when they detect a change in the resistance of the touch-sensitive surface due to the contact of human skin. Such controls can be found in some television sets and elevators which have buttons that do not rely on a physical activation mechanism. Today, there are several touch interfaces used in interactive systems that employ electrically-based sensing systems, predominantly ones that capture single-point touch. 22 ACTION name device prerequisites effects data delivered feedback granularity shape INPUT DOMAIN name degrees of freedom list degrees of freedom bindings state description DEGREES OF FREEDOM H name axis of manipulation property sensed ordering granularity bounds sensory type value range state description PRIMITIVE DEVICE parameters family action list input domain output domain physical package state description input/output function rules of use PHYSICAL P A C K A G E name technology directness body/control coupling controlling body part device grasp control precision control motion range of motion position stability resistance state visibility OUTPUT DOMAIN name dimensions list state description DIMENSION name ordering granularity bounds report type value range state description Figure 3.4: Parameterization of input device properties Electrical/Capacitive Optical Mechanical • Laptop touch pad • Wacom drawing tablet • Multi-Touch Surface [WestermanOl] • [Lee95] • SmartSkin [Rekimoto02] • DiamondTouch [DietzOO] • EnhancedDesk [Oka02] • Liquid Haptics [White98] • Haptic Lens [Sinclair97] • GelForce [Kamiyama04] • M T C Express • S T C 1000 • F E E L E X [IwataOl] • Some haptic devices Table 3.4: Existing touch interfaces organized by technology 23 3.3.1.1 Single-point Sensing The ubiquitous laptop touch pad was first developed in 1988 by George E . Gerpheide, who founded Cirque Corporation to develop touch pad technology, which was called GlidePoint. These touch pads were to be used for pointing tasks, as a substitute for the mouse, and soon were adopted by laptop manufacturers, the first being Apple. GlidePoint and most subsequent laptop touch pads employ a capacitive-sensing technology which detects a change in A C when a finger touches the surface. The location of the touch is then determined. This happens robustly, in real time. The device typically provides an output of a single coordinate in order to control a single cursor. Input of more than one finger typically results in the output of a point that represents the centroid o f the applied fingers. Wacom produces a popular line of drawing tablets of a variety o f sizes and formats. Each tablet provides a rectangular sensing surface with which a stylus can be used. Its most useful feature is very accurate pressure sensitivity—essential with many pencil and paintbrush drawing techniques. Using capacitive sensing technology between the interactive surface and passive circuits embedded in the stylus, pressure information is inferred from the precise position of the stylus. In addition, tracking and dragging semantics are also possible, due to the detection of proximity which does not require contact. Figure 3.5: Cirque touch pad (left); Wacom Graphire3 drawing tablet (right) 3.3.1.2 Multiple-point Sensing While single-point touch is very useful for interacting with computing systems today, the capabilities of the human hand are not restricted by user interface conventions. Namely, we are capable of touching with more than one finger, and while they are less common, there exist technologies that explore the capture and processing of multi-point touch. The first evidence of a hardware input device capable of sensing multiple-point input was from the work by Lee et al. [Lee85]. Up until this point, touch devices remained constrained to a single-point domain, with some devices experimenting with pressure sensitivity [Minsky84]. The work by Lee et al. proposed a hardware description for a fast multiple-touch-sensitive input device (FMTSID) which capacitive measurements between the finger and metal to determine the touch input. They also proposed some optimized scanning and interpolation algorithms to improve performance given limited computing capabilities. 24 The M T S (Multi-Touch Surface) technology developed by Westerman and Elias [WestermanOl] is currently being used in Fingerworks' line of keyboard and touchpad products. For keyboarding, the input surface interprets asynchronous touches; chorded (simultaneous) inputs are interpreted for gestural purposes such as pointing and clicking. It relies on capacitive proximity sensing, and was developed to facilitate both typing and pointing on the same control surface, as an alternative to the conventional keyboard-mouse setup. Among larger-scale devices, SmartSkin by Rekimoto [Rekimoto02] uses a capacitive sensing architecture and a desk form factor. Developed for collaborative gesture recognition, it utilizes proximity sensing to enhance the input space. For example, mouse emulation is achieved by using proximity to map to button press states; variations in the potential field created by the hand's proximity to the table is used to effect repelling actions. Figure 3.6: Multi-point sensing desk interfaces: Smartskin (left), Diamondtouch (right) Mitsubishi Electric Research Laboratories developed the DiamondTouch large-scale touch surface to facilitate multiple-user collaborative applications [DietzOO]. It also uses a capacitive sensing technology and a set of insulated antennae serve to differentiate between input provided by each user. The large form factor of the DiamondTouch makes it more like a touch screen input device rather than a touch pad; graphics projected from above onto the interaction surface indicates a one-to-one mapping between position and graphics coordinates. 3.3.2 Interfaces using Optical Properties A n alternative to capacitive sensing technology is to use devices which detect changes in optical properties. Most commonly, these involve camera-based systems. These are approaching reasonable cost and speed, but there are other devices which use optical sensing in different ways. Advantages of optically-based systems over electrically-based systems include not requiring a conductive object to provide input, and being less prone to interference effects. 3.3.2.1 Video Capture A system for tracking gestures of multiple fingertips is seen in Oka's EnhancedDesk augmented desk interface [Oka02]. Using vision-based methods, the EnhancedDesk is able to resolve multiple fingertips and apply tracking and correspondence algorithms to detect and process gestures. The fingertip detection is performed by a binarization scheme on a properly thresholded image taken from an infrared camera, which precludes difficulties due to 25 complex backgrounds and changing illumination. Most importantly, it attempts to optimize the search for fingers by applying predictive filtering based on the physical constraints o f the human hand. Another interface that uses video capture is the Liquid Haptics work by White [White98]. A liquid-filled bladder serves as a deformable surface not unlike the human skin. A video camera positioned underneath captures the varying levels of light that passes through the translucent bladder as it is being manipulated. The result is an input device that detects not only the shape of the hand and fingers applied to it, but also the pressure, indirectly. In addition to passive haptic feedback due to the properties of the bladder, White also experimented with placing the bladder on top o f electromagnetically-driven actuators in a grid. Figure 3.7: Range information generated by Haptic Lens (left); interacting with GeForce (right) Similarly, the Haptic Lens by Sinclair [Sinclair97] uses a flexible membrane screen to act as the input surface. Video capture of reflected light levels as the hands or other objects are pressed on the membrane provides 3D input generated from the captured range images. Proposed for applications such as telemedicine and 3D digitizing, it provides real-time manipulation of 3D control points and also senses shear input, an example of an action that would be difficult to capture using a conventional touch surface. The GelForce [Kamiyama04] system determines the distribution of both the magnitude and direction of forces as applied onto the sensing surface, which consists o f a silicon layer embedded with dots. These forces affect the position of the dot layers differently, and these changes are captured by a video camera which is mounted underneath the touch surface. These changes are mapped to virtual forces which drive a real-time sheet simulation. The sheet deforms according to the forces applied by the hand with compelling results. However, even at present, this system has high system requirements. 3.3.2.2 Light Transmission Tactex Controls, Inc. is a company that creates touch interfaces using their proprietary Kinotex technology to sense touch input. Kinotex works by measuring changes of light transmission properties through fibre optics. Applying pressure to the interface deforms these embedded fibre optics. Deformation of the surrounding material causes variances of light transmission properties. 26 This technology is suitable for multiple-point pressure sensing, and was originally designed for environments that are hostile to devices using electromagnetic technology. A disadvantage is that there are limitations to the density of the sensor placement to prevent interference between individual fibres. One of the products from Tactex Controls, the M T C Express, is a desktop touch pad with 36 sensors. The M T C Express has been used in various art and music applications, as well as driving the sensing technology of an extensible mixer device, SurfaceOne. Figure 3.8: M T C Express (left); STC-1000 (right) The STC 1 0 0 0 uses similar technology to provide an interface that is suited for percussive input. The interface is separated into sixteen impact-sensitive zones, which can be programmed using M I D I (Musical Instrument Digital Interface); conventional synthetic drum pads offer this functionality, but only with at most two zones. The technology also allows it to be useful for finger-based touch input, such as dragging operations. 3.3.3 Interfaces using Mechanical Properties Project F E E L E X by Iwata et al. [IwataOl] is a haptic feedback device that presents a spatially continuous surface on which users can touch a projected image with any part of their hand. A flexible screen is placed over an array of linear actuators which work in concert to provide haptic feedback over a surface such as sensations associated with shape and rigidity. This device does not actually record any input information provided by the hand, though its haptic output, which clearly is meant for the whole hand, demonstrates a suitability for whole-hand applications. 3.3.3.1 Haptic Devices Many interfaces that involve mechanical actuation wi l l invariably be able to provide some form of tactile feedback, as opposed to other interfaces which provide a static surface. These haptic displays and force feedback devices are popular areas o f research and commercial development. Despite this, the best haptic feedback devices use physical actuation which can only simulate a limited range of tactile sensations. 27 Haptic research generally focuses on conveying touch sensations to users in order to simulate interaction with real objects or to enhance interaction with elements within a virtual environment. A convincing simulation of touch forces can go a long away to improving realism and accuracy when performing virtual tasks, since we have natural abilities to recognize when something "feels" right based on prior experiences. Considerable research is focused on simulating the forces that act on the body in order to provide a reasonable facsimile of the body being touched in a recognizable manner. If we can replicate the forces that a certain action normally exerts on the human body, without actually needing to conduct the actual action itself, the body can be fooled into believing that the action had been performed on it. This deception can be made stronger by coupling force simulation with visual and aural cues, as is often the case. Direct mechanical actuation is employed by many haptic devices (i.e. Immersion CyberGrasp). In some applications these devices can provide a reasonable sense of touch (i.e. gripping a steering wheel), but even these are made possible through either complex and intrusive apparatus, or through means that substitute for a direct application of force, such as using vibration packs. Figure 3.9: Immersion CyberGrasp The simulation of arbitrary forces on the human body using mechanical actuation is difficult given the sheer number, types, and sensitivity of human skin receptors that are involved. There are other techniques using less intrusive means such as electrical impulse-driven vibrations that may prove to be useful, but these are still experimental in nature. The reality is, until we have science-fiction-style remote projection of forces, it would be difficult to replicate forces other than by performing the original action itself. Haptic interfaces only represent one way in which touch information can be transferred— namely from the environment to touch receptors on the body. The other direction, how the body can apply touch onto a surface, is relatively unexplored. Therefore, while we recognize the importance of haptic research, we are focused in the opposite direction. 3.4 The Human Sense of Touch So far we have introduced several methods of organizing the space of input devices, as well as some important touch technologies themselves. In order to conduct research on touch-28 sensitive devices, we needed to gain an understanding on the human sense of touch from a physiological and psychological standpoint. Naturally, this is a huge field of study. The following table summarizes some of the important terms and concepts used in touch research literature that are relevant to our own research. Concept Relevance Reference Active touch vs. passive touch Touch interaction typically involves active movements of the hands and fingers. Hence, a user touching a touch interface is engaging in active touch. However, the perspective of touch interfaces is that of passive touch. [Gibson62] Surface touch vs. immersed touch Touch interfaces present a physical surface that a user can contact—clearly this is surface touch. However, in Chapter 7, we wi l l show an application that attempts to map surface touch to an experience that recalls immersed touch, to mixed success. [Katz89] Prehension Prehension, a form of touch interaction, is the use of hand postures to manipulate objects. This contrasts with our study of surface touch in which we are touching in order to provide information onto a touch-sensitive surface. [MacKenzie94] Intermodality The human senses are all interconnected. We use active touch to explore characteristics of our environment that supplement the information we gain from our other senses. Many touch interactive applications also closely correlate touch input with feedback from other modes (i.e. visual display). [Heller91] Gestures Using active touch comprises certain movements that can be exploratory, manipulative, or intentional. The last type of movement can be considered gestures, which has many forms and purposes. [McNeill92] Table 3.5: Touch research concepts important to this thesis We are specifically interested in surface touch, which is defined by Katz as contact between the skin and any object with a solid physical structure [Katz89]. This contrasts with immersed touch, which is the experience we get when moving the hand through liquid or streams of air. A s mentioned previously, active movement has a significant role for the perception of touch. This touch sensation is called volume touch and refers to when touch is directed to judge shape and dimensions, rather than surface properties. This emphasis is also expanded by Gibson [Gibson62] who distinguishes between touching and being touched, the former necessarily involving kinaesthesia. Accuracy of perception is better achieved through such active touch, whereas more subjective sensations are 29 experienced through passively being touched. Gibson also observes that many studies concentrate on providing passive touch stimulation to subjects. He uses the term haptics in circumstances where an object is actively held for perception purposes. Since our perception of tactile input is so acute, we often do not have to rely on the other senses to guide and execute a touch-related task, even though such secondary feedback is often present and helpful. This is helped i f the surface we are touching can provide feedback through movement (i.e. deformation, displacement), no matter now small. Interaction with an immovable, fixed surface wi l l provide less feedback, so reliance on other input from other senses may be necessary. Concerning a particular type of touch interaction, prehension, a book from MacKenzie and Iberall [MacKenzie94] describes not only the physiological mechanisms through which grasping actions are effected, but also details the physical mechanics of grasp. Among its contributions are an organization of prehensile actions, differentiating between power and precision grasps, a classification of opposition spaces in grasp actions, and the serial phases of a prehensile action. This work is important in that it provides an example of a classification of a certain subset of hand actions. Touch interaction is also related to gestures, as they both involve the hands. M c N e i l l describes gestures in the context of human communication [McNeill92]. Gestures can be considered as iconics or metaphorics, which depict an object, event or idea, especially by bearing a close relationship to the semantic content of speech. Gestures can also be considered deictic, or pointing to something or someone concrete or abstract; they can be executed in beats which emphasizes a phrase as being significant, also in the context of speech. A thorough treatment of the various classifications and modalities of hand movements and gestures is given by Mulder [Mulder96]. Some gestures are often connected and complementary to speech (semiotic), while others are more suited towards manipulation of the environment (ergotic). Touch interaction involves both, even though we are not dealing with speech. While we are considering the general space of physical hand actions and not restricting ourselves to studying gesture languages as the only type of input onto touch interfaces, classifications such as these are helpful to understand the different contexts for applying touch. In Sturman's design method for whole-hand input [Sturman93], a taxonomy of whole-hand input that describes styles of use of whole-hand input. These are derived from a categorization of hand actions and the interpretation of hand actions by a task. Hand actions are defined as the position, motion, and forces generated by the hand, and these are interpreted as functional tasks by an application. We wi l l revisit this work in Chapter 5, where we wi l l organize hand actions. The physiology and psychology of the human sense of touch is complex and there is still much to be discovered. Our focus on interaction with surfaces wi l l address a specific type of touch that is not often investigated in touch research: that of the conveyance of touch input on surfaces, as opposed to tactile feedback to our touch sensory system. These two directions 30 are intricately linked, however, and we need to understand how we process touch to understand how we are to perform hand actions. 3.5 Summary In this chapter, we described some of the prior research and developments that are relevant to this thesis. Our main contributions of organizing the space of touch interfaces is inspired and guided by many established works in the field of general input devices. Such an organization requires a set of devices to be classified, and we have introduced many of the important technologies that are representative of this area. Finally, we contextualized our research in terms of the area of study of the human sense of touch. 31 Chapter 4: A Taxonomy for Touch Interfaces This chapter proposes a taxonomy of touch interfaces which capture touch applied onto a physical surface. This taxonomy is an integral component of the framework for whole-hand touch interaction. This results from adapting an existing approach for classifying input devices to be more relevant to touch interfaces. Some of the touch interfaces that are representative of whole-hand touch interaction wi l l be classified. Next, several key classification criteria w i l l be added to for the updated taxonomy based on important characteristics common to touch interfaces. Finally, the updated taxonomy using these new criteria w i l l be presented, and some insights from this process w i l l be summarized. 4.1 Motivation Our experience with the M T C Express suggested that our intuition about what types of interactions can be supported by a particular interface may not be sufficient. This intuition is based on our experience with similar devices, and in the case of touch interfaces, there are not too many in common use. Therefore, our intuition about touch interfaces are, for most of us, typically informed by our experience with laptop touch pads, touch screens, and drawing tablets—all single-point interfaces used to manipulate a screen cursor. To be able to gauge the suitability of a touch interface from a perspective not limited to our experience of single-point devices, we must identify important attributes that set one touch interface apart from another. Only then can we start to consider how different variations of these attributes determine what actions can be captured and supported. We identify these attributes through organizing touch interfaces in a systematic way, and since the space of touch interfaces is part of a larger space of input devices, we can begin by considering efforts to classify input devices. 32 One motivation for classifying input devices is to make sense of the large variety of such devices, and also to assist in the development of new devices [Mackinlay90]. The taxonomy introduced in Chapter 3 have classified devices according to logical operations [Foley74], as well as dimensionality of the physical properties sensed by the device [Buxton83] [Mackinlay90], among other attributes. Our taxonomy aims to expand on the previous work and organize devices according to physical properties that are relevant to touch devices. The motivation remains the same: to better understand the space of touch interfaces. B y classifying existing devices according to attributes that are relevant to touch interaction, we can identify the commonalities among different interfaces, and also see where certain attributes are not currently sensed, suggesting possible areas for future work. One taxonomy stands out as being quite complete and effective at classifying many of the common input devices today—this shall be used as a basis for a taxonomy for touch interfaces. 4.2 Basis for New Taxonomy We base our taxonomy on the work by Mackinlay et al. [Mackinlay90] described in Chapter 3. Their taxonomy comprises part of a design space for input devices in general. A representation is shown in Figure 4.1 that maps many conventional user interfaces. Devices are represented by one or more circles; each circle denotes a physical property sensed by the device by its placement in the table. Figure 4.1 is similar to Figure 3.3 but with several less relevant or archaic devices removed. This taxonomy distributes many common input devices throughout the figure; the criteria chosen is representative of the gamut of physical properties typically sensed. Many touch interfaces also capture input in terms of these physical properties, but since many of these are relatively recent developments, they have not been classified in the existing taxonomy. We do this in the next section. 4.3 Classifying Touch Input Devices in Existing Taxonomy The taxonomy by Mackinlay et al. specifically classifies the dimensionality and resolution of input from a device, which can be explicitly quantified. It effectively illustrates the differences between many input technologies in use today as well as representing the correlation between input dimensions of a particular device. For our research, we are interested in where touch-related input devices fit into this classification. We already see a few existing devices already classified, such as the touch screen and the pressure pad. Since this taxonomy was developed, however, many significant new devices have been developed, some of which were introduced in Chapter 3. A s we w i l l see, the taxonomy is less useful when classifying such devices as it was created to organize input devices in general. Figure 4.2 shows the classification of touch interfaces in the same taxonomy. 33 X Linear Y z rX Rotary rY rZ position O a o touch screen o touch tablet © keyboard O o absolute joystick angle CH position VJ Vj Fastrak VJ 0 VJ VJ VJ angle CH movement r~\ r> O continuous pot > CD movement VJ VJ mouse VJ VJ trackball force o pressure | o ad o isometric joystick torque Aforce TI Atorque t-1 10 100 °° 1 10 100 °° 1-10 100 °° 1 10 100 °° 1 10 100 °° 1 10 100 °° Figure 4.1: Input device taxonomy based on design space by Card et al. Rows denote property type; columns indicate dimensionality and resolution of the measured input 4.3.1 Explanation of Device Placement in Taxonomy This section w i l l justify the placement of each device within the Mackinlay and Card taxonomy. First, it should be easy to see that in general there are no touch devices that capture rotary input. Rotary controls are typically associated with physical mechanisms like dials and those found in trackballs and joysticks. While in Chapter 5 we wi l l see that there are touch interfaces that can capture a rotation about Z by the hand, since there is no physical mechanism that encodes such an action, we do not consider placing a circle in the rZ column. Devices span a broad spectrum from commercial tools to research prototypes, including interfaces using different sensing technologies and form factors. What they all have in common is that the functionality of each device is expressly dependent on touch, and not through any physical controls such as buttons or sliders. Many of these devices have been introduced in the related works section, and here we emphasize the properties that lead to their classification. 34 X Linear Y z rX Rotary rY rZ position a Wacon o o a a a— MTC o — STC-1000 o touch screen n drawing tablet =3 DiamondTouch o FingerWorks a SmartSkin Express \^ -O \ ° P \ S B R CO C D movement TJ a laptop — ouch pad \ > C D force o- GelForce o--\—O torque Aforce T| \) Atorque -o 1 10 100 °° 1 10 100 oo 1 10 100 oo 1 10 100 oo 1 10 100 oo 1 10 100 oo Figure 4.2: Classification of touch interfaces in the Mackinlay and Card taxonomy 4.3.1.1 Touchscreen A touch screen usually acts to provide direct manipulation on a graphical display device, typically on a computer monitor. If a display shows buttons that look like they can be pressed, it is extremely intuitive to have a touch-sensitive area overlaying the button graphics so users can activate the button control. Most touch screens use a capacitive technology that senses single points upon contact, hence their placement in the linear X and Y columns. The sensing resolution is usually high (i.e. comparable to the display resolution), as with most capacitive devices, and thus the circles are aligned towards the right of each column. 4.3.1.2 Laptop Touch Pad Perhaps the most ubiquitous touch device is the capacitive touch pad found on laptop computers. These also employ capacitive sensing and hence their placement in the linear X and Y columns as with touch screens. However, their output is typically used to manipulate a 35 mouse cursor. Since it is impractical to employ a control/display ratio of 1:1 with a touch surface that is so small, or conversely, make the touch pad surface large enough to facilitate a 1:1 C / D ratio, the laptop touch pad provides relative position information, hence their placement in the dP row. This placement is the same as where the mouse was placed in the original taxonomy, and this seems obvious. However, with some recent additions to touch pad functionality, such as tapping on the surface to represent a button activation, or a virtual scrolling mechanism at the edge of the touch area, the current representation may be insufficient. 4.3.1.3 Wacom Drawing Tablet Drawing tablets employing light pens or other tethered styli have existed for some time. These often function similarly to a touch screen, except without a coupled underlying display. Since such drawing tablets using styli as an input mechanism, they cannot be considered as a true touch interface per se. However, ignoring the fact that we do not directly touch a drawing tablet with our fingers, it is not difficult to extrapolate that a drawing tablet would be represented in the taxonomy similarly to a touch screen. More recently, drawing tablets have gained popularity due to innovative products from Wacom. These use untethered styli to provide pressure input that, coupled with the appropriate software, can be used to produce much more expressive strokes than with a simple contact-sensing tablet. Capacitive sensing technology also facilitates manipulation of the cursor (or virtual drawing tablet) without contact between the stylus and the tablet surface. The pressure sensing aspect makes Wacom tablets more relevant to touch interaction, since we can intuitively apply pressure-varying touch with our own fingers. However, we wi l l see that it is difficult to sense such touch interaction with the same precision primarily due to difficulties capturing multiple fingers. St i l l , Wacom tablets represent an excellent example of a device that senses force (pressure). The circle in the force row is aligned closer to the middle as the pressure sensitivity is comparatively less than absolute position sensitivity. A fourth circle in the linear Z position grid denotes its ability to track the stylus when it is close, which is essentially a binary value (close enough/not close enough). 4.3.1.4 DiamondTouch The sensing resolution of the DiamondTouch [DietzOO] is similar to that of a touch screen, with many applications projecting graphics directly onto the interactive surface for the purposes of manipulation tasks. Hence this device is represented identically on the taxonomy. Note that there is no way of differentiating a touch screen from the DiamondTouch in this taxonomy, even though they have significant differences, though multiple independent input mechanisms can be represented by a variable, as seen with the keyboard and mouse. 4.3.1.5 MTS (FingerWorks) FingerWorks' Multi-Touch Surface (MTS) [WestermanOl] is used in a line of keyless keyboards and gesture pads. They also support multiple-point touch for the purposes of effective keyboarding (supporting use of modifier keys), and recognizing a gesture vocabulary which is richer when involving multiple fingers. However, since the M T S also uses capacitive sensing and thus senses high-resolution position input, its representation is 36 the same as the touch screen and even the DiamondTouch. 4.3.1.6 SmartSkin The SmartSkin [Rekimoto02] is yet another interface that employs capacitive sensing. Like the DiamondTouch, it uses a large-scale tabletop architecture intended for collaborative applications. Its capacitive sensor grid is adapted to measure the potential field created by the proximity of the hand and fingers, thus it is able to determine information in the Z position axis, though at a lower sensitivity than with X and Y position detection. This additional capability is represented with a circle in the Z linear position column. Figure 4.3: Proximity sensing in SmartSkin 4.3.1.7 MTC Express The M T C Express is one interface that does not use capacitive sensing and therefore does not rely on the changes of electrical properties between the hands and the contact surface. This device instead uses the measurement of light transmission levels through a translucent deformable material to determine applied pressure on the sensors. This technology is capable of capturing multi-point touch, like many other devices, but also detects pressure, much like a Wacom tablet. A major difference between the M T C Express and most other capacitive sensing devices is the sensor resolution. Due to the nature of the technology, the sensors have a considerable separation—lcm in the case of the production model. A s such, the circles that represent linear position sensing in the X and Y dimensions are placed to the left relative to those of other devices. Pressure sensing is represented by a circle in the force row, aligned relatively left o f the circle corresponding to the Wacom tablet's pressure sensing capability. 4.3.1.8 STC-1000 The STC-1000 employs the same technology as the M T C Express but tuned to provide output suitable for music application. Its sensor resolution is even less dense—there are sixteen distinct zones, but these can capture percussive input and produce an attack parameter. This device is clearly not intended for pointer control as with touch screens or tabletop interfaces; however, it does illustrate a particular area of the taxonomy that is not filled with any other device, that is, change in force (dF). 37 4.3.1.9 GelForce The GelForce [Kamiyama04] architecture is also a departure from capacitive sensing technologies. GelForce relies on video capture of dot arrays embedded in a deformable substrate to capture force input. The harder the user pushes onto the surface, the more the substrate deforms, and these forces are accurately computed and presented for display. While position sensing of the applied input can be inferred, this device clearly illustrates force input in all directions. 4.3.2 Limitations to Existing Taxonomy Upon classifying these touch interfaces with the taxonomy by Mackinlay et al., it is evident that several characteristics important to touch interfaces are inadequately differentiated. These include the sensing of multiple input points as well as a measure of the spatial sampling resolution of a device. Some aspects of the existing taxonomy are also less important, such as rotation and torque sensing. In the next section, some of these limitations w i l l be discussed and adjustments suggested to help produce and updated taxonomy for touch interfaces. 4.4 Updating Taxonomy AAA Single-point, Multi-point, Higher-order Input The most apparent observation that can be made is that most touch interfaces are found exclusively in the upper-left quadrant, controlling values in the linear positional domain. This makes sense, since touch interfaces are most often controlled by pointing a finger or stylus on a surface. Most interfaces provide continuous position information along the X - and Y -dimensions, corresponding to dragging a pointing device across a surface in order to manipulate an on-screen cursor. The current taxonomy, however, does not distinguish interfaces that can support multi-point sensing. Such interaction is represented in the same way as single point sensing, which clearly should not be the case. In addition, when dealing with the potential complexities of touch by the human hand, further distinction should be made for capture of individual discrete points (DiamondTouch, M T S ) and more general information (SmartSkin, GelForce). This distinction needs to be made because we see interfaces that are not intrinsically intended for manipulating W I M P widgets in traditional user interfaces. Even in the original organization of general user interfaces in Figure 3.3, there are very few devices that are used to manipulate more than one control element at a time. A s examples, devices such as the mouse, joystick, and trackball typically control a single pointer unit (cursor); the Polhemus Cube manipulate one set of 6DOF coordinates. Indeed, the input from a mouse is composed of linear position input in the X and Y dimensions, which theoretically can be used to drive two separate control elements; however, this is not the typical usage. We contrast this with many of the touch interfaces classified in Figure 4.2. Multiple-point capture suggests independent control of separate control elements, such as multiple cursors. Devices such as SmartSkin and GelForce are not involved with controlling distinct control elements at all, but rather a continuum of values in 3-space. Interfaces that intended for 38 multiple pointer control may not be used in the same way as devices which capture higher-order information (i.e. areas, volumes). Our taxonomy needs to address these differences. We therefore separate the linear position columns further into point sensing and field sensing. Fields represent a 2D analogue of the I D point, which is useful for representing information on a 2D plane. The major distinction is that single-point and multi-point devices explicitly produce information in terms of individual points (for the purposes of pointing tasks), whereas devices that capture fields produce information about an area. In addition to using columns to differentiate between point and field sensing, we use circles to denote single-point devices, and squares to indicate all other devices, since single-point control remains an important form of interaction. 4.4.2 Pressure Sens ing In addition to where touch is applied, another important quality is the pressure of touch. In many devices used as conventional computing input, such as touch pads and touch screens, touch pressure may not be important as typical hardware and software does not support such interaction. However, there exist many devices that support pressure-sensing, for applications in art and music that benefit from the fine control offered by the human hand. While rows for indicating force and change in force exist in the Mackinlay and Card taxonomy, there are different ways for capturing force are also important. O f the devices classified in our taxonomy, the M T C Express, Wacom technology, and GelForce all measure pressure; however, each of these do so in different ways. The Wacom tablets actually infer pressure from the position of the stylus relative to the tablet—the actual force applied to the tablet is not measured. Thus the pressure sensing capability is restricted to stylus use. The sensors in the M T C Express do directly measure applied pressure due to the resulting deformations which cause changes in light transmission. Pressure applied in GelForce causes changes in dots embedded in a deformable surface which is captured using video. Other interfaces can also infer pressure information through measuring the deformation of a human finger as pressure is applied. Such interfaces necessarily w i l l fall within the contour sensing category with high sensor resolution to be able to capture the changes in area. SmartSkin demonstrates the viability of this approach on the scale of an arm and a hand, though it is not sufficient on the scale of fingers. We introduce the concept of inferred pressure to denote such cases, and add a sub-row in the force row in the taxonomy. We also introduce an additional sub-row to account for those devices which can register the hand in proximity, without requiring actual contact. This is demonstrated to be useful in Wacom technology when moving the cursor using the stylus without touching the tablet, as in drawing applications it is useful to begin upon contact, and not having to invoke a secondary switch. SmartSkin also senses the proximity of the hands and arms and useful depth information (Z) can be produced. 4.4.3 Sensor Resolution Sensor resolution can be defined in several ways, depending on the context. Many devices wi l l use an underlying technology which has a measurable spacing of sensor elements. The 39 current taxonomy does provide a measure for the resolution of input values in the form of a continuum that serves as the column footers. We w i l l continue to use this approach, but we propose to use more meaningful values to represent this range for position measurements. Precision of position measurements is dependent on the spacing of the sensors. Therefore it makes sense to use standard measurements to represent the range of sensor resolution. Devices that use capacitive sensing or video capture have a very high sensor resolution, capable of capturing sub-millimetre-scale changes in position. Some interfaces represent the applied contact at a scale greater than 1mm. We divide this part into two segments: finger-scale and hand-scale. A finger-scale sensor resolution is capable of capturing enough information to discern fingers. SmartSkin is one such device. A hand-scale sensor resolution is anything can only capture larger areas, possibly due to sparse sensor spacing, but can capture some meaningful shapes of the hand. The M T C Express and GelForce are examples of this. Note that these categories of sensor resolution only apply to those interfaces that support field sensing; this simply does not make sense with devices that simply produce points. 4.4.4 Other Changes The columns that measure rotary input are discarded since touch interfaces do not typically have mechanical actuators to encode rotation, as mentioned previously. When interacting with a surface, we are usually interested in providing touch input in two dimensions. Therefore it makes sense to group the X and Y dimensions in our taxonomy since there w i l l be no cases in which we would be able to provide input in either dimension exclusively. These measurements are considered integral, as they are combined perceptually [Jacob94]. With the adjustments described here and the previous sections, an updated taxonomy for touch interfaces is presented in Figure 4.4. 4.5 Importan t Insigh ts From the updated taxonomy, we can gain several important insights into the space of touch interfaces. Each of these insights also raises some interesting questions which can be explored in further work. • Field-sensing interfaces are well suited for capturing information that is conveyed from the hand and fingers to a surface, and a useful representation of this information would be a series of contours or boundaries. Even though none of the devices in the category actually provide output in this form, some provide a point field that can potentially be processed to generate contours, whereas devices that provide single points cannot. o Can contour-sensing devices be effective at capturing single- or multi-point information? o What are the intrinsic limitations of resolving multiple points in terms of the sensing resolution? 40 Point X+Y z Field X+Y z position contact HS FS mm touch screen DiamondTouch FingerWorks (^2) laptop tou HS FS mm position contact s rc-iooo o • • NT ;h pad \ s C Express martSkin proximity j /( ) Wacom dra\ ring tablet movement 0 le inferred fore direct \ GelForce fore direct Aforce 1 10 100 1 10 100 °° 1 10 100 1 10 100 °° Figure 4.4: Taxonomy of touch interfaces. Circles denote single-point devices; squares denote multi-point/field sensing devices. Sub-columns denote scale of sensor resolution. • Many technologies that capture single- and multi-point input specifically for the control of mouse pointers all have high-resolution sensing capabilities, and they all use some form of capacitive sensing. A l l other technologies capture at a lower resolution and as a consequence do not provide information in the form of points. o Does the use of capacitive sensing automatically guarantee high precision? o Does the use of any other technology preclude high precision? • There are no devices that can capture multi-point pressure-sensitive input; only the SmartSkin captures any combination of non-single-point input and pressure, though pressure is only inferred from position. 41 o If pressure input is desired, does this rule out capacitive-sensing technology? • The vast majority of touch interfaces capture absolute position information, typically due to an interactive surface that is large enough to be mapped proportionally to a visual display. With these insights and questions, we can see that this taxonomy already stimulates discussion about the properties of touch interfaces. Interface designers can consider the possibilities of combining different capture attributes that are not exhibited by existing devices. 4.6 Summary In this chapter we have presented a new taxonomy that classifies touch interfaces based on several important attributes, in addition to the attributes that were classified using the general input device taxonomy by Mackinlay and Card. The original taxonomy categorized the dimensionality (linear, rotary) and type of physical measurement captured by the input device. The physical measurements include position, movement, force, and change in force, which are still relevant for touch interfaces. To improve the taxonomy we differentiated based on the types of position input captured by each device, whether it was suited for capturing points or higher-order information (fields). Point capturing devices were further distinguished between single-point and multi-point interfaces. We added sub-rows for proximity and inferred pressure sensing to distinguish from direct pressure sensing, which is actually quite rare. Finally we qualified sensor resolution in terms of suitability for resolving finger shapes, since we are interested in the interaction between the human hands and these interfaces. The study of this taxonomy, however, is from the standpoint of the physical properties that are measured by a device, and thus can be correlated with application semantics. For example, i f a device measures single-point contact, then it can be mapped to a cursor function in an application. What remains is how these properties are provided to the interface, or in other words, what types of hand actions are possible to produce certain changes in the properties. This is the subject of the next chapter, which examines the space of possible hand actions. 42 Chapter 5: The Space of Hand Actions So far, we have presented a taxonomy for touch interfaces that are typically acted upon using the human hands. Before we proceed to proposing a design space for touch interaction, we need to specify the types of input that are to be provided to these interfaces. Naturally, since we are interested in touch interaction, these actions are performed by the human hand. In this chapter, we propose an organization of hand actions that are relevant to interacting with a surface. We begin contextualizing such interaction in terms of biological constraints and interaction modes, differentiating between generalized tactile input and prehensile manipulation. With this distinction, an enumeration of atomic hand actions is presented, and these are primarily distinguished between hand- and finger-centric actions, and further qualified according to the resultant motion. This serves as a movement vocabulary for the design space. We conclude with some important aspects of touch interaction that are built up from these basic actions. 5.1 Motivation During our development of FlowField, we considered the types of actions that can be supported by the M T C Express. Our intuition suggested capturing multiple fingers being dragged across its surface, based on knowledge of what touch pads usually capture (finger tracking), as well as its claim to support "multi-touch". During this stage it would have been useful to have an organized and finite set of hand actions to consider, even i f they were grouped into vague categories. This chapter proposes such an organization. With touch interfaces, the space of physical properties is often not constrained by any physical mechanisms—only the surface itself. This opens the door to the realm of gesture, 43 which brings many expressive and semantic complexities. However, due to the flat form factors o f many existing touch interfaces, we constrain our study to include only postures and gestures that are physically possible with a flat surface. Despite these constraints, the inclusion of gestures into any possible movement vocabulary used in our design space for touch interaction wi l l present additional purposes for using touch that cannot be summarized by the space of physical properties used in previous work. B y enumerating and classifying surface-constrained gestures using physical and semantic categories, we wi l l improve our design space to better reflect the incentives and benefits of using touch interfaces. 5.2 Examples of Hand Gestures There are innumerable possible hand movements, and there are many expressions to describe them. In order to appreciate the diversity of how we use our hands, Mulder provides a list of random hand actions which spans the gamut of human experience [Mulder96] as part of his study of gestures in human computer interaction: • praying (two flat hands up together) • begging (flat hand) • expressing anger (raising a fist) • derogation (middle finger up) • accusation (index pointing) • hitch hiking (thumb up, hand moving sideways) • legal and business transactions (handshake, judge hammering) • waving and saluting • counting (fingers and/or hand) • pointing to real and abstract objects and concepts (index, hand) • conducting of an orchestra (variety of gestures with both arms and body) • traffic control of cars and airplanes (hands flat pointing or moving) • shaping of imagined objects (hands tracing out curves and shapes) • martial arts, fighting (variety of movements of arms and body) • gesturing by singers (hand and body movements) • stock exchange operations (various hand shapes) • affective gestures (hand touching) • rejective (index up moving left & right)/appreciative (hand clapping) gestures • game playing (hand signs to communicate with partner in card games) • game scoring (cricket, basketball, soccer, rugby, football) • dinner table actions (commanding waiter to refill wine glass) • positioning of real (remote or close) and abstract objects • control panel operations (mousing, steering a vehicle) • moving, touching and interacting with objects • silent and non-verbal communication (shrugging, holding one's own earlobe, scratching) • "italianate" gestures (two hands open shaking) • mimicry and pantomime (actions and objects are depicted with hand/body movements) • sign language (a complete linguistic communication system) 44 Needless to say, such a list cannot possibly be exhaustive, and neither w i l l our attempt to enumerate surface-constrained hand actions. We have italicized in the above list some actions that are relevant in the context of touch interaction. Even though these seem to form a small subset of the list, there is still much complexity to performing these actions, especially when compared to the physical movements required to manipulate most common user interfaces. Touch indeed can be as complex and capable of conveying meaning as gesture even though the terminology to describe various applications of touch is not as apparent. Nevertheless, there are many phrases in the English language that have obvious connections and origins in the human experience of touch [Schiff82]: • keep in touch • being tactful/tactless • having a soft touch • being touchy • the personal touch • rub someone the wrong way • making skin crawl • itching to go • touch and go • only skin deep • the Midas touch In the following section, we wi l l begin our study of the richness of touch by examining the possible surface-constrained hand actions from a physical perspective. 5.3 Touch Interaction in the Physical Context The human hands have over 25 degrees of freedom [MacKenzie94]. Many of these are not independent of each other, due to tendon and ligament structures in the hand. The precise number of degrees of freedom for the human hand is dependent on the individual and also the particular task performed. Each finger has three degrees of freedom (three joints per finger), though these are not independently controllable, i.e. there is only one natural way to bend a finger—it simultaneously involves all three joints, and the bend angles are correlated. Furthermore, bending one finger can also have effects on the other fingers, it is often impossible to bend one finger while keeping all the others stationary. While touch interaction implies a certain amount of movement, these actions can be viewed as built up from a vocabulary of static hand configurations, or postures. Gesture research necessarily includes postures that are unconstrained, but since we are interested in surface-touch, we consider contact modes, including prehension, which describe the applying of forces to objects to perform a task. 5.3.1 Modes of Phys ica l T o u c h Interaction The set of postures and gestures that are constrained to a surface form a small subset in many 45 gesture and prehension taxonomies, and this is not characterized in great detail. In fact, since prehension deals with actions such as the grasping of tools and gripping of handles, stationary physical contact is the only type of touch interaction involved. With the touch interfaces relevant to our research, we are interested in how the human hand is used to convey input features onto a surface, rather than how it is used to apply forces for physical manipulation. The difference is subtle: surface interaction deals with the capture of gestures applied by the hand onto a surface and how they are interpreted; prehensile interaction involves applying forces to perform work. To further clarify, consider the many devices classified in the original taxonomy by Mackinlay et al. (Figure 3.3) that can be argued to be touch interfaces but that we expressly do not include in the updated taxonomy (Figure 4.4). Rotary pots, joysticks, and trackballs are devices intended to be manipulated by hand, and therefore, by touch. However, these devices do not interpret touch in anyway—touch is required only to effect forces to physically move the moving parts. In contrast, many touch interfaces not only lack physical mechanisms that require moving, they do not constrain or afford a certain type of touch action (i.e. requiring the hand to grip a joystick or dial). The movement and forces imparted by touch are themselves the useful quantities that is captured and interpreted. Note that these two contrasting modes of touch interaction are not necessarily mutually exclusive. The objects that are engaged during prehension can have touch sensitivity themselves, which can be captured and interpreted just as in many touch interfaces. A good example is when we grasp and move human body parts, such as another person's arm. We are applying prehension to move their arm, while the other person is feeling our grasp. 5.3.2 Phys ica l Touch Interaction in Interactive Sys tems When considering touch interaction, it is often difficult to capture prehensile hand input as a general purpose input method. There are many different possible postures and each is useful for performing numerous tasks. Object attributes such as size and shape influence the type of postures used to effect the action. A n example: a mug handle typically facilitates a hook grip with one or more fingers wrapped around the handle and opposed to the palm. Systems that make use of prehensile input as an input method are often restricted to capturing specific actions because of these physical constraints. Each "interface" (i.e. handles, buttons, or switches) is used to capture a specific type of action. For many applications, this is sufficient, but for more complicated systems, physical limitations lead to a remapping of natural actions to virtual ones. For example, in aircraft and spacecraft simulators, cockpits are typically recreated in great detail, including all the buttons, dials, and readouts. This is critical for the trainees to gain familiarity with the interface layout and functions without needing to actually train in a real craft. The simulator would be far less effective i f the necessary functions are remapped to a presumably less expensive interface set. 46 There are many examples of physical interfaces that act as control interfaces for interactive video games. These include steering wheels, guns, and interfaces derived from musical instruments. Even the use of such specific physical interfaces is largely only cost effective for public arcades where one machine is dedicated for one game. In the consumer segment, we find generalized interfaces since game appliances support many different games. It is at the consumer level that having specialized interfaces for prehensile and whole-hand input becomes impractical because of cost and space concerns. Most general-purpose input devices capture input that is constrained to one or two dimensions, not taking much advantage of the capabilities of the human hand. 5.3.3 General ized Touch Interaction Because prehensile input spans a considerable range of possible hand postures, and because it is impossible for any general purpose input device to capture prehensile input in a generalized sense, we consider touch input over simpler input topologies—namely that of a plane. Touch generally involves surfaces, and the most degenerate surface is flat. From a technological perspective, instrumenting a flat surface with touch sensors is easier that embedding an irregularly-shaped object with sensors. Therefore, having a flat surface serving as an input space lends itself to much more general-purpose usage. Representing the possible contact space for an irregular surface can also be difficult. A flat surface can easily be represented by a familiar coordinate system. Most importantly, while flat-surface interaction by nature excludes most prehensile input tasks, there is still a rich set of actions that can be captured by a flat surface. A few of these are used in the context of mouse interaction, but there is so much more that is possible between the hand and a surface. However, we lose the advantages of having physical affordances guide hand actions, and also the intuitiveness that comes with physical manipulation of familiar structures. 5.3.4 Characterization of Hand Act ions The position and motion of the hand can be organized into two classes: continuous features and discrete features [Sturman93]. Continuous features include degrees of freedom; derived features such as fingertip position, joint velocities and volume enclosed by the fingers; and forces generated by the hand, characterized by normal and tangential forces on the contact areas of the hand. Discrete features are input tokens that are represented by postures and gestures. Characterizing hand actions based on such complex attributes is useful for a study of hand usage independent of any particular application. This is important to our design space when we need to represent hand configurations and actions in a parameterized form. However, for the purposes of evaluating an interface's suitability to capture certain hand actions, we focus on tasks, rather than a precise parametric representation. Thus, we begin with an enumeration and classification of basic semantics of hand actions performed with respect to a flat surface in the next section. 47 5.4 Methodology Our approach to generate the space of hand actions is through brainstorming, similar to the listing o f linguistic gesture-related phrases at the beginning o f this chapter [Mulder96]. This approach was evocative of the thought process that occurred during the development of FlowField, where there was no formalized space of hand actions to consider, nor was this deemed necessary. A s a preliminary step, we brainstormed how touch is used in common applications: Petting animals Applying makeup Medical diagnosis Braille Feeling Mahjongg tiles Scratch'n'win Feeling texture properties Wiping windows Smudging Naturally, there are many applications that involve touch, and it was necessary to separate applications that use touch to manipulate controls from those that actually use touch as a meaningful input; the list above are examples that are closer to the latter. We now need to extract the basic actions that are performed during these applications, and we considered action words that suggest hand usage (listed in no particular order): Push Squeeze Stroke Swat Pull Stretch Fl ick Trace Knead Wipe Pound Ping Poke Rub Slap Drag We then looked for commonalities among these actions in order classify them into meaningful categories, as well as establishing constraints to limit the scope of the space. First we constrained our examination to one hand. There are certainly many applications that deal with bimanual input; we wi l l address in the section on additional semantics. Next we also disregarded terms that suggest temporal-dependence, such as knead and tap, since these are applications of more basic actions. The next section continues these classification efforts as we establish the space. 5.5 Atomic Single Hand Actions With a single hand, we introduce the distinction between hand-centric and finger-centric actions. These are not mutually exclusive—a hand action can contain elements of both. In general, however, there are different cognitive requirements for performing each category. Within each category, actions are classified according to transformation properties: in-place, translation, and rotation. Rotation is differentiated between rotation about the X - or Y-axes and rotation about the Z-axis, illustrated in Figure 5.1: 48 Figure 5.1: Axes of rotation for hand actions with respect to touch interactive surface Each action is named based on its resultant motion, described by an action word, such as press or twist. The following table presents a map of the actions enumerated in the following sections. Illustrations for each action are presented in the last section of the chapter for reference. Hand-centric Finger-centric In-place Press Press Translation Wipe Drag Coordinated drag Scratch X,Y Rotation R o l l R o l l Z Rotation Twist Twist Coordinated twist Table 5.1: Hand actions classified according to hand- or finger-centric and transformation 5.5.1 Hand-centric Act ions Hand-centric actions typically are performed by hand configurations that contact the surface through a connected area, such as the side of the hand or a palm, as opposed to several discrete contact areas, such as the fingers. 5.5.1.1 Hand Press The simplest case of a hand press occurs when the hand is pressed up again a surface and pressure is applied (Figure 5.2). In many cases, the surface typically is a button or some physical structure, and therefore pressing against it merely acts to physically move that surface in a certain direction. This direction can be orthogonal to the surface (orthogonal hand press), or off-axis to the normal (directional hand press). The directional aspect of the hand press is also dependent on the friction between the hand and surface; any slippage w i l l lead to a different classification of the hand action. 5.5.1.2 Hand Wipe A wipe is characterized by the motion of the hand along some trajectory on the surface. The hand posture can remain static throughout the duration of the wipe (Figures 5.4, 5.6) or it can change from one configuration to another (Figure 5.5). Even i f the contact area changes during wipe, one part of the contact area should remain or end up stationary relative to the surface. 5.5.1.3 Hand Twist The hand twist is a variation of a wipe. The hand is not moved from one location to another, but rotates in place about an axis that is orthogonal to the surface. It is difficult to perform a twist without performing a wipe at the same type, but the distinction is important. Again, a 49 multitude of hand configurations can perform the twist (Figures 5.7, 5.8), as long as the contact area remains centred about the hand as a whole. 5.5.1.4 Hand Roll This involves the changing o f hand configurations in a fashion that evokes rolling. The hand is pressed down in a certain configuration, and then rotated about an axis parallel to the surface such that the contact area shifts in one direction. This contrasts with a hand wipe where the shift in contact area is achieved by overcoming static friction with the surface. There are many variations such as rolls with closed fists (Figures 5.9, 5.11), open palms (Figure 5.10), and even groups of fingers (Figure 5.12). 5.5.2 Finger-centric Act ions Finger-centric actions involve the use of fingers rather than the hand as a whole. Whereas with hand-centric actions there is usually a single, large contact area to consider, finger-centric actions provide multiple, smaller contact areas that can be moving independently. The cognitive requirements for moving several fingers in a coordinated manner is different that when using the hand as a whole. 5.5.2.1 Finger Press The finger press is the action of applying force with a finger without moving, similarly to a hand press. A very common occurrence of a finger press is the activation of buttons. The finger press can be qualified further based on which fingers are involved; there are many possible combinations (Figure 5.13). 5.5.2.2 Finger Drag A finger drag is the analogue to a hand wipe. Again, there are many possible combinations of fingers that can be used (Figure 5.14), but a drag trajectory can be defined in terms of individual fingers. A s a result, the dragging action of multiple fingers can be broken down into single-finger dragging actions, and therefore each finger can be applied and released independently. The main condition for multi-finger dragging actions is that each finger must remain in motion. The anchoring of any particular finger w i l l indicate a different action (see: finger twist, coordinated finger drag). A drag can be performed with any part of the finger: the fingertip, fingernail, knuckle, etc. However, when the full length of a finger is involved, then contact area is not focused on single points, and therefore should be considered more o f a hand wipe. A drag can transit into a hand wipe or other actions involving finger-movement. 5.5.2.3 Coordinated Finger Drag A coordinated drag of multiple fingers in the same general direction is only one possibility. Under such a situation, the hand itself is also moved in order to bring the fingers from one location to another. When the fingers move but the hand remains stationary, there are other possibilities that should be elaborated. 5.5.2.3.1 Pinch/Squeeze A pinch/squeeze action occurs when two or more fingers are brought towards each other. The 50 difference between a pinch and a squeeze is the number of fingers that are involved. Typically, the drawing together of two fingers, most often one finger and the thumb, is considered a pinch. A squeeze involves more fingers in an action that requires less precision. The line between the two is far from definitive—we can consider using our thumb, index finger, and middle finger to perform a pinch. In a pinch/squeeze action, each finger is undergoing its own drag action with a particular trajectory. One or more of these fingers can remain stationary, depending on i f the eventual focus of the pinch/squeeze is to be a central point (Figure 5.15), or towards a stationary finger (Figures 5.16, 5.17, 5.18). Another variation would be performing the pinch/squeeze with the base of the hand planted for support (Figures 5.19, 5.20). Other possibilities include pinching with two fingers, rather than with a thumb. 5.5.2.3.2 Stretch/Spread The opposite of a pinch/squeeze would be the stretch/spread action. A l l of the above conditions and variations also apply here. 5.5.2.4 Scratch A very specialized form of a finger drag is the scratch, which involves scraping the fingernail along the surface (Figure 5.21). This produces a unique sensation that distinguishes it from conventional drag. 5.5.2.5 Finger Twist The finger twist is performed when a single finger is moved in a rotational manner about an axis parallel to the surface normal. It is closely related to a drag, much like a hand twist is similar to a hand twist, with the distinguishing feature being the act of rotating rather than translating. The rotation should be centred about a single contact area, such as the fingertip. The most easily visualized twist is when the finger is pointed directly into the surface and twisted in-place. This can be performed by any o f the fingers and even the thumb. Again, as with the drag action, a twist can also be performed by any single part of the finger such as the fingertip (Figures 5.22, 5.23), thumb (Figure 5.24), or knuckle. Using the full length of a finger again blurs the line between a finger twist and a hand twist. 5.5.2.6 Coordinated Finger Twist We may consider the rotating of multiple fingers about a central axis should also be considered a twist; however, it is not so simple. In these cases, the axis of rotation can be either centred about one of the fingers involved, or about none of them. For example, when we use our index and middle fingers to perform a twist (Figure 5.25), we have the options to pivot around either one of the fingers, or around a point in between. In fact, in both cases, one or more of the fingers is undergoing a drag action, from the perspective of individual fingers. Therefore, such actions are already accounted for in the list of actions presented thus far. These actions are nonetheless quite common and can be clarified by assigning this additional category. 51 5.5.2.6.1 Finger-centred Twist When one finger acts as the pivot point, we distinguish this as a finger-centred twist. Common examples include the dragging the index finger about the middle finger (Figure 5.26) and pivoting o f the fingers about the thumb (Figure 5.27). 5.5.2.6.2 Centroid-centred Twist When no finger acts as the pivot point, but the rotation happens around a location in between the fingers involved (Figures 5.28, 5.29), we distinguish this as a centroid-centred twist. The point of rotation may not necessarily be the mathematical centroid, but should be a point within the line or polygon demarcated by the finger points. A rotation about a point outside this centroid region is considered a drag. 5.5.2.7 Finger Roll The conditions for a finger roll are similar to that of the above finger actions. It is best characterized when a single finger is rotated about an axis that is parallel to the surface. The contact area can range from the tip of the finger to the entire length (Figure 5.30); however, when more than one finger is involved, the action becomes more of a hand roll , as the focus does not remain on the same part of the hand. 5.6 Additional Semantics of Touch Interaction The hand actions described above were from the perspective of a single hand used as the sole input. For some interfaces, such as a touch pad, this is indeed the regular course of action. However, since touch interfaces do not usually impose any physical affordances that restrict the usage of the hands, there are other possibilities of touch interaction that are supplementary to single hand interaction. 5.6.1 Bimanual Hand Act ions The actions listed previously were all single-hand actions. We are, however, quite adept at using both of our hands together to perform many tasks that are difficult or impossible when using one hand. Many touch interfaces support the use of both hands simply due to their multi-point sensing capabilities. Bimanual hand actions are just combinations of the atomic hand actions, from the perspective of constrained-surface input. Much like what we described as a coordinated twist can be characterized as independent drag actions of one or more fingers occurring in concert, bimanual actions can also be said to be composed of independent single-hand actions. If we are to assign additional semantics to certain actions that are not evocative of the physical transformations undergone by the hand(s) involved, then the distinction between single-hand and dual-hand can be important. For example, i f a finger pinch and spread action is to correspond to a "resize" function in some application, then we can consider that the pinch/spread action performed by fingers on the same hand is different i f performed by fingers on both hands. In this chapter, however, we are specifically not dealing with application-specific semantics, but rather, human-centric semantics. In our design space the mapping between input domains 52 (hand action) and output domains (application semantics) is important and we wi l l revisit this in Chapter 6. 5.6.2 Dominant/Non-dominant Hand Act ions The usage of the dominant hand versus the non-dominant hand is also dependent on application semantics, as the transformations characterized in our hand action list can be physically performed by either hand. The application dictates the preference of using one hand over another, such as the precision required by a pointing task. This distinction is even more relevant when performing bimanual tasks, since it is posited that the dominant hand is more precise and therefore should be used to perform active manipulation tasks, while the non-dominant hand is better suited for contextual positioning. 5.6.3 Compound/Repetit ive Act ions With the exception of hand and finger press actions, all the actions listed so far have a temporal component to them—that is, it w i l l take a certain non-instantaneous amount o f time to complete the action. Each non-static action consists of one start state, one end state, and a transition between them. Actions can be strung together in sequence to form compound actions to perform more complicated tasks or to generate a gestural language. Other actions can have additional meaning when performed repetitively. For example, repeating a finger drag action back and forth can be considered a "rub" action, or applying a hand roll repeatedly can be called a "knead". We wi l l refrain from these additional terms since they can be defined as compounded atomic actions. 5.6.4 Pressure Variation When performing the hand actions enumerated in this chapter, applying pressure variation is useful for expressive touch interaction, at least in a form that is compatible with our own sense o f touch and experience o f being touched. Inputting pressure can also serve to manipulate position in the Z dimension. The hand actions by themselves are not dependent on pressure variation; they should all be recognizable by an interface that does not support pressure sensing. There are also no hand actions requiring pressure variation that needs to be considered atomic. The interpretation of pressure in the context of a captured hand action is once again the domain of the application. 5.6.5 Onset/Release and Duration Semant ics While the atomicity of hand actions does not take into account pressure variation, another aspect of touch that can be used in application is time-related semantics. The onset of a hand action is considered to be the pressure/time characteristics at the beginning of the action. For example, a finger press can be performed by a gradual increase of pressure after contact over a long period of time; it can also be executed by an abrupt impact of the finger. The same value of maximum pressure may be achieved in both cases, but the time it takes to reach this pressure is different. These can be interpreted differently in applications. 53 The duration of a hand action to complete, from onset to release, is related, and these three properties, i f measurable by the interface, can indicate a sense of urgency on the part of the action performer, or coupled with pressure variation, can be integral to an expressive form of touch interaction (caress). 5.7 Summary In this chapter we have presented a set of possible hand actions when considering interaction with a flat surface. This comprises a movement vocabulary in our design space, which we wi l l examine in the next chapter. The basic hand actions are divided between hand-centric and finger-centric actions, and are described in terms of a physical task that each action can perform. We concluded this chapter with several important semantic aspects of touch interaction that further increase the possibilities of touch input and foreshadow some attributes that w i l l influence interface selection. 54 5.8 Figures 5.8.1 Hand Press Figure 5.2: Examples of hand press action (1): left: force applied directly into the surface (orthogonal hand press); right: force applied in an off-normal direction (directional hand press) Figure 5.3: Examples of hand press action (2): left: closed fist; right: side of palm 55 5.8.2 Hand Wipe 5.8.3 Hand Twist 8.4 Hand Roll Figure 5.9: Hand roll action: closed hand rolls from left to right Figure 5.10: Hand roll action: open hand rolls from left to right Figure 5.11: Hand roll action: closed hand rolls from bottom to top Figure 5.12: Hand roll action: two fingers roll from left to right (cf. single finger roll) 58 8.5 Finger Press Figure 5.13: Multiple variations of a finger press action; in all cases contact is made by individual fingers or discrete points on the hand, rather than with the whole hand 59 5.8.6 Finger Drag 5.8.7 Coordinated Finger Drag (Pinch/Squeeze; Stretch/Spread) Figure 5.15: Finger pinch action between thumb and index finger with both moving Figure 5.16: Finger pinch action between thumb and index finger with finger moving towards thumb Figure 5.17: Finger pinch action between two fingers and thumb 61 Figure 5.18: Finger pinch action between all fingers and thumb; can be considered squeeze action to emphasize less precision compared with pinch actions diagrammed above Figure 5.19: Finger pinch action between two fingers and thumb; palm planted on surface Figure 5.20: Finger pinch action between little finger and thumb; side of palm planted on surface 62 63 Finger Twist Figure 5.22: Finger twist action of finger pointed directly into surface Figure 5.23: Finger twist action of finger contacting the surface through the finger pad Figure 5.24: Finger twist action of thumb 64 5.8.10 Coordinated Finger Twist Figure 5.25: Coordinated finger twist action of two fingers that touch the surface; neither one stays in the same spot, and therefore considered a centroid-centred twist Figure 5.26: Finger-centred twist of two fingers; middle finger remains in place while index finger moves; can be considered a drag action from the perspective of the index finger Figure 5.27: Finger-centred twist of two fingers and thumb; thumb remains in place while two fingers move about thumb 5.8.11 Finger Roll Figure 5.30: Finger roll action 67 Chapter 6: Design Space for Touch Interfaces In this chapter, we propose a design space for touch interfaces. The design space is intended to systematize information about touch interfaces, particularly ones that exist today. Such a design space would be useful in guiding the selection of interfaces for applications, as well as to suggest directions for novel interface designs. We began this task revisiting the design space presented by Mackinlay et al. [Mackinlay90] which organized information about input devices in general and provided descriptive tools for describing the semantics of a device. One part of this design space was a taxonomy of input devices based on possible measurement properties as well as a set of combination operators. We suggested modifications and additions for properties relevant to touch interfaces and formulated a dedicated taxonomy for touch interfaces in Chapter 4. Another component of a design space is a set of possible input actions, or a movement vocabulary. In the context of touch interaction, this is the set of human hand actions, which we have constrained to a subset that contains only actions performed with respect to a flat, physical surface. We have also identified several additional semantic categories which are not captured by the enumeration of actions but remain important aspects of touch interaction. Using these components we w i l l generate a design space that is relevant to touch interaction. We wi l l then outline some criteria to test points in this design space, following existing techniques described in previous works. We emphasize that our research is focused specifically on touch interaction with a flat surface, and not with physical widgets or volumes. Our design space should be extensible to support all touch interaction of any context, though it would require a thorough revising of 68 the movement vocabulary and input device taxonomy. This is beyond the scope of our research but is an important direction to investigate. 6.1 Generating the Design Space Our taxonomy of touch interfaces is derived from a design space based on the movement vocabulary developed for general input devices [Mackinlay90]. Our work in Chapter 5 has provided a basis to update this movement vocabulary as it applies to touch interaction. In addition, we have identified important semantic concepts of touch interaction that w i l l also have an influence on composition operators. 6.1.1 Primitive Movement Vocabulary A n input device can be represented by the following tuple: <M, In, S, R, Out, W> The following table explains each term in this representation: Symbol Name Description M Manipulation operator A physical property that is adjusted by the input device S Current state Current state of input device, typically expressed as a variable In Input domain The input domain is denoted by a set range of values that span the possible input values Out Output domain The output domain is denoted by a set or range of values that span the possible output values R Resolution function Mapping from input domain to output domain W Work properties General set of device properties describing additional aspects of how the device works Table 6.1: Terms of representation of an input device We see no need to modify this method of formalizing input devices because a touch interface can certainly be considered a mechanism that remaps data from an input domain to an output domain. Pertinent attributes and aspects specific to touch interfaces can be elaborated in the W variable. We wi l l address some examples of these in the section on testing the design space. 6.1.1.1 Manipulation Operator (M) From our discussion of possible hand movements constrained to a surface, we have enumerated a basic set of atomic hand actions that serve as the building blocks for touch interaction in this context. A major distinction of hand-centric and finger-centric actions serves as the first level o f categorization, with the actions within each category further differentiated by their transformational properties (Table 5.1). This serves as a more logical set of manipulation operators as opposed to the set suggested by Buxton [Buxton85] that includes rotation and torque as measurable properties (Table 3.3). 69 We do not deny the assertion that the properties that any input device can measure should be a combination of these properties, and indeed, each of the hand actions can be represented in terms of position. We propose that using a set of actions to serve as pseudo-properties better indicates what forms of touch are supported by a particular interface. We also shall retain a set of physical properties that serves as a second level of categorization. These properties are: proximity, contact, pressure, and change in pressure (Apressure). The latter two are from Table 3.3, and proximity and contact are introduced in our proposed taxonomy for input devices (Figure 4.4) as different qualities of position. While these serve as primary classifiers in the design space by MacKin lay et al., we shall consider these as secondary classifiers. 6.1.1.2 Current State (S) A s a result of using hand actions as a basis for manipulation operators, the range of possible values for the current state is not easily specified. The alternative of using lower-level property measurements such as position and force would be equally challenging; it would also be less intuitive to represent hand actions as a set of position and force values. Since there are no natural metrics to quantify the state and transformations corresponding to a hand configuration, we need to create a formalization. One way to represent a configuration of the hand is to use bend angles of the joints in the hand. The problem with this approach is the inherent physical differences and capabilities of individual hands. This is not only practically difficult (glove interfaces can encode a limited set of bend angles), but it is unnecessary—we do not perceive the state o f our hand as a set o f bend angles. To simplify, we can consider the effects touch input has on the surface, namely a contact profile consisting of an area denoting where the fingers and hand touch the surface and a three dimensional vector field representing the forces that are applied. The various states of the hand can be represented by instantaneous states. This is still too much information to express concisely as a finite set of values. There are no touch interfaces that actually capture the detail required to produce such a representation. Many touch interfaces still capture discrete points for the purposes of conventional applications; even the interfaces that we consider belonging to our proposed field-input category only produce low resolution images (SmartSkin). The only device that produces a reasonable vector field is the GelForce system, however, position information is not represented. 6.1.1.3 Input Domain (In) Specifying the input domain should be in terms of the possible configurations of the hand, using the representations presented above. O f course, it is not possible to explicitly formalize this domain in terms of a finite range. The input domain also should contain a representation for all the possible actions. Many of these actions have been given descriptive names in our examination of hand actions in Chapter 5, though these do not quantify the motion itself. A hand action can be considered to consist of the following: a start configuration, an end 70 configuration, and a parameterization of the change in configuration over time. Again, it is possible to represent the states using physical characteristics such as bend angles, and producing a transition graph of the values as the change during the action. Such a representation is again unnecessary, and also does not address the changes of forces. Our representation using a contact profile and a vector field is sufficient for encoding the begin and end states, as well as the transition, in the form of a series of snapshots. Practically, this seems just as daunting as determining and capturing bend angles, but we emphasize that the specification of the input domain is independent of the technological capabilities of current interfaces. Using this representation, all important characteristics of a transition can be theoretically extracted, given a sufficient capture rate of the information. Such characteristics include the variation of pressure, as well as the rate of change for both hand configuration and pressure. 6.1.1.4 Output Domain (Out) The output domain corresponds to the topology and range of values supported by a particular input device. For example, a laptop touch pad produces a single (x,y) coordinate, most likely in integers of a certain range. Devices that capture multiple points can be specified accordingly. Additional properties that are captured can be specified in a similar way, as so far all devices quantize all incoming information into numeric values in a set of a definable topology. For example, the M T C Express captures pressure information over sensors arranged in a fixed grid. A sample representation can be [0, 127] 7 2 , corresponding to 72 pressure values. 6.1.1.5 Resolution Function (R) This is the function that maps information from the input domain to the output domain. With simple devices, it is easy to define a single or multivariable function that specifies the relationship between input and output, such as the mouse transforming hand position information (in some metric) to x,y coordinates. For touch interaction, such a function would transform from an input domain consisting of formalizations for hand configurations and actions to an output domain represented by a certain topology of values. This, in essence, is the function of the input device itself, but it would be difficult to specify the relationship in any formalized manner. Note that the resolution function has nothing to do with how the information provided by the input device is interpreted by an application. The resolution function does not transform hand input into, say, a resize command in some paint application. The separation between input device functionality and application functionality should be emphasized. 6.1.1.6 Work Properties (W) The general-purpose set of device properties, W , can include such relevant touch interface attributes as underlying technology, sensor resolution, lag and hystersis effects, among others, but these may be represented by other elements such as input domain or resolution function. 71 6.1.2 Compos i t ion Operators In our revised input device taxonomy for touch interfaces, we maintain the use of the composition operators proposed by MacKinlay and Card. [MacKinlay90] whenever relevant. 6.1.2.1 Merge Composition This is the combination of two devices such that the resulting input domain is a cross product of the input domains for the two devices, i.e. the mouse can be considered a combination of two orthogonal one-dimensional valuators. Touch interfaces that capture input in the form of points can also be considered to employ merge compositions of one or more sets of orthogonal I D valuators. Additional properties, such as pressure sensing, can also be considered a merge composition, as long as they are captured using the same interactive surface. For example, Wacom drawing tablets capture limited Z-dimension information (proximity), as well as pressure information on the same drawing surface, thus these properties are linked by the merge composition. 6.1.2.2 Layout Composition This is the co-location of two devices on a common physical interface. The buttons of a laptop touch pad is an example of a layout composition. Layout compositions are typically rare for touch interfaces, since the touch-sensitive area is used to capture the whole hand, and therefore separate, independent input mechanisms are usually unnecessary. 6.1.2.3 Connect Composition A connect composition maps the output provided by one device into the input of another. This is not a common composition among physical interfaces; it has much more relevance when dealing with virtual devices, for example, when using two sliders to control a cursor location. This does illustrate an important point that devices in the design space do not actually have to be physical interfaces, but can also be virtual interfaces simulated in software. 6.1.3 Design Space for T o u c h Interfaces The design space for input devices is the set of possible combinations of composition operators with a primitive vocabulary. Based on the movement vocabulary we proposed we show a schematic representation of this design space in Figure 6.1. In the design space of general input devices visualized by the taxonomies shown in Figure 3.3, circles represented a transducer in the device, plotted according to the physical property it transduces. For the design space of touch interfaces, we do not consider the physical properties an interface captures, but rather the manipulation operators we have proposed, which is set of hand actions. These serve as the columns in the design space, replacing dimensionality and measurement types, since these are integrated into the hand actions (i.e. a finger twist is a combination of linear position changes in the X and Y dimensions). We further divide the columns in the finger-centric section into single-finger and multi-finger. We also then categorize the finger-centric columns by the precision the device can capture the fingers. This is represented by a continuum of coarse to fine values, similar to that shown 72 in the Mackinlay and Card taxonomy. Coarse capturing of finger input indicates very rough recognition of a finger, and difficulty with resolving multiple fingers in proximity and fine movements. Fine capturing should theoretically be able to capture all the fingers even at their closest possible configuration. The rows wi l l retain the familiarity of the taxonomies seen in previous chapters, with property types we selected as secondary classifiers for the hand actions a device captures. These include proximity, contact, pressure and change in pressure. Most existing devices capture finger position for the purposes of pointing tasks, so many devices wi l l be found in the position row. This corresponds to the crowdedness of the linear position sensing section in our plotting of touch interfaces in the original taxonomy (Figure 4.2). A s an example, we plot the laptop touch pad in the design space representation in Figure 6.1. A laptop touch pad can capture single-point contact, as well as capture single-finger drag actions. Two circles are placed in the appropriate columns in the figure, aligned towards right (fine resolution), and connected by a merge composition, as the detection of physical contact and the capture of a trajectory associated with dragging can be considered orthogonal mechanisms. Even though it is physically possible to perform any hand action on a laptop touch pad (given a big enough area for some actions), single-finger static and translation transformations are the only ones that are captured by the device to produce useful output. Even though many laptop touch pads (capacitive-sensing touch pads) have the ability to produce a centroid point from multiple finger contacts, this is not useful output. Finger-centric Hand-centric In-place Translation Rotation about X/Y Rotation aboutZ In-place Translation Rotation about X/Y Rotation about Z SF MF SF MF SF MF, SF MF Proximity Contact O lapto] —O )touch ] )ad Pressure o _r\ _r\ r\_ r\. p i p i P i Pressure MT r U i 2 Expre K J SS APressure 1 - 1 °° 1 « 1 « 1 °° 1 ~ 1 °° 1 °° 1 1 1 1 Figure 6.1: Laptop touch pad and MTC Express in the design space for touch interfaces; columns denote type of hand actions (SF=single-finger, MF=multi-finger) A more interesting example is the plot corresponding to the M T C Express. The M T C Express 73 captures multiple pressure-sensitive points, though the sensors are spaced quite far apart. A l l circles are found in the pressure row, naturally, all finger-centric and hand-centric actions are supported. Single-finger actions are generally more readily captured because there is no interference from input from other fingers. Thus the circles for single-finger actions are aligned more towards the right than those for multi-finger actions. With this design space, we now proceed to place the touch interfaces we have considered in our research (Figure 6.2): Finger-centric Hand-centric |n-place Translation Rotation about XA' Rotation" about Z In-place Translation Rotation about X/Y Rotation about Z SF E x p MF J SF MF' SF SmartSkin d MF ,SF MF. O -o touch screen a -o laptop touch pad 0—0\-Or-O - O FineerWorks O — O r - 0 | — < DiamondToiich -ol -o -o Wacom tablet o-K> -o -olo MTC Express -#|—\ GelForce (>-r<>-k>-rO STCjlOOO o k > O K > - o - o - o -o • o - k > o4o - o - o • o -o 6 1 °°i M h °°i °°i °°i °°h °°i °°i CO h CO H Figure 6.2: Visualization of design space for touch interfaces; spheres for GelForce represent ability to capture pressure in all directions, not just into the surface 6.1.3.1 Placement of Devices From Figure 6.2 we can see there are only three major classes of touch interfaces that exist currently: devices that detect single point in-place and translation actions; devices that sense single and multi-point in-place, translation, and twist (rotate about Z) actions; and devices that sense all actions. The degree of precision to which each device captures these actions varies, but some commonalities are evident. 74 A l l existing touch interfaces that are used for pointing tasks have their contact sensing indicators aligned towards the fine part of the continuum. The M T C Express and STC-1000 are the only exceptions, and they are not intended for pointing tasks. These are, however, two of the few devices that support the capture of all actions, albeit with low precision. Devices that capture single points obviously cannot process actions that require multiple-fingers, nor can they detect any possible configuration of a hand, except for perhaps providing a centroid for any multi-point input. Interfaces that capture multiple points can support multi-finger in-place and translation actions, as well as multi-finger twists, because these can be characterized as multiple independent twists. Multi-point interfaces cannot, however, adequately capture single-finger twists, unless the device produces a dense point field. In such a case, our movement vocabulary dictates that this is actually a hand-twist action, since there is no single point of contact from the perspective o f the interface. A device that produces such a dense point field w i l l actually be considered one that can capture all hand actions. The only such device seen in our taxonomy is the SmartSkin, and even this is in the row denoting proximity. This is due to the SmartSkin architecture which does not actually sense contact, but the electric field changes due to nearby body masses. We can see that there is room in the design space for a device that actually captures contact, but represented with a dense point grid, rather than a set of pointers. GelForce is a unique device in our design space, as it captures forces in arbitrary directions. We represent this capability with a sphere, rather than a circle. This is a new and interesting direction that includes interfaces with deformable surfaces or other technology that can capture forces applied non-orthogonally. A n extension of this type of interaction is touch interfaces that do not employ only a flat surface, but of an arbitrary shape. The Wacom tablet senses proximity as well as pressure for single-points, thus its placement in their respective rows. This device is also unique among the devices in that it does not capture hand-based input, but rather through a stylus. Consequently, the precision afforded by this device can exceed that provided by other devices, from the perspective of the object that contacts the surface. It should be evident that all the devices are connected through merge compositions. Connect compositions refer mostly to virtual devices, and it would be impractical to consider all such devices that can be manipulated through touch interaction. Layout compositions are also often irrelevant, especially when performing multi-finger and hand actions. There simply is no need to provide nearby additional controls. The lone exception is the laptop touch pad, but it is understood that our design space is concerned with touch interaction with a surface. 6.2 Testing Points in the Design Space A s with the original design space of input devices, we wi l l consider the evaluation of mappings of specific input devices by two basic criteria: expressiveness and effectiveness. Expressiveness refers to a device's ability to convey exactly and only the intended meaning, and effectiveness is concerned with a device's ability to convey the intended meaning 75 appropriately and efficiently (with felicity [Card91]). 6.2.1.1 Expressiveness One way the expressiveness of a device can be evaluated is by considering its input and output domains. If the number of items in both sets does not match, then there is a disparity that can lead to one of two situations: i f the projection of the Out set includes elements not in the In set, the user can provide illegal input; i f the In set includes values not in the projection, then there exists legal values that the user cannot specify [Mackinlay90]. In the context of our analysis of touch interfaces, we have established that the output domain is often a flat parameterized set of values, whereas the input domain is actually an infinite set of hand actions. Therefore, the expressiveness of touch interfaces cannot be evaluated in the above sense. It is even not necessary, as we know that the input set is always restricted to hand actions, and not arbitrary physical mechanisms represented by all input devices. In our reformulated taxonomy for touch interfaces (Figure 4.4), we considered three relevant attributes of that we used to categorize touch devices: ability to process single or multiple points, or higher order constructs (field); sensor resolution; and pressure sensitivity. These can be considered when determining the expressiveness of an interface. The most common combination of attributes for touch interfaces is single point sensing with a high sensor resolution. These interfaces support the precise capture of single-finger in-place and translation tasks. Depending on the scale of the interface, the sensor resolution can vary and still be effective for capturing single-finger actions. For example, the sensor resolution of a touch screen may be much higher than that of the DiamondTouch because the finger has less room to travel in a touch screen to effect the same movement. When capturing multiple fingers, sensor resolution not only has an impact on precision, but also on the ability to resolve fingers close together. A low sensor resolution may be sufficient for applications that require the detection of multiple touches over a large area. The detection of fingers close together or even touching wi l l require a much higher sensor resolution to resolve the multiple points, or even higher-order capture to unambiguously identify the individual fingers. For touch interaction involving pressure, the detection of forces is also important. Most pressure-sensitive devices capture orthogonal pressure. The ability to capture pressure from other directions is a variation that is possible. The resolution and range of pressure values that can be captured may also be important; consider the sensitivity of the human skin. 6.2.1.2 Effectiveness Effectiveness can be evaluated in terms of various figures of merit. Several were identified by MacKinlay et al. including: • Pointing speed (device bandwidth) • Pointing precision • Errors • Time to learn 76 • Time to grasp the device • User preference • Desktop footprint • Cost For touch interfaces we introduce the following figures of merit: • Data rate • Representation size/topology • Responsiveness • Hysteresis effects • Interference effects Data rate refers to the rate at which information can be requested from the device; this naturally influences pointing speed. Most applications would naturally desire a rate as fast as possible, but this is limited by hardware protocols and is also related to representation size. The amount of data that is sent per record is the representation size. This is dependent on the representation topology. For example, the M T C Express sends 72 pressure values in each record, corresponding to its grid of sensors; most pointing devices send data in the form of individual points, which is much more compact. Given the complexity of hand actions, hardware limitations of the interface and the destination system wi l l govern the data representation size and rate. Responsiveness is also dependent on the representation size and data rate. This is the measure of delay between input and output, which influences how quickly an action can be performed and/or repeated. Hysteresis effects are present in technologies that provide analog measurements with an easily affected baseline value. Some devices automatically compensate for this by using thresholding or filtering. Position or pressure sensors embedded in deformable materials are subject to hysteresis. Examples of interference effects are found in many technologies that are employed in touch interfaces. Capacitive sensing is inherently dependent on the manipulation of electrical signals, and so is sensitive to electromagnetic interference. Optical sensors can be affected by light levels. The technology used by the touch interface also governs the necessity of human input, as opposed to touch input by other objects. 6.3 Summary With the insights gained from developing a taxonomy for touch interfaces, as well as the identification of hand actions relevant to touch interaction with a surface, we have proposed a design space for touch input devices. The generation of the design space included a tuple formalization of a touch interface. This formalization included manipulation operators that were based on the set of possible hand actions, which complicated representations for state as well as input and output domains. The design space demonstrates a clustering of existing devices into three groups: those that 77 support single-point touch and dragging; those that support multi-point touch, dragging, and twisting; and finally, those that support all possible actions. The support for these actions vary depended on interface attributes, which can be evaluated as a means of testing the design space by examining expressiveness. Some important metrics for touch interface effectiveness were also introduced. 78 Chapter 7: Whole-hand Interactive Applications In order to motivate and demonstrate the contributions presented in this thesis, we focus this chapter on describing the applications and interactive systems that we have developed to study touch interaction. The motivation began with FlowField, a multimedia, immersive installation that made use of a multi-point pressure-sensitive touch interface, the M T C Express. This exercise yielded some interesting observations and questions not only about the suitability of this device for a particular task, but about the general space of surface-constrained touch interaction. The issues and difficulties experienced with FlowField led to three distinct but important directions for investigation, which are explored with applications described subsequently in the chapter. First, an inherent limitation of the M T C Express was the low sensor resolution. We addressed this by implementing interpolation and applying it to an updated version of FlowField in the context of an abstract layered framework for touch interface application design. The second major direction involved the development of applications that exploited hand actions that can be suitably captured by the M T C Express—we demonstrated the hand-centric roll action, which highlights the use of our design space presented in Chapter 6. Finally, the third direction investigated the development of a novel touch interface, a deformable surface captured using video technology, which is an area shown to be vacant in the taxonomy in Chapter 4. The contributions presented in this thesis were inspired by our experiences with designing and developing applications that supported whole-hand surface interaction. In this chapter, we describe our efforts creating these applications and how they contributed to shaping the design space, and how the evolution of these ideas has guided and supported our further 79 development of applications in this area. We therefore wi l l contextualize each application to emphasize its location in our framework. We begin with the initial motivating application— FlowField. 7.1 FlowField - Exploring Interface Limitations We developed FlowField for the purpose o f creating an interesting application that used the M T C Express for input 1 . FlowField was to be one of the first applications that took advantage of the device's unique properties. A t that point, we understood the M T C Express to be a device that provided pressure-sensitive input over multiple points, which was in direct contrast with the single-point contact-sensing touch pads found on laptops. A s such, we considered the types of actions that would be possible with this device that would be difficult or impossible with such laptop touch pads. The comparatively large touch area suggested possible interaction with a single hand and its fingers. The application concept called for interaction with a virtual simulation of moving particles, and thus it was natural to associate hand input on the M T C Express to directly manipulate the virtual particles in some fashion. We selected a metaphor of moving fingers through water as a basis for the mode of interaction—we believed the multi-point nature of the device would be suitable for capturing the input of multiple finger trajectories. The moving fingers were envisioned to serve as disruptions to the virtual particle field, which was to occur and be displayed in real-time. The venue for this application was a large-scale immersive display known commonly as a C A V E (Cave Automatic Virtual Environment), and thus the virtual particles would be perceived as surrounding the participant, who interacts with the system with the M T C Express held in hand. We did not anticipate the specialized display infrastructure for FlowField to affect the exploration of an interactive application with the M T C Express. 1 Chen, T., Fels, S. & Schiphorst, T. (2002). FlowField: Investigating the Semantics of Caress. Conference Abstracts and Applications of ACM SIGGRAPH, 185. Linux PC MTC | Express SGI Taxel data Onyx • 3200 Figure 7.1: System diagram of FlowField 80 A simplified system diagram is shown in Figure 7.1. The system architecture and data flow wi l l be explained in more detail in a later section, but first we w i l l describe the participant experience of FlowField. 7.1.1 FlowField Experience In FlowField, a field of moving particles is displayed in the C A V E ; this field is arranged in a cylinder (Figure 7.2) and projected in stereo. Participants would see that they are enveloped in this cylinder of particles. These particles are in constant motion, revolving around the participant. Holding the M T C Express, the participant can introduce obstacles into the flow of moving particles, disrupting them in a physically intuitive manner. Thus the experience consists of interplay between dynamic particles and obstacles which are directly controlled by input applied on the interface. Figure 7.2: The virtual field of particles arranged in a cylinder (left); circular obstructions in flow (right) The stereoscopic imagery is displayed on a four-screen C A V E in a closed cube configuration, with screens to the front, left, right, and below the participant. Participants wear shutter glasses which resolve the stereo images. While interacting with the M T C Express, they experience feedback in the form of visual obstructions which appear in the particle flow, as well as a dynamic M I D I accompaniment which is related to the collision activity. Various parameters of the virtual environment can be adjusted in real-time by the operator, including particle size, flow speed, impact characteristics, and colour profiles to add customizability to the installation. The obstructions are manifested as blue circles whose diameter lies across the flow path, such that the curved edges deflect the incoming particles as the pass (Figure 7.2). The diameters of the obstructions are mapped to the pressure values of the M T C Express; the higher the pressure, the larger the obstruction. Each of the 72 sensors corresponds to an obstruction, and these are arrayed in a regular grid to mimic the placement of sensors on the interface. Thus, the pressure profile of the hand and fingers is represented by the 72 obstructions. Participants are able to enjoy the constant motion of colourful particles and directly 81 manipulate them using input from their whole-hand. The feedback from tactile input to visual output is quick, and thus the association can be readily made. 7.1.2 System Architecture This section wi l l provide a technical overview of the architecture of the FlowField system. Implementation details such as data structures and class hierarchy can be found in the author's B.Sc. thesis 2. The primary display of FlowField is a C A V E . A s mentioned previously, the model we used consists of four screens, arranged in a cube fashion, with screens to the left, front, and right, and one screen for the floor. The images for each of these screens are projected from stereo-capable C R T projector, and are driven by an SGI Onyx 3200 housed in an adjacent room. Graphics are rendered using OpenGL, and support for stereoscopic projection onto multiple screens is provided by the VRJuggler A P I . The SGI Onyx is responsible for driving the particle simulation. Despite its formidable processing power, it should be noted that rendering four stereoscopic displays means that essentially eight images needs to be generated for each frame, which considerably limits the frame rate when compared to rendering for a single display. With some optimization, we can achieve a particle system with around 300 spheres at reasonable frame rates. The obstructions are rendered as flattened cylinders. Since the SGI Onyx is housed in a separate room, and is already burdened with the graphics rendering, we used a separate computer to capture and process the input from the M T C Express. The M T C Express sends its data through a serial interface at 115,200 bits per second. This is sufficient to capture the pressure information for the 72 sensors at a reasonable rate, as each value is represented by an 8-bit integer. Since we simply map each pressure value to obstruction size in the simulation, we parse the pressure values from each data record and send the 72 integers as a token-delimited string to the SGI Onyx via sockets (Figure 7.3). Graphics t MTC Express Whole-hand input 72 8-bit integers Serial interface Linux PC Read from serial port / S H A R E D \ B U F F E R Send to SGI Onyx Token-deliminated strings T C P connection SGI Onyx Update obstruction sizes S H A R E D B U F F E R Parse incoming packets Figure 7.3: Data flow diagram for FlowField; boxes in each computer represent concurrently running threads 2 Chen, T. (2002). Applications for the MTC Express. B.Sc. Thesis, The University of British Columbia. 82 Both the Linux and SGI computers are running parallel threads to ensure concurrent execution. The Linux machine has one process to read data from the serial port and store it into a shared buffer; one that reads the latest pressure values from the shared buffer, composes a token-delimited string and sends it through sockets. The SGI Onyx has one thread that listens for incoming packets from the Linux machine and parses any strings that contain pressure values into integers and stores them into a shared buffer; multiple separate processes comprise the graphics subsystem and one of them updates the obstruction sizes from the data stored in the shared buffer. Our architecture design represents a logical subdivision of interface processing and graphics rendering tasks, using concurrency to reduce blocking execution in either input or output components. This separation of input processing from graphics processing is a critical step in abstracting input device processing from application requirements. We w i l l see that this separation allows for the execution of interpolation algorithms in attempts to refine the data input. A s is, this initial version of FlowField uses the raw integer values, and this has consequences for mapping to obstructions. 7.1.3 Mapping of Touch Input One of the important stages of creating a touch interactive application is how to map the input captured by the interface to some domain in the application space. In FlowField, we chose to arrange a grid of obstructions along one half of the cylinder—the half that is visible from the participant's perspective. This ensured that participants would not have to look around the entire cylinder to see where some of the obstructions are appearing. If obstructions were mapped to the rear o f the cylinder, they would never be visible because there is no display there. Figure 7.4: Obstructions seen in field of virtual particles; mapped to one half of cylinder only. The cylinder of virtual particles represents a 2D domain, so the obstructions should be interacting in that 2D space. Without moving obstructions, which is prohibitive due to the arrangement and number of sensors, a mapping of pressure to obstruction size seemed logical: the higher the pressure, the larger the obstruction. The obstructions are allowed to overlap, so 83 areas of high pressure would be represented by a cluster of larger obstructions, deflecting the particles as a coherent mass (Figure 7.2). Understandably, the impression of the hand in the scene does not precisely recreate the shape of the participants' actual hand; nevertheless, it was sufficient for a strong impact. 7.1.4 Evaluation of F lowField FlowField was demonstrated in the C A V E at the N e w Media Innovation Centre (NewMIC) Immersive Media Lab facility. Over several months, including an open house event, many visitors had the chance to experience FlowField. From an experiential standpoint, FlowField was well received as an engaging, dynamic showcase of both the C A V E and interaction with the M T C Express device. We conducted a small user study with ten subjects to substantiate the anecdotal reports. We sought to evaluate subjects' approach to using the touchpad interface and how they felt about the input device's effectiveness in facilitating the interactive task. Each subject was asked to simply to enter the installation and utilize the M T C Express any way they wished. To supplement the experimenter observations, subjects were also asked to complete a brief questionnaire after their participation. The questionnaire posed several statements to which the subjects would rank on a 5-point scale: strongly agree (SA), mildly agree ( M A ) , neutral (N), mildly disagree (MD) , strongly disagree (SD). The questionnaire form can be found in Appendix B . Every subject, without exception, initially used a single finger to touch the M T C Express, even after the multi-point property had been introduced to them. From the questionnaire responses, all subjects had worked with laptop touch pads before, so their approach of emulating touch pad usage is not surprising. The subjects eventually discovered the multi-point capability and how to apply it to the system. They had little difficulty understanding that the applied pressure of their hand affected the obstruction sizes (6SA, 2 M A , 2N). This effect is not obvious a priori considering the indirect mapping and low resolution of input data (1cm sampling interval). In general, the questionnaire results indicated that the provided interface worked well for performing the flow manipulation (4SA, 3 M A , 2N), and that they enjoyed their experience (7SA, 3 M A ) . While they were positive about the usefulness of the M T C Express' unique properties, many of them maintained that there were other ways the task could be performed (3SA, 3 M A , 3N). 7.1.5 Relating FlowField to Thes i s Contr ibut ions In this section, we w i l l show how FlowField motivated and influenced the contributions of this thesis, as well as led to more work described later in this chapter. The fundamental issue with FlowField is the search for an appropriate mapping between the input data and the application domain. The circular obstruction scheme was chosen as a simple means for mapping the shape of the hand based on applied pressure. This mapping was chosen due to the realization that it would be difficult to capture dynamic motion from 84 the low number and sparse arrangement of pressure sensors. A s such, we considered the best way to use the data values as is. This issue illustrates the need to examine the relevant physical properties of an interface and how these impact the suitability for a particular application. In the case of the M T C Express, we quickly identified that the spatial arrangement of the sensors was a critical factor in choosing the input mapping; in fact, the sparse arrangement precluded the reliable capture of finger-scale motion. These findings motivated two related tracts: the exploration of the space of surface-constrained touch interfaces with respect to a defined set of relevant characteristics; and the enumeration of physical hand actions that can be performed with these interfaces. The first, the exploration of touch interfaces, is addressed with our taxonomy of touch interfaces, presented in Chapter 4. With this taxonomy, we can see touch input capture capabilities of various interfaces, in terms of physical properties (position, force, etc.), input topologies (single/multi-point, contour), and resolution (hand-scale, finger-scale, and sub-millimetre scale), all characteristics which are relevant to touch interfaces. In particular, the M T C Express is clearly shown to capture hand-scale position of a high-order topology (rather than points), as well as direct pressure. The second, the enumeration of hand actions, is the subject of Chapter 5, where we identify basic classes of hand actions possible to perform with touch interfaces such as the M T C Express. With FlowField, we concluded that static actions using the whole-hand and one or two fingers were best supported by the interface, and this was reflected in the mapping chosen to represent the input data. These actions are a subset of the space of all possible surface-constrained hand actions, and Chapter 5 elaborates this space more fully, so that future interfaces and applications can be evaluated with respect to an established set of hand actions. Finally, with these two contributions, we can now refer to our design space and identify that the M T C Express is capable of capturing all hand- and finger-centric actions, but with low precision, which hampers the resolution of multiple fingers. Since its position measuring property is classified as hand-scale, the M T C Express is best used for capturing basic hand-actions, single-finger actions, and some multi-finger actions, provided the fingers remain far apart. The capture of dynamic actions is difficult due to the lack of point tracking, thus static hand shapes and simple single- and two-finger configurations are the most suitable forms of input. 7.2 Three Investigations Motivated by FlowField Our experience with FlowField motivated our subsequent efforts in three directions. First, we introduced an intermediate stage of data processing in order to address the limitations of the raw data the M T C Express produced. The topology and resolution of the raw data itself indicates, through our design space, that certain actions are possible; however, by processing the data, we explored whether additional actions can be recognized, thereby increasing the usefulness of the device. Our second direction is informed by our organization of hand actions and the resulting design space. We selected a hand action that can be suitably captured by the M T C Express and 85 devised applications that specifically leverage these actions in an intuitive manner. The final direction explores an interface with different properties altogether, as an investigation of interface possibilities indicated by openings in the touch interface taxonomy. 7.2.1 Investigation of Improving Data P roduced by Interface A major effort that proceeded after FlowField was the development of an architecture that abstracted application requirements away from input device data3. Wi th such an abstraction, we can consider the M T C Express and FlowField as separate entities, and can independently pursue development paths for each. This approach is not only motivated by our findings with FlowField, but is prevalent in software and hardware architectures. We propose a multi-layer, hierarchical framework for touch interface application design to identify logical stages and data paths between the hardware and application. 7.2.1.1 Layered Framework for Touch Interface Application Design We propose three stages that occur between the hardware and application stages o f a touch interactive system. The Interface stage represents the source of the data from the hardware device. These are typically in some discrete, finite representation, and can also include built-in routines to compress or normalize the data before output. The Processing stage performs numerical algorithms to adjust the data to a different format or topology, such as interpolation or parameterization. The Interpretation stage specifically deals with touch interaction as it performs the recognition of hand postures and gestures. [Application FlowField • \ | Interpretation Gesture recognition, action classification • ] Interpolation, parameterization | Interface ^ " " ^ Data compression, normalization [Hardware MTC Express, other technologies Figure 7.5: Layered framework for touch interactive applications Figure 7.5 shows the framework with examples of components that fall into each stage. Arrows indicate the possible paths that data can follow from the hardware device to the application itself; data can traverse one or more stages, but generally in order of the hierarchy, from bottom to top. There is no requirement that each stage be explicitly separated from each J Chen, T., Fels, S. & Min, S.S. (2003). FlowField and Beyond: Applying Pressure-sensitive Multi-point Touchpad Interaction. Proceedings of IEEE International Conference on Multimedia and Expo (ICME2003) 49-52. 86 other or from either the hardware or application. For example, the Interface stage is typically performed in the hardware, but they can be initially implemented in software for testing. The converse can also be true: once interpolation and recognition algorithms have been designed, they can be integrated into the hardware to augment the device's capabilities. With this framework, we recognized that the major issue with FlowField is the absence of a reliable means to interpret the incoming pressure data as any coherent hand action. This is the responsibility of the Interpretation stage, but we felt that this truly was hindered by the sparseness of the sensors, and therefore some processing of the raw data would be helpful. The next section describes the efforts made in this regard. 7.2.1.2 Interpolation of Raw Data The Processing stage is responsible for reformatting the incoming data into forms that can be more useful. M T C Express data from the Interface stage is in the form of 72 8-bit pressure values—this corresponds to a 6 by 12 grid of Taxels. It is these 72 pressure sensors that capture the information provided by the hand and fingers. A representation of the data from the M T C Express is shown in Figure 7.6. Each square represents one Taxel, and the intensity of the square corresponds to the pressure value. The data represents a hand being placed on the interactive area o f the M T C Express. Four fingers can barely be discerned. Figure 7.6: Raw data values from the MTC Express It is evident that it is difficult to determine the number and positions o f the fingers with such a representation, even with our human optical recognition abilities. In the hopes of improving this situation, we applied some image processing algorithms to attempt to better represent the incoming data, at least visually. We basically applied bilinear, bicubic, and Gaussian filters to what can be considered a 6 by 12 pixel image. Clearly, this step falls within the Processing domain as we are adding information to the original data, and we are not interpreting any higher-order constructs such as gestures. Figures 7.7, 7.8, and 7.9 show the results of interpolation. 87 8 8 From these figures we can see that the images are markedly improved from a visual perspective, and the various regions of the hand can be readily identified. These results, however, come at increased computation cost and increased memory and storage requirements from the application perspective. One issue that cannot be addressed with these techniques is the poor recognition of actions involving fingers. This is again due to the spacing of the sensors, as well as a pressure-distributing substrate that serves as the interactive surface. The effect of these hardware attributes is that pressure applied by the finger can alternate between falling directly on a sensor, and then falling between sensors, resulting in a fluctuation representation of the pressure values. This phenomenon is clarified in Figure 7.10. 200 10 85 80 20 180 Figure 7.10: Pressure value fluctuations due to traversal of finger over and between sensor locations (circles) Nevertheless, the improvement offered by these interpolation techniques shows promise in processing the data provided by the M T C Express for better use in applications. We demonstrate a usage of this interpolated data in an updated version of FlowField. 7.2.1.3 Relating to Design Space The layered framework proposed in this section is a means to logically separate important components in a touch interactive system, and is a practical guide for design and implementation. Its focus is on the handling of actual data and the various levels of processing on them. Applications can certainly benefit from techniques and algorithms that are developed to enhance the usability o f data from a touch interface, and these new capabilities can be reflected in our design space. However, i f efforts to enhance the capture capabilities of an interface prove difficult, this can serve to reinforce the design space mapping. For example, in the next section we w i l l see that despite the use of the interpolated data, FlowField would still only be able to make use of a limited set of hand actions which is similar to the original application. This indicates that our efforts, while having yielded a visual improvement, have done little to enhance the usefulness of the M T C Express. Therefore, we can conjecture that further efforts to process and interpret the data may likewise be similarly ineffective. 7.2.1.4 Improving FlowField using Interpolated Data To illustrate the use of the interpolated data from the M T C Express, we developed a new version of FlowField. In this version, we used the interpolated data with an updated disruption mechanism where the pressure values are mapped to force vectors which accelerate and decelerate particles accordingly. We eliminated the use of obstructions and emphasized the disruption aspect. Each of the interpolated values is mapped to a force vector whose magnitude is dependent on 89 the pressure associated with that location. High pressure values act to slow particles down. The direction of the force vector depends on its location on the area of contact. The net effect is that particles are deflected around the areas of high pressure caused by applying the hand and fingers onto the M T C Express. • • • • • • • • ;*aF^^r ;\H\\N<N;;. • 111111! : Figure 7.11: Snapshots of FlowField 2: force vectors are represented in yellow To emphasize the disruption effects, the number of particles in the system is substantially increased, and the speed of the particle flow is also increased. Static disruptions arranged in the particle flow with greater density resulted in a compelling flow interruption effect (Figure 7.11). This version of FlowField is important in two ways. First, it is an instance where an application, FlowField, obtains data from an input device, the M T C Express, via a different path in our layered framework, going through the Processing stage. This is achieved with the adaptation of the application to use interpolated data, which is processed from the raw data and provided through an overloaded read routine. We can anticipate further improvements to 90 FlowField i f advances in the Interpolation stage are made, thus resulting in the recognition and tracking of hand shapes. Secondly, the introduction of interpolated data has no effect on the design space mapping of the M T C Express. The same actions that can be performed with the original FlowField can be used with this newer version. It is the virtual environment and mapping of the pressure information that was updated to better use the new set of data. Also from our design space, we recognize that there are additional actions that are not exploited in FlowField, namely those of dynamic motions. We proceed in the next section to outline some applications that take advantage of these actions. 7.2.2 Investigation of Suitable Hand Act ions for the Interface So far, FlowField has demonstrated the usage of static hand and finger press actions with the M T C Express. This has motivated exploring ways to refine the raw data from the device, but this has not enabled the capturing of the actions we desired for the original FlowField concept. Nevertheless, from the design space, some dynamic actions should be possible with the M T C Express, and in this section we explore one particular type of action that mitigates the inherent low resolution of the sensor. Dynamic hand-centric actions maybe more readily captured by the M T C Express, since such a large area should still impact on a sufficient number of sensors. We focused on the hand-centric roll , which characterized by the rocking of a part of the hand along the surface (Figure 5.9). The sensing surface should see the contact area move smoothly from one location to another, with the cross section changing due to the constantly variable area of contact between the hand and the surface. A hand roll can be easily differentiated from a wipe with pressure information—it is easier to see that one area of lighter pressure w i l l be "roll ing" into an area of higher pressure as the roll progresses, than i f it were just a uniform-pressure blob morphing across a distance. It is not obvious what a roll action, when captured, could actually control in the context of an interactive application. Sculpting actions, including clay and dough manipulation, are possibilities, as well as massage. These tasks are fairly complex in nature and involve more than just hand roll actions. Another roll-related task is the balancing of a surface on an up-facing palm, such as with a waiter's tray, though usually the wrist and arm are involved as well . We call this the "balanced-sheet" metaphor. The latter task inspired an interaction metaphor that can be controlled by a rolling action. Instead of balancing a tray to counteract the effect of gravity, imagine now controlling a rigid sheet that is either balanced on an inverted sphere or bisecting a sphere. The hand can apply forces on any part of the sheet, rotating it in that direction. A roll action is suitable to control this type of task since the hand can act as a proxy for the sheet, able to finely control its roll direction and magnitude. Two applications using this method of control are now presented. 7.2.2.1 3D Model Viewer Conventional 3D model viewer interaction is often treated as a virtual trackball, since it is necessary to perform a 3D transformation given 2D input. A mouse click and drag action is used to rotate this trackball from one orientation to another in order to view a model from 91 different angles. Even for users who are adept at mouse usage, sometimes this type of control is problematic and unintuitive. We proposed an alternative to this established method for 3D model transformation which can be performed using the balanced-sheet metaphor. Using the M T C Express, the centroid (Cx, Cy) of applied pressure was computed, with a magnitude Mag: The sheet is then rotated on an axis intersecting the origin that is orthogonal to the vector defined with endpoints at the origin and the centroid. The amount of rotation is dependent on the applied pressure and can be adjusted with a linear or non-linear multiplier as desired. The rotation is implemented with the following OpenGL call: gIRotatef (Mag, -Cy, Cx, 0.0); The result is a viewing mechanism that has all the functions of a mouse-controlled viewer. The user need only adjust the applied pressure in a certain direction in order to effect a rotation in the same general direction. In the same action, the pressure is used to control the magnitude of the rotation. It remains future work to evaluate the performance of this mechanism with conventional mouse-based control. Discarding factors related to mouse familiarity, it can be seen that this type of control provides an alternative to mouse usage, which may address some ergonomic concerns. The roll action that is used in this application can take many forms, including using the palm of the hand, or a closed fist. This application can conceivably also be performed with a single- or multiple-finger push. With pressure-sensitivity, the amount of rotation is coupled to the applied pressure; otherwise, this magnitude can only be coupled to some relative distance measure such as drag distance. This application demonstrates one such application of a hand roll , although a roll is not the exclusive action that can provide control in this case. However, it does demonstrate that the attributes that the M T C Express provides—low sensor resolution, multi-point sensing, and pressure sensing—are suitable to capture a hand roll action. In order to consider controlling this application with a device of different attributes, non-roll actions may need to be considered. For example, using a laptop touch pad w i l l provide only position information. A s mentioned above, the magnitude of rotation must be tied to the relative drag distance of the finger—thus a finger drag action is used in this case. For a pressure-sensitive drawing tablet, the rotation magnitude can be controlled by pressure, but the direction of rotation is dependent on pointer position, which can only be changed through a drag action. Only with the multi-point, pressure-sensitive input device can a hand roll truly be performed. 11 11 92 / Figure 7.12: 3D model viewer using hand-centric roll as input: centred teapot (top); rotated teapots (left) with corresponding pressure input data (right) represented as greyscale image (dark corresponds to high pressure) 7.2.2.2 Labyrinth Game A s a further exploration into how a roll action can be applied in an interactive application, the balanced-sheet metaphor was implemented in a game. The objective of the labyrinth game is to control the slope of a playing surface in order to navigate a marble through a maze lined with walls while avoiding holes. The physical incarnation of this game involves two rotary controls, one of which controls left-right rotation of the board; the other the front-back rotation. Collectively, these slope the maze surface in any orientation in order to roll the marble in the desired direction. 93 Figure 7.13: Screenshot of Xtreme Labyrinth Naturally, this is quite a challenge, not only with the need to avoid the holes, but with the control mechanism which decomposes the task of balancing a 2 D surface to a pair of 1D input controls. One may consider tipping the playing field directly by means of pushing on the corners, but this is confounded by the typical construction of the maze from two separately rotating sections, one enclosed in the second. Xtreme Labyrinth is a computer version of this game programmed by Ryan Westphal. It is freely available on the web including the source code. It was created for Windows and uses OpenGL and S D L (Simple DirectMedia Layer) libraries. The rotation of the two sections was controlled by using the mouse: left-right rotation by horizontal mouse movement; front-back rotation by vertical mouse movement. The rotation dimensions were originally independent of each other, at least in the physical implementation, but they are intrinsically coupled because of the usage of a mouse. This makes the game quite difficult to control. O f course, there can be many schemes devised that can make the game easier, but that is not the goal of this exercise. Instead, we implement the balanced-sheet rolling mechanism as an alternative input method. The game was first ported to run under the Linux environment, which was not difficult due to the availability of the S D L libraries for Linux. The M T C Express-controlled plane rotation mechanism was inserted into the code in place of the mouse-based control. Again, the centroid of applied pressure dictated the direction and magnitude of the rotation. The result is an application that is clearly is suitable to be controlled by this balanced-sheet 94 mechanism, and thus possible by a roll action, though not exclusively. O f course, the game still remains a challenge even with the new mechanism, but the control mechanism may be considered more intuitive than a mouse or keyboard control scheme. Future work can explore this possibility through user studies. 7.2.2.3 Relating to Design Space These applications demonstrate the use of a hand-centric action with the M T C Express, something that was not useful in FlowField. In our design space, this action is deemed to one that is supported by the device with relative success despite the low sensor resolution. Other dynamic actions may be explored, but these may be less successfully captured. 7.2.3 Investigation of Different Interface Attributes Our experiences with the M T C Express suggest that there would be difficulty capturing dynamic finger-centric actions. In previous sections we have formalized a layered approach to touch interface design, which was expanded through refining the incoming raw data, in an effort to create applications that take better advantage of the device's capabilities. In this section, we describe a new interface that has vastly different capture characteristics. One of the purposes of our touch interface taxonomy is to show what types of properties existing devices can capture, as well as indicate where some potential as-yet-created devices may fit in. The taxonomy shows that there are few devices that can capture precise multi-finger input, and none that can capture directly applied pressure as well . We recognized that capacitive sensing, while suitable for capturing precise position information, may make pressure sensing difficult; the converse is true with the technology found in the M T C Express. Pressure can best be captured with a sensing mechanism that can physically bend depending on applied forces. This mechanism should be tracked for minute changes that can be computed into position information. We decided on using a deformation surface whose deformations are tracked using video technology. 7.2.3.1 System Architecture Our malleable input surface is made of a latex glove 4 . This rubber sheet is stretched tautly over a rigid frame to form a flat interaction surface. A video camera is positioned underneath to capture input information applied to the rubber sheet (Figure 7.14). B y applying the hands and fingers onto the interaction surface, deformations are made that can be appropriately lit and captured easily by video techniques. B y patterning the surface underneath with dots or other markers, position tracking and deformation characteristics such as surface normals and depth can be determined. Our input device exhibits many important attributes of whole-hand interfaces involving surfaces. Computer vision facilitates multi-point sensitivity, suitable for interaction with the whole-hand, instead o f a single point. Pressure sensing and can be determined through video processing of deformation images. Furthermore, passive haptic feedback is inherently possible due to the malleable nature of the surface. 4 Vogt, F., Chen, T., Hoskinson, R. & Fels, S. (2004). A Malleable Surface Touch Interface. Sketches and Applications of ACM SIGGRAPH. 95 Figure 7.14: Images of Malleable Surface Interface: video camera positioned underneath interface (left); interface with tracking dots visible (top right); deforming the interface (bottom right) 7.2.3.2 Proof-of-Concept Application To demonstrate an application of our device, a virtual simulation of a spring-mass model was used. This simulation is derived from the KineticsKit Python module. Because a plane best corresponds to the input topology of our device, we chose a planar grid of masses connected by springs to comprise the interactive domain. A multi-point device can be used to manipulate multiple masses simultaneously, something that would be difficult with a mouse. Our interface also supports squeezing and stretching actions, which are types of manipulation of the virtual grid that simple downward pressure cannot provide. Although position tracking of multiple points can detect such movement-based actions as a stretch, only a malleable surface can actually provide an analogous passive haptic feedback. For example, lateral translation of the input domain by means of stretching the surface material can be mapped to x- and y-translations of the system objects in the simulation. A s a comparison, this spring-mass simulation was interfaced with the M T C Express. It was only sensible to map the applied pressure to the downward movement of the masses. The lack of a malleable surface does not provide much useful haptic feedback; the semantics to produce xy-plane movement would also be different, requiring tracking of fingers sliding across the surface. 96 Figure 7.15: KineticsKit spring mass model: interfaced with Malleable Surface Interface (left); interfaced with MTC Express (right) 7.2.3.3 Relating to Design Space It is a helpful study to try using a different technology to support the capture of several actions that would be difficult or impossible with the M T C Express, with which we have focused most of our efforts thus far. The deformability of the surface provides a different sensation to the end user, as movements with the hands deform the interface, rather than traverse the surface, as with most other interfaces. There are several existing interfaces that make use of deformable surfaces [Sinclair97] [White98], including one that actually recognizes forces of applied input through deformations [Kamiyama04]. The Malleable Surface Interface differs from all of these and can be considered as a multi-point movement sensing interface, which is notably an empty area in our taxonomy of touch interfaces (Figure 4.4). The types of actions that are supported with this interface are technically found in our organization of hand actions—hand-centric press, roll , and twist, as well as finger-centric press, drag, coordinated drag and twist—but these are differently captured due to the deformable surface. In fact, given a surface that can deform arbitrarily, we can theoretically begin to capture actions that can be considered free gestures, no longer strictly surface-constrained actions. We are still in the process of developing and investigating this interface, and we can anticipate that it w i l l bring to light an understanding of the relationship between unconstrained gestures and constrained gestures, therefore better unifying the research on touch interaction in general. 7.3 Summary In this chapter, we presented several applications of surface-constrained touch interactive applications. We began with FlowField, which was developed to explore the capabilities of the M T C Express. Based on this experience we continued to proceed in three distinct directions which have informed the contributions of the thesis: the refinement of interface data to improve suitability for touch applications, the exploration of suitable hand actions that take advantage the M T C Express, and the design and development o f a novel interface which can support different sets of actions. The first direction embodies a software development process that is formalized with a layered 97 architecture which effectively abstracts hardware and software, which is consistent with established software engineering principles. This implementation-level framework complements the design space which specifies the suitability of a particular interface to capture certain hand-actions. The determination of suitable actions is derived from our experience with developing FlowField, and with a formalization of these actions, we can create applications to explore different supported actions, which represents the second major direction of investigation. The third direction explores a location in the touch interface taxonomy that is empty, and specifically explores properties and hand-actions that are different than those of the M T C Express. This study shows the effectiveness of the taxonomy in informing the design of novel interfaces. It also demonstrates that the movement vocabulary is useful for quickly describing the types of hand-actions an interface supports. A l l the studies in this chapter have informed the development of concepts relating to whole-hand touch interaction, which is an area with several disparate efforts which can benefit from some organization and integration. We find that during this process, we are continually inspired with new ideas for applications and interfaces alike, and these explorations in turn further the development of the ideas in this thesis. This is clearly an ongoing process, and we wi l l touch on some possible directions in the future works section to follow in the next chapter. 98 Chapter 8: Conclusions and Future Work This thesis represents a major step in the exploration of whole-hand surface interaction. In many past efforts, touch devices have been created and used as alternative user interfaces for conventional mouse-based graphical interaction. Some recent developments have expanded the range of interaction possibilities to multiple finger sensing and pressure sensing, but without widespread application support, they were of limited appeal. Our research focused on organizing the space of touch interfaces, including those suitable for whole-hand touch, in order to assist application developers and interface designers. The organization represents a design space, where correspondences between interface attributes and supported hand actions can be identified. We wi l l summarize these contributions in the following sections. 8.1 Contributions 8.1.1 Taxonomy of T o u c h Interfaces Based on previous work done by MacKinlay et al. [MacKinlay90] with respect to input devices in general, we refined adapted their taxonomy of input devices to better classify touch interfaces. We identified a major partition between point-sensing and higher-order sensing (contour) which has an effect on the capture capabilities of touch interfaces. We further classified sensor resolution as this w i l l affect a device's ability to capture finger-scale input. We retained the classification of property types, including position, movement, and pressure, and added the possibilities of proximity sensing and inferred pressure sensing to classify 99 interfaces with those attributes. Finally we modified the classifying of dimensionality to acknowledge the invariant that touch interfaces capture planar input (combined X7Y dimensions), and do not have mechanisms that physically capture rotary input. This taxonomy presents a view of the range of touch interfaces and their respective capabilities, in terms of physical properties. A s such, we can quickly determine the properties any particular interface can capture, as well as see what similar devices exist. We can also see where there is potential for new interface designs that capture a combination of properties never before attempted. 8.1.2 Enumerat ion of Surface-constrained Hand Act ions There are many classifications of unconstrained hand actions for gesture studies, and there are also many applications that use the hand in a variety of ways to provide input onto a surface. We have drawn on these works to propose a comprehensive list of surface-constrained hand actions, which is the interaction relevant to our research. We have organized these actions between hand-centric and finger-centric actions and further characterized them by transformation properties (in-place, translation, rotation). Even within each action, there are myriad variations on fingers in use, anchor position, and motion characteristics. With this set of actions, we can begin to consider the possible combinations of complex gestures within the context of surface interaction. We have also identified several relevant non-action attributes of whole-hand interaction that add further richness and complexity to the space of possible touch interactions. Some of these include bimanual input, handedness, and pressure variation. Our efforts highlight the difficulty in characterizing surface-constrained touch, and serve only as a proposed start point for more research. 8.1.3 Design Space of T o u c h Interfaces Our design space draws on interface attributes identified in our touch device taxonomy, as well as the enumeration of surface-constrained hand actions, to organize knowledge about touch interfaces. Integrating hand actions into the movement vocabulary, we are able to present a visualization of the design space that organizes devices based on their ability to capture certain hand actions, which we believe is more relevant than the measurement properties considered in previous work. The coverage of the design space spans all existing interfaces and reveals three basic classes of touch interfaces: those that capture single-point in-place and translation actions; those that support multi-point in-place, translation, and rotation-about-Z actions; and those that support the full range of actions. A nascent category of 3D pressure sensitivity has also emerged. The design space is used not only to study the properties of existing devices, but also to explore new interfaces by proposing new mappings within the space. These can be evaluated through expressiveness and effectiveness metrics which gives a systematic way of assessing new technology in this field. 100 8.1.4 Appl icat ions and Related Contr ibut ions A s motivations and as demonstrations of applying the major contributions of the thesis, several software applications have been developed and are available for future consideration and study. These included FlowField, an interactive installation which used the M T C Express as its primary interface. This was the initial inspiration for our work, and limitations and difficulties here led to efforts guided by conceptual layered framework for touch interface application development. These further efforts took three directions: first, to refine the data input to attempt to capture more information with the interface; second, to create, using our design space, applications that can better take advantage of the hand actions that the interface can capture readily; and third, based on our taxonomy, experiment with creating an interface with different attributes. The first direction culminated in interpolation algorithms for the raw data from the M T C Express, and an updated version of FlowField which used this interpolated data in a more effective manner. The second was demonstrated by two applications, a 3D model viewer and an interactive labyrinth game, which used the hand-centric roll action. The third resulted in the development of an interface that was created to explore the possibilities of a deformable surface which is captured using video technology. This also represents not only an unexplored area in our touch interface taxonomy, but also raises some interesting issues when considering touch interaction with non-rigid surfaces. This w i l l likely motivate future expansion of our work. Additional support software used during the development of these applications included a Linux driver distribution for the M T C Express, which has since been released to prospective developers, as well as a Python wrapper for the driver. 8.1.5 Publ icat ions Chen, T. (2002). Applications for the MTC Express. B.Sc. Thesis, The University of British Columbia. Chen, T., Fels, S. & Schiphorst, T. (2002). FlowField: Investigating the Semantics of Caress. Conference Abstracts and Applications of ACM SIGGRAPH, 185. Chen, T., Fels, S. & M i n , S.S. (2003). FlowField and Beyond: Applying Pressure-sensitive Multi-point Touchpad Interaction. Proceedings of IEEE International Conference on Multimedia and Expo (ICME2003), 49-52. Vogt, F., Chen, T., Hoskinson, R. & Fels, S. (2004). A Malleable Surface Touch Interface. Sketches and Applications of ACM SIGGRAPH. Poster and demonstrations at A S I Exchange 2002, 2003; H C I @ U B C 2003. 8.2 Future Work This research is a comprehensive effort within an even larger area of study, that of gesture and touch interaction. Some possible directions for future research wi l l be proposed below. 101 • Extending this research to cover general touch interaction with surfaces that are not necessarily flat. Such work is an order of magnitude more complex, but w i l l be relevant when interactive touch surfaces are used in new and interesting ways. • Extending research to examine issues and differences associated with bimanual interfaces, multi-user interaction, and handedness. • Extending research to investigate deformable interfaces which begins to enter into the area of 3D touch and haptic feedback. • Empirical evaluations o f touch interfaces described in this research, especially for effectiveness metrics. • Further development of organization of possible hand actions to include bimanual interaction, interaction with non-planar surfaces, and emotional touch; possible correlation with prehensile actions. • Investigation o f surface-constrained gestural language sets for use in application control and symbolic expression; the combination of our movement vocabulary to form syntactic and lexical units. • Develop formalizations for sets of hand actions for use in input/output domains of input device representation. • Further development of interesting applications for the M T C Express. • Improve reconstruction efforts of M T C Express data, possibly using curve-fitting schemes. 8.3 Conclusion This thesis has proposed an organization on the vast subject of touch interaction. We feel that these contributions wi l l guide future efforts in this area, as they have with our own experiences working with touch interfaces. Hopefully this area can be better understood so that we can be confident in creating interfaces that truly take advantage of the magnificent abilities of the human hand. 102 References [Buxton83] Buxton, W . (1983). Lexical and Pragmatic Considerations of Input Structures. Computer Graphics, 17, 31-37. [Buxton85] Buxton, W., H i l l , R. & Rowley, P. (1985). Issues and Techniques in Touch-Sensitive Tablet Input. Proceedings of ACM SIGGRAPH, 215-224. [Bleser91] [Card90] [Card91] [Chen02a] [Chen02b] Bleser, T .W. (1991). An Input Device Model of Interactive Systems Design. Doctoral Dissertation, The George Washington University. Card, S.K., Mackinlay, J .D. & Robertson, G . G . (1990). The Design Space of Input Devices. Proceedings of ACM S1GCHI, 117-124. Card, S.K., Mackinlay, J.D. & Robertson, G . G . (1991). A Morphological Analysis of the Design Space of Input Devices. ACM Transactions on Information Systems, 9 (2), 99-122, 1991. Chen, T. (2002). Applications for the MTC Express. B.Sc. Thesis, The University of British Columbia. Chen, T., Fels, S. & Schiphorst, T. (2002). FlowField: Investigating the Semantics of Caress. Conference Abstracts and Applications of ACM SIGGRAPH, 185. [Chen03] Chen, T., Fels, S. & M i n , S.S. (2003). FlowField and Beyond: 103 Applying Pressure-sensitive Multi-point Touchpad Interaction. Proceedings of IEEE International Conference on Multimedia and Expo (ICME2003), 49-52. [DietzOl] Dietz, P. & Leigh, D . (2001). DiamondTouch: A Multi-User Touch Technology. Proceedings of ACM UIST, 219-226. [Foley74] Foley, J.D. & Wallace, V . L . (1974). The Art of Natural Graphic Man-Machine Conversation. Proceedings of the IEEE, 62, 462-471. [Foley82] Foley, J .D. & van Dam, A . (1982). Fundamentals of Interactive Computer Graphics. Boston, M A : Addison-Wesley Longman Publishing Co. , Inc. [Foley84] Foley, J.D. & Wallace, V . L . & Chan, P. (1984). The Human Factors of Computer Graphics Interaction Techniques. IEEE Computer Graphics and Applications, 4(11), 13-48. [Gibson62] Gibson, J.J. (1962). Observations on active touch. Psychological Review, 69(6), 477-490, 1962. [Heller91] Heller, M . & Schiff, W . (Eds.). (1991). The Psychology of Touch. Hillsdale, N J : Lawrence Erlbaum Associates, Inc. [Hinckley99] Hinckley, K . & Sinclair, M . (1999). Touch Sensing Input Devices. Proceedings ofACMSIGCHI, 223-230. [IwataOl] Iwata, H . , Yano, H . , Nakaizumi, F. & Kawamura, R. (2001). Project F E E L E X : Adding Haptic Surface to Graphics. Proceedings of ACM SIGGRAPH, 469-475. [Jacob94] Jacob, R.J . , Sibert, L . E . , McFarlane, D . C . & Mul len Jr., M . P . (1994). Integrality and Separability of Input Devices. ACM Transactions on Computer-Human Interaction, I, 3-26. [Kamiyama04] Kamiyama, K . , Hiroyuki , K . , Kawakami, N . & Tachi, S. (2004). Evaluation of a Vision-based Tactile Sensor. Proceedings of 2004 International Conference on Robotics and Automation, WP-6. [Katz89] Katz, D . (1989). The World of Touch. (Krueger, L . E . , Trans.). Hillsdale, N J : Lawrence Erlbaum Associates, Inc. [Lee85] Lee, S.K., Buxton, W. , & Smith, K . C . (1985). A Multi-touch Three Dimensional Touch-Sensitive Tablet. Proceedings of ACM SIGCHI, 21-25. [Mackinlay90] Mackinlay, J.D., Card, S.K. & Robertson, G . G . (1990) A semantic analysis of the design space of input devices. Human-Computer 104 Interaction, 5 (2-3), 145-190. [MacKenzie94] [McLaughlin02] [McNeill92] [Minsky84] [Mulder96] [Oka02] [OverholtOl] [Rekimoto02] [Schiff82] [Schiphorst02] [Sherr88] [Sinclair97] [Sturman93] [Vogt04] Mackenzie, C . & Iberall, T. (1994). The Grasping Hand. Amsterdam: North-Holland. McLaughl in , M . L . , Hespanha, J.P. & Sukhatme, G.S. (Eds.). (2002). Touch in Virtual Environments. Upper Saddle River, N J : Prentice Hal l PTR. M c N e i l l , D . (1992). Hand and Mind. Chicago: The University of Chicago Press. Minsky, M . (1984). Manipulating Simulated Objects with Real-world Gestures using a Force and Position Sensitive Screen. Proceedings of ACM SIGGRAPH, 195-203. Mulder, A . (1986). Hand Gestures for HCI. Technical Report 96-1, Simon Fraser University. Oka, K . , Sato, Y . & Koike , H . (2002). Real-time Tracking of Multiple Fingertips and Gesture Recognition for Augmented Desk Interface Systems. Proceedings of Fifth IEEE International Conference on Automatic Face and Gesture Recognition, 429-434. Overholt, D . (2001). The M A T R I X : A Novel Controller for Musical Expression. New Interfaces for Musical Expression Workshop at ACM SIGCHI. Rekimoto, J. (2002). SmartSkin: A n Infrastructure for Freehand Manipulation on Interactive Surfaces. Proceedings of ACM SIGCHI, 113-120. Schiff, W. & Foulke, E . (Eds.). (1982). Tactual Perception. Cambridge: Cambridge University Press. Schiphorst, T., Lovel l , R. & Jaffe, N . (2002). Using a Gestural Interface Toolkit for Tactile Input to a Dynamic Virtual Space. Conference Extended Abstracts of ACM SIGCHI, 754-755. Sherr, S. (Ed.). (1988). Input devices. Boston, M A : Academic Press. Sinclair, M . (1997). The Haptic Lens. Visual Proceedings of SIGGRAPH, 179. Sturman, D.J . & Zeltzer, D . (1993). A Design Method for "Whole-Hand" Human-Computer Interaction. ACM Transactions on Information Systems, 11, 219-238. Vogt, F., Chen, T., Hoskinson, R. & Fels, S. (2004). A Malleable 105 Surface Touch Interface. Sketches and Applications of ACM SIGGRAPH. [WestermanOl] Westerman, W . & Elias, J .G. (2001). Multi-Touch: A New Tactile 2-D sture Interface for Human Computer Interaction. Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting, 632-636. [White98] White, T. (1998). Introducing Liquid Haptics in High Bandwidth Human Computer Interaction. Masters Thesis, Massachusetts Institute of Technology. [Wu03] Wu , M . & Balakrishnan, R. (2003). Multi-Finger and Whole Hand Gestural Interaction Techniques for Multi-User Tabletop Displays. Proceedings of ACM UIST, 193-202. 106 Appendix A: Glossary We use several terms that are important to the area of touch interaction. Many of these terms are related, and sometimes interchanged with each other, occasionally erroneously. This glossary serves as a reference for definitions of terms as we use then throughout this thesis. Term Definition Touch To bring a bodily part into contact with especially so as to perceive through the tactile sense [Merriam-Webster] Touch Interaction Touch interaction is defined as the contact between two or more objects via an arbitrary contact area. One or more of the objects involved may have sensory capabilities Touch Input This is the application of a touch interaction for the express purpose of providing information to an input mechanism. It is presumed the surface that is being touched has sensory capabilities. This is an input channel [Hinckley99] from the perspective of human-computer interaction. Tactile Feedback/Output A n output channel, tactile feedback produces sensations that are processed by the cutaneous receptors of human skin. These include vibrations and other smaller-scale movements that are detectable through passive touch sensing due to the sensitivity of skin receptors (cf. haptic feedback). Tactile Input This can refer to the same thing as touch input, particularly with respect to using the hands. It can also indicate the type of information provided by tactile feedback devices onto the human skin, that is, small-scale vibrations and movements. Haptic Feedback A n output channel, haptic feedback refers to the sensations of larger-scale movements which are detected by proprioceptic and kinaesthetic senses more than local cutaneous receptors. Touch Interface A device that supports the sensing of touch/tactile input. Haptic Interface A device that supports the rendering of tactile/haptic input into the human touch sensory system. This may or may not also be a touch interface. Whole-Hand Touch Input/Interaction Same as touch input/interaction but expressly involving the use of arbitrary combinations of contact areas of the hand and fingers, particularly with respect to a sensory input surface. Whole-Hand Touch Interface A device that supports the sensing of whole-hand touch input, which may or may not be a haptic interface. 107 Appendix B: FlowField Experience Questionnaire THE UNIVERSITY OF BRITISH COLUMBIA Human Communication Technologies Lab Department of Electrical & Computer Engineering 2356 Main Mall, Room 155A Vancouver, BC, Canada V6T 1Z4 Phone: 604-822-4583 B R E B Ethics Certificate B02-0414 Study Thank for you participating in the FlowField installation! The purpose of this study is to evaluate qualitatively the effectiveness of the touchpad interface for the particle manipulation task presented in the FlowField installation. Investigators Dr. Sidney Fels, Assistant Professor, Dept. of Electrical & Computer Engineering, UBC Timothy Chen, MASc candidate, Dept. of Electrical & Computer Engineering, UBC Post-Study Questionnaire Al l responses, including those on this questionnaire will be recorded. Your identity will be confidential and will be known only to the investigators. Do not write your name on this questionnaire. In any publication that arises from this study you will be identified by 3 digit random numbers. Please answer the following questions: 1. Which of the following interfaces have you used before (check all that apply)? • Laptop touchpads • Wacom drawing tablets • MTC Express • Other touchpads • Please specify 2. Have you been in a CAVE before? • Yes • No C o 00 < CD CD CD > , <D £ > =6 £ 3 CD CD CD w CD C 0 ) O ra Q 00 Q 3. I was able to understand how to use my whole hand on the touchpad (vs. a single finger) 1 2 3 4 5 4. I was able to associate my actions on the touchpad to events occurring in the virtual environment 1 2 3 4 5 5. I was able to understand how my hands were directing the particle movement 1 2 3 4 5 6. I felt that this touchpad was necessary for performing the interactive task 1 2 3 4 5 7. I can think of other ways this same task can be performed 1 2 3 4 5 8. I enjoyed interacting with FlowField 1 2 3 4 5 9. the immersive environment contributed to my enjoyment of the installation 1 2 3 4 5 Thank you for your responses. If you have any additional comments or questions, please use the other side. 108 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0091985/manifest

Comment

Related Items