Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Modeling distal pointing on large screens : the influence of target depth Janzen, Izabelle 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2016_november_janzen_izabelle.pdf [ 12.88MB ]
Metadata
JSON: 24-1.0314096.json
JSON-LD: 24-1.0314096-ld.json
RDF/XML (Pretty): 24-1.0314096-rdf.xml
RDF/JSON: 24-1.0314096-rdf.json
Turtle: 24-1.0314096-turtle.txt
N-Triples: 24-1.0314096-rdf-ntriples.txt
Original Record: 24-1.0314096-source.json
Full Text
24-1.0314096-fulltext.txt
Citation
24-1.0314096.ris

Full Text

Modeling Distal Pointing on Large ScreensThe Influence of Target DepthbyIzabelle JanzenB.A.Sc. Honours Computer Science, McMaster University, 2014A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Computer Science)The University of British Columbia(Vancouver)August 2016c© Izabelle Janzen, 2016AbstractPointing is a fundamental task within many interactions in current computer ap-plications. It is incorporated into everything from selecting buttons to draggingfiles or positioning objects in a virtual environment. Thus, understanding, mod-eling and predicting pointing performance is crucial to the design and evaluationof many computer interfaces. Fitts’s Law (1954) is the basis for modeling humanpointing performance in the international standard on pointing device evaluation(ISO 9241-400:2007). However, while it is extremely robust for many standarddesktop applications, previous work by Shoemaker et al. (2012) has suggested thatFitts’s Law may not be robust enough to accurately model pointing at more ex-treme levels of gain and has proposed alternatives to Fitts’s Law based on earlierwork by Welford (1968). This thesis extends preliminary research by Rajendran(2012) that further examined these alternatives to Fitts’s Law for distal pointing.Distal pointing is common in virtual and augmented reality interfaces. We firstrea¨nalyze results reported by Rajendran using a variety of Welford-style models toexplore the relationship between target depth and a parameter k that was first sug-gested by Kopper et al. (2010) but is inherent in Welford’s model. We then presenta new experiment that removes the confound of system latency from Rajendran’sapproach. Our analyses provide evidence that k varies monotonically (possiblylinearly) with target depth, which further supports the claim by Shoemaker et al.that Welford-style two-part models are preferable to Fitts-style one-part models insome situations. Our analyses also challenge Kopper et al.’s suggestion that angu-lar measures of task difficulty are superior to linear measures for pointing models.We close with a discussion of how our findings about the variation of k with targetdepth might be used in calibration procedures for virtual environments.iiPrefaceMuch of this thesis builds upon previous research. This project was pitched to meby my adviser, Kellogg Booth, as a continuation of these projects and a chanceto investigate some of the questions brought up in their “future work” recommen-dations. The questions included whether angular measures account for the defi-ciencies in Fitts’s Law, whether latency was causing the variance observed in the kparameter, and whether the variance in k was linear with target depth or somethingmore complex. I was given leeway to decide which of these questions to tackle andhow to do that, but many of the project goals came from Kelly.Most of the code developed as part of my research was for administering point-ing experiments or analyzing the data recorded from them. This thesis reports ontwo main experiments. They are presented in Chapter 3 and Chapter 5. Chapter 3provides re-analysis and interpretation of an experiment conducted by Vasanth Ra-jendran in 2012 for his master’s thesis. Garth Shoemaker’s experimental softwarewas used as the basis for Vasanth Rajendran’s original experiment but was modifiedby Vasanth to better support VR. However, much of Vasanth’s code and analysiswas lost when a hard drive was corrupted. Therefore, starting from the code writtenby Garth Shoemaker, I independently modified the code base to create an applica-tion that would duplicate Vasanth’s experimental software as closely as possible.After replicating Vasanth’s experimental results in a small pilot study, I continuedto modify the code base to conduct a second experiment. Chapter 5 reports onmy study to investigate whether the trends observed in Vasanth’s original exper-iment are simply an artifact of system latency. I did most of the writing for thischapter. Aside from modifications to the experimental software, I developed com-puter vision software to process video captured during the experiment in order toiiiobtain pointing performance data without relying on computer logs that are poten-tially latency-plagued because they rely on real-time sensing. Documentation anda primer about how the computer vision software works is provided in Chapter 4,all of which I wrote.Data analysis for both experiments was performed through R-Scripts writtenindependently by me. My adviser, Kellogg Booth, helped me interpret the graphsand results from the experiments. After performing the small pilot study that repli-cated the general trends in Vasanth’s data, I was able to rea¨nalyze Vasanth’s originaldata files using my R-Scripts (he had used SPSS). Through trial and error, I wasable to reverse engineer the analysis that was reported in Vasanth’s thesis. Duringthis process, I discovered a number of problems with how the original analysis wasdone, particularly with the method for filtering outliers that affected some of theresults described in the text of Vasanth’s thesis. Because of this, I independentlyrea¨nalyzed the data using methods that are more consistent with previous research.My rea¨nalysis of Vasanth’s data was reported in the paper “Modeling the Im-pact of Depth on Pointing Performance” that was presented at the CHI 2016 con-ference [16]. Much of the text from the paper was re-used in Chapter 3, althoughit has been edited and reo¨rganized. Kelly and I had roughly similar levels of con-tribution to the paper writing process, with me focusing more on the results andanalysis sections while Kelly focused more on the framing and related work. Vas-anth and Kelly designed the initial experiment and Vasanth did the data collection.His initial analysis was used as the basic structure for the paper. The introductionand literature review from that paper were expanded and fleshed out to become thebasis for Chapter 1 and Chapter 2 in this thesis.Ethics approval for the experiments reported in this thesis was provided bythe UBC Behavioral Research Ethics Board under certificate number H11-01756“Interacting With Large Displays V.”Funding for the research was generously provided by NSERC, the Natural Sci-ences and Engineering Research Council of Canada, under the Discovery GrantProgram, and by GRAND, the Graphics, Animation and New Media Networkof Centres of Excellence. Facilities and research infrastructure administered byICICS, the Institute for Computing, Information and Cognitive Systems purchasedwith funds from the Canada Foundation for Innovation were used for the research.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xivAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1 Fitts and Welford Models of Pointing Performance . . . . . . . . 52.2 Effective Width We and Effective Amplitude Ae . . . . . . . . . . 62.3 Throughput and Cross Study Comparisons . . . . . . . . . . . . . 72.4 The Relationship Between Gain and Depth . . . . . . . . . . . . . 92.5 Distal Pointing Interactions . . . . . . . . . . . . . . . . . . . . . 102.6 The Kopper Model for Mid-air Pointing . . . . . . . . . . . . . . 112.7 Statistically Comparing Nested Pointing Models . . . . . . . . . . 112.8 Impact of Latency on Pointing Performance . . . . . . . . . . . . 122.9 Virtual Reality (VR) . . . . . . . . . . . . . . . . . . . . . . . . 13v2.10 Calibrating Virtual Environments . . . . . . . . . . . . . . . . . . 143 Target Depth and k in Computer-Mediated Pointing . . . . . . . . . 163.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.1.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . 173.1.2 Apparatus & Materials . . . . . . . . . . . . . . . . . . . 173.1.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 213.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.2.1 ANOVA for Movement Time . . . . . . . . . . . . . . . . 243.2.2 Regressions for One-Part and Two-Part Models . . . . . . 263.2.3 The Ratio k as a Function of DV . . . . . . . . . . . . . . 283.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.3.1 A Closer Look at the Data . . . . . . . . . . . . . . . . . 303.3.2 One-Part and Two-Part Models for Pointing . . . . . . . . 313.3.3 The k Parameter and Target Depth . . . . . . . . . . . . . 323.3.4 Angular Amplitude and Angular Width . . . . . . . . . . 333.4 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . 343.4.1 Welford Models vs. Fitts Models . . . . . . . . . . . . . . 343.4.2 Lack of Improvement Using Angular Models . . . . . . . 343.4.3 Calibrating VR Systems Using k-values . . . . . . . . . . 353.4.4 The Effect of Binocular Depth on Pointing Performance . 364 Incorporating Computer Vision for Latency Free Analysis of Point-ing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.2 Room Setup and Camera Placement . . . . . . . . . . . . . . . . 414.2.1 Synchronizing Video with Experiment Conditions . . . . 424.3 Designing a Vision System to Record Pointing Data . . . . . . . . 444.3.1 Finding Moving Objects - Cursor Position . . . . . . . . . 454.3.2 Finding Stationary Objects - Target Position . . . . . . . . 484.3.3 Moving From Pixels to Real World - Offset . . . . . . . . 514.4 Conclusions and Recommendations . . . . . . . . . . . . . . . . 53vi5 Real-World Pointing, Target Depth, and k . . . . . . . . . . . . . . . 555.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.1.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . 585.1.2 Apparatus & Materials . . . . . . . . . . . . . . . . . . . 585.1.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 625.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655.2.1 Removal of Outliers . . . . . . . . . . . . . . . . . . . . 665.2.2 ANOVA for Movement Time MT . . . . . . . . . . . . . 665.2.3 Comparing Pointing Models . . . . . . . . . . . . . . . . 675.2.4 Testing Angular Measures of Target Difficulty . . . . . . . 695.2.5 Modeling k as a Function of Target Depth . . . . . . . . . 705.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.3.1 One-Part Versus Two-Part Models of Pointing Performance 775.3.2 Angular Versus Classic Measures of Target Difficulty . . . 785.3.3 The Relationship Between k and Target Depth . . . . . . . 785.3.4 Differences in Task and k . . . . . . . . . . . . . . . . . . 815.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825.4.1 Isolating Factors That Affect k . . . . . . . . . . . . . . . 825.4.2 Addressing VR Calibration . . . . . . . . . . . . . . . . . 845.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856 Where Are We? The Path Ahead . . . . . . . . . . . . . . . . . . . . 866.1 One-Part and Two-Part Models of Pointing . . . . . . . . . . . . . 866.2 Measuring Target Difficulty . . . . . . . . . . . . . . . . . . . . . 896.3 Variation of Shoemaker’s k Parameter with Target Depth . . . . . 906.4 Impact on Virtual Reality Calibration . . . . . . . . . . . . . . . . 916.5 Final Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94A Supporting Materials . . . . . . . . . . . . . . . . . . . . . . . . . . 101A.1 Study Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . 101A.2 Raw Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101viiList of TablesTable 3.1 Significant ANOVA main effects and interactions for movementtime (MT ). No other interactions were found. All tests wereadjusted with Greenhouse-Geisser. . . . . . . . . . . . . . . . 25Table 3.2 Modeling movement time (ms) using Fitts and Welford formu-lations (top) and Shannon-Fitts and Shannon-Welford formula-tions (middle) for actual width W and effective width We foreach DV = DS condition (no added virtual binocular disparity)and for all DV = DS conditions combined. Fitts and Welfordmodels computed using Angular A and W are also reported (bot-tom). The coefficients and adjusted R2 for each model with theF-ratios, p-values and significance from an F(1,9)-test (Eq. 2.9with p1 = 2, p2 = 3, and n = 9) in the last three columns com-pare nested Fitts and Welford (or Shannon-Fitts and Shannon-Welford) models. . . . . . . . . . . . . . . . . . . . . . . . . 27Table 3.3 Linear regression indicating how k varies as binocular depth DV(cm) changes for each of the four models. . . . . . . . . . . . 28Table 5.1 ANOVA results for the impact of depth (D), amplitude (A) andwidth (W) on movement time (MT ). Statistically significantfactors are bolded. . . . . . . . . . . . . . . . . . . . . . . . 67viiiTable 5.2 Modeling movement time (ms) using Fitts and Welford formu-lations (top) and Shannon-Fitts and Shannon-Welford formula-tions (middle) for actual width W and effective width We. Foreach D condition and for all D conditions combined. The co-efficients and adjusted R2 for each model with the F-ratios,p-values and significance from an F(1,9)-test (Eq. 2.9 withp1 = 2, p2 = 3, and n = 9) in the last three columns com-pare nested Fitts and Welford (or Shannon-Fitts and Shannon-Welford) models. . . . . . . . . . . . . . . . . . . . . . . . . 68Table 5.3 Quality of fit for regression analyses as determined by the R-squared values is shown for angular and classic linear measuresof movement amplitude A and target width W , and for move-ment amplitude A and effective target width We. Bold cellsshow which model in a pair of columns has a better R2. Theleft columns are for classic and right for angular. . . . . . . . 69Table 5.4 Regression modeling of how k varies as target depth D (cm)changes for each of the four pointing models. . . . . . . . . . . 71Table 5.5 Regression modeling of how k varies as distance from the screenD (cm) changes for each of the four pointing models. The 165condition has been removed as a potential outlier. . . . . . . . 74Table A.1 Processed raw data from experiment one. Outliers have alreadybeen filtered out and we present effective widths, amplitudes,average movement times and 95 percent confidence intervalsfor all conditions. . . . . . . . . . . . . . . . . . . . . . . . . 104Table A.2 Processed movement time data from experiment two. Outliershave already been filtered out and we present effective widths,amplitudes, average movement times with low and high 95 per-cent confidence intervals for all conditions. . . . . . . . . . . 106ixList of FiguresFigure 2.1 Computing effective width We using the standard deviation sigmaof the observed pointing or tapping accuracy (after MacKenzie[26]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Figure 2.2 A simple diagram of the effect of depth on distal pointing.Note that the same physical movement intersects further hori-zontally away from screen center when depth is far. This cre-ates an effect similar to gain with target depth. . . . . . . . . . 9Figure 2.3 Basic distal pointing interaction implemented with NintendoWii Remote to select buttons in a user interface. . . . . . . . . 10Figure 2.4 Shadow reaching is a proposed distal pointing interaction us-ing the shadow as a metaphor for changing pointing gain bymoving back and forth through the room. . . . . . . . . . . . 10Figure 2.5 Oculus rift and other VR systems have commonly been appliedto gaming and other methods of exploring VR systems. In thisexample a treadmill is placed under the user to allow them totraverse by running in place. . . . . . . . . . . . . . . . . . . 13Figure 2.6 A secondary example of a VR system in practice allows flyingand exploration like a bird. A specially designed apparatushelps aide this effect. . . . . . . . . . . . . . . . . . . . . . . 13Figure 2.7 Image displaying an application using dynamic adjustment of3D parameters. On the left, a farther object has its stereo pa-rameters made more extreme to highlight the depth effect andimprove depth estimation. . . . . . . . . . . . . . . . . . . . 15xFigure 3.1 Apparatus: (a) Large screen stereoscopic display, (b) Hand-held pointer, (c) Head-tracking gear, and (d) Wii Remote forclick events. . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Figure 3.2 Placement of the targets and the participant in an experimentaltrial. The virtual plane of the targets is green; the physicalplane of the screen is black. . . . . . . . . . . . . . . . . . . 19Figure 3.3 Computing the cursor position from the head and hand-heldpointer positions. . . . . . . . . . . . . . . . . . . . . . . . . 20Figure 3.4 An illustration of the participant performing the pointing taskfor DV = DS. Targets are in the plane of the screen. . . . . . . 21Figure 3.5 An illustration of the participant performing the pointing taskfor DS > DV . Targets are perceived in front of the screen. . . . 22Figure 3.6 R2 values for each target depth using W (top) and We (bottom). 26Figure 3.7 k values calculated using the Welford formulation (a) for A andW (b) for A and We. . . . . . . . . . . . . . . . . . . . . . . . 29Figure 3.8 Scatterplot of MT vs. effective ID. Points are identical in theright and left plots. Points are connected in two different waysto illustrate the separability of A and W : Lines on the left con-nect points representing tasks with the same movement am-plitude A; lines on the right connect points representing taskswith the same target width W . . . . . . . . . . . . . . . . . . 38Figure 3.9 The three k-lines for DS = 110 (red / squares), 220 (green / tri-angles) and 330 (purple / circles), along with the k-line for thenon-VR DV = DS conditions (blue / diamonds). To calibratebinocular depth using k values, the desired binocular depth(A) determines a k-value (B) on the blue DV = DS line. Thatsame k-value on the red DS = 110 line (C) determines D′V to bethe corresponding binocular depth (D) that the software shoulduse to insure the desired pointing performance if the screen is110cm from the viewer. . . . . . . . . . . . . . . . . . . . . . 39xiFigure 4.1 Example view of what is seen by the camera placed behindthe screen. The green bars are the targets used in the pointingexperiment. The blue box is used for camera calibration. . . . 43Figure 4.2 Starting image for finding the pointer and targets. . . . . . . . 46Figure 4.3 Example Result of the output from thresholding for the pointer.White spots are within the threshold and thus likely within thepointer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47Figure 4.4 Example of a median filter operation. Image taken from http://tinyurl.com/j26e7cf . . . . . . . . . . . . . . . . . . . . . . 47Figure 4.5 View of the average thresholded result, edges are noisy, but therough shape is there. . . . . . . . . . . . . . . . . . . . . . . 49Figure 4.6 Example result for the target position after thresholding andprocessing the averaged passing pixels. . . . . . . . . . . . . 50Figure 4.7 Example contours of the targets overlayed over top of the orig-inal example image. Circles are positioned at the corners. . . . 50Figure 4.8 Example of the calibration procedure through grid wave. Aregular grid of known size is pictured from various angles, ma-chine learning finds the intrinsic distortions of the camera. . . 52Figure 5.1 Apparatus: Pointer used for interaction and camera used fordata recording. . . . . . . . . . . . . . . . . . . . . . . . . . 59Figure 5.2 Image of the projection room behind the screen. Camera isplaced on tripod at fixed location to record experimental data.Two projectors were used to display the experimental stimulus. 59Figure 5.3 Experimental stimuli. Participant stands at a fixed target depthalong the orange line. Specific conditions are marked withcrosses. The experimental software displays two green targetson the screen. The user smoothly moves the laser pointer be-tween the two targets. . . . . . . . . . . . . . . . . . . . . . . 61Figure 5.4 Visual comparison of different models for the relationship be-tween k and D. Data presented for regular width models ofpointing performance. . . . . . . . . . . . . . . . . . . . . . 72xiiFigure 5.5 Visual comparison of different models for the relationship be-tween k and D. Data presented for effective width models ofpointing performance. . . . . . . . . . . . . . . . . . . . . . 73Figure 5.6 Visual comparison of different models for the relationship be-tween k and D. Data presented for regular width models ofpointing performance. The 165 condition has been removed asa potential outlier. . . . . . . . . . . . . . . . . . . . . . . . . 75Figure 5.7 Visual comparison of different models for the relationship be-tween k and D. Data presented for effective width models ofpointing performance. The 165 condition has been removed asa potential outlier. . . . . . . . . . . . . . . . . . . . . . . . . 76Figure 6.1 The three k-lines for DS = 110 (red / squares), 220 (green / tri-angles) and 330 (purple / circles), along with the k-line for thenon-VR DV = DS conditions (blue / diamonds). To calibratebinocular depth using k values, the desired binocular depth(A) determines a k-value (B) on the blue DV = DS line. Thatsame k-value on the red DS = 110 line (C) determines D′V to bethe corresponding binocular depth (D) that the software shoulduse to ensure the desired pointing performance if the screen is110cm from the viewer. . . . . . . . . . . . . . . . . . . . . . 92Figure A.1 Consent forms for experiment one which investigated the im-pact of depth on pointing performance. Data gathered and per-formed by Vasanth Rajendran. . . . . . . . . . . . . . . . . . 107Figure A.2 Consent forms for experiment two which investigated the im-pact whether the impact of depth on pointing performance isan artifact of system latency. . . . . . . . . . . . . . . . . . . 108Figure A.3 Demographics questionaire used to gather qualitative data inboth studies presented in this thesis. . . . . . . . . . . . . . . 109xiiiGlossaryA Movement Amplitude, Horizontal distance between consecutive targetsin a pointing experiment.ANOVA Analysis of Variance, a set of statistical techniques to identify sources ofvariability between groups.D Target Depth, Distance from the user to the target in physical space.DS Screen Depth, Distance from the user to the image of the target on thescreen in physical space.DV Virtual Depth, Distance from the user to the target as perceived (orintended to be perceived) in stereoscopic or binocular viewing,independent of the physical screen position.HCI Human Computer Interactionk Kopper’s k, Referring to the parameter k noted by Shoemaker et al. in2012 that captures the relative impact of movement amplitude and targetwidth on pointing performance in a Welford-like two-part model.MT Movement Time, Time to complete a pointing action.R2 R-squared, a measure of how much variance in a model is accounted forby the predictors. Refers to the percentage of variance that is attributableto between groups differences.T P Throughput, , a measure of balanced overall pointing performance in bitsper second. Takes both speed and accuracy into account.xivVR Virtual Reality, Referring to virtual environments enhanced withstereoscopic 3D viewing to enhance depth.W Target Width, Width of a target in a pointing experiment.We Effective Width, a measure of target size adjusted to account for howaccurate participants were.xvAcknowledgmentsI’d like to thank my academic adviser, Dr. Kellogg Booth, in particular for all hissupport and encouragement during my Master’s degree. Thanks go to VasanthRajendran and Garth Shoemaker for providing me with access to their raw dataand their experimental software to start my investigations.I’d also like to thank my parents, Lorraine and Fred Janzen, for their supportand understanding.Special thanks go to Ron Rensink who was the second reader for the thesis. Heprovided extremely helpful and timely advice and suggestions that have improvedthe thesis in a number of ways.xviChapter 1IntroductionPointing is the act of touching or selecting an object by positioning one’s hand, orsome intermediary tool, over top of a desired object or in the general direction ofthe object if it is not close by. In human computer interaction pointing is incor-porated as a sub-component of many basic tasks. These can vary from selectingbuttons, highlighting text and moving files within a desktop to positioning boxeswithin a virtual world or game. Due to its ubiquity, it is important that we under-stand and can predict the behavior of the sensorimotor systems involved in pointingwhen we evaluate new pointing devices and techniques. Even small improvementsand refinements to a pointing device or technique could add up to save a significantamount of time when interacting with the system as a whole.Pointing devices such as the computer mouse are robust, ubiquitous and highlyeffective for desktop applications. However, these often break down when usedin virtual environments or with screens that get very large. This often occurs invirtual reality systems or classroom interaction. This style of interaction is oftencalled distal pointing, where users stand a good distance away from the screenand position the cursor with their hand as if using a laser pointer. While numer-ous pointing tools have been and are currently being developed for distal pointing,some researchers have noted problems properly evaluating and modeling the tools’performance using standard procedures [43]. This has led to attempts to expandand refine our models of human pointing performance to better evaluate these tech-niques [23]. This is a fundamental concern that we explore in this thesis. If we do1not have an accurate and robust measure of how a device performs, how can wesay if it was effective or not?We focus our work on investigating the impact of target depth on mid-air point-ing techniques (also known as distal pointing) in which a person’s hand movesfreely, unconstrained by contact with any surface or object. This is a very commontask in many training, simulation and entertainment activities where interactionwith real or virtual objects is required. It is also a task performed every day in thereal world, often with 2-D information displays such as in classrooms or lecturehalls. Some virtual environments employ “virtual hand” techniques to grasp andmanipulate targets at a distance, but we restricted our attention to pointing becauseit mimics the physical world. We sought to learn more about the fundamental actof pointing, both when targets are at a depth in the real world, and when targetsappear to be at a depth in a virtual world. Target depth is either the physical depthor the virtual depth, depending on the situation.Fitts’s Law [10] is often used to measure pointing performance by modelingmovement time as a function of target size and distance moved. Kopper at al. [23]recommended a variant of Fitts’s Law that uses angular measures of target size andmovement distance instead of the classic linear measures normally used with Fitts’sLaw. Shoemaker et al. [40] showed that a classic measures version of Kopper etal.’s model is just a special case of Welford’s two-part model of pointing [53]. Theconstant k that Kopper et al. introduced is actually the ratio of two coefficients inthe much earlier Welford model. Welford’s model is similar to Fitts’s Law, withthe change that it separates amplitude and width into their own terms with separateimpacts. Shoemaker et al. also showed that k varies monotonically and possiblylinearly with gain.In two experiments, we looked at how k varies with target depth (distance to thetarget) and we ree¨xamined Kopper et al.’s claim that angular measures are betterthan linear measures for assessing mid-air pointing performance. Furthermore, wetested a variety of different possible models for pointing and suggested ways tomake the most common models more robust for different pointing techniques andgain.The first experiment had previously been conducted to investigate whether vir-tual depth and physical depth to the screen have similar impacts. Rea¨nalyzing the2data from that experiment, we showed that for normal viewing with real-worldphysical depth cues, target depth affects pointing performance. In binocular stereo,where perceived or virtual target depth can differ from physical target depth, wefound that both virtual and physical target depths may affect pointing performance.We did not find evidence that angular measures are better than linear measures, butwe did find that two-part models are better than one-part models in some but notall conditions. The Kopper-Shoemaker k factor appeared to vary monotonicallywith gain, but we couldn’t conclude much about the exact type of relationship (e.g.linear versus logarithmic) from the limited dataset we had.Our second experiment investigated whether or not the effects observed fortarget depth were merely an artifact of latency from the computer mediation inthe experimental apparatus. Latency is the delay between physical movement andoutput being displayed on the screen and has been shown to reduce pointing per-formance [31]. We were concerned that much of the effect of target depth mightbe simply latency becoming more problematic as a user moves further from thescreen. This experiment provided further evidence that two-part models outper-form one-part models in conditions where k deviates from one (which are commonin distal pointing). We reinforced our conclusions that angular measures do notreally improve our models of pointing performance. We also showed that even af-ter removing computer mediation from the task, k still grew larger as target depthincreased. Overall, k was smaller and had slower increases than in our first experi-ment, which we argue is likely caused by task differences or sampling biases. Ourmodels of target depth’s impact on k were refined from previous studies to rein-force that a linear approximation is robust and explanatory. We also noted there issome possible merit to a logarithmic model, though many conditions produced lessfavorable statistics than linear. This has implications on ideas we’ve presented tocalibrate VR systems through the use of a ground truth k variation.In the chapters that follow we present a review of relevant literature in the field(Chapter 2) and then present the results of the first experiment (Chapter 3). Afterthat we describe a method for performing latency-free pointing evaluation usingcomputer vision tchniques (Chapter 4), which can be read as a succinct primer ona number of the basic tools and techniques for incorporating computer vision intoexperimental designs, a practice that seems unfortunately to be underused. Lastly,3we present the methods, results and conclusions of a new experiment that replicatesthe first experiment without latency artifacts (Chapter 5). Further discussion of ourresults, potential impact on the field, and key avenues for further research are thendiscussed (Chapter 6).4Chapter 2Related WorkModeling pointing performance is an extremely common and classic research topicin HCI that has been studied since the dawn of the field in the 1960’s [29]. Experi-ments to quantify and compare the pointing performance of different input methodswere critical to the introduction of the first graphical user interfaces and computermice [5, 6]. Given this massive breadth of research it would be entirely impracticalto discuss all the papers and work that have been done on the subject. Therefore,this chapter provides a selective discussion of key work that is particularly rele-vant to distal pointing and the questions we investigate. We focus mainly on thelast decade of research, which follows the pivotal work by Scott Mackenzie’s labin the 1990’s and the early 2000’s, which led to the current standard methods forevaluating pointing performance [3, 26–28, 41].2.1 Fitts and Welford Models of Pointing PerformanceThe best known model of pointing performance is Fitts’s Law (Eq. 2.1), formulatedby Paul Fitts [10]. It was originally used to model “reciprocal tapping” where thetime to move between successive taps was related to the distance moved (ampli-tude) and the size of the area being tapped (target width).MT = a+b log(AW)(2.1)5MT = a+b log(AW+1)(2.2)Movement time (MT ) depends only on the ratio of movement amplitude (A) andtarget width (W ), but not their individual values. The logarithmic term is oftencalled the index of difficulty (ID). Soukoreff and MacKenzie [41] argue for aninformation-theoretic interpretation, the Shannon-Fitts formulation (Eq. 2.2), withan additive constant in the ID term. This often produces better R2 values than doesthe simpler version of Fitts’s Law.Welford later introduced a two-part model of pointing performance (Eq. 2.3)in which the one-part ID term is replaced by a linear combination of two logarith-mic terms [53]. Shoemaker et al. [40] introduced a variation on Welford’s modelanalogous to the Shannon-Fitts formulation that they called the Shannon-Welfordmodel (Eq. 2.4).MT = a+b1 log(A)−b2 log(W ) (2.3)MT = a+b1 log(A+W )−b2 log(W ) (2.4)The Fitts formulations are special cases of the Welford formulations. A parameterk can be defined (Eq. 2.5) that measures how closely the Welford models match theFitts models.k = b2/b1 (2.5)Eqs. 2.1 and 2.3 are nested models because setting b = b1 = b2 (k = 1) turns the3-DOF (degree-of-freedom) Welford model into the 2-DOF Fitts model; similarly,the two Shannon models Eqs. 2.2 and 2.4 are also nested.Throughout our discussion, we follow the authoritative and venerable advice ofStrunk and White [42] and use Fitts’s as the possessive case of the singular propernoun Fitts.2.2 Effective Width We and Effective Amplitude AeTarget width W is the physical size of the target being pointed at (or tapped on, inthe original studies by Fitts). Crossman and Welford [53] compute the effective6width (We) of a target using the observed distribution of tap positions and thenadjust the width of a target in their calculations to reflect what participants actuallydo, rather than what participants are expected to do. This post-hoc adjustment totarget width maintains the information-theoretic analogy for Fitts and other models,which are for rapid, aimed movements (such as reciprocal tapping or pointing in 2-D or 3-D). It assumes a nominal and consistent error rate (traditionally 4%). Whenthis condition is not met, further adjustment to the target width is introduced suchthat the error rate becomes 4%.Figure 2.1: Computing effective width We using the standard deviation sigmaof the observed pointing or tapping accuracy (after MacKenzie [26]).Assuming a normal distribution with standard deviation σ of the taps, P[Z ≥2.066] = 0.02, a width of 2× 2.066 in z-units (4.133σ ) around the mean has aprobability of 96%. The probability of error (tapping outside this area) is 4%, soWe can be calculated using Eq. 2.6.We = 4.133σ (2.6)Whenever We is used, we also use effective amplitude, Ae, which is simply themean of the observed amplitudes. Further background on the four pointing models,effective width, and the F-test used to compare nested models, is provided byShoemaker et al. [40], an essential reference for the work discussed in this thesis.2.3 Throughput and Cross Study ComparisonsOne of the great benefits of Fitts’s Law is the introduction of the Index of Perfor-mance (IP), more commonly called throughput (TP), which is the term we will use.7Throughput is operationally defined by “The average rate of information generatedby a series of movements is the average information per movement divided by thetime per movement” [9]. Mathematically, throughput is defined as the averageindex of difficulty divided by the average movement time.T P = (ID/MT ) (2.7)After correcting for participant strategy by using effective width, throughputis often used to compare different pointing devices across studies. It has becomethe de facto standard metric of pointing performance and has been adopted bymany research groups [41]. We note that it is often used to make broad generalstatements such as “device A performs 30 percent better than device B” because itincorporates estimates of both accuracy and speed into a single value.Unfortunately, no standard alternative to throughput exists for two-part modelsof pointing performance. Perhaps because using a two-part model itself is notstandard practice, no concerted effort has been made to define what should beused to compare studies done with them. However, we argue that the researchcommunity could quickly adapt if a strong effort was made to switch over to two-part models. One could easily define two separate throughputs, one for ballisticmovements and one for precision adjustments with the two ID coefficients in atwo part model. This would allow us to make more nuaced comparisons such as“device A performs 20 percent better for precision tasks than device B, so it shouldbe used for interaction with small screens”. Perhaps the sum of these two indexesof difficulty over movement time could be used in place of the general throughputparameter so we could still make general effectiveness conclusions about overalldevice effectiveness. The results may even be mathematically similar to one-partthroughput if amplitude and width had similar impacts to start with.We discuss these ideas and provide our ideas of what the standard should bein Chapter 6. While providing this standard is not a key focus of this thesis, wedo wish to further motivate the pointing research community to address this is-sue. This will hopefully lead to more nuaced interpretations of costs and benefitsof input devices and allow the deployment of more effective models of pointingperformance.82.4 The Relationship Between Gain and DepthFigure 2.2: A simple diagram of the effect of depth on distal pointing. Notethat the same physical movement intersects further horizontally awayfrom screen center when depth is far. This creates an effect similar togain with target depth.Gain is the reciprocal of the control-to-display ratio, which is the relative move-ment of the pointing device to the movement of the cursor that is being used asvisual feedback for the pointing. Depth is the distance from an observer to an ob-ject that is being viewed. There is an obvious relationship between the distanceat which pointing is done and induced gain: the farther one is from objects beingpointed at, the less hand movement is required to point to different objects. In areal-world environment, target depth is the physical distance to a target. In a virtualstereoscopic 3-D environment, target depth is the perceived or virtual distance to atarget. Depth perception uses many cues. In a virtual environment an important cueis the binocular disparity between the images seen by the left and right eyes. Oneof our primary interests was to better understand how target depth affects pointingperformance in the real world, but we also looked at virtual environments wherebinocular depth may not be the same as the physical depth to the left and rightimages.92.5 Distal Pointing InteractionsFigure 2.3: Basic distal pointing interaction implemented with Nintendo WiiRemote to select buttons in a user interface.Figure 2.4: Shadow reaching is a proposed distal pointing interaction usingthe shadow as a metaphor for changing pointing gain by moving backand forth through the room.Distal pointing interaction has become increasingly common in the last decade.In particular, numerous distal pointing interfaces have been employed in recentgaming technology. Various industrial input products including Microsoft Kinect,Playstation Move, Nintendo Wii Remote, and Leap motion have provided supportfor distal pointing. Many of these have been picked up by the research communityin order to evaluate which are the most effective for specific applications [34, 37,56]. Many of these are provided by and for gaming applications and traversing3D virtual environments. However, this is far from the only application for distalpointing. Further improvements and refinements such as shadow reaching have10been suggested to allow variable precision [39]. Classroom interaction has alsobeen highlighted as a key opportunity for applying distal pointing [11]. Giventhe glut of new distal pointing devices hitting the market, ensuring that we canconsistently and effectively evaluate their strengths and weaknesses is an arguablyimportant and timely research area.2.6 The Kopper Model for Mid-air PointingKopper et al. [23] investigated ways to revise Fitts’s Law specifically for mid-airpointing by incorporating target depth into their model (Eq. 2.8) through the use ofan angular index of difficulty based on angular amplitude α and angular width ω .MT = a+b[log( αωk+1)]2(2.8)While Kopper et al.’s techniques show some promise, they unnecessarily narrowthe scope of their model by fixing the exponent of ω as k = 3, which might pre-vent its applicability in contexts other than the particular physical set up theyused. Moreover, Shoemaker et al. noted that Eq. 2.8 is mathematically similarto Welford’s two-part formulation if we ignore the squaring of the logarithmicterm and replace k = 3 with b2/b1, and they showed experimentally that Welford’sformulation seems to capture the impact of different gains on pointing perfor-mance [40]. In a way, the target depth for mid-air pointing is very similar to gain.Moving farther away from targets increases the effect of small hand movements andvice versa. Thus we suspect that any improvement in modeling seen by Kopper etal. was likely due to their model being similar to Welford’s, not to the incorporationof angular ID measurement.2.7 Statistically Comparing Nested Pointing ModelsSoukeroff and MacKenzie [41] and others have observed that two-part models in-evitably perform better than their one-part counterparts because of additional de-grees of freedom (DOF). However, there are statistical tests to determine whethernested models such as Fitts’s Law and Welford’s formulation are actually betterat describing the data beyond just the benefit of the increase in degrees of free-11dom. Following Shoemaker et al. [40], we use the F-test in Eq. 2.9 to determinewhether a two-part Welford-type model is statistically better than a one-part Fitts-type model: p2 = 3 is the number of parameters in the greater-DOF Welford model,p1 = 2 is the number of parameters in the smaller-DOF Fitts model, and n = 9 isthe number of sample points in our data. A p-value of .05 or less is consideredsignificant.F(p2− p1,n− p2) =RSS1−RSS2p2−p1RSS2n−p2(2.9)2.8 Impact of Latency on Pointing PerformanceBy necessity, no computer can react or display feedback to user input instante-nously. There is always some amount of time between a button being clicked andthe electrical signal travelling down a wire to be interpreted by the CPU. Thus, anycomputer mediated pointing system must be robust enough to work in the pres-ence of this latency without becoming cumbersome to use. There has been a largeamount of work in literature characterizing the relative impact of this effect onmany input techniques, but pointing is particularly well studied.Generally speaking most pointing research has found that the effect of latencygets more severe as latency is increased and that 100ms of latency degrades perfor-mance by around 10-15 percent [18, 31, 32, 49]. Cursor jitter imposed by tradeoffsin hardware had an impact as well, but was less than latency [31]. However, othershave noted that a similar amount of latency impacts pointing performance muchmore when targets are moving, as in a virtual environment [16].It is entirely possible that other factors could cause latency to become moreproblematic than expected by previous studies. By convention studies use consis-tent, reasonable input gains and turn off mouse acceleration. This enables greaterinternal consistency and easily reproducible experimental conditions. The effectof latency is also usually evaluated by adding latency into the system, rather thancomparing to a “natural” baseline. Shoemaker further raised the concern that thesegains were unintentionally cherry picked to maintain the effectiveness of one partmodels of pointing [40]. We felt the objections raised by Shoemaker were rea-sonable and were concerned that latency might become more problematic in distal12pointing as you moved further from the screen. This naturally lead to our secondexperiment to evaluate natural human baseline performance without the presenceof system latency.2.9 Virtual Reality (VR)Figure 2.5: Oculus rift and other VR systems have commonly been appliedto gaming and other methods of exploring VR systems. In this examplea treadmill is placed under the user to allow them to traverse by runningin place.Figure 2.6: A secondary example of a VR system in practice allows flyingand exploration like a bird. A specially designed apparatus helps aidethis effect.Virtual Reality (VR) Ellis [4] defines virtualization as “The process by which ahuman viewer interprets a patterned sensory impression to be an extended object in13an environment other than that in which it physically exists.” Virtual Reality (VR)often consists of headcoupled stereoscopic displays, with means of interaction andinput to generate a coordinated sensory experience. Some systems include audi-tory and tactile information. VR environments are used in entertainment (gaming,movies, virtual experiences, etc.), training and simulation (military training, driv-ing and flight simulators, etc.), and medical and scientific visualizations.Recently, popular interest in virtual reality HMDs (head mounted displays)has been reinvigorated by the announcement and success of the oculus rift [30].Oculus is a consumer grade VR headset that’s been touted as a cheap and easyway to bring VR to the masses [36]. Unsurprisingly, this has helped reinvigoratethe HCI community’s interest in virtual reality and applications of head mounteddisplays. Of particular note is incorporating haptic sensations to VR experiences,as well as augmenting and improving VR hardware to improve peripheral visionand reduce motion sickness [1, 21, 22, 54].Depending on the application, the requirements for a VR system might differ.In training applications, it is often desirable that the VR system evoke responsesand task performance nearly identical to that in the real world. Stereoscopic 3-Dsystems may be unable to completely live up to the ideal of perfect simulation.There are depth cues aside from binocular disparity. The physical screen distancein a VR system is often fixed, and thus rarely the same as the binocular depth ofobjects being viewed. Our next section discusses attempts to calibrate VR systemsand reduce the impact of these disparities.2.10 Calibrating Virtual EnvironmentsThere has been much previous work on VR calibration. Jones, Lee, Holliman andEzra describe calibrating the camera space against a user space based on user-reported depth values [17]. Yuan, Pan and Daly perform depth tuning of stereo-scopic 3-D content based on models of human visual comfort [55]. Iyer, Chari andKannan describe a method for determining parallax values for stereoscopic view-ing that minimize visual artifacts in 3-D television [15]. Since the introduction ofthe Oculus Rift, this trend has continued. Kulshreshth et al. provide a method toimprove depth estimation by dynamic stereo parameter adjustment [24]. Fernandes14Figure 2.7: Image displaying an application using dynamic adjustment of 3Dparameters. On the left, a farther object has its stereo parameters mademore extreme to highlight the depth effect and improve depth estima-tion.et al. suggest subtly modifying field of view to reduce motion sickness [7]. LastlyFinnegan et al. tackle the problem of calibrating audio feedback timing in such avirtual environment [8].These approaches are all based on stereoscopic perception models either de-rived from the literature, or from user-reported values. They attempt to reducedistortions associated with binocular viewing. If we want to produce task perfor-mance identical to that in real-life tasks, not simply to reduce perceived distortion,we may have to take task performance explicitly into account during calibration.To our knowledge however, no other work currently has attempted to validate pre-vious calibration methods to a performance task, or hit on the idea of calibrating aVR system based on performance metrics.15Chapter 3Target Depth and k inComputer-Mediated PointingThe analysis I present in this chapter was done on data collected by Vasanth Rajen-dran. Previous work in our lab by Shoemaker et al. had investigated the varianceof k with respect to gain. They had found that k varies quite linearly and monoton-ically with different gain settings and thus could be used as a predictor of relativegain settings [40]. However, we’d had one important observation; in distal point-ing environments, target depth may play a role similar to gain. With laser pointerstyle interaction, if you are standing far away it may be easier to cross large gapsand harder to make fine adjustments. This effect is similar to increasing the gainsettings of the device, and thus might be predicted by k. If we could use pointingperformance to tell how far away the screen is, can we use it as a more reliable indi-cator of VR calibration than self report data? Therefore in this chapter we presentan experiment to test and verify if the k factor also captures target depth, as well aswhether that variation is the same for virtual and physical depths.3.1 MethodWe examined the effect of physical target depth DS (the depth to the display screen)and binocular depth DV (the depth at which objects are intended to be perceived) onpointing performance (movement time) in a VR environment. Our primary analysis16is for the conditions in which DV = DS, a real-world physical environment with novirtual binocular disparity (the images for the left and right eyes are identical onthe screen), but we also examined the cases where DV 6= DS, virtual environmentsthat had virtual binocular disparity (the images for the left and right eyes were notidentical on the screen). The design was based on an experiment by Shoemakeret al. [40] that evaluated physical 2-D pointing on a large display for varying gainvalues. In our experiment, we evaluated 3-D pointing for different binocular depthsand for different target depths with system gain held at one.3.1.1 ParticipantsWe recruited 21 participants at our university through on-campus advertising. Asa requirement, all were right-handed with normal or corrected-to-normal vision.One participant’s data had to be discarded due to equipment malfunction. Of theremaining 20, 8 were female and 12 male. Age ranged from 20 to 28 years (mean23.3). All participants were regular computer users (9+ hours per week). The studywas approved by the behavioral Research Ethics Board at our university. Partici-pants signed a consent form prior to the start of the study. Each was compensatedwith $10.3.1.2 Apparatus & MaterialsThe hardware and software used for this experiment were based on previous exper-iments investigating the impact of gain on distal pointing by Shoemaker et al. Oneof the main additions was the supporting of VR and different virtual depths.HardwareThe wall display was a 5.16m×2.85m (width× height) glass screen rear-projectedby a 4×3 array of 800px×600px stereo projectors. The images of adjacent projec-tors overlapped by 160px with a blending function to minimize the appearanceof discontinuities. The overall resolution of the display was 2720px×1480px.Viewing was binocular frame-sequential stereo at 60Hz. Users viewed the dis-play through shutter glasses synchronized with the projectors. We calibrated thebinocular disparity based on an initial estimate followed by visual inspection and17Figure 3.1: Apparatus: (a) Large screen stereoscopic display, (b) Hand-heldpointer, (c) Head-tracking gear, and (d) Wii Remote for click events.adjustment to reduce visual distortions. The display was driven by an 8-core In-tel Processor with 6GB of RAM and dual NVIDIA GeForce GTX 260 graphicsprocessors, running Windows 7.Five Vicon cameras (a high speed infrared-based motion capture system) trackedparticipants using head gear and a hand-held pointer each fitted with reflectiveballs (Figure 3.1) after using Vicon’s standard wand wave calibration procedure.We tracked participants’ heads and the pointer positions and orientations to com-pute and project a virtual cursor on the screen. Participants stood at one of threedepths DS from the screen with targets displayed at one of three depths DV (Figure3.2). The display was not head-coupled; it was “movie theatre stereo” with a fixedviewer position assumed because we wanted no virtual binocular disparity whenthe target was in the plane of the physical screen (i.e. when DV = DS binoculardepth would be the same as physical depth and the images for the left and the righteyes would thus be identical, and this would not change with head movement).‘Tapping’ on targets was performed using the thumb (A) button on a hand-held Nintendo Wii Remote. Participants held the remote with the left hand (thenon-pointing hand) to minimize any disturbance caused by clicking.18screentargetsDs = 110cm, Dv = 110cmscreen targetsDs = 110cm, Dv = 220cmscreen targetsDs = 110cm, Dv = 330cmscreentargetsDs = 220cm, Dv = 110cmscreentargetsDs = 220cm, Dv = 220cmscreen targetsDs = 220cm, Dv = 330cmscreentargetsDs = 330cm, Dv = 110cmscreentargetsDs = 330cm, Dv = 220cmscreentargetsDs = 330cm, Dv = 330cmFigure 3.2: Placement of the targets and the participant in an experimentaltrial. The virtual plane of the targets is green; the physical plane of thescreen is black.SoftwareThe experimental software was written in C# using the Microsoft XNA Game Stu-dio 4.1 library. The WiimoteLib library [33] was used to communicate with theremote and to detect click events. The Vicon motion tracking system managed byVicon IQ software provided the positions of the user’s head and the pointer. Oursoftware recorded all click events and the timing for each trial condition. Rawtracking data was captured 60 times a second.Interaction-plane pointing was used to compute the cursor position on thescreen. An imaginary vertical plane containing the user’s eye and hand was ex-trapolated to intersect with the virtual screen, corresponding to the binocular depthDV for a trial condition. The cursor was rendered on the physical screen with ap-propriate binocular disparity for this depth (Figure 3.3). The imaginary plane wasused only by the software: participants did not sight along the hand but insteadwatched the cursor on the screen much as they would use a mouse by watchingonly the cursor not their hand. The plane of the cursor was always the plane of thetarget, in keeping with our desire to mimic physical pointing such as might be donewith a laser pointer in a classroom.19Figure 3.3: Computing the cursor position from the head and hand-heldpointer positions.Layout, Task, and stimuliThe experimental task was a serial 1-D tapping task between two target pairs, mod-eled closely after Shoemaker et al. [40] and the original Fitts experiments [10]. ISO9241-9 [3] defines a 2-D task for pointing performance; we used a 1-D task becausewe were concerned with the fundamental applicability of Fitts’s Law. Although thetask was 1-D, it is still an example of 3-D pointing because targets were perceivedat different depths in a 3-D stereoscopic environment.Targets were two identical vertical bars of variable width W and fixed height,spaced apart by variable amplitude A. The cursor was a thin (0.5 cm) vertical lineof the same height as the targets. Lighting in the room was fairly dim to reduce anydepth cues other than binocular disparity.Each trial condition was defined by four independent variables: movement am-plitude A, target width W , binocular depth of the targets DV , and physical depth DSof the participant from the screen. If DV = DS, there was no virtual binocular dis-parity from VR (targets were intended to be perceived as in the plane of the physicalscreen so the images on the screen for the left and right eyes were identical, whichwe refer to as having no virtual binocular disparity). If DV > DS, targets were per-ceived behind the physical screen (positive parallax) but if DV < DS, targets wereperceived in front of the physical screen (negative parallax). The DV = DS andDv < DS cases are shown in Figures 3.4 and 3.5, respectively.20Figure 3.4: An illustration of the participant performing the pointing task forDV = DS. Targets are in the plane of the screen.For each target pair, a participant first tapped the start target and then performeda sequence of eight back-and-forth taps between the two targets. The destinationtarget for a tap was highlighted blue and the starting target was grey. The partic-ipant was required to correctly tap the first starting target to initiate a trial. Aftereach tap the destination target turned grey and the former starting target turned blueand became the new destination target.Times and locations of all nine taps were recorded. When tapped, the activetarget briefly flashed green to indicate success, or red to indicate an incorrect tap.Participants were not required to correct errors. One target was always directlyin front of the participant, the other was to the right, at a distance defined by thecurrent amplitude (A) condition. This was done to avoid a cross-lateral inhibitioneffect (CIE) [19, 38], the observation that hand movements that cross the midlineof the body are more complex than those that do not.3.1.3 ProcedureEach participant performed the experiment in a single session of approximately45 minutes. After filling out a consent form and a pre-questionnaire that gathereddemographic information, participants were introduced to the apparatus and thepointing task. They were instructed to complete the task as quickly as possiblewith a goal of ∼95% accuracy. Each completed a practice session of at least five21Figure 3.5: An illustration of the participant performing the pointing task forDS > DV . Targets are perceived in front of the screen.randomly chosen (DV , DS, A, W ) combinations (without duplicates) that includedall DV and DS values. They were invited to practice until comfortable with thesystem.At the beginning of the block of trials for each DV and DS pair, a practice trialwith one A and W pair was presented to familiarize participants to the new depthcondition. These trials were presented in the regular flow of the experiment; aparticipant was not informed these were only practice trials. Between each block ofthe experiment representing a DS value, participants were required to take a breakfor at least three minutes. They were encouraged to take more time if desired, butvery few did.After all blocks were completed, participants filled out a post-questionnaire offeedback and comments on the experiment. We do not report on this here.MeasuresPointing performance was measured as the time taken to execute each individualtap action. A trial began when the participant successfully tapped on the start target,and ended with the eighth tap on a destination target. For each tap, the softwarerecorded movement time MT , and position of the cursor when the click was made.The position ∆tap was an offset from the center of the destination target. Therewere a total of 12,960 observations. As is common practice in pointing evaluation,22mistrials and outliers were removed. Any click landing more than half the trialamplitude away from the target (likely to be caused by an accidental click) wasignored and any tap whose movement time was more than three standard deviationsfrom the mean was removed. In total 312 taps were removed from the dataset.Experimental DesignThe within-participants design had independent variables binocular depth of targetsDV (110cm, 220cm, 330cm), screen depth from the participant DS (110cm, 220cm,330cm), target width W (5cm, 10cm, 20cm), and movement amplitude A (25cm,50cm, 75cm). All the variables were fully crossed.20 participants (N) ×3 binocular depths (DV ) × 3 screen depths (DS) ×3 movement amplitudes (A) × 3 target widths (W ) ×8 taps = 20 participants × 648 taps = 12,960 total tapsFor a given DS, all three DV conditions were presented consecutively; for agiven DV -DS pair, a participant performed trials for all combinations of A and Wbefore switching to another DV -DS condition. We decided against completely mix-ing up DS and DV to avoid participants having to move back and forth betweendifferent locations in the room. Condition order was partially counterbalancedacross participants.HypothesesWe had five hypotheses, the fourth of which has both a weak and a strong variant.Four of the hypotheses are similar to those of Shoemaker et al. [40], who con-ducted studies of mid-air pointing where gain was varied. They found that Welfordmodels were better than Fitts models and that experimentally measured k ratiosvaried linearly with gain. Our hypotheses replace “gain” with “target depth” butare otherwise the same as those of Shoemaker et al.H1 One-part formulations (Fitts’s Law and the “Shannon” version of Fitts’s Law)will not accurately model pointing performance at all target depths.23H2 Two-part formulations (Welford’s formulation and the “Shannon” version ofWelford’s formulation) will accurately model pointing performance at alltarget depths.H3 Physical target depth has no significant effect on pointing performance, onlybinocular depth matters.H4 (weak) The ratio k = b2/b1 of coefficients in a Welford formulation variesmonotonically with target depth.H4 (strong) The ratio k = b2/b1 of coefficients in a Welford formulation varieslinearly with target depth.H5 Using an angular index of difficulty will not improve the strength of our mod-els of pointing performance.3.2 ResultsAnalyzing data from individual trials or from individual participants is rare in theliterature on pointing performance. We performed both ANOVA and a linear re-gression analysis using data averaged over all participants [10, 41]. For each A andW pair in each DV -DS pair condition, we first averaged within a trial and then thetrial averages were averaged over all trials and all participants, resulting in ninevalues for each of the nine A-W pairs in each of the nine DV -DS conditions (81 val-ues in total). A Shapiro-Wilks test for normality was conducted on the participantaverage movement times, which showed a violation of the normalcy assumption(P < 0.01) which could be due to outliers that we attempted to remove throughfiltering. Overall average movement times and 95 percent confidence intervals foreach condition are included in Appendix A.3.2.1 ANOVA for Movement TimeSignificant main effects of DV , DS, A and W were found. Significant interactions ofDV ×DS, DV ×W ,DV ×A, DS×A, DS×W , and A×W , and three-way interactionsof DV ×DS×W and DV ×DS×A were also found. No other interactions werefound. This is summarized in Table 3.1. The fact that movement amplitude A24Factor ε d fe f f ect d ferror F p Partial η2DV 1.292 24.557 14.830 .000 0.438DS 1.606 30.516 10.695 .001 0.360A 1.276 24.248 406.567 .000 0.955W 1.099 20.887 162.062 .000 0.895DV ×DS 2.727 51.814 39.463 .000 0.675DV ×A 3.164 60.108 10.786 .000 0.362DV ×W 2.131 40.482 16.494 .000 0.465DS×A 3.357 63.790 4.183 .007 0.180DS×W 2.311 43.914 8.321 .001 0.305A×W 3.138 59.617 11.152 .000 0.370DV ×DS×A 4.240 80.563 5.113 .001 0.212DV ×DS×W 3.840 72.959 6.464 .000 0.254Table 3.1: Significant ANOVA main effects and interactions for movementtime (MT ). No other interactions were found. All tests were adjustedwith Greenhouse-Geisser.and target width W affect pointing performance is fundamental to any discussionof pointing performance so effects of A, W and the A×W interaction were notsurprising.We anticipated an effect of DV because varying DV is a factor that is similar tochanging gain for 2-D pointing, and other researchers [2, 20, 28, 40] report gainaffects pointing. The interactions of DV ×W and DV ×A were not anticipated,but perhaps not surprising in retrospect. The role of DV in influencing MT is notentirely understood. If it affects pointing performance in a manner similar to gain itis easy to imagine that increasing gain might cause target amplitude to have less ofan effect than width because it would be easier to cross large distances but harder tomake small corrections. Thus the impact of A and W might depend on DV througha change in perceived gain, which could lead to the interactions we observed.The effect of DS on MT and its interactions with other factors were not antic-ipated. In an ideal VR setting, the virtual scene perceived by a user should not beaffected by the position of the physical screen. Viewed binocularly, the destination25Figure 3.6: R2 values for each target depth using W (top) and We (bottom).target should be perceived to be on the virtual plane at DV , which is independentof DS. However, there are other depth queues at work that may have had enoughinfluence to effect pointing performance.3.2.2 Regressions for One-Part and Two-Part ModelsRegressions for the Fitts and Welford models and for the two Shannon variantswere performed only for the nine real-world DV = DS conditions. The regressionparameters are shown in Table 3.2. To adjust for bias due to accuracy, we per-formed the regressions using both target width W and effective target width We26[57]. We computed We by the recommended approach using standard deviations oftap positions as in Eq. 2.6.Width DV = DSFitts Welford F-testa b R2 RSS a b1 b2 k R2 RSS F-ratio p sig?W110 542.10 301.72 0.98 11436 535.76 302.54 301.18 0.996 0.98 11432 0.002 .966 no220 565.91 324.56 0.97 29057 698.49 307.20 335.70 1.093 0.97 27150 0.421 .540 no330 526.50 392.04 0.94 86862 1173.19 307.39 446.42 1.452 0.96 41501 6.558 .043 yesall 544.84 339.44 0.91 321031 802.48 305.71 361.10 1.181 0.91 299432 1.731 .201 noWe110 332.27 456.23 0.87 105935 −7.83 491.62 416.33 0.847 0.87 90976 0.987 .359 no220 424.70 448.92 0.90 89424 1242.19 384.77 575.75 1.496 0.96 32101 10.714 .017 yes330 423.09 555.58 0.80 277761 1902.41 459.28 804.89 1.753 0.92 94268 11.679 .014 yesall 427.43 467.06 0.76 844546 758.11 439.32 515.15 1.173 0.76 810745 1.001 .327 noWidth DV = DSShannon-Fitts Shannon-Welford F-testa b R2 RSS a b1 b2 k R2 RSS F-ratio p sig?W110 244.00 378.57 0.99 3417 204.88 384.26 376.80 0.981 0.99 3288 0.235 .645 no220 241.40 408.75 0.98 11931 341.73 394.17 413.28 1.048 0.99 11083 0.459 .523 no330 134.74 493.65 0.95 86862 811.84 395.25 524.20 1.326 0.98 23785 15.912 .007 yesall 206.72 426.99 0.92 272932 452.82 391.23 438.10 1.120 0.93 257626 1.426 .244 noWe110 -149.50 594.66 0.88 105935 −525.55 640.22 564.96 0.882 0.88 80741 1.872 .220 no220 -41.84 580.31 0.92 76474 800.28 500.56 681.98 1.362 0.98 25437 12.038 .013 yes330 -157 720.76 0.80 280066 1404.69 595.20 935.75 1.572 0.91 103397 10.252 .019 yesall -61.31 606.07 0.77 822633 277.49 572.04 643.38 1.125 0.77 792872 0.901 .352 noWidth DV = DSShannon-Fitts Angular A and W Shannon-Welford Angular A and W F-testa b R2 RSS a b1 b2 k R2 RSS F-ratio p sig?W110 403.49 438.58 0.99 11070 426.04 434.30 438.86 1.010 0.99 3292 14.176 .009 yes220 412.88 478.99 0.99 26666 509.97 456.79 613.00 1.705 0.99 7175 16.298 .007 yes330 341.72 580.28 0.96 40996 793.11 460.31 788.33 1.632 0.98 18396 7.371 .035 yesall 389.25 497.20 0.92 289693 795.79 407.10 506.67 1.245 0.96 127298 30.617 .000 yesTable 3.2: Modeling movement time (ms) using Fitts and Welford formula-tions (top) and Shannon-Fitts and Shannon-Welford formulations (mid-dle) for actual width W and effective width We for each DV = DS condi-tion (no added virtual binocular disparity) and for all DV =DS conditionscombined. Fitts and Welford models computed using Angular A and Ware also reported (bottom). The coefficients and adjusted R2 for eachmodel with the F-ratios, p-values and significance from an F(1,9)-test(Eq. 2.9 with p1 = 2, p2 = 3, and n = 9) in the last three columns com-pare nested Fitts and Welford (or Shannon-Fitts and Shannon-Welford)models.27We performed F-tests as in Eq. 2.9 to compare pairs of nested models to de-termine if the two-part models actually explained the data better. Two-part modelswere found to characterize the data significantly better than one-part models for tar-get depth 330cm in regular width, and both 220cm and 330cm for effective width.There was no signficant difference between one-part and two-part models for theother condtions. The R2 values for effective width were generally lower than forregular width, but greater than 0.8. This is somewhat abnormal, but not especiallyworrisome as the models are still reasonable, as will be commented on in the Dis-cussion. The variation of the R2 values for DV = DS is shown in Figure 3.6.We also performed regressions for the Fitts and Welford models using A andW measured in angular space in a manner similar to Kopper et al. [23]. Targetswere directly in front of and offset to the right of a participant, so angular widthwas slightly different for the two targets positions. To account for this we averagedthe two values. Regression coefficients changed in many cases, but trends of re-duced R2 values at higher target depths were replicated and the R2 values obtainedfrom the angular models were similar to those for the corresponding classic mod-els. Two-part models were again sometimes significantly better than one-part mod-els. Table 3.2 (bottom) shows regressions for just the Shannon-Fitts and Shannon-Welford models using angular W where two-part models were always better thanone-part models.3.2.3 The Ratio k as a Function of DVModel Width Used Intercept Slope Adjusted R2Welford W 0.7728 1.728×10−1 0.901Shannon-Welford W 0.7234 2.284×10−1 0.890Welford We 0.4596 4.528×10−1 0.941Shannon-Welford We 0.5826 3.449×10−1 0.951Table 3.3: Linear regression indicating how k varies as binocular depth DV(cm) changes for each of the four models.To test the hypothesis that k values varied at least monotonically and perhapslinearly with target depth, we performed a linear regression analysis on k computedby the Welford and Shannon-Welford models using both actual target width W and28Figure 3.7: k values calculated using the Welford formulation (a) for A andW (b) for A and We.effective target width We. The results are presented in Table 3.3 and plotted inFigure 3.7. The k values vary linearly (R2 > 0.85) with target depth DV using widthW . Using effective width We, the linear model for k is somewhat more pronounced(R2 > 0.90).3.3 DiscussionWe summarize the results according to our hypotheses, and then discuss each inmore depth.29H1 One-part formulations (Fitts’s Law and the “Shannon” version of Fitts’s Law)will not accurately model pointing performance at all target depths. Consis-tent with data.H2 Two-part formulations (Welford’s formulation and the “Shannon” version ofWelford’s formulation) will accurately model pointing performance at alltarget depths. Somewhat consistent with data.H3 Physical target depth has no effect on performance, only binocular depth mat-ters. Not consistent with data.H4a The ratio k = b2/b1 between the two linear coefficients in a Welford formu-lation varies monotonically with target depth. Consistent with data.H4b The ratio k = b2/b1 between the two linear coefficients in a Welford formu-lation varies linearly with target depth. Somewhat consistent with data.H5 Using angular index of difficulty will not improve the strength of our modelsof pointing performance. Consistent with data.3.3.1 A Closer Look at the DataMovement time data is shown in Figure 3.8. The graphs in the first column presenta scatterplot for different binocular depths DV , with lines connecting points of thesame movement amplitude A. The graphs in the second column present exactly thesame data, but with lines connecting points of the same target width W .Figure 3.8 reveals a pattern similar to one shown by Welford (Figure 5.8, page158 [53]). Movement time increases roughly linearly with ID within either a fixedA value or a W value, but not across changes in both. This separable effect of Aand W grows with increases in DV . A similar effect has been found by Shoemakeret al. [40] in a rea¨nalysis of other researchers’ data [2, 12] and of their own data.Both Graham and Shoemaker et al. concluded that Welford’s two-part formulationis necessary to account for this pattern and to accurately model movement timewhen gain varies. The applicability of a two-part formulation is further supportedby Graham’s analysis of hand velocity and acceleration profiles during pointing,30which revealed separable effects of A and W during different temporal segments oftarget acquisition.Looking at our data, we see that one-part models perform poorly at modelingthe movement time data for DV = 330cm and DV = 220cm, but two-part models dobetter. The k values from Eq. 2.5 serve to quantify the separability of contributionsof A and W to movement time. From Figure 3.8, we expect k to be close to unityfor a binocular depth DV of 110cm and to increase as DV increases, as borne outby Figure 3.7.3.3.2 One-Part and Two-Part Models for PointingThere is no universally accepted threshold for a “good” R2 value to determine ifa formulation accurately models pointing performance. MacKenzie’s suggestion[27] of R2 ≥ 0.90 as a guideline when evaluating Fitts’s Law results has been usedin the literature, and we employ this as our threshold.Hypothesis H1 was consistent with out data. One-part models (Fitts and Shannon-Fitts) were successful in characterizing movement time using actual target widthW . The Fitts formulation produced fits ranging from R2 = 0.94 to R2 = 0.98 atdifferent target depths. The Shannon-Fitts formulation was slightly better at mod-eling the data, with fits ranging from R2 = 0.95 to R2 = 0.99. These R2 values beatMacKenzie’s 0.90 threshold. However, for both formulations, it is clear that R2values decreased as DV increased (Figure 3.6).Using effective width We, the one-part models were less successful. For theFitts formulation, the fit ranged from R2 = 0.80 to R2 = 0.90. Using MacKenzie’s0.90 threshold, the Fitts formulation only produced one barely acceptable model forone target depth and was worst for the largest DV . The Shannon-Fitts formulationfared slightly better, with the fit ranging from R2 = 0.80 to R2 = 0.92. Again,Shannon-Fitts only produced an acceptable fit for one target depth, and the DV =330 condition was the worst.Effective width is sometimes more relevant because We can be a more accuraterepresentation of the task [41]. One reason for the poorer fit for one-part modelsusing effective width We is apparent: examining the k values from the two-partmodels (Figure 3.7) shows that k deviated more from a value of one when using We31than it did using W . The more k deviates from unity, the worse a one-part modelwill be at describing the data. R2 values were lower in general using effectivewidth. We argue that this is likely due to noise and outliers in the data that areparticularly common with mid-air pointing. Hand tremors and input noise happenmore frequently than with other pointing techniques (such as a mouse, where thehand rests securely on a desk). We saw a few cases where movement time wasnormal (so the point was not removed by outlier filtering), but the selection wasmade very inaccurately. This would have no impact on regular width models whichdo not care about accuracy, but would degrade effective width models.For both W and We the quality of fit decreases with increasing DV . Using We,the one-part models had unacceptable R2 values for all but one DV . Both two-partmodels (Welford and Shannon-Welford) produced a consistently good fit at everytarget depth DV using actual target width W . Regression fits for Welford rangedfrom R2 = 0.96 to R2 = 0.98. For Shannon-Welford they ranged from R2 = 0.98to R2 = 0.99. Using effective width We, the regression fits ranged from R2 = 0.87to R2 = 0.96 for Welford and from R2 = 0.88 to R2 = 0.98 for Shannon-Welford.Hypothesis H2 was consistent with our data. An F-test determined that two-part models described the data significantly better than the corresponding one-partmodels for DV = 330cm using both W and We. They also outperformed one-partmodels for DV = 220cm using We (Table 3.2). For two-part models, only the DV =110 using We condition did not meet the 0.9 threshold, and that was by but a smallmargin.3.3.3 The k Parameter and Target DepthHypothesis H3 was not consistent with our data. Most of our analysis was doneonly on the real-world DV = DS conditions. When we analyzed all DV -DS con-ditions, including those with DV 6= DS, as shown in Figure 3.9, the variation of kdepended both on physical depth and on binocular or perceived depth. It may bethat this happens when the VR environment is not well calibrated or it may be afundamental flaw in stereoscopic 3-D. Although we did a rather careful calibrationto reduce the visual distortion of our system, perhaps it was not good enough; in aperfectly calibrated system target depth might not matter. We explain later how to32use the k-lines in Figure 3.9 to possibly improve calibration.Hypotheses H4a and H4b were somewhat consistent with our data. Using ac-tual target width W , the k values fit a linear model with R2 = 0.901 for the Welfordformulation and R2 = 0.890 for the Shannon-Welford formulation. Using effec-tive target width We, the k values were more accurately modeled, R2 = 0.941 forthe Welford formulation and R2 = 0.951 for the Shannon-Welford formulation.However, given that we only have three data points it is hard to make strong con-clusions about the linearity of the data. Looking at Figure 3.7 we notice the dataseems somewhat curved using width, but using effective width it is much closer tolinear. Future studies should further investigate this issue.3.3.4 Angular Amplitude and Angular WidthHypothesis H5 was consistent with our data. Using angular amplitude α and an-gular width ω did not show consistent real improvement over the classic one- andtwo-part formulations. Simply substituting α and ω for A and W into the fourmodels made little difference (Table 3.2). All trends, including the improvementof Welford over Fitts and k increasing as depth increases remained the same. Us-ing the angular model proposed by Kopper et al. (Eq. 2.8), R2 was less than 0.70for both ω and ωe in every condition. Using k = 1.47 (an average obtained fromour analyses) instead of k = 3 in Eq. 2.8, R2 values were closer to those for ourangular Welford model, but were uniformly not as good. Angular measures in-corporate DV in the model, so Table 3.2 presents models for the “all“ condition(exactly the DV = DS conditions). This best case test shows a minor 0.04 R2 im-provement for two-part models, but none for one-part models. This suggests thatany improvement Kopper et al. saw for their model may be due mostly to its math-ematical similarity to Welford’s formulation. At least for small angles, measuringamplitude and width as angles acts like a simple scaling operation; it keeps thefundamental model the same, but changes some of the constants, which, as Shoe-maker et al. [40] discuss at length, does not help explain the behaviors shown inFigure 3.8. Kopper et al.’s choice of k = 3 seems arbitrary. Shoemaker et al. [40]report values as low as k = 0.3.333.4 Conclusions and Future WorkIn this chapter we examined pointing performance using stereoscopic 3-D whenperceived depth to the target (binocular depth) and actual target depth varied. Fourresearch contributions were reported: (1) Welford-like two-part models outper-formed Fitts-like one-part models by more robustly characterizing pointing perfor-mance across varying target depths; (2) angular measures for movement amplitudeand target width did not improve model strength for our data; (3) pointing perfor-mance was affected by binocular depth and the k-ratio of b2/b1 that characterizesthis appeared to vary linearly with binocular depth but there was a confound withtarget depth that is unexplained; and (4) task-specific calibration for VR environ-ments could be based on Welford-like pointing models.3.4.1 Welford Models vs. Fitts ModelsTraditionally, Fitts’s Law has been the model of choice for pointing performance.Similar to findings by Welford [53], Graham and MacKenzie [13], and Shoemakeret al. [40], our data show that contributions to movement time of movement am-plitude A and target width W are separable and two-part models such as Eq. 2.3and Eq. 2.4 are better than one-part models such as Eq. 2.1 and Eq. 2.2 for sometarget depths and are always at least as good at any target depth. Using F-tests,we found in some cases there were significant differences between the two typesof models. Two-part models cannot be simplified to one-part models by introduc-ing scale factors for A and W (Shoemaker et al. [40] provide a detailed discussionof this). Our data suggest that simply measuring movement amplitude and targetwidth as angles does not improve modeling strength. These findings add to an un-derstanding of how to best model pointing in VR environments and they providefurther support for the use of Welford-like models when Fitts-like models do notprovide adequate power.3.4.2 Lack of Improvement Using Angular ModelsOur opinion is that classic two-part models are superior to Kopper et al.’s one-partangular model, but if such a model is used, k should be variable, which makes ita two-part model. We see no justification for squaring the logarithmic term, espe-34cially because it goes against the information-theoretic interpretation that is fun-damental to all variants of Fitts’s Law. Our experiment used a fairly large screen,such as might be in a classroom, and the depths we chose were typical of where alecturer might stand relative to a screen, so the conditions were fair representationsof at least some types of real-world usage. Nevertheless, we cannot rule out thatfor large movement amplitudes angular models might be better.3.4.3 Calibrating VR Systems Using k-valuesEllis [4] defines virtualization as “The process by which a human viewer interpretsa patterned sensory impression to be an extended object in an environment otherthan that in which it physically exists.” Depending on the application, the require-ments for a VR system might differ. In training applications, it is often desirablethat the VR system evoke responses and task performance nearly identical to thatin the real world.There are depth cues other from binocular disparity. Physical target depth ina VR system is often fixed, and rarely the same as the binocular depth of ob-jects being viewed. Research by Teather and various co-authors investigated howstereoscopic 3-D impacts pointing performance for a variety of interaction tech-niques [44–48, 50]. They noted problems modeling pointing performance withaccepted Fitts’s Law methods specifically for “ray casting” techniques similar tothe mid-air pointing techniques we investigated [43]. Hypothesis H3 exploredwhether this might affect pointing performance in VR. Our data, though prelim-inary, suggests it does. Our findings might lead to calibration techniques that couldameliorate this.Jones, Lee, Holliman and Ezra describe calibrating the camera space against auser space based on user-reported depth values [17]. Yuan, Pan and Daly performdepth tuning of stereoscopic 3-D content based on models of human visual com-fort [55]. Iyer, Chari and Kannan describe a method for determining parallax val-ues for stereoscopic viewing that minimize visual artifacts in 3-D television [15].These approaches are all based on stereoscopic perception models – either derivedfrom the literature, or from user-reported values. They attempt to reduce distor-tions associated with binocular viewing. If we want to produce task performance35identical to that in real-world tasks, not simply reduce perceived distortion, we mayhave to take task performance explicitly into account during calibration.Our finding that pointing performance is not independent of physical targetdepth DS suggests that for a given target depth we might need to adjust binoculardepth if we want to induce pointing performance that matches real-world pointing.Figure 3.9 shows measured k-values for different target depths. It also shows the k-line for the conditions where DV =DS, which has a distinctive slope different fromtwo of the others. We can treat the viewing parameters for VR as a “black box”and use measured k-values as a compensation table. If DV is the desired binoculardepth, the DV = DS line gives the k-value expected for real-world pointing perfor-mance. Using the k-line for the target depth of the VR environment we can find thatk-value and determine the D′V corresponding to it and then use that binocular depth(instead of the intended binocular depth) to invoke the performance we want. Anoptimization algorithm could thus “tune” its parameters so that measured pointingperformance is close to measured performance in a physical environment. This nodoubt requires additional pointing performance data. Our first experiment had onlynine points, but with sufficient data, intermediate target depths could be interpo-lated to achieve the desired corrections at a suitable granularity. This is definitelyan area for further investigation.3.4.4 The Effect of Binocular Depth on Pointing PerformanceOur data revealed the relative magnitudes of contributions of A and W can be cap-tured by a single parameter k that seems to vary with target depth. This extendsearlier findings by Shoemaker et al. [40] who found that k varied linearly withgain for similar mid-air pointing in non-VR environments. We offer a possibleexplanation for why target depth matters by appealing to the information-theoreticinterpretation of Fitts-like pointing models in which the coeffcient of the index ofdifficulty is the rate at which the sensorimotor system processes information. Two-part models such as Eq. 2.3 and Eq. 2.4 have coefficients for the A-term and forthe W -term. These might be interpreted as the rate at which different componentsof the sensorimotor system process information, the first for the ballistic stage ofmovement and the second for the homing stage. However, they might also sim-36ply reflect an information-theoretic measure of uncertainty in the initial and finalpositions. Our data, and previous findings by Welford [53], Graham and MacKen-zie [13], and Shoemaker et al. [40], all show that the two rates differ in somecircumstances. For target depth, we might imagine that the perceived depth to thetarget has a different influence on the homing stage than on the ballistic movementstage. This is definitely a question to be examined in future research.There is also an important point to be investigated in future work. We’ve shownthat k varies linearly with target depth in a computer mediated application (wherethe computer processes and provides feedback to the user) but can we use this toconclude that this k variation is a true aspect of human senorimotor performance?One possible explanation for this effect (or at least its extremity) is system latency.In any computer system there is going to be a small lag between when users per-form a physical action and when the computer can react and respond to it. Thishas been shown to reduce pointing performance and become more problematic thelarger this gap [31]. It seems possible that such latency would make fine adjust-ments in the homing stage much more difficult, but provide little impact on roughballistic motion. Perhaps the variation of k is made more extreme by the presenceof latency than we would normally see in natural pointing. We investigate thisChapter 5.37Figure 3.8: Scatterplot of MT vs. effective ID. Points are identical in the rightand left plots. Points are connected in two different ways to illustrate theseparability of A and W : Lines on the left connect points representingtasks with the same movement amplitude A; lines on the right connectpoints representing tasks with the same target width W .38Figure 3.9: The three k-lines for DS = 110 (red / squares), 220 (green / tri-angles) and 330 (purple / circles), along with the k-line for the non-VRDV = DS conditions (blue / diamonds). To calibrate binocular depth us-ing k values, the desired binocular depth (A) determines a k-value (B)on the blue DV = DS line. That same k-value on the red DS = 110 line(C) determines D′V to be the corresponding binocular depth (D) that thesoftware should use to insure the desired pointing performance if thescreen is 110cm from the viewer.39Chapter 4Incorporating Computer Visionfor Latency Free Analysis ofPointing Data4.1 IntroductionOne of the potential concerns with much of our work has been the effect of la-tency on human pointing performance. By necessity, any computer based pointingsystem has a delay between when a physical movement is completed and whenthe system finishes processing and is able to render a response to the display. Anumber of papers have investigated this relationship between latency and pointingperformance and shown it has a performance cost that increases as latency getslarger [31, 49].One possible explanation for the variance of k as gain or depth is changed isthe impact of latency becoming increasingly problematic as you move further backfrom the screen or make gain higher. We do not agree with this conclusion andhypothesize that k is reflective of the underlying physical sensorimotor processes.We therefore decided to investigate if this k variation remains clear when thereis no significant latency between physical movement and response. Latency freeevaluations of human motor performance are rare in HCI (where a computer is40expected to be in the loop), but more common in psychology and kinesielogy.In particular, Fitts’ original study employed metal plates and conductive pens forpointing on a desktop, but to our knowledge no such test has been extended todistal pointing [9].We think this is likely because performing such a test in a cost effective mannerhas only recently become feasible through computer vision techniques. The metalplates in Fitts’ experiment could be easily connected to a circuit that records whenit was tapped (and a circuit closed through the conductive pen). However, distalpointing requires some form of laser pointer and visual analysis. Video taken fromparticipant interactions can be analyzed by the experimenter in post to obtain roughmovement time results. However, this would require demanding frame by frameanalysis for potentially dozens of hours of video. Furthermore, accurate physicalmeasurements in the same units as the target sizes would be impossible, since allyou can tell is the final pixel distances. These are always subject to noise anddistortion. This naturally leads to the results from such analysis being biased andless trustworthy than intended.We decided to tackle this problem and developed a computer vision system toprocess participant video in post and obtain cost effective accurate real world mea-surements of pointing performance. The next chapter describes the actual proce-dure and results of this experiment. However, we believe that the computer visiontools and techniques we employed can be broadly useful to other HCI practitionersfor cost-effective supplementary analysis. This chapter is intended both as doc-umentation of our technique and a primer to employing simple computer visiontechniques that we believe can and should be employed in future HCI lab studies.The actual sub-steps and operations employed are all implemented by the stan-dard computer vision library, OpenCV. OpenCV is licenced for free academic andcommercial use, and we highly recommend it for research use. http://opencv.org/4.2 Room Setup and Camera PlacementOne of the key concerns in computer vision is of course, what can the camera see?The camera aperture can only capture so much of what is in front of them, so it isimportant that the entirety of the interactable area is within view. Furthermore, par-41ticipants can move and it is possible for them to occlude an interaction. No matterhow complex a computer vision algorithm, it can not accurately recover data thatwas occluded and therefore not recorded. Triangulation from multiple viewpointscan be used to overcome occlusion, but that at least triples the data recorded, andrequires a complex system to combine the results from multiple cameras. Theseare fundamental limitations to computer vision, and are still being researched inthe field.It is therefore not likely that any solitary HCI researcher will have the expertiseand skills to solve these problems. They can also become major stumbling blocksto computer vision techniques being employed in field work or in-context user stud-ies. However, lab studies are uniquely poised to take advantage of computer visiontechniques because they are by design a highly controlled environment. It is com-mon practice for lighting conditions to be tightly controlled, participant positionsto be somewhat constrained and the range of interaction space limited. Further-more you also have control over the display technology used and the room setupused to perform the study.In our study, we specifically took advantage of the fact that the experimentwas conducted on a rear-projected large screen display. An example of the cameraplacement is shown in Figure 4.1. This allows us to entirely eliminate the problemof participant occlusion. Simply put, participants stood on one side of the displayand pointed the laser pointer at the screen. This scatters when it hits the translucentglass and appears as a bright red dot to cameras situated in the projection roombehind the screen. Regardless of where participants move in the experiment space,the camera can always see the position of the targets and laser pointer. Targetswould also be presented at fixed positions with fixed sizes set out before hand. Astable, high quality tripod was positioned behind the screen and used to adjust thecamera placement until it could see a good area around the largest target positionedas far away from each other as possible.4.2.1 Synchronizing Video with Experiment ConditionsAssuming you’ve solved the problem with camera placement and created a com-puter vision system that records pointing data, you still need to relate that to your42Figure 4.1: Example view of what is seen by the camera placed behind thescreen. The green bars are the targets used in the pointing experiment.The blue box is used for camera calibration.experiment. Video timesteps are usually only recorded with reference to the start ofthe video. It is therefore actually a non-trivial problem to decide which experimentcondition your actually recording data for.Theoretically the most reliable way would be to position some shape (e.g. acondition identifier) in some out of the way place on the screen. As you read ineach frame of data, you have the vision system find and identify that shape anduse it to decide which condition that data is for. However, while this is an easytask for a human, it is computationally complex. Likely it will involve implement-ing some fairly complex shape analysis such as scale invariant feature analysis.(for an introduction to such a technique, see http://docs.opencv.org/3.1.0/da/df5/tutorial py sift intro.html#gsc.tab=0) This will greatly slow down the analysis andmay also be subject to noise and mistakes. Such algorithms at best produce ma-chine learning estimates which is undesirable for a tightly controlled lab study.Furthermore, a visible condition identifier may be something participants can de-code to understand what conditions they are performing. This may be undesirable43in experiments where participant knowledge could skew results.We found that the most effective and simple method was to have the exper-iment application and vision systems synchronize. The experiment applicationwould display a text box with a welcome message before the experiment started.When a button was clicked to start the experiment, internal logging registered thetimestamp of that click as the start of the experiment and removes the welcomemessage. The system then recorded a simple text file of the timestamps each in-dividual condition was started relative to that text disappearing. When processingthe video we would manually review the start of the video to find when the wel-come text disappeared. That frame was used as timestep 0 for the analysis and thenthe relative timestamps from the text file were used to decide what condition eachadditional frame of data was for. This method may be subject to some amount ofdrift, but this was not noticeably problematic for a short 30 minute to one hour labstudy.4.3 Designing a Vision System to Record Pointing DataNow we come to actually implementing our computer vision system. At a highlevel, the pointing experiment we want to analyze requires the time each movementtakes and how far off center the selection was. Thus, the data our vision systemneeds to provide us with is as follows:• The position of the cursor in any frame• The shape and central position of the targets• The time between the cursor “selecting” each target• The offset (distance) between the cursor and the center of the target to beselected in centimeters (the scale used to model effective width)The rest of the sections in this chapter will discuss some of the relevant theo-retical reasons for using the approach we used, and provide key samples of codefor implementing similar systems yourself.444.3.1 Finding Moving Objects - Cursor PositionFinding a small moving target like the laser pointer is at once both the simplest op-eration of computer vision and one we found required the most fine tuning. Mostedge detection or object recognition algorithms are designed to minimize the ef-fect of camera noise. Noise can come from many sources but essentially makesthe image look grainy with lots of small bright or dark spots. Even in a high res-olution image the laser dot is by definition a small bright spot a few pixels across.These efforts to remove noise often remove the object you want to find entirely!Furthermore, feature detection algorithms often rely on smooth shading/texturingthat comes from natural light interacting with a real object. Somewhat counterin-tuitively, we found a very simple “thresholding” operation was more applicable forour needs.The high level idea behind thresholding is very straightforward, you decidewhat it is you are looking for (e.g. “find the bright red dot”) and then decide whatminimum and maximum colors would fit that criteria (through inspection with acolor meter). OpenCV will go through each pixel and return whether or not thatpixel meets the criterion. Then some relatively simple measures of central ten-dency tell you the X,Y position of the centre of mass for pixels that pass yourthreshold. There are however a number of complications to be aware of if imple-menting something similar.Images read in from a video file by OpenCV are treated as two dimensional ma-trices of BGR (amount of blue, green and red light) pixel values. This is extremelydifficult to threshold, since end color is a complex combination of the three chan-nels. You might set a threshold looking for something with high red, but a dull redmight not go through. Furthermore, the laser pointer color will also blend with thetarget color as it passes over the targets, which causes the color to shift and BGRthresholds to fail. Thus it is standard practice to perform thresholding operationsafter converting the image to HSV (hue - the color of the pixel, saturation - howmuch of the color there is, value - how bright that color is) format. HSV is muchmore robust to changes in lighting conditions and shading, and easier to formulatethresholds for. The following line of openCV code takes the image in the variableframe, and puts an HSV version of it in to the variable hsvFrame.45Figure 4.2: Starting image for finding the pointer and targets.cv tCo lo r ( frame , hsvFrame , COLOR BGR2HSV) ;Next we perform the actual thresholding operation. We have predefined thearray hsvPointer with the target threshold for this operation. The first and secondelements are the minimum and maximum hues (color) the second and third arethe min/max of saturation and so on. Thresholded frame will contain a two di-mensional array of booleans where each cell represents whether the correspondingpixel passes the threshold.inRange ( hsvFrame , Scalar ( hsvPoin ter [ 0 ] , hsvPoin ter [ 2 ] , hsvPoin ter [ 4 ] ) ,Scalar ( hsvPoin ter [ 1 ] , hsvPoin ter [ 3 ] , hsvPoin ter [ 5 ] ) , thresholdedFrame ) ;We also found that occasionally noise became a problem, and we’d have ran-dom pixels pass the threshold and throw off the estimate of position. A medianfilter of size 3 was used to remove noise. In laymans terms, the following codereplaces each pixel with the median of its immediate neighbors. If a noise spikecauses one or two pixels to pass they will be completely removed from the result.Median filters are fantastic ways to remove noise, but you have to be careful touse a small filter size (three is the smallest possible size). Bigger filters will causethe image to blur. This will remove larger passed areas and cause the laser dot todisappear very quickly.46Figure 4.3: Example Result of the output from thresholding for the pointer.White spots are within the threshold and thus likely within the pointer.Figure 4.4: Example of a median filter operation. Image taken from http://tinyurl.com/j26e7cfmedianBlur ( thresholdedFrame , pointerFrame , 3 ) ;The last step in finding the laser dot is to find the central tendency. This isimplemented by the moments library of openCV and the code snippet is shownbelow.47Moments oMoments = moments ( pointerFrame ) ;double dM01 = oMoments .m01;double dM10 = oMoments .m10;double dArea = oMoments .m00;i n t posX =dM10 / dArea ;i n t posY =dM01 / dArea ;PosX and posY should now contain the position of the cursor.4.3.2 Finding Stationary Objects - Target PositionThe next task in our vision system is finding the position and extent of the targetsto select. In order to know whether the laser position is inside the target we need toknow where the edges and center of the targets are. At its heart this is accomplishedby a similar thresholding operation to the laser pointer selection. However thereare a number of complications and further processing we need to accomplish.The first interesting thing to note is that unlike the laser pointer, the targetsare produced by computer projectors refreshing at 60 Hz. There tends to be someflicker and blank space between projector refreshes. The exact edges of the targetscan be a little blurry and noisy. Also, we note that the camera and targets are sta-tionary within a condition. This leads to the conclusion that we do not necessarilyhave to find the targets every frame. If we can get a robust estimate of their positionin the first few frames, we can save a huge amount of computation for the rest ofthe condition.We employ an averaging method for solving both of these problems. Essen-tially, at the start of every condition we threshold the image through the sameprocess as the laser pointer. We maintain a count of how many times each pixelhas passed the threshold successfully.After a sufficient number of frames, we perform an additional threshold to getrid of pixels that only passed the threshold a few times. This lets pixels around theedge that are slightly noisy, but often still passing the threshold to be consideredpart of the target itself. The edge of these targets are still very jagged, so someimage processing is used to smooth them out. A series of erosions (which cut outthe jaggies on the edges) and dilations (which extend the edges back out to keep48Figure 4.5: View of the average thresholded result, edges are noisy, but therough shape is there.object size consistent) are used for that purpose.erode ( targetFrame , targetFrame , ge tS t ruc tu r ingE lement (MORPH ELLIPSE, Size ( 5 , 5 ) ) ) ;d i l a t e ( targetFrame , targetFrame , ge tS t ruc tu r ingE lement (MORPH ELLIPSE, Size ( 5 , 5 ) ) ) ;d i l a t e ( targetFrame , targetFrame , ge tS t ruc tu r ingE lement (MORPH ELLIPSE, Size ( 5 , 5 ) ) ) ;erode ( targetFrame , targetFrame , ge tS t ruc tu r ingE lement (MORPH ELLIPSE, Size ( 5 , 5 ) ) ) ;Now that we have a smooth greyscale image of what pixels contain the targets,we need to decide where the edges and center of the targets exactly are. This isimplemented by the OpenCV function, findContours. The following snippet putsa list of every contour into the variable contours. Most of the parameters can behandwaved, but note the parameter CV CHAIN APPROX NONE which makes thesystem store every point in the contour rather than compressing them.f indContours ( ∗imageFrame , contours , h ierarchy , CV RETR LIST ,CV CHAIN APPROX NONE, Po in t (0 , 0) ) ;The last step is to iterate through the points within each contour and do someprocessing to find the maximum/minimum X and Y values to get the corners ofthe targets. The center is simply the average of their position. Also note that it isimportant to sort the contours by size in order to get rid of any noisy small contours.f o r ( i n t i = 0 ; i< contour Ind . s ize ( ) ; i ++ )49Figure 4.6: Example result for the target position after thresholding and pro-cessing the averaged passing pixels.Figure 4.7: Example contours of the targets overlayed over top of the originalexample image. Circles are positioned at the corners.50{f o r ( i n t j = 0 ; j +1 < contours [ c i ] . s i ze ( ) ; j ++){Poin t temp = contours [ c i ] [ j ] ;minX = min ( temp . x , minX ) ;maxX = max( temp . x , maxX ) ;minY = min ( temp . y , minY ) ;maxY = max( temp . y , maxY ) ;}}4.3.3 Moving From Pixels to Real World - OffsetThe last major piece of information the vision system needs to provide us with isthe offset of the cursor from the center of the target. At this point it would be triv-ial to simply subtract the two pixel positions, but that would only give us resultsin camera pixel coordinates. The target sizes are all recorded in centimeters, socomparing a pixel offset to physical target sizes will be an unfair comparison atbest. Pixel coordinates are also subject to perspective and intrinsic camera distor-tion, and thus cannot be trivially converted to the physical coordiantes required foraccurate modeling. This is a well-understood problem in computer vision, and thissection provides a brief description of our method to solve it.The first step in this process is commonly called “camera calibration”. Cam-eras, especially cheap ones are subject to mechanical inconsistencies, lens distor-tions and perspective projection. We can mathematically categorize these into twomatrices which can be inverted to correct these errors [58]. The “intrinsic” param-eters of the camera refer to the internal properties of the camera that are consistentfor the same camera regardless of where it is positioned or what it is looking at.For example, this would include radial distortion from a fish-eye lens. There arealso “extrinsic” parameters that refer to the cameras positional properties relativeto objects in the scene (e.g. rotation, translation away ect). Put together, these twomatrices form a camera “homography” which allows you to convert from a pixellocation into physical positions relative to an object of known size and orientation[58].Finding the intrinsics and extrinsics of a camera is a common problem usu-ally completed by waving a grid of known size in front of the camera. We will51Figure 4.8: Example of the calibration procedure through grid wave. A regu-lar grid of known size is pictured from various angles, machine learningfinds the intrinsic distortions of the camera.refer readers to the standard openCV tutorial for this procedure. Find this explana-tion and full code listing of calibrating a camera at http://docs.opencv.org/2.4/doc/tutorials/calib3d/camera calibration/camera calibration.html.We used this as the basis for our camera calibration procedure, but made someadditions and changes. For one, this process is extremely computationally expen-sive, (solving a linear system minimization over dozens of images) and would beinfeasible to run on every frame of our video. We decided to do a robust cameracalibration (Figure 4.8 showcases a picture from the calibration procedure) aheadof time to find good estimates of the intrinsics of the camera. These will be consis-tent for the camera regardless of position, so can be reused over and over withoutre-computing the homography. The following line of code corrects the intrinsicerrors in the image. CameraMat and distortionCoeff come directly from the outputfile of the openCV camera calibration.und i s t o r t ( rawFrame , frame , cameraMat , d i s t o r t i o nCoe f f ) ;Once we have corrected for intrinisic distortion we need to correct for per-spective/extrinsic properites and compute the physical distances in the plane of the52screen. To do this, we need a reference object of known physical size in the planethat we want to measure. In Figure 4.7 of the back of the screen you may havenoticed the blue box beside the green targets. This is the reference object we use,and is of a clearly different and easy to find color from the rest of the scene. Be-fore running the experiment we measured its final physical size in CM which was29.3cm high and 15cm wide. We used the same averaging and thresholding pro-cess for finding this calibration object as the green targets. The following line ofcode takes the pixel X,Y values and compares them to the real world distances incentimeters to get the perspective matrix that converts between them.img2World = getPerspect iveTransform ( cornerPixelXY , cornerInCM ) ;Then it is a relatively trival operation to take the pixel location of any givenobject in the plane of the calibrator and multiply it through a matrix that invertsthe projection to get X,Y relative to the bottom left corner of the calibrator. Thefollowing line of code tells openCV to populate worldPts with the physical positionof the input imgPts.perspect iveTransform ( imgPts , worldPts , img2World ) ;4.4 Conclusions and RecommendationsMost experiments record data using internal software logging in their experimen-tal software. While this is quick to implement and useful, Software logging canonly capture the specific moments of interaction with the technology and entirelyloses data about the preparatory phase. Video recordings of the interaction contextare often employed for post-hoc data analysis and identification of key usabilityissues. However, manual video analysis is extremely time consuming and oftenreliant on inaccurate relative positions to landmarks (e.g. the person chose to moveaway from the group to check an incoming text message). We would argue thatthe vision techniques presented here could reduce analysis workload through au-tomation while also providing more accurate real world measurements of distance.These can enrich our understanding of video data and can provide additional in-sight to standard machine logging. Future work should investigate creating a moregeneral package for vision analysis of experiment context syncronized to softwarelogs.53We are still a good distance away from having a generalizable tool that couldbe immediately applied to other research areas. By having a few main objects ofclearly distinct primary colors we were able to easily identify them from the back-ground and process their shape with just simple thresholding operations. Whilethis works well after some parameter adjustment (while being extremely efficient),more complex visual scenes and experiments are going to require more than justsimple thresholding operations, particularly when you are looking to find an objectwithin a cluttered desktop interface.For broader implementation that could be applied to more complex interac-tions we direct the user to the SIFT algorithm http://docs.opencv.org/3.1.0/da/df5/tutorial py sift intro.html#gsc.tab=0. SIFT techniques (scale invariant feature trans-formations) are often used in combination with machine learning techniques inorder to detect and identify known groups of objects of varying orientations andsizes. These techniques have been shown to be robust and helpful for identifyingkey features within a image that can be used to uniquely identify objects. Ideally, astandard set of these objects could be identified (cursors, buttons, mice, keyboards,subjects) and then combined into a software package that can identify relative realworld positions over time. Some work has already been done in similar prob-lems, but we know of no current tools that easily package this up for application inresearch yet. We consider this a strong project area for future research and appli-cation of vision technology.54Chapter 5Real-World Pointing, TargetDepth, and kIn this chapter we describe an experiment, similar to the one described earlier inChapter 3, in which we further explore the variation of k in Welford models. Thenew experiment examines “real-world” pointing rather than computer-mediatedpointing. The primary difference is that the new experiment involves no computerintervention: there is no feedback loop involving computer control. A secondarydifference is that experiment 2 examines a larger set of target depths and it does notuse binocular stereo to display the targets at virtual target depths: all target depthsare physical (in the terminology of Chapter 3, DV =DS), which is why we considerthe pointing in this experiment to be “real world” compared to the more syntheticpointing in the earlier experiment.A computer was used in the both experiments to display targets, but the com-puter in the new experiment does this entirely independently of a human partic-ipant’s movements and thus it is not “in the loop” with the human. A computeralso recorded a participant’s movements in both experiments, but in the new ex-periment the software described in Chapter 4 was used to capture movement datawith a camera for subsequent analysis off-line. The computer’s role in the newexperiment is thus entirely passive: it presents stimuli independent of the humanparticipant who is performing pointing actions, and it collects data about the hu-man’s actions but only for later analysis without providing any accuracy feedback55during the experiment. This constrasts with the earlier experiment where the com-puter actively tracked the human participant in order to determine where the humanwas pointing. This allowed it to flash the targets different colors to provide feed-back about selection accuracy. Limiting the computer to a passive role removes anylatencies or other artifacts that might have been introduced into the earlier exper-iment through a feedback loop between the human participant doing the pointingand the computer system mediating the experiment.Experiment 1 investigated the impact of target depth on pointing performanceand showed that Kopper’s k-factor (the ratio of the impact of target width andmovement amplitude on pointing performance) varies with target depth (the dis-tance from a user to the targets). Evidence was found that increasing target depthincreases k, which means that the width of targets has a larger effect on movementtime relative to the distance between the targets (movement amplitude) as targetdepth increases. This is similar to the effect reported by Shoemaker et al. [40] thatincreasing gain also changes the relative importance of target width compared tomovement amplitude for predicting movement time. For high gain, a pointer ismore responsive to hand motion and thus can cross large gaps between targets eas-ily, but it may have difficulty making sensitive fine adjustments once it gets reachesa target. As the depth of targets increases, the apparent speed of the pointer in theplane of the target increases relative to target planes that are closer. So even thoughthere is no actual change in gain in terms of C:D ratio, there could be an apparentchange in gain.We focus our attention on three likely models for the impact of depth on k. Alinear model has been suggested by our previous work and is consistent with theinformation theoretic interpretation proposed by Mackenzie et al. A logarithmicmodel may also be appropriate to capture a slow tail off, as increasing depth mayhave diminishing returns on k. A polynomial model adds another degree of free-dom to the linear model and may also capture a trail off over depth. We feel thatany more complex models will likely not be worth investigating as more degreesof freedom will come closer and closer to purely interpolating our six data pointsrather than predictively modeling it.It is possible that the observed increased impact of width on pointing perfor-mance as target depth increases is an artifact of system latency rather than being56purely due to target depth. Latency is the delay between physical movement andthe resulting feedback being displayed on the screen. Latency has been shown tohave a performance cost of around 10-15 percent per 100ms of latency [31, 49]. Itis possible reasonable that a delay between input and visual feedback in the formof a cursor displayed on the screen (which is how the experiment in Chapter 3was conducted) might be more problematic when trying to do fine adjustments forsmall targets rather than a rough ballistic trajectory.Testing whether the k variation is an artifact of the computer mediation or isinstead a true sensorimotor pattern was one of the goals experiment 2. Examininga larger range of target depths is another goal. The experiment in Chapter 3 useda limited set of target depths and saw a trend of increasing k with target depth.We would like to determine if it is a steady linear increase, or if it has some otherpattern. The design of the experiment presented in this chapter uses a larger setof target depths in order better tease apart the relationship between target depthand k for a more nuanced understanding that might lead to practical calibrationprocedures for virtual environments based on measuring k at various depths (thiswill be explored further in Chapter 6.5.1 MethodWe investigate the effect of physical target depth D on pointing performance (move-ment time MT ) in a distal pointing task. We examine the same physical targetdepths as did the experiment previously described in Chapter 3 plus additionaldepths intermediate between those previously used and also depths further awayto extend the range of our earlier results. A more important difference is that thisexperiment did not have active computer processing within the feedback loop. In-stead, stimuli were presented on a fixed timeline, independent of pointing activityby a participant. The only visual or other feedback is that produced by a hand-heldlaser pointer that is not connected in any way to the computer that displays thestimuli. This approach was adopted to remove the possibility of feedback latencyor other temporal artifacts being introduced by the hardware and software usedto control the experiment. Taking these two differences into account (the largernumber of target depths and the elimination of computer-mediation that might in-57troduce latency artifacts), the experiment is otherwise very similar to experiment 1.In particular, the experimental task was again a Fitts-style one degree-of-freedomreciprocal tapping task between two target pairs, modeled closely after Shoemakeret al. [40] and the original experiments by Fitts [10]. ISO 9241-9 [3] definesa two degrees-of-freedom task for pointing performance; we used a one degree-of-freedom task because we were concerned with the fundamental applicability ofFitts’s Law. Although the task was one degree-of-freedom, it is still an exampleof three-dimensional pointing because the interaction setup is identical to distalpointing which is common on stereoscopic 3D.5.1.1 ParticipantsWe recruited 20 participants at our university through on-campus advertising. Asa requirement of participation, all were right-handed with normal or corrected-to-normal vision. Two participants’ data had to be discarded due to equipment mal-function (the batteries on a laser pointer ran out). Of the remaining 18 participants,14 were female, 3 male and one indicated “other” for gender when completing apre-experiment questionnaire. Age ranged from 19 to 29 years (mean 23.9 years).All participants were regular computer users (9+ hours per week). The experimentwas approved by the Behavioural Research Ethics Board at our university (cer-tificate H11-01756). Participants signed a consent form prior to the start of theexperiment. Each was compensated with $10.5.1.2 Apparatus & MaterialsThe hardware and software used for the experiment were similar to that used in ex-periment 1 except that a laser pointer was used instead of a software-based pointerand a camera subsystem captured continuous video of the screen that was subse-quently analyzed to provide movement data instead of computing the movementdata in real-time during the experiment using Vicon motion capture sensors.HardwareA 5.16m×2.85m (width× height) wall display with a glass screen was rear-projectedby two side-by-side 1024×768px projectors, giving a total resolution of 2048×768px.58Figure 5.1: Apparatus: Pointer used for interaction and camera used for datarecording.The display was driven by an 8-core Intel Processor with 6GB of RAM and dualNVIDIA GeForce GTX 260 graphics processors, running Windows 7. The pro-jectors were carefully aligned to reduce visual discontinuities between projectors.Unlike the experiment in Chapter 3, no stereo projection was used and thus thetwo projectors were driven directly from the computer rather than through a videoprocessor. A standard frame rate of 60Hz was employed.Figure 5.2: Image of the projection room behind the screen. Camera is placedon tripod at fixed location to record experimental data. Two projectorswere used to display the experimental stimulus.A GoPro Hero 4 Black Edition camera was mounted on a stable tripod behindthe screen of the wall display and positioned so the entire interaction area on the59screen was within view. The camera was chosen because it could serve as a rela-tively low-cost, high-frame-rate data recorder. Video was recorded during trials at240 frames per second with a resolution of 1280×720px. The narrow lens settingon the camera was used to reduce possible distortion due to the normal fish-eye ef-fect of the GoPro lens (distortion was further corrected in post-processing softwarethrough camera calibration).SoftwareThere were two programs used for the experiment. One program, the experimentalsoftware, was a modification of the experimental software used for experiment 1.It was run during the experiment to display stimuli for the trials and to capturetime-stamped data for subsequent analysis. Data comprised log files describingthe stimuli and video files of the screen on the wall display that captured both thestimuli and the image of the laser pointer that was being used for pointing. Thesecond program was post-processing software that used low-level computer visiontechniques to produce time-stamped coordinate data on the position of the centerof the image of the laser pointer relative to the targets in the stimuli being displayedby the software.The experimental software displayed to a participant the stimuli for the variousconditions in the experiment. It was written in C# using the Microsoft XNA GameStudio 4.1 library. The stimuli were two green bars to represent a pair of equal-width targets in a one degree-of-freedom Fitts-style reciprocal tapping task. Targetwidth depended on the trial. The left target was always directly in front of a partic-ipant. The distance to the right target was determined by the movement amplitudefor the trial. A 29.3cm by 15cm blue rectangle was displayed at a constant posi-tion just to the left of the left target to give the vision system a calibration object.Experiment 1 had rendered similar stimuli using non-headtracked stereo VR. Thevirtual camera would render the targets at different sizes (according to projection)depending on how far the participant was away from the screen. Since we wereconcerned with the fundamental sensorimotor processes, the targets presented bythe software in this task were always rendered from a consistent viewpoint andthus at consistent size and position regardless of where the participant stood. Any60apparent change in size of targets was thus due solely to the participant standingfurther away from the display. The software recorded timestamps for the start ofeach trial (the end of a trial was the start of the next trial).Pointing data was produced by post-processing the video images acquired dur-ing the experiment by the experimental software. The timestamps for each trialwere fed into the post-processing computer vision software to synchronize the vi-sual images captured of the laser pointer and the target stimuli with the move-ment amplitude, target width, and target depth parameters for the trial. The post-processing software calculated the movement time and amplitude (horizontal loca-tion) of all tapping events (defined as the peak distance before a direction change inthe movement) and recorded them into a log file. The computer vision techniquesused in the post-processing software are described in detail in Chapter 4.Physical Layout, Task, and StimuliFigure 5.3: Experimental stimuli. Participant stands at a fixed target depthalong the orange line. Specific conditions are marked with crosses. Theexperimental software displays two green targets on the screen. Theuser smoothly moves the laser pointer between the two targets.Participants stood at one of six target depths D (110cm, 165cm, 220cm, 275cm,330cm, and 365cm) from the wall display screen for a block of trials. 365cm waschosen instead of 385cm (which would have followed the 55cm increase pattern)because we were running out of space in the room. All participant locations werealong a straight line perpendicular to the plane of the wall display and were marked61on the floor with tape beforehand. Every trial within a block had a pair of identicalvertical bars (targets) displayed on the screeen separated by one of three movementamplitudes A (25cm, 50cm, 75cm), both targets having the same widths chosenfrom one of three target widths W (5cm, 10cm, and 20cm). Targets were 90cmhigh, roughly centered at a typical participant’s elbow level. The left target wasalways directly ahead of the participant and the right target was to the participant’sright. This was done to avoid a participant having to move an arm across the bodymidline while pointing because the cross-lateral inhibition effect (CIE) [19, 38]predicts that hand movements that cross the midline of the body are more complexthan those that do not. Participants held down the button on a standard laser pointerto point at the targets. The distance between the centers of the two targets wasthe movement amplitude for that trial. Lighting in the room was kept consistentacross trials by making sure that all sources of natural light (windows, doors) wereblocked off, and that the same artificial lights were always turned on.5.1.3 ProcedureEach participant performed the experiment in a single session of approximately30 minutes. After filling out a consent form and a pre-questionnaire that gathereddemographic information, a participant was introduced to the apparatus and thepointing task. Participants were instructed to complete the task as quickly as pos-sible while being as accurate as possible. At the start of each block of trials, aparticipant was instructed to stand at the location marked on the floor that corre-sponded to the target depth D that was used for all trials in the block. Participantswere then invited to complete a practice trial by tapping between the first two tar-gets before data recording began. They were allowed to practice until comfortablewith the system. Most participants quickly indicated they were ready to begin theexperiment trials. Data collection was begun by the experimenter once the partici-pant had completed the practice trial.Each experimental trial was defined by three independent variables: movementamplitude A, target width W , and target depth D, the distance at which the par-ticipant had been placed relative to the screen for a block of trials. For the firsttarget pair in a block, a participant tapped the left target that was directly ahead62and then performed a sequence of reciprocal back-and-forth taps between the rightand left targets. Participants were instructed to keep the laser pointer smoothlymoving between targets and continually tap between them. Subsequent trials wererun continuously, without any break between trials. Every ten seconds a new trialbegan and the next target condition (A and W ) was displayed, regardless of howmany taps had been performed in the previous trial. The first tap in a trial wasdiscarded because it was assumed that participants would have to adjust to thenew condition while in the middle of a movement and thus that tap might not berepresentative of the participant’s sensorimotor performance as it is assumed for aFitts-style reciprocal tapping task.Between each block of the experiment participants were required to take abreak of at least three minutes. They were encouraged to take more time if de-sired, but very few did. The participant was then repositioned to the target depth Dfor the upcoming block and a practice trial followed by the experimental trials forthe block commenced.Timestamps and locations on the screen for all taps were recorded for subse-quent post-processing. No accuracy feedback was provided during the experimentother than the image that the laser pointer made on the screen. Participants werenot required to correct errors. They were told to keep smoothly moving betweentargets.MeasuresPointing performance was measured by the time taken to execute each individualtap action. Participants tapped as many times as possible within the 10 secondsgiven for each trial. For each tap, the post-processing software recorded movementtime MT , and the offset of the cursor from the center of the target when the tap(reversal of direction) was made.Experimental DesignThe within-participants design had independent variables of target depth (distanceof the screen from the participant) D (110cm, 165cm, 220cm,275cm, 330cm, 365cm),target width W (5cm, 10cm, 20cm), and movement amplitude A (25cm, 50cm,6375cm). All three variables were fully crossed.18 participants (N) × 6 target depths (D) ×3 movement amplitudes (A) × 3 target widths (W )For a given D a participant performed all trials for every combination of A andW before switching to another D condition. We decided against intermixing the Dconditions to avoid participants having to repeatedly move back and forth betweendifferent locations in the room. The presentation of target depths D was counter-balanced across participants according to a balanced latin square. The presentationof movement amplitude and target width pairs A-W conditions was randomizedwithin every block corresponding to a fixed target depth D.HypothesesThe hypotheses for this experiment are based on previous studies. Most directly,they are drawn from the previous experiment that was described in Chapter 3.H1a (weak) the parameter k will increase monotonically with greater target depthH1b (strong) the parameter k will increase linearly with greater target depthH2 One-part models of pointing performance will not accurately model all targetdepthsH3 Two-part models of pointing performance will perform better than one-partmodels in conditions where k diverges from 1.H4 Angular measures of target difficulty will not improve our models of pointingperformance.The first hypothesis is again split into weak and strong variants, as it was forexperiment 1. We expected to find further support that k varies monotonically withtarget depth, but also felt that there might be evidence that it varies linearly becauseof the additional target depths that could provide a better picture of how k behaves64as target depth varies. By removing latency as a confound we hoped to supportboth variants of the hypothesis better than in the previous experiment.The second and third hypotheses are based on our conclusions in the previousexperiment about one-part and two-part models of pointing performance, whichwere that a Welford-style two-part formulation is more robust and will, at leastfor some conditions, better predict movement time than will a Fitts-style one-partmodels.The fourth hypothesis arose from the post-hoc analysis in the previous experi-ment that indicated that using angular measures instead of classic linear measuresfor A and W does not improve modeling strength.5.2 ResultsOur analysis proceeded in four steps after removing outliers from the data. We con-ducted an ANOVA on movement time for the three independent variables. We thencompared two Fitts-style and two Welford-style models using a regression analysisfor each of the four model types for each target depth D using both target widthW and effective target width We. An F-test was conducted on four sets of pairs ofthese models (each a Fitts-style model and the corresponding Welford-style model)to determine whether the additional degree of freedom in the Welford-style modelswas justified by our data. The model comparison was repeated using the angularmeasures corresponding to A and W to determine whether the classic linear mea-sures or the angular measures advocated by Kopper et al. [23] are most appropriatefor analyzing pointing performance. We then examined the behavior of the param-eter k in the Welford-style models as a function of target depth D. A Shapiro-Wilktest for normality was conducted on the participant average movement times andshowed that data was unlikely to be normally distributed (p < 0.001) which mayhave been caused by outliers and may degrade the effectiveness of our models.We thus attempted to remove outliers before begining our analysis. Overall av-erage movement times and 95 percent confidence intervals for each condition areincluded in the Appendix A.655.2.1 Removal of OutliersThere were a total of 38,307 observations across the 18 participants whose data wasanalyzed. As is common practice in pointing evaluation, mistrials and outliers wereremoved from the data before analysis. Any tap more than half the trial amplitudeaway from the target (likely to be caused by a premature reversal of movementdirection) was ignored and any tap whose movement time was more than threestandard deviations from the mean for the participant’s performance in that A andW condition was also ignored.It was discovered that for two conditions (movement amplitude A being 75cmand target width W being either 10cm or 20cm) the recording of taps for the lefttarget was less accurate than for taps in other conditions. Consequently, all of thesetrials were removed to avoid skewing the results. Taps on the right target in thesetwo conditions were recorded accurately, so were included in the analysis.In total, 3881 taps (10.1%) were removed from the data. The remaining tapswere used in all of the analyses, except when calculating the models discussed inSection 5.2.5 where additional points were excluded for reasons that are discussedin that section.5.2.2 ANOVA for Movement Time MTThe results of an ANOVA for movement time MT with independent variablesmovement amplitude A, target width W , and target depth D are presented in Ta-ble 5.1. We used the R Library’s ez package [25], which utilizes Type-two sum ofsquares error. This sometimes differs slightly from SPSS and other packages thatuse Type-three sum of squares error for ANOVA.As expected, movement amplitude (F2,34= 114.729, p < 0.001) and targetwidth (F2,34= 116.62, p < 0.001) had strong effects. That target size and posi-tion effects pointing performance is fundamental to any study of pointing and thusis not surprising. There was also an interaction between amplitude and width(F4,68= 7.443, p < 0.001). The non-linear relationship between these variablesand movement time in both Fitts-style and Welford-style models predicts this.What is surprising is that target depth D did not have a effect (F5,85= 1.553,p > 0.05). This conflicts with the findings of experiment 1 showing that target66Factor ε d fe f f ect d ferror F p Partial η2A 2 34 114.729 .000 0.871W 2 34 116.62 .000 0.873D 5 85 1.553 .182 0.084A×W 4 68 7.443 .000 0.305A×D 10 170 0.688 .734 0.039W ×D 10 170 2.009 .035 0.105A×W ×D 20 340 1.011 .448 0.056Table 5.1: ANOVA results for the impact of depth (D), amplitude (A) andwidth (W) on movement time (MT ). Statistically significant factors arebolded.depth had an effect, which was the basis for identifying a trend of k increasing withtarget depth. There was, however, an interaction between target width and targetdepth (F10,170= 2.009, p < 0.05). The impact of target width thus depends in parton target depth.5.2.3 Comparing Pointing ModelsTo assess the relative ability of one-part and two-part models to predict pointingperformance, we performed many of the same model comparisons described inChapter 3. We conducted regression tests for a set of pointing models at eachof the six target depths (110cm, 165cm, 220cm, 275cm, 330cm, 365cm) and oneglobal model that aggregated data for all target depths. The results of the regressiontests are summarized in Table 5.2.Overall, our models of pointing performance seem quite strong. All but oneof the pointing models reached or exceeded Mackenzie’s 0.9 R-squared thresholdfor a good model. The only case that failed was at 110cm from the screen withregular width. Most of the time, one-part and two-part models performed compa-rably, but for regular width at target depth 165cm, a Welford two-part formulationwas statistically better than Fitts one-part model and at both target depth 110cmand target depth 165cm a Shannon-Welford two-part model was statistically better67Width DFitts Welford F-testa b R2 RSS a b1 b2 k R2 RSS F-ratio p sig?W110 58.39 147.33 0.94 12282.57 -132.81 172.35 131.25 0.76 0.95 8317.43 2.86 0.12 no165 126.96 128.97 0.92 11903.79 -143.19 164.33 106.25 0.65 0.97 3987.97 11.91 0.007 yes220 131.89 137.09 0.94 10884.38 -16.06 156.46 124.65 0.8 0.94 8509.91 1.67 0.227 no275 115.65 130.75 0.94 8836.39 1.85 145.65 121.18 0.83 0.94 7431.56 1.13 0.315 no330 79.9 138.22 0.97 5508.86 -40.12 153.93 128.12 0.83 0.97 3946.38 2.37 0.157 no365 73.63 148.51 0.97 5767.75 -53.18 165.11 137.85 0.83 0.98 4023.48 2.6 0.141 noall 97.74 138.48 0.94 74522.95 -63.92 159.64 124.88 0.78 0.95 57515.69 15.08 0.0002 yesWe110 -3.87 184.22 0.87 24966.37 -168.02 201.67 164.32 0.81 0.87 22097.87 0.78 0.400 no165 59.97 168.89 0.9 15223.05 -177.38 193.71 139.6 0.72 0.93 9068.54 4.07 0.07 no220 56 179.12 0.9 17110.21 22.29 182.38 -174.51 0.96 0.88 16994.53 0.041 0.844 no275 54.33 169.41 0.91 13466.63 -39.47 179.33 158.1 0.88 0.91 12486.13 0.47 0.509 no330 7.94 186.03 0.96 7364.14 30.58 183.93 189.19 1.03 0.95 7311.93 0.042 0.841 no365 20.77 182.7 0.94 10762.11 52.79 179.45 186.79 1.04 0.94 10662.13 0.056 0.818 noall 20.77 182.7 0.94 10762.11 52.79 179.45 186.79 1.04 0.94 10662.13 2.22 0.141 noWidth DShannon-Fitts Shannon-Welford F-testa b R2 RSS a b1 b2 k R2 RSS F-ratio p sig?W110 -88.55 185.4 0.95 9090.23 -328.18 220.23 174.59 0.79 0.97 4253.28 6.82 0.028 yes165 4.6 159.82 0.91 14501.23 -295.39 203.41 146.28 0.72 0.95 6920.78 6.57 0.031 yes220 -2.52 171.61 0.94 10117.21 -181.32 197.59 163.54 0.83 0.95 7424.31 2.17 0.174 no275 -12.88 163.8 0.95 7859.08 -153.73 184.27 157.45 0.85 0.95 6187.94 1.62 0.234 no330 -55.53 172.98 0.97 4798.7 -202.2 194.29 166.36 0.86 0.98 2986.68 3.64 0.088 no365 -69.76 185.02 0.96 6916.3 -215.75 206.24 178.44 0.87 0.97 5121.14 2.10 0.181 noall -37.44 173.1 0.94 72886.91 -229.43 201.01 164.44 0.82 0.95 54258.17 17.5 0.0001 yesWe110 -207.44 242.07 0.91 18094.92 -401.61 265.95 226.15 0.85 0.91 14812.89 1.32 0.278 no165 -113.14 216.6 0.9 15365.83 -367.71 247.32 194.83 0.79 0.93 9535.44 3.66 0.088 no220 -135.69 232.99 0.93 12708.29 -169.39 236.76 229.58 0.97 0.91 12610.51 0.046 0.834 no275 -117.94 216.73 0.93 11385.81 -229.69 230.33 207.46 -0.9 0.92 10249.86 0.66 0.435 no330 -188.53 241.26 0.97 4329.47 -169.09 239.15 243.28 1.02 0.97 4297.3 0.044 0.837 no365 -157.28 230.16 0.94 10912.82 -121.72 226.03 233.45 1.03 0.94 10810.64 0.057 0.817 noall -157.28 230.16 0.94 10912.82 -121.72 226.03 233.45 1.03 0.94 10810.64 2.81 0.100 noTable 5.2: Modeling movement time (ms) using Fitts and Welford formula-tions (top) and Shannon-Fitts and Shannon-Welford formulations (mid-dle) for actual width W and effective width We. For each D condition andfor all D conditions combined. The coefficients and adjusted R2 for eachmodel with the F-ratios, p-values and significance from an F(1,9)-test(Eq. 2.9 with p1 = 2, p2 = 3, and n = 9) in the last three columns com-pare nested Fitts and Welford (or Shannon-Fitts and Shannon-Welford)models.68than a Shannon-Fitts one-part model. No differences were found using effectivewidth, although there was a potential trend (p < 0.1) at target depth 165cm. Theglobal model that aggregated all target depths showed significant improvementsover a Fitts-style model for both Welford and Shannon-Welford using width, butno differences were found for effective width.5.2.4 Testing Angular Measures of Target DifficultyDS Width TypeClassicFittsAng Fitts ClassicWelfordAng Welford ClassicShannonAng ShannonClassicShannon-WelfordAngShannon-Welford110 W 0.938 0.943 0.951 0.951 0.954 0.962 0.975 0.9788165 W 0.923 0.928 0.969 0.969 0.905 0.895 0.947 0.923220 W 0.936 0.939 0.942 0.942 0.941 0.939 0.949 0.943275 W 0.943 0.944 0.944 0.944 0.949 0.948 0.953 0.950330 W 0.967 0.968 0.972 0.972 0.971 0.967 0.979 0.972365 W 0.970 0.971 0.975 0.975 0.964 0.954 0.969 0.955all W 0.937 0.938 0.951 0.941 0.938 0.935 0.953 0.93110 WE 0.874 0.879 0.870 0.869 0.908 0.929 0.913 0.930165 WE 0.901 0.908 0.931 0.931 0.900 0.901 0.928 0.917220 WE 0.900 0.901 0.884 0.885 0.926 0.937 0.914 0.926275 WE 0.913 0.915 0.906 0.907 0.927 0.931 0.923 0.926330 WE 0.956 0.956 0.949 0.950 0.974 0.980 0.970 0.977365 WE 0.945 0.945 0.936 0.937 0.944 0.939 0.935 0.929all WE 0.913 0.913 0.915 0.912 0.926 0.929 0.929 0.928Table 5.3: Quality of fit for regression analyses as determined by the R-squared values is shown for angular and classic linear measures of move-ment amplitude A and target width W , and for movement amplitude Aand effective target width We. Bold cells show which model in a pair ofcolumns has a better R2. The left columns are for classic and right forangular.We examined whether one-part pointing models are improved if A and W aredetermined by angular measurements. Kopper et al. [23] suggest that this is a moreaccurate way to model pointing. We recalculated the regression coefficients for allof the models after replacing A and W with there angular equivalents, α and ω .Table 5.3 shows this comparison for each target depth and for the “global” modelthat does not treat target depth as a parameter but instead aggregates the data for69all target depths. WWe see mixed results in terms of improvements using angular measures for in-dividual target depth models using width W . One-part Fitts models were slightlyimproved by angular measures, whereas the Shannon-Welford and Shannon one-part performed better with classic measures. The Welford models had almost iden-cical R-squared values. The patterns are less clear in effective width We wheresome models were improved by angular and others were made worse even for thesame type of model. If we look more carefully at the magnitude of these differencesfor various target depths, we see they are fairly minor. The biggest improvementwas for the one-part Shannon at target depth 110cm with effective width. Classiclinear measures produced an R-squared of 0.908 but angular measures producedan R-squared of 0.929. The angular model accounted for two percent more of thevariation in the data, which is arguably a relatively minor difference given that bothmodels were above Mackenzie’s 0.9 threshold for a “good” model.The best case scenario for angular measures is for the global model shown inthe final “all” row in Table 5.3 for both the width-based models (upper rows in thetable) and the effective width models (lower rows in the table). In this case we seevery little difference for one-part models, but for two-part models angular measuresare worse in this condition. Regular width Shannon-Welford had an R-squared of0.953 with classic linear measures compared to an R-squared of 0.938 for angularmeasures. This loss was similar in magnitude to the best case angular improvementachieved in the individual target-depth models (0.024 for target depth 165 versus0.020 for the aggregated depth).5.2.5 Modeling k as a Function of Target DepthA fundamental goal of this experiment was developing a robust interpretation ofhow k varies with target depth. We examined three potential models of k as afunction of target depth. The first is a simple linear function of target depth, whichwe expect to be a reasonable estimation that may not fully capture the trend. Wethink it likely there might be some point at which increasing target depth no longerincreases k, or at least not as much as a linear model would predict. Thus wealso looked at logarithmic and quadratic polynomial models as alternatives that70might better capture the relationship between k and target depth. We felt thesewere the most reasonable models to consider because our experiment only has sixdata points from which to verify a model. Higher order polynomials will quicklyapproach pure interpolation of the data rather than predictively modeling it. Welooked at models of k for both regular and effective width. These results are shownin Table 5.4. For a visual comparison of these models refer to Figure 5.4 for regularwidth and Figure 5.5 for effective width.Pointing Model Width Type k Model k Equation R2Welford W Linear k = 0.0005D+0.6543 0.503Shannon-Welford W Linear k = 0.0004D+0.7115 0.600Welford We Linear k = 0.0011D+0.6428 0.710Shannon-Welford We Linear y = 0.0008D+0.7220 0.711Welford W Logarithmic k = 0.1080ln(D)+0.1988 0.448Shannon-Welford W Logarithmic k = 0.0906ln(D)+0.3285 0.544Welford We Logarithmic k = 0.2223ln(D)−0.2973 0.643Shannon-Welford We Logarithmic y = 0.1725ln(D)−0.0081 0.650Welford W Polynomial k = 0.0000006D2+0.0002381D+0.6844 0.506Shannon-Welford W Polynomial k = 0.0000004D2+0.0002501D+0.7311 0.602Welford We Polynomial k = 0.0000022D2+0.0000002D+0.7540 0.725Shannon-Welford We Polynomial k = 0.0000016D2+0.0000553D+0.8023 0.725Table 5.4: Regression modeling of how k varies as target depth D (cm)changes for each of the four pointing models.Overall the results of our modeling are mixed. The values obtained for k arelower than in our previous studies. There is also more variance in the data. Usingregular width, a simple linear model is more effective than a logarithmic model,with an R2 around 0.503 compared to 0.448. The polynomial model is slightlybetter than linear at 0.506. Looking at the equation (y = 0.0000006x2+0.0002x+0.6844), the polynomial model assigns much less weight to the quadratic termand thus is actually approaching linearity. All of these are worse fits than in theprevious experiment where a linear model using width had an R2 upwards of 0.9.Using effective width is often considered a truer reflection of user performance,and our models of k also seem to improve using effective width. A polynomialtrend is the best model, but still only reaches an R2 of 0.725. Overall, these modelsseem much less effective than our previous experiment, which showed k varyinglinearly with target depth with an R2 between 0.89 and 0.95.Taking a closer look at the specific points being modeled, we see that the 16571Figure 5.4: Visual comparison of different models for the relationship be-tween k and D. Data presented for regular width models of pointingperformance.target depth condition has a lower k than the 110 target depth condition (0.64 com-pared to 0.76), whereas the rest of the points roughly produce similar or higher kvalues as target depth increases. This runs counter to the trend in the rest of ourdata, and is likely pulling the model away from the trend of the other points. Forthe sake of argument, we highlighted this as a potential outlier and attempted thesame models on the rest of the points. Table 5.5 shows the modeling results on theset of points without the 165cm outlier, while Figure 5.6 and Figure 5.7 show thistrend visually.After dropping this one point, the modeling results are uniformly improved inall conditions. This is not surprising. The overall variance should decrease whenan outlier is removed. However, especially with regular width we see dramatic72Figure 5.5: Visual comparison of different models for the relationship be-tween k and D. Data presented for effective width models of pointingperformance.improvements in the modeling. Welford’s regular width was very effectively mod-eled by a linear estimation of k (R2 = 0.91) but was further improved using bothlogrithmic (R2 = 0.94) and polynomial models (0.97). These models account forover ninety percent of the variation in k, compared to only fifty to sixty percentbefore the one outlier is removed, and they align nicely with behavior of k in ourprevious experiment. The existence of such dramatic outliers does cause concern.How confident are we of the monotonic (or even linear) relationship? We do notknow how often will outliers become a problem because we do not know whatcaused this one. We discuss the implication of this and other aspects of our resultsin the next section.73Pointing Model Width Type k Model k Equation R2Welford W Linear y = 0.0003x+0.7319 0.915Shannon-Welford W Linear y = 0.0003x+0.7645 0.955Welford We Linear y = 0.0008x+0.7248 0.771Shannon-Welford We Linear y = 0.0007x+0.7809 0.744Welford W Logarithmic y = 0.0653ln(x)+0.4536 0.943Shannon-Welford W Logarithmic y = 0.061ln(x)+0.5049 0.973Welford We Logarithmic y = 0.1741ln(x)−0.0096 0.741Shannon-Welford We Logarithmic y = 0.1376ln(x)+0.2001 0.717Welford W Polynomial y =−0.0000008x2+0.0007x+0.694 0.952Shannon-Welford W Polynomial y =−0.0000006x2+0.0006x+0.7376 0.977Welford We Polynomial y = 0.0000008x2+0.0005x+0.8093 0.777Shannon-Welford We Polynomial y = 0.0000006x2+0.0004x+0.8093 0.746Table 5.5: Regression modeling of how k varies as distance from the screenD (cm) changes for each of the four pointing models. The 165 conditionhas been removed as a potential outlier.5.3 DiscussionWe begin by summarizing the results according to our hypotheses, after which wediscuss some of the more nuanced aspects of our results. For each hypothesis weindicate how consistent it was with the data we found.H1a (weak) The parameter k will increase monotonically with greater target depthSomewhat consistent with data.H1b (strong) The parameter k will increase linearly with greater target depth.Somewhat consistent with data.H2 One-part models of pointing performance will not accurately model all targetdepths. Not consistent with data.H3 Two-part models of pointing performance will perform better than one-partmodels in conditions where k diverges from 1. Consistent with data.H4 Angular measures of target difficulty will not improve our models of pointingperformance. Consistent with data.All of the pointing models produced fairly good results. Both one- and two-partmodels successfully modeled the data in all but one condition. One-part models didnot pass Mackenzie’s threshold in one condition, but were still fairly close even74Figure 5.6: Visual comparison of different models for the relationship be-tween k and D. Data presented for regular width models of pointingperformance. The 165 condition has been removed as a potential out-lier.then. Therefore, we argue that hypothesis H2 (that one-part models will not accu-rately model all depths) was not consistent with our data. Two-part models werestatistically more accurate in two conditions using regular width, and had a poten-tial trend of improvement using effective width. These conditions had the smallestk values, which were the k values that diverged the most from unity, therefore hy-pothesis H3 (two-part models will outperform one-part models when k divergesfrom 1.0) was consistent with our data.Angular measures improved some models but degraded others to a similar ex-tent. Absolute differences between models based on angular and classic linearmeasures of A and W (or We) were relatively minor and could be caused by varia-tion of outliers pulling or pushing the model slightly away from the optimal value.75Figure 5.7: Visual comparison of different models for the relationship be-tween k and D. Data presented for effective width models of pointingperformance. The 165 condition has been removed as a potential outlier.Furthermore, two-part models in the global condition got worse when using an-gular measures of target difficulty. The global condition, which tries to modelpointing performance independent of target depth, is precisely the case that advo-cates of an angular model propose as being its strength. We therefore argue thathypothesis H4 (that angular measures of difficulty will not improve our models)was consistent with our data. We found that subject to outliers, as target depthgot larger k got bigger, and at least roughly corrleated along a linear model in allconditions. Logarithmic models were somewhat better at modeling the data whenoutliers were removed. Due to the problems with outliers we consider hypotheses76H1a and H1B to be somewhat consistent with our data.5.3.1 One-Part Versus Two-Part Models of Pointing PerformanceWe found that one-part models were sufficient to model our data in almost all con-ditions, but they were statistically worse than two-part models in some conditionsclose to the screen. These results run somewhat counter to our previous experimentwhere we noticed that as you got further from the screen two-part models outper-form one-part. Instead when you are very close to the screen we see a differencein modeling.One needs to understand these trends in the presence of the k variable, whichcaptures the relative impact of A and W on pointing performance. If k is exactlyequal to one then one- and two-part models are mathematically identical [40]. Inour previous experiment, k values started around 1.0 and got progressively larger astarget depth increased, up to a maximum of about 1.7. It was therefore unsurprisingthat the models started out similar but two-part models became necessary as kdiverged from one.In this experiment, k values start around 0.7 and get larger, but rarely surpassed1.0 by any large margin. Therefore, in this new experiment, where k was muchcloser to one all the time, one would expect fewer differences between the modelsand for one-part models to fail less often. Moreover, when two-part were statisti-cally better, k diverges more from one than in other conditions, but this was stillless divergence than in the previous experiment, which saw values of k that were1.7 larger than unity whereas this experiment had a ratio of only 1.4, which wasless, as was the absolute difference (0.7 in the first experiment but only 0.3 in thesecond). This is consistent with our claim that the improvement is only seen whenk is not approximately one. We therefore argue that this reinforces and supportsour previous conclusions about modeling distal pointing performance.Two-part models are not inherently necessary to describe all pointing tasks.Many common tasks coincidentally give amplitude and width similar impact. How-ever, specific experiemental conditions and different input techniques may causethis to change somewhat unpredictably and produce scenarios in which two-partmodels become necessary. Devices that could be useful but that produce non-77standard k values (ones that are not close to unity) quite possibly have been re-jected by the research community because they failed to be accurately describedby a one-part Fitts model. Two-part models seem more robust to changes in gainor target depth, or any other factor that might make k diverge from unity. We there-fore advocate that they be the default models, at least until it has been verified thatthe range of k values is never far from 1.0 for the task.5.3.2 Angular Versus Classic Measures of Target DifficultyAngular measures were intended to help the model be robust to changes in depth,but hadn’t been tested in isolation before these two experiments. Studies in litera-ture had compared one-part models using classic measures of difficulty to their newproposed model. While their proposed model used angular measures of difficultyit was also very similar to a two-part Welford’s model. Our previous experimentprovided data that suggested the improvements of this model may be coming morefrom the Welford’s similarity than swapping to angles.As we noted in our results section, the differences between angular and classicmeasures in this experiment were fairly minor and inconsistent. Some models wereslightly improved (usually within 5 percent R2) while others actually got worsewhen swapping to angular measures. Even the best case scenario of the globalmultiple depth condition only saw improvements in some conditions, and got worsein others. It seems reasonable to argue that the differences between angular andclassic are relatively minor and perhaps a form of random variation after changingunits under the effect of perspective projection. While perspective will cause theunits to be scaled differently at different depths, we were not able to show in eitherexperiment that this provided any real improvement in modeling.5.3.3 The Relationship Between k and Target DepthOne thing we were very surprised by in this experiment was the low value of k andits shallow growth with increasing target depth. For example, using regular widththe lowest k was 0.67 at 165cm and then raised to 0.832 by 275cm. However,between 275cm and 365cm we see shallow growth in k between 0.832 and 0.835.This corresponds to less than a percent of absolute difference between half of the78target depths tested. Effective width saw k grow more sharply in this range, whereit varied from 0.88 to 1.04. However, both variations are still smaller than thevariation we saw in the original experiment which went from 1.4 to 1.7 in a similararea. It is plausible that there might be some sort of diminishing returns as you getfurther and further from the screen. Similar increases in depth may have less andless of an impact an impact on k as target depth becomes more extreme.We noted in our results that target depth had an interaction with target widthin this experiment, rather than an omnibus effect. This is key, as rather than tar-get depth making the task inherently more difficult, it may simply exacerbate theproblems with small widths, while not hampering more reasonable targets much.This may be compounded with the changes in absolute target size due to not us-ing stereo projection in this experiment. Perhaps using a slightly different set ofabsolute widths gave us a different k?This would make intuitive sense as visual depth discriminations are less preciseat farther target depths where binocular disparity plays less of a role [51]. Similarchanges in absolute target depth will also have smaller relative changes in totaltarget depth. It could also be caused by hitting the limit of sensorimotor process-ing. As k gets larger, width has more and more of an impact on accuracy becausethe task needs to be more and more precise. This will make hand jitter start tobecome increasingly problematic for selecting smaller targets. It may well be thatwe are hitting some fundamental sensorimotor limit on fine motor precision andparticipants can only be so accurate on the targets we are asking them to select.Given these and the modeling results we presented earlier, we would considerthe two most likely models for our k trend as logarithmic and linear. Linear ap-pears to be a somewhat naive but explanatory approximation of the variation in k.By definition it assumes variation is the same at all points. Theoretically a lineartrend would imply that the same increase in target depth will always have the sameimpact on k. Using a linear approximation will throw out the pattern of smallerdifferences as you get further away, but does capture the positive correlation in ourdata. Furthermore once outliers were removed the regression software was ableto find a reasonable linear compromise between the variation and produced a R2higher than 0.9.We note that second order polynomial models gave us the highest R2 values in79all conditions, but also introduced additional degrees of freedom that would natu-rally improve fit. Looking at the absolute values of the parameters, the quadraticterm was given much smaller co-efficients than the linear (0.0000006 comapred to0.0002). This implies that the solved solution was really much more close to linearbut slightly improved by a small amount of curve. This is reinforced by a nestedmodel anova that we carried out to compare it to a linear model. No statisticallysignificant differences were found for regular width (F1,6= 0.355, p > 0.05) oreffective width (F1,6= 0.282, p > 0.05). Furthermore, the actual curve of the reg-ular width polynomial models had increasing slope as target depth increased. Thiswould go completely counter to the visual trend in the data, and may be a productof outliers. For a model to be useful it must consistently describe the trend in thedata. While we might be able to see slightly higher R2 using a polynomial model,it does not actually improve our model visually or statistically and we reject it asan accurate model of k.A logarithmic model may provide an explanation for the sloping off trend weare seeing in the regular width data. Large increases early on, but smaller as targetdepth got larger. However, it was statistically the worst model before removing theoutlier points (R2 of 0.44 compared to 0.503 and 0.506). The specific outlier wasmuch lower than the the surrounding points. The logarithmic model has to pullup higher than linear above this point and diverges from it dramatically. We thinkthe logarithmic regression was in a way rejecting this point as an outlier. It main-tained the visual trend in the data but jettisoned consideration for the absolute bestR2. Once the outlier was removed, logarithmic models surpassed linear in pure R2while more closely reflecting the visual trend in the data than polynomial models.It is important to note however, that this sloping off trend was much less apparentwhen using effective width, which is a truer representation of performance. Fur-thermore, even after removing the outlier a logarithmic model of effective width knever surpassed linear in R2. While it has its appeal, it seems hard to argue con-clusively that a logarithmic model provides a consistently better explanation of ourdata than a simple linear estimation.It is important to note that while having a high R2 is important for mathematicalconsistency, the point of such models is to reflect the relationship of the variablesand help us understand their effects. Having a high R2 is nice, but is no substitute80for a robust explanation of the phenomena involved. Furthermore, there is no uni-versal threshold for what constitues a “good” model of k. While the results weremuch lower before removing the outlier point, one might reasonably argue thataccounting for 45 to 50 percent of all variation in the data is actually reasonablypredictive. Especially for a variable like k that might be impacted by dozens ofother potential nuisance factors (e.g. gender balance, coffee consumed, participantstrategy variation ect.). It is not quite the direct massive correlation of target dif-ficulty with movement time, but that doesn’t mean it isn’t explanatory or helpfulfor us understanding the relationship. Further investigation would be useful to naildown what might be causing these outliers, but the rough trend of sloping off in-crease appears to be there. Given all this, we would argue that k is probably bestmodeled with a simple linear estimation.5.3.4 Differences in Task and kWe noted in our results that the absolute values of k were much lower in this ex-periment than the last. In fact, the largest k values in this experiment, were around1.07, when the smallest in the last experiment were only slightly smaller than 1(e.g. 0.87 for effective width Welfords).We would consider the most likely explanation for this as differences in the ex-perimental task and feedback. Targets were not displayed with dynamic stereo pro-jection, so would appear at different real world sizes than experiment one, whichcould have effected the results. In this task participants also had no feedback of se-lection accuracy (laser pointer position is a form of feedback) and were instructedto keep the laser smoothly moving between the two targets at all times. For dis-ambiguation we considered all selections to occur at the peak movement acheivedbefore changing directions. Simply put, participants couldn’t really do post selec-tion adjustment. If participants overshot their target they would not take the time tocarefully line it back up as wherever they stopped originally is where the selectionwas placed. Thus any precise adjustment due to the impact of width was done inthe deceleration motion.After cutting out the correction phase one would expect participants to spendless overall effort on being precisely accurate. This would cause the width co-81efficient (which captures precision adjusments) to have a smaller impact on move-ment time and lead to smaller k values. However, the relative impact of amplitudeprobably wouldn’t be very different between the two experiments. The ballisticphase of crossing a large distance should theoretically be pretty similar even whennot trying to be precise. These would combine to make k smaller overall in ourexperiment. We note directions for nailing down these complications in our futurework section.Another possible explanation is the task differences between the two experi-ments, especially those related to explicit visual feedback about accuracy. BarryPo notes in his PhD dissertation [35] that their are two human visual systems (orstreams): the ventral system that has a more cognitive response when some typesof feedback are present, and the dorsal system that has more of a pure sensorimo-tor response when little feedback is present. The two visual systems often havedifferent performance and different error patterns. Participants may be engaging adifferent visual system in each experiment. It is not clear that the observed trend ink should be similar in both visual systems. More research needs to be done to dis-cern to what extent each experiment engaged each system, perhaps by examiningthe exact co-efficient values as opposed to just relative k values.5.4 Future WorkIn this section we consider two main avenues for future work. Further investigatingand reinforcing our understanding of the k parameter and attempting to utilize thisnew understanding to implement a VR calibration system.5.4.1 Isolating Factors That Affect kWe’ve already noted that the different experimental task may have resulted in alower k than previous studies. However, we had to remove one outlier point fromthe data in order to model k as well as in the last experiment. This new experi-ment was intended to provide a fundamental human baseline for expected pointingperformance performance without computer mediation. Given that participant in-structions were the same for all conditions and they were partially counterbalanced,what could cause such a dramatic outlier? Resolving this issue will be important82before going forward with any calibration systems relying on an accurate model ofk.One plausible avenue to investigate is our sample being biased. Our averageparticipant age was similar to the previous experiment, but through random sam-pling we had quite a good deal more women than men (14 compared to 3). Ourprevious experiment was biased in favor of more men than women (12 comparedto 8). Gender variation in motor skills is highly studied and produces significantdifferences in a variety of tasks. Women tend to perform slightly better at preci-sion motor tasks, while men better at spatial reasoning [14, 52]. Perhaps gendervariation might account for some of why k was lower in the second experiment,as women might be able to make precise adjustments more quickly. This couldlower the impact of width on pointing performance and lead to a lower k, such asin our outlier point. It therefore seems plausible that perhaps the different biasingof the sample caused k to be lower in our experiment than the last. However, thiswould not explain why specific depths were impacted so much more than others.Theoretically this should have been accounted for by our counterbalanced within-participants design. We leave post-hoc testing for this to future work, as the samplesizes in this experiment (3 men vs 14 women) are not appropriate for such a be-tween subjects comparison.Perhaps more likely to cause individual variations in conditions is changes inparticipants strategy. Of course there is a tradeoff between speed and accuracyin motor tasks like pointing. Depending on the instructions given, participantsmight emphasize selecting quickly or precisely which creates a tradeoff betweenmovement time and success rate. Thus it is common practice in pointing studiesto use effective width to model pointing performance. This adjusts the predictedsize of the targets according to what participants actually did in post. Effectively,this makes our analysis consider modeled targets larger if participants sacrificedaccuracy. However, an open question remains; Does the speed accuracy tradeoffalso effect k. Since k captures the relative impact of amplitude and width on point-ing performance, could participants choosing the sacrifice accuracy for a conditioncause the impact of width to lessen and k to fall? If so, does modeling k with ef-fective width also account for this tradeoff? Do we need to develop an adjustmentsimilar to effective width for the k parameter itself? Through random chance could83participants have chosen to change their strategy in one condition?Future studies should look to isolate some of these factors potentially impactingk so we are more sure of what causes the outliers. The same experiment could berun with computer mediation to validate our assertions that task differences causedthe lower k values in this experiment. More careful sampling could address genderbalance issues, or test whether this effected our results through a between partici-pants design. More careful analysis should be done to tease apart the relationshipbetween strategy and k values. Perhaps a future experiment should try a betweenparticipants design where some participants are instructed to be accurate and someare instructed to be fast. We think it is reasonable that at least a few of these stud-ies could be run to hopefully discover how we can make sure such outliers do nothappen in the future.5.4.2 Addressing VR CalibrationLets assume for now that we’ve investigated the factors causing k to vary in thedifferent target depths and can come up with a model that is both effective androbust to noise/outliers. Assuming that this model is still linear, what can we prac-tically accomplish? We had suggested in our previous experiment an idea wherethe virtual target depth we render the targets at could be adjusted to reflect the hu-man baseline k curve determined experimentally. This idea has promise but hasa few problems. Does adjusting the actual target depth of the object (and thus itssize) change people’s perceived target depths and is that a valid way of adjustingperformance in a simulation? Given the multitude of potential factors effecting theabsolute values of k is it plausible to come up with a single human baseline and saywhat k “should” be at a given target depth? If individual strategy plays a role, howmuch can we trust the k values from our participants? Furthermore, our naive ideaproposed in our last experiment had some edge cases whereby the corrected targetdepth wouldn’t make sense around the corner cases (e.g. producing a negative tar-get depth extremely close). There seems to be a fair amount of work to be done tointegrate this knowledge into a working VR calibration system.We propose that for any VR calibration system to work it is going to be depen-dent upon an model from a small set of participants that crossed real and virtual84depth. Furthermore, such a system must ignore the absolute values of k and simplystrive to make the VR and real k curves more similar through relative differences.We think its impossible to make the curves exactly match up. However, one couldrun a quick pointing task crossing physical and virtual depth. If one then observedthat k variation is steeper with virtual target depth than real, one could then changestereo parameters to play down the depth effect at far target depths and make thetarget seem closer while keeping size consistent. Such dynamic stereo parameteradjustment has already been applied to enhance depth estimation in literature be-fore. We argue that it would be valuable to connect this technique to actual taskperformance and validate the results it gives from a simulation perspective. Per-haps an experiment could be designed to test multiple methods of calibrating a VRsystem and arrive at a unified system the provides more accurate and comfortableVR experiences.5.5 ConclusionThis experiment has investigated the variation of Shoemaker’s k-factor with depthwhile removing the confounding factors of virtual reality and computer mediatedpointing. We provided further evidence that two-part models outperform one-partmodels in conditions where k deviates from one. We reinforced our conclusionsthat angular measures do not really improve our models of pointing performance.We also showed that even after removing computer mediation from the task, kstill got bigger as target depth increased. Overall, k was smaller and had slowerincreases than our previous experiment, which we argue is likely caused by taskdifferences or sampling biases. Our models of target depths impact on k wererefined from previous studies to show that a linear approximation still provides themost reliable and effective modeling. Future work should investigate a number offactors, such as the speed-accuracy tradeoff that could explain the noise we noticedin the k-factor. We also argued that dynamic stereo parameter adjustment may beemployed as a means of making the virtual and real k curves line up more closely.Future work will investigate these factors, attempt to create a unified model of kand apply it to VR calibration.85Chapter 6Where Are We? The Path AheadThis thesis has investigated the research communities methods for evaluating point-ing performance in distal pointing tasks. In particular, we tested a variety of com-mon models of pointing and specific improvements and modifications that havebeen suggested in the literature. These included Fitts’s one-part models versusWelford’s style formulations, as well as angular versus classic measures of targetdifficulty. We also investigated the natural human baseline of distal pointing per-formance without computer mediation.6.1 One-Part and Two-Part Models of PointingThrough repeated experiments we have shown that amplitude and width have sepa-rable impacts on pointing performance. In many common devices and interactionstheir impact happens to be similar by happenstance. In such situations standardone-part models of pointing, such as Fitts’s Law and the Shannon formulation wereshown to be as effective as more complex two-part models. However, we havedemonstrated that the relative impact of amplitude and width, as characterized byShoemaker’s k parameter, varies with target depth in distal pointing. When k im-plies equal impact, Fitts’s and Welford’s style models are mathematically identicaland produce similar results. However, when k implies divergence, Welford’s out-performs Fitts’s Law in a statistically significant way. This causes one-part modelsto be less robust to modeling distal pointing interactions, which are particularly86common in virtual reality interactions and a popular research area at the moment.Thus we recommend the two-part Welford’s formulation, or its Shannon equivalentas a valid and effective alternative to Fitts’s Law.We note that one-part Fitts’s Law allows use of the throughput parameter tocompare aggregate pointing performance across studies. Mathematically definedas the average index of difficulty over average movement time, it provides us witha rough metric of how effective the device is. Two-part models of pointing per-formance separate index of difficulty out into two terms and thus it is non-obvioushow they should be incorporated into the throughput metric. Lacking this standardmetric of device quality has arguably been one of the main stumbling blocks to us-ing two-part models in practice. Given our data showing that two-part models willbecome more necessary while modeling the distal interactions that are common inVR, we feel this is a common problem that needs to be addressed. However, thesolution is perhaps much simpler than imagined. In fact, why does throughput evenneed to change at all?Of course, Fitts’s Law is formalized as the logarithm of the ratio of amplitudeand width, with throughput dividing the logarithm by average movement time.MT = a+b log(AeWe)(2.1)T P = (log(AeWe)/MT ) (2.7)A key observation here is that the standard definition of throughput removesany consideration of the magnitude of the b coefficient. Instead, it just focuses onthe magnitude of the ID logarithm (adjusted for consistent accuracy by effectivewidth) compared to ultimate movement time. Welford’s formulation splits indexof difficulty into amplitude and width terms that are subtracted. This allows alinear regression to apply two separate coefficients to each ID term that reflecttheir separable impacts.MT = a+b1 log(Ae)−b2 log(We) (2.3)One could, if they were inclined to do so, follow the same logic as Fitts through-87put and create separate throughput terms for both logarithms in the Welford’smodel. This would naturally lead to Equation 6.1 and Equation 6.2 which rep-resent the throughput impacts of A and W. These could be added together to formthe final throughput of a Welford’s model. However, higher widths make the taskeasier, so a final throughput would arguably subtract the two as in Equation 6.3.T Pa =log(Ae)MT=IDaMT(6.1)T Pb =log(We)MT=IDbMT(6.2)T P2 = T Pa−T Pb (6.3)Alternatively, one could consider the whole of Welfords as its index of diffi-culty and formulate the new throughput quite simply as Equation 6.4. Like one-partmodels we drop consideration of the actual model coefficients and just compare ef-fective target difficulty to movement time.T Pw =log(Ae)− log(We)MT=IDwMT(6.4)This would create a metric of throughput for the Welford’s formulation that isphilosophically analogous to the standard Fitts’s Law throughput equation. How-ever, mathematically the results of all of these formulations of throughput are ac-tually identical. By the logarithm quotient rule (see Equation 6.5) we can imme-diately simplify one-part throughput to T Pw. Simply applying some basic fractionrules ((A−B)/W = (A/W −B/W )) farther simplifies T Pw to T P2. This is simi-lar to our observation that when the coefficents of a Welfords model are the same(or as here, not considered) Fitts’s Law and Welfords produce the same model ofpointing.log(xy) = log(x)− log(y) (6.5)Therefore, because our classical definition of throughput does not considermodel coefficients, there is no need to change our mathematical definition of through-88put. It is simply a matter of notation as mathematically they produce the samevalue. However, since we have split throughput into two terms we could assignweights to each term in Welford’s throughput (Equation 6.3). This would let usspecify whether throughput cares more about accuracy or speed but it is not clearwhat benefit we gain by considering this. Throughput is intended as a rough met-ric of overall quality considering both accuracy and speed that can be consistentlycompared across studies. Deciding whether we care more about ballistic speed orprecision accuracy is a more nuanced analysis than throughput was intended for.Rather than getting rid of or radically changing the useful metric of throughput, weargue that it should be kept in its original form or merely changed in notation toreflect the use of two-part models. In addition, future studies should also report onthe value of the k parameter their models produced. This would allow us to havethe general throughput classifier (“x is ultimately more efficient than y”) as well asa more nuanced understanding of its strengths (“because k is positive this device isbetter at rough ballistic movements”). This may be helpful in tailoring our pointingdevices to specific interactions or display configurations.6.2 Measuring Target DifficultyThere has long been the question in pointing research of how should we measureand quantify how difficult a target is to select. More specifically, what sort of unitsshould we use and where should they be measured? In fact, this was partly themotivation for the one-part formulation of Fitts’s Law. By dividing amplitude bywidth, the units cancel out and we are left with unitless metrics of target diffi-culty [41]. Most studies simply measure the size in centimeters on the screen, butother options have been suggested. In particular, distal interactions have attemptedto measure the difficulty in “hand space” by measuring it in terms of angles thearm has to move. In repeated experiments we have demonstrated that angular mea-sures of target difficulty do not provide any consistent, regular modeling strengthimprovement over classic measures in centimeters. In some cases, we saw minorimprovements but in other cases minor deterioration. We consider this an indica-tion of some natural random variation due to scale change under perspective pro-jection as opposed to any coherent improvement. On a more global level, we have89argued that the units with which one measures pointing performance largely actsas a scaling factor. While using consistent units is important to keep mathemati-cal comparability for throughput, it does not seem to truly change the underlyingmodel.6.3 Variation of Shoemaker’s k Parameter with TargetDepthShoemaker’s k factor is defined as the ratio of the amplitude and width coefficientsin a Welford’s model of pointing. It captures the relative impact of amplitudeand width on pointing performance and previous work had already shown that thisvaries linearly with gain [40]. In this thesis, we have extended this line of enquiryand demonstrated through repeated experiments that target depth has a similar im-pact on distal pointing. This relationship is at least roughly monotonic. While thereis some appeal to modeling it with a logarithmic model, we argue that it is approxi-mated by a linear trend. This is a very simple and explanatory model that producesconsistently high R2 while more complex models do not add any statistical power.Furthermore, we have shown that while latency may exacerbate the magnitude ofk variation, this relationship still occurs when latency is removed. Thus we arguethat k variation with target depth is reflective of a true sensorimotor trend, and notunique to computer mediated interaction. This has impact on calibration of virtualenvironments which will be discussed in the next section.While we have shown evidence to suggest that k varies logarithmically withtarget depth, there is still much work to be done in making our understanding ofthe trend more nuanced. There remains an issue with outliers where specific pointsseem to vary out of step with the rest. We feel tighter experimental controls shouldbe developed to more closely understand and minimize this variation. We thereforesuggest three follow-up investigations.• Post-hoc analysis or between subjects comparison to determine if the skewedgender balance in our second experiment could have affected our trend in k.• Computer mediated experiment performing the same task as our latency freeexperiment to confirm how much of the reduction in k is attributed to task90versus latency.• Post-hoc analysis or between subjects comparison to determine how muchof an impact user strategy has upon K.We feel these are the most likely explanations for why the outliers may haveoccurred in our second experiment. Hopefully, by investigating these impacts wecould devise some metric for the deviation of k that either accounts for these nui-sance factors or corrects them in post. Creating a robust human baseline for howk varies with target depth would be important and interesting for calibrating VRpointing tasks for lifelike performance.6.4 Impact on Virtual Reality CalibrationGiven the recent spate of new ideas and work being done in virtual reality, an im-portant and pertinent research topic is ensuring these devices are simulating realityeffectively. In this work we have provided some preliminary ideation towards im-plementing a performance based method for VR calibration. This would insurethat participants are not just self-reporting that the objects look correct, but thatthey are actually interacting with them in a similar manner to real life. In our ini-tial experiment, we suggested simply adjusting the target depth at which virtualobjects are rendered to match the k curve of the ground truth condition. Theo-retically this would line up each VR condition with the ground truth condition ofequivalent performance. A diagram for this is presented in Figure 3.9While the simplicity of this idea is useful and intuitive, after seeing the potentialfor outliers in our next experiment we have decided to move away from it. Thereis simply too great a chance for an outlier to cause the curves to diverge fromtruth and give us nonsense depths. This is particularly problematic towards theends of the interaction zone where the corrected depth may not even make sense.Instead, we now believe that dynamic stereo parameter adjustment may providea more promising avenue. It has already been shown in literature that changingthe stereo parameters can be used to make depth estimation more accurate [24].We hypothesize that this method can be augmented with an understanding of thereal and virtual curves of k. One could note where k in virtual space exceeds the91Figure 6.1: The three k-lines for DS = 110 (red / squares), 220 (green / tri-angles) and 330 (purple / circles), along with the k-line for the non-VRDV = DS conditions (blue / diamonds). To calibrate binocular depth us-ing k values, the desired binocular depth (A) determines a k-value (B)on the blue DV = DS line. That same k-value on the red DS = 110 line(C) determines D′V to be the corresponding binocular depth (D) that thesoftware should use to ensure the desired pointing performance if thescreen is 110cm from the viewer.intended physical result and subtly adjust the stereo parameters to play down thedepth effect. The exact amount of shift would need to be decided and the resultsof this calibration would need to be validated for simulation effectiveness. Doesit appear reasonable, and how does it effect simulation sickness? While still verymuch preliminary, it seems a promising extension to accepted and relevant work inthe field.6.5 Final WordsDistal interaction is a fundamental sub-component to exploring and working withmany virtual environments. This work has gone back to the fundamentals of the hu-man sensorimotor systems and re-evaluated how we choose our models of pointingperformance. We have demonstrated situations where current experimental prac-tice fails unpredictably. This may cause devices and tasks with non-standard gainand or depth parameters to not be accepted by the pointing community. Welford’s92two-part model of pointing performance has been shown to be a more robust andeffective model of pointing. We have futher explained how we can extend two-part models to include an equivalent metric of throughput to one-part models. Inour opinion, this removes the last major objection to using two-part models overone-part. Enhancements such as angular measures of target difficulty did not con-sistently improve our models. We’ve also argued there is a logarithmic relationshipbetween k (the relative impact of amplitude and width on pointing performance)and target depth. However, this relationship can be reasonably approximated by amore rough linear trend. This work has impact upon and provides new avenues forwork in VR calibration while pushing forward our methods for pointing evaluation.Hopefully, it provides a strong theoretical basis for how we evaluate new pointingtechniques in VR interaction.93Bibliography[1] M. Azmandian, M. Hancock, H. Benko, E. Ofek, and A. D. Wilson. Hapticretargeting: Dynamic repurposing of passive haptics for enhanced virtualreality experiences. In Proceedings of the 2016 CHI Conference on HumanFactors in Computing Systems, pages 1968–1979. ACM, 2016. → pages 14[2] G. Casiez, D. Vogel, R. Balakrishnan, and A. Cockburn. The impact ofcontrol-display gain on user performance in pointing tasks. HumanComputerInteraction, 23(3):215–250, 2008. doi:10.1080/07370020802278163. URLhttp://www.tandfonline.com/doi/abs/10.1080/07370020802278163. →pages 25, 30[3] S. A. Douglas, A. E. Kirkpatrick, and I. S. MacKenzie. Testing pointingdevice performance and user assessment with the iso 9241, part 9 standard.In Proceedings of the SIGCHI Conference on Human Factors in ComputingSystems, CHI ’99, pages 215–222, New York, NY, USA, 1999. ACM. ISBN0-201-48559-1. doi:10.1145/302979.303042. URLhttp://doi.acm.org/10.1145/302979.303042. → pages 5, 20, 58[4] S. R. Ellis. Nature and origins of virtual environments: A bibliographicalessay. Computing Systems in Engineering, 2(4):321–347, May 1991. →pages 13, 35[5] D. C. Engelbart and W. K. English. A research center for augmenting humanintellect. In Proceedings of the December 9-11, 1968, fall joint computerconference, part I, pages 395–410. ACM, 1968. → pages 5[6] W. K. English, D. C. Engelbart, and M. L. Berman. Display-selectiontechniques for text manipulation. IEEE Transactions on Human Factors inElectronics, HFE-8(1):5–15, March 1967. ISSN 0096-249X.doi:10.1109/THFE.1967.232994. → pages 594[7] A. S. Fernandes and S. K. Feiner. Combating vr sickness through subtledynamic field-of-view modification. In 2016 IEEE Symposium on 3D UserInterfaces (3DUI), pages 201–210. IEEE, 2016. → pages 15[8] D. Finnegan, E. O’Neill, and M. Proulx. Compensating for distancecompression in audiovisual virtual environments using incongruence. InSIGCHI Conference in Human Factors in Computing Systems 2016.University of Bath, 2016. → pages 15[9] P. M. Fitts. The information capacity of the human motor system incontrolling the amplitude of movement. Journal of experimental psychology,47(6):381, 1954. → pages 8, 41[10] P. M. Fitts. The information capacity of the human motor system incontrolling the amplitude of movement. Journal of ExperimentalPsychology, 47(6):381–391, 1954. → pages 2, 5, 20, 24, 58[11] F. E. Gonzalez. A camera-based approach to remote pointing interactions inthe classroom. PhD thesis, University of British Columbia, 2015. → pages11[12] E. D. Graham. Pointing on a computer display. PhD thesis, Simon FraserUniversity, 1996. → pages 30[13] E. D. Graham and C. L. MacKenzie. Physical versus virtual pointing. InProceedings of the SIGCHI Conference on Human Factors in ComputingSystems, CHI ’96, pages 292–299, New York, NY, USA, 1996. ACM. ISBN0-89791-777-4. doi:10.1145/238386.238532. URLhttp://doi.acm.org/10.1145/238386.238532. → pages 34, 37[14] E. Hampson. Estrogen-related variations in human spatial andarticulatory-motor skills. Psychoneuroendocrinology, 15(2):97–111, 1990.→ pages 83[15] K. Iyer, M. Chari, and H. Kannan. A novel approach to depth image basedrendering based on non-uniform scaling of depth values. In FutureGeneration Communication and Networking Symposia, 2008. FGCNS ’08.Second International Conference on, volume 3, pages 31–34, Dec 2008.doi:10.1109/FGCNS.2008.46. → pages 14, 35[16] B. F. Janzen and R. J. Teather. Is 60 fps better than 30?: the impact of framerate and latency on moving target selection. In CHI’14 Extended Abstractson Human Factors in Computing Systems, pages 1477–1482. ACM, 2014.→ pages iv, 1295[17] G. R. Jones, D. Lee, N. S. Holliman, and D. Ezra. Controlling perceiveddepth in stereoscopic images. Proc. SPIE Stereoscopic Displays and VirtualReality Systems VIII, 4297:42–53, June 2001. → pages 14, 35[18] R. Jota, A. Ng, P. Dietz, and D. Wigdor. How fast is fast enough?: a study ofthe effects of latency in direct-touch pointing tasks. In Proceedings of theSIGCHI Conference on Human Factors in Computing Systems, pages2291–2300. ACM, 2013. → pages 12[19] N. C. Kerphart. The slow learner in the classroom. Columbus, OH: Merrill,2 edition, 1971. → pages 21, 62[20] H. H. Koester, E. LoPresti, and R. C. Simpson. Toward Goldilocks’ pointingdevice: Determining a “just right” gain setting for users with physicalimpairments. In Proceedings of the 7th international ACM SIGACCESSconference on Computers and accessibility, Assets ’05, pages 84–89, NewYork, NY, USA, 2005. ACM. ISBN 1-59593-159-7.doi:10.1145/1090785.1090802. → pages 25[21] J. Kohn and S. Rank. Evaluating physical movement as trigger fortransitioning between environments in virtual reality. In Proceedings of the2016 CHI Conference Extended Abstracts on Human Factors in ComputingSystems, pages 1973–1979. ACM, 2016. → pages 14[22] R. Konrad, E. A. Cooper, and G. Wetzstein. Novel optical configurations forvirtual reality: evaluating user preference and performance withfocus-tunable and monovision near-eye displays. In Proceedings of the 2016CHI Conference on Human Factors in Computing Systems, pages1211–1220. ACM, 2016. → pages 14[23] R. Kopper, D. A. Bowman, M. G. Silva, and R. P. McMahan. A humanmotor behavior model for distal pointing tasks. Int. J. Hum.-Comput. Stud.,68(10):603–615, Oct. 2010. ISSN 1071-5819.doi:10.1016/j.ijhcs.2010.05.001. → pages 1, 2, 11, 28, 65, 69[24] A. Kulshreshth and J. J. LaViola Jr. Dynamic stereoscopic 3D parameteradjustments for enhanced depth discrimination. In Proceedings of the 2016CHI Conference on Human Factors in Computing Systems, pages 177–187.ACM, 2016. → pages 14, 91[25] M. A. Lawrence. ez: Easy analysis and visualization of factorialexperiments (version 4.3). https://cran.r-project.org/web/packages/ez/,November 2015. Accessed: 2016-08-26. → pages 6696[26] I. S. MacKenzie. Fitts’ law as a performance model in human-computerinteraction. PhD thesis, University of Toronto, Toronto, ON, Canada, 1991.→ pages x, 5, 7[27] I. S. MacKenzie. Fitts’ law as a research and design tool in human-computerinteraction. Hum.-Comput. Interact., 7(1):91–139, Mar. 1992. ISSN0737-0024. doi:10.1207/s15327051hci0701 3. URLhttp://dx.doi.org/10.1207/s15327051hci0701 3. → pages 31[28] I. S. MacKenzie and S. Riddersma. Effects of output display andcontrol-display gain on human performance in interactive systems.Behaviour and Information Technology, 13:328–337, 1994. → pages 5, 25[29] B. A. Myers. A brief history of human-computer interaction technology.interactions, 5(2):44–54, Mar. 1998. ISSN 1072-5520.doi:10.1145/274430.274436. URLhttp://doi.acm.org/10.1145/274430.274436. → pages 5[30] V. Oculus. Oculus rift-virtual reality headset for 3d gaming. URL:http://www. oculusvr. com, 2012. → pages 14[31] A. Pavlovych and W. Stuerzlinger. The tradeoff between spatial jitter andlatency in pointing tasks. In Proceedings of the 1st ACM SIGCHIsymposium on Engineering interactive computing systems, pages 187–196.ACM, 2009. → pages 3, 12, 37, 40, 57[32] A. Pavlovych and W. Stuerzlinger. Target following performance in thepresence of latency, jitter, and signal dropouts. In Proceedings of GraphicsInterface 2011, pages 33–40. Canadian Human-Computer CommunicationsSociety, 2011. → pages 12[33] B. Peek. Wiimotelib - .NET managed library for the Nintendo Wii remote,2012. URL brianpeek.com/page/wiimotelib. Accessed: 2016-08-26. →pages 19[34] A. Pino, E. Tzemis, N. Ioannou, and G. Kouroupetroglou. Using kinect for2d and 3d pointing tasks: performance evaluation. In Human-ComputerInteraction. Interaction Modalities and Techniques, pages 358–367.Springer, 2013. → pages 10[35] B. A. Po. Open Loop Pointing in Virtual Environments. PhD thesis,University of British Columbia, 2002. → pages 8297[36] P. Rubin. The inside story of oculus rift and how virtual reality becamereality. Wired (May 20).(accessedJanuary 24-2015), 2014. → pages 14[37] L. Sambrooks and B. Wilkinson. Comparison of gestural, touch, and mouseinteraction with fitts’ law. In Proceedings of the 25th AustralianComputer-Human Interaction Conference: Augmentation, Application,Innovation, Collaboration, pages 119–122. ACM, 2013. → pages 10[38] W. N. Schofield. Do children find movements which cross the body midlinedifficult? Quarterly Journal of Experimental Psychology, 28(4):571–582,1976. doi:10.1080/14640747608400584. → pages 21, 62[39] G. Shoemaker. Body-centric and shadow-based interaction for large walldisplays. PhD thesis, University of British Columbia, 2010. → pages 11[40] G. Shoemaker, T. Tsukitani, Y. Kitamura, and K. S. Booth. Two-part modelscapture the impact of gain on pointing performance. ACM Transactions onComputer-Human Interaction (TOCHI), 19(4):28, 2012. → pages 2, 6, 7,11, 12, 16, 17, 20, 23, 25, 30, 33, 34, 36, 37, 56, 58, 77, 90[41] R. W. Soukoreff and I. S. MacKenzie. Towards a standard for pointingdevice evaluation, perspectives on 27 years of Fitts’ law research in HCI.Int. J. Hum.-Comput. Stud., 61(6):751–789, Dec. 2004. ISSN 1071-5819.doi:10.1016/j.ijhcs.2004.09.001. → pages 5, 6, 8, 11, 24, 31, 89[42] W. Strunk, Jr. and E. B. White. Elements of style. New York, NY:Macmillan, 1959. → pages 6[43] R. J. Teather and W. Stuerzlinger. Pointing at 3D targets in a stereohead-tracked virtual environment. In 3D User Interfaces (3DUI), 2011 IEEESymposium on, pages 87–94. IEEE, 2011. → pages 1, 35[44] R. J. Teather and W. Stuerzlinger. A system for evaluating 3D pointingtechniques. In Proceedings of the 18th ACM Symposium on Virtual RealitySoftware and Technology, VRST ’12, pages 209–210, New York, NY, USA,2012. ACM. ISBN 978-1-4503-1469-5. doi:10.1145/2407336.2407383.URL http://doi.acm.org/10.1145/2407336.2407383. → pages 35[45] R. J. Teather and W. Stuerzlinger. Pointing at 3D target projections withone-eyed and stereo cursors. In Proceedings of the SIGCHI Conference onHuman Factors in Computing Systems, CHI ’13, pages 159–168, New York,NY, USA, 2013. ACM. ISBN 978-1-4503-1899-0.doi:10.1145/2470654.2470677. URLhttp://doi.acm.org/10.1145/2470654.2470677. → pages98[46] R. J. Teather and W. Stuerzlinger. Visual aids in 3D point selectionexperiments. In Proceedings of the 2Nd ACM Symposium on Spatial UserInteraction, SUI ’14, pages 127–136, New York, NY, USA, 2014. ACM.ISBN 978-1-4503-2820-3. doi:10.1145/2659766.2659770. URLhttp://doi.acm.org/10.1145/2659766.2659770. → pages[47] R. J. Teather and W. Stuerzlinger. Depth cues and mouse-based 3D targetselection. In Proceedings of the 2Nd ACM Symposium on Spatial UserInteraction, SUI ’14, pages 156–156, New York, NY, USA, 2014. ACM.ISBN 978-1-4503-2820-3. doi:10.1145/2659766.2661221. URLhttp://doi.acm.org/10.1145/2659766.2661221. → pages[48] R. J. Teather and W. Stuerzlinger. Factors affecting mouse-based 3Dselection in desktop vr systems. In Proceedings of the 3rd ACM Symposiumon Spatial User Interaction, SUI ’15, pages 10–19, New York, NY, USA,2015. ACM. ISBN 978-1-4503-3703-8. doi:10.1145/2788940.2788946.URL http://doi.acm.org/10.1145/2788940.2788946. → pages 35[49] R. J. Teather, A. Pavlovych, W. Stuerzlinger, and S. I. MacKenzie. Effects oftracking technology, latency, and spatial jitter on object movement. In 3DUser Interfaces, 2009. 3DUI 2009. IEEE Symposium on, pages 43–50.IEEE, 2009. → pages 12, 40, 57[50] R. J. Teather, W. Stuerzlinger, and A. Pavlovych. Fishtank fitts: A desktopvr testbed for evaluating 3D pointing techniques. In CHI ’14 ExtendedAbstracts on Human Factors in Computing Systems, CHI EA ’14, pages519–522, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2474-8.doi:10.1145/2559206.2574810. URLhttp://doi.acm.org/10.1145/2559206.2574810. → pages 35[51] W. H. Teichner, J. L. Kobrick, and R. F. Wehrkamp. The effects of terrainand observation distance on relative depth discrimination. The Americanjournal of psychology, 68(2):193–208, 1955. → pages 79[52] D. Voyer, S. Voyer, and M. P. Bryden. Magnitude of sex differences inspatial abilities: a meta-analysis and consideration of critical variables.Psychological bulletin, 117(2):250, 1995. → pages 83[53] A. T. Welford. Fundamentals of skill. Methuen’s Manuals of ModernPsychology. London: Methuen, 1968. → pages 2, 6, 30, 34, 37[54] R. Xiao and H. Benko. Augmenting the field-of-view of head-mounteddisplays with sparse peripheral displays. In Proceedings of the 2016 CHI99Conference on Human Factors in Computing Systems, pages 1221–1232.ACM, 2016. → pages 14[55] C. Yuan, H. Pan, and S. Daly. Stereoscopic 3D content depth tuning guidedby human visual models. SID Symposium Digest of Technical Papers, 42(1):916–919, 2011. ISSN 2168-0159. → pages 14, 35[56] A. Zaranek, B. Ramoul, H. F. Yu, Y. Yao, and R. J. Teather. Performance ofmodern gaming input devices in first-person shooter target acquisition. InCHI’14 Extended Abstracts on Human Factors in Computing Systems, pages1495–1500. ACM, 2014. → pages 10[57] S. Zhai, J. Kong, and X. Ren. Speed-accuracy tradeoff in Fitts’ law tasks:On the equivalency of actual and nominal pointing precision. Int. J.Hum.-Comput. Stud., 61(6):823–856, Dec. 2004. ISSN 1071-5819.doi:10.1016/j.ijhcs.2004.09.007. → pages 27[58] Z. Zhang. A flexible new technique for camera calibration. IEEETransactions on pattern analysis and machine intelligence, 22(11):1330–1334, 2000. → pages 51100Appendix ASupporting MaterialsThis appendix primarily contains the annonymized raw data from our differentexperiments. As the total amount of data would be unmanageable to report on, weprovide aggreate statistics including average movement time and effective width.Study instruments including our demographics questionaire are also reported on.A.1 Study MaterialsA.2 Raw DataDS DV A W WE AE MT LowCI HighCI110 110 25 5 6.70 25.45 1229.85 1138.89 1320.82110 110 25 10 8.65 25.46 911.54 820.57 1002.50110 110 25 20 16.43 25.92 688.41 597.44 779.37110 110 50 5 7.67 50.14 1551.79 1460.82 1642.75110 110 50 10 12.10 50.35 1205.60 1114.64 1296.57110 110 50 20 17.31 51.00 965.46 874.49 1056.42110 110 75 5 11.31 75.62 1798.58 1707.62 1889.55110 110 75 10 12.86 75.42 1385.95 1294.99 1476.92110 110 75 20 29.25 75.07 1099.31 1008.35 1190.28110 220 25 5 6.63 25.29 1887.37 1796.41 1978.34110 220 25 10 6.67 25.40 1192.47 1101.50 1283.43101DS DV A W WE AE MT LowCI HighCI110 220 25 20 10.46 25.35 852.15 761.19 943.12110 220 50 5 7.43 50.28 1972.32 1881.36 2063.29110 220 50 10 9.06 50.24 1385.73 1294.77 1476.69110 220 50 20 11.47 49.73 1109.63 1018.66 1200.59110 220 75 5 6.33 75.44 2317.52 2226.56 2408.49110 220 75 10 14.78 75.33 1656.73 1565.77 1747.69110 220 75 20 17.61 74.55 1221.71 1130.75 1312.68110 330 25 5 9.44 25.02 2174.89 2083.93 2265.86110 330 25 10 8.70 25.39 1713.43 1622.46 1804.39110 330 25 20 9.95 25.18 963.01 872.04 1053.97110 330 50 5 10.71 50.37 2492.41 2401.44 2583.37110 330 50 10 9.48 49.78 1948.03 1857.07 2039.00110 330 50 20 12.06 49.84 1268.31 1177.35 1359.27110 330 75 5 9.22 75.51 2949.67 2858.70 3040.63110 330 75 10 8.31 75.11 2069.95 1978.99 2160.92110 330 75 20 13.79 75.11 1495.17 1404.21 1586.13220 110 25 5 11.18 25.54 1351.85 1260.89 1442.82220 110 25 10 23.86 25.93 1003.88 912.92 1094.85220 110 25 20 25.19 26.92 828.98 738.01 919.94220 110 50 5 11.90 50.10 1623.27 1532.31 1714.23220 110 50 10 16.81 50.49 1321.52 1230.56 1412.49220 110 50 20 30.67 50.15 1038.77 947.80 1129.73220 110 75 5 15.85 76.29 1991.38 1900.41 2082.34220 110 75 10 26.39 76.53 1569.39 1478.43 1660.36220 110 75 20 31.66 77.61 1320.33 1229.36 1411.29220 220 25 5 8.08 25.79 1332.75 1241.79 1423.72220 220 25 10 12.61 25.61 973.90 882.94 1064.87220 220 25 20 16.22 26.20 776.36 685.40 867.33220 220 50 5 7.46 50.57 1698.42 1607.46 1789.39220 220 50 10 12.36 50.91 1228.03 1137.07 1319.00220 220 50 20 19.22 51.29 949.56 858.60 1040.53220 220 75 5 9.36 75.51 1906.07 1815.10 1997.03102DS DV A W WE AE MT LowCI HighCI220 220 75 10 13.72 75.62 1515.02 1424.06 1605.99220 220 75 20 19.89 76.24 1159.64 1068.67 1250.60220 330 25 5 6.79 25.38 1663.40 1572.44 1754.37220 330 25 10 10.16 25.26 1234.16 1143.19 1325.12220 330 25 20 12.37 25.88 817.61 726.65 908.58220 330 50 5 9.39 50.51 1998.51 1907.55 2089.48220 330 50 10 10.57 50.65 1447.49 1356.53 1538.45220 330 50 20 14.70 50.57 1087.13 996.17 1178.10220 330 75 5 19.13 74.84 2142.74 2051.77 2233.70220 330 75 10 20.99 74.92 1608.96 1518.00 1699.93220 330 75 20 23.59 74.81 1192.52 1101.56 1283.49330 110 25 5 29.92 24.68 1392.45 1301.49 1483.42330 110 25 10 38.96 22.56 1075.40 984.44 1166.37330 110 25 20 43.36 22.71 819.65 728.69 910.62330 110 50 5 16.94 50.58 1971.15 1880.19 2062.12330 110 50 10 36.52 48.10 1537.05 1446.09 1628.02330 110 50 20 56.41 45.86 1137.21 1046.25 1228.18330 110 75 5 78.31 74.66 2392.42 2301.45 2483.38330 110 75 10 29.19 76.91 1936.15 1845.18 2027.11330 110 75 20 85.21 74.40 1546.01 1455.04 1636.97330 220 25 5 10.93 25.88 1348.55 1257.58 1439.51330 220 25 10 19.95 25.80 1011.31 920.35 1102.28330 220 25 20 21.92 26.42 767.65 676.69 858.62330 220 50 5 11.80 50.46 1586.63 1495.67 1677.60330 220 50 10 15.39 50.51 1296.33 1205.36 1387.29330 220 50 20 35.53 50.91 947.00 856.03 1037.96330 220 75 5 11.06 75.46 1908.90 1817.94 1999.87330 220 75 10 37.28 75.07 1421.08 1330.11 1512.04330 220 75 20 26.73 75.22 1139.59 1048.63 1230.56330 330 25 5 8.81 25.73 1585.54 1494.57 1676.50330 330 25 10 11.80 25.35 1053.63 962.66 1144.59330 330 25 20 16.84 25.12 772.66 681.70 863.63103DS DV A W WE AE MT LowCI HighCI330 330 50 5 11.71 50.52 1851.61 1760.65 1942.58330 330 50 10 15.25 50.85 1306.28 1215.32 1397.25330 330 50 20 21.30 50.27 1012.67 921.71 1103.64330 330 75 5 9.73 75.71 2161.65 2070.68 2252.61330 330 75 10 12.93 75.48 1629.48 1538.52 1720.45330 330 75 20 23.46 75.05 1118.22 1027.26 1209.19Table A.1: Processed raw data from experiment one. Outliers have alreadybeen filtered out and we present effective widths, amplitudes, averagemovement times and 95 percent confidence intervals for all conditions.DS A W WE AE MT LowCI HighCI110 25 5 6.12 25.13 386.57 345.63 427.51110 25 10 8.66 25.18 319.44 278.50 360.38110 25 20 18.98 25.27 264.70 223.76 305.64110 50 5 6.09 50.15 580.21 539.27 621.15110 50 10 9.60 50.19 427.55 386.61 468.49110 50 20 17.89 50.40 353.82 312.88 394.76110 75 5 8.46 75.35 685.88 644.94 726.82110 75 10 11.71 74.08 595.95 555.01 636.89110 75 20 17.96 73.28 394.91 353.97 435.85165 25 5 6.36 25.10 419.56 378.62 460.50165 25 10 8.20 25.07 337.50 296.56 378.44165 25 20 18.33 25.19 256.94 216.00 297.88165 50 5 7.15 50.15 579.63 538.69 620.57165 50 10 9.38 50.18 445.14 404.20 486.08165 50 20 17.38 50.28 362.62 321.68 403.56165 75 5 7.49 75.26 695.37 654.43 736.31165 75 10 12.62 74.23 610.42 569.48 651.36165 75 20 19.83 72.85 409.95 369.01 450.89220 25 5 6.16 25.21 451.97 411.03 492.91220 25 10 8.54 25.10 332.64 291.70 373.58104DS A W WE AE MT LowCI HighCI220 25 20 18.02 25.16 265.62 224.69 306.56220 50 5 7.86 50.23 606.37 565.43 647.31220 50 10 9.40 50.17 454.98 414.04 495.92220 50 20 16.92 50.40 378.36 337.42 419.30220 75 5 8.31 75.18 735.65 694.71 776.59220 75 10 9.46 74.61 647.69 606.75 688.62220 75 20 19.00 73.95 423.50 382.56 464.44275 25 5 6.47 25.27 463.19 422.25 504.13275 25 10 8.87 25.12 325.69 284.75 366.63275 25 20 17.59 25.04 241.32 200.38 282.26275 50 5 7.22 50.30 569.21 528.27 610.15275 50 10 9.51 50.15 471.30 430.36 512.24275 50 20 16.35 50.18 351.50 310.57 392.44275 75 5 7.16 75.26 707.29 666.35 748.23275 75 10 12.88 74.57 548.03 507.09 588.97275 75 20 22.79 74.38 393.52 352.58 434.46330 25 5 7.11 25.14 468.17 427.23 509.11330 25 10 9.37 25.17 328.94 288.00 369.87330 25 20 17.66 25.08 239.47 198.53 280.41330 50 5 7.15 50.19 650.81 609.87 691.75330 50 10 9.84 50.02 472.34 431.40 513.28330 50 20 17.85 50.21 346.88 305.94 387.81330 75 5 8.55 75.18 746.30 705.36 787.24330 75 10 11.82 75.27 579.51 538.57 620.45330 75 20 20.45 74.98 422.92 381.98 463.86365 25 5 7.57 25.21 515.28 474.34 556.22365 25 10 9.01 25.20 340.05 299.11 380.99365 25 20 17.31 25.11 267.13 226.19 308.07365 50 5 7.32 50.25 656.25 615.31 697.19365 50 10 9.51 50.16 524.42 483.48 565.36365 50 20 17.34 50.22 361.11 320.17 402.05365 75 5 7.09 75.15 736.34 695.40 777.28105DS A W WE AE MT LowCI HighCI365 75 10 9.76 75.38 616.20 575.26 657.14365 75 20 22.36 75.47 432.06 391.12 473.00Table A.2: Processed movement time data from experiment two. Outliershave already been filtered out and we present effective widths, ampli-tudes, average movement times with low and high 95 percent confidenceintervals for all conditions.106  Version 1.2 2011-August-28 page 1/2  Interacting With Large Displays II  UBC Department of Computer Science ICICS/CS Building 201-2366 Main Mall Vancouver, B.C., V6T 1Z4  Consent Form  Principal Investigator Kellogg S. Booth, Professor, Department of Computer Science, (604) 822-8193  Co-Investigators Vasanth Rajendran, M.Sc. Student, Department of Computer Science, (778) 991-2616 Garth Shoemaker, Postdoctoral Fellow, Department of Computer Science, (604) 827-3993 Tao Su, M.Sc. Student, Department of Computer Science, (778) 318-8499 Julia Rose Freedman, B.Sc. Student, Department of Computer Science, (604) 603-9200  Project Purpose and Procedures The purpose of this study is to evaluate different methods for interacting with different sized electronic displays. Because of the properties of some displays, and how they are used, standard devices such as mice and keyboards are ill suited. You will be asked to point at targets on the display, and press a button to select the targets.  Confidentiality Your identity will remain anonymous and will be kept confidential. A computer will record performance and motion data as you perform the tasks, but no identifying information (such as your name) will be stored with this data, nor will it be associated with the data after it has been analyzed.  The results will be made public through publications; however, no identifying information will be included in any published disclosure of the research.  No audio recordings or photographs will be made of your participation.  Risks/Remuneration/Compensation There are no anticipated risks to you participating in this research. Use of stereoscopic glasses may cause slight discomfort or fatigue in some subjects. You are free to take a break or withdraw from the study.  You will receive an honorarium of $10 for your participation. You will be eligible for the honorarium even if you withdraw from the study.   Figure A.1: Consent forms for experiment one which investigated the impactof depth on pointing performance. Data gathered and performed byVasanth Rajendran.107 Consent Form – Version 1.8 – 2015-Dec-1 page 1/2  Interacting With Large Displays V  UBC Department of Computer Science ICICS/CS Building 201-2366 Main Mall Vancouver, B.C., V6T 1Z4  Consent Form  Principal Investigator Kellogg S. Booth, Professor, Department of Computer Science, 604-822-8193  Co-Investigator Izabelle F. Janzen, M.Sc. Student, Department of Computer Science, 604-345-4263  Project Purpose and Procedures The purpose of this study is to evaluate different methods for interacting with different sized electronic displays. Because of the properties of some displays, and how they are used, standard devices such as mice and keyboards are ill suited. You will be asked to point at targets on the display, and select the targets.  Confidentiality Your identity will remain anonymous and will be kept confidential. A computer will record performance and motion data as you perform the tasks, but no identifying information (such as your name) will be stored with this data, nor will it be associated with the data after it has been analyzed.  The results will be made public through publications; however, no identifying information will be included in any published disclosure of the research.  No audio recordings or photographs will be made of your participation.  Risks/Remuneration/Compensation There are no anticipated risks to you participating in this research. You are free to take a break or withdraw from the study.  You will receive an honorarium of $10 for your participation. You will be eligible for the honorarium even if you withdraw from the study.  Figure A.2: Consent forms for experiment two which investigated the impactwhether the impact of depth on pointing performance is an artifact ofsystem latency.108 Version 1.3 2012-December 27 Pre-Questionnaire page 1/1 Interaction With Large Displays III Phase 1 Pre-Experiment Questionnaire Participant # c c c  1. How old are you? c c years   2. What is your gender? (tick one)  ¡ Male ¡ Female ¡ Other  3. How much time do you spend per week using a computer? (tick one)  ¡ Less than 1 hour ¡ 1 to 3 hours ¡ 4 to 8 hours  ¡ More than 8 hours   4. Do you normally wear glasses or contact lenses? (tick one)  ¡ Yes If yes, what is your prescription? ¡ __________________  ¡ I don’t know  ¡ No  [Some questions not relevant to the current experiment have been deleted from earlier versions of the questionnaire] Figure A.3: Demographics questionaire used to gather qualitative data inboth studies presented in this thesis.109

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0314096/manifest

Comment

Related Items