UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Building believable robots : an exploration of how to make simple robots look, move, and feel right Bucci, Paul 2017

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2017_november_bucci_paul.pdf [ 27.11MB ]
JSON: 24-1.0355204.json
JSON-LD: 24-1.0355204-ld.json
RDF/XML (Pretty): 24-1.0355204-rdf.xml
RDF/JSON: 24-1.0355204-rdf.json
Turtle: 24-1.0355204-turtle.txt
N-Triples: 24-1.0355204-rdf-ntriples.txt
Original Record: 24-1.0355204-source.json
Full Text

Full Text

Building Believable RobotsAn exploration of how to make simple robotslook, move, and feel rightbyPaul BucciB.A. Visual Arts, The University of British Columbia, 2012B.Sc. Computer Science, The University of British Columbia, 2015A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Computer Science)The University of British Columbia(Vancouver)August 2017c© Paul Bucci, 2017AbstractHumans have an amazing ability to see a ‘spark of life’ in almost anything thatmoves. There is a natural urge to imbue objects with agency. It’s easy to imagine achild pretending that a toy is alive. Adults do it, too, even when presented with ev-idence to the contrary. Leveraging this instinct is key to building believable robots,i.e. robots that act, look, and feel like they are social agents with personalities, mo-tives, and emotions. Although it is relatively easy to initiate a feeling of agency, itis difficult to control, consistently produce, and maintain an emotional connectionwith a robot. Designing a believable interaction requires balancing form, functionand context: you have to get the story right.In this thesis, we discuss (1) strategies for designing the bodies and behavioursof simple robot pets; (2) how these robots can communicate emotion; (3) and howpeople develop narratives that imbue the robots with agency. For (1), we developeda series of four robot design systems to create and rapidly iterate on robot formfactors, as well as a tools for improvising and refining expressive robot behaviours.For (2), we ran three studies wherein participants rated robot behaviours in termsof arousal and valence under different display conditions. For (3), we ran a studywherein expert performers improvised emotional ‘stories’ with the robots; also oneof the studies in (2) included soliciting narratives for the robot and its behaviours.iiLay SummaryHumans have an amazing ability to see a ‘spark of life’ in almost anything thatmoves. It’s easy to imagine a child pretending that a toy is alive. Adults do it,too. Leveraging this instinct is key to building believable robots, i.e., robots thatact, look, and feel like they are social beings with personalities, motives, and emo-tions. Although it is relatively easy to make a robot seem alive, it is difficult tocontrol, consistently produce, and maintain an emotional connection with a robot.Designing a believable interaction requires balancing form, function and context:you have to get the story right. For this thesis, we explored how robots can com-municate emotions through simple body movements. We found that people willperceive a wide range of emotions from even simple robots, create complex storiesabout the robot’s inner life, and adapt their behaviour to match the robot.iiiPrefaceThis thesis is organized around two major published papers, with some currentand unpublished work. For all work, I collaborated closely with members of theSPIN Lab, especially Laura Cang, Oliver Schneider, David Marino and my su-pervisor, Karon MacLean, and two visiting researchers, Jussi Rantala and MerelJung. In conjunction with Laura and Oliver, I supervised a number of undergradu-ate researchers, all of whom contributed time and effort to the projects, includingSazi Valair, Lucia Tseng, Sophia Chen and Lotus Zhang. I am extremely gratefulto each of them, however, I would attribute most of the intellectual contributionto myself, Laura, Oliver, and David. In CuddleBits [16], I was the principle de-signer, architect, and builder of the CuddleBit form factors and design systems.Although other people helped with evaluation and assembly, this is my work. Fur-ther CuddleBit designs included a twisted string actuator introduced and developedby Soheil Kianzad.The chapter on behaviour generation (4) outlines a number of behaviour gen-eration ideas and systems. MacaronBit is an extension of work by Oliver that wasinitially developed by myself, Oliver, Jussi, Merel, and Laura, and carried into ma-turity by myself and David. Voodle [51] is a system initially conceived of by Oliverand prototyped by Oliver and David. I architected and implemented Voodle’s finalversions that were used in our studies, with significant help from David in writingthe final code. The Twiddler and Hand-Sketching prototypes were developed onmy own. The work in Complexity stems from ideas developed by Laura and my-self in CuddleBits, but was carried out to maturity by myself, with significant helpfrom Lotus in taking care of the very important minutia.The chapter on behaviour evaluation (5) outlines a number of studies in whichivwe evaluated the CuddleBits and Voodle, through both displayed and interactivebehaviours. In CuddleBits, the initial study design was developed collaborativelybetween Laura, Oliver, Jussi, Merel, and myself, then brought to maturity by Lauraand myself. In Voodle, I largely architected the study design, with help from David.Voodle analysis was performed collaboratively between myself, David, and Oliver.As first author on CuddleBits, I principally drafted, organized, edited and wrotethe paper. Laura and Karon contributed significantly to editing, with help fromJussi, Merel and Sazi.On Voodle, I contributed heavily to initial drafting, framing, analysis and writ-ing. David, Oliver, and Karon carried the paper through the first draft, then I editedand wrote heavily for the second draft. The final draft was largely an effort byKaron and David.The chapter on current and future work (6) includes ideas that are currently indevelopment by myself, Laura, and Mario Cimet in an equal intellectual partner-ship. It also references unpublished work developed by Laura, Jussi, Karon, andmyself, with significant inspiration from Jessica Tracy.The chapter on related work (2) is largely taken and amended from Voodle andCuddleBits. Sections on believability and complexity are my own.Research was conducted under the following ethics certificates: H15-02611;H13-01620; H09-02860.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 What Makes a Social Robot Believable? . . . . . . . . . . . . . . 11.2 Affective Touch and Narrative Context . . . . . . . . . . . . . . . 51.3 CuddleBots and Bits . . . . . . . . . . . . . . . . . . . . . . . . 71.4 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.5 Contributions and Limitations . . . . . . . . . . . . . . . . . . . 81.5.1 CuddleBits: Simple Robots for Studying Affective Touch . 81.5.2 Voodle: Vocal Doodling for Improvising Robot Behaviours 91.5.3 Complexity: Characterizing Pleasant/Unpleasant Robot Be-haviours . . . . . . . . . . . . . . . . . . . . . . . . . . . 10vi1.5.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 101.6 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . 112 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1 Companion Robots, from Complex to Simple . . . . . . . . . . . 122.2 Rendering Emotion Through Breathing . . . . . . . . . . . . . . 132.3 Emotion Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4 Complexity and Physiological Behaviours . . . . . . . . . . . . . 162.4.1 Sample Entropy . . . . . . . . . . . . . . . . . . . . . . . 182.4.2 Multi-Scale Entropy . . . . . . . . . . . . . . . . . . . . 182.5 Coupled Physical and Behaviour Design . . . . . . . . . . . . . . 182.5.1 Sketching Physical Designs . . . . . . . . . . . . . . . . 192.5.2 Creating Expressive Movement . . . . . . . . . . . . . . 202.6 Voice Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . 212.6.1 Alignment . . . . . . . . . . . . . . . . . . . . . . . . . 212.6.2 Iconicity . . . . . . . . . . . . . . . . . . . . . . . . . . 212.7 Believability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.8 Gestural and Emotional Touch Sensing . . . . . . . . . . . . . . . 253 Sketching Robot Bodies . . . . . . . . . . . . . . . . . . . . . . . . . 273.1 Paper Prototyping for Robotics . . . . . . . . . . . . . . . . . . . 283.2 Design Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.2.1 Considerations . . . . . . . . . . . . . . . . . . . . . . . 303.2.2 CuddleBit Visual Design Systems . . . . . . . . . . . . . 313.3 The ComboBit . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.4 Moving from 1-DOF to 2-DOF . . . . . . . . . . . . . . . . . . . 353.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Generating Robot Behaviours . . . . . . . . . . . . . . . . . . . . . 384.1 Early tool attempts . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 MacaronBit: A Keyframe Editor for Robot Motion . . . . . . . . 394.3 Voodle: Vocal Doodling for Affective Robot Motion . . . . . . . 404.3.1 Pilot Study: Gathering Requirements . . . . . . . . . . . 424.3.2 Voodle Implementation . . . . . . . . . . . . . . . . . . . 47vii4.4 Evaluating Voodle and MacaronBit . . . . . . . . . . . . . . . . . 474.4.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 474.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 Evaluating Robot Behaviours . . . . . . . . . . . . . . . . . . . . . . 515.1 CuddleBits Study 1: Robot Form Factor and Behaviour Display . 515.1.1 CuddleBits Study 1: Methods . . . . . . . . . . . . . . . 525.1.2 CuddleBits Study 1: Results . . . . . . . . . . . . . . . . 535.1.3 CuddleBits Study 1: Takeaways . . . . . . . . . . . . . . 545.2 CuddleBits Study 2: Evaluating Behaviours . . . . . . . . . . . . 555.2.1 CuddleBits Study 2: Methods . . . . . . . . . . . . . . . 575.2.2 CuddleBits Study 2: Data Preprocessing . . . . . . . . . . 575.2.3 CuddleBits Study 2: Data Verification . . . . . . . . . . . 585.2.4 CuddleBits Study 2: Analysis and Results . . . . . . . . . 595.2.5 CuddleBits Study 2: Takeaways . . . . . . . . . . . . . . 645.3 Voodle: Co-design Study . . . . . . . . . . . . . . . . . . . . . . 655.3.1 Voodle: Methods . . . . . . . . . . . . . . . . . . . . . . 655.3.2 Voodle: Participants . . . . . . . . . . . . . . . . . . . . 665.3.3 Voodle: Analysis . . . . . . . . . . . . . . . . . . . . . . 675.3.4 Voodle: Discussion and Takeaways . . . . . . . . . . . . 755.3.5 Voodle: Insights into Believability and Interactivity . . . . 765.4 Complexity: Valence and Narrative Frame . . . . . . . . . . . . . 795.4.1 Complexity: Methods . . . . . . . . . . . . . . . . . . . 805.4.2 Complexity: Results . . . . . . . . . . . . . . . . . . . . 815.4.3 Complexity: Discussion, Takeaways, and Future Work . . 865.5 Conclusions for Complexity, Voodle, and CuddleBits BehaviourEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886 Conclusions and Ongoing Work . . . . . . . . . . . . . . . . . . . . 906.1 Interactivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916.2 Developing a Therapeutic Robot . . . . . . . . . . . . . . . . . . 926.3 Can We Display Valenced Behaviours? . . . . . . . . . . . . . . . 92viii6.4 Lessons for Designing Emotional Robots . . . . . . . . . . . . . 93Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95A Supporting Materials . . . . . . . . . . . . . . . . . . . . . . . . . . 105ixList of TablesTable 4.1 Pilot Study: Linguistic features that participants felt correspondedbest with robot position in the imitation task. “+” and “-” indi-cate feature presence or absence. The comparative Study 1 wenton to use pitch as a primary design element. . . . . . . . . . . 45Table 4.2 Affect Grid quadrants of PANAS emotion words. † representswords used in CuddleBits Behaviour Generation Study; ‡representswords used in CuddleBits Study 1 (see Evaluating); * representswords used in CuddleBits Study 2 (see Evaluating). . . . . . . 48Table 5.1 Summary of Voodle Co-Design Themes. We refer to each themeby abbreviation and session number (e.g., “PL1”) . . . . . . . 68Table 5.2 The correlation between summary statistics of valence ratings(columns) and complexity measures (rows). For example, thetop left cell is the correlation of the mean valence rating perbehaviour and the variance of the behaviour signal. . . . . . . . 82xList of FiguresFigure 1.1 Examples of social robots. The Nao (left) is a humanoid robot,Google’s self-driving car (centre) acts in human social spaces,and Universal Robotics’ UR-10 is a collaborative pick-and-place robot. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Figure 2.1 Two dimensional models: left, Russell’s circumplex with as-sociated PANAS words; right, the discretized affect grid. . . . 15Figure 3.1 Top: the evolution of the RibBit and FlexiBit from low-fidelitypaper (and plastic) prototypes to study-ready medium-fidelityprototypes. Bottom: a diagram of the actuation mechanismsof the RibBit and FlexiBit. The RibBit looks like a rib cage,if you imagine that there are hinges along its spine. The robotexpands by the ribs moving outwards as figured above. TheFlexiBit looks like an orange carefully peeled by slicing intoequal sections, if the orange was then taken out and the sliceswere reattached at the bottom and top. . . . . . . . . . . . . 28Figure 3.2 Screenshot from Sagmeister and Walsh’s website. Notice howa wide number of products can be generated by creating a de-sign system, i.e., an aesthetic, colour palette, and canonicalshapes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Figure 3.3 FlexiBit design system. Using two canonical shapes, the sliceand the base, the FlexiBit’s shape and size can be quickly andeasily varied. . . . . . . . . . . . . . . . . . . . . . . . . . . 32xiFigure 3.4 RibBit design system. Using Adobe Illustrator, it is easy tomodify the texture, shape, size, and number of the ribs, makingiteration quick and easy. . . . . . . . . . . . . . . . . . . . . 32Figure 3.5 SpineBit design system. The whole robot is built off of con-figurations of a single slice (shown) which is defined by pa-rameterized curves. By defining key slices and interpolatingbetween them, a new robot shape can be produced. . . . . . . 33Figure 3.6 ComboBit design system. The robot is built off of configura-tions of a two slices (two configurations of the same ‘rib’ sliceshown left; one configuration of the ‘spine’ slice shown right)which is defined by parameterized curves. By defining keyslices and interpolating between them, a new robot shape canbe produced. . . . . . . . . . . . . . . . . . . . . . . . . . . 34Figure 3.7 Exampe of two slices created by design tables. Notice that thedefined curves are the same for both: the same number of cir-cles and curves in the same relative placements, with the samerelations (i.e., some point X is coincident with some circle K)with the different values for the same dimensions. The val-ues of the dimensions are set in an Excel table, which is usedby Solidworks to produce different configurations of the sameshapes and relations. . . . . . . . . . . . . . . . . . . . . . . 35Figure 4.1 For behaviour generation, users had two design tools to helpcreate behaviours: Voodle and MacaronBit. With Voodle, userscould optionally sketch a robot motion using their voice. Theirvocal input was imported into MacaronBit as raw position keyframes(shown as ’pos’ above). Users could modify the waveform bymanipulating other parameters, specifically randomness (shownas ’ran’), max and min position. MacaronBit includes standardkeyframe editing functions. . . . . . . . . . . . . . . . . . . . 40Figure 4.2 Voodle (vocal doodling) uses vocal performance to create be-lievable, affective robot behaviour. Features like tone and rhythmare translated to influence a robot’s movement. . . . . . . . . 41xiiFigure 4.3 The 1-DOF CuddleBit robots used in the Voodle co-designstudy: (a) RibBit: A CuddleBit that looks like a set of ribs;(b) FlexiBit: A Bit whose stomach ”breaths” via a servo; (c)FlappyBit: A Bit with an appendage that flaps up and down viaa servo; (d) VibroBit: A Bit that vibrates via an eccentric massmotor; (e) BrightBit: A Bit whose eyes light up via an LED. . 43Figure 4.4 The Voodle system implementation, as it evolved during ourstudies. Additions for each stage are highlighted in yellow. Inour final system, incoming vocal input is analyzed for ampli-tude and fundamental frequency. These signals are normalizedbetween 0 and 1, then averaged, weighted by a “pitch/amp”bias parameter. Randomness is then inserted into the system,which we found increased a sense of agency. Output is smoothedeither with a low-pass filter or PD control. Final output can bereversed to accommodate user narratives (i.e., robot is crunch-ing with louder voice vs. robot is expanding) for several dif-ferent CuddleBits. . . . . . . . . . . . . . . . . . . . . . . . 44Figure 5.1 Waveforms of Study 1 behaviours as designed by researchers.Each quadrant is represented by a PANAS affect word corre-sponding to the extremes along (valence, arousal) axes, i.e.,Excited is high-arousal, positive-valence. . . . . . . . . . . . 52Figure 5.2 Mean behaviour ratings (+2 for Match; -2 for Not Match) forFlexiBit grouped by the researcher-designed behaviours (hori-zontal) and the emotion word against which participants ratedbehaviours (vertical). Researcher-designed behaviours corre-spond with (1) to (8) in Figure 5.1. RibBit scores were similarand omitted for space. . . . . . . . . . . . . . . . . . . . . . 56Figure 5.3 For each behaviour and viewing condition, a single vector wascalculated by adding the vectors of the top three words that par-ticipants chose, weighted by confidence levels. Word vectorswere determined at the beginning of the session, when partici-pants rated each word in terms of arousal and valence. . . . . 58xiiiFigure 5.4 Each plot shows a single behaviour’s arousal (-1,1) and va-lence (-1,1) ratings. Live viewing condition is in red, videoin blue. Green ellipses show confidence intervals at 5% and97.5%. Green cross is mean, purple cross is median. Each plotcorresponds to a single PANAS word, each row corresponds toan affect grid quadrant. Rows order from the top: Depressed,Relaxed, Excited, Stressed. . . . . . . . . . . . . . . . . . . . 61Figure 5.5 Correlation results from behaviours that were designed for anemotion label but unrated by participants (marked unrated above)were calculated on all 72 designs from CuddleBits: Partici-pant Generated Behaviours (see Generating); correlation re-sults from participant-ratings were calculated on the 16 be-haviours from CuddleBits: Study 2 (marked by viewing condi-tion). A strong positive correlation is shown between the po-sition total variance for all arousal columns (unrateda, videoa,livea) – the higher the total variance, the higher the arousal. . . 63Figure 5.6 Reported affect grids by participant and session. After beinginstructed about dimensions of arousal and valence, partici-pants drew the robot’s expressive range directly on affect grids.Participants indicated increased expressivity from sessions 1 to2, differences between voice and Wheel control, and that eachrobot had a different range. . . . . . . . . . . . . . . . . . . 72Figure 5.7 Vision for “Embedded Voodle”: Voodle could be a natural low-cost method to emotionally color a motion path in more com-plex robots. . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Figure A.1 RibBit assembly instructions, page 1. . . . . . . . . . . . . . 106Figure A.2 RibBit assembly instructions, page 2. . . . . . . . . . . . . . 107Figure A.3 RibBit assembly instructions, page 3. . . . . . . . . . . . . . 108Figure A.4 RibBit assembly instructions, page 2. . . . . . . . . . . . . . 109Figure A.5 RibBit assembly instructions, page 5. . . . . . . . . . . . . . 110Figure A.6 RibBit assembly instructions, page 6. . . . . . . . . . . . . . 111Figure A.7 RibBit assembly instructions, page 7. . . . . . . . . . . . . . 112xivFigure A.8 RibBit assembly instructions, page 8. . . . . . . . . . . . . . 113Figure A.9 RibBit assembly instructions, page 9. . . . . . . . . . . . . . 114Figure A.10 RibBit design system explainer, page 1. . . . . . . . . . . . . 115Figure A.11 RibBit design system explainer, page 2. . . . . . . . . . . . . 116Figure A.12 RibBit design system explainer, page 3. . . . . . . . . . . . . 117Figure A.13 Lasercutting files for the RibBit. . . . . . . . . . . . . . . . . 118Figure A.14 FlexiBit assembly instructions, page 1. . . . . . . . . . . . . . 119Figure A.15 FlexiBit assembly instructions, page 2. . . . . . . . . . . . . . 120Figure A.16 FlexiBit assembly instructions, page 3. . . . . . . . . . . . . . 121Figure A.17 FlexiBit assembly instructions, page 4. . . . . . . . . . . . . . 122Figure A.18 FlexiBit design system explainer, page 1. . . . . . . . . . . . 123Figure A.19 FlexiBit design system explainer, page 2. . . . . . . . . . . . 124Figure A.20 FlexiBit design system explainer, page 3. . . . . . . . . . . . 125Figure A.21 FlexiBit design system explainer, page 4. . . . . . . . . . . . 126Figure A.22 FlexiBit design system explainer, page 5. . . . . . . . . . . . 127Figure A.23 FlexiBit design files to be cut out. . . . . . . . . . . . . . . . 128xvAcknowledgmentsNo good work is done alone. I certainly couldn’t have done anything interestingwithout support from my colleagues, friends, and loved ones. It is rare that weallow ourselves the opportunities to be effusive, so I will do my best to make themost of it.To each of my colleagues at the SPIN Lab and MUX group, I thank you foryour critique, enthusiasm, and support. I should mention a few by name:Laura, for over two and a half years of productive collaboration and friendship.It’s a rare thing to find someone who is able and willing to take one’s ideas seri-ously. You’ve done that and much more. I am very grateful to have worked withsomeone so enthusiastic, intelligent, and thoughtful.Oliver, for consistent mentorship and enthusiastic help whenever I needed it. Ilearned a lot from you about research, and I’d like to say I taught you a thing ortwo as well. I’ve enjoyed getting to know you, and I hope we can work togetheragain.David, for listening to me talk for hours, encouraging me, working with me,writing with me, tolerating me, and believing in my silly ideas. It’s been a privilegeand a joy to work with you and be your friend.Jussi and Merel, for a very creative half a year and excellent intercontinentalcollaboration for the next couple of years.Karon, for believing in me many times over, pushing me forwards, lookingcarefully and deeply at my work, and saying ‘yes’ more than you needed to. It’sbeen a privilege to work with you and learn from you.This work was also supported by contributions from many people in other de-partments. Jon Nakane (EngPhys), Nick Scott, Graham Entwistle and Blair Satter-xvifield (SALA) deserve special mention.I’m personally grateful to Tamara Munzner, Christine D’Onofrio, and JessicaDawson, all of whom have given time and thought to helping me advance in myacademic career. And to Eric Vatikiotis-Bateson, who believed in me and helpedshape my perspective on science. RIP.And, of course, I’m deeply grateful my friends and family who have supportedme along the way.Ashley, who has been willing to listen to hours of my fool ideas for years now.All of the Oldbyssey and Abby crews who have politely engaged me in con-versation about this.Micki, for love, and for putting me up while I finished this thing.My mother, Dorothy, who has spent many car rides, late nights, and early morn-ings listening to me talk, ever since I was able to talk.xviiDedicationTo Micki, who has shown me more kindness,support and love than I have yet known.xviiiChapter 1IntroductionThe most sophisticated people I know—inside they are all children.— Jim HensonThis work is pretty silly. I want, you, the reader, to know that up front, and to knowthat I know that this is silly. But I have taken this silly stuff pretty seriously, andhave found that keeping things playful has helped me and my colleagues tacklesome big and interesting problems in affective computing and social robotics. So,dear reader, prepare yourself for a monograph that looks seriously at crafts, toys,puppetry, and funny noises, and hopefully you’ll agree that being a little silly canproduce some interesting results.A note on pronouns: “I” will refer myself, i.e., the author of this document, PaulBucci. “We” will refer to myself and my collaborators in the Sensory Perceptionand Interaction Research Group, a.k.a. SPIN Lab.1.1 What Makes a Social Robot Believable?Deciding what constitutes a robot can be difficult. If you go broad, any interactivesystem can seem like a robot. Do you include wearable electronics? How aboutan interactive light installation? Does the robot need to have a body? Can youcall a character in a cartoon a robot? Does the robot need have senses, i.e., vision,hearing, touch?Here, I’ll be discussing social robots, which are robots that need to act in a1Figure 1.1: Examples of social robots. The Nao (left) is a humanoid robot,Google’s self-driving car (centre) acts in human social spaces, and Uni-versal Robotics’ UR-10 is a collaborative pick-and-place robot.human social space. You can imagine robots that greet you at the door like theNao1, or that you collaborate with like many pick-and-place robots2, or that driveyou around like Google’s self-driving car3 (see Figure 1.1).For this work, we’ll assume that, yes, a robot does need to have a body. Thatleaves on-screen cartoon characters out of the robot category. We’ll further assumethat a robot needs to be perceived as an autonomous agent, or an independentmoving thing with a mind of its own. Interactive light installations, automaticdoors, and wearable electronics are all out, too. Finally, we’ll assume a robotneeds to be able to sense and react to the world, so, stuffed animals, cuckoo-clocksand wind-up toys are out.I’ll confess upfront to cheating a little. Although the robots described hereincan and have been given both sensing and reactive capabilities, some of the studieswe ran had robots that could not sense and could only act. But I would still beconfident in calling them believable social robots—why?As an exercise for the reader, try ordering the examples below in terms of robot-ness, then social-ness4. You can decide whether you agree with my ordering. Thepoint is not to determine some universal order, but rather to try to tease apart whatmakes something feel like a social agent.1https://www.ald.softbankrobotics.com/en/cool-robots/nao2Example pictured from https://www.universal-robots.com/3https://waymo.com4I am intentionally using the -ness suffix over robotic and sociality to avoid connotations ofrepetative, automatic movements and ability to socialize, respectively.2Try it.• on-screen cartoon characters• interactive light installations• automatic doors• wearable electronics• stuffed animals• cuckoo-clocks• wind-up toysDone? Both social-ness and robot-ness? My turn. I’ll just do three. I wouldorder stuffed animals < automatic doors < wind-up toys in terms of robot-ness.But if I were to order in terms of social-ness, I would order automatic doors <wind-up toys < stuffed animals.Automatic doors, although they live in a human social space, don’t interactemotionally with the humans around them. Although humans very often get emo-tional at automatic doors—especially if they don’t open when expected—the doorsaren’t seen as malicious, just poorly-designed. In other words, we don’t develop atheory of mind by attributing desires, emotions, or intents to the doors; we think ofthem just as machines.Wind-up toys seem to have a little more agency. They move on their own, seemto have a purpose (even if a little single-minded), and they can evoke emotions.Automatic doors are rarely funny, but wind-up toys are often funny, surprising,and even cute. As toys, they are designed to be played with, which puts you in aframe of mind where you’re willing to pretend that the interaction is real in someway, despite your knowing that the the toy is a toy. This idea of suspending yourdisbelief comes from theatre: it’s your ability to become immersed in the emotionsand narrative on the stage despite the fact that (a) you know it’s a play; (b) theactors aren’t acting like real people; (c) the set doesn’t look like the real world. Wecan become fully immersed in a pretend-world even when there is a single actor3by themselves on a featureless black stage, as long as we are willing to suspendour disbelief about how unreal the play-acted scene is. In fact, sometimes addingmore features to a set becomes a distraction. Adding some thing to a scene filtersthe possibilities for imagination.This paradox of presence5 is what ranks a stuffed animal above a wind-up toyin terms of social-ness. Stuffed animals don’t move on their own, but they arealso not constrained to a single kind of motion through a fixed mechanism. Thedeterministic aspect of the wind-up toy takes away from all of the emotions youcan imagine it having. Stuffed animals, on the other hand, are free to be imbuedwith whatever story you want to give it. If you’re willing to play pretend and tosuspend your disbelief, a stuffed animal can feel very real indeed, even if you knowrationally it’s not. Watch a child play with their stuffed toys one day—it’s obviousthat their pretend scenes with their toys are real enough to them to impact theiremotions.As a final thought experiment, imagine what it would take to make an automaticdoor seem like it had agency. It would have to occasionally act against the purposethe designer intended for it6. An automatic door that opens courteously is just adoor that works well. But an automatic door that played games with you—triedto close on you halfway through, started to open and then slammed shut, teasedyou by remaining open when you were far away but closed as soon as you tried toenter it—would seem malicious. You would imbue it with (an unfortunate) theoryof mind because you would be unable to predict its actions well.The above examples illustrate three key elements for treating an object like asocial robot with agency:1. The object’s ability to behave autonomously (can it move on its own?).2. How deterministic the behaviour is (can you predict what it will do?).3. Your willingness to suspend your disbelief, which is dependent on the social5A soft paradox: adding things to a set seems like it shouldn’t take away from an experience, itshould add to the experience. But in this case, adding some thing to the set takes away from whatyou can imagine. This is the basis of the more is less aphorism.6If you can think of any examples where the door both works well and seems to have agencywithout dipping into teasing or some other faux-malicious act, please let me know.4situation of the object as well as its abilities and behaviours (how much doyou want to play along?).A believable social robot, therefore, is something you have to act with as if itwere a social agent, regardless of how much you have to suspend your disbelief.This puts the burden on you, the interactor7, as it is your frame of mind that pro-duces the robot as believably social more than the robot itself. The magic momentthat elevates a robot from machine to agent happens in concert with the robot, butultimately it is up to you to decide the robot has a mind of its own. As the rest ofthis thesis will argue, that sense of agency seems surprisingly easy to produce, butvery difficult to control.1.2 Affective Touch and Narrative ContextHumans communicate through a number of channels: speech, touch, posture, writ-ing, drawings, to name a few. Intuitively, we could organize the way meaningis conveyed into explicit and implict semantic paradigms. For explicit semantics,meaning is conveyed through signifiers such as words or pictures, and there is someformal logical content to the statement. For example, if I told you “your dog ranaway,” you would be able to derive some verifiable truth value regarding what hap-pened to your dog. However, implicit semantics convey meaning through contextand inference. For example, if I told you “your dog ran away” with a dejected toneof voice, you might infer that the situation was dire, that I didn’t know where yourdog might be, and you just might not be getting your dog back. But if I told youthe same thing with a neutral or hopeful tone of voice, you might infer that I mightbe following up with “...and we found it,” or “...but I have a pretty good idea ofwhere it is,” and there might be some hope for recovery.Touch interaction affords both implicit and explicit communication. Considerteaching someone how to dance, specifically trying to correct their posture. Imag-ine trying to describe in words how each body part should be oriented or whichmuscles should be activated. Anybody who has worked as a dance instructor or a7Here I use interactor rather than user to emphasize the bidirectionality of the interaction and theactivity of the interaction, i.e., inter-actor. This also attempts to understand the person commonlyreferred to as a user as a more complex and interesting human being.5personal trainer will tell you how difficult that task is. It is nearly impossible togive them a series of instructions that describe how they should amend their exactposture from moment to moment. However, by placing your hands on them andpushing their body into place, you can fully express how you want their postureto be with very little effort and no words. This would be explicit instruction viatouch.Contrast to how a lead communicates with a follow during a partner dance.Usually, there is a slight touch that might suggest the move that the lead expectsthe follow to make next, but the follow must use their knowledge of the dance, thepart of the song that they’re in, and their knowledge of the lead’s abilities to inferthe lead’s intended next move. This constructing of knowledge through inferencemotivated by a touch would be an implicit instruction via touch.There is also a deep emotional dynamic communicable through touch. Youcan covey affect (i.e., emotion) powerfully through touch: imagine the differencebetween saying to someone “I love you,” and giving them a full-bodied hug. A cer-tain depth diversity, power and depth of meaning can be expressed through touch.Imagine the scenario where your partner is about to do something rash. You couldexplicitly tell them “don’t do that,” with a tense tone of voice, which tells that youwant them to stop. Or you could touch them softly but firmly on the arm, whichconveys the same thing, but also closeness, care, and trust. Which do you thinkwould work better?The field of affective touch studies this interesting space of implicit communi-cation. Although there can be explicit meaning in a touch gesture, the context ofthe interactors greatly impacts the content of the meaning. In linguistics, context-dependent meaning creation is studied in the subfield of pragmatics. Although Iwon’t be going deep into pragmatics in this work, it is important to keep in mindthis interplay of context and meaning. Specifically, I consider the relationship be-tween the robot and the interactor, how the interactor sees the robot, and the storythat the interactor builds about the robot. Together, I refer to this as the narrativecontext, which frames the interaction with the robot as storytelling. This borrowsfrom a long tradition in the social sciences of considering identity creation and in-teraction as a narrative act. The term is meant to invoke the literary connotationsof a narrative, i.e., characters, themes, story arc, point of view. As will be argued6later, it is important that an interactor develops a narrative context for the robot bydeciding which metaphors represent the robot (Is this robot a cat? Dog? A mix?),the interactor’s role (Am I an owner? A friend? Neither?), a history of their rela-tionship (Did we always get along?), and a theory of mind for the robot (What doesthe robot want or believe?).1.3 CuddleBots and BitsA number of years ago, a lap-sized furry robot called the Haptic Creature wasdeveloped in the SPIN Lab to study affective touch. Resembling a sort of mixbetween a cat and a guinea pig, the Haptic Creature was imbued with the abilityto display emotionally-expressive movements through inflatable ears, an expand-ing ribcage to simulate breathing, and a vibration motor to simulate purring, anddetect touches through a matrix of force-sensitive resistors (FSRs). After SPINresearchers showed the Haptic Creature to have a measurable physiological andemotional impact on study participants by calming them down through breathingbehaviours, the lab developed a new version of the robot called the CuddleBot.With this robot came the challenge of impacting interactor’s emotions in other di-rections, such as making them more excited, stressed, or depressed. The dream isto make interactions with the robot range a full spectrum of emotion, and, hope-fully, make it possible to create emotional scenes with the interactors that couldenable entertainment, education, or therapy.However, as the robot was developed and used, it became clear that a tradi-tional engineering-focused robot design process would not work for emotional ex-pression. As such, we developed small, 1-degree-of-freedom (1-DOF) robots thatexplored a single expressive channel at a time called the CuddleBits (as in thediminutive of multi-DOF CuddleBots). Critically, these robots enabled rapid iter-ation on their form and abilities, a process we refer to as physical sketching [57].Further, we developed tools for designing the robot’s behaviours, including an im-provisation tool that enabled behaviour sketching and an editing tool that enabledbehaviour refining. By taking this improvisational sketching approach, we wereable to quickly iterate on the robot’s bodies and behaviours, developing both as-pects simultaneously for a cohesive exploration of the design space.71.4 ApproachTargeting believable emotional interactions as the long-term goal of the CuddleBotproject, in this thesis we focus on the development of the CuddleBits, their coupledbody and behaviour design, and their evaluation in terms of emotional expressivity.Using qualitative and quantitative approaches, we submit that the CuddleBits (a)are capable of consistently expressing a wide range of emotions; (b) allow inter-actors to develop a sense of alignment and agency under certain conditions; and(c) illustrate a viable design approach for developing robots capable of emotionexpression.For (a), we ran three studies where participants were asked to rate displayedrobot behaviours in terms of pleasantness (valence) and activation (arousal). For(b), we ran a three-part co-design study over six weeks with expert performers. For(c) I designed a series of robots, wherein I developed and used a sketch-based rapidprototyping approach for emotionally-expressive robots.1.5 Contributions and LimitationsThis thesis is comprised of three major works, referred to as CuddleBits (SimpleRobots for Studying Affective Touch), Voodle (Vocal Doodling for ImprovisingRobot Behaviours), and Complexity (Characterizing Pleasant/Unpleasant RobotBehaviours). The contributions of each are outlined here; since all share similardomains, limitations are summarized for all works at the end.1.5.1 CuddleBits: Simple Robots for Studying Affective TouchIn CuddleBits, we devised a rapid prototyping approach that allowed us to designand quickly iterate on a small family of 1-DOF robot bodies (Figure 3.1). Therobots were designed with sketching/modification as a primary requirement, thusallowing for iteration on the order of under a minute to less than a day. Simultane-ously, we explored design tools for robot behaviours, extending a keyframe-basededitor for vibrotactile sensations (Macaron [79]) and a new tool for improvisationalhaptic sketching with vocal input (Voodle).We developed, tested and analyzed a set of emotion behaviours. Study partic-ipants rated the emotional perception of our behaviour set for agreement with val-8idated emotion words. We analyzed the behaviour set for design parameters andcharacteristic signal features (mean, median, max, min, variance, total variance,and area under the curve) to determine which aspects of robot motion correlatewith arousal and valence. We further examined robot form factor and participantperception of emotion under two sets of two conditions: surface {soft and furry vs.hard and exposed}, and viewing condition {live vs. video} in a mixed experimentdesign.We contribute:• DIY designs for a 1-DOF robot with a validated range of emotional expres-sion;• Identification of relationships between robot behaviour control parameters,and robot-expressed emotion (as consistently rated by participants).• Demonstration of the capability and potential of a sketch-and-refine designapproach whose novelty lies in facilitating joint consideration of form andbehaviour for 1-DOF affective robots, and can be extended to more complexdisplays.1.5.2 Voodle: Vocal Doodling for Improvising Robot BehavioursIn Voodle, our goal was to support creation of believable robot behaviour throughvocalizations. We developed voodling in stages: (i) a pilot exploration of vocal in-teraction with a linguistic analysis, to test the concept and gather requirements; (ii)a comparative study with 10 naı¨ve users to situate Voodle in relation to traditionalanimation tools (also reported in CuddleBits); and (iii) a co-design study with threeexpert performers in a 6-week intensive relationship while the Voodle system wasiteratively revised. We describe Voodle’s design and implementation, then detailand reflect upon our two studies.We contribute:• A working Voodle system that is customizable in real-time, and extensiblefor further development and applications.9• Key factors underlying effective voodling in affective interaction, relating tolevel of user control, form, and achievable alignment and believability.1.5.3 Complexity: Characterizing Pleasant/Unpleasant RobotBehavioursWith Complexity, we explored the relationship between the complexity of a behaviour-producing waveform and the perceived valence. Building on our previous observa-tions of how interactors create ‘stories’ for the robot, we also inspect how narrativeframing can invert how the same behaviour is seen by different people.We contribute:• An analysis of complexity measures and valence that show how valence andparticular complexity measures correlate.• An analysis of narrative framing in relation to valence perception, with anemphasis on understanding how the robot is seen to have an ‘inner life’ andagency.1.5.4 LimitationsThe major strength of this work is also one of the biggest limitations: these robotsare 1-DOF by choice, which limits how much they are able to express certain emo-tions. All claims made herein, therefore, should be understood to be relative tothese very simple robots. Any extrapolations beyond 1-DOF are purely specula-tive.This includes the exploration of design systems. It is not yet known how wellour design systems will extend into two or more DOFs. As such, the contributionsin design systems should be seen as enabling faster creative expression only withinthe prescribed limitations of a particular design system. As such, this is more ofa call to action: we espouse the design approach of treating social robot designas primarily artistic enterprise, and apply known practices and lessons of art anddesign production to robot design.Our claims made in reference to the robot’s emotion expressivity are simi-larly made only relative to the emotion models we used, which are simplistic and10have known limitations (discussed in Related Work. Critically, the models do notaccount for narrative point of view, the possible non-linearity of emotional space,and how measurement tools are culturally situated. As such, it is important to thinkof the experiments and quantitative results discussed herein as rhetorical devicesused to provide insight into social phenomena.1.6 Thesis OrganizationThis thesis is organized around two major papers and one unpublished work. Thecontent of the two papers has been reorganized to reinforce the central theme ofbuilding believable robots. Related Work outlines concepts relevant to all worksdiscussed here, Sketching presents the physical design of the robots, Generatingpresents our work on producing robot behaviours, and Evaluating presents ourwork on evaluating robot behaviours. Current and Future work presents work thatis on-going in the SPIN Lab to create a model of interactor’s emotions, critiquescurrent approaches, and outlines future directions for research. This work has beendeveloped collaboratively, as noted in the preface.11Chapter 2Related WorkTo enable emotional interaction, machines must be able to both sense and salientlydisplay emotions [31]. Here, we focus on the display of emotion through breathing-like behaviours. We root our exploration of affective robot motion in the literature,borrowing insight from companion robots’ emotional breathing, and consideringavailable emotion models.We take inspiration from other places where physical and behavioural designis coupled, discuss the role of improvisation in the creative process, outline theinspiration for and implementation of our custom animation methods and tools thatunderlie our motion rendering process, and end with a discussion on what makesrobot interactions believable.2.1 Companion Robots, from Complex to SimpleCompanion robots that once existed only in science fiction are quickly becomingpart of our present reality. Paro, a cute actuated harp seal with soft fur, has beenused as a therapy robot in elder care homes to help manage dementia and encouragesocialization [88, 89]; its study provides evidence that even simple robot behaviourcan produce therapeutic benefits.The Haptic Creature is a furry mammal-like robot with a multi-DOF hapticemotion display [95], shown to display a variety of emotions with multi-DOF [96].The physiological effect of one level of sinusoidal robot breathing was studied in12a controlled trial, where users held the robot in “breathing” and “non-breathing”conditions. In the first, they experienced a statistically significant decrease in theirown heart- and breathing-rates as well as self-reported relaxation [81].Here we examine the larger expressive space of breathing-like behaviours alone.2.2 Rendering Emotion Through BreathingBreathing is a natural expression of emotion [11, 12, 64] that can enhance therecognition of certain affective states, e.g. as displayed on a 3D avatar of a humanupper body (including the face) [28]. It is yet unclear how to operationalize designand/or display parameters to distinguish between emotional states. Boiten et al.identify two relevant categories of respiratory parameters: (1) volume and timingparameters (i.e., respiration rate and respiration volume, frequency and amplituderespectively), (2) measures regarding the morphology of the breathing curve (i.e.,changes of the breathing curve over time; irregularity) [12]. Our keyframe editor,MacaronBit, attempts to capture both sets of parameters; our improvisation tool,Voodle, manipulates these parameters implicitly.Keyframes come from animation, where the key frame is the animation framewhere some motion changes. The animator will set the value of some variable attwo points in time—say the height of a bouncing ball to 1 meter at 0:10 secondsand 0 meters at 0:12 seconds—and the computer will fill in the animation framesbetween the two keyframes.For volume and timing parameters, imagine a person breathing regularly. Ifthey breathe deeper and faster, you would assume they are more excited. For mor-phology of the breathing curve, imagine you draw a regular breath as a graph: thiswould look something regular and periodic like a sine wave. Now, imagine draw-ing someone crying. The graph would look a lot more erratic, with a lot of jumpsand stops randomly distributed on the curve.In previous work on similar 1-DOF robots, people petting the small furry bod-ies reported that their simple periodic motions seemed analogous to breathing, sug-gesting aliveness [15, 18]. Their behaviours have been likened to other repetitivebiological behaviours such as wing motion or heartbeats.132.3 Emotion ModelsModelling human emotions presents many difficulties, and emotion model usageis fraught with misunderstanding and misuse [94]. This is because there are a lotof difficulties in measuring emotions. Emotions are felt privately by the personwho is experiencing them, and there exists no truly objective measure without be-ing able to completely model how someone is thinking. Physiological measuressuch as skin conductance or heart rate can give you an objective measure of theeffects of some set of emotions, but you still have the problem of matching thesubjective emotions to the objective physical effects. As such, emotion researchersmake up methods to measure emotion by using scales based on emotion words,such as “happy” or “sad,” then ask study participants to rate their feelings in termsof those words (“how happy do you feel from one to five?”). But there is a mis-match between the theories of emotions are actually produced in the brain andthese kinds of word-based models (called “categorical” models). Some theories ofemotion attempt to organize all possible emotions onto abstract dimensions such as“pleasantness” or “activation” (called “dimensional” models) and then try to placeemotion words along those dimensions.Categorical emotion models attempt to arrange the full spectrum of humanemotion into discrete categories, often represented by one or more emotion words(such as “excited” or “stressed”). Structurally, categorical emotion models do notaccount for conflicting or multiple emotion states (you can feel both “happy” and“sad” at the same time), and are not granular enough to account for the subtlety offelt experience (there are many kinds of emotions we might group under “happy”).These models are based on being able to consistently describe an emotion withwords, which introduces many difficulties with individual and cultural differences.Dimensional models attempt to present emotion experience as continuous alongsupposedly orthogonal dimensions such as “pleasantness,” “activation,” or “domi-nance.” These models have the same shortfall of being unable to represent multiplesimultaneous emotion states1, but are by definition highly granular. However, twopoints can be very close dimensionally, but far semantically: in a 2-dimensional1Unless a measurement is treated as probabilistic—for example, you could imagine using a di-mensional model as a probability density function, where you distribute weight to every point on theemotion space. However, this is not how they are typically used.14Figure 2.1: Two dimensional models: left, Russell’s circumplex with associ-ated PANAS words; right, the discretized affect grid.model that uses “pleasantness” and “activation”, both angry and nervous are spa-tially close (both are highly activated and unpleasant), but are conceptually verydifferent2. To be useful to researchers, dimensional models are often discretized byeither directly portioning the space (such as with an affect grid, see Figure 2.1) orusing categorical analogues (such as with the Positive and Negative Affect Sched-ule, or PANAS, where words are mapped to grid points). This reintroduces theproblems with categorical scales, but with an attempt to sidestep the dependenceon language. The argument is, roughly, if the core dimensional model can beshown to be generalizable, a discretized version of that model might represent amore valid construct.Since each person’s subjective frame is variable enough that without heavy cal-ibration, qualification, or verification, an objective measure can over-represent thereferent subjective state as true. That is not to say that any attempt at quantifyingsubjective states is doomed to failure, but that the use of a quantified scale needs tobe understood as fundamentally rhetorical and relationally constructed.We use a conventional two-dimensional emotion model of valence (pleasant-ness) and arousal (activation) [71]. This model is often discretized by dividing the2D plane into a grid [72]; the Positive and Negative Affect Schedule (PANAS) [92]is a set of words with validated mappings to divisions of the resulting affect grid [30].23-dimensional models that add “dominance” claim that this apparent closeness is an artifact ofprojecting into 2-dimensional space, but there are counterexamples even in a 3-dimensional space.15We use a subset of PANAS words to refer to the grid’s extreme corners: Excited(positive valence, high arousal), Relaxed (positive valence, low arousal), Depressed(negative valence, low arousal), and Stressed (negative valence, high arousal), aswell as others (a total of 20 words) that fall into each quadrant. This model choiceis motivated by dimensionality, with future behaviour generation on continuums ofvalence and arousal in mind. However, we also take care to calibrate word usage,verify the representation of emotion states by checking in with participants, and ac-count for multiple possibly-conflicting emotion states through using multiple andcomplex ratings.2.4 Complexity and Physiological BehavioursIn biological, computational, and artistic signal processing, there is a need to dif-ferentiate complexity from variability3 [44, 50]. To develop an intuition for this,consider how randomness is perceived. Imagine a T.V. screen displaying purestatic—i.e., white noise. If you were to split that screen up into a 10 by 10 gridof boxes, you could easily switch two boxes and perceive no noticeable differencein the picture. Static looks the same everywhere. White noise is uniformly random,which means that at every point on the screen, it is equally likely that you will seeany of the possible values a point could have.By contrast, imagine a black and white splatter painting such as the ones thatJackson Pollock painted. Again, split the splatter painting into a 10 by 10 grid ofboxes. Now, if you switch two boxes, it will be really obvious that you’ve madea switch, even standing very far away and cutting/pasting the switched boxes per-fectly. Certain paint splotches will look discontinuous, and you could very easilysee the sharp lines along which you made the cut.Both are random, in some sense of the word. But the static is more random inthe mathematical sense, and the splotchy painting is more random in the percep-tual sense. In a way, the uniformly distributed white noise has no structure to it.You can’t see any patterns except total randomness. Therefore, we would call thesplotchy painting more perceptually complex than the screen full of static [1].If you still need convincing, think about mirroring the image of the static vs.3No relation to computational complexity.16mirroring the image of the splotchy painting. Imagine you are shown a pictureof the original static screen and splotchy painting, then told to close your eyeswhile the pictures were randomly swapped out with either the same image or amirror image. If then asked whether the static screen was mirrored, could you tell?Probably not. But you could easily tell whether the splotchy painting was mirrored.The “switch two boxes from a grid” approach is called self-similarity. The“guess if the new image is a mirror image” approach is called (time-)irreversibility,i.e., to what extent you can reverse a signal and have it still look the same. Bothare measures of complexity: we would say that a more complex signal is less self-similar and less time-reversible [26]. Often, signals that feel random are not uni-formly random (static, white noise, etc.), they are more complex (paint splotches,Perlin noise, Brownian motion, etc.).Now, consider some different 1-D time series: a sine wave, white noise, and1-D Brownian motion. By many measures of variability, the white noise is themost variable, i.e., it has a uniform spectral power distribution, highest Shannonentropy, and infinite variance4. However, white noise is also highly self-similar,time-reversible and uniformly random. Intuitively, you should be able to take aslice of white noise and replace it with another slice without perceiving a differencein the signal. In contrast, mixing slices of 1-D Brownian motion would create verydifferent-seeming signals.A measure that accounts for self-similarity at different time scales is multi-scale entropy (MSE, explained here intuitively, below algorithmically) [25]. It isused in the biological sciences as a complexity measure, e.g. to determine howirregular a heart beat is. MSE is defined by a process of taking the sample en-tropy [68] of a signal at different time scales. Sample entropy is a measure of howself-similar a signal is at a single time scale. It is calculated by splitting the signalup into vectors of length k, counting the number of vector pairs that are withinsome distance r, repeating the count for vectors of length k+ 1, then taking thenegative log ratio of the two counts.Here, we develop a working hypothesis that valence and complexity are nega-tively correlated for breathing behaviours, i.e., the more complex a breathing be-4Over an infinite signal.17haviour is, the lower valence it is. The intuition comes from our own observations,as well as Bloch [11] and De Melo [28], where breathing behaviours described asirregular were also deemed to be lower valence.2.4.1 Sample EntropySample Entropy is a measure of self-similarity for some signal, S. The algorithmis as follows:1. Break S into a list of k-sized vectors, X .2. Given any distance function, d, and some minimum distance, r, count thenumber of pairs of vectors such that d(xi,x j)< r. Call this count N.3. Now split S into a list of k+ 1-sized vectors, Y , and repeat (2) for Y . Callthis count M.4. Calculate −log(N/M).2.4.2 Multi-Scale EntropyMulti-scale Entropy improves on Sample Entropy by looking at different coarse-grained time scales. The algorithm is as follows:1. Split a signal S into k parts.2. Average each part to produce a new vector, A, where A will be of length k.3. Determine the Sample Entropy of the new signal, A.4. Repeat for all lengths k from 1 to n, where n is the length of the orignalsignal, S.2.5 Coupled Physical and Behaviour DesignThe way that a robot is shaped changes the way that it can move. Emotional be-haviours are intrinsically bound to the bodies on which they are expressed. Being18able to flexibly design and realize robot bodies is imperative to quickly and cre-atively explore the possible design space for a new robot. As such, affective robotdesign benefits strongly from the flexible, rapid-turnaround physical prototypingmethods being popularized by DIY and maker culture.Our sketch-and-refine process draws on low-fidelity haptic sketching, tradi-tional sculpture and paper craft, rapid prototyping, and animation methods. Ourinnovation is a process that integrates existing sketching methods by preparingthem in form amenable to quick combination and refinement.2.5.1 Sketching Physical DesignsAs a design approach, sketching means quickly generating many versions of animagined final product. For some engineering problems, detailed specificationscan be generated without sketching. In contrast, establishing requirements throughfast iteration (design-by-prototyping) [48, 52] is typical for those who aim to gen-erate emotional reactions — e.g., industrial designers, set designers and sculptors,who often develop a larger design by making and getting reactions to multiple si-multaneous maquettes (small physical models).Our physical design methods are inspired by Moussette’s haptic sketching [55,56], with complex haptic expressions mocked up with low-cost physical media,and by Hoffman and Ju [40], where robots were designed through drawn sketches,prototyping actuated skeletons, and 3D modeling.We take “robot sketching” further. Concerned with hapticness as well as vi-sual emotiveness, we prioritize rapid access to user responses to tangible, physi-cal media. Therefore, skipping 3D modelling, we directly implement our robotsin low-cost sketch media (wood, plastic) so they are both easy to iterate on andimmediately study-ready. We thus design both for and through sketching: by mak-ing on-the-fly modification a primary design requirement, haptic (touchable) robotprototypes can be iterated on within hours or even minutes (Figure 3.1).In keeping with Maker ethos, we also target open hardware design, inspired byprojects like WoodenHaptics [32].192.5.2 Creating Expressive MovementDesigning expressive behaviours can be challenging, requiring animation, behaviorand robot expertise as well as diverse tools [36]. Conventional robot movement isproduced by an algorithm that acts on a model to define an exact path towards agoal, optimizing efficiency or safety [21].Affective robot control differs from typical robot motion planning, with a goalof communicative rather than functional movement. Affective robot behaviour de-sign consequently draws heavily on puppeting (3D) and animation [67]. Our mini-mal sketch-and-refine approach stands in contrast to those for higher-DOF affectiverobots such as Probo [75], where focus on facial expression for emotional displaynecessitates extensive 3D simulation software to coordinate actuation.Animators tout the benefits of sketching to develop believable motion parame-ters where a sketch’s subordinated detail presents opportunities to zoom in, problem-solve, then back out — a methodology not lost on haptic designers [58]. Recentlydeveloped haptic design tools include Macaron [79], which borrows the anima-tion tenet of keyframes, directly manipulating the vibrotactile sensation to matchnotable events (or frames) in an analogous animation.Alternatively, an animator can define a model’s movement, e.g., with keyfram-ing. Both techniques can impart expressive or biological-appearing qualities, al-gorithmically (perhaps with limited quality), or manually (laboriously and withskill). The robot can be triggered to follow the path (pre-computed or generatedon-the-fly) by a pre-defined command with a deterministic outcome.Expressive motion can derive from other sources. “Programming by demon-stration” records manually actuated robot motion [21]; actor input can be employedin this way. Hoffman and Ju suggest an iterative approach that integrates robotphysical design with 3D modelling for performative robots [39]; Croft and Moonmimic human hesitation behaviours on a robot [54]. Takayama applied traditionalanimation techniques (such as easing in/out) and tested user perceptions of robotbehaviour in a video-based simulated environment [87]. Some generative tech-niques for affect exist as well: adding Perlin noise to robot poses can increase userrecognition rate of displayed emotion [8].In Voodle, we responded to individuals’ natural vocal expressions by providing20a novel, direct input mechanism; effectively producing commands that modify apre-determined motion.2.6 Voice Interaction2.6.1 AlignmentAlignment happens when behaviours become synchronized in some way. Thinkabout when you are walking with someone, and your strides start to match up. Orwhen you accidentally start to mimic someone with an accent. This can often beembarrassing, but don’t worry—matching speaking patterns is a natural way thathumans build trust and ease communication.This fundamental component of natural human communication is called align-ment, which occurs when people mimic one another’s communicative patterns [33].Phonetic convergence refers specifically to alignment of speakers’ phonetic pat-terning [59], and other studies show that people similarly coordinate speech rhythm,body language and breathing pattern [46, 53, 90]. Similarly, mimicry positively im-pacts affiliation and likability [46]. Alignment extends to human-computer conver-sations: people adjust language to their expectations of how a system works [60].A believable human-robot conversation must likewise see the robot align itscommunication style, at some level, to the human partner’s. Previous work withvirtual avatars exploited such linguistic and physical alignment behavior for morenaturalistic virtual conversation agents [5, 35, 49]; Hoffman has explored human-robot alignment by utilizing computer vision techniques within performative con-texts [38]. Here, we use iconic features of the speech signal to achieve the illusionof alignment, making for more believable interaction.2.6.2 IconicitySpeech meaning comes from the semantics of words and phrases, utterance con-text, the sounds used to construct the words, prosody (tone and rhythm), and ac-companying gestures. In the Sausserian tradition, linguistic meaning is an arbi-trary relationship between the signifier (a sound pattern) and the signified (a con-cept) [29]. In this interpretation (symbolic speech), signifier form has little relation21to its meaning. For example, the English word “cat” and its Japanese equivalent(“neko”) sound very different, suggesting that the mapping from ’cat’ the soundand cat the concept is arbitrary.The notion of iconicity in language is when the form of a word and its mean-ing are non-arbitrary [61–63]: the word sounds like the thing it represents. Forexample, the English word for a cat meowing (“meow”) sounds very similar to theMandarin word for a cat meowing (“mia¯o”). Iconic vocalizations are also com-monly used to express psychological states (“ugh”), or physical phenomena likemotion (“zoom”) [62].Iconic vocalizations carry emotional content. Banse and Scherer found thaticonic voicing excels in communicating psychological phenomena such as emo-tional states [6]; Rummer et al demonstrated a relationship between positive emo-tions and /i/ (the ‘ee’ sound in ‘coffee’), and negative emotions with /o/ (approxi-mately an ‘uhh’ sound) [70].Iconic vocalizations are effective for describing physical phenomena and mo-tion.With physical tools including haptic interfaces, users often opt to use iconicvocalizations to describe tactile sensations [77, 91], and to ground and communi-cate design intention [4, 14]. Individuals link vocalization features motion patternswith some consistency. Shintel et al saw speakers using high- and low-pitchedvocalizations to describe up and down motion respectively. Syllable rate is also amajor indicator of visual speed [84]. Voodle uses a similar cross-modal mappingbetween iconic speech and motion, with upward pitch mapping to upward motion,and time-varying vocal amplitude as a proxy for syllable rate.Iconicity is an alternative or complementary input mechanism to speech recog-nition. Previous efforts use sound input to control an interface [34, 42], enhanceaccessibility of computer systems [10], or as intuitive input for artistic expres-sion [22, 41]. Voice Augmented Manipulation augments users’ touch input withvoice, e.g., as a modal modifier key [74]. Iconic vocalizations has been explicitlymodeled for robots: Breazaeal and Aryananda used prosodic speech features torecognize affective intent, e.g., praise, prohibition, and soothing [13].In Voodle, we convert vocal features to affective motion rather than categoriz-ing speech.22By utilizing speech form as a basis for controlling robot movement, Voodle candisplay emotional behaviour without explicit symbolic representation of emotionalstates. This approach is computationally inexpensive.2.7 BelievabilityBelievability is a core concept in human-robot interaction design; it is often used,variably defined, and poorly understood. There is a sense in which everyone‘knows’ what it means to be believable, but cannot satisfactorily define it. Roseet al [69] attempt to create a quantifiable framework wherein the different sensesof believable are expressed and interrelated in predicate logic. Roughly, the sensesthey propose are:sense 1 A person believes that a robot is capable of an action in a given envi-ronment.sense 2 A robot invokes an involuntary or pre-cognitive reaction in a personsimilar to that which a non-robot might invoke.sense 3 A person recognizes the action that the robot performs.sense 4 A person ascribes mental states to the robot.Sense 4 is the strongest sense of believability (wherein a person creates a theory ofmind of the robot) and Rose et al argue that it logically entails the other senses ofthe word. This partially works: you necessarily have to believe a robot is capableof an action to recognize that action (sense 3→ sense 1) and you have to recognizean action to ascribe a mental state to the robot (sense 4→ sense 3). But (sense 4→ sense 2) is much weaker: you may not need to be involuntarily affected by arobot to ascribe it a mental state5.5Rose et al recognize this weakness and challenge the reader to find a counterexample. However,it seems that this relationship cannot be directly proved, as sense 2 needs to be determined empiricallyand by analogy, i.e., you would have to test that there is some involuntary reaction produced by therobot that is not significantly different the reaction produced by some other non-robot entity. Further,the connection to sense 4 would have to be empirically shown, which, at best, would be correlational.This is not to say that sense 2 has no use, but it shows that this framework is less logical than heuristic.23This framework is more useful as a set of heuristics than the hoped-for baselinefor quantifying believability. Sense 4 is the closest to the way in which we usebelievability here.The notion of believability we use is much more in line with Bates’ definitionin The Role of Emotion in Believable Agents [7]:“[A believable character] does not mean an honest or reliable char-acter, but one that provides the illusion of life and thus permits theaudience’s suspension of disbelief.” (Bates 1994)Key to our understanding of believability is the connection to narrative frameand alignment. Our understanding of narrative is situated within the constructivistepistemological tradition, which posits that the way in which we construct knowl-edge about the world is both subjective and negotiated with the external world.Critically, this is an active and relational understanding of knowledge; we extendthis to the notion of believability by saying that an interactor must actively pro-duce the sense of believability in conjunction with the robot. By suspending theirdisbelief, an interactor creates a narrative for and with the robot, despite their fullawareness that the robot is a machine. Paradoxically, this notion of believabilityhas very little to do with an actual belief that the robot is alive; instead, to evalu-ate how believable the robot is, we would ask whether the narrative is maintainedthrough continued action by both the robot and the interactor that is consistent withthat narrative. The question shifts from “do you believe that this robot is alive?” to“are you willing to act as if this robot is alive?”Connecting this to the concept of alignment and ascribing mental states to therobot, we could say that an interactor has developed a theory of mind for the robotif they are aligned with the robot and if they behave as if the robot has internalthoughts, feelings, or motivations. Again, this can be parallel to their belief thatthe robot is a machine; rather than being a state of cognitive dissonance, whenaligned, the interactor acts with the robot to produce a shared narrative context.In this thesis, we explore how this sense of believability is created, maintained,and lost. We further explore how a robot can support believable interactions usingcomputational techniques such as machine learning, signal processing, and customsoftware design tools. We analyze the features of the signals that produce robot24behaviours and correlate with participant-evaluated emotional interpretations ofthose behaviours. We posit that believable and emotional robot behaviours can bebetter understood in terms of complexity measures such as MSE.2.8 Gestural and Emotional Touch SensingA critical component of touch interaction is sensing as well as display. Althoughthis thesis is focused on the display of emotional robot behaviours, I have beenheavily involved in efforts to sense human emotion through touch gestures by (1)designing, training, and analyzing machine learning models; (2) designing, con-structing, and testing custom fabric touch sensors; (3) developing experimentalparadigms that support valid emotional interaction; (4) prototyping a simple inter-active system to demonstrate mapping from recognized gestures to designed be-haviours. This work is reported on here because the machine learning approach tosignal processing heavily informed our work on behaviour display, including bothgeneration and analysis.A first assumption in detecting human emotion through touch interaction isthat being able to classify social touch gestures would be beneficial. The intuitionis that if you can detect a pat, a happy pat might be easier to differentiate from aangry pat. Using a custom-built touch sensor, in Cang et al [19]6, we explored thedetection of social touch gestures using machine learning techniques. Participantstouched the CuddleBot given gesture words (pat, stroke, etc.) while touch datawas recorded (pressure and location). Using Weka (a common machine learningsuite7), gestures were detected by breaking the touch data stream into windows andcalculating statistical features on the pressure and location data (such as maximumpressure, variance of movement in the X-direction, etc.). We found that we wereable to classify touch gestures with over 80 per cent accuracy, depending on thecondition. Gesture classification was performed with a random forest using 20-fold cross-validation.Given our success in touch gesture recognition, we decided to apply the samemachine learning techniques to emotional conditions (manuscript in preparation8,6On which I am a second author, having contributed to study execution, writing, and analysis.7http://www.cs.waikato.ac.nz/ml/weka/8On which I am also a second author, having contributed to study design, execution, analysis, and25also reported on in Cang 2016 [17]). Participants were asked to tell an emotionalstory while touching the robot. We collected touch, biometric, and eye-trackingdata. Here, our success was varied. Given system knowledge of an individual (i.e.,using 20-fold cross-validation for model validation), we were able to accuratelyclassify emotional conditions between 70 per cent (touch only) and 99 per cent(integrating touch, biometric, and eye-tracking data). However, given no systemknowledge of an individual (i.e., leave-one-out), we were unable to perform betterthan chance. This lead us to the conclusion that, if it were possible to detect emo-tional touches, we would have to create an individualized experimental paradigmand machine learning model.Ongoing work in the SPIN Lab is focused on refining our experimental paradigmand machine learning/signal processing approach to emotion classification.writing.26Chapter 3Sketching Robot BodiesThe CuddleBits stem from an attempt to answer design question about simplicity:how expressive can you be with only one degree-of-freedom (DOF)? Given thecomplexity of rendering emotion on the CuddleBot, the CuddleBits allowed anapproach wherein we could decompose the multi-DOF robot into many 1-DOFrobots. By studying each expressive DOF independently, we can gain a deeperunderstanding of how emotional expression works in that DOF, then eventuallyrecompose the DOFs into new multi-DOF robots. For this work, we look primarilyat 1-DOF breathing behaviours, explore 1-DOF spine motion, and give an exampleof how 1-DOF spine and breathing can be recomposed into a 2-DOF robot. Toenable rapid iteration, we developed (1) a low-fidelity prototyping method (similarto the design methods one would use to create a puppet); and (2) a design methodusing design systems to enable higher-fidelity iteration. The four CuddleBit designsystems that we developed are outlined below. The two major design systems thatwere studied here are the RibBit and the FlexiBit (see Figure 3.1). The RibBitlooks like a rib cage, if you imagine that there are hinges along its spine. The robotexpands by the ribs moving outwards. The FlexiBit looks like an orange carefullypeeled by slicing into equal sections, if the orange was then taken out and the sliceswere reattached at the top and bottom. The FlexiBit contracts by pulling the topand bottom together, as if you were squishing the orange peel.27Figure 3.1: Top: the evolution of the RibBit and FlexiBit from low-fidelitypaper (and plastic) prototypes to study-ready medium-fidelity proto-types. Bottom: a diagram of the actuation mechanisms of the RibBitand FlexiBit. The RibBit looks like a rib cage, if you imagine that thereare hinges along its spine. The robot expands by the ribs moving out-wards as figured above. The FlexiBit looks like an orange carefullypeeled by slicing into equal sections, if the orange was then taken outand the slices were reattached at the bottom and top.3.1 Paper Prototyping for RoboticsPaper prototyping is used widely in HCI, where low-fidelity prototypes are madeout of simple, often physical media such as paper to rapidly explore and evaluatethe design space of an interface. This is similar to a sculptor or designer’s use ofmaquettes, where fast, small models of a larger or more complex project are madeas studies to begin to sharpen the design intuition about a piece. For example,in theatrical set design, after the initial sketches are made of a potential set, asmall scale model is often created to work out design problems on cheap, easily-modifiable media. With a physical model, members of the theatrical productionteam are able to evaluate the design with more confidence than if the designs hadbeen just on paper, since you can use your spatial intuition to reason about theplacement of objects, lighting, etc.Each CuddleBit design series began with many small, rapidly-produced low-28Figure 3.2: Screenshot from Sagmeister and Walsh’s website. Notice how awide number of products can be generated by creating a design system,i.e., an aesthetic, colour palette, and canonical shapes.fidelity prototypes, often implemented in paper and glue. This allowed for a rapidexploration of actuation mechanisms and form factors with very little cost in termsof time or materials: each prototype cost cents, and could be designed, cut out, andtested in less than a few hours.The approach of making many incremental designs in parallel is known as de-sign by prototyping, and confers the major advantage of having finished prototypesat the end of each design cycle. For the CuddleBits, this meant that they could beused in research very early on in the design process compared to if they had beendeveloped with a traditional engineering design approach, where requirements aremeticulously defined ahead of time. In this case, the primary requirement wasemotional expressivity; all other requirements were secondary.3.2 Design SystemsOnce the low-fidelity prototyping phase starts to produce converging designs, ahigher-level design approach naturally emerges. This happens when commonali-ties can be identified between designs, including canonical shapes, parts, and ac-tuation methods. Similar to cross-cutting approaches in programming, faster andmore powerful designs are enabled when parts can be abstracted and reused. Thisis the idea behind a design system, where a single design effort can produce many29finished products.A design system is best explained with an analogy to brand identities. We’lltake the famous New York design studio Sagmeister and Walsh as an example.They were recently commissioned to create a brand identity for the Portugeuseenergy company EDP. Sagmeister and Walsh developed a design system basedaround a common font, colour, core group of shapes, and visual aesthetic. Ratherthan designing a single one-off logo, business card, or website, the design systemcould produce many new designs in a variety of visual media with very low impactper new design. By doing the design work upfront, a designer tasked with creatinga new product would not have to waste time choosing colors, fonts, or even shapesagain, but could build off of the many examples and templates already producedfor them.The CuddleBit design systems take this same approach to robot design. Onceworking out the fundamental actuation principles, body parts, and assembly meth-ods, new designs could be created that varied in size and shape with relatively littleease. New explorations of expressive capacity could be developed by creating newprototypes within less than a day. Putting the design effort up front allows for fast,dynamic explorations of the design space with low cost1.3.2.1 ConsiderationsRobust compliance and handling affordances: Our prototypes had to withstand thepressure of a human hand, but we found that the structure’s basic pliability and ap-parent fragility also directed its handler’s approach: pliability afforded rough play(squishing, hitting, throwing) while rigidity incited gentleness (holding, cupping,stroking).Touch: Prototypes needed to convey some kind of biological behaviour; thus,the materials needed to afford playfulness and liveliness.Believability: We drew on caricature and internal consistency for believability;we took cues from the natural world but situated the CuddleBits in their own genrerather than mimicking a real animal. RibBit’s inner mechanics are evident, allow-ing natural material affordances but cueing user expectations; FlexiBit’s soft fur1Full explanation of the RibBit and FlexiBit design systems—including assembly instructions—available in the Appendix.30elicited stroking and its limbless form suggested no locomotor ability.Backend: The Bits are powered and controlled by an Arduino Uno, from aNodeJS server (nodejs.org/en) using the Johnny-Five JS robotics framework (johnny-five.io). Javascript construction facilitates connection to front-end applications andwidely-available web frameworks. Using a single language for front- and back-endfacilitates seamless development; using web technologies allows for transparencyin widget design (i.e., using the browser ‘inspect’ tool makes for faster debuggingthan with Java UI packages). As such, rapidly iterating on control mechanisms wasalso fast.Extendable and Open design: The CuddleBits were designed to be easily ex-tended with little effort or expertise. Full CuddleBit source documents are shared:pattern files, code and a manual for assembly and extension2.3.2.2 CuddleBit Visual Design SystemsModifiable design systems make varying robot shape quick, from under 2 hoursup to 2 days. For each family, we produced many models (Figure 3.1) and chosethe most visually and haptically salient to evaluate. Each pattern includes both theatomic units of construction and the narrative sense conveyed by the aesthetic andmaterial presentation, detailed below.The FlexiBit: Like a simple sewn sphere, Flexi’s ribs are plastic slices fixed to abase like petals of a flower. These are generated by adjusting a stencil, printing andcutting the pattern from plastic sheets with a knife, and joining them with machinescrews. Slices are scalable for smaller or larger Bits; to adjust shape, only someslices and/or the base are varied. Plastic flexibility, volume, and curvature providepassive compliance and natural feel under a faux fur cover. It is often comparedto a Tribble (a fuzzy alien species from the Star Trek universe): the plastic frameevokes a compliant torso.The RibBit: A wooden ribcage on a stand, its rigid actuation gets compliancefrom internal springs. In counterpoint to Flexi, mechanics are fully exposed, withno attempt at material realism. It comes alive only with suggestive movement.Each rib is laser cut from an easily modifiable digital pattern, and assembled by2www.cs.ubc.ca/labs/spin/cuddlebits: pattern files, code and a DIY manual for assembly andextension.31Figure 3.3: FlexiBit design system. Using two canonical shapes, the slice andthe base, the FlexiBit’s shape and size can be quickly and easily varied.Figure 3.4: RibBit design system. Using Adobe Illustrator, it is easy to mod-ify the texture, shape, size, and number of the ribs, making iterationquick and easy.32Figure 3.5: SpineBit design system. The whole robot is built off of config-urations of a single slice (shown) which is defined by parameterizedcurves. By defining key slices and interpolating between them, a newrobot shape can be produced.wood-gluing parts together, with BBQ skewers as pins and rods. Further versionsof the RibBit include fur and ridges that form an inflexible ‘spine’; these explo-rations were prototyped in low cost materials (like paper) first.RibBit has a naturalistic skeletal aesthetic due to its wooden construction. Fit-ting comfortably in the contours of a hand, the structural rigidity provides notablehaptic feedback even when covered with fur.The SpineBit: Made from many ‘slices’ of wood strung together by elasticbands and seperated by small spacers, the SpineBit passively curls around objectsand body parts to give the effect of ‘hugging’. Similar to the RibBit, the SpineBithas a hard and skeletal aesthetic, but the use of elastics gives the shape pliability.3.3 The ComboBitThe ComboBit combines the spine actuation of the SpineBit with the flexiblebreathing concept of the FlexiBit. This robot is the first attempt at extending ourprototyping process from one- to two-DOF, and is a work in progress.Before settling on the FlexiBit as the principle ’Bit to integrate with the SpineBit,a number of RibBit-and-SpineBit configurations were attempted. The limiting fac-tor was the mechanical dependence of the spine and the ribs. To support bending,different ribs need to be actuated at different rates, making the coupled RibBit rib33Figure 3.6: ComboBit design system. The robot is built off of configurationsof a two slices (two configurations of the same ‘rib’ slice shown left;one configuration of the ‘spine’ slice shown right) which is defined byparameterized curves. By defining key slices and interpolating betweenthem, a new robot shape can be produced.cage design insufficient. An attempt was made to create individually-actuated ribs.However, the mechanical complexity created high friction and high chance for me-chanical failure. Instead, a FlexiBit-inspired design—where the elasticity of thebody material allows for both compression and expansion—affords actuation inmultiple directions.Although the ComboBit is a blend of previously-designed robots, we producedlow-fidelity prototypes at all points during the design process. Hand-modifiablematerials such as paper afforded designs to be quickly and cheaply tested on thefly. Rather than being a step backwards, returning to low-fidelity prototypes af-ter having created higher-fidelity prototypes focused design efforts on new designideas.Both the ComboBit and the SpineBit are designed using a Solidworks designtable, which allows for designing with equations, relations, and variables. SinceSolidworks is a constraint-based parametric design suite, the slices that make upthe ’Bits designed with a canonical set of shapes and relationships where dimen-sions are variable and controlled via the design table (see Figure 3.7). By deter-mining the dimensions of ‘key’ slices, dimensions to vary, and interpolation type(linear, polynomial, spline, etc.), it is possible to fill in configurations between the34Figure 3.7: Exampe of two slices created by design tables. Notice that thedefined curves are the same for both: the same number of circles andcurves in the same relative placements, with the same relations (i.e.,some point X is coincident with some circle K) with the different valuesfor the same dimensions. The values of the dimensions are set in anExcel table, which is used by Solidworks to produce different configu-rations of the same shapes and relations.key slices to create a smooth, continuous body. Effectively, this allows for rapiditeration in the shape and size of the robot as long as the actuation mechanisms arekept unchanged.3.4 Moving from 1-DOF to 2-DOFWork is ongoing to fully integrate spine and breathing motion into a 2-DOF robotwith the ComboBit. Lessons learned from the design process with the ComboBitare instructive for determining the extent to which our chosen deconstructive de-sign process will be successful. Here summarized are insights and questions fromthe early attempts at the ComboBit.Starting from scratch: Since the designer has already spent a lot of time becom-ing familiar with the materials, actuation principles, and shapes of each designsystem during previous iterations, they do not start from ‘scratch’ each time. Evenearly lo-fidelity prototypes start sophisticated. This comes with a tradeoff of fullexploration of the design space. Since the designer has design solutions in mind35already, it is less likely that they will diversify their approach. To overcome this, itis necessary take alternative approaches to determine viability. The extent to whichthe designer takes the project of multiple designs seriously is what determines theextent to which they will have to start again from nothing.Tacked-on vs. fully integrated: Certain DOFs are more difficult to ‘tack on’ thanothers. For example, one can imagine fairly easily adding a rumble motor to anyCuddleBit design with relatively little design work. Similarly, with the ComboBit,one could imagine a design solution where the parts of the robot that bend (spine)and breathe are not directly connected. However, the ComboBit attempts a ‘full’integration where the breathing and spine part are mechanically dependent. Thisgreatly increases the design time needed, but also increases the extent to which therobot ‘feels real’, as the mechanically dependent parts move together (e.g., like aspine and rib cage).Dependence on prototyping materials and processes: Since the robots are beingdesigned iteratively with the goal of having working prototypes finished at theend of each stage, the designs are highly dependent on the prototyping materialand processes used. For example, since lasercutting and 3D printing machines arereadily available, the robots have been designed to leverage those technologies anduse materials that they support. This has the advantage of being highly accessible,i.e., anyone with access to rapid prototyping machines can easily create their ownCuddleBits. However, as the robots become more complex, this design approachlimits the integration of more sophisticated materials and mechanisms. Neither isbetter; the success of the project is dependent on how the robots are intended to beused.Committing to the ease of future iteration: With more moving parts, designingfor future iteration is more difficult with 2-DOF relative to 1-DOF. However, it isstill possible with some high-level planning of the dimensions you wish to control.With a SolidWorks design table, you are able to choose the dimensions over whichyou vary: once the base shapes and relations are defined, any variation that directlyscales any of the dimensions is just a matter of inputting new numbers. Even with adesign table, this ability to vary takes design time, since relations and shapes need36to be tweaked to keep the model well defined.Iterations are relative to the design system: The new robot designs that are en-abled by the design systems are necessarily constrained by the design systemsas well. The space of iteration is somewhat predefined: the dimensions alongwhich iteration is to be performed has to be explicitly designed for. Therefore,some changes are easy (i.e., changing scale, adding/removing slices/ribs, texturalchanges), but some changes require defining a whole new design system, in whichcase, the designer is doing more work than simply making a one-off design. Thedesign system approach is only appropriate when many smaller variations of robotare required.3.5 ConclusionsThe 1-DOF CuddleBits are easy to build, modify, and iterate on. They show goodpotential for integrating into 2-DOF robots. The CuddleBits were built using asketch-and-refine paradigm, where they were first built with hand-modifiable lo-fidelity materials such as paper and light plastic, and a wide variety of form factorsand actuation principles were explored. Keeping the principle of design for re-design in mind, the CuddleBits were then further developed as design systems thatallowed for many configurations of a robot to be built using the same base design.This approach allowed for rapid exploration of the design space where expressivitywas the key metric for valuation.37Chapter 4Generating Robot BehavioursThe simplicity of the CuddleBits cannot be overstated. One reviewer—in a pos-itive review—called them “painfully simple,” and I heartily agree. That is thefundamental power of the work. We were able to display a surprisingly wide rangeof emotions using very simple machines. The breathing-type CuddleBits we studyhere (the RibBit and the FlexiBit) are only able to expand and contract a portion oftheir bodies1. We used internally developed haptic design tools to quickly sketchand refine behaviours: (1) MacaronBit, a keyframe-based vibrotactile effect editorand (2) Voodle, a robot behaviour sketching tool.When designing physical objects, a designer will typically start by roughlysketching the object, then iteratively refining their design. Sketching is improvi-sational and rough, and allows a designer to test many different ideas by quicklycreating and evaluating their designs on the fly. When refining, the design is muchmore static, and changes are incremental. After attempting a number of sketchingand refining tools for robot behaviour design, our group converged on Voodle/-MacaronBit. Here, we discuss the development and evaluation of our design toolswherein we (a) conducted a pilot study to determine which aspects of vocal inputwere most salient for robot motion; (b) conducted a behaviour design study whereparticipants were asked to create emotional designs using Voodle and MacaronBit.1They basically just wiggle.384.1 Early tool attemptsWhile extending Macaron for robot behaviour design seems natural in retrospect, atthe time it was not obvious to repurpose a vibrotactile design tool for 1-DOF robotmotions. A keyframe editor balances the concerns of precision and refinementwith the easy sketching abilities afforded through direct manipulation; Macaronprovides focused control over a short window (in the order of seconds).Our group made many attempts at behaviour design/puppeting tools and tech-niques before MacaronBit and Voodle, including a free-hand vector drawing toolbased on paperjs.org, a browser-based timeline editor [2]; and for direct positioncontrol, a force-feedback knob [83]. The timeline editor was unintuitive; the draw-ing tool and force-feedback knob required too much fine control for large move-ments, and did not provide enough control for small movements (i.e., entire be-haviour would have to be re-drawn per sketch; human hands cannot move fastenough to express fine motions such as fluttering). MacaronBit allowed for precisecontrol of both large and fine motions, and Voodle allowed for quick sketching anditeration since it gave immediate continuous feedback.In later work, we attempted to use genetic algorithms with complexity mea-sures as utility functions to generate behaviours at a variety of valence levels. How-ever, even for relatively short behaviours, the process of generation was extremelylong. As such, we returned to using Voodle and MacaronBit to sketch and refinebehaviours.4.2 MacaronBit: A Keyframe Editor for Robot MotionMacaron is a open-source web-based keyframe editor for designing vibrotactilesensations, using amplitude and frequency of a vibration [79]. As our first pass atdesigning a robot behaviour editor, we extended Macaron to robot position control,calling the result MacaronBit (Figure 4.1).In developing MacaronBit, we started with a pure sine-wave and adjusted itsparameters: frequency, amplitude, bias, and amplitude/frequency variability. Itssupport of immediate playback, key-framing (parameter interpolation between keypoints), waveform generation, and click-and-drag editing sped up iteration. Forparticipant-designed behaviours, we switched to direct position control, where de-39posran motionMacaronBitVoodleFigure 4.1: For behaviour generation, users had two design tools to help cre-ate behaviours: Voodle and MacaronBit. With Voodle, users could op-tionally sketch a robot motion using their voice. Their vocal input wasimported into MacaronBit as raw position keyframes (shown as ’pos’above). Users could modify the waveform by manipulating other pa-rameters, specifically randomness (shown as ’ran’), max and min posi-tion. MacaronBit includes standard keyframe editing functions.sign parameters were position, randomness, and max and min position (Figure 4.1).4.3 Voodle: Vocal Doodling for Affective Robot MotionInteractive agents are more compelling when they are believable: giving the illu-sion of life and facilitating suspension of disbelief [7]. When users believe that anagent has a ‘spark of life’, they can be more immersed, emotionally invested, andaligned with the agent system.However, creating believable agents is hard. Animators and roboticists arehighly trained, use cutting-edge modeling tools, and have to balance making theiranimations too real (and becoming uncanny) and not real enough (thereby not be-ing understood). Yet actors and performers improvise believable characters. Whilethis may require skill, effort and a specific state of mind, the effort is applied di-rectly, not through a computer keyboard or by writing code. Can performance beleveraged to improvise believable robot motion?We focus here on voice, for two crucial qualities. First, voice naturally ex-presses emotion; meaning is conveyed through the form of speech (e.g., prosody)as well as utterance semantics. A dog can thus take direction from its owner’s40Figure 4.2: Voodle (vocal doodling) uses vocal performance to create believ-able, affective robot behaviour. Features like tone and rhythm are trans-lated to influence a robot’s movement.tone, timing and loudness as well as her words. Secondly, iconic sounds, found inonomatopoeic words like “boom”, “woof”, or “ding,” can capture hard-to-expressideas like emotion (ugh) or movement (zoom).We posit that an iconic vocabulary could be the basis of a rich, naturalistic, andimprovisational platform to interactively design behaviors for physical, affective41systems.To assess this proposition, we built the Voodle system (‘vocal doodling’), whichderives believable motion from iconic vocalizations2. Specifically, we used com-putationally low-cost methods such as real-time amplitude and pitch analysis of vo-cal performances to immediately generate motion on a 1-degree-of-freedom (DOF)robot, allowing the performer to evolve and experiment as he seeks a particular be-havior. Voodle was developed as a design tool. That is, we intended that roboticistswould perform vocalizations that move a robot as a way to design its behaviour.However, along the way we also discovered its promise as an interaction tech-nique: an expressive input that end-users can employ to elicit lifelike motion asthey interact with the robot.A 1-DOF robot can be expressive yet relatively easy to implement and control,and offers insight into motion for more complex robots. Our final Voodle systemmapped increased pitch and amplitude to CuddleBit height. We first describe a pi-lot study to gather requirements for a working system, then report implementationdetails.4.3.1 Pilot Study: Gathering RequirementsWe conducted a pilot to inform an initial Voodle implementation based on the Rib-Bit (Figure 4.3). Like most of the Bits, the RibBit moves its “ribs” in and out witha breathing-like motion. To identify and prioritize features, we captured vocaliza-tions people use to describe robot behaviours, characterized how people mappedsounds to robot movements, and identified key vocal and system features for im-plementation.We recruited five participants (aged 20-26, 2 female) from a university popu-lation, reimbursed $10 for a 1-hour session. All were fluent in English (four nativespeakers, one native Russian speaker; four multilingual) with varied artistic andperformance experience, e.g., acting, illustrating, music.2We use “Voodle” to refer to our implemented system, “voodling” to the act of using iconic vo-calizations as an input modality with an interactive system, and “voodles” for specific vocalizations.42(a) RibBit (b) FlexiBit (c) FlappyBit(d) VibroBit (e) BrightBitFigure 4.3: The 1-DOF CuddleBit robots used in the Voodle co-design study:(a) RibBit: A CuddleBit that looks like a set of ribs; (b) FlexiBit: ABit whose stomach ”breaths” via a servo; (c) FlappyBit: A Bit with anappendage that flaps up and down via a servo; (d) VibroBit: A Bit thatvibrates via an eccentric mass motor; (e) BrightBit: A Bit whose eyeslight up via an LED.MethodsAfter an icebreaker activity (tongue-twisters and improv game), participants com-pleted a vocal imitation task, then a vocal improvisation task.Imitation task: Participants observed and optionally used their hands to feel eachof 18 movements through the robot, then imitated the behaviour using iconic vo-calizations.Of the 18 robot motions, ten were developed using vibrotactile signals froman existing library that categorizes vibrations based on perceived dimensions suchas energy, duration, rhythm, roughness, pleasantness, and urgency [82]. Thesehad been previously chosen for the purpose of expressive vibrotactile display, bytwo researchers independently selecting exemplary vibrations from the library’sdimensional extremes then iteratively merging their choices [80].We produced eight more motions by systematically varying sine parameters:fast/slow, large/small, and rough/smooth.43Wheelrand....P,DmotorLEDamplitude target positioncurrent positionmax AMDF PD controlpitch/amp weight (β) scale reverse?Study 1 (comparative), Study 2 (co-design) Session 1 Study 2 (co-design) Session 2 Study 2 (co-design) Session 3amplitude+ +0-2550-10-2550-1heightmaxInput Output Input Output Input Output OutputPilotservox αx βx 1-βx 1-αscalescaleservoamplitudemax AMDF PD controlpitch/amp weight (β) scale reverse?+x βx 1-βscaleservo servoPhysically Adjustable Parameters Physically Adjustable ParametersParameter statusdisplayed on laptopVoiceVoiceVoice...New CuddleBitsFigure 4.4: The Voodle system implementation, as it evolved during our stud-ies. Additions for each stage are highlighted in yellow. In our final sys-tem, incoming vocal input is analyzed for amplitude and fundamentalfrequency. These signals are normalized between 0 and 1, then aver-aged, weighted by a “pitch/amp” bias parameter. Randomness is theninserted into the system, which we found increased a sense of agency.Output is smoothed either with a low-pass filter or PD control. Finaloutput can be reversed to accommodate user narratives (i.e., robot iscrunching with louder voice vs. robot is expanding) for several differ-ent CuddleBits.Motion durations ranged from 1-13 seconds and were looped.Improvisation task: Participants manually puppeted the unpowered robot whilespontaneously vocalizing their puppetry, while audio and video were recorded.Analysis: We transcribed vocalizations into the International Phonetic Alphabet(IPA) from the imitation task to capture and prioritize input sounds, observed andreported how people mapped sounds to robot movements, and observed phonolog-ical similarity within and between participants.44Table 4.1: Pilot Study: Linguistic features that participants felt corresponded best with robot position in the imitationtask. “+” and “-” indicate feature presence or absence. The comparative Study 1 went on to use pitch as a primarydesign element.Feature Feature Description Example Tokens Dominant Participant-ProducedBehavioursPitch Perceived fundamental frequency ofthe vocalization over time.“dum DUM” [ŮdumŰdum]“We eEH” [Ůwe Űe]“mMm” [Űm:Ům ]Upward movements associated withhigher pitches, and downward move-ments associated with lower pitch;sometimes reversed.+/- Continuant Whether or not airflow is fully ob-structed in the vocal tract duringspeech, e.g., , the “f” in “father” vsthe “t” in “butter”“waywayway” [wei-weiwei] (+continuant)“dum dum” [d2m d2m](-continuant)Continuants are associated with be-haviours that begin with gradualand smooth motion, while non-continuants are associated with be-haviours with abrupt and jerky mo-tion.+Strident When there is a large degree ofturbulence and high energy noisecaused by an obstruction in vo-cal tract. Example: the “sh” in“shush”.“tchuh-tchuh” [>tS2.>tS2]“tcheen” [tSin]Rapid movements – e.g., , the Bitmoves very quickly between differentpositions.+/- Voiced consonants A consonant is voiced if it’s pro-duced while the vocal folds are vi-brating.“ga” [ga] (voiced)“ka” [ka] (unvoiced)Voiced consonants were associatedwith smooth motion, while unvoicedconsonants were associated with lesssmooth motion.45ResultsPhonetic Features: Table 4.1 reports typical phonetic features that we observedin the pilot study’s imitation task. We transcribed vocalizations into the Interna-tional Phonetic Alphabet (IPA), then organized them by distinctive phonologicalfeatures [20]. The most compelling features, based on discriminability on motionand feasibility of implementation, were pitch, continuants, stridents, and voicedconsonants.Metaphors for Sound-to-Behaviour Mappings:Participants instituted a relationship between pitch, amplitude and height: thehigher the robot’s ribs, the higher the pitch and amplitude.There were exceptions to this pattern; for example, one participant saw therobot’s downward movement as ‘flexing,’ and therefore used increased vocal pitchand amplitude to represent its downward movement. Table 4.1 reports contrastingrelationships that we observed, with examples.We saw occasional reversals in participants’ mappings between the imitationtask and the improvisational task.One possible cause is the Bit’s actuation methods: i.e., , computer-control inimitation, and participant-actuated in improvisation. The only direction to manu-ally actuate the robot is downwards: its default state is an extended position, andthe ribs are normally pulled inwards by a servo. Hence, increased physical efforttranslates to downward movement. So the relationship between pitch and ampli-tude may be based on how the participant conceptualizes the “direction” yieldedby the work.Individualized language: Each participant seemed to have idiosyncratic soundpatterning. For example, some participants used many voiced stops (e.g., “badumbadum”) in their utterances. Some participants consistently used multiple syllableswith many consonants (“tschugga tschugga”); others consistently produced simplemonosyllabic utterances (“mmmm”).464.3.2 Voodle ImplementationBased on piloting guidance, we created a full Voodle system, seen in Figures 4.4(system design) and 4.2 (system in use).We found that fundamental frequency and overall amplitude (easily detectedin realtime) could capture a variety of relevant vocalizations, including pitch and+continuant features. To accommodate variety in metaphors (e.g., breathing vs.flexing) and individualized language, we included user-adjustable parameters: mo-tion smoothing, gain, pitch and amplitude weight (where the weight between am-plitude and pitch is a linear combination: out put = amp× ampweight + pitch×pitchweight), and the reverse. Priorities for future phonetic features include distin-guishing the additional features reported in Table 4.1.Voodle was implemented in JavaScript: a NodeJS server connected with theRibBit using Johnny-Five and ReactJS [43, 66].Input audio was analyzed in 1s windows. Amplitude was determined by themaximum value in the window, deemed to be sufficient through piloting. Thefundamental frequency was calculated using the AMDF algorithm [85], the bestperformer in informal piloting. Figure 4.4 shows algorithm evolution. Voodle isopen-source, available at https://github.com/ubcspin/Voodle.4.4 Evaluating Voodle and MacaronBitTo develop a set of behaviours to evaluate for emotion content, we ran a studywherein participants were asked to work with an expert animator to create be-haviours given an emotion word.4.4.1 MethodsWe recruited ten participants to design five robot behaviours, each based on anemotion word from the PANAS scale [93]. Three self-identified as singers or ac-tors.To define the design tasks, participants were assigned one word per affect gridquadrant, chosen randomly without replacement from the five PANAS words forthat quadrant; participants selected a fifth word. The words were presented inrandom order.47UnpleasantActivatedPleasantDeactivatedPleasantActivatedUnpleasantDeactivatedstressed† * relaxed†‡* excited†‡* depressed†*upset‡* calm‡* attentive‡* drowsy‡*scared‡* at rest‡* determined‡* bored‡*guilty‡ serene‡ proud‡ dull‡hostile‡ at ease‡ enthusiastic‡ sluggish‡nervous‡ droopy‡Table 4.2: Affect Grid quadrants of PANAS emotion words. † representswords used in CuddleBits Behaviour Generation Study; ‡representswords used in CuddleBits Study 1 (see Evaluating); * represents wordsused in CuddleBits Study 2 (see Evaluating).For each word, the participant was given the option to express the behaviourwith Voodle, design it using a traditional keyframe editor, or switch between theseas needed.The keyframe editor, Macaron [78], allows users to specify Bit height (peri-odic movement amplitude) over time, as well as remix and transform their originalanimations through copy/pasting, scaling keyframes, and inversion and other func-tions. Participants could export their voodles as keyframe data for later refinementin Macaron.During the study, an expert animator (a co-author) was a design assistant, in-troducing participants to the robot and two tools.The animator assisted participants in creating compelling designs, offeringtechnical support and guidance as needed, but did not create animations for them.Meanwhile, another researcher acted as an observer, taking notes on tool use andconducting a brief informal exit interview.Participants could create as many designs for each emotion word as they wantedusing any tool at any time until they were satisfied with the result; for example, theymight make three designs for “excited” and choose their favourite.4.4.2 ResultsVoodle was used beforehand to sketch behaviours in most cases. When a partic-ipant had a clear idea of what the behaviour should look like, both sketching and48refining was performed in MacaronBit.A library of 72 behaviours labelled by emotion word was generated (partici-pants designed multiple behaviours per word); analysis of these behaviours pre-sented in CuddleBits: Study 2 (Evaluating 5) results.Participants agreed that the robots came to life: “it shocked me how alive itfelt,” “it tries to behave like a living thing would.”Voodling was used by participants to express emotions: “the things [Voodle]’slistening for is different from the things Siri listens for...it’s usually emotional mean-ing or mental state that’s conveyed by [pitch, volume and quality]”. While 7/10participants used Voodle, those with performance experience experience used Voo-dle more. This is may be individual preference: voodling is performance, andtended to be preferred by those comfortable with performing.Participants generally chose to use Voodle to augment their keyframe-editorwork, rather than as a stand-alone tool. Only two (both performers) ever designedwith Voodle alone, and only did so for one behaviour design task each.Voodle was most appropriate for exploring and sketching ideas, not fine-tunedcontrol. When users knew their goal, they moved straight to the keyframe editor:“it always seemed easier to go to [the keyframe] editor to do what I had in my headthan trying to vocalize and create that through voice.”We found participants had trouble expressing static emotional states (e.g., ,distressed); these became clearer when contrasted with an opposing emotion. Inour next study with Voodle, we changed the task to transitions between emotionalstates.Supplementing these observations, we note that a concurrent study (whose fo-cus was on developing and assessing these robots’ expressive capacity, and noton input tools) also used these Voodle-generated animations along with others, andconfirmed that they covered a large emotional space [16]. Specifically, independentjudges consistently assessed Bit animations as well-distributed across the arousaldimension, and somewhat along valence.We concluded that Voodle had value for sketching expressive robot behaviours,but needed further development.494.5 ConclusionsIn this section, we discussed the development and assessment of our two robotbehaviour design tools, Voodle and MacaronBit. The former is used for sketchingand the latter for refining robot behaviour designs. Both tools have been usedso far for designing 1-DOF robot behaviours; there is evidence that extensionsto multiple DOFs are possible, but the transition to multiple DOFs would not besimple. Like many 3D keyframe animation tools, MacaronBit may need a morecomplex timeline-based approach where multiple DOFs and complex movementsare controlled through composing simpler movements (i.e., by joining, nesting,etc.). A vision for Voodle may be to set two keyframe multi-DOF robot positions,then to use voice to interpolate between them. For example, imagine a humanoidrobot that is going from crouching to standing.50Chapter 5Evaluating Robot BehavioursTo determine the ability of the CuddleBits to display a wide range of emotions, weran two studies where participants rated breathing behaviours in terms of arousaland valence (CuddleBits studies 1 and 2). We then ran a six-week co-design studywith expert performers to further develop our improvisation tool and study howbehaviour design could work when conveying changing emotional behaviours, i.e.,from stressed to relaxed (Voodle). Last, we ran a study wherein we explored therelationship between complexity, valence, and narrative context, wherein partici-pants rated robot behaviours for valence and created short stories that explained therobot behaviours (Complexity).5.1 CuddleBits Study 1: Robot Form Factor andBehaviour DisplayWe evaluated the emotional expression capabilities of our two CuddleBit forms(FlexiBit and RibBit) on eight behaviours representing four emotional states. Specif-ically, we asked:RQ 1. Can 1-DOF robot movements be perceived as communicating differentvalence and arousal states?Hypothesis: Different levels of arousal will be interpreted more accurately thandifferent levels of valence.51Negative valenceHigh arousalLow arousalStressed ExcitedRelaxedDepressedPositive valence21438765TimeAmplitudeFigure 5.1: Waveforms of Study 1 behaviours as designed by researchers.Each quadrant is represented by a PANAS affect word corresponding tothe extremes along (valence, arousal) axes, i.e., Excited is high-arousal,positive-valence.RQ 2. How is interpretation of emotional content influenced by robot materi-ality, e.g., a soft furry texture?Hypothesis: FlexiBit’s behaviour will be perceived as conveying more positive va-lence than RibBit’s.5.1.1 CuddleBits Study 1: MethodsBehaviour design: Team members created and agreed upon two breathing be-haviours for each quadrant of the affective grid [72]: Depressed, Excited, Relaxed,or Stressed, for a total of 8 behaviours (represented as motion waveforms in Fig-ure 5.1). Each emotion word typifies the extreme of its emotion quadrant (i.e.,Stressed is high-arousal, negative-valence).Participants: 20 participants, aged 20–40 with cultural backgrounds from NorthAmerica, Europe, Southeast Asia, Middle East and Africa, were compensated $5for 30 minute sessions.Procedure: Participants were given the task of rating each behaviour on a 5-pointsemantic differential (−2 Mismatch to +2 Match) for two different robots display-ing four emotions: Depressed, Excited, Relaxed, or Stressed. For instance, for“FlexiBit feels stressed”, a participant would play each behaviour and rate how52well it matched the robot portraying stress. During playback and rating, partici-pants kept one hand on the robot, and moused with the other; motion was experi-enced largely haptically. Noise-cancelling headphones played pink noise to maskmechanical noises; instructions were communicated by microphone.Ratings for each robot were performed separately. Robot block order was coun-terbalanced, with an enforced 2m rest. For each block, all four emotions were pre-sented on the same screen so participants could compare globally. Behaviours (15sclips) could be played at will during the block. Order of behaviours and emotionwas randomised by participant. To reduce cognitive load, participants saw the samebehaviour/emotion order for the second block. In total, each participant performed64 ratings (8 behaviours × 4 emotions × 2 robots). Afterwards, a semi-structuredinterview was conducted.5.1.2 CuddleBits Study 1: ResultsWe compared ratings of each pair of behaviours designed for the same emotionword with a pairwise Wilcoxon signed-rank tests with Bonferroni correction (Fig-ure 5.2). Ratings of the two designed behaviours for the same emotion quadrantwere not significantly different (α = .050/8 = .006; all p’s ≥ .059). Thus, weaveraged ratings into four pairs by emotion target (e.g., (1) & (2) in Figure 5.1).Effect of emotion quadrant on behaviour ratings (significant). Friedman’s teston behaviour ratings showed significant differences between behaviours per emo-tion for both robots (all p’s < .001). Post hoc analyses using Wilcoxon signed-ranktests were conducted with a Bonferroni correction (α = .050/6 = .008) to furtheranalyse the effect of emotion condition on researcher-designed behaviours:– Stressed, Excited, or Relaxed: There were significant differences betweenhigh and low arousal behaviours (Stressed-Depressed, Stressed-Relaxed, Excited-Depressed and Excited-Relaxed, all p’s≤ .002); but none between behaviours withthe same arousal level but different valence content.Effect of robot on behaviour ratings (not significant). Wilcoxon signed-ranktests with Bonferroni correction showed no statistically significant differences be-tween ratings of emotions displayed on the two distinct robot forms (α = .050/16=.003; all p’s ≥ .026).53Duration (not significant). A two-way (2 robots × 4 emotions) repeated mea-sures ANOVA showed no significant differences in the time spent on rating be-haviours (all p’s ≥ .079), suggesting each emotion rating was undertaken withsimilar care.5.1.3 CuddleBits Study 1: TakeawaysHypothesis 1: Different levels of arousal are easier to interpret than different levelsof valence. – Supported.In general, participants were able to perceive differences in behaviours de-signed to convey high or low arousal. Speed or frequency was most mentioned forarousal variation: low arousal from low frequency and high arousal from high fre-quency. Participants found interpreting valence more difficult. Thus, behaviours onthis 1-DOF display corroborates earlier findings in regards to both dimensions [27,65, 97].We posit that the difficulties in determining valence may be due in part to therestrictive range of behaviours. All designs were based on the perception and imag-ination of three computer science researchers, which may not be broadly general-izable as effective emotional displays.Improvement: Behaviours may have more range or discernible valence whensourced from a more diverse group of designers. To increase emotional variance inStudy 2, we recruited participants (N=10), the majority of whom were employed increative roles to create the behaviours with an expert designer. Participants wereencouraged to puppet robot movements, act out desired movements, and interactwith the robot until they were satisfied with the emotional displays.Hypothesis 2: FlexiBit’s behaviour will be perceived as conveying more positivevalence than RibBit’s. – Not supported.In post-study interviews, participants reported the movement expressed by thetwo robot forms as sensorially but not necessarily emotionally different. FlexiBitfelt nicer to touch, but its motion was less precise. RibBit’s movements were inter-preted as breathing or a heartbeat despite the exposed inner workings emphasizingthe ’machine-ness’ of the robot.Unexpectedly, while participants specified preferences for FlexiBit’s fur and54RibBit’s motor precision, pairwise comparisons of the same emotions revealed nosignificant difference between robots. Movement rather than materiality dominatedhow participants interpreted emotional expression; although visual access to formwas restricted during movement, tactility might have modulated perception of, e.g.,life-likeness.Improvement: Whereas robot form factor had little to no influence on emo-tion recognition results, it did influence how participants perceived the robot. Weselected characteristics to emphasize for a second round of robot prototyping, pro-ducing a new robot for Study 2. We focused on characteristics that participantsreferenced as salient or pleasing in interviews, such as fur, texture, and body firm-ness.Starting from paper prototypes, we iterated on the RibBit form factor to in-crease haptic salience and to incorporate positive FlexiBit features. After exploringbumps on the ribs, spine configuration, fur textures, and rib count, we convergedon a form that had fewer ribs, dense fur, and a prominent spine. This combined thefavourite features of the RibBit (crisp motion and haptic feedback) with the Flex-iBit’s cuddliness. With rapid prototyping methods, each paper/lo-fi sketch couldbe explored in less than an hour; full new robot prototypes took about two hours tomodify design files, half an hour to laser cut, and about two hours to assemble.5.2 CuddleBits Study 2: Evaluating BehavioursIn a second study, we validated our participant-created behaviour designs and ex-plored the effect of presence on emotion evaluation. Here, we ask how consistentlybehaviours are rated in terms of valence and arousal under two viewing conditions:(1) the robot is present; (2) the robot is displayed via video.Of the 72–item behaviour set generated by participants (see Generating 4),CuddleBits Study 2 used a subset of 16: five researchers selected the most rep-resentative designs, converging on the top four per quadrant. Under two viewingconditions {live, video}, participants chose three words that best represented thedisplayed behaviours and rated their confidence in each chosen word, as well asone or more words that least represented the behaviour. Participants rated wordsahead of time in terms of arousal and valence. Ratings per participant and per55Figure 5.2: Mean behaviour ratings (+2 for Match; -2 for Not Match) forFlexiBit grouped by the researcher-designed behaviours (horizontal)and the emotion word against which participants rated behaviours (ver-tical). Researcher-designed behaviours correspond with (1) to (8) inFigure 5.1. RibBit scores were similar and omitted for space.viewing condition were combined into a single (valence, arousal) point (describedbelow). Through this, we explored the following:RQ 1. Is there a difference in viewing conditions?Hypothesis: Participants will rate behaviours similarly regardless of viewing con-dition.RQ 2. Are behaviours consistently distinguishable?Hypothesis: Each behaviour will be distinguishable.RQ 3. Which behaviour design and waveform features correlate with rateddimensions of arousal and valence?Conjecture: Features that are characteristic of variability will correlate with va-56lence, while features that are characteristic of speed will correlate with arousal.5.2.1 CuddleBits Study 2: MethodsParticipants: We recruited 14 naı¨ve participants (4 male), aged 22–35. 12 par-ticipants were fully proficient in English; the remaining 2 had advanced workingknowledge. Out of 14 participants, 13 reported having at least some interactionwith pets; 6 rarely interacted with robots, and 8 never interacted with robots. Allwere compensated $15 per session.Procedure: Participants were seated, introduced to a fur-covered RibBit, and askedto touch the robot to reduce novelty effects. To calibrate emotion words, partici-pants rated the valence and arousal of 12 words on a 9-point scale (Table 4.2).Participants then viewed the 16 robot behaviours in two counterbalanced viewingcondition blocks {live, video}.In the live condition, participants could physically interact with the CuddleBitwhile playing each robot behaviour via MacaronBit. Noise-cancelling headphonesplayed pink noise to mask robot noise. In the video condition, participants watchedsilent videos of the CuddleBit performing the same behaviours (side view, 640x360px, 30fps). In both conditions, behaviour order was randomized for each partici-pant.In each viewing condition, participants were asked to choose 3 emotion wordsthat best represented the behaviour from a list of the 12 emotion words they cali-brated previously, indicating their confidence level of each word on a 5-point Likertscale. They watched 16 behaviours and answered qualitative follow-up questions.After an optional 5 minute break, this process was repeated, with condition blockcounterbalanced. Including a semi-structured interview, the session took∼60 min-utes.5.2.2 CuddleBits Study 2: Data PreprocessingBefore each session, participants calibrated the emotion words that they would beusing by rating each in terms of arousal and valence. Using the calibrated list of57Determined (8,6)Excited  (9,9)Nervous (1,7)Participants rate each word in terms of arousal and valence.For each behaviour and viewing  condition, word vectors are weighted A vector is produced for each word.Determined1 arousal 91 valence 9c1w1 + c2w2 + c3w3|w1| + |w2| + |w3|b1,live =Each behaviour is then rated by top i.e. {1..5}  {0.0,0.25,0.5,0.75,1.0}.Determined Excited  NervousFigure 5.3: For each behaviour and viewing condition, a single vector wascalculated by adding the vectors of the top three words that participantschose, weighted by confidence levels. Word vectors were determined atthe beginning of the session, when participants rated each word in termsof arousal and valence.emotion words, we constructed vectors of (v= valence, a= arousal) for each word,where 1 < v,a < 9. For each behaviour and viewing condition, the best three wordswere weighted by their confidence values, added and normalized. This produced asingle vector of (v,a) for each behaviour and viewing condition (Figure 5.3).5.2.3 CuddleBits Study 2: Data VerificationBefore the following analysis, we ran a series of data verifications to ensure con-sistency in each participant’s responses.Due to the high subjectivity of the kinds of emotions people will associatewith different words, the participant-calibrated emotion words were checked forconsistency with the expected PANAS quadrants. For all participants, no more thantwo words disagreed with the PANAS quadrants; as such, we took the participantrated words to be reasonably calibrated.Similarly, for each behaviour per view condition, the best three rated wordswere checked both against themselves, and against the selected least representativeword(s). Roughly 50 per cent agreed within a reasonable margin of error acrosseither valence or arousal; 30 per cent agreed across both valence and arousal; 2058per cent either did not agree or were inconclusive.To determine whether our confidence value weighting scheme was valid, weperformed both a visual inspection of word distribution and confusion matriceswith design labels. With no weighting scheme, data was heavily biased towards(positive valence, high arousal) ratings, which did not agree with our qualitativeresults or a reasonable reading of our quantitative results. As such, a linear weight-ing scheme was determined to be the least biased, such that confidence ratings of{1,2,3,4,5} were mapped to {0.0,0.25,0.5,0.75,1.0}.5.2.4 CuddleBits Study 2: Analysis and ResultsWe summarize our findings from CuddleBits: Study 2. All significant results arereported at p < .05 level of significance.RQ1: Is there a difference in viewing conditions?Hypothesis: Participants will rate behaviours similarly regardless of viewing con-dition. – Not supported.Behaviour label × Viewing condition: We found a significant effect for viewingcondition (Pillai= 0.563, F(2,415)= 6.87) and behaviour label (0.563, F(30,832)=10.86). We did not find an interaction effect (p = .33). Although there is evidenceto suggest that participants do rate behaviours differently, since they also rate view-ing conditions differently, we should be careful in using video as a proxy for liverobot behaviour display.Behaviour label quadrant × Viewing condition: We found a significant effectfor viewing condition (Pillai = 0.441, F(6,880) = 41.43) and by collecting de-signs by quadrant (e.g., Hostile and Upset are both high-arousal, negative-valenceemotions), (Pillai = 0.030, F(2,439) = 8.705), and the interaction effect.Duration: Through 2-way ANOVAs, we found significance in duration betweenviewing conditions wherein participants took longer to rate live (µ = 72.49s, σ =40.69s) than via video (µ = 64.13s, σ = 29.28s) per behaviour, corroborated in thatlive behaviours (µ = 2.36, σ = 1.51) were played more times than the correspond-ing video (µ = 1.96, σ = 1.18). The more time spent on live behaviours couldbe due to more information conveyed or more interest as participants interpret the59motion and/or haptic expression.RQ2: Are behaviours consistently distinguishable?Hypothesis: Each behaviour will be consistently distinguishable. – Partially sup-ported.Behaviour label × Participant: As behaviours (Pillai = 0.917, F(30,448) =12.649), participants ratings (Pillai = 0.671, F(26,448) = 8.705), and the inter-action are all significant, we determine that the behaviours are distinguishable byparticipant.Through Figure 5.4, we examine rating consistency by behaviour and quadrant.Negative-valence, low-arousal (Depressed) behaviours have the largest dispersionin rating for both dimensions, suggesting that they are the most difficult for par-ticipants to classify. Low-arousal, positive-valence (Relaxed) behaviours are moreconsistently concentrated towards the relaxed quadrants.Both high-arousal, negative-valence (Stressed) and high-arousal, positive-valence(Excited) behaviours are concentrated in the high-arousal half, yet highly dispersedacross valence, suggesting valence is difficult to determine for certain high-arousalbehaviours.Overall, behaviours designed for a representative quadrant may not necessar-ily be interpreted as such. Determined, for example, was interpreted as negative-valence with high-arousal, a contrast to the intended positive-valence high-arousal.Finally, live behaviours (red in Figure 5.4) are more dispersed than video be-haviours (blue). This illustrates a higher variation in how participants rated livethan video behaviours.RQ3: Which behaviour design and waveform features correlate with rated di-mensions of arousal and valence?Conjecture: Features that are characteristic of variability will correlate with va-lence, features that are characteristic of speed will correlate with arousal. – Par-tially supported.Analysis using machine learning techniques was performed as a preliminarystep to understand which features might be most relevant. Using the full set ofdesigned behaviours from participants (see Generating) and their associated de-sign labels, we trained a Random Forest classifier on statistical features calculated60VA L E N C ES L U G G I S H D R O O P Y B O R E D D R O W S YC A L M R E L A X E D S E R E N E AT  E A S EAT T E N T I V E P R O U D E N T H U S I A S T I C D E T E R M I N E DG U I LT Y H O S T I L E N E R V O U S S C A R E D-1 1AROUSAL-11Figure 5.4: Each plot shows a single behaviour’s arousal (-1,1) and valence(-1,1) ratings. Live viewing condition is in red, video in blue. Greenellipses show confidence intervals at 5% and 97.5%. Green cross ismean, purple cross is median. Each plot corresponds to a single PANASword, each row corresponds to an affect grid quadrant. Rows order fromthe top: Depressed, Relaxed, Excited, Stressed.from design and output waveform attributes. Since each behaviour was output as awaveform, we could decompose the waveform using MacaronBit design parame-ters, and describe them using keyframe count and standard statistical features (min,max, mean, median, variance, total variance, area under the curve) on keyframe61values. The same statistical features were calculated for the output waveform.Each behaviour label was mapped to the original PANAS quadrant (called heredesign quadrant). When running 20-fold cross-validation classifying on designquadrant, the Random Forest classifier achieved between 66% and 72%, full fea-ture subset for the former, and an optimal subset for the latter (chance=25%). Topperforming features were position: keyframe count, range, total variance; ran-dom: max, min.Note that the selected features are related to waveform complexity. If the ran-dom parameter was set high, then the waveform would have a high amount of vari-ation. Similarly, if there were a high number of position keyframes, the waveformwould have a lot of variation.Feature SelectionA correlation matrix was constructed between arousal and valence for 16 participant-rated behaviours (per viewing condition), and for the 72 participant-and-researchergenerated unrated behaviours.As seen in Figure 5.5, arousal has stronger correlation within the feature vectorthan valence. Features with strong positive correlation to arousal are those thatalso correspond with the widest, fastest, and most erratic motions, such as positionkeyframe count, position range, and random maximum.Valence has much weaker correlation overall, and particularly low absolutecorrelation values in the participant-rated analysis. However, within the unratedbehaviours, the top correlated features are also indicators of waveform complexityand are negatively correlated with valence, i.e., the more complex a behaviour is,the less it is deemed to be a pleasant behaviour.Participant ExperienceInterviews with participants were audio-recorded, transcribed, and coded for themesand keywords by a single researcher using an affinity diagram and constant com-parison. Open-ended written responses from participants for both live and videoviewing conditions were analyzed with the same techniques.In contrast to video, participants emphasized the importance of haptic feedback62Positionnumkfmedmaxminvartotvarrangemeanaucunrated_vunrated_avideo_vvideo_alive_vlive_aRandomunrated_vunrated_avideo_vvideo_alive_vlive_aWaveformunrated_vunrated_avideo_vvideo_alive_vlive_a-.15.01-.25.18-.08-.08-.31.10-.15.39-.10.06-. 5.5: Correlation results from behaviours that were designed for anemotion label but unrated by participants (marked unrated above) werecalculated on all 72 designs from CuddleBits: Participant Gener-ated Behaviours (see Generating); correlation results from participant-ratings were calculated on the 16 behaviours from CuddleBits: Study 2(marked by viewing condition). A strong positive correlation is shownbetween the position total variance for all arousal columns (unrateda,videoa, livea) – the higher the total variance, the higher the arousal.(7/14), the ability to view the robot from multiple angles (4/14), the increasedengagement and accessibility that resulted from the live interaction (5/14); 9/14touched the robot while playing behaviours. Of these, three reported wanting totouch the robot when they were having difficulty interpreting the behaviour.“Feeling the movements rather than just watching them helped me geta better sense of an interpretation of what the emotion was.” –P05(corroborated by P09, P11)Participants reported that some of the given emotion words were ambiguous(particularly Depressed, Attentive, Excited, and Stressed), due to a lack of contextand visual cues.“There were some emotions where it was pretty ambiguous. [Thereare] different connotations depending on exactly what the context is.”–P03 (P02, P10)63Several participants (4/14) interpreted a combination of emotions while ob-serving the robot behaviours, with emotions happening either simultaneously orsequentially.“I think [the emotions] are happening in order. So sometimes [it’s]excited, and then after the stimulus is gone, it becomes bored again.”–P06 (P04, P07, P09)Participants’ description of their process for labelling robot behaviours in-cluded relying on their experience with animal behaviours (4/14), their experiencewith human emotions (2/14), and interpretation of the “heartbeat” or “breathing”of the robot’s movement (7/14). 5/14 participants mentioned that it was difficult tointerpret emotions from the robot behaviours.5.2.5 CuddleBits Study 2: TakeawaysViewing conditions: Experiencing the robot live seems to have a different effecton behaviour interpretation than via video. Since the ratings in viewing condi-tions were significantly different, it is inadvisable to use video as a proxy for robotbehaviours. This difference is likely due to the ability to experience the robothaptically and visually from multiple angles. Futher, the laboratory context mayhave diminished the emotional interpretation in both cases, as the context of thebehaviours were intentionally unspecified and therefore ambiguous.Distinguishability and consistency: Behaviours were distinguishable; some robotbehaviours were consistently rated within the same affective quadrants, some acrossa single dimension (usually arousal), and some not at all. In line with an intuitiveunderstanding of emotional behaviours, this suggests that it is possible to createbehaviours that are seen as meaningfully different (especially in terms of arousal),but that their interpretation is subject to some variance. If we take the distributionof behaviour ratings at face value, the ambiguity of interpretation may not be mea-surement noise, but a true representation of the behaviour’s emotional space. Thatis to say, the behaviour labeled as Sluggish in Figure 5.4 may be well-defined as aprobability space with a nearly-neutral mean.64Waveform features: The waveform features that define arousal seem clear: byincreasing the amplitude and frequency of a behaviour, it should be interpreted ashigher arousal. Valence is less clear, since low correlation values make a defini-tive claim difficult. However, we conjecture that the waveform features that cre-ate a more complex and varied behaviour (i.e., randomness, number of keyframesneeded to make a behaviour) create a lower-valence behaviour. More work isneeded to look into valence behaviours (see Complexity below).It is possible to create distinguishable emotional behaviours with the Cud-dleBits. Future work should (1) seek to establish the factors that produce consistentvalence interpretations; and (2) establish how those behaviours change within thecontext of an interactive ‘scene’.5.3 Voodle: Co-design StudyTo understand how Voodle would work in a behaviour design process, we per-formed a co-design study: performer-user input guided iteration on factors under-lying Voodle’s expressive capacity.Because using iconic input to generate affective robot motion is an unexploreddomain, we focused on rich qualitative data. Methods borrowed from groundedtheory [86] allowed us to shed light on key phenomena surrounding this interactionstyle, and to define the problem space through key thematic events as a basis forfurther quantitative study.5.3.1 Voodle: MethodsIdeal Voodle users are performance-inclined designers. We recruited three expertperformers to help us improve and understand Voodle.Over a six week period, each performer met us individually for three one-hour-long sessions, for a total of nine sessions conducted.After completing Session 1 with all three participants, we iterated on the systemfor Session 2; we repeated this process between Sessions 2 and 3.In each session, participants were guided through a series of emotion tasks,followed by an in-depth interview. Each emotion task was treated as a voice-actingscene, where the participant played the role of actor, and two researchers played the65roles of director (here, an assistant as for comparative Study 1) and observer. Asbefore, the director/assistant offered technical support and suggestions as needed,but did not actively design behaviours. An observer took notes.In each task, participants used Voodle to act out transitions between oppos-ing PANAS emotional states, e.g., distressed → relaxed, for (high-arousal, high-valence) → (low-arousal, low-valence). The full set of emotion tasks (a) crossedthe diagonals of the affect grid; and (b) crossed each axis: Distressed - Relaxed,Depressed - Excited, Relaxed - Depressed, Excited - Distressed, Relaxed - Excited,Depressed - Distressed.Participants performed as many as they could in the time allotted per session.Each session lasted an hour: 30 minutes dedicated to the main emotion task, 20minutes for an interview, and 10 minutes for setup and debriefing.An in-depth interview was framed with three think-aloud tasks, to motivatediscussion and draw out user thoughts on the experience of voodling. Participantswere asked to (1) rate and discuss Likert-style questions of 5 characteristics: per-ceived alignment, fidelity and quality of designed behaviours, and perceived degreeof precision and nuance; (2) sketch out a region on an affect grid to represent theexpressive range of the robot (Figure 5.6). (3) pile-sort [9] pictures of objects,including pets, the CuddleBit, and tools, to expose how they defined terms like‘social agent,’ and how the Bit fit within that spectrum.5.3.2 Voodle: ParticipantsParticipants were professional artists with performance experience, recruited throughthe researchers’ professional networks.P1 is a visual artist focused on performance and digital art. He was born inMexico and lived in Brazil for 4 years and Canada for 7 years. P1 is a native speakerof Spanish and English, with working knowledge of Portuguese and French.P2 is an audio recording engineer, undergraduate student (economics and statis-tics), and musician: he provides vocals in a band, and plays bass and piano. P2 isa native English speaker born in Canada; he is learning German and Spanish.P3 is an illustrator, vocalist, and freelance voice-over artist. She has a degreein interactive art and technology, and has taken classes on physical prototyping and66design. She is a native speaker of Mandarin and English, with working knowledgeof Japanese. P3 was born in Taiwan and immigrated to Canada when she was 8years old.5.3.3 Voodle: AnalysisWe conducted thematic analysis [73] informed by grounded theory methods [24]on observations, video, and interview data. We found four themes (Table 5.1): par-ticipants developed a personal language, voodling requires a narrative frame andbrings users into alignment with the robot, and parametric controls complementthe voice for input. Each session helped to further develop and enrich each theme,adding to the overall story. We refer to each theme by an abbreviation and sessionnumber: “PL1” is Personal Language, Session 1.67Table 5.1: Summary of Voodle Co-Design Themes. We refer to each theme by abbreviation and session number (e.g.,“PL1”)Theme Definition Session 1 Session 2 Session 3PersonalLanguage(PL)The individualizedwords and utter-ances a participantdeveloped with therobot.Participants took a varyingamount of time to “get” Voo-dle; each vocalized in differentways, arriving at a local maxi-mum.Participants build upontheir constructed language,starting from their Session1 language, but exploringmore ideas.Robots influenced choiceof voice or MIDI input,but not vocalization lan-guage.NarrativeFrame(NF)The story the useris telling themselvesabout who or whatthe robot is.Participants needed to situatethe robot by constructing acharacter to effectively inter-act with the robot by utilizingmetaphors, concepts, and feel-ings that do not need to be ex-plicitly described in words.Participants used narrativeframe in different ways.Fur did not affect their abil-ity to construct a narrativeframe.Robot form factor, orien-tation adjusted the storiesthat participants told.Immersion(I)The extent to whicha participant couldsuspend their disbe-lief.Participants adjusted theirlanguage depending on therobot’s behaviour. By con-versing with the robot, theyfound the behaviour was morebelievable than the observersdid.Experience helped peo-ple be more in-tune(“aligned”) with the robot;as did voodling in compar-ison to using direct MIDIcontrols.Too much or too lit-tle control reduces emo-tional connection; phys-ically actuated displaysconnect more with users.Controls(C)How the controlof the systeminfluenced howparticipant saw theinteraction.Laptop controls were difficultto use. A low pass smooth-ing algorithm was not effec-tive. Randomness contributedto life like behaviour.Physical MIDI controlswere easy to use whenvoodling, but lacked feed-back. The robot neededan adjustable “zero” tomaintain lifelike behaviourwithout input.Suggestions include:steady-state sine wavebreathing and setting 0position as 50% of maxservo68Voodle: Session 1We introduced naı¨ve participants to the initial Voodle prototype and allowed themto explore its capabilities and limitations by completing as many of the emotiontasks as time permitted (∼30 mins). We closed with a semi-structured interview;participant feedback informed the next iteration of Voodle.Theme PL1: From “Eureka” to local maximum: Participants were initially in-structed to use iconic vocalizations with an example such a ‘wubba-wubba’. De-spite this, all participants chose to use symbolic speech early in Session 1.For example, when asked to perform the emotion task relaxed → depressed,P2 started by saying “I’m having a nice relaxing day”, with little visible successin getting the Bit to do what he wanted. Each participant transitioned into under-standing how to use Voodle at different times. P3 quickly understood that symbolicspeech wouldn’t afford her sufficient expressivity, and transitioned to iconic input,while P2 kept reverting to symbolic speech as an expressive crutch.It took P2 until the fifth emotion task (of seven) until he had a breakthrough: “Ikinda made it behave how I imagined my dog would behave”. Using that metaphor,subsequent vocalizations attained better control. Unlike the other two participants,P1 switched to iconic vocalizations gradually.Each participant eventually converged on his or her own idiosyncratic collec-tion of sounds that they felt was most effective. This differs from what might be aglobally optimal set of sounds to use: participants stayed in some local maximum.For example, P1 started using “tss” sounds and breathes into the microphone;while initially successful for percussive movements, they later proved limiting. P2used nasal sounds peppered with breathiness (“hmmm”). P3 eventually focused onmanipulating pitch with vowels. (“ooOOOO”), as well as employing nasals likeP2 (“mmm”), and some ingressive (breathing in) vocalizations. (“gasp!”).Theme NF1: Developing a story: Once the participant finds the robot’s ‘story’,emotional design tasks get easier. For example, P2’s shift came with his story ofthe robot being a dog. P2 refused to explicitly tell a story: “it wasn’t much of aconcrete story”. P1 said he created less a full story, “more a grand view of feelingsome emotions and from there on you could build a story, we were getting more the69traces of a story through the emotions”. This narrative potential was enabled bythe conceptual metaphor of Voodle as a dog [47].Theme I1: Mirroring the robot suspends disbelief:Participants formed a feedback loop with the robot: their vocalizations influ-enced the robot’s behaviour, which in turn encouraged participants to change theirvocalizations. P2’s dog-like “hmmm” vocalizations caused the robot to jitter, sur-prising P2 and prompting a switch to “ooo ooo ooo” sounds.When actively interacting with the robot, participants reported stronger emo-tional responses than the experimenters observed in the robot; as actors in thescene, participants were more connected than the director and observer. This couldbe due to their close alignment with the robot while acting – an experimenter mightsee a twitch as a quirk of the system, but the participant might see it as evocativeof emotional effort: “I did an ’aaa’ and at the end of the syllable it did a flutter...itwas just really nice, there were just things that I didn’t expect that expressed myemotion better than I thought it would” (P3).Theme C1: Screen distracted, algorithm was unresponsive: Using a laptop tocontrol algorithm parameters distracted participants from looking at the robot.All participants had some trouble modifying Voodle parameters; the directorneeded to take over parameter control (with the participant’s direction) as theyvocalized. Parameter manipulation was especially difficult in emotional transitiontasks where multiple parameters needed to be adjusted over time.In addition, participants reported that the smoothing algorithm, a simple low-pass filter, was unresponsive: “feels like there’s a compressor [audio filter restrict-ing signal range]...limiting the the amount of movement” (P2).Changes for Session 2 – We implemented four changes for Session 2: replacedthe web interface with a physical MIDI keyboard for parameter control; replacedthe low-pass filter with a PD (proportion/damping) controller to improve respon-siveness, with parameters named “speed” (P) and “springiness” (D); introduced anew “randomness” parameter to simulate the noise from the removed low-pass fil-ter; and added a new mode to aid in making comparisons, “Wheel”: users couldpress a button on the MIDI keyboard to disable voice input and directly control theposition of the robot using wheel control.70Voodle: Session 2 Format and ResultsIn the second session we juxtaposed voodling against a manual MIDI-wheel con-troller, based on a participant’s suggestion. Participants first did as many emotiontasks as possible in ∼15mins, using voice for robot position control; then repeatedthese in Wheel mode. Included in this session were observations of how a user’srelationship with the Bit matured as they became more familiar with both the robotand Voodle.PL 2: Participants learn, differ in skill: Unprompted, participants began withsame language they used in Session 1, then developed their language with exper-imentation. P3 continued to use primarily pitch control, as she did in the firstsession. P1 continued his “tss” sounds and blowing directly into the microphone,essentially a binary rate control: the robot was either expanding quickly or con-tracting quickly.After some experimentation, P1 incorporated more pitch control, which af-forded better control. P2 and P3 indicated increased expressivity on their affectgrids in Session 2 (Figure 5.6), suggesting improvement of either ability or system.The participants began to diverge in their ability to create nuanced behaviours,suggesting talent or training influenced their capabilities.P1, with his breathing sounds, simply didn’t succeed in controlling the robot.P3 seemed to understand how to work with Voodle, creating subtle and expressivedesigns; she preferred vocal input, but also was adept with Wheel. P2 was betweenthe other two, making extensive use of Wheel control, and playing it like a piano.NF 2: Agency from motion:Randomness and lack of precise control imbued the robot with agency. P1claimed that, on the whole, randomness made the Bit feel more alive because itimplies self-agency. When turning up the randomness, P3 exclaimed, “oh hey hi,I woke it up”. She explained: “The randomness meter...was always the first thingI moved I think...because it added another layer of emotion to it.” This lack ofcontrol connected to the sense of life within the robot: “[the Bit] was modeledto look like a living creature and that makes me feel like it should probably notcompletely obey what I want it to do. There should be something unexpected”71P1Session 1 Session 2 Session 3P2P3UnpleasantHigh energyLow energyPleasantVoiceand WheelWheel moreprecise hereWheelVoice(P3 intentionally sized circlesto be bigger than Session 1)RibBitFlappyBitFlexiBitBrightBitVibroBitFigure 5.6: Reported affect grids by participant and session. After being in-structed about dimensions of arousal and valence, participants drew therobot’s expressive range directly on affect grids. Participants indicatedincreased expressivity from sessions 1 to 2, differences between voiceand Wheel control, and that each robot had a different range.(P3).Continuous motion can contribute to agency. All participants felt the robotshould not be motionless in its ‘off’ state; it needed a default, like breathing. P2further suggested that the robot’s ‘zero’ point be the middle of its range, to accom-modate both contraction and expansion metaphors.72I2: Voice converses, MIDI instructs: Participants were more aligned with therobot when vocalizing.For example, P3 expressed that manual wheel control allowed her to instructthe robot, whereas voice control allowed her to converse with the robot: “Voicefeels like it’s more conversing than by wheel, I think it’s because by wheel I havea better idea of what’s going to happen...which makes me experiment a little less”(P3). Non-voice MIDI control gave a stronger sense of controlling the robot, di-minishing agency: “[The wheel] felt more like playing an instrument” (P2). P2, theaudio engineer, preferred using the MIDI wheel, while P3 preferred voice. Both P1and P3 indicated that the Wheel had more expressive capabilities with low-arousal,negative emotions (Figure 5.6).C2: Visual parameter state: MIDI parameter control allowed participants to fo-cus attention on the robot.All participants continuously modified the Voodle parameters with the MIDIcontroller, compared to minimal modification with Session 1’s HTML controller.P2 suggested that sliders may be more effective than knobs, as they provide imme-diate visual feedback for range and current value. P3 also requested more visualfeedback for parameter status, e.g., bar graphs.Changes for Session 3 – We displayed parameter status on the laptop screen, andadded 4 new robot forms to explore how form and actuation modality influencevoodling (Figure 4.3).Voodle: Session 3 Format and ResultsThe final session explored the effect of form factor on control style. Each par-ticipant ran through a subset of our emotion tasks with each of the new robots(Figure 4.3), given the option to use either voice or wheel control. They were alsoallotted free time to play with the new robot forms. We administered a closingquestionnaire to capture their overall experience of the final version of Voodle.PL3: Consistent language across robots : Despite wide variation in each robot’sexpressive capability (Figure 5.6), participants continued to use their developedlanguages across robots. Examples include “tssss”, “ooo”, “aaa” (P1), “mmmm”73(P2), and “oooh”, “ahhh” (P3). While language remained consistent across robots,preferred control mechanism did not.P2 preferred vocal input only for FlappyBit as he engaged emotionally with it:he saw the flapper as a head. However, P2 used wheel control for the remainingBit forms. P1 always started vocalizing as an experimentation technique with newBit forms and then consistently moved to wheel input for fine-grained control. P3preferred voice for most robots, although she did indicate the RibBit respondedmore consistently to wheel input (unlike the other robots).NF3: Shape, orientation create lasting stories: Robots did not just have vary-ing expressive capability; they also inspired different stories. Participants reacteddifferently to each. For example, P3 saw VibroBit as a multi-dimensional, highly-controllable, lovable pet; P1 and P2 saw it as a unidimensional, completely uncon-trollable, unlovable object. Different robot features changed the narrative context.While P2 thought FlappyBit’s flapper was a head, giving it expressivity, P3 thoughtthe flapper was a cat’s tail. When FlappyBit was flipped over such that its flappercurled downwards, both P2 and P3 felt that it became only capable of expressinglow-valence emotions. However, form factor did not not completely change thestory: in all sessions, P2 felt the robot was a dog, no matter which robot he wasinteracting with.I3: Sweet spot of control; motion matters: P1 reported high control over Bright-Bit and low control over VibroBit, but rated both with a smaller expressive rangethan the other robots (Figure 5.6). This suggests a “sweet spot” of control whenconnecting emotionally with the robot: some control over behavior is good, but nottoo much.P1 felt more connected to FlappyBit or FlexiBit. That said, all participantsexpressed a lack of emotional connection with BrightBit. P3 thought that the lackof movement was the cause, while P2 did not feel like he conversed with BrightBit:“I kept visualizing it talking to me instead of me talking to it” (P2).Changes for Robot Iterations – Session 3 resulted in several implications for futureiterations on each robot: VibroBit had a limited expressive range; FlappyBit’s flap-per looked like a head, which was easy to connect with, but metaphors would varydepending on orientation; FlexiBit had an ambiguous shape; BrightBit seemed un-74emotional.Voodle: Likert and Pile-Sort ResultsThe Likert scale and pile sort tasks were primarily used as an elicitation deviceto stimulate discussion. Participant responses were consistent with other observa-tions; we highlight a few examples.The questionnaire measured quality of Bit movements match to participant’svocalizations/manual control; precision, nuance and fidelity of voice control; andalignment of Bit behaviour to the emotions participants felt as they performed.Emotional connection with the RibBit increased by session. RibBit and Curly-Bit performed much better than other CuddleBit forms on all metrics. Wheel andvoice control offered similar degrees of quality on average. P1 and P2 reported thatthey felt more in control with the wheel, though P3 said that it made the Bit appearas less of a creature.Participant perceptions of the CuddleBit as a social agent changed through re-peated sessions, albeit in different ways. In the pile sort, P2 first placed RibBitbetween cat and robot, but post-Session 2, moved to between human and cat. Incontrast, P1 first sorted the RibBit between a category containing anthropomorphicelements and home companionship possessions, but later agreed it could fit in allof his categories (except one for food) if it was wearing fur.5.3.4 Voodle: Discussion and TakeawaysOur initial goal was to create a dedicated design tool for affective robots. We ob-served something intangible and exciting about live vocal interaction. We deriveda more nuanced understanding of Voodle use, in that it seems to exist somewherebetween robot puppetry and a conversation with a social agent.In the following, we discuss insights into interaction and believability, and howVoodle can function as an interactive behaviour design tool within a performancecontext. We conclude with future directions, including insight into how Voodlemight be embedded as a component of a larger behaviour control system.755.3.5 Voodle: Insights into Believability and InteractivityThrough the co-design study, we found that believability was mediated by partici-pants conception of robot narrative context, and their level of control and personalways of using it.Creating a context:Behaviour designs and alignment improved dramatically once participants founda metaphor or story. Context was determined by confluence of form factor, robotability and participant-robot relationship. For example, P1 could neither decidewhat VibroBit represented nor control it well, hence saw it as a failure; while P3thought that it was cute and felt skillful when interacting with it.Balancing control with a “spark of life”: Voodling created lifelike behaviourswith a simple algorithm: deliberate randomness and noise produced a user-reactivesystem that still seemed to act of its own accord.Varying randomness and user control made Voodle more like a conversation,or like a design tool.Control increased alignment, like people sharing mannerisms in a conversa-tion [33]; but with too much or little, the system becomes mundane or frustrating,the magic gone. Voodle was a more emotionally immersive design experience thantraditional editors.Personalization:Users developed unique ways to use Voodle. Algorithm parameters could bevaried to facilitate a metaphor, output device, or simply preference. Users mod-ulated their vocal performance with these parameter settings much as guitaristsuse pedals to adjust tone, before or as they play. Importantly, we observed thatusers tended to use similar “personal language” with varied robots, suggesting anindividual stability across context.Voodle: Vision for Behaviour Design ProcessIt is likely that producing affective robots will soon be like producing an animatedfilm or video game. Indeed, steps towards this have begun (e.g., Cozmo [3, 36]).Here, it seemed that enabling artists and performers to directly interact with robots76during design did facilitate the believability of the resulting behaviors, in that thedesigners who became aligned with their robot model seemed to be more satisfiedwith their behaviors than more attached observers.Behavior design team: As reflected in the structure of the co-design study, a be-haviour design session may involve a scripted scenario, a director, a designer, anactor, and the robot itself. Working together to bring out the best performanceon the robot, an actor and director would read through a script as the designertakes notes on how to modify the robot’s body. Through an iterative design pro-cess [16, 40], both behaviours and robot form factors could be refined together(Theme NF3).The actor could also leverage Voodle’s support to improve alignment with therobot. Like a puppet, the actor would be simultaneously controlling and actingwith the robot. Although the interactive space in which the actor works will likelyhave to be multimodal (i.e., , including a physical controller such as the MIDIkeyboard), alignment through voice enables a deeper emotional connection withthe robot itself (Theme I2).Physically adjustable parameters: Voodle took a different approach from previ-ous non-speech interfaces (e.g., the Vocal Joystick [10, 37]), which had a defined,learned control space. As we discovered in our pilot, voodling relied on a nar-rative context: a metaphor for how vocalizations should produce motion. Thiscould change from moment to moment: amplitude might be associated with therobot expanding, but if the robot was conceptualized as “flexing”, amplitude corre-sponded to downwards movement. When adding parameters, we found physicallymanipulable controls were easier to control when voodling, but they require visualindicators of their range and status.One could imagine a kind of recording engineer in a behaviour design sessionwho adapts motion control parameters on the fly (Theme C2).How to ExtendThis work produced initial requirements for a Voodle system, which is open-sourceand online. It also produced implications for future iconic speech interfaces.77Extending the sound-symbolic lexicon:Here we considered proportionally-mixed pitch and amplitude.Our pilots (Table 4.1) have already revealed other promising vocal features,such as -continuants (“dum dum”), +stridents (“shh” or “ch”), and distinguishingvoiced consonants (“b” is voiced, “p” is not). A detailed phonetic analysis willhighlight additional features and inform ways to adjust parameters automaticallyfor specific vocal features.Some parameter ranges should be individually calibrated, e.g., pitch.While we identified examples of our performers’ languages, many more iconicmappings (features to robot position) are possible. These features could further bedynamically mapped to multiple degrees of freedom.Design techniques: While Voodle was built as a design tool, in context, we found itwas rarely used alone. Instead, Voodle could be part of an animation suite, lettingusers easily sketch naturalistic motion without a motion capture system. Inputcould be imported into an editing tool for refinement. This might be especiallyviable in mobile contexts, to sketch an animation on the go, e.g., in a chat program.Iconic vocalizations have also been used to describe tactile sensations [77, 91],so Voodle may also be useful for end-user design of tactile feedback, to augmentcommunication apps – a haptic version of SnapChat or Skype, with voice for hap-tic expression. We expect such uses will need to recognize additional linguisticfeatures (like “sss” vs “rrr”); and Voodle must be more accessible to end-userswho are not performers.Vision for “Embedded Voodle”: Voodle has the potential to add life-like respon-siveness to deployed interactive systems. Adding randomness to an ambient dis-play increases perceived agency [8], but voodling could increase a sense that it isattending to the user, especially with directed speech (I2).As a reactive system, voodling could be added to conventionally planned mo-tion of virtual agents or robots, from a robot pet that reacts to ambient speech, tobody language of a assistive robot arm (Figure 5.7). When a user explicitly tells therobot arm to “come here”, she might modulate its movement with a soft “whoa”(slow down) or urgent “WHOA” (stop).78VoodleNoVoodle+TrajectoryReactiveTrajectoryFigure 5.7: Vision for “Embedded Voodle”: Voodle could be a natural low-cost method to emotionally color a motion path in more complex robots.5.4 Complexity: Valence and Narrative FrameDuring CuddleBits Study 2, we found that certain measures of complexity seemedto be negatively correlated with valence. During all of Voodle, CuddleBits, andother investigations we found that the ‘story’ people told about the robot—e.g. therobot is like my cat or the robot is a combination of a dog and a squirrel—heavilyinfluenced their perception of the valence of the robots behaviours.This study proposes to generate robot behaviours that vary in complexity andtest the participants perception of displayed valence, with consideration for thestory that people tell about each behaviour.Note: in previous work, we have noticed that arousal and valence are not per-fectly orthogonal. This means that arousal and valence are possibly dependent,such that an increase in arousal may imply an increase in valence. We have alsoseen that it is relatively easy to control the perceived arousal of the robot by in-creasing, for example, the range of the robots breathing. As such, we are targetingvalenced behaviours only in this study, using a simple five-point scale from nega-tively valenced, to positively valenced.RQ 1. How does valence correlate with complexity? Hypothesis: valence willbe negatively correlated with complexity, i.e., the more complex a behaviour is, themore negatively valenced it will be perceived.RQ 2. How do participants rationalize the change in the robots behaviours79(i.e., how does the ‘story’ change the perception of the robot)?5.4.1 Complexity: MethodsParticipants: We recruited 10 nave participants, all compensated $15 per session.Procedure:: Particpants were seated, introduced to a fur-covered FlexiBit, andasked to touch the robot to reduce novelty effects. They were then shown a samplebehaviour and walked through the rating task, and the story task. They were thenasked to complete the rating tasks while being directed by a pre-recorded voicewhile an experimenter took notes. Each task took roughly 2 minutes: the robotwould breathe neutrally, a test behaviour would be displayed, the robot wouldreturn to neutral breathing, a pre-recorded voice would ask them to rate the be-haviour (a replay option was available), and a pre-recorded voice would ask themto describe the robot’s behaviour out loud in a few short sentences. There weretwenty trials in total; the first two trials were always the same and were thrown outto reduce novelty effects, the next 18 behaviours were presented randomly. A shortfollow-up interview ended the session; each session took roughly 45 minutes.Fifty-four behaviours were designed by hand, then scored and ranked for com-plexity using three complexity measures: variance, peak count of a spectrographas generated by a fast Fourier transform (FFT), and MSE. Variance was cal-culated by the standard statistical method, i.e., E[(X − µ)2], and was includedto compare against results from CuddleBits. FFT peak count was calculatedby counting all maxima on a spectrogram within 2.6 standard deviations of theglobal maximum, and gives an estimate of the distribution of power across thepossible frequencies of the signal. MSE was calculated as outlined in RelatedWork (2), and in Costa 2008[26], then behaviours ranked by slope and shapeof the produced MSE graph. Six behaviours were chosen from each complex-ity measure to represent an even spread across rankings, i.e., one from each ofthe 0−16,17−32,33−48,49−64,65−80,81−100 percentiles for each of {Var,FFT, MSE}. This ensured a wide range of possible behaviours were shown to par-ticipants.805.4.2 Complexity: ResultsHere, we outline qualitative and quantitative results. Quantitative results outlinethe consistency of behaviour ratings in terms of valence; qualitative results discussthe impact of narrative frame on behaviour interpretation.Complexity: Quantitative ConsistencyInter-rater reliability: This measures the extent to which different raters (par-ticipants) rated all behaviours similarly. For example, if all participants rated eachbehaviour the same, there would be perfect agreement. If half the participants ratedall behaviours one way, and half the other way, there would be systematic disagree-ment. Since we are using ordinal scales, a reliability measure that accounts for theorder of the scale items was used. For example, we would want to assign a closeragreement between a behaviour rated at both “-2” and “-1” than if it were ratedat both “-2” and “+2”. Krippendorf’s alpha was therefore used to determine theinter-rater reliability, producing α = 0.07. This is typically interpreted as slightagreement, where α < 0 is systematic disagreement, α = 1 is perfect agreement,and α = 0 is no agreement. However, the observed agreement (pa) is 0.80, andthe agreement due to chance is 0.79 (pe), so α may be suppressed in this case(α = (pa− pe)/(1− pe)).Per-behaviour consistency: This measures the extent to which each behaviourwas rated similarly by all raters (participants). For example, if every participantrated a behaviour differently, we would say it was inconsistent. Weighting rat-ings for an ordinal scale [45], observed agreement per behaviour (pi) ranged from0.64 to 0.92, with 4 behaviours <0.70, 6 behaviours between 0.70 and 0.85, and8 behaviours above 0.85. As intuition, the variance of ratings and pi:n are highlycorrelated; the lower the variance in rating per behaviour, the higher the pi.Correlation:Our goal was to explore how measures of complexity correlate with valenceratings of breathing behaviours (i.e., to see whether more complexity producedlower valence).. Since our per-behaviour ratings were reasonably consistent, weconsider mean, median, and mode of the valence ratings per behaviour and cor-81Table 5.2: The correlation between summary statistics of valence ratings(columns) and complexity measures (rows). For example, the top leftcell is the correlation of the mean valence rating per behaviour and thevariance of the behaviour signal.Valence/Complexity mean median modeVariance -0.03 -0.15 -0.08FFT peak count -0.01 -0.12 -0.14MSE rank -0.45 -0.47 -0.37MSE mean -0.30 -0.08 -0.30MSE AUC -0.30 -0.27 -0.32MSE slope -0.30 -0.31 -0.43relate with our complexity measures. For the signal variance and peak count, wesimply calculated the correlation between each summary statistic of valence ratingsand the complexity measure of each signal; because MSE outputs a series of valuesper signal (i.e., one per time scale over which MSE performs coarse-graining), wecorrelate each summary statistic of valence ratings with the mean MSE rating, areaunder the curve of MSE ratings, and slope of the regression line of MSE ratings.Results are reported in Table 5.2.Complexity: Narrative framingAt the end of each trial, participants were asked to describe the robot’s behaviourin a few short sentences. Individual interpretation of the robot’s character, moti-vations, and subjective state were highly varied, even within the same behaviour.Here, we present a series of themes developed by (1) comparing and contrasting re-sponses per behaviour; (2) comparing and contrasting responses across behaviours;(3) drawing insights from individual responses.The qualitative results paint a much more complex picture than the quantita-tive results. Participant understanding of the robot and its behaviours were highlyindividualized. Although there were many behaviours for which at least someparticipants converged on the interpretation of the behaviour, the majority of theresponses revealed that the final reported valence measurement was fragile1 to a1Here, I use fragile to mean that the final outcome could vary widely across the range of possible82complex emotional reasoning scheme. Part of that emotional reasoning scheme Irefer to as the narrative frame, i.e., the set of assumptions about the robot’s char-acter, state of mind, backstory, and relationship to the environment through whicha participant will interpret a behaviour. Here I outline three themes that attemptto characterize the complexity of the reasoning behind the participants’ emotionalinterpretations:Participants ascribe complex mental states to the robot:The robot was understood to have thoughts and feelings that extend far beyondit feeling “good” or “bad”. Generally, participants gave a nuanced explanation foreach behaviour, where they attributed both human- and animal-like motivationsand emotions to the robot. For example,“It feels like it was trying really hard to do something but it doesn’tseem to be able to do it.” (P7, B11)“It feels calm, but also like he wants to start moving more, like wantsto play or something.” (P5, B10)Both quotes explain the current behaviour in terms of the robot wanting some-thing that it is not currently able to do, which suggests that the robot not only hasa current state of mind, but the ability to reason about the future. Similarly, partic-ipants describe the robot having hidden feelings that oppose the current action:“I feel like the robot is angry and wants to punch something like that,but it’s not like he’s lost his mind, he’s still under control.” (P9, B4)“...it kind of felt like it was sort of aggressive response, like someonetrying to hold in their anger, or whatever it might be. But perhapsaggressive is not the best word because it didn’t seem like it was somesort of...it seemed more like frustration, that’s a better word for it,frustration.” (P3, B5)responses depending on a few small changes to the emotional reasoning that was used. In contrastto uncertainty, fragility means that the participant could be very certain in what the final valencerating should be, but that conviction could change very easily given a small change in one of theirassumptions about the robot.83If the robot is frustrated and trying to hold it in, it would have to have a senseof how its actions impact the people with whom it is interacting, i.e., it must havesome kind of social understanding. This implies that participants not only act as ifthe robot has an internal mental state, but that the robot has its own internal modelsof their emotional states. This is not naively done; critically, participants are fullyaware of the robot’s status as a robot but act as if it has these complex mental statesregardless.Narrative frame heavily influenced valence ratings:Assumptions that participants made about the character, motivation, and situ-ation of the robot heavily influenced whether they saw the behaviour as positivelyor negatively valenced. For example, both of these quotes describe the same be-haviour:“The shaking so fast like the robot is talking to his friend like so happy,so makes me feel like it’s a positive emotion.” (P9, B5)“Breathing rate’s really fast, seems like it’s really anxious, or it’s run-ning away.” (P2, B5)In both cases, the story of the behaviour (whether it was running away, ortalking to a friend) flip the participant’s understanding of the behaviour’s valence.Interestingly, both frame the behaviour as if the robot was not reacting to the nar-rative of the interaction within the lab, but some other, unseen narrative.Even when all participants gave similar valence ratings, the emotional contentof the rating differed greatly. For example, participants who rated B7 as negativeframed the interaction as all of having “trouble breathing” (P2), “sobbing” (P3),“alarmed” (P4), “didn’t want to be kept here” (P5), “worried” (P6), “agitated”.Positive ratings included “restless, but in a good way” (P8), “working [a job]”(P7), “a cat playing with straw” (P1)2.Participants often framed the current behaviours relative to previous behaviours,often with short references to “this time” or “last time”. One participant even sawa throughline between multiple behaviours:2P9 rated the behaviour positively, but could not decide what the robot was doing.84“OK, I believe that the last two trials and this trials is a whole story.Like, the first one should be like the robot gets sick at the beginningphase, and the second one is ’the robot is so sick’, and this one, therobot is dying, but he’s still struggling so I feel like a little breaths in-between but almost like, in the most of time, he just doesn’t breathe atall. So maybe he’s dying. So that’s my guess.” (P9, B2)This was part of building up the robot’s story over time. One participant madeexplicit reference to the robot’s back story, i.e., where it came from:“It seemed like it was burrowing, burrowing into my arms I guess, butI imagine that that’s an instinctive behaviour it got from the wild.” (P4,B10)Valence was difficult to determine in incongruous or neutral states:The majority of responses indicated that it was easy for participants to find astory for the behaviour. However, if they could not determine a story, or if theactions seemed incongruous with the story they had already imagined, then it wasdifficult for them to rate the valence of the behaviour:“It didn’t seem like an animal behaviour, breathing was a bit weird.I don’t know—it doesn’t seem like an actual animal behaviour that Irecognize. The breathing was just...yeah, it didn’t seem very real tome at all, so I wasn’t able to place a behaviour for that.” (P10, B14,valence= 0)“I don’t know, I couldn’t tell what was going on. It seemed prettyrandom to me, and I couldn’t tell if it was an emotion.” (P1, B16,valence= 0)In both of the above cases, other participants were able to produce a strongnarrative for the behaviour:“The behaviour reminded me of a dog gnawing his teeth in his sleep,so this, like, grinding his teeth, and doing, yeah, just, weird things inhis sleep. So I gave it neutral.” (P8, B14, valence= 0)85“Compared to all of the other behaviours, this one, to me, definitelyhad more of a truly utter joy and satisfaction to it, I don’t know why,I can’t exactly put my finger on why I think that, just my natural gutinstinct is that, just kind of an air of warmth and happiness and satis-faction just the way that the pulse was.” (P3, B16, valence= 2)Often, even with a strong narrative, a neutral state made it difficult for partici-pants to rate in terms of valence:“I feel like the robot is aruging with somebody or something and hisemotion gets more and more...like, I don’t know how to describe it,but he gets more excited while he’s doing this, so I think maybe he’strying to convince somebody of something, but I don’t think it’s eitherpositive or negative, so I choose neutral.” (P1, B11, valance= 0)For example, this quote illustrates a strong (if not well-articulated) sense ofwhat the robot is doing, but not a strong sense of whether the robot is in a positiveor negative emotional state. The participant describes the robot as “arguing” and“excited” but cannot determine a valence rating.5.4.3 Complexity: Discussion, Takeaways, and Future WorkComplexity measures: Higher correlations between MSE measures and valenceratings relative to variance and FFT peak count suggest that MSE is a promisingapproach for quantifying valence. Future work should include an MSE analysis onthe behaviours from CuddleBits combined with other complexity measures such astime irreversibility.Generating behaviours: A generative approach where MSE is used as a utilityfunction might still be possible, but computationally efficient stochastic processesneed to explored to seed behaviour generation. Our attempt at generating via ge-netic algorithms using MSE as a utility function was short lived due to the lengthof time generation was taking with purely random recombination, even with hand-generated behaviours as seeds. One could imagine using a stocastic process such asBrownian motion or Perlin noise to generate a complex seed behaviour (or portion86thereof), then use known complexity-reducing functions to vary the complexity to-wards some MSE setpoint. However, further tests are still needed to determine howappropriate MSE measures would be for all low-valence behaviours.Generating on the fly: An interactive robot needs to generate behaviours on thefly. It may be possible to mix behaviours to mix valence states, e.g., through dy-namic time warping [23]. More work is needed to know whether a fully-interactivecontext would significantly change the interpretation of the behaviours.Qualitative AnalysisQualitative results present a complex and nuanced picture of robot behaviour in-terpretation. Although there were some behaviours that were unambiguously posi-tively or negatively valenced, the diversity of stories the participants told about therobots belie a fragile system of interacting narrative elements that could radicallyshift the understanding of a behaviour.A natural urge for a scientist is to argue that we need higher experimental con-trol to properly determine valence rating; however, there are are a number of prob-lems with this premise.First, the dimensions of control are ambiguous, since results indicate that wewould want to control the participant’s narrative frame. It is not clear whetherwe can, would want to, or be able to provide a story for the robot, as the storyis produced in conjunction with the participant’s own subjective experience. Thelaboratory setting provides its own set of expectations and narrative possibilities;the ambivalence of some interpretations might have been in response to a ratingtask that had low ecological validity. However, increasing the ecological validityby, e.g. moving out of a laboratory, would likely mean giving up control in manyother dimensions.Second, getting a clear rating of valence may take developing a relationshipbetween the robot and the participant. This would necessitate creating an interac-tive environment where the robot’s actions could vary directly in response to theparticipant’s actions. Although such a situation would make procuring ratings dif-ficult (when and how would participants rate a behaviour?), we may find that someconsistent narrative frame emerges naturally out of continued interaction.87Third, it may be that a single-sample, single-point rating task is too reductive.Asking a participant to choose a single point on a dimensional model that repre-sents an entire behaviour ignores any changes in emotion that the participant mayhave perceived during the behaviour. For example, imagine interacting with therobot for a five-minute period: there may be a lot of different emotions felt inthat time, all with different intensities. Further, a single point on any dimensionalmodel may not well represent an emotion state—it seems reasonable to be both“happy” and “sad” at the same time. For example, an emotion state may be betterrepresented by a probability distribution across a set of dimensions, as that wouldaccount for simultaneous conflicting emotions. It may be that an emotion stateshould not converge to a single point until a decision is forced.Future work should include tasks that account for these difficulties. Accountingfor the complexity of subjective measurements, the tasks should (a) attempt toproduce a narrative frame through interaction; (b) have some ecological validitythrough creating a believable setting; (c) take place over a long enough period oftime such that a relationship can develop between the robot and the participant; (d)use a rating scheme and/or emotion model that allows for simultaneous conflictingstates.5.5 Conclusions for Complexity, Voodle, and CuddleBitsBehaviour EvaluationDespite their simplicity, the CuddleBits are capable of evocative emotion displayand interaction. In the above studies, we have demonstrated that participants candistinguish emotional behaviours along axes of valence and arousal3, create com-plex emotional stories for the robots3,4, and conceive of the robot as a socialagent4,5. We have shown that it is possible to create emotional stories with therobot, and that interactors truly inter-act by becoming aligned with the robot4.We have further shown that the arousal and valence of a breathing-like behaviourcan be determined according to some statistical measures of the behaviour, andhave identified signal features that correlate with arousal and valence (i.e., com-3CuddleBits.4Voodle.5Complexity.88plexity measures such as MSE for valence, and frequency/amplitude measures forarousal)3,5. However, we have also presented evidence that simple dimensionalmeasures for determining an emotion state are not sufficient to capture the diversityand depth of participants’ subjective experience while interacting with the robot4.We draw the conclusion that, although arousal is relatively consistent, valence mea-sures are especially fragile to differences in narrative frame.89Chapter 6Conclusions and Ongoing WorkThis thesis has established the potential for simple, 1-DOF robots to engage incomplex emotional interactions. If the work were to be boiled down to the a singlecontribution, it would be that CuddleBits can be designed to be believable socialagents, despite their ‘painful simplicity’. The strong caveat is that, although thissense of agency is easy to produce, it’s difficult to control, due to the complexityof the human subjective experience. There are some opportunities for generality,i.e., arousal display is correlated with frequency and amplitude of the behaviourwaveform; valence display with complexity measures. However, valence displayneeds to account for the interactor creating a narrative frame with the robot, theproduction of which is tied to both the interactor’s experiences and the relationshipthey build while interacting with the robot. To reliably display valenced breathingbehaviours, a designer would need to account for the impact of the narrative frameby creating some interactive context through which the robot’s could be expressedand reliably interpreted.This chapter focuses on identifying current and future work that may supportvalenced interactions. Work discussed here is in various stages of development inclose collaboration with other SPIN researchers1.1especially Laura Cang and Mario Cimet.906.1 InteractivityIn Voodle, participants continuously interacted with the robot to create emotionalscenes. The robot seemed to have a much stronger ability to display emotions forthose participating directly in the scene (the actor and director) than those whowere emotionally removed from the scene (the observer). One interpretation ofthis finding could be that participants in the scene are actively suspending theirdisbelief, whereas the observer was not. Key to achieving this engaged emotionstate is continuous interaction with the robot.To date, the approach in the SPIN Lab has been to treat the robot’s emotiondisplay (presented here) and the robot’s emotion recognition (presented in Cang2016 [17]) as separable, even with cross-pollination between team members. Aholistic approach may be necessary to properly inspect emotion recognition forboth the robot and its interactor.Such an approach would contravene current attempts at single-sample emotionmeasurement, i.e., when we ask participants to summarize an emotional interactionwith a single rating. Emotions may only be well-defined within the context of acontinuous appraisal and reappraisal of the interaction at hand with regards to aperson’s subjective frame (called appraisal theory [76]). A relational definition ofemotions would require a different measurement schema than is currently used.One approach could be to avoid any direct estimation of emotion, and insteaddetermine the success of an emotional interaction through a goal-based approach.Imagine a study in which the goal was to find out how the robot likes to be touched.You might imagine the robot expressing dissatisfaction if it doesn’t like to be tick-led, but satisfaction if it likes to be stroked. The participant would have to reasonabout the robot’s emotional state and determine arousal and valence continuouslywith direct feedback. This kind of an approach sidesteps the problems with emo-tion modeling by placing successful interactions as the valuation metric.A similar idea would be to create robots with differentiable ‘personalities’.Given a long enough interaction period, one could determine, e.g., whether a robotprefers to be left alone, or prefers a particular proximity. A personality, in this case,would be equivalent to a mapping between some set of inputs and behaviours. Ifan interactor were able to compare and contrast the robots by their behaviours, it91would imply that some valence interpretation was consistent and successful.6.2 Developing a Therapeutic RobotCurrent work in the SPIN Lab2 is on designing an interactive talk therapy-like ses-sion centered on the robot. The premise is to tell the robot an emotional story whiletouching it. The goal is to determine the emotional state of the user through theirtouch interactions, and, if possible, intervene through robot motion. For example,you might imagine that the robot becomes sympathetically agitated while a ther-apy client tells an emotionally-charged story, however, the act of calming the robotdown helps them to calm themselves down.We assume that what is necessary to this vision are (1) a system to recognizethe interactor’s emotion; (2) a validated set of behaviours that can be manipulatedto become gradually more agitated; and (3) a mapping between the recognizedemotion and the output behaviour. However, the results from CuddleBits, Com-plexity, and Voodle suggest that an emotion recognition system may not have todetermine valence as much as arousal, since the interactor may interpret the robot’sbehaviour according to their own subjective frame. This could take the form of asimple mapping from some interactor’s input to some robot motion, such as in Voo-dle where vocal input was mapped to robot position. For example, the motion ofan accelerometer could be mapped to displayed arousal. This approach leaves thevalence interpretation as emergent from the context of the interaction.6.3 Can We Display Valenced Behaviours?It would be tempting to conclude from this thesis that we cannot consistently dis-play valenced behaviours on 1-DOF robots—that they’re too dependent on theinteractor’s narrative frame, which is too hard to control. I would consider thisviewpoint too pessimistic. We had success in displaying consistently valenced be-haviours, and found correlation between signal features and valence ratings. Thissuggests that there are aspects of a signal that can inspire a particular interpretationof valence, but that they are easily overpowered and/or accentuated by narrativeframing. Since some consistency was found (assuming no false positive), a better2see Cang 2016 [17]92emotion model may also improve recognition and display relative to some specifiednarrative frame.Building a believable robot must then take narrative into account. A goal of anautonomous robotic system means that the social context and the robot’s situationtherein must be determinable through interaction. A vision for such a system is onewhere the interactor can continously evaluate their actions in relation to the robot’s(re)actions and build up a narrative frame over time. Results from Voodle andComplexity suggest that this is possible, since participants naturally try to create ahomogenous narrative for the robot. Once a behaviour is embedded in a consistentnarrative context, it is likely that it will be interpreted more consistently in termsof valence.To motivate, consider how one learns how to interact with animals. There aresome shared behaviours that map to human expressions, such as the shape andmovement of a dog’s eyebrows, or the height at which an animal holds its head.Many animal expressions do not have human analogues, such as a dog’s waggingtail, or a cat’s ears. Yet, we are able to tell what these movements mean. Forsome movements, we can determine valence by correlation. One can learn that adog’s wagging tail means ‘happy’ by observing the excitement in their eyes, notinghow eagerly they approach you, or by noticing that the wagging tail accompaniesa friendly pat. By contrast, we learn that a cat’s wagging tail means ‘angry’ bywatching their fur rise, their menacing growls, and by noticing that the waggingtail accompanies an unfriendly scratch. For other movements, we can determinevalence by inference. If a cat leans into a stroke, one can be sure that they likeit. Or, if they facilitate a stroke by making it easier to stroke them. In all cases,the context—a history with animals, a knowledge of what constitutes facilitation,repeated interactions—is what determines the valence of a particular action.6.4 Lessons for Designing Emotional RobotsIn this thesis, we have looked at the design process for simple, emotionally-expressiverobots, and the generation and evaluation of their behaviours. The scope of be-haviour evaluation was limited to breathing behaviours as a design challenge, anddue to their emotional salience. However, we expect that many of the lessons93learned here should apply to emotional machines of many configurations and ca-pabilities.Our findings about narrative context reinforce the understanding that humansinteract emotionally with machines. Creating contexts in which machines can ex-press emotion requires that designers facilitate the interactor seeing the machineas a social agent and creating a narrative context. Although this may not be desir-able for every object—maybe we don’t want a social garbage can—the interactiveobjects that already act in our lives socially may be enhanced by emotional pro-gramming.For example, our robot vacuum cleaners already work within the social space ofour homes, and may be much more interesting and understandable if imbued withemotionally expressive behaviours. Imagine a robot vacuum cleaner that couldseem happy about its work, or frustrated at being unable to reach a spot in theroom, or scared about its battery running out. Our work suggests that it may bepossible to convey some of this emotional complexity with only 1-DOF motion,as long as the narrative context was accounted for in the behaviour design. Usingsome of our findings, we might expect a happy robot to drive smoothly, a frustratedrobot to drive erratically, and a scared robot to shake erratically.Why would we want an emotional robot vacuum cleaner? I would argue thatwe already do have robot vacuum cleaners that work in our human emotional space,but they’re just not very good. Well-thought out emotional behaviours for a robotvacuum cleaner would be more than a cute feature, they would be critical to a gooddesign. Humans act emotionally with their world; machines that communicateemotionally make transparent their purpose, motivations, and status. For a robotvacuum cleaner, this might equate to the robot’s owner being able to understand therobot at a deeper level and make it less frustrating to be around; for a collaborativerobot working in industry, such emotional insight might save someone’s life. Thework presented here suggests that that kind of emotional interaction is possiblewith just simple motion—if we can get the ‘story’ right.94Bibliography[1] PhysioNet complexity and variability analysis.https://physionet.org/tutorials/cv/. Accessed: 2017-08-22. → pages 16[2] J. Allen, L. Cang, M. Phan-Ba, A. Strang, and K. MacLean. Introducing thecuddlebot: A robot that responds to touch gestures. In Proceedings of theTenth Annual ACM/IEEE International Conference on Human-RobotInteraction Extended Abstracts, pages 295–295. ACM, 2015. → pages 39[3] Anki, 2017. https://anki.com/en-us/cozmo. → pages 76[4] F. Arab, S. Paneels, M. Anastassova, S. Coeugnet, F. Le Morellec,A. Dommes, and A. Chevalier. Haptic patterns and older adults: To repeat ornot to repeat? In 2015 IEEE World Haptics Conference (WHC), pages248–253. IEEE, jun 2015. ISBN 978-1-4799-6624-0.doi:10.1109/WHC.2015.7177721. URLhttp://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7177721.→ pages 22[5] J. N. Bailenson and N. Yee. Digital chameleons automatic assimilation ofnonverbal gestures in immersive virtual environments. Psychologicalscience, 16(10):814–819, 2005. → pages 21[6] R. Banse and K. R. Scherer. Acoustic profiles in vocal emotion expression.Journal of personality and social psychology, 70(3):614, 1996. → pages 22[7] J. Bates et al. The role of emotion in believable agents. Communications ofthe ACM, 37(7):122–125, 1994. → pages 24, 40[8] A. Beck, A. Hiolle, and L. Canamero. Using perlin noise to generateemotional expressions in a robot. In Proceedings of annual meeting of thecognitive science society (Cog Sci 2013), pages 1845–1850, 2013. → pages20, 7895[9] H. R. Bernard. Research methods in anthropology: Qualitative andquantitative approaches. Rowman Altamira, 2011. → pages 66[10] J. A. Bilmes, X. Li, J. Malkin, K. Kilanski, R. Wright, K. Kirchhoff,A. Subramanya, S. Harada, J. A. Landay, P. Dowden, and H. Chizeck. TheVocal Joystick: A Voice-based Human-computer Interface for Individualswith Motor Impairments. In Proceedings of the Conference on HumanLanguage Technology and Empirical Methods in Natural LanguageProcessing, HLT ’05, pages 995–1002, Stroudsburg, PA, USA, 2005.Association for Computational Linguistics. doi:10.3115/1220575.1220700.URL http://dx.doi.org/10.3115/1220575.1220700. → pages 22, 77[11] S. Bloch, M. Lemeignan, and N. Aguilera-T. Specific respiratory patternsdistinguish among human basic emotions. Intl Journal of Psychophysiology,11(2):141–154, 1991. → pages 13, 18[12] F. A. Boiten, N. H. Frijda, and C. J. Wientjes. Emotions and respiratorypatterns: review and critical analysis. International Journal ofPsychophysiology, 17(2):103–128, 1994. → pages 13[13] C. Breazeal and L. Aryananda. Recognition of Affective CommunicativeIntent in Robot-Directed Speech. Autonomous Robots, 12(1):83–104, 2002.ISSN 1573-7527. doi:10.1023/A:1013215010749. URLhttp://dx.doi.org/10.1023/A:1013215010749. → pages 22[14] L. Brunet, C. Megard, S. Paneels, G. Changeon, J. Lozada, M. P. Daniel, andF. Darses. “Invitation to the voyage”: The design of tactile metaphors tofulfill occasional travelers’ needs in transportation networks. In 2013 WorldHaptics Conference (WHC), pages 259–264. IEEE, apr 2013. ISBN978-1-4799-0088-6. doi:10.1109/WHC.2013.6548418. URLhttp://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6548418.→ pages 22[15] P. Bucci, X. L. Cang, M. Chun, D. Marino, O. Schneider, H. Seifi, andK. MacLean. CuddleBits: an iterative prototyping platform for complexhaptic display. In EuroHaptics ’16 Demos, 2016. → pages 13[16] P. Bucci, L. Cang, S. Valair, D. Marino, L. Tseng, M. Jung, J. Rantala,O. Schneider, and K. MacLean. Sketching cuddlebits: Coupled prototypingof body and behaviour for an affective robot pet. To appear in Proceedingsof the SIGCHI Conference on Human Factors in Computing Systems (CHI)2017, 2017. → pages iv, 49, 7796[17] L. Cang. Towards an emotionally communicative robot: feature analysis formultimodal support of affective touch recognition. Master’s thesis,University of British Columbia, 2016. → pages 26, 91, 92[18] L. Cang, P. Bucci, and K. E. MacLean. Cuddlebits: Friendly, low-costfurballs that respond to touch. In Proceedings of the 2015 ACM onInternational Conference on Multimodal Interaction, pages 365–366. ACM,2015. doi:10.1145/2818346.2823293. → pages 13[19] X. L. Cang, P. Bucci, A. Strang, J. Allen, K. MacLean, and H. Liu. Differentstrokes and different folks: Economical dynamic surface sensing andaffect-related touch recognition. In Proceedings of the 2015 ACM onInternational Conference on Multimodal Interaction, pages 147–154. ACM,2015. → pages 25[20] N. Chomsky and M. Halle. The sound pattern of English. New York: Harperand Row, 1968. → pages 46[21] H. Choset, K. M. Lynch, S. Hutchinson, G. Kantor, W. Burgard, L. E.Kavraki, and S. Thrun. Principles of Robot Motion: Theory, Algorithms, andImplementations. The MIT Press, 2005. → pages 20[22] M. Chung, E. Rombokas, Q. An, Y. Matsuoka, and J. Bilmes. Continuousvocalization control of a full-scale assistive robot. In 2012 4th IEEE RASEMBS International Conference on Biomedical Robotics andBiomechatronics (BioRob), pages 1464–1469, jun 2012.doi:10.1109/BioRob.2012.6290664. → pages 22[23] B. Clark, O. S. Schneider, K. E. MacLean, and H. Z. Tan. Predictable anddistinguishable morphing of vibrotactile rhythm. → pages 87[24] J. Corbin and A. Strauss. Basics of Qualitative Research: Techniques andProcedures for Developing Grounded Theory. Sage Publications, Inc., 3edition, 2008. → pages 67[25] M. Costa, A. L. Goldberger, and C.-K. Peng. Multiscale entropy analysis ofcomplex physiologic time series. Physical review letters, 89(6):068102,2002. → pages 17[26] M. D. Costa, C.-K. Peng, and A. L. Goldberger. Multiscale analysis of heartrate dynamics: entropy and time irreversibility measures. CardiovascularEngineering, 8(2):88–93, 2008. → pages 17, 8097[27] J. Q. Dawson, O. S. Schneider, J. Ferstay, D. Toker, J. Link, S. Haddad, andK. MacLean. It’s alive!: exploring the design space of a gesturing phone. InProc of Graphics Interface 2013, pages 205–212. Canadian InformationProcessing Society, 2013. → pages 54[28] C. M. de Melo, P. Kenny, and J. Gratch. Real-time expression of affectthrough respiration. Computer Animation and Virtual Worlds, 21(3-4):225–234, 2010. → pages 13, 18[29] F. De Saussure, W. Baskin, and P. Meisel. Course in general linguistics.Columbia University Press, 2011. → pages 21[30] L. Feldman Barrett and J. A. Russell. Independence and bipolarity in thestructure of current affect. Journal of personality and social psychology, 74(4):967, 1998. → pages 15[31] T. Fong, I. Nourbakhsh, and K. Dautenhahn. A survey of socially interactiverobots. Robotics and Autonomous Systems, 42(3):143–166, 2003. → pages12[32] J. Forsslund, M. Yip, and E.-L. Sallna¨s. Woodenhaptics: A starting kit forcrafting force-reflecting spatial haptic devices. In Proceedings of the NinthInternational Conference on Tangible, Embedded, and EmbodiedInteraction, pages 133–140. ACM, 2015. → pages 19[33] S. Garrod and M. J. Pickering. Why is conversation so easy? Trends incognitive sciences, 8(1):8–11, 2004. → pages 21, 76[34] M. Goto, K. Kitayama, K. Itou, and T. Kobayashi. Speech Spotter:On-demand Speech Recognition. In in Human-Human Conversation on theTelephone or in Face-to-Face Situations. Proc. ICSLP’04, pages 1533–1536,2004. → pages 22[35] J. Gratch, A. Okhmatovskaia, F. Lamothe, S. Marsella, M. Morales, R. J.van der Werf, and L.-P. Morency. Virtual rapport. In International Workshopon Intelligent Virtual Agents, pages 14–27. Springer, 2006. → pages 21[36] J. Gray, G. Hoffman, S. O. Adalgeirsson, M. Berlin, and C. Breazeal.Expressive, interactive robots: Tools, techniques, and insights based oncollaborations. In Human Robot Interaction (HRI) 2010 Workshop: What docollaborations with the arts have to say about HRI, 2010. → pages 20, 7698[37] S. Harada, J. A. Landay, J. Malkin, X. Li, and J. A. Bilmes. The VocalJoystick:: Evaluation of Voice-based Cursor Control Techniques. InProceedings of the 8th International ACM SIGACCESS Conference onComputers and Accessibility, Assets ’06, pages 197–204, New York, NY,USA, 2006. ACM. ISBN 1-59593-290-9. doi:10.1145/1168987.1169021.URL http://doi.acm.org/10.1145/1168987.1169021. → pages 77[38] G. Hoffman. On stage: robots as performers. In RSS 2011 Workshop onHuman-Robot Interaction: Perspectives and Contributions to Robotics fromthe Human Sciences. Los Angeles, CA, volume 1, 2011. → pages 21[39] G. Hoffman and W. Ju. Designing robots with movement in mind. Journalof Human-Robot Interaction, 3(1):89–122, 2014. → pages 20[40] G. Hoffman and W. Ju. Designing robots with movement in mind. Journalof Human-Robot Interaction, 3(1):89–122, 2014. → pages 19, 77[41] B. House, J. Malkin, and J. Bilmes. The VoiceBot: A Voice ControlledRobot Arm. In Proceedings of the SIGCHI Conference on Human Factors inComputing Systems, CHI ’09, pages 183–192, New York, NY, USA, 2009.ACM. ISBN 978-1-60558-246-7. doi:10.1145/1518701.1518731. URLhttp://doi.acm.org/10.1145/1518701.1518731. → pages 22[42] T. Igarashi and J. F. Hughes. Voice As Sound: Using Non-verbal Voice Inputfor Interactive Control. In Proceedings of the 14th Annual ACM Symposiumon User Interface Software and Technology, UIST ’01, pages 155–156, NewYork, NY, USA, 2001. ACM. ISBN 1-58113-438-X.doi:10.1145/502348.502372. URLhttp://doi.acm.org/10.1145/502348.502372. → pages 22[43] Johnny-Five, 2017. http://johnny-five.io. → pages 47[44] K. J. Kitto. Modelling and generating complex emergent behaviour.Flinders University, School of Chemistry, Physics and Earth Sciences., 2006.→ pages 16[45] K. Krippendorff. Computing krippendorff’s alpha reliability. Departmentalpapers (ASC), page 43, 2007. → pages 81[46] J. L. Lakin, V. E. Jefferis, C. M. Cheng, and T. L. Chartrand. The chameleoneffect as social glue: Evidence for the evolutionary significance ofnonconscious mimicry. Journal of nonverbal behavior, 27(3):145–162,2003. → pages 2199[47] G. Lakoff and M. Johnson. Metaphors we live by. 2003. → pages 70[48] Y.-K. Lim, E. Stolterman, and J. Tenenberg. The anatomy of prototypes:Prototypes as filters, prototypes as manifestations of design ideas. ACMTOCHI, 15(2):7, 2008. → pages 19[49] R. Maatman, J. Gratch, and S. Marsella. Natural behavior of a listeningagent. In International Workshop on Intelligent Virtual Agents, pages 25–36.Springer, 2005. → pages 21[50] M. M. Marin and H. Leder. Examining complexity across domains: relatingsubjective and objective measures of affective environmental scenes,paintings and music. PloS One, 8(8):e72412, 2013. → pages 16[51] D. Marino, P. Bucci, O. S. Schneider, and K. E. MacLean. Voodle: Vocaldoodling to sketch affective robot motion. In Proceedings of the 2017Conference on Designing Interactive Systems, pages 753–765. ACM, 2017.→ pages iv[52] L. Mathiassen, T. Seewaldt, and J. Stage. Prototyping and specifying:principles and practices of a mixed approach. Scandinavian Journal ofInformation Systems, 7(1):4, 1995. → pages 19[53] D. H. McFarland. Respiratory markers of conversational interaction.Journal of Speech, Language, and Hearing Research, 44(1):128–143, 2001.→ pages 21[54] A. Moon, C. A. Parker, E. A. Croft, and H. M. Van der Loos. Did you see ithesitate?–empirically grounded design of hesitation trajectories forcollaborative robots. In 2011 IEEE/RSJ International Conference onIntelligent Robots and Systems, pages 1994–1999. IEEE, 2011. → pages 20[55] C. Moussette. Sketching in Hardware and Building Interaction Design :tools , toolkits and an attitude for Interaction Designers. In Processing,2010. URL https://public.me.com/intuitive. → pages 19[56] C. Moussette. Simple haptics: Sketching perspectives for the design ofhaptic interactions. Umea˚ Universitet, 2012. → pages 19[57] C. Moussette, S. Kuenen, and A. Israr. Designing haptics. In Proceedings ofthe Sixth International Conference on Tangible, Embedded and EmbodiedInteraction - TEI ’12, page 351, New York, New York, USA, feb 2012.ACM Press. ISBN 9781450311748. doi:10.1145/2148131.2148215. URLhttp://dl.acm.org/citation.cfm?id=2148131.2148215. → pages 7100[58] S. Oliver and M. Karon. Haptic jazz: Collaborative touch with the hapticinstrument. In IEEE Haptics Symposium, 2014. → pages 20[59] J. S. Pardo. On phonetic convergence during conversational interaction. TheJournal of the Acoustical Society of America, 119(4):2382–2393, 2006. →pages 21[60] J. Pearson, J. Hu, H. P. Branigan, M. J. Pickering, and C. I. Nass. Adaptivelanguage behavior in hci: how expectations and beliefs about a system affectusers’ word choice. In Proceedings of the SIGCHI conference on HumanFactors in computing systems, pages 1177–1180. ACM, 2006. → pages 21[61] C. Peirce. The philosophical writings of Peirce, pages 98–119. New York:Dover, 1955. → pages 22[62] M. Perlman and A. A. Cain. Iconicity in vocalization, comparisons withgesture, and implications for theories on the evolution of language. Gesture,14(3):320–350, 2014. → pages 22[63] P. Perniss and G. Vigliocco. The bridge of iconicity: from a world ofexperience to the experience of language. Phil. Trans. R. Soc. B, 369(1651):20130300, 2014. → pages 22[64] P. Rainville, A. Bechara, N. Naqvi, and A. R. Damasio. Basic emotions areassociated with distinct patterns of cardiorespiratory activity. Internationaljournal of psychophysiology, 61(1):5–18, 2006. → pages 13[65] J. Rantala, K. Salminen, R. Raisamo, and V. Surakka. Touch gestures incommunicating emotional intention via vibrotactile stimulation. Intl JHuman-Computer Studies, 71(6):679–690, 2013. → pages 54[66] React, 2017. https://facebook.github.io/react. → pages 47[67] T. Ribeiro and A. Paiva. The illusion of robotic life: principles and practicesof animation for robots. In Proceedings of the seventh annual ACM/IEEEinternational conference on Human-Robot Interaction, pages 383–390.ACM, 2012. → pages 20[68] J. S. Richman and J. R. Moorman. Physiological time-series analysis usingapproximate entropy and sample entropy. American Journal ofPhysiology-Heart and Circulatory Physiology, 278(6):H2039–H2049, 2000.→ pages 17101[69] R. Rose, M. Scheutz, and P. Schermerhorn. Towards a conceptual andmethodological framework for determining robot believability. InteractionStudies, 11(2):314–335, 2010. → pages 23[70] R. Rummer, J. Schweppe, R. Schlegelmilch, and M. Grice. Mood is linkedto vowel type: The role of articulatory movements. Emotion, 14(2):246,2014. → pages 22[71] J. A. Russell. A circumplex model of affect. Journal of Personality andSocial Psychology, 39(6):1161, 1980. → pages 15[72] J. A. Russell, A. Weiss, and G. A. Mendelsohn. Affect grid: a single-itemscale of pleasure and arousal. Journal of personality and social psychology,57(3):493, 1989. → pages 15, 52[73] G. W. Ryan and H. R. Bernard. Techniques to Identify Themes. FieldMethods, 15(1):85–109, feb 2003. ISSN 00000000.doi:10.1177/1525822X02239569. URLhttp://fmx.sagepub.com/cgi/doi/10.1177/1525822X02239569. → pages 67[74] D. Sakamoto, T. Komatsu, and T. Igarashi. Voice Augmented Manipulation:Using Paralinguistic Information to Manipulate Mobile Devices. InProceedings of the 15th International Conference on Human-computerInteraction with Mobile Devices and Services, MobileHCI ’13, pages 69–78,New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2273-7.doi:10.1145/2493190.2493244. URLhttp://doi.acm.org/10.1145/2493190.2493244. → pages 22[75] J. Saldien, K. Goris, S. Yilmazyildiz, W. Verhelst, and D. Lefeber. On thedesign of the huggable robot probo. Journal of Physical Agents, 2(2):3–12,2008. → pages 20[76] K. R. Scherer. Appraisal theory. 1999. → pages 91[77] O. S. Schneider and K. E. MacLean. Improvising design with a hapticinstrument. In 2014 IEEE Haptics Symposium (HAPTICS), pages 327–332.IEEE, 2014. → pages 22, 78[78] O. S. Schneider and K. E. MacLean. Studying Design Process and ExampleUse with Macaron, a Web-based Vibrotactile Effect Editor. In HAPTICS’16: Symposium on Haptic Interfaces for Virtual Environment andTeleoperator Systems, 2016. → pages 48102[79] O. S. Schneider and K. E. MacLean. Studying design process and exampleuse with macaron, a web-based vibrotactile effect editor. In 2016 IEEEHaptics Symposium (HAPTICS), pages 52–58. IEEE, 2016. → pages 8, 20,39[80] O. S. Schneider, H. Seifi, S. Kashani, M. Chun, and K. E. MacLean.HapTurk: Crowdsourcing Affective Ratings for Vibrotactile Icons. InProceedings of the SIGCHI Conference on Human Factors in ComputingSystems (CHI) ’16, pages 3248–3260, New York, New York, USA, may2016. ACM Press. ISBN 9781450333627. doi:10.1145/2858036.2858279.URL http://dl.acm.org/citation.cfm?id=2858036.2858279. → pages 43[81] Y. S. Sefidgar, K. E. MacLean, S. Yohanan, H. M. Van der Loos, E. A. Croft,and E. J. Garland. Design and evaluation of a touch-centered calminginteraction with a social robot. IEEE Transactions on Affective Computing, 7(2):108–121, 2016. → pages 13[82] H. Seifi, K. Zhang, and K. E. MacLean. Vibviz: Organizing, visualizing andnavigating vibration libraries. In World Haptics Conference (WHC), 2015IEEE, pages 254–259. IEEE, 2015. → pages 43[83] M. Shaver and K. Maclean. The twiddler: A haptic teaching tool forlow-cost communication and mechanical design. Master’s thesis, 2003. →pages 39[84] H. Shintel, H. C. Nusbaum, and A. Okrent. Analog acoustic expression inspeech communication. Journal of Memory and Language, 55(2):167–177,2006. → pages 22[85] J. Six, O. Cornelis, and M. Leman. TarsosDSP, a Real-Time AudioProcessing Framework in Java. In Proceedings of the 53rd AES Conference(AES 53rd), 2014. → pages 47[86] A. Strauss and J. Corbin. Basics of qualitative research: Techniques andprocedures for developing grounded theory . Sage Publications, Inc, 1998.→ pages 65[87] L. Takayama, D. Dooley, and W. Ju. Expressing thought: improving robotreadability with animation principles. In Proceedings of the 6thinternational conference on Human-robot interaction, pages 69–76. ACM,2011. → pages 20103[88] K. Wada and T. Shibata. Living with seal robotsits sociopsychological andphysiological influences on the elderly at a care house. IEEE Transactionson Robotics, 23(5):972–980, 2007. → pages 12[89] K. Wada, T. Shibata, T. Asada, and T. Musha. Robot therapy for preventionof dementia at home. Journal of Robotics and Mechatronics, 19(6):691,2007. → pages 12[90] R. M. Warner. Coordinated cycles in behavior and physiology duringface-to-face social interactions. Sage Publications, Inc, 1996. → pages 21[91] J. Watanabe and M. Sakamoto. Comparison between onomatopoeias andadjectives for evaluating tactile sensations. In Proceedings of the 6thInternational Conference of Soft Computing and Intelligent Systems and the13th International Symposium on Advanced Intelligent Systems (SCIS-ISIS2012), pages 2346–2348, 2012. → pages 22, 78[92] D. Watson, L. A. Clark, and A. Tellegen. Development and validation ofbrief measures of positive and negative affect: The PANAS scales. Journalof Personality and Social Psychology, 54(6):1063–1070, 1988.doi:10.1037/0022-3514.54.6.1063. URLhttp://doi.apa.org/getdoi.cfm?doi=10.1037/0022-3514.54.6.1063. → pages15[93] D. Watson, L. A. Clark, and A. Tellegen. Development and validation ofbrief measures of positive and negative affect: the panas scales. Journal ofpersonality and social psychology, 54(6):1063, 1988. → pages 47[94] A. C. Weidman, C. M. Steckler, and J. L. Tracy. The jingle and jangle ofemotion assessment: Imprecise measurement, casual scale usage, andconceptual fuzziness in emotion research. 2016. → pages 14[95] S. Yohanan and K. MacLean. A tool to study affective touch. In CHI’09Extended Abstracts, pages 4153–4158. ACM, 2009. → pages 12[96] S. Yohanan and K. E. MacLean. Design and assessment of the hapticcreature’s affect display. In Proceedings of the 6th international conferenceon Human-robot interaction, pages 473–480. ACM, 2011. → pages 12[97] S. Yohanan and K. E. MacLean. The role of affective touch in human-robotinteraction: Human intent and expectations in touching the haptic creature.Intl J Social Robotics, 4(2):163–180, 2012. → pages 54104Appendix ASupporting Materials105Figure A.1: RibBit assembly instructions, page 1.106Figure A.2: RibBit assembly instructions, page 2.107Figure A.3: RibBit assembly instructions, page 3.108Figure A.4: RibBit assembly instructions, page 2.109Figure A.5: RibBit assembly instructions, page 5.110Figure A.6: RibBit assembly instructions, page 6.111Figure A.7: RibBit assembly instructions, page 7.112Figure A.8: RibBit assembly instructions, page 8.113Figure A.9: RibBit assembly instructions, page 9.114Figure A.10: RibBit design system explainer, page 1.115Figure A.11: RibBit design system explainer, page 2.116Figure A.12: RibBit design system explainer, page 3.117Figure A.13: Lasercutting files for the RibBit.118Figure A.14: FlexiBit assembly instructions, page 1.119Figure A.15: FlexiBit assembly instructions, page 2.120Figure A.16: FlexiBit assembly instructions, page 3.121Figure A.17: FlexiBit assembly instructions, page 4.122Figure A.18: FlexiBit design system explainer, page 1.123Figure A.19: FlexiBit design system explainer, page 2.124Figure A.20: FlexiBit design system explainer, page 3.125Figure A.21: FlexiBit design system explainer, page 4.126Figure A.22: FlexiBit design system explainer, page 5.127Figure A.23: FlexiBit design files to be cut out.128


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items