UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Design of a casual video authoring interface based on navigation behaviour Fong, Matthew 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2014_november_fong_matthew.pdf [ 6.59MB ]
JSON: 24-1.0167366.json
JSON-LD: 24-1.0167366-ld.json
RDF/XML (Pretty): 24-1.0167366-rdf.xml
RDF/JSON: 24-1.0167366-rdf.json
Turtle: 24-1.0167366-turtle.txt
N-Triples: 24-1.0167366-rdf-ntriples.txt
Original Record: 24-1.0167366-source.json
Full Text

Full Text

Design of a Casual Video Authoring Interface based onNavigation BehaviourbyMatthew FongB.A.Sc., University of British Columbia, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMASTER OF APPLIED SCIENCEinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Electrical and Computer Engineering)The University Of British Columbia(Vancouver)September 2014c©Matthew Fong, 2014AbstractWe propose the use of a personal video navigation history, which records a user’sviewing behaviour, as a basis for casual video editing and sharing. Our novel inter-action supports users’ navigation of previously-viewed intervals to construct newvideos via simple playlists. The intervals in the history can be individually pre-viewed and searched, filtered to identify frequently-viewed sections, and added toa playlist from which they can be refined and re-ordered to create new videos. In-terval selection and playlist creation using a history-based interaction is comparedto a more conventional filmstrip-based technique. We performed several user stud-ies to evaluate the usability and performance of this method and found significantresults indicating improvement in video interval search and selection.iiPrefaceAll the implementation and experiements henceforth were conducted by myself.Concepts and design decisions were discussed among myself, Abir Al Hajri, Gre-gor Miller and Sidney Fels.A version of Chapter 4 was published as Fong, M, Al-Hajri, A, Miller, G, Fels,S (2014) at Graphics Interface 2014.The screenshots from Figures 1.1, 2.1, 3.1, 3.2, 3.4, 3.5, 3.6, 3.7, 3.8 arec©copyright 2008, Blender FoundationiiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1 Video Browsing . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Using Video Watching History . . . . . . . . . . . . . . . . . . . 142.3 Video Authoring . . . . . . . . . . . . . . . . . . . . . . . . . . 162.4 User Action History . . . . . . . . . . . . . . . . . . . . . . . . . 192.5 Summary and Influence on Design . . . . . . . . . . . . . . . . . 253 General Interface Design for History Based Editing . . . . . . . . . 273.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27iv3.2 Design Language . . . . . . . . . . . . . . . . . . . . . . . . . . 283.3 Thumbnail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.4 Filmstrip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.5 Navigation History View . . . . . . . . . . . . . . . . . . . . . . 313.6 Playlist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Comparison of History and Filmstrip Selection . . . . . . . . . . . . 374.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.1.1 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . 384.1.2 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . 384.1.3 Participants . . . . . . . . . . . . . . . . . . . . . . . . . 384.1.4 Design and Procedure . . . . . . . . . . . . . . . . . . . 384.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 414.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 Comparison of Implicit History and Explicit Favourites . . . . . . . 475.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.1.1 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . 485.1.2 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . 485.1.3 Participants . . . . . . . . . . . . . . . . . . . . . . . . . 485.1.4 Design and Procedure . . . . . . . . . . . . . . . . . . . 485.1.5 Videos Content . . . . . . . . . . . . . . . . . . . . . . . 495.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 515.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556 Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . 566.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59vList of TablesTable 4.1 Questionnaire results for History vs Filmstrip Clip Selection . . 45Table 5.1 Questionnaire results for History vs Favourites Clip Selection . 52Table 5.2 Results for use of Favourites . . . . . . . . . . . . . . . . . . . 53Table 5.3 Results for use of History and Favourites when given both meth-ods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54viList of FiguresFigure 1.1 Context menu from a YouTube video. . . . . . . . . . . . . . 4Figure 2.1 YouTube’s video player . . . . . . . . . . . . . . . . . . . . . 10Figure 2.2 Dynamic thumbnail, by Girgensohn et al. . . . . . . . . . . . 11Figure 2.3 ZoomSlider, by Hu¨rst et al. . . . . . . . . . . . . . . . . . . . 13Figure 2.4 ColorBrowser by Barbieri et al. . . . . . . . . . . . . . . . . 13Figure 2.5 Trailblazer by Kimber et al. . . . . . . . . . . . . . . . . . . 14Figure 2.6 User footprints bar by Mertens et al. . . . . . . . . . . . . . . 16Figure 2.7 Edit While Watching by Weda et al. . . . . . . . . . . . . . . 18Figure 2.8 History found in Adobe Photoshop . . . . . . . . . . . . . . . 20Figure 2.9 Chimera by Kurlander and Feiner. . . . . . . . . . . . . . . . 21Figure 2.10 Chronicle, by Grossman et al. . . . . . . . . . . . . . . . . . 22Figure 2.11 Chronicle Timeline, by Grossman et al. . . . . . . . . . . . . 22Figure 2.12 Annotated history, by Nakamura and Igarashi . . . . . . . . . 23Figure 2.13 WebVCR by Anupam et al. . . . . . . . . . . . . . . . . . . . 24Figure 2.14 History Views of ActionShot by Li et al. . . . . . . . . . . . . 24Figure 2.15 Facebook’s vertical event timeline . . . . . . . . . . . . . . . 25Figure 2.16 Twitter’s vertical event timeline . . . . . . . . . . . . . . . . 26Figure 3.1 Overview of the interface . . . . . . . . . . . . . . . . . . . . 28Figure 3.2 Video Interval Thumbnail . . . . . . . . . . . . . . . . . . . 30Figure 3.3 Overview of filmstrip . . . . . . . . . . . . . . . . . . . . . . 31Figure 3.4 Overview of history . . . . . . . . . . . . . . . . . . . . . . . 32Figure 3.5 Re-watch filter in history. . . . . . . . . . . . . . . . . . . . . 33viiFigure 3.6 History Filtering . . . . . . . . . . . . . . . . . . . . . . . . 34Figure 3.7 Overview of the playlist . . . . . . . . . . . . . . . . . . . . 35Figure 3.8 Preview thumbnail in the Playlis . . . . . . . . . . . . . . . . 36Figure 4.1 Experimental Design: History and Filmstrip . . . . . . . . . . 39Figure 4.2 Descriptive Statistics for History vs Filmstrip Interval selection 44viiiAcknowledgmentsI’d like to thank my supervisor Sidney Fels for overseeing this work, as well asGregor Miller and Abir Al Hajri, who worked with me on much of this work.Of course my loving family, my mom, Rosamaria, my dad, Ken, as well as mysister, Heather, who encouraged me throughout my schooling years.And of course, my loving girlfriend, Crystal Lim, always positive, and alwaysencouraging.ixChapter 1IntroductionVideo, as we know it today, is a widespread and popular medium for news, enter-tainment, as well as instructional content. It has become popular as a form of enter-tainment, especially on the Internet. In casual videos, such as the amateur videosfound on YouTubeTM, users watch over 6 billion hours of video everyday, and thesehours are not limited to desktop computer users either, as mobile devices accountfor almost 40% of YouTube’s traffic1. Casual video sharing and watching videoon YouTube and other online video repositories has become commonplace, andwatching user-generated-content has become popular. As user-generated-contentis generally unfiltered, there is a lack of quality control resulting in longer videoswith uninteresting segments. It has been shown, in studies like Vihavainen et al.[40] that content creators are unlikely to edit videos unless given significant incen-tive. Thus, there is a need for a method in which video watchers can create videoto share with others.The average duration of a video on YouTube is 4.4 minutes 2. The majorityof viewers on YouTube spend less than five minutes watching each video, and theamount of time spent on this activity can vary greatly. The level of commitment tothis activity is low, and can be defined as a casual activity. A less casual form ofvideo viewing would be watching feature length movies, where the commitment to1http://www.youtube.com/yt/press/statistics.html2http://www.comscore.com/Insights/Press-Releases/2014/2/comScore-Releases-January-2014-US-Online-Video-Rankings1watching the entire movie is much higher. The idea of casual video viewing can beextended to video creation. Casual video authoring is analogous to searching forvideo to share with others after having viewed a short video. The current methodsfor creating and sharing video clips is insufficient for the casual use case.Video sharing in its current state requires users to search and remember thelocations of interesting parts of video. To create video, users need to first findthe approprate content. In the context of sharing already-seen video, the videoviewing history is the logical place to begin. The use of history and footprints,specifically in Internet browsing sessions, has aided users in finding previouslyseen information. Online video repositories, for example, YouTube allow users toview their previously seen videos, allowing users to narrow down their search for aspecific video. This history mechanism can be extended to support a within-videonavigation, allowing users to search for video clips. Searching for video clipswithin videos can be a difficult task, and becomes more difficult with lengthiervideos. The video watching history that we propose allows users to mark and saveclips of video simply by re-watching them. These video clips can be filtered sothat users can find which video clips they have watched more than once, as wellas favourited as a more complete form of organization. Sharing these video clipsis then accomplished by dragging clips across to the playlist. This method forauthoring is focused towards casual video watchers and the process for creatingvideo is easier than past methods for video authoring.The processes involved in video production have evolved throughout the yearsas technologies incrementally improved. The first consumer grade video capturedevices were large and difficult to carry around. Today, video cameras are smalland ubiquitous enough to be integrated within other hand-held electronic devices,namely mobile phones. The inclusion of video capture in mobile phones has madeit extremely convenient for users to record their own video. The basic techniqueinvolved in video capture, however, has not changed, whereas editing video haschanged significantly. Since the move from analogue to digital formats, manip-ulating video has become simple enough for amateur video enthusiasts to createtheir own videos. Now, instead of working with physical filmstrips or video cas-settes, editors maneuver virtual video clips on screen. This has made video editingfar more appealing, inviting more advanced computer users to attempt to edit video2with software such as Apple iMovie, Microsoft Windows Movie Maker and AdobePremiere. However, as seen in [40], the process is not easy enough to convinceusers to edit their video. The distribution end of video has also shifted. Withoutthe Internet, users were restricted to handing out physical copies. Under the in-fluence of the Internet, video watchers can share video with each other simply byemail or instant message.Sharing video has become an activity that extends beyond video producers.In the rise of social media, computer users share video that they have watchedwith others. Unfortunately, many existing video players are oriented toward linearviewing of content and do not provide easy methods to seek around the video. Thisis an issue when a user wants to search for a specific part of a video to share. OnYouTube, for example, the act of sharing a video is accomplished by sharing aURL. The player, however, does not aid in video search, and only provides a linearview on the timeline. Furthermore, as mentioned before, much of the video foundon YouTube can be uninteresting, and it is more useful to only share parts of thevideo, and to do so, users must uncover a hidden context menu to “Copy videoURL at current time”, like in Figure 1.1. Let us take a look at a typical usagescenario.Susan is browsing stand up comedy routines on YouTube, and comes across avery funny joke that made her laugh. Susan saves the video into a list of favourites.She does this three times for ten videos. A week later, Susan wants to revisit themand share them with friends. Realizing that her friends may not have the timeto watch each entire comedy routine, Susan makes sure to tell them where theyshould start watching and when they should stop watching. For each video in herfavourites list, Susan searches through the video for the three jokes by skippingthrough the video, finds the beginning and end of the segment and sends it to herfriends via email.This scenario illustrates the idea behind saving and sharing video on YouTube.1. Watch videos on YouTube2. Add videos to favourites playlist3. Memorize temporal locations of interesting parts of video3Figure 1.1: Context menu from a YouTube video.4. Find favourites playlist, after a week, on YouTube5. Search for temporal locations previously memorized6. Right mouse click on the video and “Copy video URL at current time”7. Paste link and send to friend, noting the amount of time to watch8. Repeat previous 3 steps until finishedYouTube’s playlist function is a useful way to save videos. However, sharingbecomes difficult if the user wants to share multiple parts of different videos. Su-san needed to search for the specific video intervals in each separate video, as wellas note down the length of each interval. This requires Susan to rely on her mem-ory both for remembering specifically which video it was in, as well as when theinterval happened in that video.Endel Tulving defines semantic memory as information that have been passedthrough as knowledge, and episodic memory is defined as information that is de-fined through personal experiences [39]. We will define our own terms in the video4space. Semantic video memory defines the temporal location of a specific space invideo, such as its timestamp (Where in time was that joke?). Episodic video mem-ory defines the context of the video (Did this joke happen after that joke?). Thissearch relies on semantic video memory, if Susan is willing to commit to memorythe time stamps for those videos. Or, if Susan is seeking around the video lookingfor the interval, the search relies upon episodic memory, which means that Susanis relying on context. Of course memory is unreliable, and given our scenario,the week between watching the videos and sharing them makes it difficult. Giventhat remembering sequences of events is easier than remembering time stamps,we would like to build on usage of episodic memory to aid in video search andretrieval for video interval selection and sharing. Introducing a video navigationhistory would allow users to search for previously seen video intervals. The userwill then ask the question “When did I watch that part of the video”, which makessearching for video a much more personal experience. Much like a web-history,video-navigation history is displayed temporally, allowing users to see the orderin which they watched certain video intervals and allowing users to retrace theirsteps. This works even if users viewed the video out of order.Creation of the history can be accomplished in two ways. We allow users toimplicitly create a video history by recording when a user rewinds to re-watchan affective [37] interval of video. This takes advantage of human nature and itsinnate tendency for repetition, as seen in Zipf’s Law [44] and Pareto’s Principle[26], which we will cover in Chapter 2. We believe that due to this behaviour,allowing users to save video by re-visiting it will encourage such behaviour if it isnot already present. For other situations, we allow users to manually select whento record video. This is similar to personal-video-recorders (PVRs).Using this history, users can then save these intervals to a customized playlistfor future manipulation, or leave them in the history to find later. The main ideais that video interval selection becomes much easier, only requiring users to watchthe video. If we revisit the scenario, the steps can be reduced significantly.Susan is browsing stand up comedy routines, and comes across a very funnyjoke that made her laugh. Susan rewatches the joke, automatically saving it intoher history. She does this three times for ten videos. A week later, Susan wants torevisit them and share them with friends. She goes into her history and searches5for the 30 history items (three for each of the ten videos), all next to each other, andadds them all into a playlist.1. Watch video2. Re-watch interesting intervals3. Add interesting intervals from history into a playlist4. Find playlist, after a week5. Share playlistWe believe that by allowing users to save specific intervals in addition to spe-cific videos provides an easier solution to playlist authoring and sharing. It reducesthe search space significantly and allows for more flexible manipulation of videoover an abstraction of text. The use of history should be simple and easy to under-stand, as well as useful in providing users with information that can aid in search-ing and sharing of video intervals. The research, is thus, making sure the authoringusing a video navigation history is feasible.1.1 Research QuestionThe purpose of this work is to apply the history to a video editing context, mean-while keeping it simple enough that casual users can utilize it in their every dayvideo viewing and sharing. To direct this work, we propose the question:Does integrating video history, based on tracking what a person is watching,into a video player provide an effective means for authoring for sharing onlinevideo content?We investigate methods for creating video clips from larger videos, how tointegrate the history into the video viewer in a way that does not interfere withother timelines, as well as a method to present it to the user. As the history grows,there also needs to be methods for managing it for search, and we looked at filteringmethods. Lastly, we look at a method for collecting video clips in a playlist formatas a form of authoring for sharing.61.2 ContributionsWe focus our efforts on answering the research question on the re-visitation be-haviour mentioned earlier. We developed a new video viewing and navigation in-terface designed to integrate both the semantic and episodic memory mechanismsfor creating and searching intervals for authoring. We ran two pilot studies and twofull studies to test the design and verify its effectiveness. Thus, the main contribu-tion this work makes is to demonstrate that video history based on tracking whatand how often a person watches intervals is an effective means for creating andsharing online video content.Our first contribution was creating and validating the video-navigation-historyto facilitate playlist authoring. Through a user study, we demonstrated the effec-tiveness of history-based navigation to save video intervals for authoring. Resultsshowed that with history, finding previously viewed video clips is faster than usingthe traditional filmstrip technique to accomplish the same task. Qualitative resultsrevealed that such a tool is also desirable and useful. This design and user studyare covered in Chapter 4.Our second contribution was investigating user preference and watching be-haviour a personal-video-recording (PVR) like design to create video intervals inconjunction with the personal navigation history. Through user studies, we de-termined that this design was preferred for situation where predictable events arein the watched video compared to the history design that was preferred for situa-tion where the viewer doesnt know what is coming up in the video. We illustratehow the two designs work seamlessly together to provide a homogenous historymechanism. The design and user study are covered in Chapter 5.1.3 PublicationsWe have been able to publish portions of this work in several conferences. A paperwe published at the Interact Conference on Human-Computer Interaction (2013)[2] provides an overview and evaluation of the usage of video viewing historyfor video navigation and video interval search. These tasks, particularly in longvideos, can be very daunting for users, and allowing users search a subset of thelonger video helps immensely. Searching the history becomes more difficult as the7number of videos being watched becomes larger. A paper we published at the ACMCHI Conference on Human Factors in Computing Systems (2014) [3] providestwo different visualizations to allow users to organize and sort their history. Wecompared the two visualizations for finding video intervals against the conventionalYouTube style navigation, and found that they offered improved performance.At the Graphics Interface Conference (2014), we published two papers. Oneof the papers [1] developed a variation of traditional video timelines (introduced asthe ‘Filmstrip’ in subsequent chapters) called the View Count Record (VCR). Thisallowed users to quickly find the most viewed parts of a video. This included selfviews as well as crowd sourced views. The other paper [14] is the work covered inChapter 4.8Chapter 2Related WorkIn this chapter, we first introduce some of the work that has already been com-pleted in the realm of video. Our approach to using video navigation behaviour forcasual video authoring and sharing relies on video browsing, video authoring, anduser-action-history. The overview of video browsing looks at methods for navigat-ing video, for example, the timeline, or the VCR controls found in conventionaltelevision players. We take design cues from these works to integrate into our ownmethods for video navigation. Video authoring encompasses the use of traditionalvideo editors found in commercial applications, as well as several works by otherresearchers for creating video. Lastly, user-action-history shows us how we canuse users’ navigation behaviour as a technique for creating interesting video.2.1 Video BrowsingA popular video player can be found in YouTube’s embedded video player, picturedin Figure 2.1. In this player, there are two basic parts: the main video view and thenavigation timeline. The navigation timeline allows the user to skip through thevideo non-linearly. In order to provide users with a preview of the video, YouTubeemploys work by Girgensohn et al. [15]. As the cursor moves over the timeline, thethumbnail changes its contents corresponding to the position of the mouse. Thispreview thumbnail provides the user with a good indication of where they can seekwithout actually having to seek. In our interface, we applied a flexible thumbnail9Figure 2.1: YouTube’s video player shows the main video, and a timeline thatincludes a preview for users to know where they are seeking.visualization of each history segment: initially, each segment is visualized using athumbnail that indicates the beginning of that segment which then changes whilethe user navigates this segment.One of the primary uses of a video player is to allow users to navigate thecontent. Navigating a video space, specifically for searching, can be mentally de-manding and time consuming, especially in longer videos. There have been manyinteraction techniques proposed in the literature to alleviate this problem in orderto quickly navigate and search video content. The standard navigation tools usedin most video systems are the VCR-like controls. These controls provide the userwith the ability to move video time against its natural progression (forwards orbackwards), which in turn allows them to search faster than by just watching atregular speed. As this method only allows users to go sequentially throughout thevideo, it is unsuitable for the quick video search needed.An improvement along the VCR controls is the chapter based system found ininteractive DVDs. This menu-like system allows users to skip ahead to time pointspredetermined by the video creator. In most cases, chapters are represented by10Figure 2.2: Dynamic thumbnail, by Girgensohn et al.thumbnails representing the most important scene of the chapter. These thumbnailsor previews, provide the users with a better understanding of the contents of thevideo, allowing users to preview the video. This system is used in conjunctionwith the VCR controls, and helps reduce the amount of time that a user has tosearch linearly. Due to its reliance on the VCR controls however, it still suffers thesame problems. Furthermore, the chapter locations are placed at the discretion ofthe video creator, which may or may not be useful for the user.An improvement to the chaptering system was developed by Li et al. [31].They allow users to create their own annotations in the video, thus creating a cus-tomized table of contents. This allows users to bookmark interesting parts of thevideo and create a chapter system that makes sense for themselves. Self-createdbookmarks inherently make more sense to the user and the temporal location ofsuch bookmarks is easier to remember, making non-linear search easier. We makeuse of manual annotations in video in our interface due to this. The mechanismproposed in this work, however, intrudes on the video viewing experience and re-quires users to label and write down descriptions for each of the time. Furthermore,there is no visual representation of each of the bookmarks, and requires the user toseek to each bookmark to figure out the actual contents, should they be mislabeled.11An improvement to this would be to give thumbnail previews for each bookmark,like in the DVD chapter system described above.Since video is a visual medium, providing visual previews is a very importantpart in enhancing video navigation. In the video systems developed by Christelet al. [9], and Drucker et al. [13], the use of thumbnail previews is heavily usedto aid the user in search tasks. The system by Christel et al. uses thumbnailsto display a storyboard of the video to aid in a search task. These thumbnailsare keyframes extracted from the video automatically using text metadata alreadypresent in the video to separate clips from one another. This system relies heavilyupon metadata to already be embedded within the video, and makes it an unlikelytool for searching in newly captured, or home made video. Drucker et al., in theirSmartSkip video player, allows users to skip along regular 30 second intervals,which are previewed to the user via thumbnails along the timeline. This extendsthe functionality of the skip-ahead button, and allows users to look at where theyare seeking. We adopt this interface element design, allowing users to preview thesections of the video without needing to interact with it.Traditional timelines, such as the one shown in Figure 2.2 have inherent prob-lems as the length of video becomes larger. Representation of each particular time-step becomes too small to see and use. Work by Hu¨rst et al. [24, 25] introduces theZoomSlider, which shows only part of the timeline, and allows users to shift theslider across the screen to seek across the video. The seeker bar is zoomable, andthe granularity of seeking is dependent on the vertical mouse position, allowingfor higher accuracy in seeking for longer videos. Commercially, this multi-levelseeking functionality can be found in the Apple iOS default video player. Thezooming functionality, however, is undesirable as it hides portions of the timelines,impeding access to the entire video. The ZoomSlider can be seen in Figure 2.3.On the idea of showing more information of the video, Barbieri et al. [5],introduced the ColorBrowser, which extracted the dominant colour in each frameand presented it to the user in the form of a timeline. This can be seen in Figure2.4. This is useful for showing scene changes in the video, and may be useful tothe user in terms of search if the user remembers a scene to be predominantly aspecific colour. In the context of a short video, however, this is not as useful.Another interesting method of manipulating the timeline was introduced by12Figure 2.3: Dynamic thumbnail, by Hu¨rst et al.Figure 2.4: ColorBrowser by Barbieri et al.Kimber et al. [28]. They developed a system that allowed users to use direct mousemanipulation (i.e., dragging) to manipulate objects directly within the scene alongtheir natural flow. They achieved this by preprocessing video using computer vi-sion techniques to enable object tracking. Using background/foreground segmen-tation they were able to extract objects, and allow them to be draggable. For ex-ample, users would be able to drag a moving car across the screen to control theforward and backward flow of time in the video. In an example application, theyshowed objects with motion trails shown to indicate the direction of movement toallowed for dragging. Dragivic et al. [12], Karrer et al. [27] used optical flow toaccomplish the same task. This can be seen in Figure 2.5. Goldman et al. [17]set upon to improve upon the systems shown in the previous works, by providingsupport for partial occlusion in objects, motion grouping, and long-range accuracyof object tracking.Divakaran et al. [11] proposed a method for quick video browsing by dynami-cally adjusting the playback rate for video. They used the compression informationfound in the video codec to evaluate motion. In places with low motion, playbackrate was increased. In places with high motion, playback rate was decreased. Fur-ther work by Peker and Divakaran [35] introduced the same adaptive playback rate13Figure 2.5: Trailblazer by Kimber et al.and included spatial-temporal complexity to the evaluation of scenes. High com-plexity resulted in lower playback rates, and lower complexity resulted in higherplayback rates. Cheng et al. [8] extended upon the work by Peker and Divakaran,introducing SmartPlayer. They expand the previous works to include scene com-plexity, predefined scenes of interest, as well as user’s preferences with respect toplayback speeds. In addition, it will also learn the user’s preferred event types andthe preferred playback speeds specific to the event type through manual interven-tion from the user.2.2 Using Video Watching HistoryThe relatively high interest in the social web, sharing and the use of videos onlinehas motivated researchers to investigate the use of video navigation history. Usersleave footprints during the video-browsing process, and this can add value to thecontent for both analytical purposes, as well as future viewers. Syeda et al. [38]use data analysis to model video browsing behaviour. By collecting navigationalbehaviour, they can then deduce which parts of a video are more interesting. For14example, they can look at how far a user watches a video and then stops watching,or if a user skims parts of a video first to find interesting parts. These portionsof video are referred to as ‘touched’ video. They then use this data to assembleshort previews by selecting clips based on video visit count. Using the model ofvideo browsing behaviour, it is possible for the system to create previews on thevideo tailored to each particular user, whether they are “searching for something”,or “found something”, or “curious” etc. This authoring of videos based on user-navigation-history indicating user interest is a concept we adopted in allowing usersto find video intervals that interest them.Yu et al. [43] expanded on the work by Syeda et al., and introduces the notionof “ShotRank”, which is a measure of the subjective interestingness and importanceof a video shot. The ShotRank of a video’s shot, or scene, is thus the probabilitythat a viewer would visit a shot during browsing. ShotRank is computed based onlink analysis between scenes in a video, as well as some low level feature detectionin different scenes. The link analysis is generated using an “Interest-guided Walk”,which takes a viewer’s browsing behaviour to create a map linking different scenesof a video together. Furthermore, using the feature detection, ShotRank can thendeduce links from scene to scene based on a variety of factors: they look similar,they sound similar, they are temporally sequential, etc. Using ShotRank, Yu et al.created a system that allowed users to skim through a video through a “chapter”based system determined by the scenes that ShotRank determined to be interesting.Unfortunately, ShotRank requires a priming phase, requiring multiple users to gothrough the video itself to create the data necessary to link scenes together. Mertenset al. [33] created a similar system for web lectures. It kept track of user footprintsacross the entire lecture and displayed it to users to show areas of interest. Theyfurther allow users to create bookmarks, display it to the users, as shown in Figure2.6.Video navigation history can play an important role in user-based informationretrieval from videos on the web. Shamma et al. [37] and Yew et al. [42] have pro-posed a shift from content-based techniques to user-based analysis because it pro-vides a more promising basis for indexing media content in ways that satisfy userneeds. Leftheriotis et al. [30] developed a system called VideoSkip that recordsuser video browsing actions (play, pause, skip). Using this data, they can then gen-15Figure 2.6: User footprints bar by Mertens et al.erate thumbnails to represent each video based on the frame that has been mostlyviewed by users. Gkonela et al. [16] also indicates that this simple user heuristicwhile navigating videos can be as effectively used to detect video-events as whencontent-based techniques are applied. Providing users with explicit access to theirvideo navigation history gives users the flexibility to see what interests them, mak-ing it easier to select video intervals in short authoring tasks.2.3 Video AuthoringIn order to have video to watch, someone needs to create the video and make itsuitable for general consumption. Video authoring is a difficult task and requiresusers to perform many search tasks to look for video intervals they want to share.They then have to ensure that each interval is correctly cut, as well as arranged inthe proper order. The difficulty of video authoring can then be separated by theamount of interaction required to create the video. The most popular video author-ing methods require users to perform every task in the process, such as importingvideos, cutting them into usable clips, arranging them clips in a timeline, and se-lecting video transitions. Performing these tasks professionally requires skill andinsight that the average consumer may not have. There are several tools for the job:software like Adobe Premiere 1 is difficult to learn, but allows for professional,full-featured video and audio mixing; whereas Apple’s iMovie2 and Microsoft’sWindows Movie Maker3 provide a simpler interface but have limited functionality.1http://www.adobe.com/products/premiere.html2http://www.apple.com/ilife/imovie3http://windows.microsoft.com/en-ca/windows-live/movie-maker16As these tools are offered to users free of charge, they are an attractive solution tothose seeking a video authoring tool. However, these types of tools may be exces-sive and require too much effort for a novice computer user who mainly just wantsto share short video clips with their friends, as seen in studies by Vihavainen et al.[40], where users are not inclined to launch a video editor.On the opposite end of the spectrum is the fully-automatic approach. Thesetypes of systems take only video as input from the user, making video creationextremely easy for users. There are commercial applications that implement thisapproach, such as Muvee autoProducer4, Magisto5. The manual tool Pinncale Stu-dio6 also includes a “smart movie” function to generate video summaries fromexisting videos. Christel et al. [10] created a system that analyzes a video streamsto produce a professional-quality summary video. To do this, they first need toidentify “important” audio and video info. For video, they first decompose thevideo into different shots, and then going through each shot, they attempt to iden-tify important objects, such as faces and text, and identify movement. For the audioportion, they run it under speech recognition, and align it to the video’s transcript.They then analyze the video transcript and score the phrases based on frequency.Using all this information, they can extract the appropriate clips and assemble theminto a video summary. This process requires a lot of computation to be performedto produce the video, with much of the information being generally unavailable inregular video.Pfieffer et al. [36] also developed a similar system. They define scene cutsthrough the use of video and audio analysis. To then extract interesting scenes,they evaluate each based upon contrast, motion, colour composition, and dialog.Ranking upon each of these criteria allows the system to pick and choose a numberof scenes, specified by the users, as well as the length of the scenes to be includedin the resulting “trailer” video, also specified by the user. Using these systemsto create video removes a lot of the freedom in creating video, and may not besuitable for just quick and easy sharing of video. Furthermore, make it very difficultto quickly create shareable video clips. Automatic tools, while not satisfactory4http://www.muvee.com/5http://www.magisto.com/6http://www.pinnaclesys.com/publicsite/us/home/17Figure 2.7: Edit While Watching by Weda et al.yet, provide techniques that can be integrated alongside the history approach toprovide further insight on the use of filters to help users find video intervals theyfind affective so that they can author and share videos.As automatic solutions do not suit the criteria, we take a look at semi-automaticvideo creation tools. Weda et al. [41] produced a system that features “Edit WhileWatching”, providing users with an easy home video editing solution in the livingroom. Using a remote control to interact with their television, the user is then al-lowed to perform several video editing related tasks, such as adding music, deletingshots, adding scenes, adding effects, and altering suggested scenes.They extract low-level features from raw video data such as camera motion,contrast and luminosity, and using this data, they can split video data into shots ofabout 1 to 10 seconds. These segments are then assigned scores based on visualquality from the features. The interface for these options can be seen in Figure 2.7.The purpose of this system then, is to provide users with a video editing systemthat is able to give users some control over the video editing process, but hide thedetails and reduce the skill required to take on full video editing programs.Another content-aware system, LazyMedia, by Hua et al. [22], uses mediacontent analysis and composition templates to allow users to create videos. Thesecomposition templates guide the user as to what kind of shot needs to go into which18spot to create a visually pleasing video. The media content analysis features manyvideo content filters, such as motion, colour, face detection, scene grouping, andattention detection for video. For audio, they only analyze music, for which onset,beat and tempo detection are supported.Cattelan et al. [7] developed a Watch and Comment (WaC) system that wouldallow novices users to created interactive video just by watching video. They dothis by associating user navigation commands with video editing commands andtagging users’ comments with the video. As a user is watching the video, they can“edit” the video as they are watching it, inserting one of three commands that theinteractive video produced will perform: seek, skip, or loop. The interactive video,or the resulting video will then follow a script based and perform almost like anedited video.These works provide us with techniques in which we can help users authorvideo intervals that can be used for sharing. These semi-automatic systems showthat automatic analysis of video can provide satisfactory aid. With that in mind,highlighting affective video intervals using video-navigation history is a possibilitythat has yet to be explored.2.4 User Action HistoryHumans are repetitive in nature and many of the tasks that we perform are likely tobe performed again in the future. As seen in work by Greenberg et al.[18], peoplereuse commands in command line interface frequently and they developed a systemto facilitate repeated usage. Further examples of repetition in human nature can beseen in natural language, as shown by Zipf et al.[44], as well as in economics, bythe Pareto’s Principle [26]. Zipf’s law states that in natural language, the frequencyof any word is inversely proportional to its rank in a frequency table. Naturally,some words occur more frequently than others, illustrating repetition in naturallanguage. Pareto’s Principle states that 80% of a business’s revenues come from20% of the business’s clients. In other words, business’s thrive on repeat customers.We investigate into the use of user histories and footprints for inspiration on howto use implicit tagging in video viewing for video authoring purposes.We can start by looking into the undo mechanism found in almost every com-19Figure 2.8: History found in Adobe Photoshop. The most recent action is onthe bottom.puter program. Many implementations allow users to explore the historical changesmade to a document. A good example of this can be found in Adobe Photoshop’shistory, which can be seen in Figure 2.8. It is arranged in chronological orderand allows users to view changes to their document every step of the way. Unfor-tunately, the visualization of the undoable actions does not provide much usefulinformation about how the document has changed. It does, however, provide userswith cues to aid in episodic memory retrieval, allowing users to look back andrecognize the workflow in case they need to leave the task for a period of time.Looking further back, Kurlander and Feiner [29] developed a graphical rep-resentation for user interaction history and macro creation. Users can use historyto easily create macros. By giving users a graphical representation, it also allowsusers to easily see the macro created and easily deduce the procedure of the macro.20Figure 2.9: Chimera by Kurlander and Feiner. Shown is a macro to draw adiagram.The system they created, Chimera, allows users to manipulate each stage of themacro, as well as pick and choose specific operations to create new macros. InFigure 2.9, we see a macro being created to draw one of the diagrams included inthe original paper. They use a comic strip metaphor to indicate sequence.To see better visualizations of history, we can look toward Grossman et al.[19], who introduced Chronicle, a tool to support graphical document workflowexploration and playback. It allows users to explore document revisions, calledchronicles, graphically, through a timeline, as well as video representation. Chron-icles employs screen video capture to show the user the evolution of the document,interaction recording, and some application specific logic to detect changes to thedocument. The main interface can be seen in Figure 2.10. The main chronicle win-dow (Figure 2.10a) shows the step by step changes made to the document. Thesechanges include visual changes, as well as layer information in the image that maynot be visible. The timeline (Figure 2.11) shows a detailed view of the actionstaken by the user throughout the document’s history. Finally, the playback window(Figure 2.10c) shows the entire process as captured, in video form. Chronicle alsoincludes data probes to allow users to filter out specific actions taken by the user.Nakamura and Igarashi [34] created a similar system that aims to be applicationindependent. It monitors GUI events and uses screen snapshots to give visualiza-tion to the user, instead of long videos. They use a comic-strip metaphor, and usingannotations and still screen snapshots, they are able to convey user action history.An example of this is shown in Figure 2.12. As we can see, the annotations on topof the GUI make the actions fairly clear and easy to understand.Researchers interested in studying history of users may find work by Heer et al.[21] interesting. Their work brings forth new visualizations for user action history.21Figure 2.10: Chronicle, by Grossman et al. a) main Chronicle window, b)timeline, c) application/playback windowFigure 2.11: Chronicle Timeline, by Grossman et al.They created a database visualization system, Tableau, which allows for record-ing and visualization of interaction histories, and supports data analysis as well asmechanisms for presenting, managing and exporting histories. Guimbretiere et al.[20] developed ExperiScope, which is an analytical tool for visualizing user ac-tions in an experimentation setting. It allows interaction designers to identify keypatterns in subject interaction, and allows easy analysis of these interactions.Another example of history would be the one found in every web browser.Navigating back to previously viewed web pages is a useful feature and allows22Figure 2.12: Annotated history, by Nakamura and Igarashiusers to not only view their progression, but allows users to re-find informationfar easier than retracing their steps manually. In fact, web browser history is apopular domain for research, and many researchers have created many differenttools to related to keeping, recording, and reusing history. Anupam et al. [4]developed WebVCR, a system allowing users to set a recording mode, record theirinteractions with a web page, and play them back for later use. The interfaceitself is simple and included the conventional VCR playback controls mentionedearlier, as seen in Figure 2.13. The purpose was to allow users to navigate throughwebpages using macro-like behaviour where conventional bookmarks failed, suchas session sensitive pages. Hupp and Miller [23] extended the idea, and allowedusers to not only developed smart bookmarks, an extension to WebVCR, can beshared with other people, as well as edited after creation. Furthermore, unlikeWebVCR, the system allows users to retroactively create macros without explicitlypressing record. As a passive macro creation tool, this allows users to concentrateon the task at hand. Li et al. [32] continue to extend this idea with ActionShot.Contrary to a traditional web browser history, ActionShot records history at a moredetailed level, down to the specific interactions made on a web page, such as textualentry, ticking a checkbox, or clicking a button. This then allows users to createmacros like those found in Adobe Photoshop or Microsoft Word. A screenshot ofthe history view can be seen in Figure 2.14. The system implemented also allowedusers to share their executables with other people.23Figure 2.13: WebVCR by Anupam et al.Figure 2.14: History Views of ActionShot by Li et al.In systems such as Facebook7 (Figure 2.15) and Twitter8 (Figure 2.16, eventsare shown in a vertical fashion. This is in stark contrast to the layout found fortimelines. These conventions provide semantic cues to the user to indicate thedifference between a user action or event history and the video timelines. We takethese cues into our own work, placing our history views into a vertical orientation,and our video timelines in a horizontal orientation.History can also be used to create tutorials. Like the previous methods, this em-ploys recording to playback actions that tutorial creators have performed. Bergman7http://www.facebook.com8http://www.twitter.com24Figure 2.15: Facebook’s vertical event timelineet al. [6] such a system. Their solution, “follow-me documentation wizards”, con-tinually show users the current position in the procedure, highlight the relevant ap-plication controls, and like the previous systems described, can automate portionsof the procedure automatically. Their recording system turned the user’s actionsinto a script language that they can then edit retroactively to change actions, or toadd annotations.All of the systems mentioned above are based upon workflow and tutorial gen-eration. They all use some method of capturing user interaction history, and all usethat information for repeating boring tasks, or for teaching and so do not directlyapply to video.2.5 Summary and Influence on DesignIn this chapter, we covered video browsing, video authoring, and user action his-tory. In the video browsing section, we take inspiration from video preview thumb-nails, as well as timelines introduced to allow for non-linear search through a video.Under video authoring, we gain insight on the different types of interfaces seen inpopular video editors, and the use of automatic methods for generating video inter-vals. Further, we investigated the use of user action history in tutorial creation andsharing, which has direct implications for video authoring and sharing.25Figure 2.16: Twitter’s vertical event timeline26Chapter 3General Interface Design forHistory Based EditingBased on the functionality found in professional video editors, we developed aninterface that supplies users with four basic functions: watching video, searchingvideo, selecting video clips, and authoring new video. The important componentsin our design are: a video player, a user-history-timeline, and video clip list tocollect video clips. We describe the complete video interface that integrates ourcomponents in features from typical commercial based ones for our test. We relyon some of the design decisions found in Apple’s iMovie, as that is a current videoeditor focused on beginning video editors and provides a basis for comparison inour user tests.3.1 OverviewA screenshot of the interface can be seen in Figure 3.1. In the middle, the largepicture is the main video view, similar to most video players and video editingsoftware. Clicking on the video view allows the user to toggle playing and pausingof the video. As a large target, this makes it easy for the user to stop the video.This same functionality can also be found in YouTube’s player as well as variousvideo viewing applications found in the mobile space.The rest of the interface, such as the history (yellow, Section 3.5, filmstrip27Figure 3.1: Overview of the interface. Here, we see the history (yellow), andthe filmstrip (green), and a bookmarking button (blue)(green, Section 3.4), and the playlist (not shown, Section 3.6) will be described indetail below. We first describe the general interface design decisions.3.2 Design LanguageThere are three design decisions that we made to account for consistency through-out the entire interface.1. The design of the interface revolves around the concept of video intervals,each of which is represented by a small thumbnail preview, similar to thosefound in [15]. Manipulation of video can be achieved by dragging thumb-nails around, from one widget to another. For example, dragging a thumbnailover to the main video view, will cause the viewer to start playing at the in-terval specified by the interval, and pausing when the interval has ended.Dragging a thumbnail to a playlist will add it to the playlist.2. The width of the thumbnail generally represents the entire length of the videointerval that it is showing. The actual temporal location of the represented28interval is communicated to the user by a small, red, horizontal bar along thebottom of each thumbnail acting as a miniature timeline. The spacial rep-resentation of video length versus width is maintained throughout the entireinterface and can be seen with the filmstrip and the width of the main player,show in Figure 3.1.3. The arrangement of the two different timelines is also taken into account. Wewanted to separate video timelines and user-history timelines so that userswould not become confused with the two. Since video timelines are con-ventionally horizontal, (and this guideline is also carried out in the miniaturetimeline in all thumbnails), we decided to lay out user-histories in a verticalfashion, as mentioned earlier in Chapter 2.4. This provides a mental separa-tion of the two concepts and prevents confusion.These guidelines provide the application with consistency so that the interfacecan be more intuitive and easy to use. In the next sections, we describe the variousinterface components.3.3 ThumbnailThumbnails are used to represent video intervals. By placing the cursor over thethumbnail, a miniature timeline appears. This can be seen in Figure 3.2. Thebottom grey and red line represents the timeline of the entire video, with the redsection showing the thumbnail’s video interval. In accordance The popup thumb-nail, is also in red, is a zoomed in version of the bottom timeline. Using this, theuser can then utilize the entire width of the thumbnail to scan across the video in-terval. It only partially obscures the thumbnail and is semi-transparent, allowingthe user to continually see the preview while scanning.Interacting with these thumbnails can be achieved by dragging and droppingthem around various elements of the interface. Clicking on the button will allowthe user to play the video interval in the main player, and highlight the correspond-ing video interval in the filmstrip in blue. The same action can also be performedby clicking and dragging the thumbnail over to the main player. The button al-lows the user to insert the video interval into the playlist. To indicate successful29Figure 3.2: Video Interval Thumbnail. The red bar on the bottom representsthe video interval. The timeline pops up allowing the user to searchthrough the interval.insertion, a ghost thumbnail is animated to fly over from the history to the playlist.The button allows users to manually mark an interval as a favourite. By mak-ing users move thumbnails and giving video intervals a concrete representationthat users can see, we can ensure that users are kept in the know about everythinghappening on the screen.3.4 FilmstripThe filmstrip is the interface element bound in green in Figure 3.1. The filmstripallows users to seek to any part of the video. Unlike timelines that only featurea playhead and a timestamp, the filmstrip provides users with a preview to allowbetter accuracy in seeking, rather than only relying on timestamps to judge thelocation of the potential seek. In the first version of this interface, both the filmstripand the traditional (non-previewable) timeline are given to the user.The filmstrip is a set of the thumbnails described in Section 3.3, placed alonga horizontal axis, respective of the design guidelines. There are six thumbnailsand each thumbnail represents 16 of the entire video. We decided to use six inthis version, as the resulting height was large enough to allow users to see thepreview while not interfering with the main viewer. The highlight in blue indicatesa currently playing interval invoked by the user. The red highlight indicates a videointerval being selected by the user.30Figure 3.3: Overview of filmstrip. It is used for seeking and searchingthrough a video.There are two methods to seek the video. The first method is the traditionalmethod and involves clicking on the timeline slightly below the main player. Thesecond method involves the filmstrip.3.5 Navigation History ViewThe history view is the interface element bound in yellow in Figure 3.1. The historyis a mechanism that records a user’s video watching behaviour, and represents itin a method that a user would be able to understand and navigate. Its purpose isto allow users to see which parts of a video they have watched most, go back toreview them, and create video clips by rewatching the video.The history, shown in Figure 3.4, allows users to revisit old video intervalsthat were previously viewed. These intervals are represented by thumbnails foundin a scrollable field. This size was chosen to make the thumbnails relatively thesame size as those found in the filmstrip. They can also be previewed within thethumbnails by clicking on them. Like the filmstrip, each thumbnail can be draggedto the main player to be previewed. As noted in our design language in Section3.2, a vertical layout is used to here to help the user differentiate between the videotimeline, and user-action time. The order of the thumbnails is arranged with thebottom being the most recently watched history item. The reverse order was foundto be confusing in our own tests, as moving all the thumbnails down caused toomuch motion and distracted from the video viewing experience.The history itself starts off with one interval, and this interval extends itself tothe main player’s time until a navigational action, such as seeking, is performedwith the filmstrip or the timeline. Upon such an action, the current interval ends31Figure 3.4: Overview of history. History items are created as the user watchesand navigates around the video. Each history item has a start and endpoint that can be explored by placing the cursor over it.and a new one begins. Visually, a new thumbnail is added to the history.In initial pilot studies, users found it difficult to create history items in exactlythe way they wanted to. They were confused about how to stop the currently cre-ated history item from being extended, and seeking was not sufficient in aidingwith that. In an effort to provide the user with a better solution, we came up withthe system shown in Figure 3.5. Across the top of the history, there is a drop-down selection box that allows users to filter their history for segments of videothat they have seen multiple times. Upon selecting and filtering out clips that theyhave seen twice or more, for example, the history will find the intersections of allhistory items for which the view count is higher than one. The envisioned methodfor using this to create clips is:32Figure 3.5: Re-watch filter in history.1. Watch the video clip2. Come across an interesting moment3. Finish watching the interesting moment4. Seek back to where the interesting moment starts5. Finish watching video clip6. Find intersection of history items with a view count >1As such, setting the start time is achieved in step 4, and setting the end time isachieved in step 3. This removes the need to predict if future clips coming up aregoing to be of interest.Manual creation of history items was also included, and we decided to addanother filter to allow users to search for these specially annotated history items.These intervals can be created by toggling the “bookmarks” button seen in Fig-ure 3.1 (blue). Clicking on the button will cause the history to start recordingwhile the video is playing, and clicking it again will stop the recording, much likethe interaction found in modern personal-video-recorders. Combined with the re-wawtched filter, these allow users to see what clips they know to be of interest, aswell as clips they may not realize they found interesting. In Figure 3.6a, there are33(a) No filters (b) Re-watched filter (c) All filtersFigure 3.6: History Filteringthree clips that the user watched. Clips 1 and 3 are favourited. If we turn on there-watched filter, as in Figure 3.6b, we can see the intersections of all the clips,created as new video clips 4 and 5. If we turn on both the re-watch filter and thefavourites filter, we see that the favourites, clips 1 and 3 reappear, as in Figure 3.6c.3.6 PlaylistWe can bring out a playlist by clicking on the arrow button on the far right inFigure 3.1 (red) which is shown in Figure 3.7. The playlist functions as a locationfor users to store video clips. Using this users can create their own videos filledwith clips that were of interest. This functions as a simple video editor, allowingusers to playback, lengthen, and shorten clips.The playlist, shown in Figure 3.7, contains several thumbnails. Each playlistitem is represented by a grey box with a thumbnail showing the clip’s contents.34Figure 3.7: Overview of the playlistLike the previous thumbnails, this thumbnail seeks to represent the location of thecursor over the length of the playlist item. On the left and right edges of eachplaylist item is a draggable region, allowing users to set the clip’s starting andending points from the video. This larger region makes modifying the clip easierthan the last version. Upon dragging either edge, the thumbnail within the greybox will change to reflect the edge that the user has dragged, providing a previewfor the change.With this interaction, the temporal length of the video clip correlates with thewidth of the playlist item represented on the screen. Unfortunately, due to thevariable width, sometimes the preview thumbnail is not all visible, which makespreviewing the clip difficult. Our solution was to have a popup thumbnail preview,as shown in Figure 3.8. This thumbnail pops up whenever the cursor is over aplaylist item whose width is less than the original preview thumbnail. It also ap-pears every time one of the edges are dragged to provide the user with a larger,easier to see thumbnail.There are also some buttons to the right of the playlist. In order from top tobottom, these were: a close playlist button , a play button , an open existingplaylist button and an export playlist button . Button 1 hides the playlist backinto its original position. Button 2 plays the contents of the playlist in the videoviewer. For the third and fourth button, we implemented an option to save andopen playlists. Playlists are saved as simple XML files that are portable and easyto read. The last button is used for the evaluation and was for the user to indicatethat they were finished performing the task.The interface should be able to support users in watching, saving and sharingvideo intervals, and do it in a more seamless manner than what is provided in sepa-rate video viewing, editing, and sharing software suites. We carry out preliminaryevaluations to evaluate the effectiveness of not only the interface, but more im-35Figure 3.8: Preview thumbnail in the Playlisportantly, the interaction of re-watching video to create video intervals for furtheruse. Throughout the next three chapters, we will be describing the testing, and theiterative evolution of the interface.36Chapter 4Comparison of History andFilmstrip SelectionIn this chapter, we discuss the first experiment to evaluate our interface. This eval-uation was performed following pilot studies to polish and receive initial feedbackfrom users about the use of the History for a video authoring task. Here, the pi-lot studies helped us make the decision for horizontal and vertical timelines. Inprevious iterations, our playlist was oriented vertically. Placing the playlist in ahorizontal layout also made altering video intervals more accessible. In previousiterations of the History, there was no method for filtering or otherwise managingthe History and pilot studies showed that there needed to be a better method forcreating video intervals, and so we introduced the re-watched filters.4.1 ExperimentA user study was carried out to evaluate the design and performance of our in-terface and the usefulness of using personal video navigation histories for creatingplaylists. We developed an evaluation protocol to ensure users have sufficient view-ing history while keeping the experiment short and maintaining a bias free evalu-ation. Likewise, for comparison to current practice, we needed to ensure that ourinterface mimicked currently adopted approaches as well as logical extensions tothem to provide a fair comparison to using viewing history. Using our protocol, we37investigated whether visualizing and using a video navigation history would makecreating and inserting clips to playlists more efficient. We conducted a comparativeuser study, comparing the performance of participants using a personal navigationhistory against the state-of-the-art navigation method (Filmstrip) to find clips to beadded to a playlist.4.1.1 HypothesisWe hypothesize that participants save video intervals to the playlist in less timeusing the History method than the Filmstrip method, and prefer the History method.4.1.2 ApparatusThe experimental interface was implemented using Adobe Flash CS6 and Action-Script 3. The study was performed on a 15” MacBook Pro, with a screen resolutionof 1680x1050 pixels. Participants used a regular Microsoft mouse to manipulatethe contents of the screen. The application window was set to a size of 1280x720pixels.4.1.3 ParticipantsEighteen volunteers, 13 male and 5 female, participated in the experiment. Theywere monetarily compensated for their time. Participants ranged in ages from 19to 50. Each participant worked on the task individually. All participants wereexperienced computer users and watched videos at least 3-5 times a week. 11rarely created videos, and 7 have never created videos.4.1.4 Design and ProcedureThere were two methods and six videos experimented by each subject. Since par-ticipants will be selecting video intervals and relying on their memory, we ensurethat they watch each video only once to reduce the learning effect on the video. Weseparated participants into two groups. One group used the first set of videos withthe History, and the second with the Filmstrip, while the other group reversed thesets. This allowed participants to use both methods and only view each video once,and the design is shown in Figure 4.1. Each set of videos consists of 3 two-minute38Figure 4.1: Experimental Design: History and Filmstripvideos. The content of these videos varied from sports, news segments and comedyshorts.Each participant was asked to watch each video and to re-watch parts of thevideo that follow two themes specified by the experimenter. When the participantfinished viewing the video, they were asked to find 2 to 3 specified clips, whichbelong to one theme, from within the viewed video and add them to the playlist.Using History, participants were asked to locate the clips in the history using theview count filter and add them to the playlist. To restrict the participants to useonly History, adding items from the Filmstrip was disabled. Using the Filmstrip,participants were asked to use the selection tool by clicking and dragging alongthe Filmstrip to select a portion of video to add to the playlist. To restrict theparticipants to use only Filmstrip, History was hidden from the interface as soon39as the viewing phase had ended. Incorrect selections would be indicated to theparticipant and they would be asked to add the correct one.The participants were asked to complete the task as quickly as possible. Foreach task, the time taken to perform the clip selection (i.e. adding them to theplaylist) was recorded. The timing begun when the user clicked the start button,and stopped when the stop button was clicked. We checked whether the correctclips were chosen to ensure a proper evaluation among all participants and if notthese clips were considered as error.The experiment proceeded as follows:1. The evaluation started with a familiarity phase for each technique. Partici-pants were shown how to use each of the interface elements, and most im-portantly, how to add clips to the playlist from the history and the Filmstrip.They were also shown how to create a useful history through video review,and how to access created clips with the view count filter.2. After the training phase, a trial started by asking the participant to watch avideo and re-watch clips that fit in the two themes provided by the experi-menter (e.g. shots of fire fighters and police officers).3. The participants were asked to add clips to the playlist using the availablecomponent for the tested method.4. Once completed, participants were advanced to the next video and using thealternate method. Upon the completion of the six videos, participants weregiven time to experiment with both techniques at the same time on a blooperreel. This allowed us to explore which method participants will prefer whenthey are give the option of using both.5. They were then given a questionnaire to fill out asking about their reactionsto the interface.The total time taken for the experiment was approximately one hour.404.2 Results and DiscussionWe would first like to note that, as explained in the design, this was a within partic-ipant study that was conducted with between elements for the method and videos.Using this design, we were able to get more informed data from the participantswho would be able to give a more accurate opinion of both techniques. The analy-sis was done between so that each participant was exposed to each video only oncewith a single method.A two-way ANOVA analysis test was carried out to explore the effect of methodand videos on the time needed to create a playlist. The analysis showed significantmain effects of method (F(1, 96) = 66.69, p <0.0001) and of video (F(5, 96) =5.86, p <0.0001). However, these main effects are qualified by a significant in-teraction effect (F(5, 96) = 2.37, p = 0.045). Simple main effects of the videoat method show that for each video, the time needed to find clips and insert theminto the playlist was significantly faster using History over the Filmstrip. Partici-pants took at most two-thirds the average time using History than using Filmstripas shown in Figure 4.2.The evaluation required participants to create video intervals containing theinterval of interest. As the target audience is casual video watchers and authors,the accuracy of was not important. The qualifier for a correct trial was for eithera portion or the entire interval was inserted into the playlist. As a result, the errorrate in this experiment was zero.Of the six videos, we found that History worked better for some videos thanothers, and the same effect can be observed in the Filmstrip. These traits cangenerally be attributed to the content of the videos, which is why the analysis isseparated for each video.1. The first video consisted of a short hockey video of a power-play. The themesfor the video were two goals, and two segments where the defensive teamhad possession of the hockey puck. The participants were asked to find thetwo goals. In this video, it was difficult to see what was going on withinthe thumbnails. This was particularly difficult with the Filmstrip becausemany participants forgot when the first goal happened, and were forced tolook through the entire video in the Filmstrip, whereas History reduced the41search space required, down to four short clips.2. The second video consisted of six hockey players, three wearing blue jerseys,and three wearing white jerseys, skating around a rink and competing for thefastest time. The theme in this video is the hockey players turning the lastcorner. The participants were asked to find the white jerseys to insert into theplaylist. Again, like the previous video, participants had to search throughthe entire Filmstrip to find the correct clips, while History only requiredparticipants to look through six different clips, all of which were easily dis-tinguishable by the screenshot within the thumbnail. Furthermore, becauseone of the players skates in the opposite direction of the other five, usingthe Filmstrip became slightly more difficult because the participant had todistinguish the different turns within the small thumbnail.3. The third video was a short comedy segment featuring the cast from the sit-com Modern Family, where a little girl runs around, makes smart remarks,and pulls pranks on various people. The themes to this video were the re-actions to her remarks, as well as the result of the two pranks she pulled.The participants were asked to find the clips with the two pranks. This videohad a slight advantage given to the Filmstrip because the scene of one of thepranks was visible in the Filmstrip, and the second prank was right at the endof the video, making both mental and physical retrieval relatively easy. Thedata reflects this condition as this was the fastest video for Filmstrip.4. The fourth video was a comedic instructional video of someone taking aparta camera. Along the way, the instructor would make mistakes that werevery obvious, and he would disconnect things within the camera. The cho-sen theme for this video were the three mistakes made throughout the video.Like the second video, the clips were fairly spread out along the video, how-ever they were somewhat hard to see within the Filmstrip, which made itslightly more difficult using that method. Additionally, some participantsforgot the location of the clips and were forced to look through the entiremovie, like the first video.5. The fifth video was a news segment on an Olympic hopeful wanting to enter42the snowshoeing competition. The theme of the video was the six talkingheads within the video. Of those, the participant was asked to find two spe-cific ones. This ended up being very easy for History, as it was not visuallytasking to find clips of a specific person. Again however, one of the clipsappeared in the Filmstrip, and the second clip was right after.6. The last video was of the 2011 Stanley Cup riot in Vancouver, and the themesof the video were of shots of fire fighters and police officers. The clips to beselected were the fire fighters. These clips were mostly within the middle 30seconds of the video. For the Filmstrip, some participants started searchingfrom the beginning of the video, as they did not remember when the firstclip appeared. They were, however, very close together within the video andallowed for quick selection once they found the first clip. They were visuallydistinctive from the police clips, and were also easy to find within History.These videos were chosen to represent real world video, and sometimes thevideo interval of interest occurs near the beginning or end of the video. As a result,finding these intervals using the Filmstrip is much easier in these cases and explainsthe variation in the times taken by Filmstrip selection. Choosing events that weremore spread throughout the video would have minimized this effect. In Figure 4.2,we see that the variation in History is minimal, and History provides a consistenttime expected to search for video intervals.Watching the participants use each of the methods allowed us insight into theweaknesses of each technique. The Filmstrip provided a linear view of the entirevideo, and provided no clue as to the location of specific video clips that needed tobe found. As such, participants would often forget where the clips were and neededto search through the entire video. The time taken was further aggravated whenthe video clips were spread evenly throughout the video. It should be noted, ourimplementation of the Filmstrip did not feature a zooming function. This would,however, likely result in longer times taken for the clip selection task because ofthe need to scroll horizontally to find clips, and would also hide clips from the user,making retrieval slightly more difficult. But if we were to extend this interface towork for many more videos, and especially those longer than a couple of minutes,zooming becomes a necessity to maintain accuracy in clip selection.43Figure 4.2: Descriptive Statistics for History vs Filmstrip Interval selection.Mean times are shown for each video and method. Numbers are inseconds.The questionnaire results shown in Table 4.1 demonstrate the positive reactionof participants to our history-based authoring interface. The overall scores forthe History method were all above 5 (on a Likert scale of 1 to 7). The Filmstripmethod scored well but participants did not find it as easy to find video clips; thisis supported by the quantitative data.Observation of users in the free play section of the study showed a stronglypositive reaction to using History. Eleven users created a history the way we in-tended, creating a proper history, and using the view count filter as intended. Thisis shown by the questionnaire that not only is it useful, but a usable history canbe easily created. Three participants created a history but did not use the filter toadd clips, but instead created a playlist with whatever happened to be in the his-tory. It is likely that these participants forgot to use the filter and did not realizeit. Two of the participants used a combination of the two techniques. For example,they used History as they were taught, and after using History, use the Filmstripto find additional clips that they did not think to re-watch at the time. Finally, twoparticipants were unreceptive to creating and using the history, and only used theFilmstrip to create the playlist.They stated that they did not think that the viewingpattern required by the interface fit with their own viewing behaviour, and that do-ing so would disrupt their video viewing experience. While these participants did44Question Mean σOverall usefulness 5.77 1.06Overall ease of use 5.61 1.29Overall reaction 5.47 1.19Learning to use the interface 5.50 1.42History is useful 6.11 1.23Creating usable history is easy 5.56 1.29History filtering is useful 5.89 1.07History filtering is intuitive 5.28 1.56Finding history video clips is easy 5.44 1.89Finding Filmstrip video clips is easy 4.50 1.89Inserting history video clips is easy 6.44 0.70Inserting Filmstrip video clips is easy 5.61 1.50Table 4.1: The aggregated results of our questionnaire, with the mean score(Likert Scale, 1 to 7) and standard deviation (σ ). Participants found theinterface with the history to be useful for the creation of new videos.Their overall reaction to our novel video authoring mechanism washighly positive.not use History, it still recorded the video they watched. In this case, the historycontained a single item for the entire video. If we extend the system to multiplevideos, it would be easy to see which videos they liked, if the video was viewedmore than once, and when they watched them. While only two participants usedboth techniques to create their own video in the free play, it is worth noting Historyworks well in conjunction with the Filmstrip. Further integration between the two,such as placing markers in the Filmstrip indicating history intervals may improveperformance even more.The experiment showed that users were able to make use of History and itsfiltering mechanism fairly effectively, and well enough that it was significantlyfaster than using the Filmstrip. Furthermore, it also revealed that the type of videoused also affects the performance of either History method. These were illustratedby the interaction effect found between video type and selection method. Allowingusers both methods of selection showed that the two methods can also work inconjunction with each other.454.3 SummaryThis chapter introduced methods for browsing through history, which helped usersmake accurate interval selection. The filtering method allowed users to more easilyset end times for the selected intervals, reducing the need to adjust the intervals inthe playlist. We performed an evaluation of the interface, and found that users wereable to use History effectively, and were able to select video intervals significantlyfaster over the traditional Filmstrip method.46Chapter 5Comparison of Implicit Historyand Explicit FavouritesIn the last evaluation, it became easier to see that the use of history to find specificvideo clips was much easier and quicker to use than the filmstrip. It was noted bythe users, however, that having to re-watch the video clips to input them into thehistory was slow, and sometimes unnecessary. They would then like to manuallycreate video clips without needing to re-watch them. This approach is similar towork by Li et al. [31]. In this work, we categorize the events in our videos intotwo categories: predictable and unpredictable. Predictable events have a lead-upand are easy for users to foresee, whereas unpredictable events have no lead-up.5.1 ExperimentA user study was carried out to evaluate the design and performance of our in-terface, the usefulness of using personal video histories and Favourites for videointerval creation and retrieval. We allowed users to freely watch videos and cre-ate a history of video intervals that they would share with others. We conducteda qualitative user study to evaluate the use of History and Favourites in video in-terval search. Following our scenarios, we give our participants videos to watchcategorized between the presence of predictable and unpredictable events.475.1.1 HypothesisWe hypothesize that participants use and prefer the History method for unpre-dictable events, and use and prefer the Favourites method for predictable events.5.1.2 ApparatusThe experimental interface was implemented using Adobe Flash CS6 and Action-Script 3. The study was performed on a 15” MacBook Pro, with a screen resolutionof 1680x1050 pixels. Participants used a regular Microsoft mouse to manipulatethe contents of the screen. The application window was set to a size of 1280x720pixels.5.1.3 ParticipantsEleven volunteers, 7 male and 4 female, participated in the experiment. They weremonetarily compensated for their time. Participants ranged in ages from 19 to 50.Each participant worked on the task individually. All participants were experiencedcomputer users and watched videos at least 3-5 times a week.5.1.4 Design and ProcedureThere were two methods and eight videos experimented by each subject. Eachparticipant was exposed to each video only once. The two methods for video clipcreation were tested: the History, and Favourites. We performed a study whereeach subject used each method on separate videos. Each video was 2-3 minuteslong. The experiment proceeded as follows:1. The participants were given a training session. Participants were shown bothmethods for creating video intervals for saving, and they were given time totry it for themselves.2. The participants were given two mock trials (but were not told that they weremock) to further familiarize themselves with the tools. For the first mocktrial, they were given a predictable video and the re-watching feature to use.The option to view and create Favourites was disabled for this trial. Theywere given 1.5 times the length of the video to watch it and create a history48by re-watching segments. During this time, they were asked to create at leastthree different history items.3. After viewing, the participants were asked to locate two of the created inter-vals that the experimenter named.4. The participants were then asked to fill out part of a questionnaire, whichasked about the effectiveness for the method applied for that particular video.5. The participants would repeat the task with Favourites enabled and the re-watching feature disabled with an unpredictable video.6. The participants then repeated the task, twice with re-watching with both apredictable and unpredictable video, and then twice with Favourites, againwith a predictable and unpredictable video. The opposite feature is alwaysdisabled.7. After each trial, they were again given the questionnaire to fill out detailingtheir thoughts on using the method for creating video intervals.8. They repeated the task with both methods enabled. They were given freereign as to which method they would like to use, and observations weretaken to review the methodology in which participants employed with eachmethod and when they used them.9. They were then given a questionnaire detailing their preferences for eachtechnique, and which videos they thought each technique was effective forproducing a retrievable video clips.The entire experiment took approximately one hour.5.1.5 Videos ContentIn total, the participants were given nine videos to watch, with one training video.There were two types of videos, which were categorized as having events that werepredictable and unpredictable. Predictable videos can be described as videos thatbuild-up to a particular moment as the climax of the particular segment. Unpre-dictable videos have minimal build-up (at most 1 second) and are easily missed49upon the first pass of the video. Of the eight trial videos users were given to watch,four were predictable, and four were unpredictable. The contents of the four pre-dictable videos are as follows:1. A hockey shootout video. Clips were shown with players preparing to shoot,skating up to the goaltender and shooting the puck. Multiple replays fromdifferent camera angles were shown after each shot.2. Another hockey shootout video.3. Three clips of guests getting scared by a man on the afternoon talk-show,Ellen Degeneres. A man in a costume is shown sneaking up behind the guestand ultimately screaming at them while they are being distracted by Ellen.4. Three clips of scientific experiments. All the clips include an explanation aslead-up to the climax. The first clip involved dropping rubidium in a bathtub,resulting in an explosion. The second clip involves a man walking on a non-Newtonian fluid. The third clip shows a rocket being fired on a along tracktowards a car, ultimately destroying it.The contents of the four unpredictable videos are as follows:1. Three different clips from separate videos. The first clip shows two skate-boarders attempting tricks. A horn sounds and the camera pans towards anelderly lady walking across the street. A car revs its engines and the lady hitsthe car and the airbag pops open. The second clip is from a dash camera ona car driving along a road. From the left appears a car that is spinning andpasses the car, and goes back into lane safely straightening out. The thirdclip is of a Ferrari Enzo driving and crashing into a cement block.2. Three different clips from separate videos. The first clip shows an excerptfrom a documentary about whales. The video shows closeups of smaller fishswimming in groups and a giant whale coming out and filling the screen,shown opening its mouth and eating all the fish. The second clip shows adance competition, where a teenager comes out and performs some seem-ingly underwhelming dance moves. He then breaks out and performs spins50that show an enormous amount of skill. The third clip shows a flooded street.From the left, a car with an open trunk drives through the flood, and behindit is a man on surfboard being towed through the water.3. Three different clips from separate videos. The first clip shows a stream ofwater. A hand then appears and turns on a switch that produces a 24Hz sinewave tone, augmenting the water flow in an interesting way. The second clipshows a stereo system playing a song on top of a desk. Soon after, the songproduces multiple loud bass notes, shaking the desk uncontrollably. Thethird clip shows a washing machine in a yard running normally. Soon after, aperson throws a brick into the washing machine, and it shakes uncontrollableuntil the drum is disconnected from the rest of the machine.4. Three different clips from separate videos. The first clip is another dash-camvideo. This time the car is following a large truck. It proceeds to turn left,and from behind the truck appears a smaller truck, causing a crash. Thedriver of the second truck pops out the front and is seen walking away fromthe accident, harm free. The second clip shows people fishing off a dock. Asthey try to use the net to pick up a fish, a shark appears and eats the fish. Thethird clip shows a girl gloating to the camera about selling a large amount ofcookies. Through the middle of her gloating, she slips and hits her head onthe table in front of her.5.2 Results and DiscussionThe results in this experiment were almost entirely derived from the questionnaire.The structure of the questionnaire consisted of eight sections, one for each video.Each section consisted of three questions:1. It was easy to predict when something was going to happen2. I could easily create the video clips I wanted3. I could easily find the video clips I wanted51The aim of the Question 1 was to ensure that our own thoughts of the predictabil-ity of video was in-line with reality. For what we selected as predictable videos,participants rated the videos on average, from 5.8 to 6.5 on a scale from 1 to 7where 1 was strongly disagree and 7 was strongly agree. For what we selected asunpredictable videos, participants rated them from 3.8 to 4.5 (neutral score). Usingthese metrics, we then asked users about which method they preferred with respectto the two different types of videos, as shown in Table 5.1. The results agree withour hypothesis that History is preferred for unpredictable videos and Favourites forpredictable videos. The graph shows three people preferred History for both. Onereason was that they did not mind having to re-watch segments of video, becauseit was an easier method of saving clips as we only tested for video clips under 10seconds. The other reason was related to the ease of use and number of clicksrequired to create video intervals. For the participant who said Favourites is bet-ter for unpredictable, the reason was that creating video intervals was impreciseusing the History method. Observation into the method in which they employedrevealed that particular participant would watch an event, and seek back, and startFavourites precisely where they wanted.Question No.re-watching is better for predictable 3re-watching is better for unpredictable 10bookmarking is better for predictable 8bookmarking is better for unpredictable 1Table 5.1: Questionnaire results for History vs Favourites Clip Selection.When questioned, most users stated that the use of re-watching was moresuited towards videos that were unpredictable, and bookmarking was bet-ter for videos that were more predictable.These results lead us to the observable usage patterns of Favourites creation.These can be shown in Table 5.2. The results here show that at least half of theparticipants rewind and re-watch parts of video they liked in order to create videointervals, regardless of the video’s predictability. For unpredictable videos, thismeans that most of the participants (9 of 11) manually produced the same resultas the re-watching history. For the two participants who used Favourites without52rewinding in the unpredictable videos, they were quick enough on the button thatthey could create video intervals during the event. It should be noted that theintervals created would be lower quality (they would give no context, and be verydifficult to understand). For predictable videos, the results are split between the twotypes of users. Half the users created what is essentially an identical re-watchinghistory, and half created no re-watching history at all. While these videos werepredictable, and it was obvious something was going to happen, some participantsdid not create Favourites as they were watching the video and preferred to giveeach event a first pass before going back to Favourite them.Use of FavouritesType of videoUnpredictable Predictablerewind and then Favourite 9 6Favourite straight through 2 5Table 5.2: Results for use of the Favourites feature. When given the abil-ity to use Favourites and a predictable video, the participants were splitamong re-watching interesting intervals to Favourite and Favourite whilethe video was playing. For unpredictable video, participants generallymissed the interesting intervals and were forced to rewind and Favouritethe interval on the second pass.In the last two trials in the experiment, participants were given both methodsto use freely when creating video intervals for both predictable and unpredictablevideos. Results of the observation of usage patterns is shown in Table 5.3. Giventhe results, we can see that most of the participants used the History exclusivelyand did not bother with the Favourites technique at all. When asked about this,the participants stated that the re-watching method simple enough that taking alittle bit more time to create video intervals was not bothersome. It would beinteresting to investigate if this usage pattern holds true for videos with longerevents. Another participant stated that using the Favourites feature required toomany clicks to create intervals. Thus, contrary to our hypothesis, we have foundthat participants would rather use the re-watching method for creating short videointervals.When asked about possible improvements to the interface, one participant sug-gested that ability to be able to seek while holding down the Favourites button. This53Method use when given both No.used re-watching history exclusively 6used Favourites exclusively 3used both where appropriate 2Table 5.3: Results for use of History and Favourites when given both meth-ods. When given both methods participants were more likely to use the-HHistory exclusively for both videos regardless of the type of video theywere watching. Others used Favourites exclusively, and only two partici-pants used the intended technique (History for unpredictable, Favouritesfor predictable) for the videos.would extend the ability for users to create a Favourite even when the event had al-ready passed, without having to use the filmstrip to seek back. Another participantpointed out that the creation of the re-watched intervals was slightly confusing.With the re-watching, it was clear where the start point of the video interval is,but not clear when the video clip would end. Additionally, it was stated by oneparticipant that the notification that Favourites was active (the star button glows),was not easy enough to see. It was suggested that an outline be drawn around theedge of the main player to more visually signify Favourites.Given the feedback about the interface, it seems that the automatic creation ofvideo intervals through re-watching is a preferred method of interaction. However,it lacks the preciseness of the Favourites feature. It is worth further investigation toimprove it in this respect. Possible fixes include Favouriting re-watched clips andallowing editing of Favourited clips, or being able to collect clips into a playlist forfurther editing.The experiment showed that participants agreed with our hypothesis in usingFavourites for predictable events, and History for unpredictable events. However,we found that in practice, participants more often used History for their video in-terval creation. This reaffirmed the utility of the re-watching history, and showsthat Favourites are not suitable for replacing the History, but instead, complementsHistory and filmstrip selection methods.545.3 SummaryThis chapter introduced an alternative method for creating history by allowingusers to manually create a history without having to re-watch video intervals. Wepredicted that this would allow users to preemptively save video intervals that fore-saw to be interesting. We performed an evaluation to verify this, and found thatwhile users vocally agreed with our hypothesis, they performed differently. In-stead, users found it more convenient to re-watch intervals.55Chapter 6Discussion and ConclusionsIn this chapter, we present a discussion of the results of our experiments to evaluatea novel interface for video authoring using both a passive video viewing history andan active video viewing history. We also discuss the significance and contributionsof the thesis, as well as a proposal for future research.6.1 DiscussionIn this thesis, we presented an answer to our research question:Does integrating video history, based on tracking what a person is watching,into a video player provide an effective means for authoring for sharing onlinevideo content?We have attempted to create an interface that is tailored to the way the humanmemory system works. Current video players and authoring tools provide someaid in helping users find video in the form of video previews and annotation, butnone allow users to search for video intervals based on watching behaviour. Inour interface, we adopt video previews, as well as annotation, but we present itin a form that users can easily interact with to search and insert into playlists forauthoring and sharing.The results from our experiments indicate that users are able to effectively andquickly author videos based on what they watched. In our first evaluation, we madea comparison between the use of history and state-of-the-art selection technique,56filmstrip, for video clip selection and playlist authoring. For all the videos tested,we found that the use of history was significantly faster in aiding users to search forspecific video clips for previously seen video. This can be explained by lookingat the amount of video the user has to search through. In the filmstrip, the usermust manually reduce the search space by remembering specifically where theclips were in time in relation to the video. With the history, we can dramaticallyreduce the search space and allow users to more easily find the clips they want.In this evaluation, we found that creating a history can be inconvenient becauseof the need to re-watch video. For clips that do not need re-watching, this canfeel cumbersome and slow. The evaluation itself was completed in a controlledenvironment and users were coached in advance on how to use the interface. Inorder to make the results more generalizable, a longitudinal study would be morefavourable. The evaluation also showed that the type of video reflected differencesin the performance of the history.In our second evaluation, we looked to remedy the situation and introduced amethod to create history items without having to re-watch any video. The introduc-tion of a favourites button allowed users to manually create history items. In our ex-periment, we looked qualitatively at using both the passively created history and theactively created history. In the videos we chose, we specifically looked for eventsthat differed in their predictability of appearance. This experiment attempted toimprove on the experimental design from the first evaluation, providing users withmore videos that had affective intervals that were more evenly distributed throughthe video. We hypothesized that predictable events would be easier to capture us-ing the new favourites, and that unpredictable events would be easier to captureby re-watching. These hypotheses proved to be true from a theoretical standpoint(participants indicated that this is the method for which they would use the system),but in practice, participants were more likely to solely use the history for their in-terval authoring tasks. The favourites system is thus, more complementary to thehistory than it is a replacement.Supplying users with their video navigation history, along with manual meth-ods to create it, as well as proper methods for filtering and history managementhelps users in video search for authoring tasks. They also work well with moretraditional point-and-drag methods of video interval selection like filmstrip, and57are ideally used in conjunction with each other to create a simple and pleasurableexperience for video authoring and sharing.6.2 Future WorkIn this thesis, we developed an interface for authoring video using history that wasused for testing with a controlled set of videos and limited number of participants.The next step in evaluating this interface is to expand on that, and build a fullfledged application with access to a larger library of videos, for example, those onYouTube, and to gather more data from a wider audience. To do this, we havestarted developing an application with greater functionality for the Apple iPad.This will allow us to more easily deploy the application through the Apple AppStore allowing us to reach more users.Throughout the work, we have come up with some design guidelines for videobrowsing and history tracking applications, outlined in Chapter3.2. Future work inthis area would be to formalize these guidelines, and investigate the implicationsof each of these design decisions with respect to memory, as well as repetition inhuman behaviour. This can then be extended into studies on different methods inusing the history, for different types of videos.The contributions in this work have shown that giving users access to theirvideo navigation history is beneficial for simple video authoring. Providing propermethods of filtering and managing this history is an essential part of video intervalselection using this concept.58Bibliography[1] A. Al Hajri, M. Fong, G. Miller, and S. Fels. Fast forward with your vcr:Visualizing single-video viewing statistics for navigation and sharing. InProc. Graphics Interface, May 2014. → pages 8[2] A. Al Hajri, G. Miller, S. Fels, and M. Fong. Video navigation with apersonal viewing history. In Human-Computer Interaction – INTERACT2013, volume 8119 of LNCS, pages 352–369. Springer, 2013. → pages 7[3] A. Al Hajri, G. Miller, M. Fong, and S. Fels. Visualization of personalhistory for video navigation. In Proceedings of the ACM CHI Conference onHuman Factors on Computing Systems, CHI’14, New York City, New York,U.S.A., April 2014. ACM. → pages 8[4] V. Anupam, J. Freire, B. Kumar, and D. Lieuwen. Automating webnavigation with the webvcr. In Proceedings of the 9th international WorldWide Web conference on Computer networks : the international journal ofcomputer and telecommunications netowrking, pages 503–517, Amsterdam,The Netherlands, The Netherlands, 2000. North-Holland Publishing Co. →pages 23[5] M. Barbieri, G. Mekenkamp, M. Ceccarelli, and J. Nesvadba. The colorbrowser: a content driven linear video browsing tool. In Multimedia andExpo, 2001. ICME 2001. IEEE International Conference on, pages 627–630,2001. → pages 12[6] L. Bergman, V. Castelli, T. Lau, and D. Oblinger. Docwizards: a system forauthoring follow-me documentation wizards. In Proceedings of the 18thannual ACM symposium on User interface software and technology, UIST’05, pages 191–200, New York, NY, USA, 2005. ACM. → pages 25[7] R. G. Cattelan, C. Teixeira, R. Goularte, and M. D. G. C. Pimentel.Watch-and-comment as a paradigm toward ubiquitous interactive video59editing. ACM Trans. Multimedia Comput. Commun. Appl., 4(4):28:1–28:24,Nov. 2008. → pages 19[8] K.-Y. Cheng, S.-J. Luo, B.-Y. Chen, and H.-H. Chu. Smartplayer:user-centric video fast-forwarding. In Proceedings of the 27th internationalconference on Human factors in computing systems, CHI ’09, pages789–798, New York, NY, USA, 2009. ACM. → pages 14[9] M. Christel and N. Moraveji. Finding the right shots: assessing usability andperformance of a digital video library interface. In Proceedings of the 12thannual ACM international conference on Multimedia, MULTIMEDIA ’04,pages 732–739, New York, NY, USA, 2004. ACM. → pages 12[10] M. G. Christel, M. A. Smith, C. R. Taylor, and D. B. Winkler. Evolvingvideo skims into useful multimedia abstractions. In Proceedings of theSIGCHI conference on Human factors in computing systems, CHI ’98, pages171–178, New York, NY, USA, 1998. ACM Press/Addison-WesleyPublishing Co. → pages 17[11] A. Divakaran, K. Peker, R. Radhakrishnan, Z. Xiong, and R. Cabasson.Video summarization using mpeg-7 motion activity and audio descriptors.In A. Rosenfeld, D. Doermann, and D. DeMenthon, editors, Video Mining,volume 6 of The Springer International Series in Video Computing, pages91–121. Springer US, 2003. → pages 13[12] P. Dragicevic, G. Ramos, J. Bibliowitcz, D. Nowrouzezahrai,R. Balakrishnan, and K. Singh. Video browsing by direct manipulation. InProceedings of the twenty-sixth annual SIGCHI conference on Humanfactors in computing systems, CHI ’08, pages 237–246, New York, NY,USA, 2008. ACM. → pages 13[13] S. M. Drucker, A. Glatzer, S. De Mar, and C. Wong. Smartskip: consumerlevel browsing and skipping of digital video content. In Proceedings of theSIGCHI conference on Human factors in computing systems: Changing ourworld, changing ourselves, CHI ’02, pages 219–226, New York, NY, USA,2002. ACM. → pages 12[14] M. Fong, A. Al Hajri, G. Miller, and S. Fels. Casual authoring using a videonavigation history. In Proc. Graphics Interface, May 2014. → pages 8[15] A. Girgensohn, J. Boreczky, and L. Wilcox. Keyframe-based user interfacesfor digital video. Computer, 34(9):61–67, 2001. → pages 9, 2860[16] C. Gkonela and K. Chorianopoulos. Videoskip: event detection in socialweb videos with an implicit user heuristic. Multimedia Tools andApplications, pages 1–14, 2012. 10.1007/s11042-012-1016-1. → pages 16[17] D. B. Goldman, C. Gonterman, B. Curless, D. Salesin, and S. M. Seitz.Video object annotation, navigation, and composition. In Proceedings of the21st annual ACM symposium on User interface software and technology,UIST ’08, pages 3–12, New York, NY, USA, 2008. ACM. → pages 13[18] S. Greenberg and I. H. Witten. Supporting command reuse: Empiricalfoundations and principles. International Journal of Man-Machine Studies,39:353–390, 1993. → pages 19[19] T. Grossman, J. Matejka, and G. Fitzmaurice. Chronicle: capture,exploration, and playback of document workflow histories. In Proceedingsof the 23nd annual ACM symposium on User interface software andtechnology, UIST ’10, pages 143–152, New York, NY, USA, 2010. ACM.→ pages 21[20] F. Guimbretie´re, M. Dixon, and K. Hinckley. Experiscope: an analysis toolfor interaction data. In Proceedings of the SIGCHI conference on Humanfactors in computing systems, CHI ’07, pages 1333–1342, New York, NY,USA, 2007. ACM. → pages 22[21] J. Heer, J. Mackinlay, C. Stolte, and M. Agrawala. Graphical histories forvisualization: Supporting analysis, communication, and evaluation. IEEETransactions on Visualization and Computer Graphics, 14(6):1189–1196,Nov. 2008. → pages 21[22] X.-S. Hua and S. Li. Interactive video authoring and sharing based ontwo-layer templates. In Proceedings of the 1st ACM international workshopon Human-centered multimedia, HCM ’06, pages 65–74, New York, NY,USA, 2006. ACM. → pages 18[23] D. Hupp and R. C. Miller. Smart bookmarks: automatic retroactive macrorecording on the web. In Proceedings of the 20th annual ACM symposiumon User interface software and technology, UIST ’07, pages 81–90, NewYork, NY, USA, 2007. ACM. → pages 23[24] W. Hurst, G. Gotz, and T. Lauer. New methods for visual informationseeking through video browsing. In Information Visualisation, 2004. IV2004. Proceedings. Eighth International Conference on, pages 450–455,2004. → pages 1261[25] W. Hurst and P. Jarvers. Interactive, dynamic video browsing with thezoomslider interface. In Multimedia and Expo, 2005. ICME 2005. IEEEInternational Conference on, pages 4 pp.–, 2005. → pages 12[26] J. Juran and A. Godfrey. Juran’s Quality Handbook. Juran’s qualityhandbook, 5e. McGraw Hill, 1999. → pages 5, 19[27] T. Karrer, M. Weiss, E. Lee, and J. Borchers. Dragon: a direct manipulationinterface for frame-accurate in-scene video navigation. In Proceedings of thetwenty-sixth annual SIGCHI conference on Human factors in computingsystems, CHI ’08, pages 247–250, New York, NY, USA, 2008. ACM. →pages 13[28] D. Kimber, T. Dunnigan, A. Girgensohn, F. Shipman, T. Turner, andT. Yang. Trailblazing: Video playback control by direct object manipulation.In Multimedia and Expo, 2007 IEEE International Conference on, pages1015 –1018, july 2007. → pages 13[29] D. Kurlander and S. Feiner. A history-based macro by example system. InProceedings of the 5th annual ACM symposium on User interface softwareand technology, UIST ’92, pages 99–106, New York, NY, USA, 1992. ACM.→ pages 20[30] I. Leftheriotis, C. Gkonela, and K. Chorianopoulos. Efficient video indexingon the web: A system that leverages user interactions with a video player. InProceedings of the 2nd International Conference on User-Centric Media(UCMEDIA), 2012. → pages 15[31] F. C. Li, A. Gupta, E. Sanocki, L.-w. He, and Y. Rui. Browsing digital video.In Proceedings of the SIGCHI conference on Human Factors in ComputingSystems, CHI ’00, pages 169–176, New York, NY, USA, 2000. ACM. →pages 11, 47[32] I. Li, J. Nichols, T. Lau, C. Drews, and A. Cypher. Here’s what i did:sharing and reusing web activity with actionshot. In Proceedings of the 28thinternational conference on Human factors in computing systems, CHI ’10,pages 723–732, New York, NY, USA, 2010. ACM. → pages 23[33] R. Mertens, R. Farzan, and P. Brusilovsky. Social navigation in web lectures.In Proceedings of the seventeenth conference on Hypertext and hypermedia,HYPERTEXT ’06, pages 41–44, New York, NY, USA, 2006. ACM. →pages 1562[34] T. Nakamura and T. Igarashi. An application-independent system forvisualizing user operation history. In Proceedings of the 21st annual ACMsymposium on User interface software and technology, UIST ’08, pages23–32, New York, NY, USA, 2008. ACM. → pages 21[35] K. A. Peker and A. Divakaran. Adaptive fast playback-based videoskimming using a compressed-domain visual complexity measure. In InIEEE International Conference on Multimedia and Expo, pages 2055–2058,2004. → pages 13[36] S. Pfeiffer, R. Lienhart, S. Fischer, and W. Effelsberg. Abstracting digitalmovies automatically. Technical report, 1996. → pages 17[37] D. A. Shamma, R. Shaw, P. L. Shafton, and Y. Liu. Watch what i watch:using community activity to understand content. In Proceedings of theinternational workshop on Workshop on multimedia information retrieval,MIR ’07, pages 275–284, New York, NY, USA, 2007. ACM. → pages 5, 15[38] T. Syeda-Mahmood and D. Ponceleon. Learning video browsing behaviorand its application in the generation of video previews. In Proceedings of theninth ACM international conference on Multimedia, MULTIMEDIA ’01,pages 119–128, New York, NY, USA, 2001. ACM. → pages 14[39] E. Tulving. Episodic and semantic memory 1. Organization of Memory.London: Academic, 381:e402, 1972. → pages 4[40] S. Vihavainen, S. Mate, L. Seppa¨la¨, F. Cricri, and I. D. Curcio. We wantmore: human-computer collaboration in mobile social video remixing ofmusic concerts. In Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems, CHI ’11, pages 287–296, New York, NY,USA, 2011. ACM. → pages 1, 3, 17[41] H. Weda and M. Campanella. Use study on a home video editing system. InProceedings of the 21st British HCI Group Annual Conference on Peopleand Computers: HCI...but not as we know it - Volume 2, BCS-HCI ’07,pages 123–126, Swinton, UK, UK, 2007. British Computer Society. →pages 18[42] J. Yew, D. A. Shamma, and E. F. Churchill. Knowing funny: genreperception and categorization in social video sharing. In Proceedings of the2011 annual conference on Human factors in computing systems, CHI ’11,pages 297–306, New York, NY, USA, 2011. ACM. → pages 1563[43] B. Yu, W.-Y. Ma, K. Nahrstedt, and H.-J. Zhang. Video summarizationbased on user log enhanced link analysis. In Proceedings of the eleventhACM international conference on Multimedia, MULTIMEDIA ’03, pages382–391, New York, NY, USA, 2003. ACM. → pages 15[44] G. Zipf. Human behaviour and the principle of least-effort. Addison-Wesley,Cambridge, MA, 1949. → pages 5, 1964


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items