Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

System architecture of WebSmart : a web-oriented synchronized multimedia authoring system Xu, Yue 1998

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1998-0320.pdf [ 7.51MB ]
JSON: 831-1.0051379.json
JSON-LD: 831-1.0051379-ld.json
RDF/XML (Pretty): 831-1.0051379-rdf.xml
RDF/JSON: 831-1.0051379-rdf.json
Turtle: 831-1.0051379-turtle.txt
N-Triples: 831-1.0051379-rdf-ntriples.txt
Original Record: 831-1.0051379-source.json
Full Text

Full Text

System Architecture of WebSmart - A Web-Oriented Synchronized Multimedia Authoring System by Yue X u B .S . , FuDan University, 1991 A THESIS S U B M I T T E D I N P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F Master of Science in T H E F A C U L T Y O F G R A D U A T E S T U D I E S (Department of Computer Science) we accept this thesis as conforming to the required standard The University of British Columbia A p r i l 1998 © Yue X u , 1998 In presenting this thesis in partial fulfillment of the requirements for an advanced de-gree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Computer Science Department The University of British Columbia 2366 M a i n M a l l Vancouver, B C Canada V 6 T 1Z4 Abstract Embedding multimedia contents such as audio and video into a Web page promotes wide applications on the Internet. However, to present web-oriented interactive multi-media raises extra requirements on network bandwidth, synchronization mechanisms, presentation engines, and authoring environments. Moreover, the heterogeneity of the Internet environment calls for open applications to fulfill these requirements seamlessly on various platforms. This paper proposes an approach to create Web-oriented synchronized multi-media documents by integrating the latest development in Java, Java Media Frame-work, JavaScript and H T M L . A s the result a WebSmart system is produced. It includes a presentation engine as well as an authoring environment, and it also realizes two lev-els of inter-object synchronization. Designed using an Object-Oriented methodology and implemented in Java, it aims to be a platform-neutral application. This paper introduces the system architecture and the technology framework of WebSmart system. Performance analysis and future work are also presented at the end of the paper. i i Contents Abstract 1 1 Contents n i L i s t o f Tables v i L i s t o f F igures v " Acknow ledgments v n i 1 Introduction 1 1.1 Problem domain 1 1.2 Contributions of WebSmart 5 1.3 Thesis outline 7 2 WebSmar t Archi tecture 9 2.1 Functional model 10 2.2 Scenario of creating a presentation 13 2.3 WebSmart synchronization statements 26 2.4 Object model 29 2.4.1 Entity objects and interface objects 29 2.4.2 Class diagram 31 3 Techno logy F r amework 42 3.1 Java 4 2 3.2 Java Media Framework (JMF) 45 3.3 H T M L 49 3.4 JavaScript 50 i i i 3.5 An example of integration 54 4 Performance 60 4.1 Multi-platform implementation 60 4.2 HTTP server 60 4.3 An application - a web-based JMF course 61 4.4 An Audio-only option 62 4.5 Other issues 63 4.5.1 Compression algorithms 63 4.5.2 General purpose web server vs. multimedia specialized web server 64 4.5.3 Network connection 65 5 Related Work 67 5.1 Authorware 67 5.2 RealSystem 68 5.3 NetShow 69 5.4 SMIL 71 5.5 Summary 72 6 Conclusion and Future Work 74 6.1 Conclusion 74 6.2 Future Work 76 6.2.1 Extensible Markup Language (XML) 77 6.2.2 Cascading Style Sheet (CSS) 79 6.2.3 Document Object Model (DOM) 80 Bibliography 82 Appendix A 86 Appendix B 88 iv Appendix C 90 v List of Tables T A B L E 1. J M F implementations v i List of Figures F I G U R E 1. Functional Model 13 F I G U R E 2. wsEditor 15 F I G U R E 3. Section window 16 F I G U R E 4. Object Properties window 18 F I G U R E 5. Timing window 19 F I G U R E 6. VideoFrame window 20 F I G U R E 7. The wsEditor main window after an object Notes is added 21 F I G U R E 8. PageStyle window 22 F I G U R E 9. Objects Layout window 23 F I G U R E 10. CaptionMaker window 24 F I G U R E 11. QuizDesigner window 25 F I G U R E 12. Question window 26 F I G U R E 13. Notation of entity objects and interface objects 30 F I G U R E 14. A n example of interface objects and entity objects 31 F I G U R E 15. Object Mode l Notation 32 F I G U R E 16. Object Mode l 41 F I G U R E 17. The output H T M L files of WebSmart 50 F I G U R E 18. Control flow in WebSmart 52 v i i Acknowledgments I owe a great deal of gratitude to my thesis supervisor, Dr. Gerald Neufeld. It was he who led me into this interesting area, multimedia on the Internet, the area which I might take as my future professional career. He has provided with generous support as well as useful suggestions on the WebSmart project. I 'm also grateful for his patience in re-viewing my thesis. M y gratitude also goes to Dr. Norm Hutchinson, who kindly took the time to be the second reader of this thesis, and gave many helpful comments. Thanks Lan L i , my partner in this project, for the effective cooperation between us. This one year's busy but enjoyable work is a precious experience to me, and I learned a lot from her diligent researching attitude. Finally, I wish to thank my parents and my husband for the understanding and love they are always ready for me. Every piece of my progress belongs to them. Yue X u The University of British Columbia April 1998 v i i i Chapter 1 Introduction The advent o f the Internet has greatly changed the way in w h i c h people get in format ion and knowledge. Today more and more people are retr ieving and sharing in format ion f r om the W o r l d W i d e W e b . W i t h the development in d ig i ta l audio/video med ia , a new type of web page w h i c h presents mu l t imed ia content such as audio and v ideo is emerg -ing . Such mu l t imed ia presentations w i l l extend or even revolut ion ize the educat ional system as educat ion is no longer l im i ted to the " c l a s s r o o m " format i n w h i c h a l l the stu-dents and instructors are required to col lect i n one locat ion for the spec i f ic per iod. A remote student can attend a class at his o w n convenience and pace. Moreove r , the class w i l l remain effect ive as the number o f students increases. In addit ion to this distant learning, mu l t imed ia web presentations also have other appl icat ions such as on-line publication/advert isement, corporate tra ining and product demonstrat ion. 1.1 Problem domain Multimedia, as def ined i n [1], has three characterist ics: • Multiple media (for example , audio, v ideo, data, text, and graphics) , w h i c h center on inc lud ing such types as audio or v ideo for w h i c h t ime is c ruc i a l ; • Interactivity, rather than s imple one-way broadcast ing; 1 • Networked offerings, as contrasted with those that stand alone. In order for multimedia presentations to be effective, the following challenges need to be addressed: • Bandwidth The current Internet is marked by its limited bandwidth and unstable transmission. The situation w i l l worsen when it transports voluminous video files. Most present public accesses to the Internet are via either 10-Mbps LAN/Ethernet, or typically 28.8Kbps home-use modem. On the other hand, video formats requires a high bandwidth. For example, International Standards Organizations (ISO)'s Moving Pictures Expert Group ( M P E G ) [2] requires 1.2 to 40 Mbps, Intel's Digital Video Interactive (DVI) requires 1.2 to 1.8 Mbps. and International Telecommunications Union (ITU)'s H.261 [3] requires 0.064 to 2 Mbps. Even in 10-Mbps L A N s which allow for several parallel compressed video streams, the performance is nondeter-ministic due to some mechanisms such as the Carrier Sense Multiple Access with Collision Detection (CSMA-CD). This w i l l be briefly discussed in Section 4.5 (Per-formance analysis). To solve this bandwidth problem, Internet-oriented compression algorithms are being designed by companies and international standard organizations. Such algo-rithms include RealNetworks' RealSystem codec, ISO's MPEG4, and ITU's H.323/ H.324 standards. • Synchronization 2 Unlike a document which contains only text and image, a multimedia document which contains audio/video defines a new temporal axis along with the traditional spatial axis. Audio and video are both time-sensitive media formats. A s time goes on, various components within a single document might drift in time away from each other, therefore creating an inconsistent effect to the document viewer. Blakowski [6] defined two levels of multimedia synchronization, intra-object and inter-object synchronization. The former is concerned with the time relations within one media object, and the latter is concerned with time relations between two or more media objects. Before the temporal relationships can be specified within a multimedia appli-cation, a model has to be set up as the formal notation to describe the synchroniza-tion activities. For this purpose several models have been created such as a timeline model in [7], a petri-net model in [8], and a Media Relation Graph in [9]. • Presentation Another issue for a multimedia web presentation is the mechanism to playback the video and audio clips within a browser such as the Netscape's Navigator and the Microsoft's Internet Explorer. The display of text and image is static, and these ele-ments are embedded in H T M L files by special tags. However, for continuous media such as audio/video, a V C R simulated interface providing controls such as Start/ Pause, Stop, FastForward, and Rewind is most preferable to ordinary people. 3 Plug-in is a widely used approach to present continuous media content within a web browser. A plug-in [11] is a separate code module that behaves as though it is part of a browser. Being a code library whose source conforms to standard C/C++ syntax, a plug-in's file type depends on the platform, for example, a .DLL (Dynamic Link library) on Windows, or a .SO or .DSO on Unix. Plug-ins can be embedded in H T M L files. When the web page viewer goes to a page that contains embedded data of a media type that invokes a plug-in, the browser usually loads the plug-in code into memory, creates and initializes a new instance of the plug-in. When the viewer leaves the page or closes the window, the instance is discarded. When the last instance of a plug-in is deleted, the plug-in code is unloaded from memory. Differ-ent vendors provide plug-ins according to different APIs , some widely-used multi-media plug-ins includes RealNetwork's RealAudio, RealVideo [22], and Apple's QuickTime [12]. In this thesis, however, we ' l l introduce another multimedia presentation approach which leverages the recent progress in Java APIs and which provides a standard interface to presentation engine designers. • Authoring environment The Hypertext Markup Language ( H T M L ) [10] defines the layout and spatial rela-tionship of components within a text-based web document. When it comes to the multimedia documents where a temporal relationship exists as well as the spatial relationship, more functionality is needed in assembling and synchronizing the multi-components of a page. In addition, as W W W is used by more non-profession-ally people, the authoring platform is required to be easy-to-use as well as versatile. 4 Authoring Systems, as defined in [13], are "software tools that enable users to create interactive instruction without programming, thereby allowing those who lack either access to programmers, time to program, or interest in learning programming to engage in courseware development." Unlike programming or authoring languages such as C or HTML which requires coding, authoring systems provide on-screen tools to help users enter content of the document and organize the layout. Besides the basic authoring functions described above, a multimedia authoring environment should also provide a mechanism to set up the synchronization events and define interactions between a web page and its viewers. Currently two complementary methods are being used, scripting language and declarative language. Examples for scripting language are JavaScript [17], VisualBasic [18] and Macromedia's Lingo [19], while examples for declarative language are HTML, QuickTime and SMIL [24]. From the above discussion, we conclude that supporting integrated multimedia documentation on the web requires more than a suitable network with sufficient band-width. Synchronization models, presentation engines, and authoring environments, are all among the new techniques to be developed. 1.2 Contributions of WebSmart WebSmart, as its name implies, is a Web-oriented synchronized multimedia authoring system. It was designed to provide the following functions: • A presentation engine to playback the video content within a web browser without extra plug-ins. WebSmart uses a light-weight Java Applet to achieve the following functions: 5 • Play back the continuous media from network. • Control the presentation of the documents. • Synchronize the components. • Tune the performance according to the available bandwidth. • An authoring environment to create the multimedia web documents. It provides following functions: • The document created is composed of several components that are accessible via a U R L , e.g. files stored on a Web server. • The components have different media types, such as audio, video, image or text. • The synchronization among components can be generated in a visual way in the authoring environment. • Interactions between the document and its viewer can be pre-set. • The layout of the document page is easy to handle. Users can set the position and size of each component. The major contributions of WebSmart are: • It creates a simple document model for web-oriented multimedia presentations. This model uses a distant learning scenario, and embeds components such as Table of Content, Media and accompanying HTML pages which are distinct in meaning and easy to be applied to other applications. 6 • It generates a series of authoring tools (wsEditor, CaptionMaker, and QuizDe-signer), and multimedia presentations can be created in a W Y S I W Y G manner using these tools. • Besides the authoring tools, it also offers alternatives for experienced web authors. A set of synchronization statements are designed for creating more flexible presen-tations, and these statements use existing technologies (such as JavaScript) which are well understood and do not require extra learning time. • WebSmart provides a platform-neutral approach to create and present multimedia Web documents by integrating Java [14], Java Media Framework [16], JavaScript and H T M L . The authoring system runs seamlessly on various platforms including Unix and Windows95/NT, and presentations it created are accessible from ordinary browsers without installing special plug-ins. • With its synchronization model and algorithms, the system realized two levels of inter-object synchronization: flipping and lip sync. Moreover, the synchronization events can be created easily within the WebSmart authoring environment without extra coding in either system programming languages or scripting languages. • The system can be easily extended due to its object-oriented design and 100% Java implementation. 1.3 Thesis outline This thesis focuses on the WebSmart architecture and its authoring environment, with a brief description on the synchronization mechanism. For detailed synchronization 7 models and algorithms, see [20]. The rest of the thesis is organized as follows: Chapter 2 gives a detailed descrip-tion of the system architecture, Chapter 3 introduces the technology framework of this project, Chapter 4 analyzes the performance of the system, Chapter 5 reviews the relat-ed work, finally Chapter 6 draws the conclusions and proposes for future work. 8 Chapter 2 WebSmart Architecture To manage the difficulty in the development process and extend the flexibility of the system, WebSmart adopted an object-oriented design methodology. Object-Oriented design divides the complexity of the system into smaller modules with interfaces de-fined clearly among them. This helps different members in a develop team to coordi-nate and communicate with each other. Furthermore, the multimedia authoring is a new area and developing quickly, this requires its architecture to support easy extension. Object-Oriented designing models software systems as collections of cooperating ob-jects which communicate with each other by passing messages. Compared with other design methods such as Top-down structured design and Data-driven design, the Ob-ject-Oriented design adapts more distinctly to extensions by simply adding in new col-lections of objects. This chapter discusses the architecture of the system in a top-down sequence. Section 2.1 analyzes the high level functionality of the system with regard to the re-quirements for multimedia authoring systems (refer to Chapter 1), Section 2.2 comple-ments Section 2.1 with a scenario of creating a WebSmart multimedia presentation. Section 2.3 introduces WebSmart-defined synchronization event A P I which provides an alternative to the scenario described in Section 2.2. Section 2.4 addresses the low level implementation of the system using an object model, and refines each functional module defined in Section 2.1. 2.1 Functional model WebSmart was primarily designed for distant learning applications. However, it can also be applied to other applications such as corporate training and on-line publication/ advertisement. Each presentation generated by WebSmart consists of three types of components: • Table of Contents (TOC) outlines the major topics of a presentation. Just as a speech or a lecture is composed of several parts, a WebSmart presentation is com-posed of several sections, where each section is composed of one or more video clips. After creating the outline of a presentation, authors can select hiding or show-ing the T O C on the generated web page. • Media (Audio/Video) presents the audio/video content. It can not be hidden in the web page created, because synchronization events taking place in other components of the page are all driven by this Media object. It is feasible to include more than one Media object within a WebSmart presentation and present audio/video streams in these Media objects in a synchronized way. In the current distant learning appli-cation we built (refer to Section 4.3), there is only one speaker (the instructor) in the class, therefore the web page includes only one Media object. • Accompanying H T M L pages include all the other objects which simulate the overhead and notes in a real classroom course. Located in neighboring frames in a web page along with the T O C and Media, these H T M L pages can be flipped as the presentation video is going on, therefore creating a virtual classroom's atmosphere. 10 The whole system (Figure 1) is made up of an authoring environment and a presentation engine. The authoring environment creates a set of H T M L files for a pre-sentation. The H T M L files are transmitted to the Web browser at the client site. The media content embedded within the H T M L files is presented by a Java Applet (Display Applet) which is also migrated to the client site. The authoring environment consists of four parts: synchronization mechanism, layout module, QuizDesigner and wsEditor. WebSmart supports two levels of inter-object synchronization, namely flipping pages and lip synchronization. The former one is a form of interaction described as "15 s e c o n d s i n t o v i d e o - c l i p 1, show s l i d e _ l i n the Notes Frame a n d t i t l e _ l i n the t i t l e Frame;... 37 s e c o n d s i n t o v i d e o -c l i p 2 , show s 1 i d e _ 5 . . . " . The latter one realizes a finer-grained form of syn-chronization. The Media frame in a WebSmart page includes an optional text field which displays the closed caption of the audio/video content. A s the media clip is being presented, the closed caption field changes its text in a pace corresponding to the move-ment of the narrator's lips. This closed caption feature enables deaf and hard of hearing people to watch a multimedia-embedded web page, improves people's reading scores and helps them understand/learn foreign languages. Closed captions are signals that ar-rive with video/audio. They require a special decoder to make them visible, and can be turned on or off by the user. The layout module provides a visual tool for authors to arrange the layout of ex-ported web pages. Authors can add/delete multimedia components on the page, change sizes and positions of these components, hide/restore components from the page. A l l 11 these editing actions are done visually in the wsEditor. When authors export the multi-components setting, a multi-framed web page wi l l be created in a set of H T M L files. The QuizDesigner creates a kind of H T M L files with more intensive interac-tions than ordinary pages. Besides the flipping pages scenario described above, the QuizDesigner realizes the scenario like "20 s e c o n d i n t o v i d e o - c l i p 3 , p a u s e t h e v i d e o t i l l t h e v i e w e r a n s w e r s t h e q u e s t i o n i n s l i d e _ 1 0 ; . . . " , and like " i f t h e v i e w e r s u c c e e d e d i n g r e a t e r t h a n 5 o f t h e 10 q u e s t i o n s t h a t c o n t a i n e d i n s l i d e 7 t o 1 6 , d i s p l a y v i d e o c l i p 5 i n t h e Media F r a m e ; o t h e r w i s e d i s p l a y -v i d e o c l i p 6". The wsEditor is the central part of the authoring environment. It collects syn-chronization events generated by the Sync module and layout information obtained from the Layout module. The QuizDesigner creates one H T M L file for each of the ques-tions in a quiz, these H T M L file names wi l l be quoted in the wsEditor when a quiz sec-tion is created. After a WebSmart document is exported, the wsEditor creates a set of H T M L files to be transmitted to the web browser. Besides the exported H T M L files mentioned above, the authoring environment also creates two other formats of output. The QuizDesigner creates a .quiz file for each quiz. This .quiz file is kept for future modifications of the quiz. The wsEditor also cre-ates a .ws file for all the sections, events and layout information within a WebSmart pre-sentation. 12 Authoring environment Presentation engine Sync mechanism Layout QuizDesigner wsEditor T H T M L files Web browser media files Presentation Engine (Display Applet) where: FIGURE 1. Functional Model 1 represents "synchronization events", 2 represents "layout information", 3 represents " HTML files for quizzes", 4 represents ".wsfiles", 5 represents ".quizfiles"'. 2.2 Scenario of creating a presentation WebSmart is designed to be an authoring environment which adapts to different levels of web page designers. A n author without any programming experience can create his 13 presentation exclusively within the system. The complete process includes entering sections and their video clips, inserting accompanying H T M L frames and their syn-chronization events, designing the page layout, entering caption text and setting caption pace, finally designing the questions, i f there are any. A n author with H T M L experi-ence can build his own multi-framed web page for more dynamic layout. A n author with JavaScript experience can create his own synchronization mechanisms using a simple form of event A P I defined by WebSmart. The H T M L pages displayed in accom-panying frames can be created in most H T M L editors such as Netscape's Communica-tor. This section illustrates the steps of creating a multimedia presentation. The next section w i l l describe the event A P I provided by WebSmart. 1. Start the Editor The wsEditor opens a window as depicted in Figure 2. The two default frames in the initial window are Table of Contents (left) and Media (right). 14 njx] Video/audio Panel Play Clip Play Section Capture WebSmart Editor: 1 New File' File Object Layout Table of Contents ADD EDIT DELETE Up Down FIGURE 2. wsEditor 2. A d d in Sections To add video/audio clips into the document, authors must set up section(s) in the "Table of Contents" to include these clips. However, the "Table of Contents" Panel can be hidden from Object Layout in the future. The information of a section is added in the Section window as depicted in Figure 3. 15 Section Number Section Type Section Name Video Files Audio Files Image Files Caption Files Ok Cancel Help F I G U R E 3. Section window If you prefer using audio-only files rather than video files for the presentation, you can put the audio-only files in the "Video Files" field, and leave the "Audio Files" field empty. Audio-only and image files can act as alternatives to the video files when the network bandwidth is insufficient (refer to Chapter 4Chapter 4 Performance and Future Work). Caption files can be created by the CaptionMaker (refer to step 8). 16 Add Section I 1 (• Instruction C Quiz I Introduction movies/sec1 mpg movie s/sec1 .mpa I movies/sed.gif movies/sed .cap • If you have more than one video/audio clip for a section, use " ; " to separate the clips. For example, you can have the following entries: Video Fi les : m o v i e s / s e c l . 1 . m p g ; m o v i e s / s e e l . 2 . m p g ; m o v i e s / s e e l . 3 . m p g Audio F i les : m o v i e s / s e c l . 1 . m p a ; ; m o v i e s / s e c l . 3 .mpa Image Fi les : m o v i e s / s e c l . g i f ; ; m o v i e s / s e c l . 3 . g i f In this case, three video clips {seel. 1 .mpg,seel.2.mpg, and seel.3.mpg) w i l l be sequentially played in this section. Notice, the second video file does not have a corresponding audio-only rile. . A d d in Objects Objects are accompanying components that go along with a Media component. When exported, these objects become frames in a web page. Under a distant learn-ing context, these objects can act as course notes and images which wi l l be flipped as the video is presented in the Media frame; therefore the whole page simulates a live classroom environment. B y selecting the " A d d Object" item from the Object menu in the wsEditor (Figure 2), authors can enter object information in the "Object Properties" window (Figure 4). There are three types of objects that can be added, namely, Video/Audio (for creating Media frame), HTML (for creating accompanying components for the video/audio), and Table of Contents. The Layout item allows authors to choose the exported web page style, whether the multi-components of a web page are put into frames or layers. The Frame is the layout style used widely in most current web 17 pages, and Layer is a new technique defined in Dynamic H T M L ( D H T M L ) which has not been supported in most commercial browsers. The D H T M L wi l l be dis-cussed in detail at Chapter 4 (Performance and Future Work). j|HObject Properties Object Name | Notes C Video/Audio Type <• HTML C Table of Content <• Frame Layout C Layer Height | 50 % Width | 40 % Number of Pages Timing | Ok Cancel | Help 1 F I G U R E 4. Object Properties window The "Timing" window (Figure 5) is popped out after the "Timing" button in the "Object Properties" window is clicked. The Timing window requires informa-tion describing the synchronization events taking place in the current object. The three entries in the first row of the "T iming" window in Figure 5 set up a flipping page event for the Notes object defined in Figure 4. The event is to display slidel.html in the Notes frame when the movies/secl.mpg is presented to the 5th second in the video stream. 18 Timing •LI Page* Source (URL) Start at (Sec) Video Clip 1 | slidel html I 5 | movies/secl .mpg 2 | slide2.html | 10 | rnovies/sed rnpg 3 | slide3.html |15 rnovies/sec1 mpg Ok Cancel Help I F I G U R E 5. Timing window 4. Synchronization between components The synchronization information can be entered in the Timing window. It can also be generated automatically in the "VideoFrame" window. C l i c k the "Play C l i p " button in the Video/Audio Panel (Figure 2), or play video files for a specific section by selecting a section in the Table of Contents list and clicking the "Play Section " button, a VideoFrame window (Figure 6) is opened. First, cl ick the "Play" button to start the video playback, or move the slider to the desired position in the clip. When the video stream is at the point where a syn-chronization event is desired, click on a source U R L in the corresponding object list within the wsEditor window (the source U R L is entered in the Timing window), then click "Timing" button below this object. The captured synchronization event time is then updated in the object list. 19 F I G U R E 6. VideoFrame window Figure 7 displays the look of the wsEditor main window after an object Notes is added and the synchronization information is generated from the VideoFrame window. 20 WebSmart Editor: 'homepage* File Object Layout Help Table of Contents 1 Introduction ADD | EDIT J DELETE | Up | Down Object: Notes slidel.htmj; 5; movies/secl .mp slide3.html; 15; movies/sed .m View Timing Undo Video/audio Panel Play Clip Play Section Capture IE F I G U R E 7. The wsEditor main window after an object Notes is added 5. Set the H T M L page style Users can choose their favorite H T M L page styles. The objects (including Table of Contents, Video/Audio Panel, and accompanying H T M L objects) are arranged by framesets in the exported H T M L file. Users can select whether the framesets are separated by rows or columns, and they can also decide the size of each frameset, as in Figure 8. The layout set in this figure is to separate the two framesets of the web page by rows, and each frameset takes half height of the entire page. 21 Page Style In the exported HTML page, the framesets will be arranged by: <• [Rows! 50 C Columns 50 50 50 % % Ok Cancel Help F I G U R E 8. PageStyle window 6. Set the object layout Besides page style, object layout can also be set by authors as in Figure 9. Users can swap the positions of several objects by exchanging their position IDs in the Objects Layout window, and can also set the size of each object. If authors want to hide a specific object (including Table of Contents), they only need to set its height or width to zero. The hidden objects can be recovered by resetting their sizes. 22 Object syllabus media_app Notes F I G U R E 9. Objects Layout window 7. Save and Load a .ws file WebSmart defines a specific file format (.ws file) to save the intermediate results of the user's input. Authors can save the current document and resume the editing in the future. The final web-accessible H T M L files are created by exporting the .ws file from the wsEditor's main menu (Figure 2). 8. CaptionMaker The CaptionMaker is a visual tool (Figure 10) that makes closed captioning an easy task. It synchronizes caption texts with video/audio, and stores the encoded texts into .cap files. The .cap files are attached to their video/audio files in WebSmart Editor for adding closed captions into the multimedia presentations. Position Height Width If 1 5 0 % 50 % % % 11 | 50 50 % % h | 50 40 Cancel Help 23 WebSmart CaptionMaker: C:\WebSmart\app1\movies\sec0.mpg File Ed Options Help Time: 64/64 Wmm itself is built upon JMF technology. We are expecting to give you a real feeling of what JMF can do for you. If you have any comments or suggestions on improving the designing or anything related with the course, feel free to let us know. This course is based on JavaSoft's JMF specification and Intel's JMF tutorial. The course includes two parts: an instruction part, and a short quiz. The instruction part is composed of three chapters: JMF introduction, the Basic Media Player, and the extending part of the Basic Media Player. If you want to start from a specific section, click it in the syllabus frame. Now let's start and we |hopeyou enjoy the course. hope you enjoy the course] •F.i F I G U R E 10. CaptionMaker window Where: Caption_field, the single line text field under the picture frame, is for displaying and edit-ing a caption line for the current clip segment. File_view, the text area besides the picture frame, displays the entire caption file with the current caption line highlighted. Audio/video clips are played back in either step or continuous mode. In step mode, the playback pauses at the pace set by the user, waiting for him to enter the text corresponding to this short segment of the media content, before continuing to the next segment of the media. The continuous manner, on the other hand, allows the user to review the caption texts entered and make modifications. 9. QuizDesigner 24 The QuizDesigner maintains a list of questions (Figure 11), and each question is edited in a Question window (Figure 12). In the "Question" window, users enter the name of the H T M L file which saves this question, the question body, the question type (an input question or a multiple-choice question), expected answer, and the time point at which a video clip should be paused and the name of the video clip to be paused. In the case of multiple-choice questions, users also have to enter the content of choices. The quiz and its questions can be saved in a .quiz file, the H T M L files of the questions are generated by exporting the .quiz file. | H | Q u i z Design: "New File* mmmm • | x | File Help Question List Add Question | Edit Question | Delete Question FIGURE 11. QuizDesigner window 25 Question File Name Question Body q1 .html Type (• Input C Choice A. B. C. D. m Expected Answer Java Media Framework Video pauses at Video Clip 16 movies/sed 1_1 .mpg| Ok Cancel Help F I G U R E 12. Question window 2.3 WebSmart synchronization statements WebSmart adapts to different levels of users by different levels of interface. Section 2.2 illustrated how to create a web presentation exclusively in the system's visual tools. However, authors with programming experience can customize their documents for 26 both page layout and synchronization setup. To make more complicated page layouts, authors could refer to H T M L manuals. To customize synchronization setup, WebSmart provides an event A P I for maintaining event tables and keeping statistics on page view-ers' interactions. Instead of defining a complicated scripting language as most commercial prod-ucts do, WebSmart's synchronization event A P I consists of only three basic statements, namely, clearAllEvents(), addEvent(), endEvent(), and five extended statements for the QuizDesigner, namely, quizReset(), one_more_quiz(), one_more_correct_answer(), getQuestionsTaken() and getCorrectAnswers(). These statements are embedded in Jav-aScript functions within H T M L files. The syntax of these statements is as follows: clearAllEvents{ ); addEvent(JSJunction, URL_Jlip, VideoClip,EventTime); endEvent( ); quizReset( ); one_more_quiz(); one_more_correct_answer( ); getQuestionsTakenf ); getCorrectAnswers( ) The semantics are as follows: 27 clearAllEvents() informs the system to build new synchronization event tables for the current section of the presentation. The event table wi l l be discussed briefly in the WS class in Section 2.4.2 (Class Diagram). For more information about the implementation details on synchronization event table, please refer to [20]. addEvent() adds a synchronization event into the event table. The scenario of the event is described as "asking the browser to execute the JS Junction when the VideoClip is being displayed to the EventTime". Usually the JS Junction is to ask the browser to flip the content of an H T M L frame within a Web page, and URLJlip indicates the U R L address of the new content of the frame. endEvent() informs the system to close the event adding process and trigger the initialization processes. quizReset() starts a new quiz which consists of a set of questions. one_more_question() informs system that one more question has been taken. This statement, as well as the one_more_correct_answer() described next, is used to report quiz information and create statistics for the quiz summary page. one_more_correct_answer() informs system that one more correct answer is given by the page viewer. getQuestionsTaken() returns number of questions taken. This statement, as well as the getCorrectAnswers() described next, is used in creating the quiz summary page. 28 • getCorrectAnswers() returns the number of correct answers the page viewer has given. A typical H T M L file generated by the WebSmart authoring environment is the section 1.html in Appendix A . Authors can also create or modify this kind of H T M L files in a plain text editor. Notice in the sample file, "changetitle()" and "changeNotesO" are two JavaScript functions, "title" and "Notes" are names taken from two WebSmart objects in the presentation page. Chapter 3 discusses how Web-Smart integrates Java, JavaScript, Java Media Framework and H T M L in detail. 2.4 Object model A s the functional model (Section 2.1) defines the functionality of the system, the object model describes the static data structure of objects, classes, and their relationships to one another. It is a refinement towards the implementation phase in the project life-style. In this section we ' l l first distinguish the roles of entity objects and interface ob-jects, then give the class diagram of the system, focusing on entity objects. 2.4.1 Entity objects and interface objects According to Jacobson [25], objects within a software system can be divided into three classes based on their functionality. These classes are control objects, entity objects and interface objects. Interface objects describe the bidirectional communication between the system and its users. Entity objects are used to model the information that the sys-tem w i l l handle over a longer period of time and also the behavior that naturally belongs to this information. In some complex cases, there often remains behavior that is not nat-urally placed in either interface objects or entity objects. Such behavior is placed in con-29 trol objects. This thesis adopts the following notations in describing the relationship between these objects (Figure 13). O o Entity Object Interface Object FIGURE 13. Notation of entity objects and interface objects In WebSmart, we use only Entity Objects and Interface Objects. For each infor-mation structure, we define an entity object (named unit where unit is the name of the information structure such as Section and wsObj, etc.) to maintain its information and an interface object (named unitjGUI) to handle the communication between the system and its users. The entity object and its corresponding interface object exchange infor-mation via their public methods. Figure 14 depicts an entity object (WSObj) and its corresponding interface ob-ject (WSObj_GUI), as well as their relationship. The W S O b j _ G U I pops up a window for users' access, and collects users' input such as text and checkbox selection. The WSObj ' s property information collected in the W S O b j _ G U I is then reported to the W S O b j . The WSObj retrieves the property information, does operations on it, and com-municates further with other entity objects in the system. WSObj also feeds back infor-mation to the W S O b j _ G U I . When a W S O b j _ G U I opens an existing WSObj , it gets information from that WSObj for object name, height, and width. The purpose of encapsulating the user interaction separately from the entity in-formation is to protect the entity data from unintentional modification from outside of 30 the system, and also to establish a clearer interface between different developers within a team. - 0 = 0 name, height, width, • WSObj other entity objects F I G U R E 14. A n example of interface objects and entity objects 2.4.2 Class diagram Since the only role interface objects play is to pass users' input to entity objects, and major functions of the system are realized in entity objects, we ' l l focus on entity objects in the object model. We adopt Rumbaugh's Object Model Notation [26] in our object modelling. Figure 15 depicts the basic notations we use. A n object class describes a group of objects with similar properties (attributes), common behavior (operations), common relationships to other objects, and common semantics. A n association describes a group of links with common structure and com-mon semantics, it often appears as verb in a problem statement such as "class A uses methods defined in class B". A n aggregation is the "part-whole" or "a-part-of' relation-ship in which objects representing the components of something are associated with an object representing the entire assembly. The generalization and the inheritance are powerful abstractions for sharing similarities among classes while preserving their dif-31 O A user input text, checkbox, WSObj entered by users User W S O b i G U I ferences. Generalization is the relationship between a class and one or more refined versions of it. The class being refined is called the superclass and each refined version is called a subclass. Each subclass is said to inherit the features of its superclass. The solid balls are multiplicity symbols (zero or more). The sign means "one or more " multiplicity. Class: Class Name Class Name attribute: data_type operation(arg_list): return_type Generalization (Inheritance): Superclass Subclass-1 Subclass-2 Association: Class-1 Association Name Class-2 Aggregation: Assembly Class o o o Part-1-Class Part-2-Class 1 + Part-3-Class FIGURE 15. Object Model Notation A s introduced in the functional model, WebSmart system is made up of a pre-sentation engine and an authoring environment. The authoring environment in turn is 32 implemented in five blocks: wsEditor, Synchronization mechanism, GUI and Layout, QuizDesigner and utilities. The former four blocks correspond to modules in the func-tional model defined in Section 2.1. The utilities block provides services such as print-ing debug information to the other blocks. There are also other associations among blocks. Figure 16 depicts the object model. • Presentation Engine The presentation engine is an independent block in the object model. It is realized by three classes: Display, EventGroup and EventThread. The Display is the applet embedded in H T M L files. Its major functions lie in three categories: • Interfacing with the HTML file. The applet reads the parameters listed in the H T M L file and supplies these values to the media player. These parameters include video file name, audio-only file name, and caption file name. • Controlling the media player. The media player which presents the media con-tent is created by the Display applet. This player realizes all the controlling functions provided to the end user. These functions include start, stop, sus-pend, resume, fastforward, and rewind. Besides, an "audio-only" option (refer to Chapter 4 Performance) is also supported in the Display applet. 33 • Interfacing with JavaScript. The interaction between the web page and its viewers is realized by the communication between Java and JavaScript (details w i l l be given in Chapter 3: Technology Framework). The JavaScript to Java communication is realized in the public functions provided in this Display applet. The EventGroup is used by the Display to control the multiple events taking place during a presentation. As there are multiple frames within a web page, and each frame can define synchronization events associated with the media presenta-tion, the EventGroup provides a consistent management over these events. Its major function is to pass the controlling operations to each thread member within the group. A n EventThread object takes care of synchronization events taking place within a single frame. It can be a member of the EventGroup. The major function of an EventThread is to wait for the time set by an event and execute the event by call-ing JavaScript functions. • wsEditor Block wsEditor assembles a presentation's major parts, including, WS, WS_Panel, ContentPanel, Section, VideoPanel, and WSObj. A WS object represents a WebSmart presentation, and it is usually made up of multiple WSJPaneh. The major role of the W S object is to keep track of all the pan-els and objects on the page, and export them to H T M L files at last. The W S object also maintains a hashtable for synchronization events. This event table is indexed 34 by video clip names, and saves a vector of WSObj objects which have associated synchronization events for each video clip. A t the time to write a section H T M L file, a Section examines its video clips, extracts synchronization events for each WSObj associated with each video clip, and inserts events into the H T M L file using the synchronization statements defined in Section 2.3. A WS_Panel object is physically displayed as a layout frame, and functionally embeds multimedia content. According to the type of multimedia content it embeds, a WS_Panel can be realized as a ContentPanel, a VideoPanel, or a WSObj. WS_Panels are defined recursively. A WS_Panel can contain other WS_Panels, therefore a hierarchical structure is formed. The WS_Panels at the top level of the hierarchy are exported to the framesets in the final presentation page, and WS_Panels at the lower levels are exported to frames. A ContentPanel corresponds to the Table of Contents under our distant learn-ing context. It contains at least one Section. Its major function is to maintain all the sections for a presentation. Besides the addition and deletion of a section, the Con-tentPanel also maintains the hierarchy of the sections within a presentation. The numbering of a section (e.g. " 1. " or " 1 . 1 " or " 1 . 1 . 1 " ) and its indentation are created automatically by the system. A Section simulates a part of a speech under our distant learning context. It acts as a means to bring audio/video content into a web page. The set of sections within a ContentPanel makes up the complete scenario of a presentation. According to the interactivity involved, sections are divided into two types: "instruction" and "quiz". Interactions in an instruction section are only between the Media compo-35 nent of the web page and the page viewers. Interactions in a quiz section also extend to those among the accompanying frames of the web page and the page viewers. The VideoPanel corresponds to the Media component. It plays back audio/ video content in an exported WebSmart presentation. In the authoring environment, the VideoPanel uses the PlayerFrame to playback the video clips for users to cap-ture timing points of synchronization events. A WSObj instance is a WS_Panel object which is neither a ContentPanel nor a VideoPanel. It corresponds to an accompanying frame within the exported web page. Content of an accompanying frame can be flipped at pre-defined media points as the audio/video content is being presented in the Media component. This kind of flipping activity is called an event. The WSObj class provides functions in three categories: • communication with the WSObjjGUI object on object's property information. • communication with the WS object on the synchronization event hashtable. • communication with the PlayerFrame object to create synchronization events. • Synchronization mechanism The synchronization mechanism block is composed of VideoPanel, WSObj, PlayerFrame and its subclass CaptionMaker, and CaptionPace. Two levels of syn-chronization defined in Section 2.1 (Functional Model) , namely, flipping page and lip synchronization, are set up in this block. 36 A flipping page event is generated by the communication among the VideoPanel, the WSObj and the PlayerFrame. The VideoPanel responds to users' requests to play video clips by opening the PlayerFrame. The main functions of the PlayerFrame are to play the video clips sequentially and to report media time at WSObj 's request. The WSObj responds to users' requests to insert an event, gets the current media time from the PlayerFrame, then updates the event timing infor-mation of the WSObj . Later this event timing information w i l l be used in the event hashtable of the W S object. The lip synchronization information is created by the CaptionMaker and the CaptionPace. The CaptionMaker inherits the continuous media playback operation from the PlayerFrame, and extends another "STEP" playback operation. In the S T E P mode, the media playback pauses automatically after a certain length of media stream has been played, waiting for the user to enter caption text. After the text for the whole video clip have been entered, the user can switch back to CON-TINUOUS mode to review the generated caption file. The CaptionPace is used to set the length of time for each periodic play in the STEP mode. • GUI and Layout The G U I and Layout block is composed of GUI classes and Ratio JLayout. The G U I classes implement interface objects in a way described in Section 2.4.1 (Entity object and Interface object). Their basic function is to take users' input and pass the information to the corresponding entity objects. 37 The RatioJLayout is an extended layout manager developed by Parr [28]. A layout manager is an object that positions and resizes the components in an A W T container according to some algorithm. Traditional layout managers defined in the Java language (e.g. BorderLayout, FlowLayout, CardLayout, GridLayout, and GridBagLayout) divide the space of a web page into grids in different levels of granularity. Each component on the page occupies certain blocks of grids. Although the granularity is becoming finer and finer, the page author still lacks the flexibility of setting a component at an arbitrary position and at an arbitrary size. The Ratio_Layout, however, meets this demands effectively. Every component is speci-fied by a ratio vector ( x , y ; w i d t h , h e i g h t ) , where ( x , y ) is the position of the upper left corner of the component, and ( w i d t h , h e i g h t ) decides the size of the component. Every parameter within the vector is a percentage ratio to the entire page size. For example, a component specified by (0 , 0 ; 0 . 4 5 , 0 . 7 5 ) is a rectangle starting at the upper left corner of the page, and taking 45% of the whole page's width and 75% of the whole page's height. The behavior of a layout manager is effected by the implementation of a L a y -o u t M a n a g e r interface behind the scene. B y implementing the following methods defined in the LayoutManager, users can define their specialized layout manager. • a d d L a y o u t C o m p o n e n t ( S t r i n g r , C o m p o n e n t c ) defines how a component is added into the associated Container. • r e m o v e L a y o u t C o m p o n e n t ( C o m p o n e n t comp) defines how a compo-nent is removed from the Container. 38 • p r e f e r r e d L a y o u t S i z e ( C o n t a i n e r t a r g e t ) defines the preferred size of this container after laying out all the objects. • m i n i m u m L a y o u t S i z e ( C o n t a i n e r t a r g e t ) defines the minimum size of this container after laying out all the objects. • l a y o u t C o n t a i n e r ( C o n t a i n e r t a r g e t ) sets the ( x , y ) coordinate and size ( w i d t h , h e i g h t ) of each component contained in the target Con-tainer. The Ratio_Layout implements the above methods of the LayoutManager inter-face with a special modifier to indicate the characteristics of each component in the container. The modifier is in the form of " x r a t i o , y r a t i o ; w i d t h R a -t i o , h e i g h t R a t i o " . B y keeping track of modifiers for all the components, and calling r e s h a p e () to draw these components on the screen, the Ratio_Layout sets the components at arbitrary positions and arbitrary sizes within a page. QuizDesigner The QuizDesigner block consists of Quiz and Question. It creates a special type of section - quiz. A Quiz object has multiple Question parts. The main func-tions of the Quiz class are maintaining the question list, updating a .quiz file, and generating a quiz summary page upon users' request. Each question is exported to an H T M L file which realizes the interaction scenario described in Section 2.1 (Functional Model). Utilities 39 The Utilities block provides services to other blocks of the system. The WS_Util uses InfoDialog to create a message window which displays multiple lines of information (via the MultiLineLabel class). The WSJUtil also generates dialog window to ask users for confirmation (via the YesNoDialog class), and gen-erates dialog window to allow users to open or save a file (via the File Viewer class). 40 GUI and Layout Ratio_Layout ws WS_GUI wsEditor X WS_Panel KZontentPanel 1+ Secti on VideoPanel WSObj PlayerFrame X CaptionMaken CaptionPace Sync Mechanism (other GUI classes) WS_Util InfoDialog YesNoDialog FileViewer MultiLineLabel Utilities Display QuizDesigner Quiz EventGroup EventThread Presentation Engine Question F I G U R E 16. Object Model 41 Chapter 3 Technology Framework Java, Java Media Framework(JMF), JavaScript and H T M L constitute the technology framework of the WebSmart system. This chapter discusses the advantages and disad-vantages of each component, and gives an example on how WebSmart is built by inte-grating these components. 3.1 Java Designed by Sun as " A Simple, object-oriented, network-savvy, interpreted, robust, se-cure, architecture neutral, portable, high-performance, multithreaded, dynamic lan-guage", Java [14] is used as the system programming language of WebSmart. This section examines some of Java's major features and discusses both Java's advantages and disadvantages as it is compared with other languages. • Object-Oriented To function within increasingly complex, network-based environments such as the Internet, programming systems must adopt object-oriented concepts. The reason is that the encapsulated, message-passing paradigm of the object-oriented methodol-ogy coincides with the needs of distributed, client-server based system. Java was designed to be object-oriented, and it [15] supports primary object oriented charac-teristics such as Encapsulation, Polymorphism, Inheritance and Dynamic binding. 42 The mapping between the object model of WebSmart and its implementation is straightforward: all the classes, associations, inheritance and aggregation have direct correspondences in the language. One disadvantage of Object-Oriented lan-guages is the overhead caused by intra-object communications, and this makes them not as suitable as process-oriented languages (such as C) for real-time appli-cations. Architecture neutral and portable In contrast to C++, Java doesn't generate a binary format machine code which depends completely on the native hardware instructions - rather, it generates byte-codes: a high-level, machine-independent code for a hypothetical machine. The bytecodes generated by a Java compiler is realized in specific hardware and soft-ware platforms by the Java interpreter and run-time system. This kind of architec-ture-neutral and portable language platform of Java is known as the Java Virtual Machine. The benefit of this virtual machine mechanism is obvious. In a heteroge-neous network environment as the Internet, applications must be capable of execut-ing on a variety of hardware and software architectures. The disadvantage of Java's interpreted nature is that the interpretation process of the bytecode slows down the execution speed. Multi-threaded Multimedia applications are featured by many concurrent activities such as playing an audio/video clip in one frame of the web page, displaying images in another frame, and at the same time waiting for users' input in yet another frame. Such an interactive scenario requires a multithreading support from the platform on which it 43 runs. Java, fortunately, has support for threads built in its language level. A Thread [29] class is defined, with a list of methods including r u n ( ) , s t a r t () , s t o p ( ) , s l e e p O , s e t N a m e () , g e t N a m e () , i s A l i v e ( ) , etc. Users now can assign a specific task to a thread, and make multiple threads run asynchro-nously within an application. Moreover, Java defines another class called ThreadGroup, which, as the name implies, is a class that handles a group of threads. B y organizing a set of relevant threads into a group, programmers are allowed to manipulate many threads by call-ing a single method such as s u s p e n d () , r e s u m e () , s t o p ( ) , or d e s t r o y ( ) , etc. Conventional languages such as C and C++ are single-threaded. Implemented in Java, WebSmart has deployed Threads and GroupThreads efficiently in its syn-chronization tasks. • Extensible Ever since its debut in late 1995, Java has expanded in various areas in computer science. Besides its core language features, Java has already extended APIs to such areas as database access (JDBC), remote execution (RMI) , interactive media (JMF), sound (JavaSound), graphics and imaging (Java 2D and Java 3D), and even telephony (Java Telephony). One of the considerations in the early design phase of WebSmart was to take advantage of the rich set of APIs that Java vendors provide, and to build object oriented modules which can be easily expanded as the Java plat-form expands. The Java Media Framework discussed below is one of the recently published Java APIs on which WebSmart is built. Currently only J M F Player is 44 defined and implemented. Once the live capturing and video conferencing are real-ized in J M F , these functions wi l l be merged into WebSmart without much difficulty in design and implementation. 3.2 Java Media Framework (JMF) A s "an application programming interface (API) for incorporating media data types into Java applications and applets", J M F [16] is utilized by the WebSmart project in several critical modules such as the presentation engine, and the synchronization mech-anism. J M F is developed jointly by Sun Microsystems Inc., Silicon Graphics Inc., and Intel Corporation. With its first official version being released in September, 1997, it now has following implementations on different platforms: vendor platform Sun Solaris Windows 95/NT SGI Irix 6.2/6.3 (JMF spec ver 0.96 only) Intel Windows 95/NT Brian Griffith Archipelago Productions Macintosh Windows95/NT T A B L E 1. J M F implementations J M F is being released in three stages: Player, Capture, and Conference APIs . At the time of this writing, only Java Media Player has been released. It supports the synchronization, control, processing and presentation of compressed streaming and stored time-based media, including video, audio and M I D I across all Java enabled plat-form. 45 The central structure within the Java Media Player is the player. A p layer is defined in [16] as "an object that processes a stream of data as time passes, reading data from a DataSource and rendering it at a precise time". The operations on a player during its lifecycle can be described as follows: 1. Creating an instance of the p layer A p layer can be created directly from a U R L by calling createPlayer (). The Manager uses this media U R L to create an appropriate type of Player according to the content of the media resource. A n identical interface exists to embed media for both a web-oriented document and a local file stored in C D - R O M . When created, a p layer also needs to register a C o n t r o l l e r L i s t e n e r so that the latter can observe media events posted by the player. 2. Controlling the p layer Before any media content can be presented by a p layer , a r e a l i z e () opera-tion has to be done to allocate system resources (e.g. visual component and control component) exclusively to the player. A s the resource of presentation is being allocated, an asynchronous operation prefe tch () can be applied on a player to get its media content from either the network connection or local storage. Once the visual component and control component are ready, a player can start displaying media content as long as there is content prefetched into the buffer. Document viewers interact with the p layer from the control component (a set of buttons on the web page) to s t ar t () or stop () the presentation. 3. Responding to media events 46 A s the player changes its status among unrealized, realizing, realized, prefetching, prefetched, started or error situations such as resource unavailable, connection error, etc., it posts event reporting messages to the ControllerListener reg-istered, therefore the latter can react correspondingly. 4. Deallocating the player When the player is no longer in use, it should be processed in the following three steps: • first, stop () the player, so that it's not active. • second, deallocate () the player to release any exclusively resources and minimize its use of non-exclusive resources so that other players can start as soon as possible. • finally, close () the player to release all of the resources that the player was using and causes the player to cease all activity. WebSmart adopts J M F as its major component in the multimedia presentation frame-work for the following reasons: • Open architecture. Differentiating from other proprietary players wrapped as plug-ins, Java Media Player provides the application developers with an unified interface for embedding audio and video in their applications and applets. Users do not need to prepare their media files into a special format, because the current Java Media Player supports most of the widely used audio and video formats/codecs and transport protocols 47 including DVI, MIDI, MPEG-I, H.26I, H.263, AVI, QuickTime, File, FTP, HTTP, RTP, etc. Besides, a piece of code written by the J M F specification runs smoothly in all J M F implementation platforms without modification. Streaming media support Streaming is a buffering technique which begins playback of a video clip before the whole video file has been downloaded. B y giving the video file a few seconds to load before starting the image, a reserve of video is available in the memory of the client's computer. The display of the buffered content and transmission of the next sections taking place concurrently, solved the delay problem in media presentation, and provides a real-time video access via the Internet. Streaming related standards include the Internet Engineering Task Force (IETF)'s Real Time Streaming Proto-col (RTSP) [4] and Microsoft's Advanced Streaming Format (ASF) [5], etc. The J M F specified the support to streaming video. When using an H T T P con-nection to download media files in certain 'streamable" formats ( M P E G and Quick-time), the P l a y e r can begin playback before all of the media data is received. If the download rate is not enough to keep up with the rate at which the P l a y e r is using data, an under-run condition occurs. A t this time the P l a y e r continues to download data, waiting until enough data has been received to begin playback again (Prefetching). Time-keeping and Synchronization support 48 A s stated in Chapter 1 (Introduction), a time management mechanism is required to deal with time-based media such as audio and video in multimedia documentation. The J M F TimeBase and Clock interfaces define the mechanism for managing the timing and synchronization of media playback. A TimeBase represents the flow of time. "A Java Media player uses its Time-Base to keep time in the same way that a quartz watch uses a crystal that vibrates at a known frequency to keep time" [16]. A Java Media Player can use its TimeBase to keep time but never transform or reset it. On the other hand, a "media time" is defined for a player to represent a point in time within the stream that the player is presenting. "The media time can be started, stopped, and reset much like a stop-watch" [16]. The media time of a stream can be mapped to a system TimeBase by a Clock object. WebSmart controls the time-keeping within a single media stream by adjusting its MediaTime, and synchronizes among multiple media streams by associating their players with the same TimeBase. 3.3 HTML The HyperText Markup Language ( H T M L ) [10] is the lingua franca for publishing hy-pertext on the Wor ld Wide Web. A s illustrated in Section 2.1 (Functional Model) , the major output of the WebSmart authoring environment is a group of H T M L files which are fed into the web browser. This group of H T M L files form a logically hierarchical structure in presenting the content of a WebSmart presentation: 49 application.html + sectionX.html instruction sections quiz sections question.html FIGURE 17. The output H T M L files of WebSmart In contrast to the above logical hierarchy, the actual H T M L files are stored in the server in a flat structure (e.g. they are files under a same directory), with indistinc-tive relationship between each other. The application.html file uses tags FRAMESET and FRAME to organize the lay-out of multi-components within a page, and sectionX.html files uses tag PARAM to de-fine multimedia contents such as VIDEO-FILE, AUDIO-FILE, IMAGE-FILE, and CAPTION-FILE. Display of the page layout is static. Parsing and operating on the media contents are made by the presentation engine implemented in a Java applet. Despite its wide applications on the Web, the original H T M L has limitations in creating dynamic web pages. Section 6.2 (Future work) w i l l discuss some of the new languages and interfaces being developed to produce dynamic web pages. 3.4 JavaScript JavaScript is used by WebSmart as a scripting language to link various components (e.g. Java, Java Media Framework, and H T M L ) within the system. A s Ousterhout ana-50 lyzed in [35], system programming languages (C, C++, Java) are suited for building components where the complexity is in the data structures and algorithm, and scripting languages (Perl, Tel , Visual Basic, JavaScript) are well suited for combining applica-tions where the complexity is in the connections. Compared with system programming languages which are mostly strong typed and compiled, scripting languages are gener-ally typeless and interpreted. A typeless language makes it much easier to link together components. Interpretation provides rapid development by eliminating the compile times. Moreover, being a much higher level of language than system programming lan-guages allows more casual programmers (instead of professional programmers) to mas-ter a language in a short period of time and focus on their main jobs in building applications. So far scripting languages are widely applied in three categories of appli-cations: Graphical User Interface (Visual Basic, HyperCard, and Tcl /Tk) , Internet (Perl, JavaScript), and Component frameworks (Visual Basic). JavaScript [17] is "Netscape's cross-platform, object-based scripting language for client and server applications". The language can be divided into a client side and a server side. Both sides share the Core JavaScript which consists features such as vari-ables, functions, and LiveConnect. The client-side encompasses extras such as the pre-defined objects only relevant to running JavaScript in a browser, and server-side encompasses extras such as predefined objects and functions only relevant to running JavaScript on a server. So far, WebSmart only utilizes the client-side JavaScript, of which the LiveConnect is the most attractive feature to the project. LiveConnect is used to enable communications between JavaScript and Java applets in a page and between JavaScript and plug-ins loaded on a page.WebSmart uses 51 it in realizing the mutual communication between a Java applet and JavaScript func-tions embedded in an H T M L file. The control flows among different parts in the frame-work as follows: Users control the process of a presentation by the interactions with an H T M L page such as pressing a button or entering a textfield. On receiving these actions, H T M L calls on the JavaScript functions in the page and passes users' input to these functions. JavaScript functions in turn call the methods defined in the Java applet which is embedded in the same H T M L file. Java applet functions execute tasks such as start-ing or stopping the presentation of a video stream. The control flow can also be reversed in such cases as flipping a frame within the H T M L page. In this case, a method in the Java applet calls a function in the JavaScript scripts. Since the JavaScript has the access to components of a page such as windows and frames, it flips a frame by resetting the content of that frame. At the end, users notices the changes happened in the web page in a real time manner. The communication between Java and JavaScript is realized as follows: • JavaScript to Java LiveConnect FIGURE 18. Control flow in WebSmart 52 When the LiveConnect is enabled, JavaScript can make direct calls to Java meth-ods. A Java method is referred to in the following format: [Packages.]packageName.className.methodName For example, with the following tag in the H T M L file: <APPLET Name="clock" Code=Display.class Width=320 Height=340 MAY-SCRIPTS where D i s p l a y . c l a s s is the name of the compiled bytecode for the Java applet, the JavaScript function can call a method of the applet in applets["clock"].appletPlay() to start the media player. • Java to JavaScript To access JavaScript methods, properties, and data structures from within a Java applet, users should import the Netscape j a v a s c r i p t package to the Java source code. The author of an H T M L page must permit an applet to access JavaScript by specifying the M A Y S C R I P T attribute of the < A P P L E T > tag. Access to a JavaScript function from a Java applet proceeds in two steps: 1. Get a handler for the JavaScript Window Before you can access JavaScript, you must get a handle for the Navigator window. The statement below shows how to get a window handle within a Java applet: JSObject win = ISObject.getWindow(this); where J S O b j e c t is a class name in the n e t s c a p e . j a v a s c r i p t package. 53 2. Call ing JavaScript methods Use the c a l l () method of JSObj ect, which has the following syntax:, argArray() where argArray is an Array of Java objects used to pass arguments. For example, when a pre-defined timing point for a synchronization event is due, the Java applet calls JavaScript function j s _ f unc in w i n . c a l l ( j s _ f u n c , arg) ; where win is the window handle obtained in step 1, and j s _ f unc could be a flipping action such as defined in an H T M L page as follows: function changeNotes(n) { top.frames["Notes"].location=n; } where Notes is the name of a frame in the page, and n is the U R L address for the new content of the frame. 3.5 An example of integration Now we've discussed the major components of the WebSmart technology framework, we ' l l see how these components create a WebSmart exported H T M L file in detail. We use the following scenario in demonstrating the integration of Java, Java Media Frame-work, JavaScript and H T M L in this section: As the user c l i c k s on the Quiz section i n the TOC frame, a video c l i p which narrates questions w i l l s t a r t presenting i n the media frame. At the 2nd second of the narration, the t i t l e frame w i l l f l i p to a new t i t l e page, showing the Quiz logo. At the 11th second of the na r r a t i o n 54 the Notes frame w i l l d i s p l a y the content of question 1. At the 13rd second of the narration ( i t i s the time point when the nar r a t i o n of the f i r s t question i s over), the presentation of the video w i l l be paused u n t i l the user enters the answer into the text f i e l d i n the Notes frame and c l i c k s the "Done" button. Upon re c e i v i n g the "Done" button, the presentation of the video resumes and s t a r t s the narration of the next question. The above scenario is accomplished by two H T M L files (e.g. section3.html [Appendix B] and ql.html [Appendix C]), and three Java programs (e.g., and EventGroup.javd) in following steps. 1. Enabling LiveConnect • Import JavaScript package into Java applet. The, and should include the following statement at the beginning of the source code: import netscape.javascript.*; • Specify the MAYSCRIPT attribute in the < A P P L E T > tag to permit the Java applet to access JavaScript functions embedded in section3.html. <applet name = "clock" code=Display.class width=320 height=350 MAYSCRIPT> • Get the handle of the Navigator window in the Java applet. This window han-dle w i l l be used when the applet calls a JavaScript function. The following function in the gets the window handle when the applet is invoked. 55 public void s t a r t ( ) { window = JSObject.getWindow(this); i } 2. Embedding media streams in H T M L files The following declarations have to be included in section3.html for the Java applet Display.class to get the media streams of the presentation. <PARAM name="VIDEO-FILE" value="movies/secll_l.mpg"> <PARAM name="AUDIO-FILE" value="movies/secll_l.mpa"> <PARAM name="IMAGE-FILE" value="movies/secll_l.gif"> <PARAM name="CAPTION-FILE" value="movies/secll_l.cap"> 3. Embedding JavaScript functions in H T M L files JavaScript functions accomplish the following objectives: • Adding synchronization events The l o a d E v e n t s () functions in section3.html and ql.html call the a d d E -v e n t () function of the Java applet to add the events of flipping the title page in the title frame, flipping the question content into the Notes frame, and paus-ing the video presentation in the media frame. Following is the l o a d E -v e n t s () function in ql.html. function loadEvents() { var APP=parent.frames["media_app"].document.applets["clock"]; APP.addEvent("pauseApp"," ","movies/sec11_1.mpg","13"); } 56 • Flipping actions The following two functions implement the flipping of the title frame and the Notes frame within the browser window. JavaScript can refer to the whole win-dow, or framesets and frames within the window, and update their properties. In the sample functions, the p a r e n t refers to the frameset window that con-tains the frame from which the JavaScript function is called, the l o c a t i o n property is used in JavaScript to denote the contents of a frame, and the param-eter n is the U R L address to the new content of the frame. function changeTitle(){ p a r e n t . f r a m e s [ " t i t l e " ] . l o c a t i o n = n; } changeNotes(){ parent.frames["Notes"].location = n; } • Controlling Java applet JavaScript controls the process of the Java applet by calling Java applet's method (e.g. the a p p l e t P a u s e () in the example). In the next step (Writing Java public methods) we ' l l see how this a p p l e t P a u s e () is implemented. function pauseApp() { var APP = document.applets["clock"]; APP.appletPause(); } 4. Writing Java public methods 57 To be accessed by JavaScript functions, a Java method has to be defined as pub-l i c . According to the functionality, the public Java functions for communication with JavaScript in WebSmart can be divided into three categories: • Handling event tables A s introduced briefly with the WS class in Section 2.4 (Object Model), an event hashtable is maintained for synchronization events. In the Display Java applet, functions are defined to associate these events with the corresponding threads. Details on this event hashtable can be found in [20]. public void addEvent(String j s f n , S t r i n g layer, S t r i n g mfile, S t r i n g t ) ; public void endEvent(); • Controlling the Java Media Player This is where the J M F is used in WebSmart. The Display Java applet defines a Java Media Player which realizes the playback of the media streams in the media frame. B y applying the s t a r t () and stop () operations on the Player, the applet controls the playback of the media stream. public void appletPause() { i player.stop(); } public void appletContinue() { i t player.stop(); 58 } • Call ing JavaScript methods When the time for an synchronization event (such as flipping content of a frame) is due, the Java applet w i l l call corresponding JavaScript functions (as defined above in Embedding JavaScript functions in HTML files) to exe-cute the synchronization event. The following function e x e c u t e E v e n t () is defined in the public void executeEvent(String layer_id) { String arg[]= {layer_id};, arg); } where w i n is the handle of the Navigator window which was obtained in Enabling LiveConnect step. User Interface A t last we come to the G U I end where users interact with the system via the H T M L page in the browser. In ql.html we use a " t e x t " type of input for users to enter texts, and a " b u t t o n " type of input for confirmation. <PxFORM NAME=" qui z">< INPUT TYPE="text" NAME= "answer" VALUE="" SIZE=50 ></P> <PxINPUT TYPE="button" VALUE=" Done" NAME="submit" OnClick= check_answer()></P> 59 Chapter 4 Performance The WebSmart is a multi-platform application which works on both U N I X and W i n -dows platforms. To test the performance of accessing multimedia documents from the Internet, an H T T P server and a distant learning application have been built. This chap-ter analyzes the experimental results. 4.1 Multi-platform implementation The WebSmart system [36] runs on multiple platforms. Currently the authoring envi-ronment works on Irix 6.2, Solaris, and Windows95/NT, with some small modifications in source code for the Irix 6.2 because the SGI 's J M F implementation only supports the 0.96 version of the J M F specification. The presentation engine (the Display Applet) can be embedded in both Netscape's Navigator (version 3 .X or more) and Microsoft's In-ternet Explorer (Version 3 .X and more), and it provides applets compatible with both version 1.0 and version 0.96 of the specification. To run the system, a Java Develop-ment Kit (version 1.0 and up) and a Java Media Framework implementation have to be installed on local machines. 4.2 HTTP server To test the performance of the WebSmart broadcasting on the Internet, a Netscape E n -60 terprise Server version 3.5.1 (later Apache H T T P server version 1.2.5) is installed on a SGI workstation ( which runs Irix 6.2 at the Computer Science De-partment of U B C . The U R L address of the server is h t t p : / / r a i n f o r -e s t . c s . u b c . c a / . 4.3 An application - a web-based JMF course Besides the H T T P server, an application was set up on it to test the performance of ac-cessing WebSmart documents via the Internet. The application is at the U R L address of h t t p : / / r a i n f o r e s t . c s . u b c . c a / m m a s / a p p l _ 1 0 / a p p l . h t m l for J M F 1.0 specification compatible, or h t t p : / / r a i n f o r e s t . c s . u b c . c a / m m a s / a p p l _ 9 6 / a p p l . h t m l for J M F 0.96 specification compatible. The application is a web-based course on Java Media Framework, and includes two instruction sections as well as a quiz section. The whole web page of the course is composed of four frames, namely, a Table of Content, a media playback frame, a title frame and a Notes frame. The narration of the course is divided into short video clips, each lasts for an average of 50 seconds in time. After compressing into M P E G - 1 for-mats, these clips are on average of 8 .7MB. The application is accessible via both L A N and 28.8Kbps modem. When ac-cessed from a P C connected with the H T T P server via a L A N , the video presentation is performed in a "stop-and-gd" manner. When the bandwidth becomes insufficient tem-porarily, a "data starved' message is displayed, and the download bar above the video component within the media frame displays the buffering process. A s soon as enough media content has been fetched in the buffer, video presentation wi l l resume by clicking 61 the play button. When viewing the same J M F web course from the 28.8Kbps modem, the per-formance is unacceptable. For example, it takes about five minutes to prefetch enough content in buffer before video presentation starts. 4.4 An Audio-only option To meliorate the unacceptable performance of the original presentation engine, an op-tion was added to allow the document viewers to switch to an audio-only presentation mode when the bandwidth of the network connection becomes scarce. The option is provided with a button laid above the visual component of the media player. When viewers open the web document, a video clip is displayed by default, and the button shows the "Audio O n l y " text. Whenever viewers click the button, the video presen-tation w i l l be stopped and an audio only clip which narrates the same presentation starts. A t the same time, the button modifies its text display to be "Audio-Video" for viewers to switch back to video presentation mode. The audio only clips divide the whole course narration into same lengths' seg-ments as their corresponding video clips do. For the same length of 50 seconds narra-tion, the audio clips take an average of 605KB, which is about 6.9% the volume of the video clips. The reduce in the media size transferred improved the performance of the presentation, but at the same time, lost the distinctive visual effect brought by the video clips. 62 4.5 Other issues 4.5.1 Compression algorithms The performance enhancement f r om the audio-only opt ion was obtained by sacr i f i c ing the dist inct ive v i sua l effect o f a v ideo playback, therefore it can not be used as a long term approach. Bes ides , even in the audio-only mode , the 28 .8Kbps m o d e m access st i l l j ams frequently. The reason is that the M P E G - 1 w h i c h is used in compress ion is not an ideal solut ion to Internet-based mu l t imed ia appl icat ions. F o r a compress ion a lgor i thm to be suitable for the current Internet, an a lgor i thm has to possess the f o l l ow ing fea -tures: bandwidth scalability, resolution, frame-rate, frame-quality scalability, fast com-press/decompression, ability to cope with network losses, and small encoding and decoding latency. M P E G - 1 was designed for p layback f r om C D - R O M , targeted at a 1-1.5Mbps bandwidth w h i c h is impract ica l for m o d e m access to Internet. Several new algor i thms spec i f ica l ly targeted at Internet v ideo are be ing deve l -oped. F i rst is the RealSystem codec w h i c h compresses v ideo content adapted to scalable bandwidth ranging f r om 10Kbps to 500Kbps . Recent ly the Rea lNe tworks is wo rk i ng w i th Sun to integrate Rea l Sys tem v ideo streams into J M F implementat ions. Second is the H.323/H.324 standards def ined by the I T U . The core idea used here is that the v ideo codec is a modular entity that can be p lugged in and out. It is assumed that certain base v ideo codecs (e.g. H .261/H.263, w h i c h are for the l o w bit-rate appl icat ions) are present i n a l l implementat ions. B y us ing the H.245 protoco l , an H.323 appl icat ion can negotiate the use o f any other audio/video codec. Th i s a l lows for new innovat ions to be easi ly incorporated into ex is t ing appl icat ions and adapts to appl icat ions w h i c h impose scal-63 able bandwidth requirements. The last new standard introduced is the MPEG-4 [37] which is being developed by the M P E G group. Compared with its predecessors (i.e., M P E G - 1 and M P E G - 2 ) , M P E G 4 is designed for applications including Internet Multi-media, interactive Video Games, VideoConferencing, VideoPhone, Multimedia Mail-ing, and Wireless Multimedia, etc. M P E G - 4 video is optimized for bandwidths ranging from low (<64Kbps), intermediate (64-384Kbps) to high bitrates (384Kbps - 4Mbps). Furthermore, M P E G - 4 supports scene description (spatio-temporal synchronization and behavior) for audiovisual object presentation purpose, therefore is suitable to Web-Smart presentations. A s these new Internet-oriented compression algorithms wi l l provide new solu-tions to deal with scalable network bandwidth, the task of multimedia authoring sys-tems is to build an flexible framework so that these new codec innovations can be accommodated easily once they become available. A n d this, was exactly one of the de-sign strategies of the WebSmart system. 4.5.2 General purpose web server vs. multimedia specialized web server Currently the WebSmart uses a general purpose web server to store multimedia data and transmit data to the client site. Such a server-less video streaming approach has ad-vantages from an economic perspective, however, it also inherits many disadvantages from its simplicity. Once the client starts displaying the video content after a few sec-onds of buffering the file, the smooth presentation on the rest of the file is based on the hypothesis that the rest of the file must come over the network at a rate greater than or equal to the rate at which it is being played by the client (i.e., the bit-rate at which it was encoded). If ever this hypothesis is not realized, the video shown w i l l become "stop-64 and-go" in which the client waits t i l l the buffer builds up again. This is obviously an-noying and undesirable. A better solution is to set up a multimedia specialized web server which is sep-arate from the traditional HTTP-based web server and specialized in the video/multi-media streaming task. Its major difference is made by a function called "admission control". Once a video is requested from a client, the server first decides whether to ac-cept or deny the request based on the server's C P U usage, network load, the identity of the client, or any other factors. If the request i f accepted, a connection is established and the video is streamed to the client. The data packets are properly spaced in time on the network to match the fraction of link bandwidth being used by the client. The benefits of a server-based streaming video include better network throughput, better video qual-ity to the end user, support for advanced features like admission control, cost-effective scalability to a large number of users, and protection of content copyright. A s intro-duced in the Chapter 5 (Related Work), both RealSystem and NetShow use a special-ized streaming video server. 4.5.3 Network connection A s we analyzed the performance over L A N , we also noticed some non-deterministic behavior. In low-load situations, the video presentation is smooth, but as soon as more traffic is injected into the subnet, the video display becomes jerky. The reason is rooted in the Carrier Sense Multiple Access with Collision Detection (CSMA-CD) which has been applied in Ethernet-based L A N s . In high-load situations it exercises no control over access delay or available bandwidth per application. In addition, it does not pro-vide any access-priority mechanisms and thus cannot give preferred treatment to real-65 time traffic over conventional data. Asynchronous Transfer Mode (ATM) seems to be a good solution to the above problem. The A T M is a cell-based multiplexing and switching technique. In the phys-ical layer, it is targeted at 155Mbps optical transmission. In the A T M layer, the 53 oc-tets A T M cells are multiplexed and switched in the network. In the A T M adaption layer, different adaptation layers (from A A L 1 to A A L 5) provide different classes of service to the upper application level. These services include constant bit rate and vari-able bit rate. Unlike the Ethernet, the A T M connection specifies quality of service (QoS) at the connection establish service, therefore avoids the non-deterministic behav-ior occurred in Ethernet. Although A T M seems to be a promising network in delivering real-time multimedia, the high cost to establish such kind of network prohibited its wide application. How the ordinary web users can benefit from it remains unseen. Besides A T M , another kind of high-speed network is also in development. It is called high-speed L A N , and includes 100BASE-T Fast Ethernet and lGbps Gigabit Ethernet [38]. These high-speed networks inherit existing frame format from lower-speed Ethernet, therefore are easier to upgrade from current network devices with less cost in training, maintenance and troubleshooting than A T M is. Once these low-cost high-speed networks are combined with some QoS standards, for example, Resource Reservation Protocol (RSVP) [39], they can provide a practical and reliable solution to Internet multimedia presentations. 66 Chapter 5 Related Work Multimedia Authoring System is a new area in computer science, yet it has grown rap-idly within a year. Several commercial products have been released, and an internation-al standard is in progress. 5.1 Authorware The Authorware Interactive Studio [21] is a series of products developed by Macro-media for making multimedia presentations. Authorware (currently version 4) provides an icon-based, drag-and-drop inter-face to allow users to create pages that include hyperlinks to text, digital movies, graph-ics and sounds. It also provides built-in editors for creating custom interactions such as graphic buttons, check boxes, and radio boxes. The video embedded in an Authorware page is transmitted and played back by a plug-in called Shockwave which supports streaming video. The content of a Shock-wave movie could be prepared by Director (currently version 6), another component of the Authorware Interactive Studio series. Director creates a format of media called Direct movies which can import 2D and 3D graphics, text, animation, sounds, and dig-ital video. To add interactions into Director movies, page authors have to write scripts in the Lingo language. The Shockwave movies could be embedded in H T M L codes, 67 and downloaded and played back in Internet browsers. The major strength of Authorware is its rich functionality in generating and ma-nipulating graphics and animation. It can be used by WebSmart users in creating H T M L files for accompanying objects of the audio/video clips. One problem of the A u -thorware is that when creating Shockwave movie files, authors have no control over the bandwidth of the exported movies files. When the movies are transported in an unstable network such as the Internet, the performance of the video playback is unpredictable. Another problem is the compatibility of an H T M L page which contains Director or Shockwave movies. A tag has been used for inserting a movies into an H T M L file. However, Netscape's Navigator and Microsoft's Internet Explorer use different tags, and this brings extra work to page authors. 5.2 RealSystem The RealSystem [22] is RealNetworks's streaming media solution to the Internet and Corporate Intranets. It includes RealPlayer, RealServer, RealMedia Tools, and RealEn-coder/RealPublisher, etc. Among them, RealPublisher creates RealAudio and Re-alVideo clips which are delivered by the RealServer over the Internet, and presented at the client site by the RealPlayer. The most outstanding features of RealPublisher are its multi-template encoding, bandwidth negotiation, and synchronization mechanism. When users encode audio and video clips, they are allowed to encode their files with different compression rates based on their bandwidth capability. Multiple tem-plates generate multiple output files. The disadvantage is that the web site must have a 68 separate hypertext link and web page for each format or template. However, using bandwidth negotiation option wi l l enable the RealServer to automatically stream the best file for the users' stated bandwidth regardless of the number of the alternative files. This bandwidth negotiation option can be turned on every time when users create a doc-ument with the RealPublisher. To create a synchronized multimedia presentation for the RealSystem 5.0, users have to prepare a plain text file, called Input Events File. This Input Event File has to be merged with a non-synchronized media file to generate the final synchronized pre-sentation file. This is done with the rmmerge.exe tool. In the Input Events File, each synchronization event should indicate its starttime, endtime and URL address. The RealPublisher supports automatic H T M L page creation by its H T M L Gen-eration Wizard. The wizard helps create the H T M L files and allows users to choose i f the file w i l l be displayed in either a Pop-up RealPlayer or an Embedded RealPlayer. The strength of the RealSystem lies in its capability to create high-quality con-tent at a scalable bandwidth ranging from 10 Kbps to over 500 Kbps, and to deliver streaming content reliably under real-world conditions. A s the RealNetworks is work-ing with S U N to merge the RealSystem codec into J M F implementation, the perfor-mance of WebSmart presentations w i l l be improved by compressing media streams using the RealSystem codec. 5.3 NetShow The NetShow [23] is Microsoft's solution to an end-to-end system, server, client and tools that allows users to stream multimedia over intranets and the Internet. The key 69 idea here is a form of content called Active Streaming Format (.asf) files. These files sit on a NetShow server and are streamed to the clients that request the files. These files are created by Microsoft 's content creation tools and embedded within H T M L files. A t the clients's side, these files can be viewed in a browser and presented by a NetShow Player. ASF Editor is one of the .asf file creation tools. It assembles and synchronizes various types of content, and compresses the files to fit the source material into different bandwidth restrictions. The NetShow installation includes several audio, image and video codecs such as MPEG Layer-3, L&H, Voxware MetaSound, Voxware RT24/RT29 for audios, and H.263, Microsoft MPEG4, VDOnet VDOWave, Duck trueMotiunRT, and Intel H.263 for videos. Users can select a codec as they create .asf files. The A S F Editor combines and synchronizes the images, audio, video and scripts commands on a graphic timeline representation. Several options are provided for the synchronization: • Markers: Markers are like bookmarks in the .asf file. NetShow Player uses mark-ers to jump around in the .asf file. This is particularly useful when viewers are watching a big video file, markers help them to jump from section to section. • Script commands: Visual Basic Scripts and JavaScript scripts are used to inter-leave events into the streaming .asf file. A s the NetShow Player receives the .asf file, the script commands are passed to a client-side application via the ScriptCom-mand event that handles the commands and executes the actions. Script commands are sent in a table at the .asf file header. 70 • URL flips: A n .asf file can be embedded into a framed H T M L page. A S F Editor provides the interface to arrange U R L s appearing in accompanying frames in a syn-chronized manner as the media frame is presenting the audio/video clip. • Captions: caption is another form of additional information to be embedded in the .asf file. Users can specify a time for the display of caption, and enter the text of the caption in the A S F Editor. In summary, NetShow shows strengths in organizing the synchronization infor-mation and fitting the multimedia material into the restricted bandwidths. The synchro-nization options are similar to those in WebSmart. However, the A S F Editor still needs more flexibility in page layout designing. 5.4 S M I L The World Wide Web Consortium has been working on providing a common solution for integrating synchronized multimedia presentations into the Web. In November 1997, it released the first public working draft of the Synchronized Multimedia Integra-tion Language (SMIL) [24]. Instead of using scripting languages as in most commercial products, it adopts a declarative format for expressing media synchronization on the Web. S M I L is a markup language just as H T M L , which can be edited by an ordinary text editor. Each S M I L document has a head and a body. The head part contains a layout section that determines the placement of each component on the web page, its position, size and scaling of a visual media object within a rendering window. In the body part, each component uses a schedule element to specify its temporal behavior. The synchro-71 nization between elements can be in either parallel or sequential mode. In both modes, users define the following attributes of the schedule element: its media type (audio, vid-eo, text or image), source address ( U R L address), begin time, and end time of presen-tation etc. A rich set of options is available for defining the synchronization behavior among elements. These options include whether children of a parallel group keep their independent clock or share a common clock, and whether the display of an element ends when its duration has ended or the display of it freezes on the screen, etc. To im-prove the flexibility in performing at various transmission speeds, S M I L provides au-thors with a switch mechanism to specify a set of alternative elements from which to choose. The attributes which can be switched are bit rate, language, screen size, and screen depth etc. S M I L is a comprehensive and flexible solution to the multimedia documenta-tion. The problem is whether this S M I L specification w i l l be implemented as a real standard, and how it w i l l influence the current market where different multimedia for-mats exist. A t the time of this writing, there is only one experimental implementation [27] which realized half of the S M I L features. 5.5 Summary From the above discussion, we conclude that present multimedia authoring systems have provided with practical mechanisms to assemble and synchronize multimedia content within a web page, and also made progresses in playing back video streams with the Internet bandwidth limitation. Yet the incompatibility among the media for-mats hinders the distribution of documents within the heterogeneous Internet environ-72 ment. S M I L is a good solution to a unified multimedia documentation, but whether and when it w i l l be widely applied are still pending. 73 Chapter 6 Conclusion and Future Work 6.1 Conclusion The emergence of interactive multimedia presentations on the World Wide Web brought about both wide applications and big challenges. To enable video over the In-ternet, solutions need to be provided to deal with problems in bandwidth, synchroniza-tion, presentation and authoring environment. Many companies and organizations are working on new standards and products. Yet the heterogeneity of the Internet calls for open applications to work seamlessly on various platforms. WebSmart is a research project targeted at creating web-oriented synchronized multimedia presentations. It consists of a presentation engine and an authoring environ-ment. The presentation engine allows playback of video content within a web browser without additional plug-ins, and the authoring environment supports different levels of users to create multimedia-embedded documents with built-in synchronization mecha-nisms. WebSmart uses a simple document model to create multimedia web presenta-tions. Designed primarily for distant learning applications, it is able to generate three kinds of components for a presentation: Table of Content, Media, and Accompanying HTML pages. These components correspond instinctively to a real classroom course's elements such as syllabus, instructor's narration and overhead, therefore they are easy 74 to understand by presentation authors. Besides, these components can also be used in constructing other applications such as corporate training and on-line publication/ad-vertisement. WebSmart is designed to adapt to different levels of users. For inexperienced web page authors, presentations can be generated exclusively in WebSmart's authoring tools. These tools include a wsEditor for assembling all components in a presentation, a CaptionMaker for generating Closed Captions, and a QuizDesigner for creating quiz sections which embed more intensive interactions than ordinary instruction sections. These tools allow authors to design their documents in a W Y S I W Y G manner. For ex-perienced authors, WebSmart defined a set of statements for creating customized syn-chronization mechanisms. These simple statements are embedded in JavaScript functions, therefore authors do not need to learn extra scripting languages. To address the timing management problem in multimedia presentations, Web-Smart realized two levels of inter-object synchronizations: page flipping and lip sync. Page flipping arranges U R L s appearing in Accompanying HTML pages at pre-defined timing points as the Media, frame is presenting the audio/video clip. The lip sync real-izes a finer-grained form of synchronization. When the Media frame is presenting the audio/video clip, a text field below that frame displays the content of the narration in a pace corresponding to the movement of the narrator's lips. Wi th WebSmart's original algorithm, these two levels of synchronization prevent the multiple components within a web page to drift in time away from each other, and present the time-sensitive media (audio/video) in a consistent manner. Both kinds of synchronization events can be gen-erated visually in WebSmart authoring tools, and they are manipulated by the presen-75 tation engine at the presentation time. The WebSmart system is built in Object-Oriented methodology for easy main-tenance and extension. B y leveraging the latest progress in Java, Java Media Frame-work, H T M L , and JavaScript, WebSmart tries to provide an open solution which is easy to incorporate future development. The authoring system runs seamlessly on both Unix and Windows95/NT platforms, and the presentations it creates are accessible from within ordinary web browsers. A n H T T P server and a distant learning case have been built to test the performance of the system. Although enhancement has been made to reduce the traffic put into the network and improve the performance of accessing multimedia documents via the Internet, more work is expected on scalable compression algorithms, specialized multimedia server and the network. Future work includes improving the functionality of WebSmart using Dynamic H T M L , and accommodate live capturing and conferencing into the au-thoring environment. 6.2 Future Work J M F is working on live capturing and conferencing specifications. Once these specifi-cations are finalized, WebSmart can accommodate them into the current authoring en-vironment. Another potential extension to WebSmart is to improve the functionality of the system using the latest development in Dynamic H T M L . Despite of its wide applications on the web, current H T M L is described by Bosak [33] as having following limitations: 76 • Extensibility. H T M L does not allow users to specify their own tags or attributes in order to parameterize or otherwise semantically qualify their data. • Structure. H T M L does not support the specification of deep structures needed to represent database schemas or object-oriented hierarchies. • Validation. H T M L does not support the kind of language specification that allows consuming applications to check data for structural validity on importation. The Wor ld Wide Web Consortium is working on creating new languages and interfaces to address the above problems. Each of these new elements works on a slight-ly different set of H T M L problems: Extensible Markup Language ( X M L ) on helping organizing and find data, Cascading Style Sheets (CSS) on Web page inheritance and presentation, and Document Object Model ( D O M ) on manipulating the content. CSS , combined with H T M L 4.0, scripts, and D O M , is termed as Dynamic HTML (DHT-ML) by some vendors to create dynamic web pages. 6.2.1 Extensible Markup Language (XML) A s defined in W 3 C ' s Recommendation 1.0 [31], X M L "describes a class of data ob-jects call XML documents and partially describes the behavior of computer programs which process them". It is designed to make it easy and straightforward to use S G M L on the Web. Standard Generalized Markup Language (SGML) (ISO 8879) [34], is the international standard for defining descriptions of the structure and content of different types of electronic documents. S G M L specifies grammars for document markup lan-guages, and offers functions not available in H T M L , such as the three listed above (Ex-11 tensibility, Structure and Validation). H T M L itself is based on S G M L , with only a limited set of hard-wired tags implemented. The major problems of S M G L are its large, cumbersome specification and difficulty in learning and implementation of the lan-guage syntax. X M L is designed to bridge the gap between S G M L ' s richness and H T M L ' s ease of use in Web applications. The style and structure of an X M L document look sim-ilar to an H T M L document, except that the X M L file can uses Document Type Defi-nition (DTD) to specify the logical structure of a document. When the web server with X M L content prepares data for transmission, it generates a context wrapper pointing to the associated D T D . Web clients, when unpacking the X M L document, parse the con-tent according to the D T D . One example of how a WebSmart presentation wi l l look like when using X M L coding is demonstrated below. <WS_PRES ENTATION> <NAME>A JMF Web Course</NAME> <SECTION> <NUMBER>1</NUMBER> <TYPE>Instruction</TYPE> <NAME>Introduction</NAME> <VIDEO_FILE>movies/secll_l.mpg</VIDEO_FILE> <AUDIO_FILE>movies/secll_l.mpa</AUDIO_FILE> </SECTION> <SECTION> <NUMBER>2</NUMBER> <TYPE>Quiz</TYPE> <NAME>A sample guiz</NAME> <VIDE0_FILE>movies/secll_2.mpg</VIDEO_FILE> <AUDIO_FILE>movies/secll_2.mpa</AUDIO_FILE> <QUESTION> <NUMBER>1</NUMBER> <Q_BODY>What i s JMF?</Q_BODY> <ANSWER>Java Media Framework</ANSWER> </QUESTION> </SECTION> </WS_PRESENTATION> In this example, the presentation is named by " A J M F Web Course", and has two sections. Each section has a section number, type (Instruction or Quiz), a name, a video clip and an audio clip. A Quiz type section also includes one or more questions. Each question has a question body and an answer. Notice here that the tags in the X M L file correspond instinctively with the logical structure of the presentation itself, instead of the F R A M E S E T S and F R A M E s in original H T M L files. What these tags w i l l look like when you view it is up to the D T D , and the appearance of the page within browsers can be specified using CSS . 6.2.2 Cascading Style Sheet (CSS) While X M L helps organizing the data, style sheet describes how documents are pre-sented in a Web browser, for instance, color, font, position, size and format. Some ad-vantages of C S S [30] which can be applied to WebSmart are as follows: • Positioning. 79 H T M L uses FRAMESET and FRAME to layout components with sizes in percent-age of the width and height of the entire page. WebSmart has to use a special layout manager (Ratio_Layout) to provide flexibility in adjusting the position and size of a component. CSS , however, places each component in an invisible rectangle whose width and height and distance from the top/left edges of the browser window can be specified in pixel's precision. Page authors are therefore allowed to dispose compo-nents in precise coordinates. Besides, as the term cascading style sheets implies, more than one style sheet can be used on the same document, with different levels of importance. • Font. H T M L does not support font specification in a document, users must find workaround to change fonts for the presentation, as WebSmart does by defining special Java functions. C S S , however, provides explicit specification in page fonts, frees page authors from dealing with fonts outside of the H T M L file. 6.2.3 Document Object Model (DOM) The Document Object Model ( D O M ) [32] is a "platform- and language-neutral pro-gram interface that will allow programs and scripts to access and update the content, structure, and style of documents in a standard way". Currently there are several ven-dors offering software to embed scripts in documents for content accessing and updat-ing, but all in a non-standard interface which prohibits authors from using these documents in an interoperable manner. The Document Object Mode l provides a stan-dard set of objects for representing H T M L and X M L documents, a standard model of 80 how these objects can be combined, and a standard interface for accessing and manip-ulating them. Vendors can support the D O M as an interface to their proprietary data structures and APIs, and content authors can write to the standard D O M interfaces rath-er than product-specific APIs . A s a consequence the interoperability is increased on the Web. With regard to WebSmart, once its X M L presentation conforms to the standards specified by D O M , it can inter-operate with other web objects as long as the latter con-forms to D O M as well . 81 Bibliography [1] Multimedia - From Vision to Reality, Sandra L . Teger, A T & T Technical Journal, Volume 74, Number 5, September/October 1995 [2] M P E G home page, [3]H.261, International Telecommunication Union Recommendation, http://, March, 1993 [4] Real Time Streaming Protocol (RTSP), Internet Engineering Task Force, http:// search. ietf. org/internet-drafts/draft-ietf-mmusic -rtsp-09. txt [5] Advanced Streawming Format (ASF) White Paper, Microsoft Corporation, http://, December, 1997 [6] A Media Synchronization Survey: Reference Model , Specification, and Case Studies. G . Blakowski and R. Steinmetz. I E E E Journal on Selected Areas in Com-munications, Vol.14, No. 1, January 1996 [7] A Temporal Model for Interactive Multimedia Scenarios. Nael Hirzalla, et al. I E E E Multimedia, Vol.2, No.3, Fal l 1995 [ 8 ] Synchronizing the presentation of multimedia objects. Petra Hoepner. Computer Communications, Vol.15 No . 9, 1992 [9] A Simple, Intuitive Hypermedia Synchronization Model and its Realization in the Browser/Java Environment. Jin Yu . D E C Technical Note, October 1997 82 [ 1 0 ] H T M L 4.0 Specification, W 3 C Recommendation 18-Dec-1997. http:// [ 11 ] Netscape Plug-in Guide, communicator/plugin/index.htm [ 12 ] QuickTime, Apple Computer, Inc. [13] Authoring Systems, Craig Locatis, et al., L I S T E R H I L L Monograph: L H N C B C 92-1, January 1992. [ 14 ] The Java Language : A n Overview, Sun, j ava/j ava-o verview-1. html [15 ] The Java Language Environment - A White Paper. James Gosling, Henry McGi l ton , M a y 1996. [16 ] Java Media Player, Sun, SGI , and Intel, [17 ] JavaScript Guide, Netscape, [18] Mastering Visual Basic 5, Evangelos Petroutsos, Sybex Incorporation, 1997 [19] Direct 6 and Lingo Interactive, Macromedia Inc., macromedia Press, 1997 [20] Synchronization in WebSmart Multimedia Authoring System, Lan L i , 1998 [21 ] Authorware, MacroMedia, [ 22 ] RealSystem, RealNetworks, 83 [23 ] NetShow, Microsoft, ccag-f.htm [24] Synchronized Multimedia Integration Language, Philipp Hoschka, et al. http:// [25] Object-Oriented Software Engineering, Ivar Jacobson, et al. Addison-Wesley, 1992 [26] Object-Oriented Modeling and Design. James Rumbaugh, et al. Prenticee Hal l , 1991 [27 ] Hypermedia Presentation and Authoring System. D E C System Recearch Cen-ter, [28] How to B u i l d a Layout Manager, Terence Parr, developer/javaInDepth/layout-mgrs-10-96.html [29] Java Threads. Scott Oaks, Henry Wong. O 'Re i l ly & Associates, Inc. 1997 [ 3 0 ] Cascading Style Sheets, World Wide Web Consortium, Style/ess/ [ 31 ] Extensible Markup Language ( X M L ) 1.0, W 3 C Recommendation 10-February-1998 [32] D O M : Document Object Model Specification, W 3 C Working Draft, 18 March, 1998. [33] X M L , Java, and the future of the Web. Jon Bosak, sun-info/stardards/xml/why/xmlapps.htm, 1997 84 [34] Standard Generalized Markup Language ( S G M L ) , W 3 C , cate/dl6387.html [35] Scripting: Higher-Level Programming for the 21st Century, John K . Ousterhout, I E E E Computer Vol . 31, no. 3, March 1998 [36] WebSmart home page, [37] M P E G - 4 , Moving Picture Experts Group, mpeg/public/w2196.htm [38] Gigabit Ethernet - accelerating the standard for speed, white paper. Gigabit Ethernet Alliance, 1997. [39] R S V P : A New Resource ReSerVation Protocol. L . Zhang, et al. I E E E Network, September 1993 85 Appendix A Appendix A lists source code for sectionl.html. < H T M L > < H E A D > <TITLE>Course Overview</TITLE> < /HEAD> < B O D Y TEXT="#000000" B G C O L O R = " # F F F F F F " LINK="#990000" VLINK="#333399" ALINK="#333399" OnLoad="loadEvents()"> <H2> 1 Course Overview </H2> <applet name = "clock" code=Display.class width=320 height=350 M A Y S C R I P T > <blockquote> <hr> <em>Your browser doesn't understand the A P P L E T tag. Here's a screenshot of the clock applet that you would see running here i f it did:</em> <P> <hr> </blockquote> < P A R A M name="VIDEO-FILE" value="movies/sec0.mpg"> < P A R A M name="AUDIO-FILE" value="movies/sec0.mpa"> < P A R A M name="IMAGE-FILE" value="movies/sec0.gif'> < P A R A M name="CAPTION-FILE" value="movies/sec0.cap"> < P A R A M name="CAPTION-PACE" value="5"> </applet> <SCRIPT> function changetitle(n) { parent.frames["title"].location=n; } 86 function changeNotes(n) { parent.frames["Notes"].location=n; } function loadEvents(){ var A P P = document, applets ["clock"]; APP.clearAllEvents(); APP.addEvent("changetitle", "sec_title0.html", "movies/secO.mpg", "0"); APP.addEvent("changeNotes", "slide0.html", "movies/secO.mpg", "34"); APP.addEvent("changeNotes", "slidel.html", "movies/secO.mpg", "46"); APP.endEvent(); } </SCRIPT> < /BODY> < /HTML> 87 Appendix B Appendix B lists source code for section3.html. Section3 is a quiz section. < H T M L > < H E A D > <TITLE>Quiz</TITLE> </HEAD> < B O D Y TEXT="#000000" B G C O L O R = " # F F F F F F " LINK="#990000" VLINK="#333399" ALINK="#333399" OnLoad="loadEvents()"> <H2> 3 Quiz </H2> <applet name = "clock" code=Display.class width=320 height=350 M A Y S C R I F T > <blockquote> <hr> <em>Your browser doesn't understand the A P P L E T tag. Here's a screenshot of the clock applet that you would see running here i f it did:</em> <p> <hr> </blockquote> < P A R A M name="VIDEO-FILE" value="movies/secll_l.mpg"> < P A R A M name="AUDIO-FILE" value="movies/secll_l.mpa"> < P A R A M name=" IMAGE-FILE" value="movies/secl l_l .gif '> < P A R A M name="CAPTION-FILE" value="movies/secll_l.cap"> <SCRIPT> function changetitle(n) { parent.frames["title"].location=n; } 88 function changeNotes(n) { parent, frames ["Notes"].location=n; } function pauseAppO { var A P P = document.applets["clock"]; APP.appletPause(); } function loadEvents(){ var A P P = document.applets["clock"]; APP.clearAllEventsO; APP.quizReset(); APP.addEvent("changetitle", "sec_title4.html", "movies/sec 11_1 .mpg", "2" APP.addEvent("changeNotes", "ql .html" , "movies/secl 1_1.mpg", "11"); APP.addEvent("changeNotes", "q2.html", "movies/secl 1_1.mpg", "16"); APP.addEvent("changeNotes", "q3.html", "movies/sec 11_1.mpg", "32"); APP.endEvent(); } </SCRIPT> < /BODY> < / H T M L > 89 Appendix C Appendix C lists source code for ql.html. < H T M L > < H E A D > <SCRIPT> function loadEvents() { var APP=parent.frames["media_app"].document.applets["clock"]; APP.addEvent("pauseApp"," ","movies/secll_l.mpg","13"); } function check_answer() { var correct_answer="Java Media Framework"; var your_answer=document.quiz. answer, value; var APP=parent.frames["media_app"].document.applets["clock"]; APP. one_more_quiz(); i f (your_answer == correct_answer) { APP.one_more_correct_answer(); document.quiz.result.value="correct!"; } else document.quiz.result.value="wrong!"; APP. appletContinue(); } </SCRIPT> < /HEAD> 90 < B O D Y TEXT="#000000" B G C O L O R = " # F F F F F F " LINK="#990000" VLINK="#333399" ALTNK="#333399" OnLoad="loadEvents()"> <P>&nbsp;&nbsp;&nbsp;</P> <P>What does J M F stand for ?</P> < P x F O R M N A M E = " q u i z " x I N P U T TYPE="text" NAME="answer" V A L U E = " ' SIZE=50 ></P> < P x I N P U T TYPE="button" V A L U E = " D o n e " NAME="submi t" OnCl i ck=check_answer ( )xHRx/P> <DIR> <DIR> <DIR> <P>Your score is <INPUT TYPE="text" NAME="resul t" SIZE=10 readonlyx/P> </DIR> </DIR> </DIR> < P x / F O R M x / P > < /BODY> < / H T M L > 91 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items