UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

MPEG-4 delivery : DMIF based unicast and multicast systems Asrar Haghighi, Kambiz 2001

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2001-0310.pdf [ 9.64MB ]
Metadata
JSON: 831-1.0065172.json
JSON-LD: 831-1.0065172-ld.json
RDF/XML (Pretty): 831-1.0065172-rdf.xml
RDF/JSON: 831-1.0065172-rdf.json
Turtle: 831-1.0065172-turtle.txt
N-Triples: 831-1.0065172-rdf-ntriples.txt
Original Record: 831-1.0065172-source.json
Full Text
831-1.0065172-fulltext.txt
Citation
831-1.0065172.ris

Full Text

MPEG-4 DELIVERY: DMIF BASED UNICAST AND MULTICAST SYSTEMS by Kambiz Asrar Haghighi B . A . S c , T h e Universi ty of Brit ish C o l u m b i a , 1999 A T H E S I S S U B M I T T E D I N P A R T I A L F U L F I L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M A S T E R O F A P P L I E D S C I E N C E in T H E F A C U L T Y O F G R A D U A T E S T U D I E S Department of Electrical and Computer Engineer ing W e accept this thesis as c o n f o r m i n g to the required standard T H E U N I V E R S I T Y O F B R I T I S H C O L U M B I A September 2001 © K a m b i z Asrar H a g h i g h i , 2001 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of BL6cTRlCjM. 4 & M M P UT&R £MC>| N 6 £ R < * J G The University of British Columbia Vancouver, Canada Date Qcf /(J , TOOl DE-6 (2/88) ABSTRACT In recent years, there has been tremendous development in the f ie ld of mult imedia and Internet technologies. However , not until recently d i d the two areas begin to converge. O n e standard that attempted to d o just this is M P E G - 4 . T h i s is the first mult imedia standard designed with the goal of Internet streaming i n m i n d . T h e novelty of this concept required a proof of concept as wel l as a feasible design that w o u l d take advantage of these features. T h i s thesis discusses developments in the f ie ld of unicast as well as multicast streaming for M P E G - 4 traffic based on the D e l i v e r y M u l t i m e d i a Integration Framework ( D M I F ) standard. T h e design and implementation required conformance to the systems layer specifications within the standard and interoperability with the I M 1 reference software. T h e unicast system required the design and implementation of multi-client streaming server that w o u l d support high complexity mult imedia data and deliver transparent data to the receiver. T h i s design includes issues such as rate control, multithreaded development for multi-client operation and other novel features. T h e development of this unicast system led to some efficiency issues for w h i c h a multicast system was proposed and designed based on the D M I F standard with Internet G r o u p Management Protocol ( I G M P ) additions. T h i s proposed system is simulated and the performance improvements and design enhancements are discussed. T h e finished software for the unicast system is part of the verification model for the M P E G - 4 standard conformance software. T h i s system is completely standard compliant and provides some media monitoring capabilities and rudimentary interactivity. W e expect this work to provide a reference architecture for a wide range of standards compliant M P E G - 4 server designs in academic and industrial institutions. ii T A B L E OF CONTENTS Abstract ii Table of Contents iii List of Tables vi List of Figures vii Dedication ix Acknowledgements x CHAPTER 1. Introduction 1 1.1 OBJECTIVES 1 1.2 O V E R V I E W 3 CHAPTER 2. MPEG-4 Background 5 2.1 M P E G - 4 5 2.1.1 Compression layer. 7 2.1.2 Systems layer 9 2.1.2.1 T h e S y n c L a y e r 9 2.1.2.2 T h e F l e x M u x T o o l 11 2.1.3 Delivery layer 12 2.2 T H E D M I F C O M M U N I C A T I O N M O D E L 14 2.2.1 Design Architecture 14 2.2.2 The DMIF Application Interface 16 2.2.3 The DMIF Signaling Protocol 18 CHAPTER 3. Unicast System 19 3.1 U N I C A S T S T R E A M I N G 19 3.2 R E M O T E INSTANCE S Y S T E M A R C H I T E C T U R E 20 3.3 M P E G - 4 C L I E N T ( R E M O T E R E T R I E V A L INSTANCE) 24 3.3.1 IM1 Remote Instance Design Architecture 26 3.3.2 DPI Implementation: Remote Instance 29 i i i 3.3.3 DNI Framework 31 3.3.3.1 D N I Client Side D e s i g n : D N Package 31 3.3.3.2 D N Package Implementation Details ( U D P / T C P ) 3 3 3.3.3.2.1 NetSessionList : list of network sessions 3 3 3.3.3.2.2 D A r r a y and G e n e r i c M s g L o o p : 35 3.3.3.3 D D S P Messages 35 3.3.4 Data Plane Thread 36 3.4 M P E G - 4 S T R E A M I N G S E R V E R 3 7 3.4.1 Application Layer 40 3.4.2 DM1F Service Layer 40 3.4.3 DN Package 40 3.4.4 Data Plane Thread 42 3.4.4.1 Scheduling 4 2 3.4.4.2 Rate Contro l & Fast Start 43 3.5 S I M U L A T I O N & RESULTS 44 3.5.1 Single Client Operation 45 3.5.2 Rate Control 49 3.5.3 Fast Start Mechanism 50 3.5.4 Multi-Client Behaviour 52 3.6 POSSIBLE S Y S T E M IMPROVEMENTS 5 4 C H A P T E R 4. M u l t i c a s t B a c k g r o u n d 56 4.1 IP M U L T I C A S T 5 6 4.2 M U L T I C A S T GROUPS 5 9 4.3 INTERNET G R O U P M A N A G E M E N T PROTOCOL ( V E R . 2) 60 4.3.1 1GMP Message Format 60 4.3.2 Protocol Description 62 C H A P T E R 5. M u l t i c a s t S y s t e m 63 5.1 M U L T I C A S T S E R V E R 63 5.1.1 Fast Start Algorithm 64 5.1.2 Client Random Access 65 iv 5.2 M U L T I C A S T C L I E N T 66 5.2.1 DMIF with IGMP 66 5.2.2 Multicast Messaging 68 5.3 S I M U L A T I O N RESULTS 7 0 5.3.1 Network Congestion 71 5.3.1.1 B a c k b o n e Congest ion 7 2 5.3.1.2 Access Network Congestion 74 5.3.2 Media Scalability 7 7 5.4 A D V A N T A G E S OF M U L T I C A S T S Y S T E M S 80 CHAPTER 6. Conclusion 83 6.1 CONTRIBUTIONS 84 6.2 F U T U R E R E S E A R C H 85 Abbreviations 88 Bibliography 90 Appendix 1 94 Appendix II 96 II. 1 S E S S I O N S E T U P [ C A L L B A C K ] 96 11.2 S E S S I O N R E L E A S E [ C A L L B A C K ] 97 11.3 S E R V I C E A T T A C H [ C A L L B A C K ] 98 11.4 S E R V I C E D E T A C H [ C A L L B A C K ] 99 11.5 T R A N S M U X S E T U P [ C A L L B A C K ] 9 9 11.6 T R A N S M U X C O N H G [ C A L L B A C K ] 100 11.7 T R A N S M U X R E L E A S E [ C A L L B A C K ] 101 11.8 C H A N N E L A D D [ C A L L B A C K ] 101 11.9 C H A N N E L A D D E D [ C A L L B A C K ] 102 11.10 C H A N N E L D E L E T E [ C A L L B A C K ] , 103 II. 11 U S E R C O M M A N D [ C A L L B A C K ] 103 11.12 U S E R C O M M A N D A C K [ C A L L B A C K ] 104 Appendix III 106 v LIST OF TABLES T a b l e 4 - 1 I G M P Message T y p e s 61 Table 5-1 L i n k Uti l izat ion 75 Table 5-2 Bandwidth Improvement based on N u m b e r of Receivers 78 Table 1-1 Remote Instance D L L Fi les 94 T a b l e 1-2 Server Fi les 94 vi LIST OF FIGURES Figure 2 - 1 I S O / I E C 14496 T e r m i n a l Architecture 7 Figure 2-2 M P E G - 4 Presentation 8 Figure 2-3 Object Descriptor Referencing (a) one audio E S , (b) two audio E S s 10 Figure 2-4 Referencing the same E S in Different Contexts 11 Figure 2-5 F l e x M u x Packet Def ini t ion 12 Figure 2-6 D M I F Addressed Technologies 13 Figure 2-7 D M I F C o m m u n i c a t i o n M o d e l 15 Figure 2-8 T h e D a t a Plane in an M P E G - 4 T e r m i n a l 17 Figure 3-1 System Architecture 21 Figure 3-2 D M I F C o n t r o l Plane C o m m u n i c a t i o n 23 Figure 3-3 D M I F Remote Retrieval Instance Architecture 25 Figure 3-4 I M 1 D M I F Structure Implementation 27 Figure 3-5 Service R e c o r d Object Structure 30 Figure 3-6 A N e t w o r k Session shared by multiple Service Sessions 32 Figure 3-7 M P E G - 4 Streaming Server Architecture 37 Figure 3-8 U s e r " P l a y " C o m m a n d 39 Figure 3-9 " A p a d a n a " Server A p p l i c a t i o n 45 Figure 3-10 V i d e o D e l i v e r y C o m p a r i s o n 47 Figure 3-11 V i d e o A r r i v a l L e a d over Play-out T i m e 48 Figure 3-12 A u d i o A r r i v a l L e a d over Play-out T i m e 49 Figure 3-13 Rate-Control M e c h a n i s m 50 v i i Figure 3-14 Fast Start M e c h a n i s m 51 Figure 3-15 Fast Start M e c h a n i s m (a) Server Bitrate (b) A v e r a g e Server Bitrate 52 Figure 3-16 M u l t i - C l i e n t Operation 53 Figure 3-17 Cl ient M e d i a L e a d T i m e 54 Figure 4-1 Multicast Tree 57 Figure 4-2 I G M P Message Format 61 Figure 5-1 R a n d o m A c c e s s M o d e l 65 Figure 5-2 I G M P Host State D i a g r a m 67 Figure 5-3 Multicast Session Initiation 69 Figure 5-4 Mult icast Session Termination 70 Figure 5-5 O P N E T Simulat ion Network 71 Figure 5-6 Unicast vs. Mult icast Backbone Uti l izat ion 73 Figure 5-7 Unicast vs. Multicast Q u e u i n g D e l a y 74 Figure 5-8 Q u e u i n g D e l a y for two Receivers within Different Subnets 76 Figure 5-9 Unicast vs. Multicast Rece ived Traff ic 77 Figure 5-10 Backbone Uti l izat ion 79 Figure 5-11 Unicast vs. Multicast Sent Traff ic 80 Figure 5-12 Multicast Networks within the Internet 82 Figure 1.1 I M 1 Cl ient Program 95 Figure II. 1 U M L Descript ion of Session Setup 96 Figure II.2 Initiation of a Service in a Remote Interactive D M I F 105 Figure III. 1 Cl ient M e d i a Session Buffer ing Scheme 106 v i i i T^o oWaman and (S&aba A C K N O W L E D G E M E N T S The quest for knowledge of this knowledge, I believe, is what drives the human race. It is however not a simple task and in the end, the credit belongs to a group of people rather than an individual. In my quest to contribute a modicum of knowledge that is meaningful to the world, I have had a great deal of support. I would like to show my appreciation and gratitude to my family, friends and colleagues who have helped me in my lifelong endeavours especially throughout this undertaking. I know that I have missed many important people here, and would like to acknowledge them also. T o my immediate family, my grandmother, aunts and uncles (wherever you may be), your support has always been felt and been a driving factor for me throughout my life and education. With the endless encouragement from my parents over the past quarter century, I am both grateful and always indebted. I am indeed fortunate to have spent most of my adult life living with Kasra. He has not only been a great brother but an even more valuable friend, whose presence over the past couple of years has made my life more exciting and easy going. Most importantly I would like to thank Dr. Alnuweiri, my supervisor, who has advised, directed and helped me over the last two years to grow as a person in the process of producing this work. The friendly discussions regarding school, life, work, and finances have enriched my life greatly and given me a different perspective on the world. T o the fellow group members of the Multimedia Communications and Networking Lab as well as the Lab for Advanced Networks, I would like to extend my deepest gratitude for all the wonderful discussions and meetings. Y o u are the ones that have had the most impact on making my experience a more enjoyable one. I have found a great friend in Yaser, and our friendship has grown and I believe that this experience has resulted in a lifetime friendship. Haitham, who has not only given me lots of help through wonderful discussions, but has also given me the motivation to practice my Arabic. Ayman, Amr, Safwan, Tamer, Anwar, and Carol have always been there to lend a listening ear or a helping hand, and it has been a pleasure being in the same group as all of you. I'd also like to thank Dr. Soudack; with whom the conversation filled lunches are always a pleasure. T o the executive members of the E C E G S A , I have to say that you have made the E C E department a more interesting, social and educational environment to be in. I hope that future students can benefit from their your as I have. T o my friends, I'd like to say that you have made this experience unforgettable and one that I'd gladly repeat. Ashley has constantly helped me keep a balance between work, play, and more play. Mohsen has given me many useful and informative conversations and the opportunity to remain in some sort of shape through our wonderful and heated badminton games. Nima has always been eager to help to take my mind off the research when it was becoming too much. Taraneh, who through the past couple years has become much more than a cousin and our friendship has grown tremendously. T o A l i , who has given me great new insight and whose presence is always a treat. T o Maureen, thanks for your listening ear, wonderful conversations, and enduring my numerous proofreading requests. A n d finally, to all the people that have played alongside me in the various football games, thanks for all the great times. x CHAPTER 1. INTRODUCTION T h i s chapter presents an introduction to the material contained in this thesis. In the first section, the motivation and objectives of the research are presented. T h e second and final section contains an overview of the thesis, and a brief description of each of the components. 1.1 OBJECTIVES Throughout the human developmental evolution, people have been trying to f i n d means of communicat ing with richer content. O n e of the most noteworthy breakthroughs in this area occurred in the 15 t h century with the invention of the printing press by Johannes Gutenberg. W i t h i n 100 years of Gutenberg's invention, the A m e r i c a s were discovered, the authority of the Cathol ic C h u r c h was questioned, and scientists challenged many long-held dogmas. It is arguable that none of this w o u l d have happened without the easy exchange of ideas made possible by Gutenberg's printing press [Enca99]. Another breakthrough i n the area o f communicat ion is the kinetophonograph. Invented b y W i l l i a m D i c k i n s o n in 1889, it produced a brief f l ickering image with poorly synchronized sound. T h i s however d i d not transform into c inema for another 40 years, and we are still experiencing the repercussions of media delivery to the masses. However , both these techniques represent a unidirectional f low of information, f r o m the author to 1 the passive audience. A s media delivery technology has developed so has the desire for bi-directional feedback based media communicat ion techniques. T h e exchange of ideas has become an integral part of the drive for developing rich content (media) communicat ion methods. T o d a y , with the universal availability of "connected" computers, media communications is largely based o n the eff ic iency o f the media coding techniques used. O n e of the newest standards that attempts to close the gap to the ubiquitous availability of media is M P E G - 4 . A l o n g with this need for content rich media, the Internet community has sought ways to make this media available to a group at large, analogous to radio and television broadcasts. T h i s has been most successful through the efforts of the IP multicast standard. T h e M P E G - 4 standard [IS014496-1], developed over the past f ive years by the M o v i n g Picture Experts G r o u p ( M P E G ) of the Geneva-based International Organization for Standardization (ISO), explores every possibility of the digital environment. Recorded images and sounds co-exist with their computer-generated counterparts; a new language for sound promises compact-disk quality at extremely low data rates; and the mult imedia content c o u l d even adjust itself to suit the transmission rate and quality [FranOO]. O n e of the major advances made by M P E G - 4 that changes the traditional nature of media communications is that viewers and listeners need no longer be passive. T h e pinnacle o f "interactivity" in audio-visual systems today is the user's ability to merely stop, rewind, forward, or start a video in progress. M P E G - 4 is completely different i n that it allows the user to interact with objects within the scene. Authors of content can give users the power to m o d i f y scenes by deleting, adding, or repositioning objects, or to alter the behavior of the objects [Batt99]. M P E G - 4 has been designed with the capability to support a wide range access speeds over the Internet. T o enable this feature, M P E G - 4 supports scalable content, that is, it allows content to be encoded once and automatically played out at different rates with acceptable quality for the communicat ion environment at hand. T h i s scalability on top of a multicast network can bring the traditional television set broadcast experience to computers everywhere. A l t h o u g h M P E G - 4 provides diverse functionality targeted at 2 remote streaming transmission scenarios, not until recently had there been a standard for this media delivery. T a k i n g into account the benefits that arise f r o m using the M P E G - 4 standard in a remote retrieval scenario it is logical to develop a mechanism for delivering this rich media information over networks. T h i s thesis discusses the work done in the area of M P E G - 4 , content-rich media delivery over unicast (one-to-one) and multicast (one-to-many) networks. T h e major motivation for this research has been to develop an open standard architecture for the remote delivery of M P E G - 4 media over a unicast l ink using a client-server architecture. T h i s implementation is further developed and analyzed to propose a technique to b u i l d the underlying capability for multicast session support. T h i s is especially important in view of M P E G - 4 ' s scalability and object-based functionalities; and the need to deliver media to the masses in broadcast environments. 1.2 OVERVIEW T h e developed architecture for the remote retrieval scenario is different f r o m the multicast scenario but does provide a great deal of insight into the development of a multicast server and the issues involved . A l t h o u g h the two issues of m e d i a delivery and multicast networks overlap the background required is diverse, w h i c h is w h y the chapters have been organized to present the introductory material separately. In Chapter 2, background material on the M P E G - 4 standard is presented. In the first section, an overview o f the layering within the standard architecture is g iven with an introduction of the functionality associated with each of the layers. T h e second section contains an in depth overview of the M P E G - 4 delivery standard. A n y relevant background that is not covered in this chapter w i l l be addressed in the f o l l o w i n g chapters based on the associated discussions. In Chapter 3, a design for the Remote Instance implementation is proposed. T h e first section contains the issues that have to be considered in a unicast streaming system as well as some of the benefits and disadvantages. T h e second, third and fourth sections describe the design and implementation of the Remote Instance system, inc luding an i n -3 depth view of the properties of the client and server. T h e fifth section includes some results as to the benefits of such a system obtained through simulation and experimentation derived though the Remote system. T h e final section contains some possible improvements that can be made to the proposed unicast system. In Chapter 4, background for the IP multicast standard is presented. T h e first section contains background information, the standard goals and its characteristics. T h e second and third sections explain the notion o f groups i n the one-to-many scenario and the techniques and protocols used for group communications. In Chapter 5, a proposed design for a multicast system based on the Remote Retrieval Instance scenario is given and this design is evaluated to determine its advantages and shortcomings. In the first section, the modifications that are required for the server are presented. T h e second section discusses the additions to the delivery layer of the client and its impact on the system. T h e final two sections evaluate this design through simulation and provide an objective determination of its pros and cons. Chapter 6 contains the conclusions that have resulted f r o m this work and the future directions that can be taken. S o m e of the viable research areas w i l l be a direct enhancement to either of the two systems discussed, whereas some w i l l use the systems as a platform to b u i l d on top of. 4 V CHAPTER 2. MPEG-4 B A C K G R O U N D T h i s chapter provides some background information on the M P E G - 4 standard and it's components. Since the focus of this thesis is timely delivery (whether it be local or streamed), the background on M P E G - 4 delivery is m u c h more extensive than the other sections. Section 1 has a brief overview of the M P E G - 4 standard, and section 2 describes the delivery mechanism associated with it. 2.1 MPEG-4 W i t h the recent advances i n mult imedia technology, it has become inevitable for new standards to arise that change the way media is stored, depicted, and delivered. M P E G - 4 is the first standard that views mult imedia content as a set of audio-visual objects that are presented, manipulated and transported individual ly . T h e audio-visual objects that together make up a presentation can vary f r o m traditional text to 3-dimentional animations. T h i s standard specifies the entire spectrum of tools required for encoding objects within a scene, composing presentations, and accessing them through a variety of delivery technologies. A s was mentioned i n the previous chapter, there are various aspects o f M P E G - 4 that make it appealing to use with streaming technology. T o support a wide range o f access speeds over the Internet, M P E G - 4 supports scalable content. There are a number of ways a presentation can be scalable in M P E G - 4 . Firstly, the media can be compressed with scalable layers, as is the case with H.263+. T h e most important information can be kept 5 in the base layer where as additional presentation detail can be encoded within secondary extended layers. T h u s i f there is a bandwidth limitation on the connection available, the extended layers can always be dropped and the client w o u l d still receive acceptable presentation quality. Another form of scalability provided by M P E G - 4 is due to the object-oriented presentation of media. T h u s within a presentation, the important object streams c o u l d be grouped as the base layer, whereas the additional scene enhancing objects c o u l d be grouped as the secondary layer. S imilar ly to the previous case, the extended layer w o u l d be discarded when faced with bandwidth restrictions. T h e latest scalability mechanism developed for M P E G - 4 media is F i n e Granular Scalabili ty ( F G S ) . It has been introduced to allow m a x i m u m adaptability to the unpredictable variation in bandwidth over the Internet [SchaOO]. F G S consists of a base layer and an enhancement layer coded in a progressive manner. T h e base layer is coded to a m i n i m a l l y acceptable quality of video to always require less bandwidth than the t ime-varying available network bandwidth. T h e enhancement layer improves upon the base-layer video, fu l ly util izing the available bandwidth at transmission-time. T h e layering in M P E G - 4 provides added error resilience for network transmission of media. Error resilience is achieved through adding additional error protection bits to the base layer streams while keeping the protection on the extended layer constant. T h u s in high bit error rate networks, the base layers wil l arrive at the receiver safely and the high cost (bits) in adding such error protection codes to the entire presentation has been avoided. A n M P E G - 4 terminal contains three layers, the composit ion layer, the synchronization layer, and the delivery layer as depicted in Figure 2-1. T h e layering b y the standard adds all the benefits of a modular design and ensures that each layer is only aware of the information that is meaningful to it. T r u e to M P E G tradition, M P E G - 4 focuses on media coding, " C o m p r e s s i o n L a y e r . " However , there are a number of other aspects that this standard also addresses: the " S y s t e m s " level relationship between the various audio-video objects and the abstraction of the " D e l i v e r y " technology. media aware delivery unaware I S O / I E C 14496-2 V i s u a l I S O / I E C 14496-3 A u d i o media unaware delivery unaware I S O / I E C 14496-1 Systems media unaware delivery aware I S O / I E C 14496-6 D M I F Compression Layer Sync Layer Delivery Layer Elementary Stream Interface (ESI) D M I F Applicat ion Interface ( D A I ) Figure 2-1 ISO/IEC 14496 Terminal Architecture T h e concept of " S y s t e m s " in M P E G - 4 not only refers to the overall architecture, multiplexing, and synchronization, but also encompasses scene description, interactivity, content description, and programmability. T h e f o l l o w i n g sections give an in-depth description of the three layers in an M P E G - 4 terminal. 2.1.1 COMPRESSION LAYER W i t h i n the C o m p r e s s i o n L a y e r of the M P E G - 4 standard, media elements (e.g., audio, background, actors, etc.) can be encoded as distinct objects. These objects are organized in a hierarchical manner to compose an M P E G - 4 audio-visual scene. In addition to these media elements, M P E G - 4 also standardizes different primitives to represent both natural and synthetic content types, which can be either 2- or 3-dimensional [Koen99] . A media object in its coded f o r m contains descriptive elements. These descriptive elements allow for the handling of the object in an audio-visual scene as well as providing the necessary information for streaming the data, i f the data has been prepared for remote access. T h i s implies that coded media objects can be represented independent of their surroundings or background. T h e coded representations of media objects are encapsulated in the f o r m of access units ( A U s ) , the smallest elements that can be 7 attributed individual time stamps (e.g., a frame of video or audio data). In addition to these media objects, a scene description is also required to describe the relationship between the various objects. T h e M P E G - 4 scene description language, also referred to as B i n a r y Information for Scenes (BIFS) , provides a spatio-temporal composit ion of scenes (i.e., in essence it builds the hierarchical structure of the M P E G - 4 presentation). T h e advantage of encoding objects separately is that compression can be performed on an individual bitstream, based on its characteristics to enable more efficient compression and to provide specialized functionality. Examples of such functionalities are error robustness, easy extraction, availability i n a scalable form, and the ability to use Qual i ty of Service ( Q o S ) metrics on a per object basis. A n example of a simple M P E G - 4 presentation can be seen in Figure 2-2, the details of the diagram w i l l become clear through the discussions in the f o l l o w i n g section. Object Descriptor ObjectDescriptorlD -Object DescnptorStream .\ , ESJD ^'^iS'feam (e^asMyer)' Figure 2-2 M P E G - 4 Presentat ion 8 2.1.2 S Y S T E M S L A Y E R T h e system layer is primarily comprised of two units, the sync layer and the F l e x M u x tool. T h e functionality of the systems layer and its role within the M P E G - 4 terminal w i l l be explained through the f o l l o w i n g two sections. 2.1.2.1 The Sync Layer T h e Sync L a y e r ( S L ) coordinates the play-out of the multiple M P E G - 4 objects that comprise a presentation [KoenOO]. A l l information in M P E G - 4 is conveyed in a streaming manner, f r o m the C o m p r e s s i o n L a y e r as A U s to the D e l i v e r y L a y e r as packetized streams. T h e Sync L a y e r provides packetization for A U s within elementary streams (ESs) . T h e A U m a y be larger than the S L packet, in w h i c h case it w i l l be fragmented across multiple S L packets. S o m e objects, such as a sound track or video sequence, w i l l have a single stream. Other objects may have two or more. F o r instance, a scalable object w o u l d have an E S for base layer information plus one or more enhancement layers, each of which w o u l d have its o w n E S for i m p r o v e d quality. A l s o with the multiple E S s , a receiving terminal can choose to receive o n l y the streams that are crucial to the presentation and disregard the others (e.g., additional background detail or background music , etc.). T h e higher-level description o f a scene (i.e., B I F S ) is conveyed i n a separate E S . T h e most important advantage of this technique is that it becomes easier to reuse media objects to create different presentations. F o r example, parts of a scene m a y be used only under certain conditions (e.g., when it is determined that sufficient bandwidth is available); multiple scene description E S s for different circumstances m a y be used to describe the same scene. T h e information regarding w h i c h E S s belong to w h i c h object is conveyed in object descriptors ( O D s ) . T h i s includes the scene description, audio-visual objects, as well as object descriptor streams themselves. O D s i n turn contain elementary stream descriptors ( E S D s ) , w h i c h describe individual streams including the information needed to tell the system what decoders are required to decode a stream, as wel l as information regarding the format of the data (e.g., visual face animation stream). T h i s information is conveyed 9 in special elementary streams that m a y be sent at any time to dynamical ly add or remove media elements f r o m a scene. In most simple cases, an O D w i l l contain one E S D that identifies the stream it is referencing (e.g., an audio stream). T h i s can be seen in Figure 2-3(a). O n the other hand, it is possible for one O D to contain multiple E S D s , for example, one identifying a low bit audio stream and another one identifying a higher bit-rate stream with the same content as seen in Figure 2-3 (b). In this case the user (receiving terminal) w o u l d have the choice between audio qualities [HerpOO]. It is possible to provide all kinds of different resolution or different bitrate streams representing the same audio or visual for a single-object descriptor in order to offer a choice of quality. Figure 2-3 Object Descriptor Referencing (a) one audio E S , (b) two audio E S s A n O D m a y reference multiple E S D s to represent a scalable, or hierarchical , encoding of the data that represent an audio-visual object. In this technique, the streams w o u l d be dependent on each other and w o u l d b u i l d on top of each other, thus the descriptor should also contain information on the interdependencies of these streams. Hierarchical dependencies are only al lowed to exist between the set of E S D s that are included in a single-object descriptor. T h u s , i f there are multiple scenes, or different scene description 10 nodes, that use the same streams in different contexts, say (a) as a single-quality stream and (b) as the base layer of an audio-visual object encoded in a scalable fashion, then there have to be two different O D s . T h i s type scenario can be seen in Figure 2-4. Figure 2-4 Referencing the s a m e E S in Different Contexts Another crucial element in an M P E G - 4 presentation is the Initial Object Descriptor ( I O D ) . T h e I O D is required to identify the elementary streams that contain the scene description and the associated object descriptors. O n l y in the presence o f the I O D is the receiving terminal able to receive an M P E G - 4 presentation. F igure 2-2 shows the interdependency between the I O D and the other streams. 2.1.2.2 The FlexMux Tool A simple mult iplexing tool called F l e x M u x has been defined within the M P E G - 4 standard for adapting M P E G - 4 streams for various transmission environments. T h e F l e x M u x tool is a multiplexer with simple packet syntax, created for low delay and low bitrate streams. T h i s tool is useful in cases where the management cost in terms of delay or load for setting up and using transport channels for each individual elementary stream is too high. In cases where an M P E G - 4 presentation contains dozens of audio-visual 11 objects with a similar amount of corresponding elementary streams, the load on maintaining separate transport channels for each of the E S s w o u l d be too high. Figure 2-5 contains the two variations of F l e x M u x packet structure. In simple mode the header consists of an ' index' that corresponds to the F l e x M u x channel ( F M C ) , or stream number, and the packet length in bytes. MuxCode Mode can be used to multiplex multiple streams into a single F l e x M u x packet. T h e scheme used for mult iplexing the various streams is based on the 16 templates available to the sender terminal. There is a slight initial overhead while the sender transmits the template to the receiver during transport channel setup, but this process provides overall improvement on the multiplex overhead. FlexMux-PDU index length S L - P D U Simple Mode Header Payload FlexMux-PDU index length version F M C a F M C b F M C n MuxCode Mode H Payld H Payload H Payload Figure 2-5 F lexMux Packe t Definition T y p i c a l l y , i f there exists a better than best effort network, m e d i a streams with similar Q o S requirements w o u l d be multiplexed into a single channel. 2.1.3 DELIVERY LAYER T h e D e l i v e r y L a y e r provides a means of retrieving M P E G - 4 elementary streams. T h i s layer provides an abstraction layer between the core M P E G - 4 systems components and the retrieval method [AsraOl ] . T h i s separation between system and delivery means that the M P E G - 4 Systems [IS014496-1] specification does not enter the details of the various delivery technologies as do M P E G - 1 and M P E G - 2 . T h i s separation is achieved through the D e l i v e r y M u l t i m e d i a Integration Framework ( D M I F ) [ISO 14496-6]. T h i s framework addresses the issues of local file access, broadcast media access, and peer-to-peer media 12 access as seen in Figure 2-6. T h i s is done through a c o m m o n interface to hide the operational scenario f r o m the application. B y implementing this requirement, the obvious differences between operational scenarios w o u l d have no impact on the interface or on the way the application manages the streamed content, but w o u l d impact the authoring process. F o r example, an M P E G - 4 presentation c o u l d contain elements f r o m an IP multicast session as well as a local pre-downloaded sequence. B o t h these elements c o u l d be integrated into a single, harmonized presentation since the application is unaware that they are diverse scenarios. Moreover , the same application c o u l d perform quite differently in a Q o S enabled Intranet versus a best effort Internet. D M I F also enables the synchronization and simultaneous presentation of M P E G - 4 content carried through different delivery technologies. T h e M P E G - 4 standard does not enforce any delivery specifications for media transport across the Internet. There are a number of possible delivery platforms that c o u l d be tied into this technology for transport across the Internet. S o m e o f these platforms are based on D M I F , while others do not acknowledge this framework and ignore this part of the standard [ C i v a O l , K i k u O l ] . A l t h o u g h D M I F was not specified solely to support M P E G - 4 Systems, it has been through the effort of the M P E G - 4 committee that this flexible delivery framework has been standardized and has become an integrated part of M P E G - 4 . D M I F has numerous Figure 2-6 DMIF Add ressed Techno log ies 13 advantages in that it abstracts the media f r o m the delivery technology and enables easy utilization of various media access techniques. Other advantages o f D M D F include the Q o S provisions and the client/server capability exchange provisions [FranOO]. T h e D e l i v e r y M u l t i m e d i a Integration F r a m e w o r k ( D M I F ) has been developed for mult imedia standards through the efforts of the M P E G - 4 committee to provide seamless retrieval of media without knowledge of its location. In the M P E G - 4 standard the unifying factor is the c o m m o n synchronization layer defined in the m o d e l . 2.2 THE DMIF COMMUNICATION MODEL T h e D M I F standard consists of some key elements that characterize its benefits. These elements are the reference architecture, the D M I F A p p l i c a t i o n Interface ( D A I ) , and the D M I F Signal ing Protocol ( D D S P ) ; all presented in the f o l l o w i n g subsections. 2.2.1 D E S I G N A R C H I T E C T U R E T h e design architecture for D M I F models deals with different operational scenarios in the same manner. T h i s can be seen in Figure 2-7. T h e four basic blocks in this model ing scheme are: originating application, originating D M I F , target D M I F , and target application. A l t h o u g h they are modeled similarly, the implementations of the different blocks are not always substantial. F o r example in the local retrieval and broadcast scenarios, the target D M I F and target application contain very little functionality and reside, typically, in the same process as the originating application. T h e originating application is the actual application in the terminal, e.g., the M P E G - 4 presentation viewer or mult imedia conferencing application. T h r o u g h the signaling between the D M I F peers, an originating application retrieves data f r o m the target application. O n l y in the remote scenario w i l l the target D M I F and the target application reside on a separate host f r o m the originating application and D M I F . In this case, the D M I F peers use D D S P for communications. 14 T h e data that is carried by D M I F is considered opaque, and only understood by the end applications. T h i s is true regardless of the operational scenario. T h u s , D M I F is not l imited to M P E G - 4 - b a s e d applications, since the data that it carries is inconsequential to its operation. Another significant element in the D M I F architecture is the D M I F filter. T h i s module represents a container for the D M I F operational scenarios (instances) that are available in the terminal. W h e n the originating application makes a request for specific media , it is the D M I F filter that determines how to retrieve this data by selecting the appropriate D M I F instance. T h i s decision is based on the D M I F U R L requested b y the application. DAI Originating H DMIF for Broadcast u-& Q Target DMIF Target App. Originating DMIF for Local Files Target DMIF Target App. Originating DMIF for Remote srv -4 Sig map DNI map Tared DMIF Broadcast source Local Storage DNI Flows between independent systems (normative) Flows internal to a single system (either informative or out of DMIF scope) DAI Figure 2-7 DMIF Communicat ion Mode l A n element that is only present in the remote instance is the Signal ing module (Sig M a p ) . T h r o u g h the use of the D M I F Network Interface ( D N I ) , two D M I F peers w o u l d use the S i g M a p to convey the mult imedia presentation. T h e D N I represents the border between the generic and specific tasks of a D M I F instance for remote interaction scenario. T h e specific tasks are based on the underlying transport technology being used. In essence the remote instance c o u l d contain numerous specialized networking protocols and select 15 among them based on the application-requested D M I F U R L . T h i s is a further level below the instance selection. 2.2.2 T H E DMIF A P P L I C A T I O N I N T E R F A C E In the M P E G - 4 context, the D A I is the demarcation line between the D M I F and Systems layers, but can also perform the same role in other applicable contexts. In D M I F there are separate walkthroughs for each operational scenario, all showing the same behaviour at the D A I , as seen in Figure 2-7. D u r i n g remote media access, two communicat ion planes are required: a D a t a Plane for the transport of media data (e.g., video stream) and application control data, and a Control Plane used for media session management. T h e term D a t a Plane and U s e r Plane w i l l be used interchangeably throughout the remainder of this Thesis . T h e D M I F specification adopts out of band signaling, and therefore the C o n t r o l and D a t a Planes can use different transport protocols. T o ensure the reliability of Contro l Plane messaging in error prone environments, an error-free transport scheme should be employed. These planes are accessed by the originating application through a set of primitives that comprise the D A I . T h e functionality that the D A I provides is apparent through a group of primitives that have been defined, as seen below. D u e to the abstraction of the information passed across the D A I , some of the parameters conveyed within these primitives appear as opaque data. T h e three sets o f primitives that the D A I is comprised of are: • Service primitives, which deal with the Contro l Plane, and allow the management of service sessions • Channel primitives, which deal with the Control Plane, and allow the management of channels • D a t a primitives, which deal with the Data Plane, and serve the purpose of transferring data, whether media data or application control data, through channels between the target and the originating application T h e M P E G committee d i d not specify a detailed definition o f the D M I F syntax deliberately, since it does not have an impact on its concept and m o d e l . T h u s the 16 limitation of providing only semantics for the aforementioned primitives was a conscious decision to encourage diverse adoptions of the standard. A s mentioned previously, a l ikely protocol, within an IP based network, for transmitting Contro l Plane messaging is the transmission control protocol ( T C P ) due to its error free nature. H o w e v e r , there are numerous alternatives for delivering D a t a Plane information.The D a t a Plane in an M P E G - 4 terminal [ISO 14496-6] is illustrated in Figure 2-8. FlexMux Channel 1 Elementary Streams t L S E l y SL-Packetized Streams | FlexMux | | FlexMux \TransMux Channel} optional use of FlexMux Tool jn/jeTL\ • FlexMux Streams Sync Layer DMIF-Application Interface Delivery Layer TCP IP UDP IP (PES) MPEG2 TS AAL5 ATM H223 GSTN DAB mux t t t X X TransMux Streams Figure 2-8 The Data P lane in an M P E G - 4 Terminal In the M P E G - 4 terminal D a t a (User) Plane, E S s cross the D A I in individual channels and are possibly multiplexed in the D e l i v e r y Layer , generating F l e x M u x streams. These streams are then carried into T r a n s M u x channels that are in turn multiplexed based on the characteristics of the protocol stack for delivery. T h e D e l i v e r y L a y e r is responsible for the configuration of the transport protocol stacks. It is also in charge of keeping track of the associations of channels and transport resources. It is evident f r o m Figure 2-8 that the elementary streams can be transported either individual ly or as a group of multiplexed streams over a transport protocol . T h i s is the distinction between using the simple mode of the F l e x M u x tool versus its muxcode mode. 17 2.2.3 T H E DMIF S I G N A L I N G P R O T O C O L T h e D D S P is a generic session level protocol designed to ful f i l l the requirements for mult imedia data streaming. T h u s , this signaling only applies to the remote instance scenario. T h e development of this protocol has taken into account possible future evolutions of networking technologies, and supports features that are not readily available with current techniques. Such features include Q o S provisions, resource management, as well as support for heterogeneous networks in later versions. B e i n g a session level protocol, D D S P is analogous to F T P . In both cases the first step consists of opening a session with a server entity. O n c e the session has been established, a number of streams m a y be selected and requested, as is the case with F T P . However , with D D S P , instead of retrieving files, a channel is created for each requested stream or possibly multiple multiplexed streams by means of the F l e x M u x tool. It does not download the stream; but merely sets up the channels and configures the protocol stack. It is then up to the application (i.e., D a t a Plane) to control the streaming. T h e D D S P protocol supports the exchange of generic Q o S information, and can therefore be exploited over Q o S enabled networks. A potential competitor to D D S P in the Internet environment is the Real T i m e Streaming Protocol ( R T S P ) defined by the Internet Engineer ing T a s k Force ( I E T F ) . R T S P is not parallel to D M I F ; however, it can be used instead of D M I F or in conjunction with it [ A l n u O l ] . There are two major reasons w h y the integration of R T S P in the D M I F framework is difficult . Firstly, one of the characteristics of M P E G - 4 media is that it c o u l d potentially be composed of a large number of streams. D M I F has been specifically designed to handle these situations, whereas other streaming control protocols, i n c l u d i n g R T S P , w o u l d have to be adapted and greatly extended for such scenarios. U s i n g R T S P in conjunction with D M I F provides functionalities that are not explicit ly defined in the M P E G - 4 standard. F o r example the application specific information carried through the D M I F " U s e r C o m m a n d " can be mapped to various R T S P methods. Secondly , R T S P mixes the roles of session and connection set-up with the role of stream control, whereas in M P E G - 4 Systems and especially D M I F these roles have been kept separate. 18 CHAPTER 3. UNICAST SYSTEM T h e scenario of the unicast server is the reality that all servers have to cope with due to the current nature of the Internet. T h u s , it is important to evaluate such an implementation and identify its shortcomings and exploit its advantages. T h i s chapter gives a brief outline of the advantages that unicast streaming provides and some of its shortcomings. In the fo l lowing sections there is a discussion on the design and implementation of a D M I F based remote retrieval server and the client for the Implementation 1 ( I M 1 - 2 D ) software. E v i l - 2 D is the reference software implemented by the M P E G committee to provide a proof of concept for systems and audio/video coding of the M P E G - 4 standard. F i n a l l y , in the last section there is an evaluation of the performance of this server through simulation and testing. 3.1 UNICAST STREAMING T h e term unicast is composed of two roots, f r o m Lat in " u n u s " meaning one, and f rom O l d Norse "kasta" meaning throwing forcefully. In the f ie ld of media communications, it is used to mean one-to-one transmission of media f rom the server to the client. There are various advantages and disadvantages associated to unicast streaming. T h e most prominent advantage of unicast streaming is the ability to provide v ideo-on-demand to clients. T h u s a client does not have to wait for other users or a preset time to receive the media but can receive it upon request. Another prominent advantage is the customization and interactivity that is inherent to one-to-one data. T h e user can pause, 19 play, or fast forward at any time without consequence to other users. W i t h the best effort Internet that is available today, each user can request certain quality constraints or reserve desired bandwidth. T h i s type of media delivery is highly customizable to the client requests and capabilities. O n the other hand, there are some disadvantages that are associated with unicast streaming. Since servers have to service each client individual ly , there is an upper bound on the number of clients that can be serviced with adequate quality. T h i s is not only due to the server load but also the bandwidth of the l ink going f r o m the server to the edge router. T h i s is the most severe drawback of unicast traffic. Another major disadvantage of a unicast server is that there m a y be multiple duplicate packets on the network going to different destinations (clients) and this wastes a great deal of bandwidth. 3.2 REMOTE INSTANCE SYSTEM ARCHITECTURE T h e remote instance implementation signifies a unicast streaming system based on the D M I F standard, not restricted to M P E G - 4 , but designed and implemented for proof o f concept of its functionality. T h e architecture used for realizing D M I F corresponds to the recommendations made in part 6 of the M P E G - 4 standard as is depicted in Figure 2-7 [ISO 14496-6]. A n overview of the major components of the client/server system and the messaging that takes place between distributed peers is also seen in the figure. T h e remote instance system architecture is comprised of a client module and a server application. These w i l l be discussed in greater detail in the f o l l o w i n g sections. D u e to the adherence to D M I F recommendations, the system architecture includes a D a t a Plane and a Control Plane for out-of-band signaling. T h e separation between the Contro l and D a t a planes is not s imply logical , but the implementation consists of separate processes for both planes on the client and server. T h e system architecture is depicted in Figure 3-1. 20 MPEG -4 Client MPEG -4 Server Figure 3-1 Sys tem Architecture T h e system differs f r o m traditional video on demand ( V O D ) systems in the characteristics of the presentations delivered. V i d e o on demand has mainly been about delivering frame based (e.g., M P E G - 2 ) audio and video [Chan97]. In the case of object-based presentations, the media data and the m e d i a composit ion data are transmitted to a client as separate streams in the same session. In order to deliver usable media to the client, the server contains various components. T h e M P E G - 4 server consists of an elementary stream provider, a packetizer, and rate and f low-control unit. T h e server delivers the S L packets that have been produced through the aforementioned components to D M I F for delivery to the client. T h e client, player, is the application that initiates the media request to the client and w i l l display the media data that is sent f rom the server. T h e components that are required at the player are a remote retrieval D M I F instance, elementary stream decoders (multiple), and compositor. T h e D M I F instance D a t a Plane receives the media information f r o m multiple elementary streams and forwards it to the appropriate decoders. A player typically contains several decoders, each handling a specific elementary stream. T h e 21 decoders that the I M 1 core can presently support are: A A C , G 7 2 3 , H 2 6 3 , and J P E G . T h e decoded objects are then passed to the compositor for display within a scene. T h e Contro l Plane contains provisions for session initiation, session management, and session termination. T h e D a t a (User) Plane, on the other hand, is the transport m e d i u m through w h i c h the m e d i a data and application data are transferred between the client and server. A s seen in Figure 3-1, requests for control services traverse the f o l l o w i n g modules f r o m the client application to arrive at the server application layer for processing: D M I F layer, underlying transport layer (e.g., T C P ) , server listening thread, and the server D M I F layer. T h r o u g h the use of the Control Plane, the client application can request media transmission f r o m the server. T h i s request w i l l be made through the C o n t r o l Plane of the D M I F layer and sent to the server using a transport channel (e.g., T C P , U D P ) . D u r i n g session initiation, the initial object descriptor ( I O D ) w i l l be sent to the client. T w o channels are then created to carry the scene descriptor ( S D ) , w h i c h describes the relationship between various objects in the presentation, and O D , w h i c h identifies the elementary streams for the media objects. T h u s , each presentation contains at least two elementary streams: an S D stream and O D stream. B a s e d on the O D and the S D , the client determines the number of Data Plane channels (Transmux channels) required to receive the media streams and creates these channels through Contro l Plane messaging. T h i s is i n essence the D a t a Plane through which the media w i l l arrive at the client. O n c e the session initiation messaging is completed, the client application w i l l request to 'P lay ' the presentation. W h e n the 'Play ' request arrives at the server, the server w i l l request the media f r o m the elementary stream provider and send this data to the client through the D a t a Plane. T h e f low of media data through the D a t a Plane f r o m the server to the client can be seen in Figure 3-1 and is as fol lows: Elementary Stream provider, packetizer, underlying transmission layer (i.e., R T P , U D P ) , client input Transmux channels, depacketizer (Sync Layer) , and decoders at the client application for playback. T h e Contro l Plane signaling between an originating application and a target application occurs irrespective o f the transport layer technology due to the abstraction provided by D M I F . T h e implementation of the signaling interface on the client side was implemented 22 through a Remote Instance D L L that plugs into the LM1 software, whereas the server application was designed and developed as a complete application using a modular interface similar to that of the M P E G - 4 standard. T h e gray coloured blocks in Figure 3-2 represent the modules and processes that were implemented. T h e process on the client represents the D M I F Instance for Remote Retrieval as described in 2.2.1. T h e D M I F filter routes client requests for remote services to the D M I F remote instance. T h i s instance replaces the D M I F filter providing the D A I functions through the D M I F P l u g - i n Interface (DPI). T o the application this w i l l be transparent and the user w i l l not notice a change in interface, but the requests w i l l now be serviced through the particular D M I F Instance. T h e means through w h i c h the originating and target application communicate is through the D N I . Originating Application Target Application Compression Layer — i DAL. Sync Layer J2EL DMIF Filter IM1-2D Player Server Application D A L D A I Cal lback l-unitiuns D M L K c m i H L ' Instance Plug-in D N Piotocol Slack Remote Instance DLL D M _ DN Callback Functions DN Daemon Figure 3-2 DMIF Control P lane Communica t ion T h e system uses a D M I F instance for IP networks and has been tested over Ethernet. T h e media streams are transported using user datagram protocol ( U D P ) while signaling can use either transmission control protocol ( T C P ) or U D P . T h i s code has already been contributed to the M P E G committee and is in the process of being evaluated to become the reference software for M P E G - 4 remote retrieval. 23 3.3 MPEG-4 CLIENT (REMOTE RETRIEVAL INSTANCE) T h i s section contains the design and implementation details of the remote retrieval instance at the client end. T h e details that are relevant to both the client and server implementations are also discussed in this section. T h e client implementation is based on the architecture depicted i n Figure 3-3, which represents the instance i n v o l v e d in handling remote media access. T h e resulting software module , w h i c h supports remote access of M P E G - 4 content, implements the recommended D A I . T h i s module interacts with the application through the D M I F client filter. T h e D M I F standard only describes the semantics of the D A I ; def ining the syntax is left to the system developers. In the implementation, the D M I F instance for remote retrieval (also referred to as the Remote Instance) is a D L L , however it c o u l d be l inked in any other practical f o r m such as a static library. T h e Remote Instance Contro l Plane implementation is composed of two main layers as shown in Figure 3-3. T h e upper layer, D M I F Service layer ( D S ) , interacts with the D M I F filter and provides the services requested by the application. T h e lower layer, D M I F Network A c c e s s layer ( D N A ) , handles the network control messaging between peers and implements D D S P . T h e D S layer accesses the D N A layer through the D N I [PourOl] . M e d i a data is transported across native network transport channels that are referred to as Transmux channels. Creating Transmux channels and managing network sessions between D M I F peers is done using the functionality provided b y N e t w o r k Session objects in the D N A layer. T h e implemented D N A layer presents its functionality through the D N I primitives regardless of the protocol used for the transportation of the D D S P messages. 24 D M I F Filter DPI (DAI) DNI 0 P 1 s I I Q < DMIF Service Object DMIF Service Layer Manager Network Session Object DMIF Network Access Layer Manager Figure 3-3 DMIF Remote Retr ieval Instance Architecture T h e main functionality expected f r o m the D M I F Service layer is to create and manage D M I F services and hide the technology used to transport the control messages and elementary streams. T h i s is done using D M I F Service objects created in this layer. T h e separation of the D S and D N A layers facilitates e m p l o y i n g a variety of transport technologies. In the implementation of the Control Plane, messages are transported b y either T C P or U D P . Since D D S P does not provide error recovery facilities, lost U D P datagrams can halt the system. In order to prevent this problem while U D P is used, a simple error recovery scheme can be applied to the D N A layer. H o w e v e r , U D P is adequate for testing in local intranets, where Ethernet is used as the underlying transport technology, since there is no overhead and little probability of loss. W h e n using T C P as the transport protocol for the D D S P messages, no error recovery is required. H o w e v e r , T C P requires a connection set-up phase prior to sending the first D D S P message. T h e reliability that is inherently provided by T C P outweighs the undesirable initial delay. F o r each network session, a T C P connection is established at session-setup time. Further 25 D M I F messages are transported through this dedicated connection; therefore only the session-setup message suffers f rom the T C P initial delay. T h e structure used in the implementation, according to the standard, implies that the Data Plane can use any viable delivery technology regardless of the delivery scheme used by the Control Plane. In the current implementation, U D P is used for the delivery of M P E G -4 content. T C P is not a good candidate for the transport of time-critical data due to its preference for reliable transmission over t imely delivery. R e a l - T i m e Protocol ( R T P ) w o u l d provide additional benefits and is being considered for future development. T h e Data and C o n t r o l Planes are defined as separate processes (i.e., threads) in this implementation. Data Plane channels are created in the Transmux channels that have been established between the client and the server. F o r each Transmux channel a client listening thread is dispatched to receive the packetized elementary stream. A t the server, several data channels are mult iplexed into one or more T rans mux channels; therefore the packets containing multiplexed elementary streams need to be demultiplexed i n the client D M I F instance before being delivered to the S y n c Layer . F o l l o w i n g the presentation, the client w i l l close the data and Transmux channels before terminating its session with the server. These high level descriptions of the operations of the Remote Instance are more clearly understood through the fo l lowing sections, w h i c h provide higher degree of detail on design and implementation issues. 3.3.1 IM1 R E M O T E I N S T A N C E D E S I G N A R C H I T E C T U R E In the E v i l software, D M I F is implemented as a main block cal led D M I F C l i e n t F i l t e r and a number of objects that abstract the lower layer delivery technology. T h i s architecture is depicted in Figure 3-4. T h e similarity between the E v i l implementation and the D M I F architecture as described in [ISO 14496-6] is evident. T h e file structure of the Remote Instance D L L can be seen in A p p e n d i x I. 26 DMIF Client Filter DIM Object (Service) rvk-o Record U-.I icr\ in" Riviml Scrnci: Rcuod DIM Object (Service) DIM Object (Service) Ser\ ice Rircnrd 1 i 1 i DNI Hi L — i File Retrieval Instance (DLL) IBroadcast Instance (DLL) Figure 3-4 IM1 DMIF Structure Implementation E a c h object is implemented as a D L L and contains a class of D M I F services. These objects provide the D M I F client filter with a unique interface. T h i s unique D A I interface hides the delivery technology details f r o m the application. T h e application calls targeted to the D A I are addressed through the D P I once the appropriate instance has been selected and an appropriate object created. In I M 1 , each pair of D A I functions is implemented in the f o r m of req/cnf and ind/rsp functions. T h e latter pair is used to realize the call-back functions that are i n v o k e d by the remote peer, while the req/cnf functions address a D M I F request that is i n v o k e d b y the local user. A n example of the duality can be seen below: DAI_ServiceAttach: local DMIF user requests DAI_ServiceAttach_req (DMIFClientFilter.cpp) DAI_ServiceAttach_cnf (Executiv.cpp) DAI_ServiceAttach_callback: remote application requests DAI_ServiceAttach_ind (ServerApp.cpp) DAI_ServiceAttach_rsp (Server_DA_Package.cpp) 27 E v e n though all four types of D A I functions belong to the D M I F client filter, they are implemented in different parts of the E v i l software. D A I _ c n f and D A I _ i n d functions are implemented i n a part o f the I M 1 software that is named "core . " D A I _ r e q and D A I _ r s p functions are implemented in D M I F client filter. It must be noted that callback functions are only implemented in the D M I F module at the server and are not required for the local or broadcast instance. In the E v i l D M I F client filter, ind/rsp functions, though present, are not used. A user request, w h i c h is made by cal l ing a D A I _ r e q function, is redirected to the appropriate service provider object in a D P I instance ( D L L ) . T h i s object is created in the first call to the D P I instance. These objects are maintained in the D M I F client filter in the f o r m of a l inked list. A n object is created every time the user makes a service attach request to the D M I F instance. T h i s object contains some private members that w i l l be required for future communicat ion with the server. There is a one-to-one relation between D P I and D A I primitives. D P I _ r e q and D P I _ r s p functions are realized in the D P I instance, while D P I _ i n d and D P I _ c n f functions are part of the D M I F client filter. Similar to the D A I functions, ind/rsp functions are not required at the client side. Since each D P I function that is called by the D M I F client filter belongs to a specific service and a separate D P I object is created for that service, some of the parameters that are passed to the D A I are not explicitly passed to the D P I . Instead, these parameters are retrieved using the service reference (object pointer). U s i n g this service reference, a D P I object creates a record for itself. T h i s record is kept for later use b y D P I functions. D P I functions m a y extract additional parameters f r o m the existing record when the D A I does not pass all the necessary fields. E a c h D P I instance ( D L L ) contains a list of the parameter records that are i n use. T h i s l inked list is public to all D P I objects. T h e network specific functionality is provided by the D N I package. In each D P I instance, there is only one D N I package that serves all the active services (i.e., D P I objects). 28 3.3.2 DPI I M P L E M E N T A T I O N : R E M O T E I N S T A N C E T h e relationship between the Remote Instance D M I F service and the D M I F N e t w o r k i n g layer is not one-to-one. U n l i k e the local access scenario, calls made to this service have to be sent to the D N layer with sufficient information to make communicat ions possible with the server. W i t h i n a D M I F service (i.e. Remote Instance object), there may be numerous remote requests, referred to as services. These services are kept in a ServiceRecordClassList list structure. E a c h service object keeps track of its local variables and through this list, requests made to this service by the D M I F client filter can be dealt with correctly. T h e class structure of the list is shown below. W i t h i n a service, a network session is the logical association between a client and a server as seen in Figure 3-5. T h i s w i l l be discussed in greater detail in the f o l l o w i n g section on D N I Implementation. E a c h service, depending on the number of locations that the presentation has to be retrieved f r o m , w i l l have the appropriate number of Tra nsmux channels. These Transmux channels are identified using I D ' s referred to as T r a n s m u x Associat ion Tags ( T A T ' s ) . T h e T A T , is a unique network session wide identifier assigned by the originating D M I F entity. E a c h T A T , can contain multiple logical channels between the client and server. These channels, identified using Channel Associa t ion Tags ( C A T ' s ) , are the channel identifiers for each elementary stream. class ServiceRecordClass { Public: Private: int32 servicePtr; long double networkSessionld; LPSTR absURL; CATJTYPE* pCAT; TAT_TYPE* pTAT; intl6 serviceld; }; A s was alluded to in the previous section, when an initial request is received f r o m the application layer for a ServiceAttach, the Remote Instance object w i l l save its session information in the ServiceRecordClassList. T h i s session information is required for the 29 lower D N layer and is only provided during session initiation, thus the D P I w i l l retain the information and relay it to the D N layer for subsequent requests for the existing service. O n consequent requests the D P I service object w i l l retrieve the relevant network session information for the requested action (e.g. networksessionld, serviceld) and call the D N layer counterpart. W h e n the service is no longer required, the call to DPI_ServiceDetach_req w i l l delete the service f r o m the ServiceRecordClassList. Originating DMIF Target DMIF Network Session (networkSessionld) Service 1 T A T 1 ; I CAT I C A T 2 u Q C U Service 3 T A T 2 OH u Q IB CL, Figure 3-5 Service Record Object Structure T h e D P I is responsible for session initiation and data channel allocation for the remote retrieval instance. W h e n a ServiceAttach request is received f r o m the D M I F client filter, this w i l l be passed on to the D N client if a session has already been set-up. Otherwise, the D P I w i l l have to perform a 'session setup' as wel l as a 'service attach' using the D N . T h e complementary operations are also true. W h e n the client needs to detach f r o m a service, the D P I determines whether there are other services that still require a network session before releasing the session. Another scenario where the D P I issues multiple commands to the D N for its single c o m m a n d f r o m the D M I F client filter, is during ChannelAdd and 30 ChannelDelete. If there are no Transmux channels available between client and server, the D P I w i l l create a Transmux channel to allow for media data transmission. T h e complement is also true. T h e D P I calls on the client are b l o c k i n g functions and until the response of the operation returns f r o m the server, the D A I w i l l give control back to the originating application f r o m the D A I _ r e q function. In addition to the request functions in the Remote Instance object, the DNI_DataReceive_Ind function is also available. T h i s client function is cal led f r o m the D N package to indicate that there is user media data f r o m the server available to be passed up to the D M I F client filter. T h e DNI_DataReceive_Ind w i l l then depacketize the media data and pass it to the upper layer. T h e function does not return to the D N package. 3.3.3 DNI F R A M E W O R K T h e architecture used in realizing the D N I functions and implementing the D D S P is depicted in Figure 3-3. T h i s architecture adheres to the D M I F standard. T h e D M I F filter routes client requests for remote services to the D M I F remote instance. T h i s instance replaces the D M I F filter providing the D A I functions through the D M I F P l u g - i n Interface (DPI). T h i s instance realizes the D N I functions as wel l . T h e f o l l o w i n g two sections provide an overview o f the design and operation o f the D N I functionality and a more in-depth view of the implementation details. 3.3.3.1 DNI Client Side Design: DN Package D M J F N e t w o r k protocol stack ( D N ) whose functionality is realized through D N I primitives has been implemented as an integrated unit called the D N package. T h i s integration facilitates the manageability of network resources and eases their control. T h e D N package consists of the functions that comprise the D D S P client side functionality. S imilar to the D P I instance, a database object (called NetSessionList), stores the b i n d i n g information between physical entities and D M I F defined parameters. T h e networkSessionld is the most important parameter in the D N client. T h i s parameter uniquely identifies each network session established between two D M I F peers. E a c h 31 individual network session employs a unique transport protocol (e.g. U D P or T C P ) port. T h i s port is exclusively b o u n d to a socket. Therefore, records in the database object must contain these three parameters. There is further detail on this issue in the f o l l o w i n g section. Since each network session m a y contain several server connections f r o m different sources, the information related to those U s e r Planes are kept in separate records inside a network session record. There m a y also be different services f r o m the same server, thus they w o u l d use the same network session, since the creation of a new session w o u l d be redundant. F igure 3-6 shows how a network session m a y be used to control more than one D a t a Plane. Originating peer Target peer Control plane connectivity User plane connectivity Figure 3-6 A Network S e s s i o n shared by multiple Serv ice S e s s i o n s E a c h client D N function, called b y the D A I (through D P I ) functions, w i l l create a message that must be sent to the target D M I F application. T h e message format has been defined in [ISO 14496-6]. T h e message fields are mostly predefined or sent by the D A I layer; but the network related fields must be dynamical ly assigned in the D N package. O n e such f ie ld is transactionld. T h i s behaves similarly to a sequence number, except transactionld's are only meaningful within a service session. M o s t D N messages contain a parameter that can carry one or more user application data or data associated to the D P I . H o w e v e r , since the D M I F layer is application and media data unaware, the size and content o f this parameter is unknown to the D A or D N . In the implementation, this L o o p is realized using a data structure that is called GenericMsgLoop. T h e GenericMsgLoop itself contains a k i n d of dynamic multi -dimensional arrays of multi-type data structure; 32 this structure (class) is called DArray. These data structures implemented as data classes are further explained in the f o l l o w i n g section that deals with the D N package implementation details. O n c e the message has been sent, the cal l ing function w i l l make a b l o c k i n g call to receive the server response f r o m the same socket that it created as a signaling channel. T h e received message f r o m the server w i l l be checked to ensure that the transactionld corresponds to the value of transactionld in the message that was sent. T h i s is to verify that the response pertains to the call ing function. T h e receiving message w i l l then be passed to higher layers to determine whether the operation was successful or not. 3.3.3.2 DN Package Implementation Details (UDP/TCP) A s mentioned in the previous section, there are a number o f structures that contain required information in the D N package. These structures are the NetSessionList, GenericMsgLoop, and DArray. T h e usage of these structures is described in the f o l l o w i n g subsections. 3.3.3.2.1 NetSessionList: list of network sessions A network session is the key element in the D N package. T h i s structure uniquely identifies the D N object, so that each request made by the upper D P I layer deals with either a pre-established network session or establishes a new one. A network session is recognized between each pair of D M I F peers and is locally associated with a W i n d o w s socket, sigSocket. T h e structure that is used for each network session object is seen below: struct NetSessionRecordStruct { public: long double networkSessionld; SOCKET sigSocket; SOCKADDRJN destination_sin; NetSessionRecordStruct* nextElement; private : UserPlaneRecord_Type * UserPlaneList; }; 33 T h i s structure is saved as an association table in the f o r m of a l inked list. It associates the relationships between networkSessionld, sigSocket, and destination_sin (destination address). T h i s is due to the networkSessionld being the only binding variable between the D N layer and the upper layer, thus a translation between networkSessionld and sigSocket is necessary. T h i s association table is implemented as a C++ class containing the aforementioned elements. T h e networkSessionld is a unique identifier for the network session, or its associated socket. D M I F signaling messages are delivered through this socket. T h i s socket is b o u n d to a local port that can be assigned dynamical ly . D N I request messages are sent to the target D M I F peer through this port. T h e target port is a w e l l - k n o w n D D S P port. T h e client then listens (blocking call) on sigSocket (local port) for a response f r o m the server. O n c e the session has been established, the f o l l o w i n g D P I requests w i l l be called with the networkSessionld parameter. T h e D N functions then determine the outgoing socket and destination address to use through the association table class. T h e socket assigned for D M I F signaling w i l l be used for all signaling messages; therefore, it must be shared amongst all D N functions parameterized by the same networkSessionld. Information about U s e r Planes that are controlled by each network session is kept in each network session record as wel l . Since the number of User Planes served by each network session is not k n o w n i n advance, the object containing U s e r Plane information is implemented as a l inked list inside the network session record. A T A T exclusively identifies each U s e r Plane record. T h e structure for the U s e r Plane that is referred to by the network session object can be seen below: struct UserPlaneRecord_Type { _ int l6TAT; SOCKET TransmuxSocket; int8 direction; // Transmux channels are unidirectional, hence need a direction field struct { DWORD IP Address; intl6Port; } dest_Transmux_tuple; UserPlaneRecord_Type * nextUPRecord; }; 3 4 3.3.3.2.2 DArray and GenericMsgLoop: T h i s data structure is used by the application to exchange data between the client and server. S o m e information within the D M I F peers is also exchanged using this structure. T h e D M I F document defines these parameters (semantically) as " L o o p . " Sometimes these L o o p s are nested and their internal variables are in turn other L o o p s . T o realize this data structure I defined a primary data structure called DArray. DArray is a dynamic-s ize array of pointers to other data types. These secondary data types according to the standard should support any type of user data. In the implementation, they are dynamic-size arrays of fixed-size data types. T h i s provides a two-dimensional array of fixed-size data types. There are functions defined in D A r r a y class that helps in managing the data object and facilitates the storage, retrieval and deletion of data elements. GenericMsgLoop in turn contains a number of data types that include DArrays as one of their records. In the GenericMsgLoop these data records are primari ly defined with DArrays of zero size. W h e n the user application requests to send data, the DArray w i l l be resized to accommodate the available data. S o m e examples of DArray usage are user application data (e.g., ddData) , channel descriptor information f r o m the D N layer, and Q o S descriptor information f r o m the D A layer based on the requirements of the media. 3.3.3.3 DDSP Messages T h e D M I F standard enforces the syntax required for use with D D S P messages. E a c h D N request is mapped to two D M I F messages that are cal led D S _ x _ R e q u e s t and D S _ x _ C o n f i r m . T h e former is constructed f rom a D N layer request and the latter includes parameters that are output to the D N primitive. In the implementation all of these messages are defined as C data structures. T h e structures m a y contain one of the aforementioned data types (e.g. GenericMsgLoop) or standard C data types. T o translate back and forth f r o m the structures to the bitstreams, the f o l l o w i n g two functions are used: getStreamQ, and createObjfromStream(). 35 3.3.4 D A T A P L A N E T H R E A D T h i s D a t a Plane thread along with its equivalent on the server side are the components that make up the D a t a Plane. T h e Data Plane thread is responsible for receiving i n c o m i n g media data f r o m the server and buffering this data before passing it to the upper layers. T h i s thread is created upon successful completion of a ChannelAdd c o m m a n d between the client and the server. O n c e the client requests the addition of one or more channels f r o m the server, the server w i l l determine the number of T r a n s m u x channels that are required for these logical elementary stream channels. W h e n the channels have been created on the server, the client w i l l receive the appropriate control instruction ChannelAdded and one or more Transmux channels w i l l be created for the i n c o m i n g media data. O n c e the client issues a media specific c o m m a n d using the UserCommandAck primitive, the server w i l l use the newly created Transmux channel to transmit the appropriate media information. T h e D a t a Plane at the client needs to demultiplex the packetized elementary streams and route the packets to their associated data channels. T h e Sync L a y e r reassembles the media packets to their primary form of access units. Packetization and reassembly o f S L packets are out o f the scope o f D M I F and are performed in the Synchronization Layer . There is no processing of the data at the lower levels. T h e media data is passed up to the client application for display through the D M I F object. U p o n timely arrival of the data at the client, the application w i l l use the elementary stream time stamps to synchronize various streams and play them at the required rate. T h e client also contains a buffer for i n c o m i n g data to support high bit-rate media pumps. T h e buffer is also used for diminishing network jitter since the client should ideally have a few seconds of the unpresented media in its buffer at all times. T h e buffer is also required for low delay initiation of the media. Presently there is no control information that is exchanged between the client and the server during transmission of the media. T h i s can be added through D D S P messages by using DNJTransMuxConfig primitives or through R T C P messages once R T P has been implemented for media transmission. S u c h information w o u l d be used to change the channel behaviour and thus creating more or less bandwidth for the presentation. T h i s 36 may force the server to dynamical ly alter the bitrate of the i n c o m i n g data to accommodate the available resources. 3.4 MPEG-4 STREAMING SERVER T h i s section contains the design and implementation details of the M P E G - 4 Streaming Server. T h e Streaming Server implementation consists of two layers, an application layer (service provider layer) and a D M I F layer. T h e D M I F layer is further d i v i d e d into two parts, as is the case with the client D M I F layer. There are various advantages to this server, the most important being the multi-client service capability. T h e server architecture can be seen in Figure 3-7. F lexibi l i ty in exploit ing different transport technologies and the user of different media sources are other advantages this architecture offers. CD •g •> 9 >-£ I CD «j O —I e CD C O DAI CU CD 2 Q DNI CD CJ 2 §1 5 Q Application Service Manager SS4 SS3 SS2 Service Session Object ld=l| Data Plane (Channel) :Service4 DMIF Service Manager Service3 Service2 DMIF Service Object Idl Network Session Manager NS3 NS2 Network Session Object (ns id 1) DN Daemon : Listening + secondary threads ES piovider: MP4 file reader SPI ES pro\ ider. 1.1 vir Media , Ki'jl-1 imi. MIM.f.4 l;m.oJci c o cd o CD CD w Figure 3-7 MPEG-4 Streaming Server Architecture 37 T h e Server A p p l i c a t i o n consists of two distinct layers. O n e layer is in essence the Service Provider layer for the client requests. T h e other layer consists of one or more E S Provider. T h e Service Provider layer includes the realization of M P E G - 4 Sync Layer . T h e elementary streams are supplied to this layer by the E S Provider, w h i c h can be a real-time M P E G - 4 encoder, an M P 4 file reader, etc. These elementary streams are packetized in this layer, taking into consideration the transport protocol used by the Data Plane. These stream providers can be added to the implementation as new D L L ' s and this addition w i l l not oblige any changes in the Server A p p l i c a t i o n layer. T o facilitate this media resource abstraction, a Server Provider Interface (SPI) has been defined. Currently only the M P 4 file reader instance has been exploited as an E S provider. T h e D M I F layer on the server side consists of the same constructs as the client D M I F , as well as a D M I F - N e t w o r k D a e m o n . T h e major responsibility o f the D N D a e m o n is to listen on a specific signaling port ( D M I F P O R T ) and relay client requests to the proper service provider in the upper layer. T h e major difference between Server D M I F layer and its client-side counterpart is that every client request forces the creation of an individual and independent thread. T h i s thread traverses all the D M I F layers as well as the application layer using the appropriate servicing objects. D u r i n g normal operations, the server only has threads processing the m e d i a data for its clients and does not have a thread dedicated to that session awaiting client requests. Whereas with the creation of a single thread per request, once the request has been addressed, the thread w i l l be terminated and the server w i l l be lightly loaded but the servicing objects w i l l still be available for future requests. F o r example in a scenario where there are 100 streams being sent to 50 clients, the server w o u l d need an average of slightly over 100 threads to serve the clients, whereas in the case where each session requires a managing thread, that number w o u l d be 150. T h i s structure can be seen i n Figure 3-8. T h i s however is only true for the case where U D P is the underlying transport protocol. Whereas for the connection oriented T C P connection, a listening thread is created to monitor the receiving port. In this scenario the abovementioned saving does not come into play and scheduling technique w o u l d greatly improve the performance. There is substantial improvement in using a scheduling technique for the U D P scenario as w e l l ; this is further discussed in section 3.6. 38 Q j e n t D N Daemon Secondary Thread U serCommandRequest blocked 1 , 1 1 w U serCommandConfirrn Begin thread blocked ^ r Data Plane thread UserCommandCallback blocked * ' X Figure 3-8 User "Play" Command E a c h layer in the server architecture consists of a M a n a g e r object and a number of Service objects as seen in Figure 3-7. T h e D N A layer objects, N e t w o r k Session objects, handle the D D S P signaling between two peers. A l l the requests associated with a specific D M I F service are handled b y its Service object. E a c h tuple o f N e t w o r k Session and D M I F Service objects is l inked to a Service Session object in the A p p l i c a t i o n (Sync) layer. T h u s , each request to the server, as it traverses through the layers, is served by a Network Session object, a D M I F Service object, and a Service Session object. T h e thread that has been created to service the request is a member of the first servicing object, the Network Session. A s the thread traverses the upper layers, it invokes member functions of its associated service objects. T h i s makes the implementation very scalable and enforces separation between various clients since their object handlers are disparate. M P E G - 4 streaming server was designed and implemented as an executable program and a number of D L L s . T h e Server executable contains the Service provider layer, which controls the sessions between the servers and clients. T h e additional components including the E S providers and D M I F layer are implemented as D L L ' s . 39 3.4.1 A P P L I C A T I O N L A Y E R T h e Server A p p l i c a t i o n L a y e r includes a number o f objects o f the A p p l i c a t i o n Service class ( A p p S e r v i c e C l a s s ) , w h i c h are responsible for serving the requests made by their corresponding client service. These objects can also be considered as the realization of a service session. A p p l i c a t i o n service objects are created upon the ServiceAttach request made by clients. These objects implement the D A I _ I n d and D A I _ c n f functions. T h i s layer is the main layer responsible for elementary stream management, it includes the realization of M P E G - 4 Sync Layer . Sync L a y e r packets that are read by the M P 4 file reader are transferred to the application layer b y cal l ing its D A I _ D a t a R e c e i v e _ I n d function. These Packets are then sent to the client using the server D M I F layer. T h e M P 4 file reader used in this implementation is a variation o f the original M P 4 F i l e instance provided b y I M 1 . A m o n g rate control provisions there are various modifications required for remote transmission of the media. These modifications deal with the transfer of S L packets to the application layer. 3.4.2 DMIF S E R V I C E L A Y E R T h e server D M I F service layer is the layer that binds the A p p l i c a t i o n L a y e r to the D N Layer . T h i s layer contains a number of D M I F Service objects ( D M I F S e r v i c e C l a s s ) that are responsible for serving the requests made by their corresponding network session objects. D M I F service objects are created upon the ServiceAttach requests made by clients. These objects implement the D N _ I n d & D A I _ R s p functionality. T h i s layer abstracts the communicat ion layer ( D N layer) below f r o m the above A p p l i c a t i o n L a y e r . T h e services on the server D M I F layer side correspond to services on the client side and represent an active session. 3.4.3 D N P A C K A G E O n the server, the D N layer is composed of two main modules, a D N daemon and a set of D N callback functions as seen in Figure 3-7. T h e D N daemon is implemented as a thread that is responsible for listening to clients. It is also responsible for managing the secondary threads that handle the client requests. 40 T h e D N daemon listens to the D M I F _ P O R T ( U D P port 5000, T C P port 5001) for D D S P messages. T h i s local port is associated to a W i n d o w s socket and shared amongst several network sessions and thus it is mapped to several networkSessionld's, as is defined in section 12.3.5 of [IS014496-6]. W h e n a message arrives at the server, its source IP and port addresses are extracted and reused for sending the response messages. T h e extracted address is of the S O C K A D D R _ L N format and is shared between the D N D a e m o n and the secondary dispatch threads. W h e n a message is received f rom the client, the D N daemon creates a secondary thread to service this request and passes to it a structure containing the relevant information. T h e information by which the thread is created is based on the messageld f ield. Before dispatch the D N daemon also verifies the transactionld to ensure that duplicate messages are discarded. W h e n the callback function is determined by the D N daemon, it w i l l be dispatched as a new thread (secondary thread) to handle the client 's request. T h e D N daemon w i l l pass the message to this thread for further processing and resume listening to the D M I F _ P O R T . T h e format of the structure that is passed to the secondary thread can be seen below. It consists of the i n c o m i n g client message and its network address i n the f o r m of S O C K A D D R _ I N . T h e first component o f this structure, the client message, is received as a stream of bytes f r o m the i n c o m i n g socket, thus the secondary thread has to first translate this information into a D S message structure before passing it on to higher layers. T h e secondary thread uses the network address f ie ld to send a reply to the client. typedef struct { char buf[MAX_MESS AGE_LENGTH]; SOCKADDRJN client_sin; } message_type; T h e newly spawned thread parses the message and calls its corresponding D N I callback function. T h e callback function transfers the interpreted client message to the upper application layer and awaits a response. T h e secondary thread (dispatched function) then assembles a response message and sends it back to the client. T h i s thread can be 41 terminated upon completion of this task. T h i s behaviour can be seen in Figure II. 1 in A p p e n d i x H 3.4.4 D A T A P L A N E T H R E A D T h e D a t a Plane, as seen in Figure 3-1, is the transportation m e d i u m for mult imedia data. T h i s mult imedia data can be delivered using any underlying transport protocols, however U D P and R T P are the primary candidates. Since the M P E G - 4 S y n c L a y e r provides some transport related functionalities such as sequence numbering and t iming, U D P seems to be the most straightforward option for the delivery of M P E G - 4 data over IP. T h e availability of a mult iplexing tool such as F l e x M u x further encourages the use of U D P , a simple transport protocol . Despite the simplicity, there are some problems associated with this approach. Inter-media synchronization cannot easily be achieved between M P E G - 4 streams c o m i n g f r o m different sources. M o r e o v e r , other mult imedia data streams cannot be synchronized with M P E G - 4 data delivered over U D P . N o t providing any feedback b y U D P , may force the creation of an M P E G - 4 back-channel for carrying quality feedbacks. Regardless of the underlying transport scheme that may be used, the f o l l o w i n g holds true for the D a t a Plane. D u r i n g channel setup, a number of Transmux channels are created to transport the application data f r o m the server to the client. T h e T r a n s m u x channel has a one-to-one association with a communicat ion channel (e.g., R T P sessions or U D P port). W h e n a client request to initiate the media sequence, i.e. " P l a y , " arrives at the server, a dedicated thread is dispatched per data channel to send the packetized elementary streams to the D M I F Network Access layer. T h e D N A layer multiplexes and transmits the packetized elementary streams across the Transmux channel. T h e D a t a Plane in the implementation uses the simple mode of the F l e x M u x recommendations to multiplex the packetized elementary streams for transmission. 3.4.4.1 Scheduling Scheduling relates to media preparation for transmittal to the client(s) i n a real-time fashion. In an ideal scheduler, the decision to send a media that is due at the client w o u l d 42 be made instantaneously, and the propagation time and transmittal time w o u l d be negligible. H o w e v e r , as the number of serviced clients increases, in a realistic system, the server time is d i v i d e d into a smaller time units dedicated to each client. T h i s can cause the server to miss media deadlines if it does not context switch back to a client with pending media. There are two assumptions that are made during the design o f the server scheduler. Firstly, the server machine is assumed to be a dedicated machine where all processor time is dedicated to the server processes. Secondly, the service provided to all clients is of equal priority. T h e server scheduler is designed as a pseudo-round robin scheduler. A s described earlier, every client data plane contains a dedicated normal priority thread. T h i s implies that the operating system w i l l allocate equal time to each of the threads. T h u s , among the client servicing threads, and the server D N listening D a e m o n thread a round robin scheduler comes into being. W i t h i n each data plane thread, the decision to transmit the next packet of data is based on the decomposit ion timestamp of the media. T h i s information is contained within the S L packets that are sent f r o m the server application layer. If the number o f clients were to increase to the thousands, this scheme w o u l d not work, since as mentioned previously the idle times between a client receiving processor time w o u l d mean that the m e d i a w o u l d miss its deadlines. T h i s deals more with the issue of rate control and is discussed further in the f o l l o w i n g section. 3.4.4.2 Rate Control & Fast Start Rate and f low control is performed at the server and the data is sent to the client based on media timestamps. T h i s is media dependent and varies based on content and the number of elementary streams that have to be sent to the client. T h e E S providers, seen in Figure 3-7, supply the service provider (application) layer with media access units. T h e Sync Layer , w h i c h is part of the service provider layer, receives and packetizes the access units. U s i n g the information that is provided to the S y n c layer, the S L headers are generated. S L Packets are then multiplexed and passed to the D M I F layer to be transported over Transmux channels. 43 T h e t iming information that is available to the Sync layer helps in adjusting the rate of transmission. T h i s scheme prevents data underflow or overflow at the client. However , it does not address network bandwidth issues. In order to allow for bandwidth fluctuations, an adaptive source-coding algorithm [Dang96, LeehOO], an intelligent buffering system [ W a l k O l ] , or a layered encoding [Ghan89] must be adopted. H o w e v e r , the server rate-control mechanism takes into account the client media buffer size. T h e server also offers a user defined packet drop rate. T h i s allows for a certain percentage of the media to be sent to the client when there is a restriction in bandwidth availability. T h e lower b o u n d on the amount of media data that is sent to the client is 50% of the total media bitrate. W h e n the server is servicing multiple clients then this value w i l l be an aggregate reduction in bitrate since the same fair scheme w i l l be used to thin each clients outgoing bitstream. T h i s mechanism can be further i m p r o v e d to base the adaptation of media on the real-time readings f r o m the network as opposed to user intervention. T h i s is discussed further in final chapter. T h e transmission rate of the server is higher at the beginning of a session, in order to f i l l the client media buffer. T h i s is referred to as the Fast Start mechanism. B a s e d on the client buffer size, the transmission rate then is gradually s lowed to the real rate of the media data. T h i s ensures that the media data arrives at the client before its deadline has passed. T h i s is due to the initial burst of data. O n c e the rate is lowered to the media rate, the data delivered at the client w i l l lead the play-out time of the m e d i a and ensure uninterrupted mult imedia with m i n i m a l jitter. 3.5 SIMULATION & RESULTS T h e above implementation required a great deal of architectural design and conceptual design. A look at the server application interface can be seen in Figure 3-9. A p p e n d i x I contains a picture of the client application that includes the Remote Instance Implementation. T h e test bed used for the experiments in this section was 2 personal computers ( P C ' s ) connected through Ethernet on the same local area network ( L A N ) . There were also 44 experiments performed across the Shaw @ H o m e network to ensure correct operation over realistic Internet conditions. T h e simulations presented here are all done using O P N E T network simulation software. T h e media used during these simulations is an exponentially distributed signal at an average bitrate of 5 2 K b p s . T h e f o l l o w i n g sections contain the measured results compared to their simulation counterparts. 3.5.1 S I N G L E C L I E N T O P E R A T I O N Figure 3-9 signifies two back-to-back sessions f r o m one client. In the first session, vh64.mp4, the I-frame interval can be seen to be 0.5 seconds and the m e d i a bitrate to be 5 6 K b p s o n average. Whereas, the second session, videoandaudio.mp4, has the single I-frame at the start of the media, and the average bitrate is 2 5 K b p s . Towards the end of the media there is considerable movement, thus the media contains a great deal of intra-coded macroblocks in the P-frames, which is the reason for the higher bitrate. T h e results shown i n the f o l l o w i n g section utilize the aforementioned medias as wel l as the 125Kbps version of vh64.mp4, L i n d a . m p 4 , which contains higher quality video as wel l as audio. ^Untitled -"Apadaria" MPtC -4 Streaming Sei V I M Fie Help I" S i - I I ' f i r . m i i ' l r i s IEE Seconds) Maximum Display bit'ate [200 Kbps Packet Drop |g * of Full Rate Number ol Samples Displayed [300 * dent Buffer Size |5D00 *l 90 sec Figure 3-9 "Apadana" Server Appl icat ion 45 T h e traffic shown in the figure is the total traffic sent f r o m the server. T h i s includes the Control Plane messaging as well as the Data Plane media data. T h e Fast Start mechanism is visible at the start o f each of the two sessions in the figure. T h e start, however, also signifies all the Control Plane traffic that is being communicated between the server and clients. T h e rate that is being sent f r o m the server matches the rate of the encoded media. Since the media is pre-encoded, the bitrate cannot be controlled to provide a stable bitrate. There are numerous such rate control schemes but these only apply during the encoding process. T h e functionality of the unicast remote server is verifiable through its comparative behaviour with respect to the local retrieval instance. In the local instance, there are no bandwidth restrictions and no possibility of loss. Whereas for the remote server scenario, the rate of the media has to be regulated to ensure that the bitrate does not exceed bandwidth availability. T h e L i n d a . m p 4 file is used to compare the two scenarios due to its multi-object facet. T h e fo l lowing two figures show a comparison of the video and audio arrival times at the client program. F o r the local instance, the arrival occurs through a media reader thread that is b o u n d by disk seek time and C P U processing power. F o r the remote instance, the arrival occurs f rom the delivery plane, D M I F , and consequently f r o m the server, w h i c h introduces many different factors to the scenario. These scenarios were carried out without any packet drop and the m e d i a that arrived at the client was identical. In Figure 3-10, the ' C o m p o s i t i o n T i m e Stamp' and ' C o m p o s i t i o n T i m e Stamp Remote ' lines represent the media data sequence f r o m both the local and remote instance. T h e overlap signifies that the entire media data has been received in both scenarios. T h e y also indicate the play-out time according the m e d i a timestamps (i.e., in order for the arriving data to be played in real-time, it has to arrive prior to the play-out time). 46 300000 250000 * ^uuuuu 2 S 150000 a E 100000 50000 Video Local Clock Composition TimeStamp Video Remote Clock Compostion Time Stamp Remote -50000 J Time (msec) Figure 3-10 Video Delivery Comparison T h e ' V i d e o Remote C l o c k ' signifies the arrival time of the media data sequence at the client in the remote retrieval scenario. T h e ' V i d e o L o c a l C l o c k ' signifies the arrival time of the media data sequence at the client f rom the local media reader thread. T h e ' V i d e o Remote C l o c k ' appears to be overlapping the reference clocks, whereas this is only due to m u c h greater lead that the ' V i d e o L o c a l C l o c k ' has. These graphs show that the fast-start algorithm implemented in our software realizes true streaming efficiency and smooth playback at the client. T h i s argument becomes clearer in Figure 3-11. T h e steadiness of the media arrival in the remote case indicates the correct operation of the rate control occurring at the server. T h e local case, however, does not have to deal with bandwidth availability issues, which is w h y there is a large burst of video data at the beginning of the presentation. T h e lead that the local instance implementation has is due to the implementation and is proportional to the bitrate of the media. T h i s is further explained in A p p e n d i x III. 47 Figure 3 -11 shows the lead that both these media sequences have over the play-out time and thus both deliver real-time media presentations. In this figure, the media play-out time is considered to be time zero, and the positive values of the two scenarios represent their lead over the media data. 80000 -i 70000 60000 •Video Remote Lead "Video Local Lead 500 1000 1500 2000 2500 3000 3500 4000 Time (msec) Figure 3 -11 Video Arrival Lead over Play-out Time A similar scenario can be seen when we compare the audio streams f r o m the two scenarios. In the local instance, the I M 1 implementation does not burst the audio, as was the case with video since there audio bitrate is m u c h lower and buffered data is less useful. Figure 3-12 shows the media arrival lead for both scenarios. T h e rate-control characteristic at the server is also seen in this figure through the steadiness of the media arrival rate. 48 2500 2000 1500 1000 500 Audio Local Lead Audio Remote Lead 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 1 0000 (interval) Figure 3 -12 Aud io Arrival Lead over Play-out T ime T h e server as explained in previous sections, provides a number o f features to ensure that the rate o f the media corresponds to the encoded rate and that the video arrival occurs prior to the media play-out time. These can be seen in more detail in the fol lowing sections. 3.5.2 R A T E C O N T R O L A s was seen from the previous graphs in the previous section, the rate control mechanism at the server except for the beginning o f the sequence sends the media based on its bitrate. T h i s is seen in Figure 3-11 where the ' V i d e o Remote L e a d ' runs in parallel to the x-axis. T h e packet drop mechanism that has been implemented attempts to compensate for the high bit rate o f the media during high motion scenes by reducing the overall transmitted bitrate. Figure 3-13 shows the comparison o f the vh64.mp4 media stream sent with three different packet drop conditions. T h e bitrate o f the media has been averaged to facilitate 4 9 reading the graph. T h e theoretical 50% and 85% bitrates based on the ' F u l l Rate' media are also shown. 60000 50000 40000 n 30000 m 20000 10000 —Ful l Rate — 8 5 % Rate 50% Rate <—Theoretical 85% Rate — Theoretical 50% Rate 100 -10000 J Time (sec) Figure 3-13 Rate-Control Mechan i sm 3.5.3 F A S T S T A R T M E C H A N I S M T h e fast start mechanism is required at the server to reduce the initial delay at the start of the media sequence. A s explained previously, buffered data reduces network induced jitter and allows some flexibility in arrival time of data. However , the initial time for f i l l ing the buffer is reduced using a fast start mechanism at the start of the media where the client w i l l b u i l d a lead on the actual media play-out time. T h i s behaviour can be seen in Figure 3-14. 50 1800 -I 1600 --200 J Time (msec) Figure 3-14 Fast Start Mechanism T h e test sequence used is suzie.mp4. There are three scenarios considered in this figure. In the first scenario a 3-second client buffer is assumed and its effects are seen in that there is a 1.6 second lead on the arriving media and the play-out time. T h e discrepancy is due to a number of factors. T h e 3-second client buffer s imply signifies that is the m a x i m u m amount of data that can be buffered at the client. B e i n g too close to this value w o u l d be dangerous since there exists the possibility of data loss through buffer overflow. T h e second graphed line shows a 400msec client buffer. In this case there is only a 241msec lead over the media. A l t h o u g h enough for high bandwidth Ethernet L A N connections, this lead is insufficient for best effort Internet. T h e last graphed line shows a 30msec client buffer, and this buffer is not deep enough to b u i l d a sufficient lead on the media to present in real-time. T h u s , there is a 50msec lag between the m e d i a arrival time and the decoding time stamp value. If this lag were due to delay within the network, it w o u l d be possible to increase the delivery speed of the media to catch up to the presentation play-out time. But this w o u l d require either a skip in media or a temporary speed up in playback. Figure 3-15(a) shows the actual bitrate for the vh64.mp4 sequence 51 sent f rom the server. It can be seen that fast start sequence includes a m u c h higher burst at the beginning of the media. A l s o if their average bitrates were plotted, as is the case in Figure 3-15(b), the sequence with fast start w o u l d rise more quickly . T h e reference media sequence in this graph is the averaged graph for an instantaneous constant bitrate media at the same bitrate as suzie.mp4. Figure 3-15 Fast Start Mechan i sm (a) Server Bitrate (b) Average Server Bitrate Without this mechanism, there w o u l d be an initial delay while the media buffer was being fi l led before the presentation could be started at the client. T h i s is however dependent on the network and without the availability of the additional bandwidth the fast start is not possible. 3.5.4 M U L T I - C L I E N T B E H A V I O U R In the case where there are multiple clients being serviced, the behaviour should resemble a scaled version of the single client scenario. T h e vh64.mp4 media sequence was used for these experiments due to its length. Individual clients were connected at 5sec intervals. T h i s was mainly due to the test set-up where there are only 2 P C ' s . T h u s all the clients were run on the same machine. T h i s is also the reason for p i c k i n g a media sequence that d i d not have any audio component since there w o u l d be a conflict in the resources that the LM1 client programs w o u l d be able to attain. Figure 3-16 shows the overall traffic being sent f r o m the server as well as a comparison of the single client traffic. In this scenario there are five clients, all receiving the same media. D u e to the delay in start times between the first and fifth client, there is 25sec 52 worth of traffic in the multi-client instance. T h e aggregate fast start and traffic behaviour can be seen to closely match the single client scenario. 600000 500000 • 5 Clients • 1 Client Time (sec) Figure 3-16 Multi-Client Operat ion T h e client behaviour in the scenario is also similar to the single-client case. Since throughout the f ive sessions, the client buffer depth was set to 3 seconds for all i n c o m i n g requests at the server. T h e media characteristics at all the clients should be similar to each other as well as to the single-client scenario. T h e average m e d i a lead-time for each of the clients is depicted in Figure 3-17. A s expected the lead-time on the m e d i a at each client is very similar to the single-client scenario with a client buffer of 3 seconds. It is difficult to graph the fact that 'client 1' had five packets dropped out o f a total o f 3375 packets, but this is due to load on the client machine as opposed to network congestion. T h i s was verified through a repeat test that d i d not result in any packet loss. T h i s occurrence however is similar to a short packet drop burst through congestion, and it can be seen that due to the media encoding (I-frame interval) the client is able to recover f r o m these 53 errors. A l s o another improvement that could be made to the system is for the client to request the lost packets to be retransmitted. T h e lead on the play-out time allows for such functionality. But currently the U D P media delivery mechanism does not take any recovery steps. 1800 i 400 4 200 ol , , , , , , . , 0 500 1000 1500 2000 2500 3000 3500 4000 Time (msec) Figure 3-17 Client Med ia Lead T ime 3.6 POSSIBLE SYSTEM IMPROVEMENTS T h e discussions in this section are mostly due to the shortcomings and stumbling blocks that have been faced through the design and implementation of the streaming M E P G - 4 unicast system. There are numerous implementation improvements that could be introduced, however code efficiency is not the goal of this section, and thus they wil l not be discussed. A few o f the system improvements involve changes to the server, whereas most of them pertain to components f rom both the client and the server. 54 T h e area in which there exists the highest potential for improvement is scheduling. T h i s is in fairness a research area to itself and some of the ideas that w i l l be mentioned here require considerable study. However , in modern servers, the m a x i m u m efficiency and support for clients is derived through i m p r o v i n g scheduling techniques and reducing server load. A s was mentioned previously, the scheduling technique used currently in the implementation is adequate for the current version, however it w i l l not work as the number of clients increases. T h i s is due to the linear increase in the number of threads that have to be managed by the operating system ( O S ) as the number o f clients increases. H o w e v e r , i f the scheduling were done entirely within one thread the load w o u l d be m u c h lower. A rudimentary R o u n d - R o b i n ( R R ) scheduler could be used to achieve the same results as the existing O S scheduling. H o w e v e r as was the case before, the time scale o n w h i c h the scheduling is done w o u l d have to be increased. T h u s , each session w o u l d receive a f ixed amount of data (based on the time scale), and hence when its successive opportunity arrives to send data, it has not missed any deadlines. T h i s scenario however also provides additional benefits. W i t h complete control on the various sessions, there exists the possibility to prioritize these sessions and ensure that some experience more stringent constraints. O n e such scheduling technique is Weighted-Fair Q u e u i n g ( W F Q ) . S u c h a technique w o u l d provide an efficient fair model for the outgoing traffic. T h e areas that w o u l d require changes to both client and server are m a i n l y focused in i m p r o v i n g media quality and network negotiations. D u e to the immaturity of quality provis ioning within the Internet, one of the areas that has not been developed within D M I F is the Q o S provisioning. T h i s functionality w i l l ensure superior m e d i a delivery in that it is kept within tolerable network fluctuations and once a client receives service, they are assured service for the entire presentation. T o implement such a system, functionality similar to that which is provided through the R T P w o u l d be required. T h r o u g h control plane messaging within R T P , network status can be monitored and information can be derived to moderate outgoing traffic. W i t h these changes and efficient coding, this system c o u l d easily be made into a commercia l ly viable solution for Internet media streaming. 55 CHAPTER 4. MULTICAST BACKGROUND There are various forms of multicast but the most appropriate scenario in this case is IP multicast. T h i s is due to the fact that IP is the predominant protocol over the Internet. T h u s , when the w o r d multicast is used in this work, it refers to IP multicast within the Internet. T h i s chapter gives a brief outline of IP multicast, and its associated protocols along with their advantages and problems. In the f o l l o w i n g sections there is a description of the IP multicast standard and the significance of groups in IP multicast. F i n a l l y , in the last section an overview of the primary group management protocol. 4.1 IP MULTICAST N o r m a l IP communicat ion consists of one sender and one receiver (unicast). It is however beneficial for some applications for one sender to send to a large number of receivers (multicast) [Tane96]. Examples of such applications are near video on demand ( N V O D ) media presentations, videoconferences, distributed databases, transmission of stock quotes to multiple brokers, and updates to replicate data. T h i s multicast advantage led to the development of the IP multicast standard. T h e IP multicast standard classifies the multiple receivers into a group, and hosts can choose to j o i n or leave this group at w i l l . Furthermore, a host m a y belong to more than one group at a time. T h e notion of group is an essential concept o f EP multicast. Multicast groups have an I D called multicast group I D . Whenever a multicast message is sent out, a multicast group I D specifies the destination group. These group ZD's are essentially a set 56 of IP addresses called "Class D " addresses. Therefore, if a host (a process in a host) wants to receive a multicast message sent to a particular group, it needs to listen to all messages sent to that particular group. If the source and destinations of a multicast packet share a c o m m o n bus (i.e., Ethernet Bus) , each host o n l y needs to k n o w what groups have members among the processes of that host. However , if the source and destinations are not on the same L A N , forwarding the multicast messages to the destinations become more complicated. A n example of such a scenario is illustrated in Figure 4-1, where source S wants to transmit to destinations with multicast group G l . A l t h o u g h the source can send a copy of the packets to each of the destinations separately, it w o u l d be more efficient to m i n i m i z e the number of copies in the network. W i t h multicast routing, router 1 only receives one copy of the packets as opposed to 4 copies in the unicast case. It then forwards the packets to router 2 and 4 simultaneously. U p o n receipt o f these packets, router 2 copies the packets to its local network and router 4 duplicates the packets onto router 5 and 6. T h e packets are forwarded in this manner until all the group members have a copy of the packets [GarcOO]. O i Figure 4-1 Mult icast Tree 57 T o solve the problem o f Internet-wide routing of multicast messages, hosts needs to j o i n a group by informing the multicast router on their subnetwork. T h e Internet G r o u p Management Protocol ( I G M P ) is used for this purpose. L e a v i n g a group is done through I G M P also. T h i s way multicast routers know about the members o f multicast groups on their network and can decide whether to forward a multicast message to the hosts on their network or s imply disregard the message. Whenever a multicast router receives a multicast packet, it checks the message group I D and forwards the packet only i f there is a member of that group in the networks connected to it. I G M P provides the information required in the last stage of forwarding a multicast message to its destinations. H o w e v e r , for delivering a multicast packet f r o m the source to the destination nodes on other networks, multicast routers have to exchange group membership information they have gathered f r o m their downstream paths. Based on the routing information obtained, whenever a multicast packet is sent out to a multicast group, multicast routers w i l l decide whether to forward that packet to their network(s) or not. F i n a l l y the leaf router w i l l see i f there is any member of that particular group on its physical ly attached networks based on the I G M P information and decides whether to forward the packet or not. W i t h multicast routing, each packet is transmitted only once per l ink. T h e bandwidth saving with multicast routing becomes more substantial as the number of group members increases. There are numerous protocol associated with generating multicast trees and obtaining routing information, but they are beyond the scope of this thesis. S o m e of these protocols are part o f the multicast backbone ( M B O N E ) . T h i s is basically an overlay packet network on the Internet supporting routing of EP multicast packets. Unfortunately, although IP multicast has been around for many years, the expenses i n v o l v e d in superimposing the multicast functionality on the existing Internet backbone are exorbitant. T h i s is w h y we still do not have ubiquitous access to this technology. EP multicast inherently provides an unreliable datagram multicast service. T h i s service does not offer any guarantees that a given packet w i l l reach every intended recipient in the multicast group. T h i s is not an issue when the application is concerned with 58 performance rather than reliability. T h i s is analogous to the U D P protocol for unicast systems. However , numerous reliable multicast protocols have also been developed to m i m i c the behaviour of T C P in a unicast system. These protocols include Single Connect ion Emulat ion ( S C E ) [Talp95] and Reliable Multicast Protocol ( R M P ) [Whet95]. These protocols provide applications with guarantees of atomicity and reliability. T h i s ensures that all group members receive messages only once. A t o m i c , reliable multicast protocols are useful for developing applications such as distributed databases. S u c h applications need to be certain that all members of a multicast group agree on w h i c h packets have been received. Prior to the start o f a multicast session, there has to be a means by w h i c h clients become aware of the ongoing sessions and their contents. T h i s is usually handled using other protocols such as the Session Announcement Protocol ( S A P ) [ R F C 2 5 4 3 ] , Session Description Protocol ( S D P ) [ R F C 2 3 2 7 ] , or s imply by advertising the session on a web site [ M o h a O l ] . T h e issue of session announcement and description are not within the scope of this thesis. 4.2 MULTICAST GROUPS There are three types of IPv4 addresses: unicast, broadcast, and multicast. Unicast addresses are used for transmitting a message to a single destination node. Broadcast addresses are used when a message is supposed to be transmitted to all nodes in a subnetwork. F o r delivering a message to a group of destination nodes, w h i c h are not necessarily in the same subnetwork, multicast addresses are used. W h i l e Class A , B , and C IP addresses are used for unicast messages, Class D addresses (224.0.0.0 -239:255.255.255) are employed b y multicast messages. T h e sender w o u l d create a multicast session with a particular group address and all interested hosts w o u l d then be able to j o i n the group. T h e sender w i l l not be aware of the hosts that are members of the multicast session and in the f o l l o w i n g section, there is more detail on h o w this transparency is achieved through I G M P . 59 4.3 INTERNET GROUP MANAGEMENT PROTOCOL (VER. 2) T h e Internet G r o u p Management Protocol was designed to allow hosts to signal group membership to their attached routers. I G M P is a host-router protocol , where routers can also be considered hosts when they are signaling an upstream router. I G M P runs directly on IP, with an IP protocol number of 2. Hosts w i l l i n g to receive multicast information (packets) need to inform their neighboring routers that they are interested in receiving multicast messages sent to certain multicast groups. T h i s way, each node can become a member of one or more multicast groups and receive the multicast packets sent to those groups. Routers also use I G M P to periodically check whether k n o w n group members are still active. In case there is more than one multicast router on a given subnetwork ( L A N ) , one of the routers is elected as the "querier" and assumes the responsibility of keeping track of the membership state of the multicast groups that have active members on its subnetwork. Based on the information obtained f rom I G M P , the router can decide whether to forward multicast messages it receives to its subnetwork(s) or not. Af ter receiving a multicast packet sent to a certain multicast group, the router w i l l check for at least one member of that particular group o n its subnetwork. If that is the case, the router w i l l forward the message to that subnetwork. Otherwise, it w i l l discard the packet. 4.3.1 I G M P M E S S A G E F O R M A T I G M P is an integral part of IP. T h e messages in I G M P all have a f ixed size with no optional data. A l l I G M P messages of concern to hosts have the format seen i n Figure 4-2 [ R F C 2 2 3 6 ] . T h e 8-bit Type f ie ld is the first f ield in the I G M P message. There are three types of messages for host-router interaction. These types can be seen i n T a b l e 4-1. T h e multicast routers w i l l s imply ignore any messages with an incorrect type. 60 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 H 1^ 1^ 1^ 1 1 1 1^  1 ) 1 1 1 ( 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1-| Type | Max Resp Time | Checksum | + - + - + - - 1 - - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + | Group Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4-2 I G M P M e s s a g e Format Tab le 4-1 I G M P M e s s a g e Types Type Description 0x11 Membership Query • General Query, used to learn which groups have members on an attached network. • Group-Specific Query, used to learn if a particular group has any members on an attached network. 0x16 Version 2 Membership Report 0x17 Leave Group 0x12 Version 1 Membership Report, for backwards compatibility with IGMPvl T h e Max Response Time is only used by routers when sending a M e m b e r s h i p Query message. It specifies the m a x i m u m al lowed time before sending a responding report by the hosts. In all other messages, it is set to zero by the sender and ignored by the receivers. T h e Checksum is a 16-bit one's complement of the one's complement sum of the whole I G M P message (entire IP payload). D u r i n g the checksum calculation, the value o f c h e c k s u m is set to zero, not to affect the outcome o f the calculation. T h e Group Address f ie ld in the message takes on a different value depending on the context in which it is used. In a M e m b e r s h i p Q u e r y message, this f ie ld is set to zero when sending a general query, and set to the group address being queried when sending a group-specific query. Querier 's periodically send general queries to determine whether there are any multicast sessions on its subnetwork. In a M e m b e r s h i p Report or L e a v e G r o u p message, 61 the group address f ie ld holds the IP multicast group address of the group being reported or left respectively. 4.3.2 P R O T O C O L D E S C R I P T I O N A multicast router keeps a list of multicast group memberships for each attached network, and a timer for each membership. T h e list o f group memberships refers to the presence o f at least one member of a multicast group and not every member within a group. However , the behaviour of the multicast router does not affect the behaviour of the multicast hosts and the target application, thus it w i l l not be discussed in any detail. O n c e a host has determined a multicast session that it wishes to j o i n , it w i l l send a j o i n message to its neighbouring router, and it w i l l become part of the multicast group. T h e fol lowing discussions w i l l assume that a host has become a member in a multicast group. W h e n a host joins a multicast group, it should immediately transmit an unsolicited M e m b e r s h i p Report for that group. T h i s is required in case the host is the first member of that group on the network, thus the subnetwork router is informed. F o l l o w i n g this event, when a host receives a General Q u e r y message, it w i l l respond with a M e m b e r s h i p Report to the querying router within the m a x i m u m response time al lowed. If the host receives another host's Report before sending its Report, it does not send a report to avoid duplicates. In order to leave a multicast group, i f a host was the last host to leave that group, it should send a Leave G r o u p message to the all-routers multicast group (224.0.0.2). If there exists sufficient memory for hosts to keep track of the group members, hosts that are not the last member to leave a group can forgo sending a L e a v e G r o u p message. T h e L e a v e G r o u p message is addressed to the all-routers group because other group members have no need to know that a host has left the group. M a n y of the details of the I G M P have been left out for conciseness. T h e relevant details, however, are introduced in the fo l lowing chapter along with their usage in a multicast enabled D M I F protocol. 62 CHAPTER 5. MULTICAST SYSTEM W h e n we consider the issue of a multicast system, we have to consider the changes that are required on the server and client ends as wel l as the network itself. A s mentioned previously, the network modifications involve the placement of a multicast network (i.e., M B O N E routers), to enable the I G M P communications and to provide the routing protocols required for bui lding the multicast trees. T h i s however is not within the scope of this thesis and w i l l not be discussed any further. T h e first section of this chapter contains a description of the changes required on the server to enable multicast delivery of media . In the second section, there is a proposal as to adapting the D M I F protocol to support multicast sessions and support the group management additions required at the client end. T h e next section contains simulated results of using such a system for media delivery. In the final section, the advantages o f such an M P E G - 4 multicast system are presented. T h e discussion here represents a design for multicast-enabled D M I F layer for an M P E G -4 system. T h e simulation results show the advantages that arise f r o m developing such a system. These are shown in the fourth section. 5.1 MULTICAST SERVER T h e multicast server is somewhat less convoluted than the multicast client. T h e reason for this being that the server does not have to j o i n any groups or to create any sessions; it s imply becomes a multicast sender (by sending to a group address). A l t h o u g h this does 63 not impact the communicat ion layer very much , it does change slightly the behaviour of the server and its D M I F layer. A s mentioned previously the sender in a multicast session needs to i n f o r m the other hosts of the event, and this can be done either through a web page or through S A P and S D P . T h i s is the additional work that has to be performed on behalf of the server to i n f o r m other hosts of the session. A simple solution w o u l d be to place the details on a designated web site, which means that there w o u l d be no changes to the server application. O n c e other hosts have been informed of the session, and the scheduled start time of the multicast has arrived, the server w i l l start streaming the media . T h e fundamental difference between this and the unicast scenario is that there is no signaling that occurs between the receivers and the sender. A s was seen in Figure 2-7 the target D M I F in this scenario is not very substantial, due to the lack of signaling required between the client and server. U n l i k e the unicast scenario, the session is not a receiver-initiated service; instead, the server sends all media data to the multicast group address. T h e server w o u l d transmit the O D and S D , as was the case with the unicast scenario, but instead of the server receiving a ChannelAdd request f r o m the client, it w o u l d s imply determine the number of channels that are required for the media and create those channels. Hosts that have already j o i n e d this multicast group w o u l d have also received the O D and S D and w i l l be able to create the same media channels and receive the media once it is sent. T h e difference occurs when clients are late in j o i n i n g the multicast session, w h i c h is w h y the server has to retransmit the I O D and S D periodically throughout the session to allow clients that j o i n the session at a later time to determine the required channels and receive the media. D u e to the small amount of data that is contained within the O D and S D , it does not add a great deal of overhead to buffer this data at the server. T h i s is clearly discussed in section 5.1.2. 5.1.1 F A S T S T A R T A L G O R I T H M T h e fast start mechanism cannot be used i n the multicast instance. A l t h o u g h this w i l l be similar to the unicast case at the start of the session, it w i l l mean that clients j o i n i n g the 6 4 session after it has started w i l l be out of synch with the original clients. T h i s is due to the fact that once the session has started the server w i l l transmit the m e d i a based on its bitrate through rate control. T o avoid this problem, the server in the multicast scenario w i l l transmit at the media bitrate, thus the client applications w i l l have comparable lags (propagation delays) on the media play-out time and w i l l be synchronized. T h i s does not pose a problem, since in a video delivery multicast scenario, a lag o f 0.5 seconds is not significant. 5.1.2 C L I E N T R A N D O M A C C E S S Since multicast sessions are not client initiated, there m a y be clients that j o i n an M P E G - 4 multicast after it has started. A s described previously, the I O D and S D information transmitted at the start of the session is required for any client to determine the number of streams that are present in the presentation [HerpOO, M o h a O l ] . In order to accommodate this need, the server has to buffer periodically send the session descriptive streams (i.e., I O D , S D , and stream O D ' s ) . These descriptors w i l l contain all the information required for creating the data channels at the client. O n c e the client has this information, it can receive the S L packets f r o m all the elementary streams within the multicast session. T h e random repetition o f the information is depicted in Figure 5-1. T h e period o f repetition is based on the number of available elementary streams within the session. « o c c eS J3 o CS « Q < IOD | BSIP„| Esibj H E ES1D C IOD#ESID. ESID T ODI ESID OD2 ESID^: )-.| - ' S L P J H f S L P SLf SLPII SLP.s SLP |OPl | jESIP | |OD2| tiSID SLP, L-SLP. SUP, si p \ SLPl USLP IISLPl x SLP kSLP SLP SLP SLP|| SLP ||~SLP~|| SLP | |~SLP |[SLP]r~SLP 11 SLP pfSLP] | SLP | ~ | Figure 5-1 R a n d o m A c c e s s Mode l T h i s is the extent of the changes that have to be made to the server. It does not require many changes to the delivery of the media, except to remove the signaling f r o m the 65 client. T h e server listening thread is also no longer required. O n the other hand, the client modifications are more extensive as seen in the f o l l o w i n g section. 5.2 MULTICAST CLIENT A multicast client is similar to a unicast client except that the host requires provisions for group membership and communications. Another important difference is that the client and server do not have any means of direct communications. T h e client s imply exchanges meesages with the mutlicast network (i.e., neighboring router it is attached to). These two issues c o m b i n e d are going to affect the D M I F client in two ways. Firstly, the unicast client w i l l have to adopt the I G M P protocol in order to negotiate group status with the neighbouring router. Secondly, the D M I F signaling does not have to be sent to the server. In the f o l l o w i n g section, the design modifications to the D M I F layer are discussed. T h e first section contains the I G M P extensions to D M I F , and the f o l l o w i n g section contains the changes in signaling. 5.2.1 DMIF WITH I G M P There are some major additions that have to be considered for the client D M I F layer to support I G M P version 2. T h e most important modification is the addition of a listening thread to the Contro l Plane. T h i s listening thread is required to listen for i n c o m i n g I G M P M e m b e r s h i p Q u e r y messages, whether they be general or group-specific . T h e new listening thread, an integral part of the D M I F Instance, w i l l pass the queries to the D S layer for processing. T h u s the application w i l l in no way have to be changed as all I G M P signaling w i l l terminate and commence within D M I F . A new signaling function w i l l have to be added to the D M I F instance to support the generation of M e m b e r s h i p Report messages, but the existing D M I F functions w i l l only have to be m o d i f i e d slightly to support j o i n i n g and leaving multicast groups. T h e D M I F Mult icast Instance requires a number of timers to meet the requirements of the I G M P messaging. W h e n the host D S layer receives a General Q u e r y , it sets a delay timer for each group it has received a query f rom. W h e n the group's timer expires, the host multicasts a M e m b e r s h i p Report to the group. If the host receives another host's Report 66 while it has a timer running, it stops its timer for the specified group and does not send a Report, in order to suppress duplicate Reports. Another timer that is required by the host is an UnsolicitedReportlnterval timer. T h i s timer is required upon j o i n i n g a group. W h e n a host joins a multicast group, it should immediately transmit an unsolicited M e m b e r s h i p Report for that group. A f t e r one or two short delays, UnsolicitedReportlnterval, the host resends the Report to cover the possibility of the initial Report being lost or damaged. A n optimization w o u l d be to avoid the use of this timer, and once the host joins the group, to set the query delay timer. T h u s when the delay timer runs out, the host w i l l send the Report. In order to keep track of the different states that an I G M P host requires, a variable is required to keep track o f which o f the three possible states the host is i n . T h e host state diagram can be seen in Figure 5-2. Non-Member leave group (stop timer) j o i n group (send report, s t a r t timer) < < Delaying Member query received query received (start timer) report received (stop timer) >l timer expired (send report) leave group Idle Member Figure 5-2 I G M P Host State Diagram 67 In the non-member state, even though the host receives general M e m b e r s h i p Q u e r y messages, the D S layer does not generate any response to these messages. A delaying member, is one that has a delay timer set to respond to a query with a M e m b e r s h i p Report message. It is evident that the enhancements to the D M I F layer are not very extensive but as w i l l be seen later, the advantages are numerous. 5.2.2 M U L T I C A S T M E S S A G I N G In the case of the Multicast scenario, the model is simpler than that of the Remote Instance scenario but no internal interface has been identified. Conceptually , such a D M I F Instance includes the features of a Target D M I F peer, as wel l as those of the Target A p p l i c a t i o n . A l l the control messages that are exchanged between peer applications in an interactive scenario are in this case terminated i n the D M I F Instance. Stream control commands like P L A Y / P A U S E / M U T E are an example. T h e multicast messaging is very similar to that of the broadcast messaging described in A p p e n d i x B of [ISO 14996-6], except for the I G M P messages added for group functionality. Since there is no direct client/server communicat ion, the functionality required by the Mult icast D M I F is rudimentary and only requires session initiation and session termination. D u e to the multicast nature, there is little possibili ty for interactive media. In Figure 5-3, the interaction between the host and the multicast network can be seen. O n c e the client (host) determines the available multicast services, the user may opt to jo in one of the scheduled or ongoing multicast groups. T h e host application can then request to tune into the media f r o m a certain multicast group through the D A I . O n c e the D M I F layer joins the multicast group and informs the application of the required channels for the media presentation, a request to the D A I can set up the required channels and forward the i n c o m i n g media data to the application f rom multicast network (i.e., neighboring multicast router). T h e design has kept this structure to adhere to the same design as the Remote Instance scenario. But it is evident that the process of adding the channels and creating the D a t a Plane thread can be jo ined into one event. 68 DMIF Terminal Multicast Network Application DAI Originating DMIF application initiates the service application request new channels application rcqucM.i lo play media DAIServiceAttach w response Tune into requested service Get M P E G-4 specific information (periodic) (i.e., I O D , O D , E S i d - C A T mapping) -< DAI ChannelAdd w response Tune into M P E G-4 channels (i.e., based on C A T associations to ESid) 4 DAI_UseiCommandAck "Play" • Pass data to application Start new Data Plane thread to receive media data on multicast group ^ I G M P Join message I G M P Membership Report Multicast media data routed to the host interface Figure 5-3 Multicast Session Initiation 6 9 Figure 5-4 shows the termination process for a D M T F multicast scenario. A s was with the previous case, the session termination calls to the D A I b y the application can be c o m b i n e d into a single cal l , but have been kept separate for conformity. If the host that is leaving the multicast group is that last host on the local network then the neighboring router w i l l cease to forward the media data onto the network. Otherwise, the current host w i l l s imply ignore all media data pertaining to the terminated session. Application D M I F T e r m i n a l DAI Originating DMIF M u l t i c a s t N e t w o r k application deletes the channels application terminates the service DAI_ChannelDelete response DAI ServiceDetach response Stop receiving data on selected channels and close Data Plane thread for group Stop receiving data from the service (i.e., leave group) Multicast media data routed to the host interface IGMP Leave message Figure 5-4 Mult icast S e s s i o n Terminat ion 5.3 SIMULATION RESULTS T h e simulations performed in this section were done using O P N E T M o d e l e r . T h e network that was used is shown in Figure 5-5. T h e network contains five subnets, one sender subnet, and four receiver subnets. T h e sender ( A d m i n _ S e n d e r ) transmits an 70 exponentially distributed video feed to each of the receivers in the network. There are five receivers in the network at all times, all receiving the same media : E C E _ R e c e i v e r _ l , E C E _ R e c e i v e r _ 2 , Phy_Receiver , C h e m _ R e c e i v e r , and Math_Receiver . Figure 5-5 O P N E T Simulat ion Network T h e fo l lowing sections show some of the different simulations that were performed using this and variations of this network in both the unicast and multicast scenarios to show the differences between the two and the benefits of developing a multicast enabled D M I F system. 5.3.1 N E T W O R K C O N G E S T I O N Network congestion can occur at two locations within a network. It can occur in the backbone where overwhelming traffic can cause losses and delay to delivered data, or it can occur at the edge o f the network in the access networks. A l t h o u g h different, the effect of both these two congestion scenarios is similar at the client (receiver). 71 5.3.1.1 Backbone Congestion D u r i n g these simulations the sender (Admin_Sender ) transmits an exponentially distributed 5 6 K B p s video feed (this is analogous to the vh64.mp4 video) to each of the receivers in the network (a total of 2 8 0 K B p s ) . A l l point-to-point connections between terminals are 100BaseT Ethernet connection (operating at 100Mbps) , except for the connection between the two backbone routers, w h i c h is a D S 1 IP connection (operating at 1 .544Mbps). 100BaseT Ethernet cabling does not operate over long distance thus the alternative must be used. In this simulation scenario it can be seen that the congestion, i f present, w i l l exist in the backbone, due to the l imited bandwidth through the D S 1 connection. T h e effect on the client w i l l thus be either queuing delay of the received media or m e d i a loss through packet dropping at the congested node. If there exists a queuing mechanism at the ingress node of the network (where the congestion is taking place, e.g., B a c k b o n e Router 1), then the media w i l l be delivered to the receiver with a delay but none o f the data w i l l be lost. H o w e v e r , i f the node does not have a queuing mechanism or a small one for that matter, it w i l l be a matter of time before media data w i l l arrive that cannot be sent through the next l ink, and the congested node (i.e., Backbone Router 1) w i l l have no alternative but to drop the media packets and hence induce network loss. T h e duration o f the f o l l o w i n g simulation was thirty minutes, however, the A d m i n _ S e n d e r only sent media for a period of 21 minutes in both the unicast as well as the multicast scenarios. In this scenario, the 'Backbone Router 1' has a large buffer capacity, thus sent data w i l l never get lost but the queuing delay w o u l d increase linearly while data is being sent at a constant bitrate. T h e media that was sent in the unicast instance has to be sent to each of the receivers separately, and hence requires 2 .24Mbps of bandwidth, however the D S 1 connection connecting the backbone routers only supports 1 .544Mbps (193KBps) . bandwidth = 56KBps * 5 = 2S0KBps = 2240Kbps H o w e v e r , within the multicast scenario, the media is sent once through the backbone and replicated at the egress nodes to the receivers that have jo ined the multicast session. T h u s , 72 only 5 6 K B p s is sent through the backbone, and the utilization o f the backbone is considerably lower as seen in Figure 5-6. 120 100 80 c .5 BO 40 20 -20 — Multicast Unicast 200 400 600 800 1000 1200 1400 1600 1800 2000 Time (sec) Figure 5 - 6 Unicast vs. Mult icast Backbone Util ization A s seen f r o m the figure, the unicast scenario reaches m a x i m u m utilization without being capable o f transmitting all the required media. T h e media w i l l be queued at the ingress router and sent when there is available bandwidth; thus, the media session w i l l take 1690secs (~ 28mins). A s discussed previously, this introduces queuing delay as seen in Figure 5-7. In order to deliver quality media in the unicast scenario it is evident that some changes have to be made to the network or media bandwidth. One alternative w o u l d be for the server to deny service to the last two clients and thus with three clients only require 1344Mbps, w h i c h w o u l d be deliverable across the backbone without inducing end-to-end packet delays. Another solution w o u l d be to replace the D S 1 connection with a more 73 substantial l ink. H o w e v e r this requires additional financial cost and laying the necessary fibre is not always feasible. In multicast, the server load is also considerably lower than the unicast scenario. 500 450 400 350 - Unicast - Multicast 200 400 600 800 1000 1200 1400 Time (sec) 1600 1800 2000 Figure 5-7 Unicast vs . Mult icast Queu ing Delay 5.3.1.2 Access Network Congestion In the f o l l o w i n g simulation scenario the sender ( A d m i n _ S e n d e r ) transmits an exponentially distributed 8 2 5 K B p s video feed to each of the receivers in the network (a total of 4 . 1 2 5 M B p s ) . A l l point-to-point connections between terminals are 100BaseT Ethernet connection (operating at 100Mbps) , except for the connections to and f r o m the ' E C E _ H u b ' and the backbone connection. T h e connections to and f r o m the E C E _ H u b are all l O B a s e T connections (operating at 10Mbps) , and the connection between the two backbone routers is a D S 3 IP connection (operating at 44 .736Mbps) . 74 In this scenario, it can be seen that congestion w i l l occur in the access network, between the ' B a c k b o n e Router 2' and the ' E C E _ H u b ' . T h e reason for this is that the l O B a s e T connection cannot support the 13 .2Mbps throughput that is required. bandwidth = 825KBps * 2 = l650KBps = 13.2Mbps T h e utilization of each of the links is shown in Table 5-1. A l t h o u g h the theoretical m a x i m u m utilization is 100%, the l ink w i l l have a m a x i m u m throughput that is always below the value of the available bandwidth [GarcOO]. Table 5-1 Link Util ization Rank Object Name Min Ave Max Std Dev 1 Backbone Router2 <-> E C E J H u b [0] --> 0 45.9 98.7 48.8 2 Backbone Router 1 <-> Backbone Router2 [0] --> 0 25 80.2 35.1 3 Admin_Sender <-> Admin_Hub [0] --> 0 11.4 36.9 16 4 Backbone Router2 <-> ChemJHub [0] --> 0 2.3 7.9 3.2 5 Backbone Router2 <-> Phy_Hub [0] --> 0 2.3 7.9 3.2 6 Backbone Router2 <-> MathJHub [0] --> 0 2.2 8.3 3.1 T h e behaviour in this scenario, however, is different f r o m the previous case. Here the backbone is not congested, and thus, there w i l l be no queuing delay for any of the clients, except for those that are on the E C E subnet. T h e A d m i n _ S e n d e r sent m e d i a data for 10 minutes while the simulation was thirty minutes long. A s opposed to the unicast scenario, in the multicast scenario, the A d m i n _ S e n d e r only sends the media data once, and it is only replicated at the edge routers when it is required. T h u s only 8 .25Mbps w i l l be sent to the E C E _ H u b , which is within its bandwidth limitation. O n c e at the E C E _ H u b , the data w i l l be replicated and sent across each of the receiver l inks. F igure 5-8 shows a comparison between the queuing delay for the C h e m _ R e c e i v e r and the E C E _ R e c e i v e r _ l within the unicast scenario. It can be seen f r o m the figure that, the media ends within 10 minutes of its start for the C h e m _ R e c e i v e r , whereas the E C E _ R e c e i v e r continues to receive queue media , until 880secs (~ 15mins). 75 300 250 Time (sec) Figure 5-8 Queu ing Delay for two Rece ivers within Different Subnets A s the data is being sent, the media arrives without delay at the three single receiver subnets. H o w e v e r at the E C E subnet, the media w i l l be queued due to bandwidth limitations at the E C E _ H u b until the end of the transmission (i.e., when bandwidth becomes available) in order to deliver the queued media. T h i s is shown in Figure 5-9, where the received traffic at the end of the transmission drops to 10Mbps . T h i s is due to the queued traffic at the E C E _ H u b that is being sent at the m a x i m u m allowable throughput. 76 6000000 -1000000 Time (sec) Figure 5-9 Unicast vs. Mult icast Rece i ved Traffic T h e solutions that were proposed for the previous solution also apply here. B o t h the previous scenarios show the benefit o f developing a multicast enabled hybrid D M I F framework. 5.3.2 M E D I A S C A L A B I L I T Y T h e multicast functionality become more significant when we consider the potential for M P E G - 4 scalability. A s was mentioned in the previously, there are a number o f ways a presentation can be scalable in M P E G - 4 . Firstly, the media can be compressed with scalable layers and provided to the receivers through various sessions. T h e receivers however w i l l be in control o f evaluating and adapting to the network capacity. O n e such technique is Receiver driven L a y e r e d Congest ion control ( R L C ) . T h i s technique has been shown to work well for multicast 77 networks and allow for dynamic bandwidth fluctuations [Vici98] . It supports variable transmission rate b y using a layered organization of data, transmitting each layer in a separate multicast group, and letting receivers adapt to the available bandwidth by jo ining to one or more multicast groups. E a c h receiver takes decisions autonomously, but techniques are used to synchronize receivers behind the same bottleneck (and belonging to the same protocol instance), so that they can cooperate in controlling the congestion at the shared bottleneck. Another f o r m of scalability that is available in M P E G - 4 is due to the object-oriented presentation of media. T h e latest scalability mechanism is F G S . These techniques can be used in both the unicast and multicast scenarios. T h e difference between the two scenarios is that the extension layers w i l l always be delivered through their o w n sessions. T h u s , a client w i l l only subscribe to layers (i.e., sessions) that it can support. T h e bandwidth saved when going f rom the unicast to a multicast scenario is the same as the single session case. There w i l l always be an 80% improvement in bandwidth utilization in the multicast scenario with five clients. A s the number of receivers increases, the limit of percentage improvement goes to 100%. T h i s can be seen in Table 5-2. Table 5-2 Bandwidth Improvement based on Number of Rece ivers Number of Percentage Receivers Improvement 1 0.00% 2 50.00% 3 66.67% 4 75.00% 5 80.00% 6 83.33% 7 85.71% 8 87.50% 9 88.89% 10 90.00% 11 90.91% 12 91.67% Number of Percentage Receivers Improvement 13 92.31% 14 92.86% 15 93.33% 16 93.75% 17 94.12% 18 94.44% 19 94.74% 20 95.00% 21 95.24% 22 95.45% 23 95.65% 24 95.83% 78 In this simulation scenario, the media is assumed to consist o f two layers, a base layer and an enhancement layer. T h e duration o f the simulation is thirty minutes and the media is delivered for the length o f the simulation. E a c h o f the layers is 1 6 K B p s . In the unicast scenario, all five receivers have requested the base layer, and equivalently all five receivers have jo ined the base layer session in the multicast scenario. H o w e v e r , only the P h y R e c e i v e r and the E C E _ R e c e i v e r _ 2 have requested (joined the session) the enhancement layer. T h u s the bandwidths required in the f o l l o w i n g two cases are: Unicast : bandwidth = 16KBps * 5 +16KBps * 2 = 11 IKBps = 896Kbps Multicast : bandwidth = \6KBps + \6KBps = 32KBps = 256Kbps A g a i n the benefits o f designing a multicast enabled M P E G - 4 system can be seen through Figure 5-10 and Figure 5-11. T h e first figure shows the backbone utilization benefits derived through the multicast scenario, while the second figure shows the traffic that is being sent f rom the server to the receivers. In both scenarios the total media received by all the receivers amounts to 112KBps . X 40 200 400 Unicast Multicast! 1600 1800 2000 Time [sec) Figure 5-10 Backbone Util ization 79 140000 120000 100000 40000 20000 -20000 — Unicast —Multicast! GOO 1000 Time [sec] 1200 1400 1600 Figure 5-11 Unicast vs. Mult icast Sent Traffic 5.4 ADVANTAGES OF MULTICAST SYSTEMS A s was evident through the results o f the previous section, there are considerable advantages to supporting multicast media delivery. Some o f these advantages are server based, in that the server w i l l not have to negotiate or directly communicate with any o f the receivers. Another very important factor is the reduced load that the server experiences, w h i c h fol lows the same lines as seen in Table 5-2. H o w e v e r , the major advantages are due to the network impact. T h r o u g h the simulations the reduced network bandwidth presented itself as the most prominent advantage in current state o f the W o r l d W i d e Wait ( W W W ) . A l s o , as routing technology improves, and packet losses become less significant, queuing delays have to be addressed in multimedia data delivery. These unacceptable delays can be minimiz e d or resolved through the use o f multicast transmissions. D u e to the reduced bandwidth requirements in the multicast scenario, the 80 l ink costs can be significantly reduced, however this comes at the cost of upgrading the existing router technology to support IP multicast. These advantages outweigh some of the disadvantages that multicast m a y have. T o some extent, multicast sessions can be considered a non-demand service, since they are server initiated. A l s o , users must view the data at the same time to achieve the advantages associated with multicast. There are, however, suggestions for future work to alleviate some of these issues; some of these are presented in the f o l l o w i n g chapter. Multicast support is especially important due to the direction that the Internet is m o v i n g towards. Since its inception multicast protocols have been s lowly gaining popularity and more and more organizations are w i l l i n g to expend the additional expense to provide the tangible benefits. T h i s can be seen through the growth of both the unicast Internet as well as its multicast counterpart. There have been numerous attempts to quantify the size of the Internet. O n e of the techniques that seem to provide some insight into this quandary is through tracking the number autonomous systems ( A S ' s ) . A n A S is a network participating in the Internet but with its o w n routing pol icy . T h u s the Internet can be v iewed as a collection of A S ' s communicat ing through the B o r d e r Gateway Protocol ( B G P ) . T h r o u g h the use o f these B G P tables, with which a complete picture of all the connections can be made, a good estimate for the size of the Internet is determinable. T h e Mult icast B G P ( M B G P ) provides this information for multicast routing [ M u l t O l , M a n t O l ] . F igure 5-12 shows the percentage of A S ' s within the Internet that support IP multicast. T h e linearity of this graph indicates that rate at which multicast are being deployed into the network is comparable to the unicast networks. A l t h o u g h there is considerable ground for multicast applications to cover before they c o u l d compete with unicast connectivity and functionality. There is a trend towards this and as more research is done within this f ield more of the Internet w i l l become multicast enabled. T h i s graph represents the period f rom A p r i l 2 n d , 2001 to July 1 s t, 2001. 81 3.5 2.5 u (S cs o 31 3 S 1 c o-i , , , , , . 0 100 200 300 400 500 600 Time (days) Figure 5-12 Mult icast Networks within the Internet T h e data in the previous figure is obtained through Mult icast Technologies Inc. databases. T h e data logging only started on a daily basis in M a r c h 2001, and is currently being updated four times daily. T h i s information is obtained through the Multicast Technologies A S 16517. 82 CHAPTER 6. CONCLUSION T h r o u g h advances in technology, the human race seeks solutions for problems that it d i d not have. T h e unimaginable growth of the Internet is one such breakthrough. W i t h its widespread adoption, many shortcomings have become noticeable. O n e such limitation is the capability to efficiently stream multimedia data to multitudes of users. T h i s limitation prompted the direction of this thesis. T h i s thesis has attempted to provide a streaming solution for one of the newest mult imedia standards, M P E G - 4 . T h i s system has also been adapted to a multicast network due to its future potential and benefits. A s seen in Figure 5-12, 97% of all A S ' s on the Internet o n l y support unicast systems, thus the unicast system implementation is crucial , especially for an infant technology such as M P E G - 4 . T h i s provides a means of streaming complex multi-object presentations in the network framework existent today. Furthermore, the merits of IP multicast in reducing network load for persistent sources (e.g., video streaming) were also shown. T h e first section lists the contributions that have been made through this thesis and their impact on the Internet and mult imedia research areas. T h e second section discusses some of the future research that can be derived f rom this work or performed to enhance these systems. 83 6.1 CONTRIBUTIONS T h e contributions offered through this work can be classified into two categories. T h o s e that provide a purely academic and research furthering basis, and those that provide m u c h needed industrial solutions. A t the start of this research, version 2 of the M P E G - 4 standard had not been standardized. T h i s version also contained part 6 w h i c h comprised of D M I F . D u e to the early stages of the standardization, this meant that there had not been any u n i f y i n g work on streaming M P E G - 4 m e d i a based on a f i x e d standard. U n t i l then, all the work that had been performed had been based on proprietary application specific delivery techniques and d i d not exhibit any of the advantages inherent to the use of D M I F . T h i s provided the challenge of developing a system that c o u l d satisfy the demands set out in the delivery standard and provide sufficient communicat ion capabilities to support all the traits associated with M P E G - 4 traffic. T h r o u g h this work, the need for server technology arose, which led to some research in areas including scheduling as wel l as rate control. T h e need for efficient mult imedia delivery then led to the design of a multicast enabled D M I F layer that provides the same advantages on a more global scale. T h i s work resulted i n the in first D M I F based M P E G - 4 unicast streaming system. T h e code has since been donated to the I S O M P E G group and is n o w part of the reference software for the standard. T h r o u g h a demonstration that was given to the M P E G standardization committee, the conformance and feasibility of this system was shown, not only through it operation with the I M 1 software but its integration with commercial M P E G - 4 client software. T h e previous discussed work has resulted in three papers being published [ A l n u O l , A s r a O l , P o u r O l ] . T h r o u g h these publications and the association with the M P E G committee, the software is now referenced through m a n y of the M P E G - 4 sites. T h e success of the unicast scenario led to the design of a h y b r i d D M I F that supports multicast networks. T h i s system takes advantage of the properties o f multicast as well as those inherent to M P E G - 4 . C H A P T E R 5 discussed the design of this system and the advantages associated with such a development. 84 C o p i e s of the published papers and the various components of the system software can be obtained at the f o l l o w i n g website: http://lan.ece.ubc.ca/apadana.html/. T h i s code along with the IM1 software can be used to recreate the results obtained here and to further enhance the f ield of streaming M P E G - 4 . 6.2 FUTURE RESEARCH There are numerous avenues that can be studied based on this research, both in the areas of mult imedia technology delivery as well an underlying network development. M o s t of the research directions that are mentioned deal with mult imedia delivery based on various existing network constraints. T h e designs for the unicast and multicast systems present opportunities to adapt the D M I F framework for different delivery scenarios. T w o such situations are wireless and satellite D M I F implementations. There have been a few wireless M P E G - 4 solutions, one of which being Packet V i d e o ' s system, however an implementation and design based on D M L F w o u l d add all the advantages that are associated to D M I F and ensure compliance to a standard, w h i c h in turn means interoperability. T h e satellite scenario however, on first inspection appears to be a straightforward scenario. T h i s is due to the high bandwidth availability that ensures high quality video in exchange for an initial delay. H o w e v e r this delay is only an issue during control plane signaling and once the media is being sent, only the first packet w i l l experience delays, as the others w o u l d be sent consecutively. T h i s type of transmission requires, longer buffers at the client to ensure that jitter on account of long distances is not noticeable. A s more work is done on improving the D M I F standard, further issues w i l l be addressed. O n e area that requires further research is security constraints within D M I F signaling and data transmissions. A s with any type of media sent over 'cyber space' security issues are a concern. T h e most serious factor is the facility with w h i c h different objects c o u l d be used to make up a presentation. F o r example, an audio object c o u l d be c o m b i n e d with a facial mesh of a prominent figure to insinuate misleading information. 85 In regards to the multicast scenario, maintaining some of the unique features of M P E G - 4 that have been sacrificed is an important step. A n avenue of interest w o u l d be an interactive multicast scenario. T h i s w o u l d address one of the disadvantages of multicast as being a non-demand service. S u c h an idea w o u l d require a shift f r o m the traditional multicast systems. In an interactive multicast scenario, a convenor (one of the receivers) w o u l d be the terminal that creates a session to which the sender w i l l send media. H o w e v e r , there w i l l always have to be a point-to-point control session available between the convenor and the sender. A s such, the media data w i l l be sent over a typical multicast session, but the convenor w i l l have control over the media content. T h i s scenario is evidently not identical to the unicast system where each receiver has the opportunity to control its presentation, but addresses the issue of interaction in a group situation. It is often the case that only one member of the group w o u l d require mult imedia control over the session. Another adaptation for the multicast w o u l d be to develop a hierarchical client stream merging or s l a m m i n g technique to reduce the server and network uti l ized bandwidth [EageOO, E a g e O l ] . It is often the case that most the receivers j o i n the required multicast session prior to its start. F o r the receivers that j o i n the session after the start, the technique proposed i n this thesis meant that they w o u l d be able to receive the remainder of the presentation. H o w e v e r , with client based stream merging, the server w o u l d provide V O D to all clients that j o i n later than the start of the multicast session. T h i s media w o u l d be sent at a bitrate faster than the media bitrate to catch up to the first multicast session that is already in progress. T h e play rate of course w i l l be that of the media, thus the client w o u l d require substantial media buffers for this technique to be feasible. O n c e the client has caught up to the multicast session, it can then j o i n the multicast session and continue to receive the remainder of the presentation and the individual session can thus be closed. T h i s reduces the overall bandwidth of the server used. T h r o u g h this research it has become noticeable that the current state of the Internet leaves a great deal to be desired with respect to streaming technologies. Ideally the Internet should become a transparent m e d i u m that is accessed similar to local m e d i u m . In the long run, this is possible through fully optical networks. H o w e v e r , research in the areas of 86 Q o S support through IntServ and D i f f S e r v and reservation protocols can potentially solve many of the uncertainty issues that are inherent to the Internet today [ F a l l O l ] . W i t h the current sophistication of mult imedia technologies and the complexi ty and speed of networks, we are still on the brink of what w i l l be the next technological revolution. A n d gradually science fiction w i l l become science fact. T h e bounds of research in this f ie ld are limitless. 87 ABBREVIATIONS Expression Brief Description Location A S Autonomous System 5.4 A U Access Unit 2.1.1 B G P Border Gateway Protocol 5.4 BIFS Binary Information For Scenes 2.1.1 C A T Channel Association Tag 3.3.2 D A D M I F Application 3.3.3.1 D A I D M I F Application Interface 2.2 D D S P D M I F Signaling Protocol 2.2 D L L Dynamic Link Library 3.2 D M I F Delivery Multimedia Integration Framework 2.1.3 D N D M I F Network protocol stack 3.3.3.1 D N A D M I F Network Access 3.3 D N I D M I F Network Interface 2.2.1 DPI D M I F Plug-in Interface 3.2 D S D M I F Service 3.3 E S Elementary Stream 2.1.2 E S D Elementary Stream Descriptor 2.1.2 F G S Fine Granular Scalability 2.1 F M C FlexMux Channel 2.1.2.2 I E T F Internet Engineering Task Force 2.2.3 I G M P Internet Group Management Protocol 4.3 M 1 - 2 D Implementation 1 - 2-dimensional rendering 3.1 I O D Initial Object Descriptor 2.1.2 88 Expression Brief Description Location ISO International Organization for Standardization 1.1 L A N Local Area Network 4.1 M B G P Multiprotocol B G P 5.4 M B O N E Multicast Backbone 4.1 M P E G Motion Picture Experts Group 1.1 N V O D Near Video O n Demand 4.1 O D Object Descriptor 2.1.2 O S Operating System 3.6 QoS Quality of Service 2.1.1 R L C Receiver driver Layered Congestion control 5.3.2 R M P Reliable Multicast Protocol 4.1 R R Round-Robin Scheduling 3.6 R T P Real Time Protocol 3.3 R T S P Real Time Streaming Protocol 2.2.3 S C E Single Connection Emulation 4.1 S A P Session Announcement Protocol 4.1 S D Scene Descriptor 3.2 S D P Session Description Protocol 4.1 S L Sync Layer 2.1.2 SPI Server Provider Interface 3.4 T A T Transmux Association Tag 3.3.2 T C P Transmission Control Protocol 2.2.2 U D P User Datagram Protocol 3.2 V O D Video on Demand 3.2 W F Q Weighted-Fair Queuing 3.6 89 BIBLIOGRAPHY [AlnuOl] Y . Pourmohammadi, K . Asrar-Haghighi, A . Kaheel, H . Alnuweiri , S. Vuong, " O n the Design of a QoS-Aware Multimedia Server," IEEE International Symposium on Telecommunications 2001, September 2001. [AsraOl] K . Asrar Haghighi, Y . Pourmohammadi, H . M . Alnuweiri : "Realizing M P E G - 4 Streaming Over the Internet: A Client/Server Architecture using D M I F " , IEEE Proceedings of the Information Technology: Coding and Computing Conference 2001, Apri l 2001. [Batt99] S. Battista, F. Casalino, and C . Lande, " M P E G - 4 : A Multimedia Standard for the Third Millennium, Part 1," IEEE Periodicals, 1999 [Chad96] N . Chaddha and A . Gupta, A Frame-work for Live Multicast of Video Streams over the Internet, Computer Systems Laboratory, Stanford University, 1998. [Chaf98] G . Chaffee, IP Multicast and Multicast Routing, Berkley Multimedia Research Center, University of California at Berkley available at http://bmrc.berkeley.edu/peop1e/chaffee/advnet98/mcast.ppt. March 1998. [Chan97] S. Chang, A . Eleftheriadis, D . Anastassiou, H . Jacobs, H . Kavla, and J. Zamora, "Columbia's V o D and multimedia research testbed with heterogeneous network support," J. Multimedia Tools Application, vol. 5, no. 4, pp. 385-431, M a y 1997. [CivaOl] Reha M.Civanlar, V . Balabanian, A . Basso, S. Casner, C . Herpel, C . Perkins, " R T P Payload format for M P E G - 4 Streams," Internet-Draft draft-ietf-avt-rtp-mpeg4-Ol.txt. [Dang96] W . Ding and B . L i u , "Rate Control of M P E G Video Coding and Recording by Rate-Quantization Modeling," IEEE Transactions on Circuits and Systems for Video Technology, Vol. 6, February 1996, pp. 12-20, 1996. [EageOO] D . Eager, M . Vernon, J. Zahorjan, "Bandwidth Sldmming: A Technique for Cost-Effective Video-on-Demand," Proceedings of Multimedia Computing and Networking 2000, January 2000. 90 [EageOl] [Enca99] [FallOl] [FranOO] [GarcOO] [Ghan89] [HerpOO] [IS014496-1] [ISO 14496-6] [Koen99] [KoenOO] D . Eager, M . Vernon, J. Zahorjan, "Minimizing Bandwidth Requirements for O n -Demand Data Delivery," IEEE Transactions on Knowledge and Data Engineering 2001, September 2001. "Gutenberg, Johannes," Microsoft Encarta Encyclopedia 99. © 1993 - 1998, Microsoft Corporation. A l l rights reserved. Y . Pourmohammadi, Integrating DMIF and Internet Standard Protocols for QoS-Aware Delivery of MPEG-4, University of British Columbia, 2001. G . Franceschini, ' T h e Delivery Layer in M P E G - 4 , " Signal Processing: Image Communication vol. 15, pp. 347-363, 2000. A . Leon-Garcia & I. Widjaja: Communication Networks: Fundamental Concepts and Key Architectures, Boston, M c G r a w - H i l l Companies Inc. 2000. M . Ghanbari, 'Two-Layer coding of video signals for V B R networks," IEEE L. on Selected Areas in Communications, vol. 7, pp. 771-778, June 1989. C . Herpel, A . Eleftheriadis, " M P E G - 4 Systems: Elementary Stream Management," Signal Processing: Image Communication vol. 15, pp. 299-320, 2000. Coding of Audio-Visual Objects - Part 1: Systems, ISO/IEC 14496-1 International Standard, ISO/IEC JTC1/SC29/WG11 N2501, March 2000. Coding of Audio-Visual Objects - Part 6: Delivery Multimedia Integration Framework (DMIF) , ISO/IEC 14496-6 International Standard, ISO/IEC JTC1/SC29/WG11 N2501, March 2000. R. Koenen, " M P E G - 4 Multimedia of our Time: The Latest Multimedia Standard Excels Audiovisually, Husbands every bit, and Invites the Viewer to Join the O n -Screen Act ion, " available at www.cselt.it/mpeg/documents/koenen/mpeg-4.htm, IEEE Spectrum Magazine, February 1999. R. Koenen, " M P E G - 4 Overview - V.16 - L a Baule Version," available at www.cselt.it/ufv/leonardo/mpeg/standards/mpeg-4/mpeg-4.htm, October 2000. 91 [KikuOl] Y . Kikuchi , T . Nomura, S. Fukunaga, Y . Matsui, H . Kimata,, " R T P payload format for M P E G - 4 Audio/Visual streams," Internet-Draft draft-ietf-avt-rtp-mpeg4-es-04.txt [LeehOO] Hung-Ju Lee, Tihao Chiang, and Y a - Q i n Zhang, "Scalable Rate Control for M P E G -4 V i d e o , " I E E E Transactions on Circuits and Systems for Video Technology, V o l . 10, No . 6, September 2000, pp. 878-894. [MantOl] Cooperative Association for Internet Data Analysis ( C A I D A ) , " M A N T R A -Monitoring Multicast on a Global Scale - Monitoring M B G P Routes," available at http://www.caida.org/tools/measurement/Mantra/route-mon/route-mon-mbgp.html, July 2001. [MohaOl] A . Mohamed, H . M . Alnuweiri, " M P E G - 4 Broadcast: A Client/Server Framework for Multi-Service Streaming Using Push Channels," Proceeding of IEEE Multimedia Signal Processing Conference 2001, 2001. [MultOl] Multicast Technologies Inc, "Multicast Status W e b Page," available at www.multicasttech.com/status/, July 2001. [PourOl] Y . Pourmohammadi, K . Asrar Haghighi, A . Mohamed, H . M . Alnuweiri, "Streaming M P E G - 4 over IP and Broadcast Networks: D M I F Based Architectures," Proceeding of Packet Video Workshop 2001, pp. 218-227, M a y 2001. [RamaOO] S. Raman, A Framework for Interactive Multicast Data Transport in the Internet, University of California at Berkley, 2000. [RFC1112] S. Deering, "Host Extensions for IP multicasting," Request for Comment 1112, August 1989. [RFC2236] W . Fenner, "Internet Group Management Protocol, Version 2," Request for Comment 2236, November 1997. [RFC2327] M . Handley and V . Jacobson, " S D P : Session Description Protocol," Request for Comment 2327, Apri l 1998. [RFC2974] M . Handley, C . Perkins, and E . Whenlan, " S A P : Session Announcement Protocol," Request for Comment 2974, October 2000. 92 [SchaOO] [Talp95] [Tane96] [Vici98] [WalkOl] [Whet95] [WudaOO] [YangOl] [Zhang99] M . van der Schaar, H . Radha, and C . Dufour, "Scalable M P E G - 4 Video Coding with Graceful Packet-Loss over Bandwidth-Varying Networks," IEEE Conference Proceedings, 2000. Talpade, R., Ammar, M . H . , "Single Connection Emulation (SCE) : A n Architecture for Providing a Reliable Multicast Transport Service," IEEE International Conference on Distributed Computing Systems 1995, 1995. A . Tanenbaum: Computer Networks, New Jersey, Prentice-Hall Inc. 1996. L.Vicisano, J Crowcroft, L .Rizzo, 'TCP-like Congestion Control for Layered Multicast Data Transfer', IEEE INFOCOM'98, A p r i l 1998. M . Walker, R. Jacobs, and M . Nilsson, "Adaptive Multimedia Streaming over IP," Proceeding of Packet Video Workshop 2001, pp. 15-21, 2001. Brian Whetten, Design, Implementation and Verification of the Reliable Multicast Protocol, Masters Thesis University of Illinois at Urbana-Champaign, 1995. Dapeng W u , Yiwei Thomas Hou, Wenwu Zhu, Hung-Ju Lee, Tihao Chiang, Y a - Q i n Zhang, H . Jonathan Chao, " O n End-to-End Architecture on Transporting M P E G - 4 Video over the Internet," IEEE Transactions on Circuits and Systems for Video Technology, Vol. 10, No. 6, September 2000, pp. 923-941, 2000. D . Yang, W . Liao, Y . L i n , " M Q : A n Integrated Mechanism for Multimedia Multicasting," IEEE Transactions on Multimedia, vol. 3, no. 1, March 2001. Z . Zhang, S. Nelakuditi, R. Aggarwa, and R. P. Tsang, "Efficient Server Selective Frame Discard Algorithms for Stored Video Delivery over Resource Constrained Networks," IEEE INFOCOM, March 1999, pp. 472-479, 1999. 93 APPENDIX I T h e files that make up the implementation of the D M I F Remote Instance D L L can be seen in T a b l e 1-1. Table 1-1 Remote Instance DLL Files File name: Content, comment: Remotelnstance.cpp DMIF Remote Instance DLL implementation DNJJDP.cpp DNI functions, (request functions) DN_common_udp.cpp Supplementary functions used by DN_udp.cpp DNUDP.h The main header file for DNI implementation DN_definitions.h Definition of constants used in DDSP messages Messages, h DS messages, defined as structures T h e files that make up the implementation of the M P E G - 4 Streaming server can be seen in T a b l e 1-2. Table 1-2 Server Files File name: Content, comment: DN_Server.cpp Main Listening Thread (DN Daemon) DN_udp_threads. cpp Secondary DN_Daemon threads Handling individual client requests DN_Callback.cpp DNI callback functions; Called by Secondary threads. DN_Server_udp.h The main header file for DNI implementation DN_definitions.h Definition of constants used in DS messages Messages.h DS messages, defined as structures. T h e I M 1 client program that utilizes the Remote Instance functionality can be seen in Figure 1.1. 94 _«^4QcifX1.mp4-IM1-2D Figure 1.1 IM1 Client Program 95 APPENDIX II The native D N functions behave similarly, as is shown in Figure I L L The client sends a request to the D N server and blocks on the response. A t the server, the D N daemon receives and parses the message. It then spawns a new thread to process the client request and resumes listening for additional requests. The secondary thread services the request by calling D N callback functions and terminates upon sending a response back to the client. This is true for the U D P scenario, in the T C P scenario the new server listening thread that was created and dedicated to the connection will service all the future client's requests. Client D N Daemon Secondary Thread TJ M o o Sessions etupRequest Sessions etupConfirm Begin thread SessionSetupCallback X Figure 11.1 U M L Description of S e s s i o n Setup 11.1 SESSIONSETUP[CALLBACK] B O O L DN_SessionSetup ( long double networkSessionld, LPCSTR calledAddress, LPCSTR callingAddress, int compatibilityDescriptorln, intl6 ""response, unsigned char *compatibilityDescriptorOut) The purpose of this function is to establish an end-to-end network session with a remote interactive target D M I F . Only the originating D M I F application calls this function. The DN_SessionSetup function will get a handle for a socket that will be used by the D M I F remote instance D L L . This handle then binds to a local port and is used for all subsequent communications with the target D M I F peer by all the D N client functions. Once the communication medium is established, the function will build the dynamic components of the message for session setup, call the getStream function (the getStream function is used by all D N 96 functions to create a correct stream for transmission to the target DMIF) and send the created stream through a datagram to the target DMIF. Once the message has been sent, the function will listen for an incoming response on the same port. If the receive function times-out, the function will attempt to send the message again. When the response has been received, it will be converted from a data stream into a SessionSetupConfirmMessage using the createObjFromStream function (this conversion facilitates the process of accessing data elements in the received message). The function will ensure that the message was intended for it, by verifying the transactionld and messageld. If the received message is correct, a triplet of neworkSessionld, socket_handle, and destination address of the target DMIF will be added to the NetSessionList. This list can be accessed later by other client functions to make use of the newly created socket. The function will then parse the message and extract the target DMIF response. This can then be returned to the calling function in the DMIF remote instance D L L . int DN_SessionSetupCallback ( long double networkSessionld, LPCSTR calledAddress, LPCSTR callingAddress, int compatibilityDescriptorln, int 16 ""response, unsigned char *compatibilityDescriptorOut) This function serves as the corresponding function on the target DMIF application to the previous function. This function is invoked by the D N server daemon. The createObjFromStream function is used to create an incoming message object. Based on the received data, the function informs the target DMIF layer of the networksessionld that is to be used for the session. This function determines whether the target DMIF is able to start a new session, and specifies the response that the target DMIF will send to the originating DMIF application. If the target DMIF is unable to initiate a new session a RESPONSE_ERROR will be sent, otherwise the target DMIF returns a RESPONSE_OK. 11.2 SESSIONRELEASE[CALLBACK] BOOL DN_SessionRelease( long double networkSessionld, int 16 reason, int 16 *response) The purpose of this function is to close all relations with the target DMIF. Once the network session is no longer required, a call to this function will send a message to the target DMIF indicating the networkSessionld to invalidate and the reason for doing so. The socket is determined through the NetSessionList and a message is sent to the target DMIF. When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. If the received message is correct, the triplet of neworkSessionld, socketjiandle, and destination address of the target DMIF will be deleted from the NetSessionList. The client will then close the socket to the target DMIF. The function will then parse the message and extract the target DMIF response. This can then be returned to the calling function in the DMIF remote instance D L L . 9 7 BOOL DN_SessionReleaseCallback( long double networkSessionld, intl6 reason, intl6 * response) This function serves as the corresponding function on the target D M I F application to the DN_SessionRelease function. This function is invoked by the D N server daemon and informs the target D M I F layer of the reason for terminating a particular network session. This network session will be identified by its networkSessionld. The target D M I F will invalidate its network session and return a R E S P O N S E _ O K message to the client. 11.3 SERVICEATTACH[CALLBACK] BOOL DN_ServiceAttach( long double networkSessionld, intl6 serviceld, const char *serviceName, const char * ddDataln, intl6 ""response, char * ddDataOut) The purpose of this function is to initialize an end-to-end D M I F service session with a remote D M I F peer. The DN_ServiceAttach function, issued by the Originating D M I F , establishes a service session with the Target D M I F . This service session is established inside a previously established Network Session (during SessionSetup), identified by the networkSessionld. The DN_ServiceAttach function will associate a serviceld tag with a service session. The serviceld and the networkSessionld together uniquely identify a D A I service session. This function accesses the NetSessionList to determine the handle for the SigSocket that was previously created by the DN_SessionSetup function, for communicating with the target D M I F (except for DN_SessionSetup, all subsequent functions determine the socket handle using this method). When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. The function will then parse the message and extract the target D M I F response. This can then be returned to the calling function in the D M I F remote instance D L L . The service is maintained by the upper layers and thus the D N will not save a record of the service session that has just been created. int DN_ServiceAttachCallback ( long double networkSessionld, int 16 serviceld, const char * serviceName, const char * ddDataln, int 16 ""response, char * ddDataOut) This function serves as the corresponding function on the target D M I F application to the DN_ServiceAttach function. This function is invoked by the D N server daemon and informs the target D M I F layer of the serviceld and its associated networkSessionld that are to be used for 98 initializing the D A I service session. This function informs the D A I of the client request for a new service, and specifies the response that the target D M I F will send to the originating D M I F application. This function has to call a D A I callback function to get the necessary response to the client service-attach request. If the target D M I F is unable to initialize a new service session a R E S P O N S E _ E R R O R will be sent, otherwise target D M I F returns a R E S P O N S E _ O K . 11.4 SERVICEDETACH[CALLBACK] BOOL DN_ServiceDetach( long double networkSessionld, intl6 serviceld, int 16 reason, int 16 *response) The purpose of this function is to detach a service session that was previously established using the ServiceAttach primitive. The service session is identified with its serviceid inside the network session that, in turn, is identified with its networkSessionld. When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. The function will then parse the message and extract the target D M I F response. This can then be returned to the calling function in the D M I F remote instance D L L . The upper layer will delete the service session that is no longer needed. BOOL DN_ServiceDetachCallback( long double networkSessionld, int 16 serviceld, int 16 reason, int 16 * response) This function serves as the corresponding function on the target D M I F application to the DN_ServiceDetach function. This function is invoked by the D N server daemon and informs the target D M I F layer of the serviceld and its associated networkSessionld that are to be used for closing the D A I service session. Upon receiving this callback message, the target D M I F will issue a DA_ServiceDetachCallback and return the response from this function to the client. 11.5 TRANSMUXSETUP[CALLBACK] BOOL DN_TransMuxSetup( long double networkSessionld, GenericMsgLoop * TMReqLoop (TAT, direction, qosDescriptor; resources()), GenericMsgLoop * TMConfLoop (response, resources())) This function is used by a D M I F peer to establish one or more Data Plane transport (Transmux) channels inside an existing network session previously established with the target D M I F . The network session is identified by its networkSessionld. The TMReqLoop will contain a list of one or more T A T ' s , direction, qosDescriptor, and resources to be used for the new Transmux channels. The Transmux Association Tag ( T A T ) identifies a Transmux channel on both peers. The T A T will have to be created at the originating D M I F and thus its value has been assigned to the port number being used for the new data plane socket. These data channels are uni-directional and hence, a direction field is required as well. The qosDescriptor is set based on the information 99 contained in the qosDescriptors passed in the DA_ChannelAdd() and related to the Elementary Streams being carried in the Transmux Channel. The resources() parameter contains the description of the network resources to be reserved for the Transmux channel. Prior to sending this message, the originating DMIF will create a new socket for each of the requested Transmux channels. When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. The function will then parse the message and extract the list of responses and resources that the target DMIF has returned. The received parameters correspond to the list that the originating DMIF had sent. The Transmux channels that resulted in a RESPONSE_OK from the target DMIF are added to the NetSessionList and associated to their corresponding networkSessionld. This results in the DN package being able to determine the Transmux channels that are in use. The Transmux channels that resulted in a RESPONSE_ERROR, however, will not be added to the NetSessionList and their sockets will be closed by the originating DMIF. These responses and resources can then be returned to the calling function in the DMIF remote instance DLL. BOOL DN_TransMuxSetupCallback ( long double networkSessionld, GenericMsgLoop *TMReqLoop, GenericMsgLoop *TMCnfLoop) This function serves as the corresponding function on the target DMIF application to the DN_TransMuxSetup function. This function is invoked by the DN server daemon to inform the target DMIF of the new Transmux channels that have been requested by the originating DMIF. The target DMIF will possibly complete and update the resources parameter and reply to the originating DMIF peer with a response code. A RESPONSE_OK will be sent for each Transmux channel that the target DMIF peer is able to setup. 11.6 TRANSMUXCONFIG[CALLBACK] BOOL DN_TransMuxConfig( long double networkSessionld, GenericMsgLoop * TMReqLoop (TAT, ddDatalnO), GenericMsgLoop * TMCnfLoop (response)) DN_TransMuxConfig is issued by the Originating DMIF to reconfigure one or more Transmux channels previously established inside a network session. The Transmux channels are identified by their TAT inside the Network Session identified by the networkSessionld. For each Transmux channel a tuple of parameters is provided. The TAT contains the Association Tag of the Transmux Channel; ddDatalnO contains the DMIF descriptor containing the Flexmux information (this field contains the changes in configuration that have to be made for the channel). When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. The function will then parse the message and extract the target DMIF responses per channel. These responses can then be returned to the calling function in the DMEF remote instance DLL. BOOL DN_TransMuxConfigCallback( long double networkSessionld, 100 GenericMsgLoop * TMReqLoop, GenericMsgLoop * TMCnfLoop) This function serves as the corresponding function on the target D M I F application to the DN_TransMuxConfig function. This function is invoked by the D N server daemon to inform the target D M I F that the configuration of existing Transmux channels must be modified. The target D M I F peer will return a response code for each channel. A R E S P O N S E _ O K will be sent for each Transmux channel that the target D M I F peer is able to reconfigure. 11.7 TRANSMUXRELEASE[CALLBACK] BOOL DN_TransMuxRelease( long double networkSessionld, GenericMsgLoop *TATLoop (TAT), GenericMsgLoop * RespLoop (response)) This function is issued by a D M I F peer to close all logical channels making use of the one or more indicated Transmux channels. In normal conditions, it is only invoked when all logical channels related to the indicated T A T s inside the network session identified by the networkSessionld have been already detached (see DN_ChannelDelete[Callback]). When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. The function will then parse the message and extract the target D M I F responses per channel. The channels for which the target D M I F peer has sent a R E S P O N S E _ O K will be deleted from the NetSessionList and their sockets will be closed. These T A T ' s will become invalid. These responses can then be returned to the calling function in the D M I F remote instance D L L . BOOL DN_TransMuxReleaseCallback( long double networkSessionld, GenericMsgLoop *TATLoop, GenericMsgLoop * RespLoop) This function serves as the corresponding function on the target D M I F application to the DN_TransMuxRelease function. This function is invoked by the D N server daemon to inform the target D M I F that existing Transmux channels must be closed. The target D M I F peer will return a response code for each channel. A R E S P O N S E _ O K will be sent for each Transmux channel that the target D M I F peer closes down successfully. 11.8 CHANNELADD[CALLBACK] BOOL DN_ChannelAdd( long double networkSessionld, int 16 serviceld, GenericMsgLoop * ReqLoop (CAT, direction, channelDescriptor, ddDataIn()), GenericMsgLoop * CnfLoop (response, TAT, ddDataOut())) 101 D N _ C h a n n e l A d d is issued by the Originating D M I F to open one or more logical channels inside a service session. The service session is identified by the serviceld inside the network session identified by the networkSessionld. For each logical channel to be established, a tuple of parameters is provided, both of which are derived from parameters passed in the DA_ChannelAdd() and related to the Elementary Stream being carried in the logical channel. The CAT is a network session wide unique identifier assigned by the originating D M I F entity. The direction is set based on the related direction parameter passed in the DA_ChannelAdd() . The channelDescriptor contains the complete description requested for a particular channel as well as possible application specific descriptors. It is set based on the information contained in the related channelDescriptor passed in the DA_ChannelAdd() . The ddDatalnO contains the related uuData() provided by the D M I F User in the DA_ChannelAdd() . When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. The function will then parse the message and extract the target D M I F responses per channel. The response message contains a tuple of parameters. The TAT contains the Association Tag of the Transmux channel carrying the logical channel (see DN_TransMuxSetup[Callback]). These responses can then be returned to the calling function in the D M I F remote instance D L L . BOOL DN_ChannelAddCallback (long double networkSessionld, intl6 serviceld, GenericMsgLoop *CHReqLoop (CAT, channelDescriptor, direction, ddDatalnO), GenericMsgLoop *CHCnfLoop (response, TAT, ddDataOut())) This function serves as the corresponding function on the target D M I F application to the D N _ C h a n n e l A d d . This function is invoked by the D N server daemon to inform the target D M I F to open additional channels inside an existing service session. The target D M I F peer will return a response code for each channel and a corresponding TAT to carry the logical channel within. 11.9 CHANNELADDED[CALLBACK] BOOL DN_ChannelAdded( long double networkSessionld, int 16 serviceld, GenericMsgLoop * ReqLoop (CAT, channelDescriptor, direction, TAT, ddDatalnO), GenericMsgLoop * CnfLoop (response, ddDataOut())) This function is invoked by the originating D M I F peer to inform the target D M I F that one of more logical channels within a service session have been added. The service session is identified by its serviceld inside a network session, which is identified by its networkSessionld. For each logical channel that has been added a tuple of parameters is provided. The CAT is a unique network session wide identifier assigned by the originating D M E F entity. The direction is based on the direction of data flow within the logical channels. The channelDescriptor contains the complete description requested for a particular channel as well as possible application specific descriptors. It is set based on the information contained in the related channelDescriptor passed in the DA_ChannelAdded() . The ddDatalnO contains the related uuDataQ provided by the 102 D M I F User in the DA_ChannelAdded() . The T A T contains the Association Tag of the Transmux Channel carrying the logical channel (see DN_TransMuxSetup[Callback]). When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. The function will then parse the message and extract the target D M I F responses per channel. The response message contains a tuple of parameters. The ddDataOut() contains the uudata given by the D A I of the target D M I F peer. These responses can then be returned to the calling function in the D M I F remote instance D L L . BOOL DN_ChannelAddedCallback(long double networkSessionld, int 16 serviceld, GenericMsgLoop * ReqLoop, GenericMsgLoop * CnfLoop) This function serves as the corresponding function on the target D M I F application to the DN_ChannelAdded. This function is invoked by the D N server daemon to inform the target D M I F that additional logical channels inside an existing service session have been created. The target D M I F peer will return a response code for each channel and a corresponding ddDataOut(). 11.10 CHANNELDELETE[CALLBACK] BOOL DN_ChannelDelete( long double networkSessionld, GenericMsgLoop * CATReasonLoop (CAT, reason), GenericMsgLoop * Resploop (response)) DN_ChannelDelete is issued by the Originating D M I F to close one or more logical channels inside a network session identified by the networkSessionld. For each logical channel to be closed, a tuple of parameters is provided. When the response has been received, the function will ensure that the message was intended for it, by verifying the transactionld and messageld. The function will then parse the message and extract the target D M I F responses per logical channel to be closed. These responses can then be returned to the calling function in the D M I F remote instance D L L . BOOL DN_ChannelDeleteCallback( long double networkSessionld, GenericMsgLoop * CATReasonLoop, GenericMsgLoop * Resploop) This function serves as the corresponding function on the target D M I F application to the DN_ChannelDelete. This function is invoked by the D N server daemon to inform the target D M I F to close existing logical channels. The target D M I F peer will return a response code for each channel. 11.11 USERCOMMAND[CALLBACK] BOOL DN_UserConrmand( long double networkSessionld, ddDataClass *ddData!n, 103 GenericMsgLoop * CATLoop (CAT)) This function only passes the user command, e.g. P L A Y , to the D M I F peer. Nothing specific is needed to be done in D A I and D N I layers and the user data that is passed is opaque to D M E F . Similar to the other D N functions, this function obtains the sigSocket related to its networkSessionld and sends its D S message through it. The UserCommand function doesn't expect any response from the D M I F peer, therefore it can return as soon as it has finished sending the data. BOOL DN_UserCommandCallback ( long double networkSessionld, ddDataClass *ddDataIn, GenericMsgLoop * CATLoop) This function serves as the corresponding function on the target D M E F application to the DN_UserCommand function. This function is invoked by the D N server daemon and informs the target D M E F layer of the networkSessionld and C A T ' s that are to be used for finding the corresponding service provider to route user data to. DN_UserCommandCallback doesn't answer to the peer D M E F that has sent the message. The user data carried in the D S message is opaque to D M E F and is passed to the application layer intact. This function has to call a D A I callback function to pass the user data to. 11.12 USERCOMMANDACK[CALLBACK] BOOL DN_UserCommandAck( long double networkSessionld, ddDataClass *ddDataIn, GenericMsgLoop * CATLoop (CAT)) This primitive is expected to provide the functionality similar to UserComrnand as well as getting the response from D M E F peer, in UserCommand this step is skipped, while in UserCommandAck the call to D N function blocks until getting the response from the remote D M E F peer. BOOL DN_UserCommandAckCallback ( long double networkSessionld, ddDataClass *ddDataIn, GenericMsgLoop * CATLoop) This function serves as the corresponding function on the target D M E F application to the DN_UserCommandAck function. This function is invoked by the D N server daemon and informs the target D M E F layer of the networkSessionld and C A T ' s that are to be used for identifying the service provider to which user data should be sent. In contrary to DN_UserCommandAckCallback this function has to reply to the D M E F peer by sending the response it gets from the upper layer (DAI_Callback), The user data carried in the D S message is opaque to D M E F and is passed to the application layer intact. This function has to call a D A I callback function to pass the user data and get the response, when the call to D A I callback function returns the DS_confirm message is sent back to the requesting D M E F peer. 104 The D D S P messaging that takes place between the client and server for the initiation of a service can be seen in Figure II.2. For additional messaging diagrams refer to [IS014496-6]. Originating DMIF Terminal Target DMIF Terminal Application DAI DA_ServiceAttach DMIF Layer DNI + Network + DNI DMIF Layer (IN: DMIF_URX, uuData) (OUT: rsp, ssld, uuData) determine whether a new network session is needed attach to the service DN„SessionSetup (IN: nsld, CalledAddr, CallingAddr, CapDescr) (OUT: rsp, CapDescr) DN ServiceAttach (IN: nsld, serviceld, serviceName, ddData) (OUT: rsp, ddData) Connect to the application running the service DAI Application DA ServiceAttach (IN: ssld, serviceName, uuData) (OUT: rsp, uuData) the application | running the service replies Figure 11.2 Initiation of a Service in a Remote Interactive DMIF 105 APPENDIX III Figure III. 1 shows the client behaviour based on media bitrate. The two lines in the graph represent the same media sequence at different bitrates. The video in the Linda.mp4 sequence has almost twice the bitrate of the vh64.mp4 sequence. The client in this case would read burst more of the higher bitrate video at the start of the sequence. This is simply a characteristic of the Iml player. 70000 -i 60000 Time (msec) Figure 111.1 Cl ient Med ia S e s s i o n Buffering S c h e m e 106 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0065172/manifest

Comment

Related Items