Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Maximizing buffer and disk utilizations for news on-demand Yang, Jinhai 1994

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1994-0664.pdf [ 1.29MB ]
JSON: 831-1.0051253.json
JSON-LD: 831-1.0051253-ld.json
RDF/XML (Pretty): 831-1.0051253-rdf.xml
RDF/JSON: 831-1.0051253-rdf.json
Turtle: 831-1.0051253-turtle.txt
N-Triples: 831-1.0051253-rdf-ntriples.txt
Original Record: 831-1.0051253-source.json
Full Text

Full Text

MAXIMIZING BUFFER AND DISK UTILIZATIONS FOR NEWSON-DEMANDByJinhai YangB. Sc. (Computer Science) University of Science and Technology of ChinaA THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE STUDIESCOMPUTER SCIENCEWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAAugust 1994© Jinhai Yang, 1994In presenting this thesis in partial fulfillment of the requirements for an advanced degreeat the University of British Columbia, I agree that the Library shall make it freelyavailable for reference and study. I further agree that permission for extensive copyingof this thesis for scholarly purposes may be granted by the head of my department or byhis or her representatives. It is understood that copying or publication of this thesis forfinancial gain shall not be allowed without my written permission.Computer ScienceThe University of British Columbia2366 Main MallVancouver, CanadaV6T 1Z4Date:AbstractIn this thesis, we study the problem of how to maximize the throughput of a multimediasystem, given a fixed amount of buffer space and disk bandwidth, both pre-determined atdesign-time. Our approach is to maximize the utilization of disk and buffers. We proposetwo methods. First, we analyze a scheme that allows multiple streams to share buffers.Our analysis and simulation results indicate that buffer sharing could lead to as muchas a 50% reduction in total buffer requirements. Second, we develop three prefetchingstrategies: straight forward prefetching (SP), and two types of intelligent prefetchings(IP1 and 1P2). All these strategies try to read a certain amount of data into memorybefore a query is actually activated. We demonstrate that SP is not effective at all,but both IP1 and 1P2 can maximize the effective use of buffers and disk, leading to asmuch as a 40% improvement in system throughput. We also extend the two intelligentprefetching schemes to a multiple disk environment.11Table of ContentsAbstract iiList of Figures viAcknowledgments vii1 Introduction 11.1 Distributed multiple-user multimedia systems 11.2 Motivation and contribution of this thesis 31.3 Outline of this thesis 42 Preliminary 62.1 General framework 62.2 Fixed order reading scheme 72.3 Determining the lower bound of t 102.4 Buffer allocation: determining an upper bound for t 122.5 Approximation of non-contiguously placed data 142.6 Summary 163 Related Work 183.1 Data placement 183.2 Intelligent disk scheduling 203.3 Scheduling and on-line admission control 211113.4 Summary 234 Buffer Sharing4.1 Benefits of buffer sharing4.2 A simple case: all the streams have the identical playback4.3 General case4.4 Summary5 Prefetching5.1 Benefits of prefetching5.2 Straightforward prefetching strategy (SP)5.3 Intelligent prefetching strategy 1 (IP1)5.4 Intelligent prefetching strategy 2 (1P2)5.5 Summary6 Multiple Disk Environment6.1 System configuration of multiple disk environment .6.2 Extension from a single disk environment6.2.1 Handling buffer sharing and prefetching6.2.2 Handling synchronization6.3 Summary5353545557597 Performance Evaluation by Simulation7.1 Simulation methodology and program design7.2 Implementation concerns7.2.1 When to activate the admission controller7rates242425283233333641475160606161iv7.2.2 Buffer releasing 627.2.3 Transient period 627.3 Evaluation of buffer sharing with varying disk rates 637.4 Evaluation of buffer sharing with non-contiguous data placement 657.5 Evaluation of prefetching 667.6 Evaluation of multidisk environment algorithm 698 Conclusions 718.1 Summary 718.2 Future works 728.2.1 Improvement of current work 728.2.2 New directions of research 73Bibliography 75VList of Figures2.1 Fixed order cyclic reading 82.2 The curve of disk rate R and cycle length t relation 112.3 The buffer requirement curve 132.4 Using a continuous curve to approximate block reading . 154.1 Buffer sharing for three streams with identical consumption rates . . 265.1 Reducing the consumption rate by prefetching . . . . 355.2 How to decide the reading period length t for prefetching . . . . 385.3 The control flow of IP1 . . . . 477.1 The benefit of buffer sharing 647.2 Handling non-contiguous data placement using approximation . . . 657.3 SP vs IP1 vs 1P2: relative finish time 677.4 SP vs IP1 vs 1P2: average disk utilization 687.5 The result of the multiple disk algorithm 70viAcknowledgmentsFirst of all, I would like to express my sincere thanks to Dr. Raymond T. Ng, mysupervisor, for his guidance, commitment and understanding throughout my researchwork.I would also like to thank my co-supervisor Prof. Sameul T. Chanson. He introducedUBC to me, gave me many helps in my earlier year in UBC, and gave me many wonderfuladvises on how to do research.Special thanks are also to Prof. Kellogg Booth for being the second reader of thisthesis, and for all the valuable suggestions he gave me.The financial support from the Department of Computer Science at the University ofBritish Columbia, in the form of a research assistantship under NSERC strategic grantsis gratefully acknowledged.viiChapter 1Introduction1.1 Distributed multiple-user multimedia systemsMultimedia systems are computer systems that can support multiple media. Here, “media” refers to text, static graphics, audio data, video data, etc. For the last severaldecades, the computational power of modern computers has increased dramatically. However, the user interfaces and software technology have not improved at the same pace.Usually, conventional computer systems oniy support text and static graphics. Thus theirapplication domains are limited. On the other hand, a multimedia computer system cansupport not only text and graphics, but also audio and video data. In addition, wewill expect that all the media work in a consistent and cooperative manner. This givesusers a better opportunity to perform many kinds of tasks that are otherwise not possible. The application of multimedia computer systems has been spreading very quicklyin areas such as distance education, teleconference, news-on-demand, video-on-demand,computer games, etc.Multimedia systems on single-machine, single-user platforms have been studied andused for several years. However, the research and development in a multiple-user, distributed environment is just getting started. With the advancement of techniques instorage systems, computer networks, compressing methods and the promised “Information Super-highway”, distributed multiple-user multimedia systems will become a reality1Chapter 1. Introduction 2in the near future.Multimedia systems present tremendous challenges to current technologies, especiallywhen a distributed, multiple-user environment is being considered. A multimedia fileserver is the core of such an environment, and is focus of this thesis. In a distributedenvironment, a multimedia file server should have the ability to support multiple userssimultaneously. The users can be local users who submit requests directly to the server,or remote users who submit requests through the network. The response time for boththe local users and the remote users should be guaranteed. In addition to response time(latency), multimedia requests have a further requirement called continuity requirement.The continuity requirement requires once a stream (e.g. video) gets started, it shouldnot be interrupted until it finishes.Closer study shows that disk bandwidth and memory are the two most importantsystem constraints in a multimedia file server. How to make efficient use of these tworesources is a key performance issue when designing and implementing such servers.Consider a simple example. A digital video stream with acceptable quality requires 30frames per second. For each frame, suppose there are 400 x 200 pixel. For each pixel,we use 9 bits to represent color. So for each second of this video stream, the datarequirement is 30 x 400 x 200 x 9 = 21, 600, 000bits. This is about 2.7Mbytes. Even ifwe use compression (and assume a compression ratio 10:1), it still requires 27oKbytes ofdata. If we have a video stream of one hour long, the data requirement will be 972Mbytes!As for the disk bandwidth requirement, if the server needs to support a maximum of 50users, the disk bandwidth requirement would be at least 13.5Mbytes/s.Chapter 1. Introduction 31.2 Motivation and contribution of this thesisMany excellent studies regarding multimedia file servers have been conducted [y+89],[RV93], [LS93], [Gem93}, [RW93], [CKY93], [RV91], [GC92], [RVR92], [A0G92], [TB93]and [Po191]. With respect to the topic of this thesis, these works can be grouped intothree major categories. The first group [Y+89][RV93][LS93] is primarily concerned withdata placement, i.e. how to organize data on disk more efficiently so that the dataretrieval speed can be improved. The second group [Gem93] [RW93] [CKY93] studies theissue of intelligent disk scheduling. The basic idea of these approaches is to optimizedisk arm movement so that the disk seek time can be reduced. The third group [RV91][GC92] [RVR92] [A0G92] [TB93] [P0191] discusses how to build a multimedia file server.The key problems studied by these papers are admission control and on-line scheduling.To a large extent, most of these proposals aim to minimize disk seek latency so as tosatisfy the continuity requirements of multimedia streams. Most of the proposed soultionsare designed from a static point of view.Complementary to the problems addressed in the studies mentioned above, in thisthesis, we study the multimedia file server in a more dynamic environment with randomlyarriving queries. A typical example of this kind of system is a multimedia file server forNews-On-demand. More specifically, given a fixed amount of buffer space and disk bandwidth, both pre-determined at design time, we study how to maximize the throughputof a multimedia system, and minimize the response time of queries.There are basically two key ideas we propose to improve disk and buffer utilizationsand system throughput: prefetching and buffer sharing. The idea of buffer sharing isallowing multiple streams to dynamically share buffer. Our analysis and simulationresults show that this can reduce the overall buffer requirement by up to 50%. The ideaChapter 1. Introduction 4of prefetching is to read some data into the memory even if the system has not enoughresources to serve it at this moment. Because of the prefetched data, the request willhave a better chance to get admitted at a later time. Simulation results show prefetchingcan improve system performance up to 40%.This thesis is mainly concentrated on the discussion of these two key ideas. Morespecifically, the following contributions are made this thesis:• analyzing buffer sharing in the simple case with all the streams having the sameconsumption rates;• analyzing buffer sharing in the more general case in which streams can have differentplayback rates;• using simulation to evaluate these two schemes;• analyzing prefetching;• proposing three prefetching strategy: SP, IP1 and 1P2;• using simulation to evaluate the proposed prefetching schemes;• studying extensions of buffer sharing and prefetching to a multiple disk environment.1.3 Outline of this thesisThe thesis is organized as follows. Chapter 2 defines the basic concepts of our frameworkand introduces some basic formulas used in our later analysis. Chapter 3 is a briefsummary of related work. We list all the important work that other researchers havedone in this area. Chapter 4 and Chapter 5 discuss buffer sharing and prefetching,respectively. The most important results of this thesis are given in these two chapters.Chapter 1. Introduction 5To achieve better performance, using multiple disks is the most obvious solution. Chapter6 discusses how to extend the results in Chapters 4 and 5 to a multiple disk environment.Chapter 7 describes our simulation package and gives some important simulation results.The last chapter summarizes the results in the thesis.Chapter 2PreliminaryIn this chapter, some preliminary knowledge (e.g. basic notations, formulas) of our workwill be given. After a brief description of the general framework, the fixed order readingscheme, which is the basic framework that this thesis is built upon, is described. Followed,the basic formulas for calculating reading period length t and buffer requirement B isgiven. Finally, a scheme of how to approximate non-contiguous data placement usingcontiguous curve is presented.2.1 General frameworkDue to the real-time nature of multimedia data and memory constraints, most multimediafile servers work in a round robin manner. The server serves all the requests cyclically. Ineach cycle, the server will read a certain amount of data for each request. This amountof data must be large enough to last the period of the cycle. This requirement is calledthe continuity requirement for a multimedia stream. If the condition can not be satisfied,we say a starvation occurs.In this thesis, we consider such a scenario: A multimedia file server serves multipleuser requests concurrently. We call a user’s request for multimedia data a query. Themultimedia data that a query requests is called stream. The server maintains a workingset and a waiting queue. A query in the working set is being served by the server. Thequeries that cannot be served at the current time will be put in the waiting queue, and6Chapter 2. Preliminary 7will be served at a later time.The biggest difference between our model and most of the models discussed in theliterature is that our model has more dynamic characteristics. Queries arrive dynamically,and the data streams that queries request may have different lengths and consumptionrates. With the frequent arrival and termination of queries, the server state changesrapidly. There are basically two system states, we call them stable state and transientstate. When there is no admission and/or termination of queries happening, the serveris in the stable state. When the server is in the processing of admitting incoming queriesand/or deleting terminated queries, it is in the transient state. The time period whenthe server is in the transient state is called the transient period. Handling the transientperiod in a multimedia file server is challenging work.2.2 Fixed order reading schemeIn the cyclic reading scheme, the reading order within a period (cycle) can be fixed orvariable. Variable reading order provides a better chance to optimize disk arm movement,so as to minimize total seek time and improve system throughput. On the other hand,the behavior of fixed reading order scheme is more predictable, so that we might be ableto allocate buffer in a more efficient way. In addition, buffer sharing is also easier tohandle in the case of fixed reading order.In this thesis, only the fixed reading order scheme is used. In our initial algorithm,we assume that the data for each stream are continuously placed on the disk. In Section3.4, we will discuss how to remove this constraint.Figure 3.1 shows how fixed order cyclic reading works. In this example, there are threestreams being served. In each cycle, these streams are served in the order: S1,S2, 83. AtChapter 2. Preliminary 8the end of the cycle, there might be certain amount of disk idle time left.Si S2 S3 Idle Si S2 S3 Idle Si S2 S3 IdleFigure 2.1: Fixed order cyclic readingMore generally, let there be n multimedia streams denoted by S1,S2, ..., S. Let theconsumption rate of Stream S be P. Let t be the amount of time required to read Sin each period. Let t be the total length of a period, and let be the seek time fromstream S to stream S,. Obviously, in each period:ttl+t2+...+tfl+sl,s3.+ fl1 (2.1)To simplify notation, let s = si,2 + s2,3 + ... + s,i, and P = P1 + P2 + ... + Pn.Disk utilization is defined as:tl+t2+...+tn+S(2.2)Our notation of the disk utilization only refers to the n multimedia streams beingserved. If the system also serves other users, those disk usages are not counted. Thedisk utilization p only represents how busy the disk is. Since the disk time includes bothreading time and seek time, it does not represent how much real work the disk does.The following table summarizes the meanings of the symbols to be used in this thesis.Chapter 2. Preliminary 9Symbol Meaning of symbolBmax maximum number of available buffersB total buffer consumption of n streamsBshar total buffer consumption of n streams with buffer sharingB buffer consumption of stream S2L block size in non-contiguous placementG seek time between non-adjacent blocks in non-contiguous placementP total consumption rate of all the n streamsP2 consumption rate of stream S.jpft adjusted consumption rate of stream S. after prefetchingR maximum disk reading rateSj the i-th streams total switching time within a periodswitching time between stream S2 and stream St length of a cyclereading time for S within a cycleT, length of stream Sp disk utilizationIn the fixed reading order scheme, in each period, the data consumed (played back)by stream S is t x P.s, and the data produced (read) for stream S is t2 x R. The continuityrequirement can be expressed more precisely as follows:t2xRtxP, 1i<n (2.3)In order to avoid overflow and reduce buffer requirement, the ideal case is:Chapter 2. Preliminary 10txR=txP, 1i<n (2.4)From Equation 2.4, we derive:ti Pi1<i,j<n (2.5)j jTo minimize buffer consumption, the reading time for each stream should be proportional to its consumption rate. By combining Equation 2.2 and 2.5, t can be determinedexactly as:t = (p x t — s) x (2.6)2.3 Determining the lower bound of tEquation 2.6 can be used to calculate t in each cycle provided t is given. A lower boundof t could be established from Equations 2.3 and 2.6:sxR(2.7)pxR-PThis equation leads to two interesting observations. First, the equation is valid onlyif p x R — P > 0. Even if the disk utilization p is set to the maximum value 1, it is stillnecessary that R > P. This is the most obvious admission control criterion. That is,without violating their continuity requirements, a system cannot admit so many streamsthat their total consumption rate P exceeds the disk bandwidth R. In Chapter 5, thisconstraint can be relaxed by prefetching.Second, t is inversely proportional to p. In other words, the longer the length of theperiod, the less utilized the disk becomes (which means the disk has more free time).Chapter 2. Preliminary 11This is because as t increases, the proportion of time wasted in switching (i.e. ) withinevery period becomes smaller. In other words, a longer period corresponds to a higherpercentage of useful work (reading) done by the disk, and the disk becomes more effective.Hence, the proportion of the idle disk time becomes higher. In Chapter 5, we will showhow to make use of this relationship between t and p to maximize prefetching.Figure 2.2 helps us to understand the relationship between disk rate R and cyclelength t better. In this figure, the R and t relation curve is drawn under the conditionthat both P and p are fixed. It is easy to see that the requirement for disk rate R willdecrease with the increase of period length t. As expected, the curve corresponding top 0.5 is above the curve for p = 1.p=l.0;t (period length)2sFigure 2.2: The curve of disk rate R and cycle length t relationChapter 2. Preliminary 122.4 Buffer allocation: determining an upper bound for tThus far, we have analyzed the handling of multiple streams primarily from the viewpointof disk bandwidth allocation, and we derived a formula to calculate a lower bound for t.There is, however, another very important issue: the allocation of buffers.In this section, we will show how buffers should be allocated under the condition thatno buffers will be shared among the streams. Sharing buffers among all the streams willbe discussed in Chapter 4.As stated earlier, in this thesis, the reading order of all the streams within a cycleis assumed to be fixed. Without losing generality, we assume that S1,S2, . . .S are beingserved in that order. If stream S is the ith stream being served in period 1, it will stillbe the ith stream being served in all of the following periods. Under this assumption, thenext read for a stream is exactly t time units after the start time of the current readingof B, in the stream. So, in each period, each stream must get enough data to last thetime of t.’As can be observed from Figure 2.3, for stream S, the maximum number of buffersis needed right after S has just finished reading. Thus, the number of buffers requiredby S is:B=txR—txP, (2.8)By substituting Equation 2.6 into the above, we get:pxt—sB, P. x (R - P) xP(2.9)‘In a variable reading order scheme, this value might be as large as 2t, since a stream can be servedas the first one in the current cycle, but the last one in the next cycle.Chapter 2. Preliminary 13Figure 2.3: The buffer requirement curveThus, the total buffer requirement for the n streams is:(2.10)Two important observations can be drawn from Equation 2.10. First, the exactnumber of buffers each stream needs can be calculated. If the buffer requirement cannotbe satisfied, the stream set should not be accepted. On the other hand, giving morebuffers than the minimum necessary to a specific stream does not help at all. All theextra buffers will be wasted.Second, it is obvious that the longer the period length t, the higher the value of B willbe. Since B must be bounded by the maximum available buffer Bmax(i.e. B <Bmax),an upper bound for t can be derived by substituting Equation 2.10:PlaybackChapter 2. Preliminary 14BmaxXP+ (2.11)p1Px(R—P) pCombining Equations 2.7 and 2.11, a simple admission control policy can be given.Admission Control 1Let S, ..., S, be all the streams in the current cycle, and let S,-, be the stream to bedecided whether admission is possible.1. Compute the lower bound for t using Equation 2.7 and the upper bound for t usingEquation 2.11.2. If the lower bound is strictly greater than the upper bound, then it is not possible toadd S without violating the continuity requirements.3. Otherwise, S,, can be admitted to form a new cycle, and any value between the lowerand upper bound can be chosen as the length of the new cycle.When we discuss prefetching in Chapter 5, we will return to the issue of picking avalue for t in this range.2.5 Approximation of non-contiguously placed dataSo far, we have assumed that for each stream, the data is contiguously placed on thedisk, this means that there is no seek throughout t when Stream S is being read. Thisdata placement policy is possible when using a spiral disk. However, it might not bepractical in many other situations. In this section, we will show how to use the equationsestablished so far to handle the case when data are not placed so perfectly.Chapter 2. Preliminary 15Figure 2.4: Using a continuous curve to approximate block readingConsider a simple case first. Assume that data are stored in blocks, the block size isL, and the seek time between adjacent blocks is G. As shown in Figure 2.4, the solidline represents the reading curve of a stream of this kind. To use the equations that wehave developed so far, we can approximate the given reading curve by one that assumescontiguous placement. This approximate curve is a straight line that is always below theoriginal reading curve, but with a slope as large as possible. The dotted line in Figure2.4 represents the approximate curve. By a simple coordinate-geometry analysis, we cancalculate that the dotted line has a slope Rapprox given by:Rapprox= G + L/R(2.12)This value can be used to replace R in all the equations that we have encountered soassignedfar.Chapter 2. Preliminary 16Note that since the approximate reading curve is always below the actual readingcurve, at various points in time, the disk may read faster than approximated. Thisleads to two observations. First, the continuity requirement will not be violated by theapproximation, because the disk is never slower than approximated. Thus, there is noneed to worry about starvation.Second, extra buffers are needed to store the data that are retrieved faster thanexpected. See Figure 2.4 for an illustration. By another simple coordinate-geometryanalysis, the number of extra buffers we need is given by:L2BextraLL+GxR(2.13)Bextra is less than the block size L. In many cases, this is not a big burden to thesystem.Similar approximations can be applied to other non-contiguous data placement situations. We omit those analyses here.The above method is not the only method for approximation. There are many otherways to do it, too. For example, if short time overload (dropping one or two frames) isallowed, the approximation curve need not to be always below the real reading curve. Itcan pass right through the middle of the real reading curve. This method is, in somesense, more accurate than the above one.2.6 SummaryIn this chapter, the fixed reading order scheme, which is the basic framework this thesis isbuilt upon, was described. First, after describing the general framework this thesis basedon, we introduced the basic notations and formulas. Then, we analyzed disk schedulingChapter 2. Preliminary 17and buffer allocation, and gave the formulas for calculating cycle length t and bufferrequirement B. Based on these analyse, we gave a simple admission control policy basedon the upper bound and lower bound of t. Finally, in Section 2.5, we proposed a schemefor approximating non-contiguously placed data by a contiguous curve.Chapter 3Related WorkMuch excellent work has been done in the area of multimedia file servers. With respectto the topics that will be discussed in this thesis, this work can be grouped into thefollowing categories:• Data placement• Intelligent disk scheduling• On-line scheduling and admission controlThis chapter will give a brief summary of the work in these areas.3.1 Data placementThe speed of retrieving multimedia data from a disk depends to a large extent on howthe data are organized on the disk. Many researchers have studied how to wisely organizedata on disk so that the retrieval can be performed under the constraint of the continuityrequirement.If there is only one stream on the disk, the solution is quite straightforward. Based onthe consumption rate requirement, the block size (length of contiguously stored media)and scattering parameter (length between two adjacent blocks) can be calculated. Thenall the data can be stored accordingly.Several papers have discussed how to store two or more streams on the same disk.18Chapter 3. Related Work 19When the streams are played back, the continuity requirement should be satisfied for allthe streams.Yu and his col1eagues[Y+89 discussed how to merge two or more media streams onone disk in the strictest manner. The key idea in their method is trying to fit each streaminto the other one’s gaps, and the continuity requirement must be strictly satisfied at anytime point for any stream. In their scheme, a logical block is not allowed to be brokeninto small pieces. This method does not need buffering and read-ahead. However, usuallymultiple media streams do not fit so well as to satisfy this merging condition.Rangan and Vin[RV93] studied the same problem, but their approach is quite different. In their model, buffering and read-ahead are used to streamline the video stream sothat the continuity requirement is relaxed. Their requirement is, in a certain time periodconstrained by the buffers, that the actual data read from the disk should be greaterthan the sum of the playback rates of all the streams. In addition, a logical block canbe broken into small pieces to increase disk utilization. This method can achieve betterdisk utilization than Yu’s approach (theoretically, it can approach 100%). However, thecomputing and buffering overhead should not be underestimated.There is a serious drawback in both of the above methods: They both assume thestream accessing patterns to be fixed. For example, if two streams A and B are mergedtogether, they are supposed to be played back at the same time. This is not practical inan interactive environment. If stream A is played back now and a request for stream Bcomes ten minutes later, the system may not be able to handle this situation properlyand the continuity requirement could be violated.Louhger and Shepherd[LS931 studied how to use a disk array (also called stripeddisks) to increase disk bandwidth. In their experimental implementation, they have oneChapter 3. Related Work 20dictionary disk and two data disks. The dictionary disk is used to store metadata. Thereal multimedia data are stored in the data disks. Editing is performed at the metadatalevel. The real multimedia data in data disks will never be changed after being placed.3.2 Intelligent disk schedulingIntelligent disk scheduling schemes like SCAN, C-SCAN, SSTF and EDF have been usedin conventional file servers for a long time. Recently, several researchers have studiedthe possibility of using intelligent disk scheduling techniques in the multimedia file serverenvironment [Gem93] [RW93] [CKY93].Schemes like SCAN and SSTF can optimize disk arm movement, reduce seek timeand improve system throughput. However, these schemes do not consider the specialreal-time requirements of multimedia requests.Reddy and Wyllie[RW93 proposed an algorithm called SCAN-EDF. This algorithmcombines the features of SCAN type of seek optimizing algorithms with Earliest DeadlineFirst (EDF) type of real-time scheduling algorithms. The SCAN-EDF scheme guarantees that the real-time requirement of a user request will not be violated. Under thisconstraint, it tries to optimize disk arm movement.The other two approaches [CKY93] [Gem93] are quite similar to the SCAN-EDFscheme. Those two schemes are called sort-set algorithm (SSA) [CKY93] and groupsweeping scheduling (GSS) [Gem93]. The basic idea is: In a pure SCAN type scheme,the serving order of all the requests in the task set is rearranged in order to optimizethe disk arm movement. If the task set is large, the serving time for a particular requestmay vary greatly in the current and the next cycle. Due to the uncertainty of servingtime, sometimes the continuity requirement of the request may not be satisfied. In theChapter 3. Related Work 21SSA and GSS schemes, the whole task set is divided into several groups. The groups areserved in a fixed order. However, within each group, SCAN or some other scheme is usedto optimize disk arm movement.Although buffer requirements are analyzed in almost all these schemes, optimizingdisk arm movement is the major goal of these schemes, not buffer usage.3.3 Scheduling and on-line admission controlThere are many papers that study the problem of scheduling and admission control in amultimedia file server environment. As first observed in [RV91] and [GC92], the best wayto do scheduling in a multimedia file server is doing it in a round-robin (cyclic) manner.The key issue is how to decide the cycle-length and how to serve all the requests withineach cycle. [RV91] and [GC92] only deal with a simplified scenario which assumes thatall the users’ requests have the same playback rates. In addition, they assume that allthe data blocks have the same size. Under these conditions, a simple round-robin schemeis sufficient.Rangan’s group[RVR92] proposed a scheduling scheme called Quality ProportionalMulti-subscriber Servicing (QPMS). According to this scheme, during each reading cycle,the data read for each user is proportional to its playback rate. However, QPMS staticallytest if a task set is schedulable or not. The problem of transient period is also introducedin this paper. The transient period is the length of time during which a new requestis added to the server or a finished request is deleted from the server. In this period,the continuity requirements of all the already serving requests should not be violated.A simple scheme is given in this paper to deal with this problem. However, this simplescheme will usually result in a long transient period.Chapter 3. Related Work 22Andersion et al. [A0G92J described the design of a multimedia file system calledCMFS. First, an interface is designed for the user. When the user submits a request,the system creates a “session”. A session has such information as: file name, requireddata rate, buffer requirement, the amount of read-ahead allowed, etc. The CMFS usesthis information to do an “acceptance test”. If it passes the test, the system will acceptthe request and re-schedule all the “sessions” in the system. The key part of CMFSis the “acceptance test”, which is done statically, and based on the worst-case situation. However, the actual scheduling algorithm uses some dynamic methods to improveperformance.Polimensis[Po191] described a file system that supports multimedia data. Their approach of admission control is similar to the “acceptance test” in [A0G92]. A schedulabletasks set is constrained by two criteria: the total consumption rate should be less thanthe disk bandwidth, and the total read-ahead should be bounded by the available bufferspace. These two factors are related to each other. A faster disk will require a smallerbuffer space. On the other hand, more buffer space will usually reduce the number ofseeks and increase the actual disk transfer rate. This paper gives a detailed analysis ofthese problems. The problem of transient period is also analyzed more thoroughly, anda good solution is given. However, as will be shown later, the approach of this paper isdifferent from ours in two different ways. First, the buffer management policy is staticand is performed per task, not dynamically and globally. Second, the author did notconsider looking ahead in the request queue (prefetching).Chapter 3. Related Work 233.4 SummaryAs we noticed in Section 3.1, a static placement scheme does not work well if the accessingpattern is dynamic. For all the techniques described in Section 3.2, optimizing disk armmovement (seeking time) is the big concern, minimizing buffer usage does not get muchattention. Section 3.3 describes several scheduling and admission control algorithms.Those algorithms all work in more or less the same manner, however, none of themstudies sharing buffer among all the streams and looking ahead in the request queue(prefetching). In the following chapters, we will address the above limitations.Chapter 4Buffer Sharing4.1 Benefits of buffer sharingIn the previous chapter, a scheme to do admission control and buffer allocation wasestablished. Multimedia systems require a large amount of buffers. Given a fixed amountof buffer space, predetermined at design time, maximizing buffer utilization is a veryimportant design goal. In this chapter, we will study how to share buffers among theserved streams so that buffer utilization can be improved.As defined in Equation 2.9, the total buffer requirement of n streams is based onthe assumption that each stream S occupies B buffers in each cycle. However, S, maynot need all the B buffers at all that time during the cycle. B is just the peak bufferrequirement.Further study shows, that when one of the streams (say S) requires its maximumnumber of buffers, all the other streams do not need their maximum number of buffers atthat moment. Thus, a simple way to minimize total buffer consumption and maximizebuffer utilization is to allow the n streams to share buffers.In the following sections, we will first show how buffer sharing works in a simplifiedsituation in which all the streams have the same consumption rates. Then we generalizeto the more general situation of buffer sharing among streams with different consumptionrates. Some implementation considerations will be discussed at the end of this chapter.24Chapter 4. Buffer Sharing 254.2 A simple case: all the streams have the identical playback ratesFirst, we study a simple case in which all the streams have the same consumption rates.Figure 4.1 shows three streams S1,S2,S3 in the cycle, all of which have the same consumption rate. Thus, by Equation 2.9, each stream has an equal amount of reading time,i.e. the same t. Since the cycle length t is normally much larger than the total switchingtime s, here we consider the simplified situation that t = t/3. Let us consider the totalbuffer requirement at time 4t/3, at which point 51 has just finished reading and requiresb buffers, the maximum number of buffers that it ever needs. S2, which is about to startreading, has run out of data. Thus, the buffer requirement of S2 is 0. As for S3 therewere b buffers at time t, but at time 5t/3, all the data in those buffers will be consumed.Thus, at the current time 4t/3, S3 needs b/2 buffers. Hence, the total number of buffersrequired by all the three streams is b + 0 + b/2 = 3b/2. Note that if all the streams haveidentical consumption rates (and we assume p equals 1), their total buffer requirementdoes not change with time. Thus, 3b/2 buffers are all the three streams need. However, without buffer sharing, 3b buffers are required. Thus, buffer sharing gives a 50%reduction in total buffer consumption.In the following, we will analyze the situation when there are n streams with identicalconsumption rates and the disk utilization is 1, which means each stream will get t/ntime for reading and there is no disk idle time.Since the consumption rates are the same for all the streams, the reading time t. foreach stream is also the same, say equal to t0. Similarly, the buffer requirement B is thesame, which is equal to b say. Now, consider the time when S has just finished reading.The buffer requirement for each stream is listed in the following table:Chapter 4. Buffer Sharing 26buffers 3b3b12Figure 4.1: Buffer sharing for three streams with identical consumption ratesStream Buffer neededSi 05253 -bS. !!1bn—iFirst, Sn has just finished reading, thus it requires all b buffers. S. is about to startreading, so it has 0 buffered data at this time. S2, at an earlier point of time, had b buffersof data which are supposed to cover the consumption of S2 for a period of (n — 1) x t0.At the point when S has just finished reading, (n — 2) x t0 has elapsed, or alternatively,S2 will run out of data to seconds later. Thus, the current level of buffered data for S2is (ni)to b = --1b. Similarly, it is not difficult to see that the current level of buffereddata for S is -1b. Hence, the total buffer for all the n streams is:Chapter 4. Buffer Sharing 27Bshar=’b=b (4.1)21n—l 2Since the buffer requirement will not change with the time , the above number willbe the maximum buffer requirement of the n streams. Remember in the no sharing case,the buffer requirement is nb, thus buffer sharing achieves a 50% savings in buffer spacewhen p = 1.Now, we relax the assumption that p = 1 , and study buffer sharing in a more generalsituation. We still calculate the buffer required for each stream at the time Sn has justfinished reading. The above table can be generalized to:Stream Buffer neededS1 cb82 (c+—--)bS3 (c-i-)bSn (c+ bIn this table, c = As we will prove later, if p 1, the maximum bufferrequirement will not be a constant throughout the whole period. However, it will happenat the moment S,-, has just finished reading. Sum the numbers in the table together, wecalculate the total buffer requirement as follows:Bshar=(cb += 2n—np—x nb (4.2)1Under the condition p = 1. We will prove this later.Chapter 4. Buffer Sharing 28Notice, if p = 1, Bshar will be nb/2, which means 50% savings of buffer space happenswhen the disk utilization is 1, agreeing with Equation 4.1. Moreover, we can see, as pdecreases, Bshar will increase, which means 50% is the maximum saving we can achieveusing buffer sharing.Using Equation 2.9, we can substitute b into Equation 4.2. The total buffer requirement can then be expressed as:P 2n—np—pBshar = (R — —) x (t x p — s) X 2(n — p)(4.3)This equation can replace Equation 2.10 (and thus Equation 2.11) in the admissioncontrol test shown in Section 2.4.This completes the analysis of the case in which all the streams have identical consumption rates. Simulation result of this buffer sharing scheme will be discussed inChapter 7.4.3 General caseIn this section, we will discuss buffer sharing in the most general case. The constraintthat all the streams have the identical playback rates will no longer be required. First,we introduce some additional notation that will be used in this section.B is the buffer requirement of stream S in no sharing case BS is the buffer requirement of stream S in the sharing case, and BA is the total buffer requirement for all thestreams at the time S has just finished reading.Recall the formulas in Chapter 2, without buffer sharing, the maximum buffer requirement for stream S happens at the end of reading of S, and that amount of datawill last t—t time units.Chapter 4. Buffer Sharing 29Let F, be variable and let disk utilization p be less than 1, but still assume in eachperiod the reading order of all the streams isAt the beginning of the cycle (which is the time when S1 just starts reading), thebuffer requirement for all the streams are:Stream Buffer neededBS1 0BS2 --B2BS3 B3pQ tj+t2+... t,_1p1.JnThe above formulas are easy to derive. S1 is about to start reading, it has no dataleft, thus it requires no buffer. The buffered data for S2 will last t — t2 time units, and itsreading is still t1 time units away. Right then the amount of buffered data is t1/(t — t2).The buffer requirements for all the other streams can be calculated in a similar way.Suming the buffer requirements for all the streams together, the total buffer requirementat the time S is about to start reading will be:n nBA0 = LBS,i=2(4.4)After deriving BA0, it is not difficult to derive BA1,BA2, ... To calculate thesevalues, first we can compare the buffer requirement at the time S finishes reading andS4 finishes reading, and then we compute the difference.First, at the time S finishes reading, the buffers required by S will be increasedChapter 4. Buffer Sharing 30from 0 to B1. However, for all the other streams, since t1 time was elapsed, their bufferrequirement will decrease by B, x t1/(t — t3). Thus,BAl=BAo+Bl_’ j (4.5)More generally,= BA +B1 — tt.Bj 1 <i < n (4.6)j=1,ji IUsing Equations 4.4 and 4.6, BA1BA2, ..., BA can be calculated easily.The maximum of these values will be the total buffer requirement (we do not need tocount BA0, it is always equal to or less than BAA). That is to say,Bshar = Max(BA) (i e [1, nfl (4.7)After deriving this more general formula, we can verify the result presented in Section4.2. If all F, are the same, it is easy to see that BA+i is always greater than or equal toBA. That is to say, the maximum buffer requirement always happens at the end of thereading of S (when p is 1, the buffer requirement is a constant). A detailed derivationof these is as follos.Assuming B = b for all i, then Equation 4.4 becomes:BA0= n(n— 1b (4.8)2(n—p)and Equation 4.6 becomes:BA1 = BA +n— (4.9)n—pChapter 4. Buffer Sharing 31If p = 1, then BA+1 = BAD, which means the buffer requirement remains a constantthrough the whole cycle.If p # 1, then BA1 > BAD. So the maximum buffer requirement happens at theend of S,. finishes reading, i.e.Bmax = BA = BA0 + X —= 2n —— x nb (4.10)n—p 2(n—p)This is exactly the same result that we derived in Section 4.2 (Equation 4.3).This completes the analysis of buffer sharing in the general case. These formulas aresomewhat complicated. In order to calculate Bshar, all of the BA must be calculatedfirst, then the maximum is selected. In a real system, computing overhead may affectperformance slightly. It is interesting observe that if the streams are served in ascendingorder according to their playback rates, the maximum value will happen at the end ofS finishing reading. If this is the case, BA will be Bshar, and no comparison is needed.If the buffer requirement is the only concern, we may be lead to believe that reorderingthe streams in a cycle is a good thing to do. However, as we will show later, that this willbe too expensive if we also need to consider prefetching and handling transient period atthe same time 2 In most cases, it is not worth doing.The implementation of a buffer sharing scheme is not a trivial task. The difficultylies in the fact that the buffers used by all the streams are changing from time to time.In other words, each stream does not have a dedicated buffer set. A study of the implementation issues is under way, but is beyond the scope of this thesis.2Prefetching and transient period will be discussed in more detail in Chapter 5 and 6, respectively.Chapter 4. Buffer Sharing 324.4 SummaryIn this chapter, buffer sharing among all the served streams in the fixed reading orderscheme was analyzed. Buffer sharing in the simplified scenario in which all the streamshave the same playback rates was first studied. After that, the analysis was extendedto buffer sharing in the more general case in which playback rates can be different. Weshowed that by sharing buffers, the total buffer requirement can be reduced by up to50%.Chapter 5PrefetchingIn this chapter, prefetching as a scheme to improve disk and buffer utilizations is introduced. First, the benefits of prefetching are discussed in Section 5.1. Then, a simple straightforward prefetching scheme SP is described in Section 5.2. As our analysisand simulation results show, SP will not give satisfactory performance. An intelligentprefetching scheme IP1 to overcome the shortcomings in SP is described in Section 5.3.Finally, 1P2 is discussed in Section 5.4 as a further refinement of IP1. Our analysisshows that intelligent prefetching strategies IP1 and 1P2 can improve the performance ofa multimedia file server dramatically.5.1 Benefits of prefetchingThe admission control and scheduling schemes we have discussed so far do not considerprefetching of data. On receiving a new request for a stream (referred to as a new queryfrom now on), the admission controller that we have discussed so far simply checks if thereis enough disk bandwidth and buffer space to accept the new query, using Equations 2.7and 2.11. If there are enough resources, the query is activated. Otherwise, the query willbe left idle in the waiting queue. Consequently, there are resources — buffers and diskbandwidth — that are not utilized at all.We first consider a simplified example. Suppose there is a server whose disk’s maximum reading speed is 1000KB per second. Furthermore, suppose that all the requests33Chapter 5. Prefetching 34have the same consumption rate equal to 260KB per second. If we do not considerswitching time and there are always enough buffers, the system can support 3 streams simultaneously. That will use 260 x 3 = 780KB/s disk bandwidth. Consequently, 220KB/sdisk bandwidth will be wasted. If there are enough buffers, we can make use of this wasteddisk bandwidth to do some prefetching for the next waiting request. By doing this, theprefetched request will have a better chance of being admitted (since its actual consumption rate will be reduced after prefetching). For the same reason, the server will havemore free disk bandwidth to accept more new requests.Usually, the system performance of a multimedia file server is measured by throughput and query response time. For a system with pre-determined (at design time) diskbandwidth and amount of buffer space, the response time and throughput are primarilydetermined by the utilization of disks and buffers. Our goal here is to try to use theseresources as much as possible. More specifically, we will explore how data prefetchingcan maximize resource utilization, and thus lead to an increase in system throughput.Intuitively, there are several ways that prefetching can help a query:• First, if a query has a consumption rate P that is larger than R, then the playback of that query cannot be started immediately after its reading begins. In thiscase, prefetching some data ahead of time is a must.• Second, prefetching portions of a query before activation has the effect of reducingthe effective consumption rate of the query after activation. This is illustrated inFigure 5.1. The solid line represent the original consumption rate P1. If an amountpf is prefetched, then the new, prefetched consumption rate is given by the slopeof the dotted line. A simple analysis shows that if T1 is the length of the query, theconsumption rate after prefetching is given by:Chapter 5. Prefetching 35ppft=p._f(5.1)Since the new rate is less than the original rate, there is a possibility that the newrate may pass the admission control test, while the old one cannot. Whenever thishappens, the response time of the query is substantially reduced.slope = new rat- - - - - -Prefetchf lo=original rateFigure 5.1: Reducing the consumption rate by prefetching• Even if the above conditions cannot be applied, for a normal query, prefetchingcan still help the query reduce its response time by one cycle. To see this, assumethat Si, 82, ..., S are the active queries. At some time point, S1 has finished and5n+i is activated. For the reason of transient period (which will be discussed inChapter 7), the reading order for the new set should be S2, ..., S,1. If no data hasbeen prefetched for 5n+l, then its playback cannot be started until it gets a chanceto start reading, which is at the end of the cycle. However, if there is a sufficientamount of prefetched data for its playback can be started immediately at theChapter 5. Prefetching 36beginning of the cycle. Thus, there is a difference in response time which may beas large as one cycle length.The above claims will be shown more formally in the following sections. Our analysiswill mainly concentrate on the second situation. First, we will discuss how to decidecycle length t in the prefetching situation, and we will give a straightforward prefetchingstrategy SP. Then after observing that the effectiveness of SP may be hindered by severalshortcomings, we will develop another prefetching strategy IP1 which tries to maximizeoverall system throughput. However, there are too many intuitive and heuristic factorsin the strategy IP1. We will further analyze thoroughly what is happening in the systemand give a more formal IP strategy called 1P2. Interestingly, as will be shown in Chapter7, 1P2 does not significantly outperform IP1, which means the heuristic used in IP1 isgood enough in most cases.5.2 Straightforward prefetching strategy (SP)Just like normal data retrieval from disk, prefetching requires both disk bandwidth andbuffers. One obvious way to allow prefetching is to dedicate a certain level of diskbandwidth and buffers to prefetching. But this is not a good solution as it will reducethe disk bandwidth and buffers available to activated queries. We should make surethat prefetching is not done at the expense of activated queries. To this end, recallthat the cycle length t for the activated queries (streams) 51, ..., S are bounded belowand above, respectively, by Equations 2.7 and 2.11. If the system does not supportprefetching at all, any value between this range can be picked as the value of t. However,to support prefetching, an immediate question to answer is how to pick t so as to maximizeChapter 5. Prefetching 37prefetching, but not at the expense of the activated queries.Setting t to any value between the upper and lower bounds does not have any influencewhatsoever on the completion times of the activated queries, as the completion time ofa query is determined by its consumption rate and length. However, the value of t willaffect prefetching.First, consider setting t to its lower bound. As discussed in Chapter 2, this corresponds to set disk utilization p to 1. In other words, all the disk bandwidth is used upfor the activated queries, and there will be nothing left for prefetching. On the otherhand, consider setting t to its upper bound. From the point of view of disk bandwidthallocation, this time there is ample room for prefetching. As discussed in Chapter 2, alonger cycle length corresponds to a lower disk utilization p. However, the trouble is thatall the buffers will be used up for the activated queries. Thus, at the end there will beno prefetching done either, not because of lack of disk bandwidth, but because of lack ofbuffer space. Hence, the problem to address is which value of t in between the upper andlower bounds maximizes prefetching. Since the disk utilization to cycle length(p vs t)and buffer utilization to cycle length(B vs t) functions are unlikely to be linear, justselecting the average of upper bound and lower bound may not be a good decision.Figure 5.2 can help us to understand this better. In Figure 5.2, we define disk utilization Udk(actually p) and buffer utilization UlruffersPUd3k =Ubuffer=X (RF) (5.2)X maxWe plot how they change with cycle length t. For both Ud8k and Ubuffar, we drawthe curves when there are n requests and n + 1 requests being served. First we can seeChapter 5. Prefetching 38from the graph that if Udisk approaches 1, even though at this time Ubuffer is low, whichmeans there are many buffers left for prefetching. Since the disk is too busy, we cannot dovery much prefetching. A similar situation happens when Uôffer approaches 1. Second,if the system load is increased (more requests are being served at the same time), thechoices for t will be narrowed, which also means that the room left for prefetching willbe reduced.Figure 5.2: How to decide the reading period length t for prefetchingWe can see from the above discussion that there is no obvious way to get a good valueof t for prefetching. The main reason for this is that the information we considered so faris not sufficient. Actually, there is another factor that affects the amount of prefetchingthat can be done. AU the above analysis is based on the assumption that the cycle for thecurrent set of activated queries keeps on going. Let Tf3h denote the time the earliestT,Chapter 5. Prefetching 39next activated query will terminate. We should notice that the range bounding t is onlyvalid before Tf3h. After that time the system state will be changed, and a new cyclewill be recalculated. If we select prefetching as much as possible before Tf8h as thegoal, this leads to a straightforward prefetching strategy SP. In addition, it suggests away to calculate t. This is formalized as follows.First, if we consider just the free disk bandwidth, it is obvious that the amount ofdata that can be prefetched in time Tf5h is:Dprefetch = Tf3h X R X (:1— p) (5.3)This means we can only use the free disk bandwidth (1—p) to do prefetching. Bysubstituting p to Equation 5.3, there is:SFDprefetch = Tf8h X R X (1—(5.4)On the other hand, if we consider just the available free buffers, the buffers availablefor prefetching should be:Bprefetch Bmax — B = .Bmax — Pi X (R — P) (5.5)Dprefetch increases with t, and Bprefetch decreases with t. Furthermore, it is necessarythat Dprefetch Bprefetch. Thus, to maximize prefetching, we should let DprefetchBprefetch, which is equivalent to:>< R X (1— ) = Bmo—X (R P) (5.6)This is a quadratic equation which is equivalent toChapter 5. Prefetching 40axt2+bxt + c=0 (5.7)where the coefficients are:1Px(R-Pja =— Rb = TRTPBmax,c = -TRS.Solving this quadratic equation will give a positive solution v0 (and a negative solutionthat we ignore). If v0 falls within the lower and upper bounds of t, which occurs moreoften than not, v0 is the value of t. Otherwise, if v0 is strictly less than the lower bound,t will be set to the lower bound. And if v0 is strictly greater than the upper bound, theupper bound becomes the value of t.The equations presented above do not assume buffer sharing; the calculation of buffersis based on Equation 2.10. Since prefetching is orthogonal to buffer sharing, a similar setof equations can be derived for the buffer sharing case based on Equation 4.7.We summarize the straightforward prefetching strategy SP as follows.Strategy SPLet S1 S be all the activated queries, as allowed by the admission controller. Letbe the query at the head of the waiting queue.• Use Equation 5.7 to determine the length t of the cycle for S1 S.• Use the remaining disk bandwidth and buffers to prefetch for Si at the end ofeach cycle.Chapter 5. Prefetching 41• Prefetching stops when an activated query has finished, or the system has run outof buffers, or the entire query has been read into the memory.5.3 Intelligent prefetching strategy 1 (IP1)The straightforward prefetching strategy SP can maximize prefetching for the query atthe head of the waiting queue. However, as a result, S,1 may use up too many systemresources (particularly free buffers), for its own good, but not necessarily for the overallperformance of the system. More specifically, SP just lets S,÷i prefetch as much aspossible, but does not consider whether really needs that much data or not. Thismay have two bad effects. First, since accepting a new query always needs some workingbuffers, this may affect the possibility of accepting a new query. Second, it also reducesthe possibility of prefetching for another query. Because of this, the data prefetched fora query should be just enough for its activation.The above discussion indicates that while maximizing prefetching, SP’s approach ofprefetching as much as possible for the query at the head of the waiting queue may notbe sufficient. Naturally, the next question that needs to be answered is: Considering allthe queries in the waiting queue, if there is a chance, which query should be selected onwhich to perform prefetching? Using the margin gain analysis technique used in [NFS91],we can show that if First-Come-First-Serve for admission control still applies, then giventhe same amount of buffer space, prefetching for a shorter query can give greater benefitthan for a long query, which means prefetching for the head of the waiting queue maynot be a good decision. This is illustrated by the following example.Consider a situation where there are four activated queries (S1,S2,S3,S4) beingserved, each with a consumption rate 240KB/s. S1 will finish in 10 seconds, but theChapter 5. Prefetching 42other three will last longer. The disk has a maximum reading rate of R = 1150KB/s.There is 1MB free buffer space left at the current moment. S5 and S6 are two queries inthe waiting queue, both having a consumption rate 240KB/s. S5 is 30 seconds long, andS6 is 15 seconds long. We will show what will happen if we prefetch either S5 or S6.If we decide to prefetch data for S5, the maximum amount that can be prefetched is1MBs. By Equation 5.1, the consumption rate of S5 can be reduced to 240 — 1000/30 =207KB/s. At the time S finishes, the total consumption rate of all the other queries willbe 3 x 240+240+207 1167KB/s, which is greater than the maximum disk reading rate.This means that by prefetching S5, we cannot admit both S5 and S6 when S1 finishes.So consider prefetching S6 instead. By prefetching 1MB, the consumption rate of S6 canbe reduced to 240 — 1000/15 173KB/s. So the total consumption rate will only be3 x 240 + 240 + 173 = 1133KB/s. This is less then 1150, which means that when Sfinishes, we can admit both S5 and S6 at the same time. ElTo address the problems in SP, we propose the following new strategy. With thisstrategy, we will find the shortest query to prefetch, so as to maximize prefetching andthe number of queries that can be activated once an active query has completed.Strategy IP1Let S, ..., S be all the activated queries, as allowed by the admission controller.Among them, let Si (1 j n) be the query that will finish the earliest. Also letSfl+,5n+2, ... be the queries in the waiting queue, and let Bfree be the total number ofbuffers available for prefetching.The algorithm works as the following:1. Calculate tUse Equation 5.7 to determine the length t of the cycle for S1, ..., S,.Chapter 5. Prefetching 432. InitializationSet target to S,j, candidiateSet to and finalAmt to 0.3. First chanceIf the combined consumption rate of all the streams in candidiateSet is not greaterthan the consumption rate ofS3(i.e.P YSkecandidiateSet Pk), go to Step 6.4. Second chanceOtherwise,(a) Calculate the necessary prefetched consumption rate PfI6t of target so thatall the streams in candidiateSet can possibly be activated when S, has finished,i.e. Piirget + Starget;SecandidiateSet Pk P + (1 — p) x R.(b) Use Equation 5.1 to calculate the amount that needs to be prefetched in orderto reduce the consumption rate of target to Pi46, i.e. targetAmt (Ptarget —DPft ‘target) X1target(c) If targetAmt > Bree, then go to Step 5 to try the next condition.(d) Otherwise, use the admission control test given in Chapter 2 to determine ifall streams in candidiateSet, including the prefetched one, can fit into a cyclewith all the current activated queries except S. If the admission control testfails, go to Step 5.(e) Otherwise, set finalTarget to target and finalAmt to targetAmt. Go to Step6.5. Third and final chanceChapter 5. Prefetching 44(a) Set targetAmt to Bfree.(b) Use Equation 5.1 to calculate the prefetched consumption rate after prefetchpft — targetAmtzng. . target — target— Ttarget(c) Use the admission control test given in Chapter 2 to determine if all streamsin candidiateSet, including the prefetched one, can fit into a cycle with all thecurrent activated queries except Si. If the admission control test fails, go toStep 7.(d) Otherwise, set finalTarget to target and finalAmt to targetAmt. Go to Step6.6. See if more queries can be activatedConsider the next query5next in the waiting queue that is not in candidiateSet. Add5next to candidiateSet. Compare the length of5next with the length of the target.Set the target to be the stream with shorter length. Go back to Step 3.7. No more queries can be activatedIf finalAmt > 0, prefetch finalTarget for the amount of finalAmt.DThe above algorithm is fairly lengthy. We will explain it in more detail.Step 1 is to calculate t. This calculation is separated from the remaining part ofthe algorithm, which means changing the admission control policy will not affect thecalculation of t.Step 2 is the initialization part. Target will be the query we perform prefetchingon. CandidateSet is our working set. We consider all the queries in this set as a wholeChapter 5. Prefetching 45when doing admission control. And finalAmt is the amount of data we will prefetch fortarget.The purpose of introducing carididateSet is to enforce FIFO in the activation ofqueries. If 5k is before 5k+1 in the waiting queue, we should not admit Sk later thanSk+1• Even though it is highly possible we will prefetch 5k+1 instead of 5k, if that is thecase, Sk+1 and Sk must be accepted together. Breaking FIFO will give better chances toimprove performance, but it also introduces many other problems like fairness, deadlockprevention, etc. These are not discussed in this thesis.Steps 3—6 are the main part of the algorithm. In each iteration of IP1, a new query isadded to the candidateSet, and the shortest one in the candidateSet is selected as targetfor possible prefetching. If all the queries in the candidateSet can pass the admissiontest as a whole, a new iteration begins; otherwise, the algorithm stops. There are threepossibilities for all the queries in the candidateSet to be activated once Si has completed(steps 3, 4 and 5):• First (Step 3), if the combined consumption rate of all the queries in candidateSetdoes not exceed the consumption rate of S, all the queries in candidateSet areguaranteed to be activated once 53 has completed. In addition, nothing needs tobe prefetched in this case. Execution goes to Step 6 to see if there are more queriesin the waiting queue that can be activated.• Second, if the first condition fails, execution goes to Step 4. In this case, IP1 tests ifa sufficient amount of target can be prefetched so that all the queries in candidateSetcan be activated, provided that this amount of data does not exceed the number ofbuffers currently available for prefetching (Step 4c). If admission control in Step 4dverifies that all queries can be activated with the help of prefetching, both targetChapter 5. Prefetching 46and the prefetching amount targetAmt are recorded in the variables finalTarget andfinalAmt. Execution then goes to Step 6 to try to add another query from thewaiting queue to candidateSet, and a new iteration begins.• Third, if both of the above steps fail, IP tries the “last resort”. It simply tests ifusing all free buffers to prefetch for target will be sufficient to activate all queriesin candidateSet. If admission control returns a positive answer, all the necessarywork will be done in Step 5d and Step 6, and a new iteration begins.If all of the above steps fail, this means not all the queries in the candidateSet can beactivated. More precisely, all but the last added query in candidateSet can be activatedonce S has completed. Step 7 prepares for this event by correctly setting finalTargetand finalAmt.The control flow of SP1 is illustrated in Figure 5.3.At the beginning of this chapter, we listed three possibilities for how prefetchingcan help a query. However, algorithm IP1 only handles Case 2: reduce the effectiveconsumption rate. When this is impossible, prefetching data for one cycle could at leastimprove the response time to some degree. This can be considered as an improvementto algorithm IP1. However, this improvement is not as significant as reducing effectiveconsumption rate.In the above algorithm IP1, if everything fails, the scheduler will not do any prefetching. Actually, it might be a good idea to apply SP if this happens. Although thisprefetching will not help the query to be admitted right away, there is a possibility it willsave the system some disk bandwidth in the future, thus improving performance.As we noticed earlier, buffer sharing is orthogonal to prefetching. It is not difficult tocombine buffer sharing and prefetching together. We omit the result for simplicity.Chapter 5. Prefetching 47CCYesCalculate tFirst ChanceNoI1IYes[Second Chance 1NoL Third ChanceYesAdd New QueryNoSTOPIFigure 5.3: The control flow of IP15.4 Intelligent prefetching strategy 2 (1P2)An intelligent prefetching strategy (IP1) was introduced in last section. In contrast toSP, which always prefetch the first query in the waiting queue, IP1 tries to prefetchthe shortest possible query in the waiting queue, so that system performance can beimproved. However, closer study of strategy IP1 reveals some problems. First, in Step 3,if we take switching time into consideration, the condition P3 SkEcandidiateSet Pk is notInitial zationChapter 5. Prefetching 48sufficient to guarantee an admission. The reason is: if we let more than one query in, theswitching time of these queries may be much greater than what S, used. Similarly, inStep 4, condition PtIet + Sktarget;SecandidiateSet F,. P3 + (1—p) x R cannot guaranteean admission either. This means the value calculated in Step 4(a)(b) might not pass thetest in Step 4(d). A similar situation may occur in Step 5.Thus, the next question is: How to decide exactly if a query can be acceptted or notat the time the prefetching amount is calculated? And what is the optimal amount toprefetch?These are hard questions to answer, for reasons presented next.First, we have to keep track of all the system buffers. In order to accept a new request,we need some working buffers for cyclic reading, and we may also need some buffers forprefetching. The buffers available now can be used for either prefetching or cyclic reading.The buffers released at Tf3h by S, can only be used for cyclic reading. The reason isthat those buffers will not be released until but prefetching is performed beforethat time. There might also be some buffers released between now and (it dependson if the prefetching buffers are released gradually or at the termination of the query).Keeping track of all these buffers is complicated.Second, even if the above problem is solved, the story is still not finished. Theremaining question is: What is the optimal amount of data to prefetch? On one hand,prefetching can reduce the consumption rate for a query, so the query will have a betterchance to be admitted. On the other hand, if we use too many buffers for prefetching, wemay not have enough buffers left for cyclic reading. Because the total number of buffersis fixed, the problem is how to balance between these two extremes.To simplify the analysis, we assume that the prefetching buffers are released at theChapter 5. Prefetching 49termination of the query. Let the total system buffer space be B, the buffers used forcyclic reading be B, and the buffers occupied by prefetched data be B. The number offree buffers will be:BfreeBBcBp (5.8)Assume that the number of prefetching buffers that will be released at Tf3h by Siis B.Suppose the new request to be admitted is its playback rate is and itslength (duration time) is 1. Let D be the amount of data that will be prefetched for Snew.Now the question is how to decide on a value of D so that Pnew is maximized.First, the modified consumption rate of the query after prefetching is:P,ew Pnew — D/1 (5.9)The amount of buffers left for cyclic reading at Tf8h is:B=B-B-i-B3-D (5.10)In order to admit Snew, the first condition is that D Bfree. This is equivalent toD<B—B—B. (5.11)In addition, the admission control criteria defined in Equations 2.7 and 2.11 must besatisfied, i.e.sxRR-PChapter 5. Prefetching 50RxB- ‘P1(R- )+P’ (R-P’• (5.12)new\ new)In order to obtain a valid t, the upper bound should be greater than the lower bound,i.e.sxR< RxB(5.13)RP ZJi’Pi(RPi)+Pew(RP’ Ynew)The total consumption rates of all the queries is:nP= > Pi+P,ew (5.14)i=’,ijSubstitue this value for P into Equation 5.13, we obtain:n—iSB’ newCRfl_ipp, (P(R—P)+P’ ‘RPew)>0 (5.15)new i=1which is equivalent to the following quadratic equation:aD2 + bD + c> 0 (5.16)where the coefficients are:8 1a =12 1’n—i BBp+Bj+SR2SPnewb = R+Pi+Pnew+Ii=in—i n—ic = (BBp+Bj)(RPjPnew)s(Pnew(RPnew)+Pj(R-Pj)).i=i i=1Solving this equation, and Equation 5.11, we can obtain a range for D that maximizesPnew.Chapter 5. Prefetching 51Thus, we have finished the presentation of how to calculate D. This calculation canreplace Steps 4 and 5 in IP1. More important, by using this equation, an admissioncontrol strategy can be guaranteed to be successful if a valid (positive) solution of D isobtained.Strategy 1P2Replace Step 4 and 5 in IP1 by:Calculate D using Equation 5.16. DThe above formula seems quite complicated. However, the computing overhead isactually much smaller than IP1. The reason is that we eliminate all the failed attemptswhen doing admission control. One interesting thing to notice is: although 1P2 is moreaccurate in calculating D, and the computing overhead is greatly reduced, the actualsystem throughput is almost the same as IP1. This will be shown in Chapter 7 when wediscuss simulation results.5.5 SummaryPrefetching is a method to improve system performance of multimedia file servers. Thebasic idea of prefetching is to read some data for a new query before the query is admittedby using wasted disk bandwidth and buffer space. A simple straightforward prefetchingstrategy SP was described. SP prefetches as much as possible for the first query in thewaiting queue, and usually has bad performance. An intelligent prefetching strategy IP1is introduced as an improvement to SP. IP1 performs prefetching in a more intelligentway and usually gives much better performance result than SP. A further improvement toIP1, aimed at reducing the overhead of admission controll, is the intelligent prefetchingstrategy 1P2. Preliminary analysis shows intelligent prefetchings greatly improve systemChapter 5. Prefetching 52performance.Chapter 6Multiple Disk Environment6.1 System configuration of multiple disk environmentTo improve system throughput, using multiple disks is an obvious extension to the singledisk solution. While there are many possible system configurations for a multiple diskenvironment, in this thesis, we assume that a system consists of n disks and all the diskscan transfer data concurrently. In addition, we assume that the I/O bus can transferdata as fast as a disk can read/write.In such an environment, there are two factors that affect system performance. Thefirst factor is how the system buffers are distributed and used. The second factor is howthe data is organized on the disks.There are two possible ways to manage system buffers. All the disks can share onesystem buffer pooi, or each disk can have its own buffer set that cannot be shared withother disks. The latter case is trivial. But system performance is not as good as in thefirst case. So in our discussion, we will concentrate on the case where there is a singleshared buffer pool.There are several methods for data placement in a multiple disk environment:1. The simplest one is that all the data are replicated on all the disks. Because allthe disks keep the same copy of data, real concurrency can be supported. However,this scenario only has limited application, it is not practical.53Chapter 6. Multiple Disk Environment 542. All the data are striped perfectly on all the disks. In this case, the data unit forstriping is a fixed size, and the data are placed on the disks in strict order. Ifimplementation details are not considered, then we can treat this as a single diskwith a higher bandwidth.3. The third case is that any single stream is placed on only one disk, so the wholesystem can be viewed as a collection of disks.This list is more or less just a theoretical discussion. Practically, we can use RAIDdisk array [PGK88] as a study example. Basically, data in a RAID disk array can beread/written concurrently. Depending on the different stripping data units (from bits toblocks, to whole files), the reading behavior of a RAID disk array is similar to the Cases2 or 3 above.6.2 Extension from a single disk environmentIn this section, we discuss how to extend our original algorithms so that they can workwell in a multiple disk environment.If the data is replicated or perfectly striped(cases 1 and 2), there will be no problemsat all. We can logically treat this as a single disk, and use whatever policies we used inthe single disk case. If Rat1 = Rj is the sum of the disk rates of all the disks in thesystem, the only modification to our original algorithm is to replace R with Raji in allthe formulas.However, if the data is placed according to Case 3 (a stream is placed on only onedisk), we have to perform some modifications on our algorithm to make it work.Our goals here are to achieve high throughput and high concurrency, but at the sametime we want the disks to work as independently as possible so that the overhead inChapter 6. Multiple Disk Environment 55coordinating the actions of different disks can be minimized.In the single disk case, there is only one request queue. In the multiple disk case, thereis one system request queue, and each disk has its own request queue. When a request issubmitted to the system queue, it will be re-submitted to its corresponding disk queuedepending on where the data are stored. For each disk queue, the admission controlpolicy should still be First-Come-First-Serve. We treat each disk as a subsystem, and wecan use whatever policy we like within the subsystem. As a first step, no synchronizationwill be considered among user requests, so the admission control and buffer managementare simplified. There are only some minor modifications that need to be performed onthe original single disk algorithms.Specifically, if each disk has its own buffer space and no sharing is allowed, it willwork independently of all the other disks. However, if a shared system buffer pooi isused, then all the per-disk subsystems will compete for buffers. Keeping track of all thesystem buffers is a nontrivial task. We also need to enforce some fairness rules to prohibitany subsystem from occupying all the buffers at some time point. These issues will bediscussed in the reminder of this section.6.2.1 Handling buffer sharing and prefetchingThere are two important issues that we need to deal with in a multiple disk environment,namely, buffer sharing and prefetching.Buffer allocation and buffer sharingIn each per-disk subsystem, we will perform buffer sharing using the same scheme asin the single disk case. Since each subsystem is independent of all the other subsystemsand their admission controls are not synchronized, performing buffer sharing among allChapter 6. Multiple Disk Environment 56the serving requests in the whole system level will be too complicated.However, since all the subsystems are competing for a fixed number of system buffers,in order to improve the utilization of buffers, the number of buffers assigned to eachsubsystem should not be fixed. This can be thought as higher level buffer sharing.There are different ways to assign buffers to all the subsystems and enforce fairnessamong them. Studying the efficiency of these methods is a complicated issue and beyondthe scope of this thesis. In the algorithm that we will give later in this section, a simplescheme is used.As for the implementation concern, we have to keep track of all the buffers used byall the disks and the requests in the system at any one time. A simple way to do thisis: whenever a subsystem wants some buffers (this subsystem may not need all thesebuffers at this moment, but will need them later), it sends a request to the system buffermanager. If the buffer manager’s response is favorable, the subsystem will reserve thebuffers so that no other subsystem can use them. When the subsystem finishes with thebuffers, it will return them to the system buffer manager.PrefetchingSince we are not considering the problem of synchronization at this moment, eachdisk works independently of all the other disks. So prefetching is also performed at theper-disk subsystem level.As we already noticed in the single disk case, the most difficult problem in handlingprefetching is to figure out exactly how many free buffers the system has now and will havewhen the new request is actually admitted. However, using the request/reserve/releasescheme as we discussed earlier, this will not be a problem, although it may not be optimalin the global sense.Chapter 6. Multiple Disk Environment 57Just like the algorithms without prefetching, the problem of fairness also exists withprefetching. Since prefetching usually consumes much more buffers than regular reading,the situation gets even worse here. Buffers occupied by prefetched data in one subsystemwill affect the performance of all the other subsystems. How to coordinate the prefetchings in all the subsystems is a tough and interesting topic, but again this problem isbeyond the scope of this thesis. In our algorithm, we will just use a simple policy: Atany time, only one subsystem is allowed to do prefetching.6.2.2 Handling synchronizationSynchronization is the most difficult issue when considering multiple stream retrieval.We will show how this is handled in our multiple disk environment. We use a verysimple scheme which may not be optimal. However, it shows synchronized requests canbe handled correctly within our framework.We use the simplest synchronization scenario as an example: multiple streams are tobe started at the same time.In the single disk case, because there is only one admission controller, the problem isrelatively easy. We can bundle all the requests into one, and always admit them or rejectthem together.In a multiple disk environment, we have multiple admission controllers, and the requests may belong to different subsystems. We need to design a scheme to communicateamong all the subsystems (on different disks). When one subsystem is ready and theother is busy, the ready one should wait until the busy one is ready too. During the timeit is waiting, the ready subsystem should not admit other requests. When all requestscan be admitted, all the subsystems should admit them at the same time.Chapter 6. Multiple Disk Environment 58As we said earlier, this is not an optimal solution. When one disk is ready and theother is busy, the ready disk will be idle for some time, so the disk utilization is notquite good. However, this simple solution guarantees no deadlock will happen and it isrelatively fair. For a counter-example, suppose we have two disks to serve a synchronizedrequest which requires two streams, one from each disk. At time t, disk Dl is ready andD2 is busy. If we do not want to waste any disk time of Dl, and let other requests waitingon Dl get in now, when D2 is ready, Dl may be busy serving the new request. Thus D2will have to wait for Dl. If this situation is not handled carefully, we may easily get intocircular waiting, and the synchronization requests will never be served.It is possible to design a scheme to achieve both high disk utilization and fairness,but that needs more research, so we leave it as a future improvement. Instead, we outlinean simple algorithm for the multiple disk environment as follows.Algorithm MD:1. All the requests submitted to the system waiting queue will be re-submitted to thecorresponding subsystem waiting queues.2. A system buffer manager will handle all the buffer requests from the subsystems.(a) In order to use buffers, a subsystem should first send request to the systembuffer manager.(b) The system buffer manager will assign buffers to subsystems according to apredefined policy.(c) Subsystems will reserve the buffers after they get them from the system buffermanager, and will return them back to the system buffer manager when finished.Chapter 6. Multiple Disk Environment 593. Each subsystem has its own admission controller and scheduler. Each subsystemworks independently of the others.4. Synchronization will be handled using the simple stop-wait scheme described above.ciIn the above algorithm, the buffer manager can use different buffer assignment policies. These policies should guarantee that no subsystem will grab all the system buffersat any time. We use a simple heuristic that no subsystem is allowed to get more than50% of the total buffers, and simulation results are shown in Chapter 7.6.3 SummaryThis chapter describes a preliminary study on how to extend the work described inChapters 4 and 5 to a multiple disk environment. This preliminary study showed theframework we established in the last two chapters will work in the new multiple diskenvironment. Specifically, we discussed how to handle buffer sharing, prefetching, andsynchronized requests.Some simulation results will be given in Chapter 7, and directions for further studywill be discussed in the last chapter of this thesis.Chapter 7Performance Evaluation by Simulation7.1 Simulation methodology and program designWe have implemented a simulation package to evaluate the algorithms discussed in thisthesis. The package runs under Unix on Sun Sparc workstations. It consists of about5,000 lines of C source code.The methodology we used in the simulation package is discrete event-driven simulation. Everything that can cause a change in the system state, such as request arrivals,request completion, cyclic read completion, etc. are defined as events. The main purposeof our program is the generation and processing of these events.The simulation package can have two different kinds of output. The first requires allthe requests being submitted to the request queue at the beginning of the simulation.In this case, we can report the total finishing time, peak disk utilization, peak bufferutilization, average disk utilization, average buffer utilization, etc. The second kind ofoutput is to simulate an interactive system. In this case, in addition to the above results,we can also report query response time. For simplicity, most of the simulation resultsdescribed in this chapter are reported using the first kind of output.Our simulations are based on realistic figures. This makes the simulation results moremeaningful. The following table lists the ranges of key values we selected.60Chapter 7. Performance Evaluation by Simulation 61mill seek time(per track) 5 msecmax seek time 25 msecquery playback rate 240 kbytes/smax disk rate 2000 kbytes/sblock size 20 kbytesbuffer space 100k — 8 MMore details of the simulation package are given in the following sections.7.2 Implementation concernsTo implement/simulate our ideas, there are lots of issues that need to be considered verycarefully. In this section, we discuss several issues we encountered in the design of thesimulation package.7.2.1 When to activate the admission controller?In our algorithms, reading of streams is performed in an cyclic manner. When a requestfinishes, the admission controller can be activated either right after the reading for thisrequest is finished; or, after the whole cycle finishes. When the cycle-length is large anddisk utilization is low (which means a big chunk of free time will be left at the end ofeach cycle), the above two schemes have large differences in system performance. Weselect the first scheme in our simulation. Although it is more complicated and harder toimplement, we verified that it does improve system performance.Chapter 7 Performance Evaluation by Simulation 627.2.2 Buffer releasingIf prefetching is used in our algorithm, the prefetched data will occupy a certain numberof buffers. We can release these buffers in two ways: after the query finishes, release allthe buffers; or, at each reading cycle, whenever the buffered data is consumed, the bufferswill be released (gradually). Releasing buffer gradually will improve the performance tosome degree, but it will significantly increase the complexity of our analysis (and thus thecomplexities of the admission controller and scheduler). In our simulation package, weuse the simpler (first) scheme. By doing this, the computing overhead of the scheduleris greatly reduced.7.2.3 Transient periodAs discussed in literal, [RVR92] and [Po191}, in a cyclic-reading scheme, when the cycle-length is changed (e.g. increased), starvation may happen. For example, suppose thereare n streams being served, and the current reading period length is t. When a newrequest comes in, the new reading period length becomes In our algorithm, t1 isalways bigger than t. However, the data we read in in the current period can only lastthe length of t, so starvation will happen.To solve this problem, we divide the transient period into two steps: First, we changethe reading period for the n requests from t to t1. Then, we do some prefetching forthe new request so that it will be ready for startup. After these two steps, we can startthe normal cyclic reading with the new query added.Chapter 5 has already shown how to prefetch data for a request. We will discuss howto change the reading period length from t,- to tn+l in this chapter.The idea is to do it gradually. Basically, what the system need to do is to try to readChapter 7. Performance Evaluation by Simulation 63some extra data in each cycle so that the next cycle will be a little bit longer. Thesesteps will be repeated until we reach tn+l.Suppose disk utilization is p. The actual time used to read data in each cycle willbe t x p — s. If we use all the disk idle time to read data, the actual read time will bet,., — s. This amount of data will support the consumption of the next reading cycle, sothe reading period length for the next cycle can be changed tot(1) = ttfl — S(7.1)tTh X p — SWe can repeat this procedure to obtain t(2), t(3), t(4)..., until t(i) t1, thenthe transient period is completed.From Equation 7.1, we can see that the rate t(i) increases depends on p. If p = 1,nothing can be done. So when performing scheduling and calculating t, we cannot push pto as high as 1. That will make accepting a new request without losing data impossible.In our simulation package, we set the maximum of p to 0.95.In the process of changing t, we do not need to worry about buffer limitations. Thisis because we always increase buffer usage in this procedure, but we already know thereare enough buffers for the last state t,1, as guaranteed by the admission control test.7.3 Evaluation of buffer sharing with varying disk ratesIn Chapter 4, we analyzed buffer sharing, and the result showed that it can lead toa tremendous reduction in total buffer requirements. In the ideal case (disk utilizationp equals to 1), the savings can be 50%. Here we simulated a situation when p keepschanging and has an average value less than 1. In this series of simulation, we used50 queries, each with a consumption rate of 240 KB/s. The lengths of the queriesChapter 7. Performance Evaluation by Simulation 64Buffer requirement vs No. of queries00.owith buffer sJaring..A 00.0 /G.) /— without buffer sharingnumber of activated of queriesFigure 7.1: The benefit of buffer sharingwere from 20 to 120 seconds, with the average being 60 seconds. In order to support asignificantly high number of concurrent queries, the maximum disk reading rate was setto R=2000KB/s. The graph in Figure 7.1 shows the minimum buffer space needed whenthe number of concurrent queries varies from 3 to 7—with and without buffer sharing.As expected, in all cases, buffer sharing requires less buffer space than without buffersharing. The savings in buffer space were between 20% to 40%, depending on the averagedisk utilization.Chapter 7. Performance Evaluation by Simulation 657.4 Evaluation of buffer sharing with non-contiguous data placementIn Chapter 2, we studied how to use approximation to deal with the non-contiguous dataplacement case. By using approximation, a buffer sharing scheme can also be applied tonon-contiguously placed data, so performance can be improved.In this series of simulations, we attempted to show that non-contiguously placedstreams can benefit from the approximation, which makes them amenable to buffer sharing (and the kind of prefetching we propose in this thesis). In particular, we repeated thesimulation described in the previous section with two different queries. The first kind ofqueries request streams which were non-contiguously placed in blocks of size Bi = 20KB(i.e. roughly one disk track), with each block separated by a gap G = 5ms. The secondkind use the streams which are the approximation of the above streams using Equation2.12 and 2.13. Queries of the second type were allowed to share buffers (we cannot sharebuffer for the first type of queries by using our algorithm, because all our analysis isbased on the contiguously-placed case.). Analogous to Figure 7.1, Figure 7.2 shows theminimum buffer space (in KB) needed for both kinds of queries, when the number ofconcurrent queries varies between three and five.number of concurrent queries 3 4 5non-contiguous streams 130 360 2520approx. streams with buffer sharing 110 250 1420Figure 7.2: Handling non-contiguous data placement using approximationChapter 7. Performance Evaluation by Simulation 66As can be seen clearly in the table, in all cases, it is beneficial to approximate noncontiguous streams with contiguous ones, and if allowed to share buffers, the approximating streams can lead to a reduction in total buffer requirements.7.5 Evaluation of prefetchingIn this series of simulations, we evaluated the effectiveness of our prefetching strategies.We again used 50 queries, each with a consumption rate 240 kB/s, and length 90 seconds.The maximum disk reading rate was set to 1000KB/s. The graphs in Figure 7.3 and 7.4show the time taken to complete the 50 queries and the average disk utilization withvarying amounts of buffer space. In both graphs, the x-axis is the amount of bufferspace, varying form 5MB to 8.5MB. In figure 7.3, the y-axis is the total time takento complete 50 queries using SP, IP1 and 1P2, normalized by the time taken withoutprefetching. Thus, the horizontal line at 1.0 in Figure 7.3 represents the situation withoutprefetching. With small amount of space available to prefetching, intelligent prefetchings(both IP1 and 1P2) do not lead to any gain in performance. However, as more andmore space becomes available to prefetching, IP1 and 1P2 are able to activate more andmore queries faster than if no prefetching is allowed. Consequently, the total time takenbecomes smaller. As shown in Figure 7.3, intelligent prefetching (both IP1 and 1P2)could lead to a 30% saving in total time.Figure 7.4 provides more insight on the comparisons. If no prefetching takes place,the average disk utilization is around 0.8. But as more buffer space becomes availableto prefetching, IP1 and 1P2 are able to better utilize the disk by prefetching, and theaverage disk utilization gradually climbs up to 1.0. Moreover, the utilization of buffersfollows a similar trend.Chapter 7. Performance Evaluation by Simulation 67Finish time vs Buffer Size:0aiD00_______________________________________________________5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0Buffer Size (Mbytes)Figure 7.3: SP vs IP1 vs 1P2: relative finish timeAnother interesting thing shown in Figure 7.4 is that the average disk utilization forSP is still higher than if no prefetching is allowed. This indicates that while the diskis kept busy by prefetching, the way that SP conducts prefetching is problematic andtotally ineffective.Also we can observe from the graph, that IP1 and 1P2 do not have significant differences in performance. As we pointed out in Chapter 5, this shows that the heuristicswe used in IPI. are good enough in most of the cases. However, we should point outthat the computing time used for admission control and scheduling is not showed in theChapter 7. Performance Evaluation by Simulation 68Disk Utility vs Buffer Size- --_4_.).— 04)(/2• 005.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0Buffer Size (Mbytes)Figure 7.4: SP vs IP1 vs 1P2: average disk utilizationsimulation. 1P2 has much less computing overhead than IP1. The reason is that everytime 1P2 calculates a prefetch value, it will guarantee a successful admission. On theother hand, due to the reason we discussed at the beginning of Section 5.4, IP1 may havemany failed attempts. This increases the computing overhead. In a real system, it willhave some impact on performance.The series of simulations discussed above did not allow buffers to be shared. Inanother series of simulations, we allowed buffers to be shared, and used the version ofadmission control that is based on Equation 4.5, not on Equation 2.10. The results ofthis series of simulations were very similar to those presented above. The only differenceChapter 7. Performance Evaluation by Simulation 69was that buffer sharing saved a few hundred KBs of buffer space, and made it availableto perform prefetching. Thus, the point when IP1 and 1P2 started to show improvementnow began a few hundred KBs earlier than is shown in Figure 7.3. For simplicity, weomit the results here.7.6 Evaluation of multidisk environment algorithmIn this section, we will give the preliminary simulation results for our multidisk environment algorithm. The algorithm is described as MD in Section 6.2. In this series ofsimulations, we used a three disk array. We assumed that a stream is placed on only onedisk, and there is no duplicated data. There was a system buffer pool shared by all threedisks. We set the buffer size to 10MB.Again, we used 150 queries (roughly 50 queries per disk), each with consumption rate240 KB/s, and length 90 seconds. Each of the queries was assigned randomly to one ofthe three disks.We assumed all the three disks have the same speeds, and in this simulation, wechanged the disk speeds from 1000KB/s to 2000 KB/s, reporting the total query finishtime.For comparison, we also reported the query finishing time on a single disk server,which has the same disk speed as the disks in the disk array. In our multiple diskalgorithm, the maximum disk space any subsystem can get is 50% of the total bufferspace, i.e. 5MB in this situation. To be fair, we set the buffer space of this single diskserver to be 5MB too.In Figure 7.5, the solid line is the total query finish time using this disk array. Thedotted line is the result we got using a single disk server.Chapter 7. Performance Evaluation by Simulation 70Three Disks vs. Single Disk00Iing1eskthreedisks0-I I I I1000 1200 1400 1600 1800 2000disk rate (KB/s)Figure 7.5: The result of the multiple disk algorithmFigure 7.5 shows that by using this disk array, we can achieve a speedup factor closesto three with various disk speeds.Chapter 8Conclusions8.1 SummaryDesigning a high performance multimedia file server is a research topic of great interestand value. In this thesis, we studied one of the key problems encountered in such systems.Given a fixed amount of buffer space and disk bandwidth both pre-determined at designtime, we investigated how to maximize the throughput of the system. Our approach wasto maximize the utilizations of buffers and disk. To achieve this goal, we proposed andstudied two schemes, buffer sharing and prefetching.Buffer sharing in the fixed reading order scheme was discussed in Chapter 4. We firstanalyzed a simple case in which all the streams have the identical playback rates. Analysisshowed buffer sharing could lead to 50% saving when the disk utilization approaches 1.Removing the restriction that all the streams should have the same playback rates, wefurther analyzed buffer sharing in the general case. Simulation results showed buffersharing could lead to 30 to 40% savings in general cases.Prefetching is the other scheme we used to improve disk and buffer utilization. InChapter 5, we first discussed the benefits of prefetching. A query can be helped byprefetching in many ways. A simple prefetching scheme SP was given, and the performance was analyzed and simulated. Not satisfied with the performance result of SP,we further proposed a more intelligent prefetching scheme IP1. This scheme has much71Chapter 8. Conclusions 72better performance than SP and can fully utilize the disk and buffers. 1P2 is a furtherrevision of IP1 and the admission control overhead of the algorithm itself is reduced.Simulation results show intelligent prefetching schemes IP1 and 1P2 could lead to up to40% improvement in system performance.Some preliminary investigation on how to apply our buffer sharing and prefetchingschemes to a multiple disk environment was described in Chapter 6. Our analysis showedthere is no major difficulty in this extension. Our simulation results also indicated this.8.2 Future worksThere are at least two main categories for future research.• Improvement of current work• New directions8.2.1 Improvement of current workFor the study of buffer sharing, one thing we should do is to design an implementationscheme. In the buffer sharing case, there is no dedicated buffer set for each stream.Buffers are assigned dynamically and globally. Designing a good implementation schemeis a challenging goal.In this thesis, we only considered the case in which the reading order of all the streamsis fixed. How to apply buffer sharing in a variable reading order scheme is an interestingand challenging topic. If we can do buffer sharing in a variable reading order case, wewill get both the advantages of buffer sharing (reducing buffer usage) and of variableorder reading (optimizing disk reading).For the study of prefetching, as in the buffer sharing case, we also need to designChapter 8. Conclusions 73an efficient implementation scheme which can fully achieve the result we obtained bytheoretical analysis. In addition, we need to find out under what condition prefetchingwill give the optimal result. Further study of the behaviors of IP1 and 1P2 is also needed.We believe further studies on these problems will lead to some interesting results, andwill give us a better understanding of the buffer-disk relation in multimedia file serversystems.For the study of how to extend our work to a multiple disk environment, our researchis just a preliminary study. There are lots of issues we have not considered very carefully.Some examples are: What are the advantages and disadvantages of sharing a global bufferpool? How will the system bus affect the performance? What are good ways to organizemetadata? What are good ways to coordinate prefetching on all the disks? Further workis necessary for all these topics.8.2.2 New directions of researchThere are several important issues we have not considered in this thesis. The first problemis synchronization. A query may require data that are stored on different media. Atypical example is a query that requires video, audio and text at the same time, butthese are stored separately. The synchronization requirements may vary from applicationto application. How to handle synchronization is one of the most important issues ina multimedia system. Many studies have been conducted in this area. However, weneed to find out how our buffer sharing and prefetching schemes work when handlingsynchronization.Another important issue is the impact of networking. A multimedia file server isusually operated in a network environment. The server is at one location, and the userChapter 8. Conclusions 74can be at any location which can be reached through the network. First, the networkbandwidth may be a big problem to the server throughput. Second, some problemscaused specially by network, like network buffering, data loss, network jitter, etc., willalso affect the operation of the server in some way. When designing a network multimediafile server, all these factors must be taken into consideration.How to apply real-time scheduling theory in multimedia file server design is also aninteresting research direction. Multimedia systems have some inherent real-time requirements. Scheduling a query for a video stream is quite similar to scheduling a real-timetask. But multimedia systems have some special requirements which an ordinary realtime system does not have. It will be very interesting to study how to apply real-timescheduling theory in multimedia systems, and what kind of modifications and adjustments we have to make for this special environment.Bibliography[Adi93] Chris Adie. A survey of distributed multimedia research, standards and products. Technical report, Edinburgh University Computing Service, January1993.[A0G92j David P. Anderson, Yoshitomo Osawa, and Ramesh Govindan. A file systemfor continuous media. ACM Transaction on Computer Systems, Vol. 10, No.4:311—337, November 1992.[CKY93] Mon-Song Chen, Dilip D. Kandlur, and Phipli S. Yu. Optimization of theGrouped Sweeping Scheduling(GSS) with Heterogeneous Multimedia Streams.In Proc. of ACM Multimedia ‘93, pages 235—242, July 1993.[Ga191] Didier Le Gall. MPEG: a video compression standard for multimedia applications. communications of the ACM, Vol.34, No.4:46—58, April 1991.[GC92] Jim Gemmell and Stavros Christodaulakis. Principles of delay-sensitive multimedia data storage and retrieval. ACM Transactions on Information Systems,Vol.10, No.1:51—90, January 1992.[Gem93] D. James Gemmell. Multimedia network file servers: Multi-channel delay sensitive data retrieval. In Proc. of ACM Multimedia ‘93, pages 243—250, July1993.[JSM91I Kevin Jeffay, Donald F. Stanat, and Charles U. Martel. On non-preemptiveschduling of periodic and sporadic tasks. In Proc. of the 1991 IEEE Symposiumon Real-time Systems, pages 129—139, 1991.[LS93] P. Lougher and D. Shepherd. The design of a storage server for continuousmedia. The Computer Journal, Vol.36, No.1:32—42, January 1993.[NFS91j R. Ng, C. Faloutsos, and T. Sellis. Flexible buffer allocation based on marginalgains. In Proc. of ACM-SIGMOD, pages 387—396, May 1991.[PGK88] David A. Patterson, Garth Gibson, and Randy H. Katz. A case for redundantarrays of inexpensive disks (RAID). In Proc. of A CM-SIGMOD, pages 109—116, 1988.[Po191] Vassilios G. Polimenis. The design of a file system that supports multimedia.Technical Report TR-91-020, Computer Science Division, EECS Department,University of California at Berkeley, 1991.75Bibliography 76[RS92] Lawrence A. Rowe and Brian C. Smith. A continuous media player. In Proc.3rd mt. Workshop on Network and OS Support for digital Audio and video,November 1992.[RV91] P. Venkat Rangan and Harrick M. Vin. Designing file systems for digital videoand audio. In Proc. of the 13th ACM Symposium on Operating Systems Principles (SOSP’91), pages 13—16, October 1991.[RV93} P. Venkat Rangan and Harrick M. Vin. Efficient storage techniques for digitalcontinuous multimedia. IEEE transactions on Knowledge and Data engineering, August 1993.[RVR92] P. Venkat Rangan, Harrick M. Vin, and Srinivas Ramannathan. Designingan on-demand multimedia service. IEEE communications Magazine, Vol.30,No.7:56—65, July 1992.[RW93] A. L. Narasimha Reddy and Jim Wyllie. Disk scheduling in a multimedia I/Osystem. In Proc. of ACM Multimedia’93, pages 225—233, July 1993.[TB93] K. Tindell and A. Burns. Scheduling hard real-time multi-media disk traffic. Technical Report 204, Department of Computer Science, York University,United Kingdom, 1993.[Y89] Clement Yu et al. Efficient placement of audio data on optical disks for realtime applications. Communications of the ACM, Vol.32, No.7:862—871, July1989.


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items