"Applied Science, Faculty of"@en . "Electrical and Computer Engineering, Department of"@en . "DSpace"@en . "UBCV"@en . "Murali, Sriram"@en . "2012-02-22T21:42:34Z"@en . "2012"@en . "Master of Applied Science - MASc"@en . "University of British Columbia"@en . "We provide two approaches to handling overload in shared resources offering realtime\nguarantees. We first provide a technique (based on mathematical optimization)\nfor identifying the possible causes for an overload situation by computing the\nworst-case demand of the system, depending upon the amount of requests serviced.\nWorst-case analysis of response time has a pseudo-polynomial time complexity,\nand when there is no knowledge about the workload, the complexity further increases.\nWe provide polynomial-time heuristics to reduce the computation time\nof the algorithm. Further, we evaluate it against other techniques using stochastic\nanalysis to stress on the accuracy and ease of estimation of the result. The scheduling\npolicy based on the approach is useful to detect an overload in the resource and\nto allow us to make responsible decisions on it. Secondly, we present a scheduling\npolicy (obtained through stochastic approximation) to handle overload in real-time\nsystems. Competitive analysis of online algorithms has commonly been applied to\nunderstand the behavior of real-time systems during overload conditions. While\ncompetitive analysis provides insight into the behavior of certain algorithms, it is\nhard to make inferences about the performance of those algorithms in practice.\nSimilar on-line scheduling approaches tend to function differently in practice due\nto factors. Further, most work on handling overload in real-time systems does\nnot consider using information regarding the distribution of arrival rates of jobs\nand execution times to make scheduling decisions. With some information about\nthe workload, we aim to improve the revenue earned by the service provider, in\na scenario when each successful job completion results in revenue accrual. We\nprove that the policy we outline does lead to increased revenue when compared to\na class of scheduling policies that make static resource allocations to different service classes. We also use empirical evidence to underscore the fact that this policy\nperforms better than a variety of other scheduling policies. The ideas presented\ncan be applied to several soft real-time systems, specifically systems with multiple\nservice classes."@en . "https://circle.library.ubc.ca/rest/handle/2429/40839?expand=metadata"@en . "Response-time Analysis and Overload Management in Real-time Systems by Sriram Murali B. E., Electronics and Communications Engineering, Anna University, 2008 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Applied Science in THE FACULTY OF GRADUATE STUDIES (Electrical and Computer Engineering) The University Of British Columbia (Vancouver) February 2012 c\u00C2\u00A9 Sriram Murali, 2012 Abstract We provide two approaches to handling overload in shared resources offering real- time guarantees. We first provide a technique (based on mathematical optimiza- tion) for identifying the possible causes for an overload situation by computing the worst-case demand of the system, depending upon the amount of requests serviced. Worst-case analysis of response time has a pseudo-polynomial time complexity, and when there is no knowledge about the workload, the complexity further in- creases. We provide polynomial-time heuristics to reduce the computation time of the algorithm. Further, we evaluate it against other techniques using stochastic analysis to stress on the accuracy and ease of estimation of the result. The schedul- ing policy based on the approach is useful to detect an overload in the resource and to allow us to make responsible decisions on it. Secondly, we present a scheduling policy (obtained through stochastic approximation) to handle overload in real-time systems. Competitive analysis of online algorithms has commonly been applied to understand the behavior of real-time systems during overload conditions. While competitive analysis provides insight into the behavior of certain algorithms, it is hard to make inferences about the performance of those algorithms in practice. Similar on-line scheduling approaches tend to function differently in practice due to factors. Further, most work on handling overload in real-time systems does not consider using information regarding the distribution of arrival rates of jobs and execution times to make scheduling decisions. With some information about the workload, we aim to improve the revenue earned by the service provider, in a scenario when each successful job completion results in revenue accrual. We prove that the policy we outline does lead to increased revenue when compared to a class of scheduling policies that make static resource allocations to different ser- ii vice classes. We also use empirical evidence to underscore the fact that this policy performs better than a variety of other scheduling policies. The ideas presented can be applied to several soft real-time systems, specifically systems with multiple service classes. iii Preface Chapter 4 consists of work continued by Sriram Murali, based on the initial re- search conducted by Dr. Sathish Gopalakrishnan1. 1Supervisor, Assistant Professor, Department of Electrical and Computer Engineering, University of British Columbia iv Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Overload Management in Real-time Systems . . . . . . . . . . . 2 1.2 Overview of our Work . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 On-line Scheduling Approach to Detect Overload . . . . . 3 1.2.2 Maximizing Revenue when the System is Overloaded . . . 4 1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Background in Real-time Scheduling . . . . . . . . . . . . . . . . . . 7 2.1 Basics of Schedulability Theory . . . . . . . . . . . . . . . . . . 7 2.1.1 Simple Schedulability Tests . . . . . . . . . . . . . . . . 8 2.1.2 Worst-case Response-time (WCRT) Analysis of Periodic Task Systems . . . . . . . . . . . . . . . . . . . . . . . . 9 v 2.1.3 WCRT Analysis of Tasks with Arbitrary Deadlines . . . . 10 2.1.4 Competitiveness of On-line Scheduling Algorithms for Real- time Systems . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Basics of Combinatorial Optimization . . . . . . . . . . . . . . . 11 3 On-line Scheduling Policy for Overload Management . . . . . . . . 13 3.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 System and Task Model . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.1 The General Task Model (GTM) . . . . . . . . . . . . . . 15 3.2.2 Simple Task Scheduling Model using Rate Monotonic Schedul- ing (RMS) . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.3 Practical Considerations for Scheduling in a GTM . . . . 18 3.3 WCRT Analysis of a General Task Model . . . . . . . . . . . . . 18 3.3.1 Problem Description . . . . . . . . . . . . . . . . . . . . 19 3.3.2 Exact WCRT Analysis . . . . . . . . . . . . . . . . . . . 21 3.3.3 Off-line Schedulability Test with Exact WCRT . . . . . . 29 3.4 Reducing the Complexity of WCRT Analysis . . . . . . . . . . . 37 3.4.1 Approximated WCRT Analysis Formulation . . . . . . . . 38 3.4.2 Empirical Evaluation of Approximated WCRT Analysis . 41 3.4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 43 4 Scheduling to Improve Rewards during Overload . . . . . . . . . . 45 4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2 System and Task Model . . . . . . . . . . . . . . . . . . . . . . . 48 4.3 Identifying a Good Scheduling Policy . . . . . . . . . . . . . . . 48 4.3.1 Optimal Fractional Resource Allocation . . . . . . . . . . 49 4.3.2 An Improved Policy for Online Job Selection . . . . . . . 51 4.4 Empirical Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 55 4.4.1 Comparison with Stochastic Dynamic Programming (SDP) 55 4.4.2 Comparison with ROBUST . . . . . . . . . . . . . . . . . 58 4.4.3 Comparison with REDF . . . . . . . . . . . . . . . . . . 61 4.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 63 vi 5 Conclusions and Future Improvements . . . . . . . . . . . . . . . . 65 5.1 Summary of the Contributions . . . . . . . . . . . . . . . . . . . 65 5.1.1 On-line Scheduling Policy to Detect Overload . . . . . . . 65 5.1.2 Overload Management to Improve Rewards by Stochastic Approximation . . . . . . . . . . . . . . . . . . . . . . . 66 5.2 Potential Future Improvements . . . . . . . . . . . . . . . . . . . 67 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 vii List of Tables Table 3.1 Task Schedulability Points Si . . . . . . . . . . . . . . . . . . 27 Table 3.2 Utilization and Response-time Bounds for a Barely Schedulable Task Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Table 3.3 Comparison of Tight Upper Bounds on Response-time . . . . . 42 Table 4.1 Task Stream Parameters to Compare the Performance of the Proposed Policy with Other Policies (Two Task Streams) . . . 56 viii List of Figures Figure 2.1 Binding Constraints in a Linear Program (LP) Problem . . . . 12 Figure 3.1 Illustration of GTM for Two Tasks . . . . . . . . . . . . . . . 16 Figure 3.2 Occurrence of (a)Processor Idle Time, with Sufficient Test for Schedulability, and (b) Processor Time Demand Overflow with Necessary Tests for Schedulability . . . . . . . . . . . . . . . 25 Figure 3.3 Comparison of Sufficient and Exact Test for Schedulability . . 28 Figure 3.4 (a)Worst-case Processor Demand Bound, and (b) Correspond- ing Utilization Bound, Uubi , for a Barely Schedulable Task Set 34 Figure 3.5 Space of Binary Search to find Rubi . . . . . . . . . . . . . . 36 Figure 3.6 Timeline of P j for the Example in Table 3.1 . . . . . . . . . 40 Figure 3.7 Empirical evaluation of Approximated WCRT Analysis . . . . 43 Figure 4.1 Performance of Policy Z Compared with the Optimal Frac- tional Policy and SDP . . . . . . . . . . . . . . . . . . . . . 57 Figure 4.2 Performance of Policy Z Compared with the ROBUST Policy when Slack Factor is 2 . . . . . . . . . . . . . . . . . . . . . 59 Figure 4.3 Performance of Policy Z Compared with the ROBUST Policy when the Slack Factor is 4 . . . . . . . . . . . . . . . . . . . 60 Figure 4.4 Performance of Policy Z Compared with the REDF Policy (Ran- dom Rewards) . . . . . . . . . . . . . . . . . . . . . . . . . 62 Figure 4.5 Performance of Policy Z Compared with the REDF Policy (Lin- ear Rewards) . . . . . . . . . . . . . . . . . . . . . . . . . . 63 ix Glossary FOGS The Faculty of Graduate Studies UGF University Graduate Fellowship EDF Earliest Deadline First EPU Effective Processor Utilization FAP Fractional Allocation Policy FPS Fixed-Priority Scheduling FPPS Fixed-Priority Preemptive Scheduling FPTAS Fully Polynomial-Time Approximation Scheme MPEG Motion Picture Experts Group GTM General Task Model IRIS Increased Reward with Increased Service LP Linear Program MF Multiframe Task Model PTAS Polynomial-Time Approximation Scheme QOS Quality-of-Service QRAM QoS-based Resource Allocation Model x RMA Rate Monotonic Analysis RMS Rate Monotonic Scheduling RTA Response-time Analysis RTQT Real-time Queuing Theory SDP Stochastic Dynamic Programming SLA Service Level Agreement WCRT Worst-case Response-time xi Acknowledgments I extend my gratitude to faculty and students at UBC and my friends, who inspired me to write the thesis. I am sincerely thankful to my advisor, and mentor Dr. Sathish Gopalakrishnan, without whom I would not be the person I am today. His guidance helped me address several hard research problems, as well as gain a good perspective of the growing job industry. My sincere thanks to all my course supervisors from Electrical and Computer Engineering and Computer Science Departments, for their support throughout the course of my degree. I am also thankful for the past and present members of Radical lab, whose company made me feel at home, and willing to learn. Debojit, Theepan, Maliha, Karim and Bader were always around, whether for a relaxing conversation, or a heated discussion after the Paper Reading seminar. I am grateful for having Madhu, Mrigank, Karthik, Jagath and Shankar beside me during the times of need. Finally, my special regards to my parents, grand-parents, Bharath (brother), and cousins, who helped me at different stages of my life. I am grateful to The Faculty of Graduate Studies (FOGS), University of British Columbia (UBC) for granting the University Graduate Fellowship (UGF)2 for pursuing my degree. I dedicate this thesis to my guru Prof. N Venkateswaran (Waran), who guided me to pursue this career. 2UBC Annual fellowship awards for meritorious students pursuing graduate degree xii Chapter 1 Introduction Large scale Internet-based operators provide a variety of services today. These services range from simple HTML content retrieval to sophisticated infrastructure services. Amazon.com, for example, offers a storage service (S3) for developing flexible data storage capabilities, a database with support for real-time queries over structured data (SimpleDB), and a computation cloud for web-scale computing (Elastic Cloud) [2]. Such services are offered at a basic support level, and at pre- mium support levels with more stringent Service Level Agreement (SLA). These SLAs specify the availability, reliability, and response times that customers can ex- pect for the services provided. Further, several services are offered on a pay-for-use model rather than on the basis of long-term contracts. Whereas most service providers size their systems to meet the normal demand and some spikes in workload, studies on Internet service workload have noted that peak-to-average ratio of workload varies from 1.6:1 to 6:1 [12]. This large vari- ation makes it exceedingly difficult for service providers to size their systems to handle all possible workload scenarios. Systems should, therefore, be designed to gracefully degrade under overload conditions. Web services are illustrative of sys- tems that need to handle heavy workload and respond to requests within bounded durations to adhere to SLAs with clients. These services are like any other application, characterized by a group of re- quests competing for a shared or distributed resource. However, the important difference is that they have predefined priority levels, and have deadlines to be met 1 based on the offered SLA. Soft real-time systems are a class of such systems in which the deadlines can be missed, but results in the loss of revenue to the ser- vice providers. The revenue earned directly depends upon the requests serviced within the expected response times. Since many service providers have abundant resources, extra resources can be provisioned for the clients who require more. However, during times of severe overload, or when there is a massive outage of sys- tem resources, alternative approaches such as dropping the jobs that are requested by clients with lower Quality of Service (QoS) guarantees. 1.1 Overload Management in Real-time Systems In the case of a large amount of requests, the system administrator might provision additional resources to serve the clients, or use a scheduling policy to drop cer- tain requests and (preferentially) provide service to requests from clients that offer better revenue (have chosen a higher QoS. Finding the exact characteristics of the scheduling algorithm gives a few metrics to compare, but are not ideal for handling hugely varying workloads [12]. Approaches towards assuming the behavior of the workload, and making scheduling decisions based on it have proven to be efficient, but are based on stochastic modeling. Finding the best possible solution among the two is hard, but the knowledge of both is ideal for making meaningful decisions. It is important because, even though the SLA may indicate peak workload, the average workload might be much lower than the peak workload. In order to conserve resources, often the total available processing power of the system would be significantly lower than the cumulative requirement of all the tasks serviced over a period of time. Since most of the requests are short lived, the resources gets freed-up very quickly. Service providers also multiplex service among many clients, and this increases the need for managing situations when the requests from clients overload the system; the duration of the overload maybe a few minutes to a few hours and a good scheduling policy will lead to optimal (or near-optimal) revenue for the service provider per unit time. By using on-line schedulability tests, these policies can be made efficiently. Schedulability analysis is essential to validate if the requests can meet their dead- lines. The system resource has a peak utilization of 100%, that is shared among a 2 stream of requests. These requests comprise of a stream of jobs, having an arrival time that is periodic, or arbitrary, a deadline relative to its arrival time Di, an ex- ecution time requirement ei, and a fixed reward for successful completion. These parameters would be part of the SLA between the client and the service provider. 1.2 Overview of our Work We now discuss two important contributions of this work, in which we address the Overload management problem in real-time systems. First, in a system agnostic of the workload, we show that the possibility of an overload can be identified by understanding the worst-case performance of the system. Consequently, system resources can be increased to accommodate the overload. In the case of limited system resources, we try to we make use of the knowledge about the workload, and develop scheduling policies for maximizing the amount of requests serviced. We also analyze the optimality of these policies based on empirical evaluations and worst-case performance. 1.2.1 On-line Scheduling Approach to Detect Overload On-line scheduling using different scheduling models have been studied widely. In a real-time scenario, where we do not have much information about the system, it is only possible to determine the task schedulability for worst-case assumptions i.e. the one which yields maximum response time. We restrict our scope to Rate Monotonic Scheduling (RMS), which is the most common scheduling algorithm used, in which the tasks arrive periodically, and are prioritized based on the rate of arrival. Even though Rate Monotonic Analysis (RMA) of tasks scheduled under RMS have pseudo-polynomial time complexity, they are acceptable if the analysis can be done prior to the system deployment (off-line test). These tests were traditionally used for designing small real-time systems such as sensor networks, or systems with known requirements, in case of aviation systems. However, in the case of web services with soft real time requirements, and workload varying over time, the Worst-case Response-time (WCRT) analysis is different. We consider the problem of task schedulability on a uniprocessor system with 3 dynamic task assumptions. To solve this problem, we make use of mathematical optimization to obtain the worst-case task set for the given system, where the task periods are known. Further, any knowledge about task computation times can be easily encoded into the problem to determine a suitable solution. The task will have a certain worst-case execution time ei,max, and a best-case execution time ei,min, between which the resulting task set lies. We determine the absolute WCRT of all possible task streams, and show that if the system requires additional requirements than available already. This helps the system designer set the basic characteristics of the application, so as to maximize the system utilization. In the case of dynamic applications, it helps to check the feasibility of tasks during run-time, since it is sufficient to find the WCRT of the lowest-priority task in the system. In sum, we have attempted to address the following issues: \u00E2\u0080\u00A2 Provide a new approach to find the absolute WCRT for a system of client requests to be scheduled in a uniprocessor system. \u00E2\u0080\u00A2 Provide an approximation scheme to identify the worst-case fractional re- source allocation, such that the requests are schedulable under RMS, if the processor capacity is known at the time of design. When some information about workload is available, it can also be programmed in the model. \u00E2\u0080\u00A2 Devise a scheduling policy based on WCRT analysis, for the use in dynamic real-time applications. \u00E2\u0080\u00A2 Analyze stochastically, the approximation scheme with respect to other ef- forts. This evaluation is important, because the amount of deviation of re- sponse time from the average response time of all possible combination of resource allocations is a useful metric for evaluation. 1.2.2 Maximizing Revenue when the System is Overloaded When the system resource cannot be adapted to handle worst-case workload de- mands, it is necessary to have some requests discarded. Even though there are some heuristics that perform well under such circumstances, it is hard to prove the tightness of the bounds on the actual amount of requests serviced. In order to 4 address that, we make use of empirical evidence to show that maximum resource utilization is possible. Several scheduling policies have been proposed to sched- ule the request effectively on the available resource to reduce the frequency of deadline-misses, and thereby maximize the revenue a service provider may gain. These policies are based on either 1. the occupance of an event, such as job arrival, or completion, or if it a dead- line is missed, or 2. work-conservation (or) minimizing the idle time of the system. We consider the behavior of the scheduling policy over an infinite horizon. Note that a short overload duration (5 \u00E2\u0080\u0093 10 minutes) is sufficiently long to motivate the use of infinite horizon policies when a system is receiving several hundred (or thousand) service requests per second, as is common for many Internet services. If a system were to respond to 100 service requests per minute, a 10-minute interval would yield 60000 jobs. We also aim to maximize the average reward earned per time step; this is closely related to maximizing the total reward obtained. This model relies on the use of micro-payments, which are becoming a popular pricing design, and other pricing schemes can be approximated using micro-payments. In sum, we have attempted to address the following issues: \u00E2\u0080\u00A2 To develop a scheduling policy to improve revenues when a system is over- loaded, if we knew, a priori, some information about future workload. \u00E2\u0080\u00A2 Provable effectiveness of such a scheduling policy. \u00E2\u0080\u00A2 Evaluate the policy against other efforts aiming to handle overload situations, using empirical analysis, and determine the optimality (or near-optimality) of the policy. \u00E2\u0080\u00A2 Analyze the importance of having any knowledge about future job arrivals. The scheduling policy we have derived is based on a stochastic improvement approach, and this approach is likely to be useful in a variety of other real-time scheduling problems. 5 1.3 Outline We discuss different response time analysis techniques that we analyze in varying details in the Chapter 2. We provide the worst-case bounds on response times, and derive the worst possible request allocation to detect possible system overload in the Chapter 3. In Chapter 4, we discuss the scheduling policy that handles real- time systems under overload and maximize the revenue for the service provider. Finally, we provide conclusions and possible extensions in Chapter 5 6 Chapter 2 Background in Real-time Scheduling We discuss Schedulability Tests for Periodic Task Scheduling, particularly about the on-line and off-line schedulability tests from Real-time Systems by Jane W.S. Liu [34]. Further, we discuss the fundamentals of Combinatorial Optimization from the Papadimitriou and Steiglitz [39], and explain the formulation of Linear Programming in some detail. This section can be skipped if the reader has funda- mental knowledge of Scheduling Theory, and Linear Programming. 2.1 Basics of Schedulability Theory A schedulability test of a real-time system is the verification of a request or a set of requests to be permitted on a processor for execution, resulting in a yes/no an- swer, i.e. the requests are accepted or not accepted. For requests scheduled using dynamic-priority scheduling like Earliest Deadline First (EDF), the schedulability test is simple. If the total utilization of all the requests is less than the processor ca- pacity, the requests are schedulable. However for Fixed-Priority Scheduling (FPS), such as RMS, the schedulability analysis is little more complicated. The main ad- vantage of using FPS is the simplicity of deployment on an embedded device, where the timeliness of a real-time system is a much sought after quality. In RMS the requests with higher priorities can preempt the ones with lower priorities and 7 hence the scheduling is known as Fixed-Priority Preemptive Scheduling (FPPS). We will discuss the properties of FPPS and the efforts to provide a schedulability test for FPPS. Some of the most common scheduling tests [10, 16, 28, 33, 38, 41], as ex- plained below, check for feasible scheduling of periodic task sets. 2.1.1 Simple Schedulability Tests Liu and Layland Bound The initial work on scheduling theory was done by Liu and Layland in 1973, which provides an off-line schedulability test. Some of the assumptions are, (1) the dead- line of the tasks are equal to their periods, (2) the tasks are independent and peri- odic, with fixed computation times, and (3) there is negligible release jitter. Theorem 1. (Theorem 5 in [33]). The task set \u00CE\u0093= {\u00CF\u00841, . . . ,\u00CF\u0084n} is schedulable iff: n \u00E2\u0088\u0091 i=1 ei Pi \u00E2\u0089\u00A4 n( n \u00E2\u0088\u009A 2\u00E2\u0088\u00921) =Ubound . (2.1) The schedulability analysis is very easy in this case with a complexity of O(n). The bound provided by the work is a sufficient, but not necessary bound, leaving a large room for improvement. It results in very low utilization bound Ubound for large task sets as shown: lim n\u00E2\u0086\u0092\u00E2\u0088\u009En( n \u00E2\u0088\u009A 2\u00E2\u0088\u00921) = ln2' 0.6931 (2.2) This however inspired further research in schedulability analysis. Hyperbolic Bound The hyperbolic bound [10] is an improvement over the Liu and Layland bound having a larger region of coverage, but using the same assumptions as before. The algorithm has a complexity of O(n), and uses the processor utilization of each task to evaluate their feasibility. The theorem also assumes that the tasks suffer from a worst-case processor bound when initiated at the critical instant. The condition for 8 schedulability is given by, n \u00E2\u0088\u008F 1=1 (Ui+1)\u00E2\u0089\u00A4 2 (2.3) The simple schedulability tests are often pessimistic, and as a result will not permit a lot of tasks that can potentially be scheduled successfully. Therefore, better tests have been sought after. The optimal test for schedulability in fixed- priority scheduling makes use of WCRT analysis. 2.1.2 WCRT Analysis of Periodic Task Systems Liu and Layland [33] first showed that the task \u00CF\u0084i has its maximum possible re- sponse time, when released along with the higher priority tasks. This instant is called critical instant, and the response time Ri will be the WCRT of the task. Worst-case Response-time Analysis (RTA) is a sufficient and necessary schedula- bility test proposed by Lehoczky et al. [32]. It is based the processor time-demand analysis, where the total processor time requirement for each task is computed, and check if the requirement is met before the deadline of the task expires. Naturally, the lower priority tasks must also account for the time demand of the higher priority tasks. The workload for the tasks are given by the relation Wi(t) = ei+\u00E2\u0088\u0091 jUub is unschedulable with the same processor. 18 3.3.1 Problem Description Liu and Layland proved that the WCRT of a task is found when performing RMA on tasks released at the critical instant. For a GTM, there are many possible WCRTs, corresponding to every possible task assumption. We propose a new approach to determine Response-time bound. It is defined as Definition 2. The upper-bound on Response-times, Rubi is the maximum Response- time among all tasks sets of GTM, when scheduled using RMS. Rubi = maxE j,\u00E2\u0088\u0080 j\u00E2\u0089\u00A4i Ri (3.4) For a given system utilization, U, we aim to find Rubi . In order to achieve that, we need a mathematical optimization approach, such as LP to maximize Ri over the possible values of E j,\u00E2\u0088\u0080 j \u00E2\u0089\u00A4 i. However, this approach has a few drawbacks: 1. We must always check the Response-time of all the jobs in the level-i busy period, and find its maximum to determine the WCRT by finding. 2. We must use an exit mechanism for the algorithm, when the end of the level- i busy period is reached. In the work of Park et al. [41], the algorithm exits, when there is no feasible solution. Therefore, we obtain WCRT with an alternative representation, using a LP. Here, we find a barely-schedulable task set for an assumed Response-time bound Rubi . The task set will have a total utilization equal to the bound U ub. Any task set with a smaller utilization will be schedulable. Therefore, Uub and Rubi forms the upper bound on utilization and WCRT respectively. We then compare it with the given system utilization, U, and re-iterate the LP with a near optimal value of Rubi , until we find a sufficiently accurate Response-time bound, close to R. Constraints of the LP We now show the sufficient and necessary conditions required for the feasibility of a task, and how it directly relates to the formulation of LP problem to find 19 utilization bound. A level-i utilization bound Uubi is said to be sufficient if the total utilization U \u00E2\u0089\u00A4Uubi . Further, the processor time demand must be greater than the available computing power, at all times, until the task\u00E2\u0080\u0099s Response-time. These constraints are listed below: 1. The processor time-demand function of all the tasks \u00CF\u00841, . . . ,\u00CF\u0084i meets the line f (t) = t at Rubi . This is the WCRT of the low priority task \u00CF\u0084i. It is the cumulative processor usage of all tasks, when initiated at the critical instant. The first constraint is therefore given by: Wi(t) = ei+\u00E2\u0088\u0091 j t. This is a continuous constraint, in which one must check at each instant in (0,Rubi ]. Lehoczky et al. [32] showed that it is sufficient and necessary to check for this condition only at the preemption points given by Si, rather than all the time instants. If we only check if the processor time demand is higher than the computing power, at the points of time in Si\u00E2\u0088\u0092{Rubi }, we will have the necessary conditions for schedulability. Wi(t)> t;\u00E2\u0088\u0080t \u00E2\u0088\u0088 Si\u00E2\u0088\u0092{Rubi } (3.6) These set of conditions will not check if the processor time demand will over-flow after time Rubi . We must therefore look at both the conditions above in order to find schedulability test. Consequently, the Response-time Rubi , for which these conditions are satisfied will be the WCRT of the task set. Also, the LP, that aims to find the barely- schedulable task set satisfying these conditions, will have a utilization equal to the level-i barely-schedulable utilization bound Uubi , as shown in [28], we can 20 refer to the WCRT of any feasible task set as the level-i comfortably-schedulable Response-time bound. This is because, by the definition of the LP, any task set with utilization U \u00E2\u0089\u00A4Uubi , will be schedulable. Definition 3. The level-i comfortably-schedulable Response-time bound for any task set, with utilization U \u00E2\u0089\u00A4Uubi is given by Rubi . By picking the value of Rubi from the possible values, we get an initial feasible Uubi . By comparing it with the given system utilization bound U and searching linearly for theR that yields the actual solution, we obtain the sufficiently accurate value for WCRT. Hardness of the Problem The proposed LP must satisfy several constraints that the affect the task utiliza- tions. We will prove that we must satisfy a pseudo-polynomial number of con- straints in order to get the exact WCRT. Further, the number of jobs to be veri- fied within the level-i busy period is large, when the system utilization U is ar- bitrarily large. These factors make the problem Pseudo-polynomial, and simpler polynomial-time formulations must be used, when we need to use this technique for on-line schedulability tests. 3.3.2 Exact WCRT Analysis Determining the Barely Schedulable Task Set The tight level-i utilization bound Uubi is the utilization of the barely-schedulable task set, given the input Response-time Rubi . It is first defined by Lee et al. [28] as the utilization bound that guarantees schedulability of tasks \u00CF\u00841, . . .\u00CF\u0084i. The objective function of the LP is therefore given by i \u00E2\u0088\u0091 j=1 e j Pj =Uubi (3.7) The level-i utilization bound has an interesting property, which allows us to restrict the number of LPs required to obtain Uub to 1, instead of i, when there 21 are i tasks in the system. It is shown by Lee et al. [28] that the system-level utilization bound Uub as the smallest value among all of the level-k utilization bounds, where k= 1, . . . , i utilization. The following lemma determines the system- level utilization bound, under a special condition. Lemma 1. If the Response-time of the task \u00CF\u0084i yields a level-i utilization bound Uubi , the following condition holds: Uubj \u00E2\u0089\u00A4Uubi , \u00E2\u0088\u0080 Rubj \u00E2\u0089\u00A4 Rubi (3.8) Proof. Let us consider j = i\u00E2\u0088\u0092 1, be a next highest priority task to \u00CF\u0084i. The level- (i-1) comfortably-schedulable Response-time bound is Rubi\u00E2\u0088\u00921, and the corresponding utilization bound is Uubi\u00E2\u0088\u00921. Since \u00CF\u0084i\u00E2\u0088\u00921 has higher priority than \u00CF\u0084i, the period Pi\u00E2\u0088\u00921\u00E2\u0089\u00A4Pi. The processor time demand functions for the two tasks are given as follows: W (ti) :\u00E2\u0088\u0091 j\u00E2\u0089\u00A4i e j \u00E2\u008C\u0088 ti Pj \u00E2\u008C\u0089 \u00E2\u0089\u00A5 ti W (ti\u00E2\u0088\u00921) :\u00E2\u0088\u0091 j Rubi i \u00E2\u0088\u0091 j=1 \u00E2\u008C\u0088Rubi Pj \u00E2\u008C\u0089 e j + ei = Rubi +\u00CE\u00B4 j Let us consider a new task set, that is barely schedulable \u00CE\u0093\u00E2\u0080\u00B2 with processor time demand at Rubi is R ub i . This can be created by rewriting from the task set, we arrive at a new task set \u00CE\u0093\u00E2\u0080\u00B2, with e\u00E2\u0080\u00B2j = e j\u00E2\u0088\u0092 \u00CE\u00B4 j\u00E2\u008C\u0088 Rubi Pj \u00E2\u008C\u0089as follows. ( \u00E2\u008C\u0088Rubi Pj \u00E2\u008C\u0089 e j\u00E2\u0088\u0092\u00CE\u00B4 )+ i \u00E2\u0088\u0091 k 6= j,k=1 \u00E2\u008C\u0088Rubi Pk \u00E2\u008C\u0089 ek + ei = Rubi ( \u00E2\u008C\u0088Rubi Pj \u00E2\u008C\u0089 e\u00E2\u0080\u00B2j)+ i \u00E2\u0088\u0091 k 6= j,k=1 \u00E2\u008C\u0088Rubi Pk \u00E2\u008C\u0089 ek + ei = Rubi When there is an overflow of \u00CE\u00B4 j, the overall utilization is reduced by \u00CE\u00B4 j/Pj, because the overflow is eliminated. In order to fill up the idle time, more computa- tion can be added, which is equal to bPi/Pjc\u00CE\u00B4 j, and correspondingly, the utilization is increased by(bPi/Pjc\u00CE\u00B4 j)/Pj, which is less than \u00CE\u00B4 j/Pj. Therefore, the utilization 24 of the task set \u00CE\u0093\u00E2\u0080\u00B2 is less than that of \u00CE\u0093. We can conclude that, when we use the sufficient condition for schedulability defined by the Equation (3.10) in a LP to find Uub,mini , will return a barely-schedulable task set. From the Figure 3.2 (a), we can see that, if we only check for completion of task \u00CF\u00843 at time 8, the test will pass, but still have idle times in between 0 and 8. This will make the WCRT to be less than 8, and result in the insufficiency of RTA. The LP using only the constraint from Equation (3.5) will give the minimum uti- \u00CF\u00841 \u00CF\u00842 \u00CF\u00843 t 1 2 3 4 5 6 7 8 9 \u00CF\u00841 \u00CF\u00842 \u00CF\u00843 t 1 2 3 4 5 6 7 8 9 Processor Idle Time Processor Time Demand Overflow Figure 3.2: Occurrence of (a)Processor Idle Time, with Sufficient Test for Schedulability, and (b) Processor Time Demand Overflow with Neces- sary Tests for Schedulability lization bound Uub,mini , while the LP using all the constraints from Equations (3.5) and (3.6) will give rise to the maximum possible utilization bound Uubi . This result is important, because the number of constraints in Equation (3.6) is pseudo- polynomial with respect to the number tasks. Therefore, in order to provide the worst-case bounds on the Uubi or correspondingly the WCRT, the U ub,min i is helpful. Necessary and Sufficient Conditions for Utilization Bound We now provide the LP using all the constraints specified in (3.9) that gives the barely-schedulable task set, and the corresponding upper bound on utilization Uubi . In addition to having the processor complete all its computation requirements by Rubi , it must also be fully utilized until the time t < R ub i . The condition is given 25 by Equation (3.6). For example, if we only consider \u00E2\u0088\u0091 j t, \u00E2\u0088\u0080t \u00E2\u0088\u0088 Si, in Figure 3.2 (b), we will not be able to detect the overflow, thereby failing the schedulability test. Therefore, in order to show that the processor demand meets line f (t) = t only at time t = Rubi , we must use both the conditions and have an exact schedulability test. However, the LP formulation can be relaxed to accept anything in between the sufficient and exact test. This means that, the utilization bound obtained with a set of constraints, can be within the maximum limit of Uubi when the exact schedulability test is used, or Uub,mini , when only the sufficient con- straints are used. The corresponding WCRT will be greater when only the sufficient constraints are used, as opposed to all the constraints. This is given by the follow- ing theorem. Theorem 3. The solution of following LP will always be in the limit [Uubi ,U ub,min i ] Minimize \u00E2\u0088\u0091 j\u00E2\u0089\u00A4i e j Pj Sub ject to \u00E2\u0088\u0091 j 5 P1 2e1+ e2+ e3 > 10 2\u00E2\u0088\u0097P1 3e1+ e2+ e3 > 14 P2 3e1+2e2+ e3 > 15 3\u00E2\u0088\u0097P1 4e1+2e2+ e3 > 20 4\u00E2\u0088\u0097P1 5e1+2e2+ e3 > 25 5\u00E2\u0088\u0097P1 6e1+2e2+ e3 > 27 P3 6e1+2e2+ e3 > 28 2\u00E2\u0088\u0097P2 6e1+3e2+ e3 > 30 6\u00E2\u0088\u0097P1 7e1+3e2+ e3 = 31 Rub4 \u00EF\u00A3\u00BC\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00BD\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00B4\u00EF\u00A3\u00BE (3.12) where } denotes logical AND operation, making all the conditions necessary to prove that Rub4 as the Response-time of \u00CF\u00844. Further, the task with least priority \u00CF\u00844 is schedulable under RMS if and only if Rub4 \u00E2\u0089\u00A4 D4. 27 Tightness of the Estimated Utilization Bound Following the proof of Lemma 2, we can say that any LP that uses the suffi- cient condition in Equation (3.5), and one or more necessary conditions from the Equation (3.6), will give a utilization upper bound Uub,approxi , which is within the range [Uub,mini ,U ub i ]. Figure 3.3 shows the plot of Utilization Bounds against Response-Time. Utilizations are obtained using the LPs that used (1) only the suffi- cient conditions for schedulability, and (2) both sufficient and necessary conditions for schedulability. 0.4 0.5 0.6 0.7 0.8 0.9 1 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 U \u00E2\u0088\u0086 = Ri ub/Ti Sufficient and Necessary Conditions Sufficient Conditions Preemption Points Si Ri ub Ui ub,min = 0.61 Ui ub = 0.70 Figure 3.3: Comparison of Sufficient and Exact Test for Schedulability 28 Finding the Exact WCRT The formulation of our LP problem guarantees a one-to-one correspondence be- tween the utilization bound and response time bound. For instance, a task set which is barely schedulable under RMS has a utilization bound Uubi , and has as Response-time Rubi . Given the maximum possible utilization for the system U, the system designer should be able to obtain the corresponding WCRT bound. There- fore, in order to obtain the R among all possible jobs having utilization U, we must search the solution space the LP. The bounds of the space can be as large as the hyperperiod of the tasks in the system, i.e. (0,LCM(\u00CF\u00841, . . . ,\u00CF\u0084i)]. Since this is a huge space to search from, we narrow down the scope to an smaller space, corresponding to a smaller number of points from Si. Once we narrow down the scope to these points, the solution close to R can be obtained by further search. If we use Binary search, the algorithm converges in polynomial-time, with a complexity of \u00CE\u0098(log 1\u00CE\u00B5 ), for an approximation of (U)\u00C2\u00B1 \u00CE\u00B5 . We show, by brute-force, how the WCRT corresponding to a Utilization bound can be determined. Table 3.2 shows the relative response time bound and utiliza- tion bounds for a two task set P1 = 46,P2 = 65. We calculated the utilization bounds for each time instant in the sample range [46,92] using the LP model discussed in Section 3.3.2. If we have a task set \u00CE\u0093 with utilization U = 0.863, then we have to find the closest approximation Rubi = 71, which corresponds to the U ub = 0.8665, such that U\u00E2\u0089\u00A4Uub. Here, Rub will give the WCRT bound for the any task set, with utilization. The advantage of searching the space of LP solutions is that, it provides only the barely-schedulable task sets, and the solution is piece-wise linear, between the time instants marked by the preemption points Si. 3.3.3 Off-line Schedulability Test with Exact WCRT We now provide an off-line schedulability test using the exact WCRT obtained by incrementally searching the LP solutions on the space of Si. Since we do not have any limitations with respect to the complexity of computation, we give the Exact and Optimal LP formulation. This is acceptable for off-line schedulability tests, when the system designer is only cares about the WCRT of the GTM, rather than 29 Rub2 U ub 2 R ub 2 U ub 2 R ub 2 U ub 2 46 0.707692 62 0.809365 78 0.911037 47 0.714047 63 0.815719 79 0.917391 48 0.720401 64 0.822074 80 0.923746 49 0.726756 65 0.828428 81 0.9301 50 0.73311 66 0.834783 82 0.936455 51 0.739465 67 0.841137 83 0.942809 52 0.745819 68 0.847492 84 0.949164 53 0.752174 69 0.853846 85 0.955518 54 0.758528 70 0.860201 86 0.961873 55 0.764883 71 0.866555 87 0.968227 56 0.771237 72 0.87291 88 0.974582 57 0.777592 73 0.879264 89 0.980936 58 0.783946 74 0.885619 90 0.987291 59 0.790301 75 0.891973 91 0.993645 60 0.796656 76 0.898328 92 1 61 0.80301 77 0.904682 Table 3.2: Utilization and Response-time Bounds for a Barely Schedulable Task Set the amount of time it takes to compute. Once we are sufficiently close to the solution, we use Binary Search between the two adjacent points on Si, that yields the utilization bound. We also provide the complexity analysis of the algorithm, and show how we can improve upon it. The LP formulation to find the Exact WCRT bound involves all the constraints mentioned in Section 3.3.1, because an exact schedulability test encompasses suf- ficient and necessary tests. We also account for level-i busy period, when the input Rubi to the LP is larger than Pi. This is because, if the Response-time is greater than the period, it must be the largest of all the Response-times in the level-i busy period. Since we are performing an Off-line test, we can determine the WCRT within the busy period without any problem. The solution to the LP is the level-i barely schedulable task set, and the WCRT, FT corresponding to that task set can be easily obtained using the Equation (2.5), for each job j = 1, . . . until the busy period ends. If FT is 30 within the input Response-time Rubi of the LP formulation, then we can exit the algorithm. Again, this algorithm extends until the end of level-i busy period, and is of pseudo-polynomial time complexity. It is only acceptable when it is an off-line schedulability test, and approximation schemes are to be followed to approximate the solution in the case of on-line schedulability test. 31 Algorithm 1 SearchExact(U,) Result: Find the WCRT of the system Data: P\u00E2\u0086\u0090 // Task Periods 1 \u00CE\u00A6\u00E2\u0086\u0090<> 2 while (1) do 3 i\u00E2\u0086\u0090 1 4 for ( j\u00E2\u0086\u0090 1 to n) do 5 if (i%Pj is 0) then 6 Sn\u00E2\u0086\u0090 Sn\u00E2\u0088\u00AA{i} 7 id\u00E2\u0086\u0090 Sn.last() // id forms the Rubn input for LP 8 Break // To add redundant constraints only once // Call LP to find Uub 9 Uubn \u00E2\u0086\u0090 FindUtilBound(P,Sn, id) 10 if (Uubn \u00E2\u0089\u00A5 U) then 11 Rubn \u00E2\u0086\u0090 BSearch(U,P,Sn, i\u00E2\u0088\u00921, i,(1+\u00CE\u00B5)) // Call Search Routine 12 return Rubn // Found the WCRT // When Rubn is not found, continues here 13 if (i > Pn) then // If the Response-time exceeds Pn, find end of level-n busy period 14 FT \u00E2\u0086\u0090 LevelNPeriodEnd(E,P) 15 if (FT \u00E2\u0089\u00A4 Pn) then 16 Rubn \u00E2\u0086\u0090 BSearch(U,P,Sn, i \u00E2\u0088\u0092 1, i,(1 + \u00CE\u00B5)) // Call Search Routine 17 return Rubn // Found the WCRT 18 else 19 Continue // When Rubn is not found, increment i 20 i\u00E2\u0086\u0090 i+1 32 Searching for Optimal WCRT with Binary Search We have calls to a BSearch routine( 2) upon finding a WCRT sufficiently close to the solution. It helps narrowing down the WCRT between the two recent points of time i\u00E2\u0088\u0092 1, i. The worst-case running time of the Binary Search algorithm can be estimated in terms of the accuracy factor \u00CE\u00B5 , and the size of the search space [t1, t2]. But we first need to establish the fact that we have a continuous function between those two points. Therefore, we show that the function of Uub has no discontinu- ities in the region [t1, t2], and show the maximum possible distance between t1 and t2, in order to perform binary search. The definition of the worst-case processor demand bound function is important for the proof. Definition 4. For a sub task set \u00CE\u0093 = {\u00CF\u00841,\u00CF\u00842, . . . ,\u00CF\u0084i}, the worst-case processor de- mand bound at time t is given by \u00E2\u0084\u00A6i(t) = i \u00E2\u0088\u0091 j=1 \u00E2\u008C\u0088 t Pj \u00E2\u008C\u0089 e j\u00E2\u0088\u0092 t (3.13) The interpretation of the processor demand bound function is given by Figure 3.4. We can clearly see that the processor usage decreases over a period of time, when the tasks get computed. However, there are discontinuities at the time in- stances which are multiples of the task periods, because newer jobs are introduced in the system. The Response-time of the lowest priority task Ri is the time when the function \u00E2\u0084\u00A6i(t) = 0, or the processor is idle. Lemma 3. BSearch( 2) Algorithm has a worst-case running time of O(log(n/\u00CE\u00B5)), for an accuracy of \u00CE\u00B5 in the range (0,1), n = Rb\u00E2\u0088\u0092Ra, the difference between the minimum and maximum range of search for the algorithm. Proof. We know that the processor demand \u00E2\u0088\u0091ij=1 \u00E2\u008C\u0088 t Pj \u00E2\u008C\u0089 e j > t, until the jobs of tasks \u00CF\u0084 j, j \u00E2\u0089\u00A4 i are completed. It is clear from the Processor Demand Bound in Figure 3.4(a), has local maxima at time when new jobs are introduced in the system, and is idle at Ri. The only discontinuities in the processor demand bound function are at points Si = kPj | \u00E2\u0088\u0080 j \u00E2\u0089\u00A4 i,k = 1,2, . . . . The utilization bound Uubi is found at these time instants, using Equation (3.11) with time t = Rubi as input. We see that the utilization bound function, Uubi has a maxima at the points represented 33 0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 t W i(t) T1 T2 2*T1 2*T2 T3 T4 3*T1 T5 R5(t) 3*T2 10 20 30 40 50 60 70 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 R5 ub = t U i u b 0.55 0.7991 U5 ub 0.725 R5 lb = T5 t (a) (b) Figure 3.4: (a)Worst-case Processor Demand Bound, and (b) Corresponding Utilization Bound, Uubi , for a Barely Schedulable Task Set by Si. Since we check for the utilization bound at Si, we have a function that is continuous in the interval (ta, ta+1],\u00E2\u0088\u0080t \u00E2\u0088\u0088 Si. The open interval after time instant in Si can be neglected while performing the binary search, since we are looking for the first occurrence of U, and subsequent occurrence smaller than any occurrence can be neglected. Figure 3.4 (b) shows the utilization bound function for the same task set. The minimum and maximum range for the Binary Search algorithm is therefore continuous [Ra,Rb], and the binary search will run in a time polynomial with respect to n and \u00CE\u00B5 . The complexity of computing the Response-Time bounds by searching for the sub-optimal solution using binary search is O(log( t\u00CE\u00B5 )), where t is the distance be- tween two points ta and ta+1. We showed, by Lemma 3 that the maximum distance between those two points can be the distance between the consecutive points in Si, where the discontinuities in the processor demand occurs. 34 Algorithm 2 Helper Functions for Exact WCRT Estimation // Binary Search to return Response-time bound 1 Procedure BSearch(U,P,Sn,min,max,\u00CE\u00B4 ): 2 if (min\u00E2\u0089\u00A4 max) then 3 Rubn \u00E2\u0086\u0090 (min+max)/2 4 Uubn \u00E2\u0086\u0090 FindUtilBound(P,Sn,Rubn ) 5 if (Uubn \u00E2\u0089\u00A5 U+\u00CE\u00B4 ) then 6 max\u00E2\u0086\u0090 BSearch(U,P,Sn,Rubn ,max,\u00CE\u00B4 ) 7 else if (Uubn \u00E2\u0089\u00A4 U\u00E2\u0088\u0092\u00CE\u00B4 ) then 8 min\u00E2\u0086\u0090 BSearch(U,P,Sn,min,Rubn ,\u00CE\u00B4 ) 9 else return Rubn 10 return 0 // Find the time when Level-n Busy Period ends 11 Procedure LevelNPeriodEnd(E,P): 12 t\u00E2\u0086\u0090 Pn 13 while 1 do 14 j\u00E2\u0086\u0090 bt/Pnc 15 CT \u00E2\u0086\u0090 n \u00E2\u0088\u0091 k=1 \u00E2\u008C\u0088 t Pk \u00E2\u008C\u0089 ek // Completion time of job j 16 if CT \u00E2\u0089\u00A4 t then 17 FT \u00E2\u0086\u0090CT \u00E2\u0088\u0092 ( j\u00E2\u0088\u00921)Pn 18 return FT // End of busy period for the task (E,P) 19 t\u00E2\u0086\u0090 t+Pn The search space of BSearch 2 is shown in Figure 3.5, for the same five-task set. Let us start with the preliminary solution for Uubi at the two points ta and ta+1 of time, using Rubi = ta and R ub i = ta+1 respectively. We continue searching for the solution, until we reach a sufficiently accurate solution U\u00C2\u00B1 \u00CE\u00B5 using BSearch. The BSearch module has a polynomial running time in both t and \u00CE\u00B5 , and therefore, has a very fast convergence rate. 35 The LP formulation in the algorithm makes use of job number as well, in order 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 U \u00E2\u0088\u0086 = R iu b /T i Linear Program solution Si Scheduling Points Uub (t a )UinputUub (ta\u00E2\u0088\u00921) Ri ub (t a\u00E2\u0088\u00921) Ri ub (t a ) Ri ub (exact) Figure 3.5: Space of Binary Search to find Rubi to account for all the jobs in the level-i busy period. This is useful for finding the WCRT of task sets when the system utilization is arbitrarily high. Particularly for systems with high computing power, and when off-line scheduling is possible, this formulation is ideal. 36 Algorithm 3 LP Formulation to find Utilization Bound // Returns minimum utilization Procedure FindUtilBound(P,Sn,Rubn ): Minimize n \u00E2\u0088\u0091 i=1 ei Pi subject to n\u00E2\u0088\u00921 \u00E2\u0088\u0091 k=1 \u00E2\u008C\u0088 j Pk \u00E2\u008C\u0089 ek + \u00E2\u008C\u008A j cn \u00E2\u008C\u008B en \u00E2\u0089\u00A5 j | \u00E2\u0088\u0080 j \u00E2\u0088\u0088 Sn, and n\u00E2\u0088\u00921 \u00E2\u0088\u0091 k=1 \u00E2\u008C\u0088Rubn Pk \u00E2\u008C\u0089 ek + \u00E2\u008C\u008A j cn \u00E2\u008C\u008B en = Rubn return Uubn 3.4 Reducing the Complexity of WCRT Analysis So far we have discussed about the Exact test for finding the Utilization upper bound Uubi , and a binary search technique to find the WCRT from the space of LP solutions. This test is very accurate, and provides the absolute WCRT of a task set, given a utilization bound U. The test is often done off-line, i.e. prior to the execution of the tasks. However, for applications with varying computation requirements, an approximate algorithm is essential. We first describe a polynomial-time algorithm that will result in a sub-optimal solution of the WCRT. The accuracy of the algorithm can be bounded from below with the sufficient test, defined already. Therefore, the approximation schemes will be better than this worst-case scenario. We evaluate the performance of the algo- rithm with other comparable bounds using empirical studies. Finally, we provide insights into improving the accuracy of the algorithm by using a near-optimal al- gorithm based on the work related to pruning the constraints of schedulability test, by Bini and Buttazzo [8]. The running time of the ideal search algorithm to obtain Rubi is pseudo-polynomial, and it is acceptable for polynomial-time algorithms that delivers near optimal re- sults. The factors responsible for making the algorithm pseudo-polynomial are: 37 1. Pseudo-polynomial nature of the problem, where the problem size is depen- dent on the value of the inputs. 2. The overhead of LP to find the bounds, and 3. The running time of the search algorithm to find the correct solution. Since we need to search the solution space of several LPs to identify the WCRT, it is hard to quantify it mathematically. Therefore, we provide a simple polynomial- time heuristic to minimize the complexity of the problem. By modifying the im- plementation of the search algorithm to identify Rubi , we can obtain a sub-optimal solution running in time polynomial to the number of tasks. Depending upon the accuracy of the output expected, the polynomial-time search approximation is be implemented using approximation factor \u00CE\u00B4 = (1+ \u00CE\u00B52N ), for a task set with N tasks. Also, the number of constraints for the LP, given by the array Sn makes the algo- rithm run in pseudo-polynomial time. Therefore, we can reduce them to one per task, when an approximation factor \u00CE\u00B3 = 0,1 is set to 1. In order to use the search algorithm to use all the constraints, we set \u00CE\u00B3 = 0. 3.4.1 Approximated WCRT Analysis Formulation The two important factors that contributes to the hardness of the problem formu- lation are the number of constraints required for the LP formulation, and the total search space of all jobs that must be checked in the level-i busy period. We reduce the number of jobs to be checked to a polynomial number by using the approximate request bound function, first proposed by Fisher and Baruah [20]. It is given by \u00CE\u00B4 (\u00CF\u0084i, t) = { \u00E2\u008C\u0088 t Pi \u00E2\u008C\u0089 for t \u00E2\u0089\u00A4 (k\u00E2\u0088\u00921)Pi (t+Pi)Ui otherwise (3.14) where k = d1/\u00CE\u00B5e\u00E2\u0088\u00921, for an approximation factor 0 < \u00CE\u00B5 < 1. The approximate cumulative processor demand function is given for each job \u00CF\u0084i, l as Wi,l = lei+\u00E2\u0088\u0091 j,\u00CE\u00B5) P\u00E2\u0086\u0090 Jn\u00E2\u0086\u0090 AcquireSearchRange(P,Pn,\u00CE\u00B5) Sn\u00E2\u0086\u0090 AcquireReducedConstraints(P,Pn) for i \u00E2\u0088\u0088 Jn do Uubn \u00E2\u0086\u0090 FindUtilBoundP,Sn, i if Uubn \u00E2\u0089\u00A5 U then Rubn \u00E2\u0086\u0090 BSearch(U, ,P,Sn, i\u00E2\u0088\u00921, i,(1+ \u00CE\u00B5)) Procedure AcquireSearchRange(P, t,\u00CE\u00B5): Jn\u00E2\u0086\u0090<> k\u00E2\u0086\u0090 d1/\u00CE\u00B5e\u00E2\u0088\u00921 for j\u00E2\u0086\u0090 1 to k - 1 do Jn\u00E2\u0086\u0090 Jn\u00E2\u0088\u00AA{ j.Pn} return Jn In order to provide a tighter bound, close to the exact Response-time bound provided by Lehoczky et al. [32], we modify the algorithm to make use of an optimal subset of the total constraints possible. Prior work in analyzing the space of RMS schedulability shows that, for the entire list of scheduling points, Sn, there exists a reduced scheduling point set Pn \u00E2\u008A\u0086 Sn [8, 36]. It is formally stated as: The reduced scheduling set Pi is defined by the recurrent expression: P1(t) = t (3.16) Pi(t) = Pi\u00E2\u0088\u00921 (\u00E2\u008C\u008A t Pi \u00E2\u008C\u008B Pi ) \u00E2\u0088\u00AAPi\u00E2\u0088\u00921(t) (3.17) Despite being complex, the equation is inherently simple. In a system of n tasks, 39 by simply including the scheduling points that lie immediately before a schedul- ing of a low priority task, we can fill the list Pn. This means that the scheduling points P j depends on Pi, j < i. We make use of the reduced set of constraints for our LP formulation to obtain a tighter bound for Response-Time Rub. Again, this formulation doesn\u00E2\u0080\u0099t provide necessary and sufficient conditions for schedulability, as discussed earlier. We illustrate it with a simple example from Table 3.1. The exhaustive set of constraints are given by S4 = {5,10,14,15,20,25,27,28,30,31}. By recur- sively calculating the constraints for higher priority tasks \u00CF\u0084 j, with the knowledge of constraints for the lower priority tasks \u00CF\u0084i, j < i, we can find out the set P4 = {10,14,25,27,28,30,31}. The algorithm to determine the reduced set has a com- plexity of O(n), and therefore is polynomial in the number of tasks n. 5 10 15 20 25 30 35 14 28 27 35 0 t P1 = 5 P2 = 14 P3 = 27 P4 = 35 R 4 = 31 P = {10,14,25,27,28,30,31} Figure 3.6: Timeline of P j for the Example in Table 3.1 The only modification to the SearchApprox ( 4) is in the procedure Acquir- eReducedConstraints. The procedure AcquireReducedConstraints( 5) eliminates the need for a \u00CE\u00B3 option to choose minimal constraints or not. The algorithm recur- sively find the constraints of higher priority tasks, and runs in polynomial time. It is therefore ideal to use the method to determine the Response-time bound. 40 Algorithm 5 Procedure to Acquire Reduced Number of Constraints for the LP Procedure AcquireReducedConstraints(P, t): Data: Pn\u00E2\u0086\u0090< t > 1 for i\u00E2\u0086\u0090 n - 1 to 1 do 2 Li\u00E2\u0086\u0090<> // Li is the list of constraints for task \u00CF\u0084i 3 for j in Pi+1 do 4 Li\u00E2\u0086\u0090 AddConstraints(i, j,P) 5 Pn\u00E2\u0086\u0090 Pn\u00E2\u0088\u00AA{Li} 6 return Pn // Adds constraints for task \u00CF\u0084k 7 Procedure AddConstraints(k, time,P): 8 Constraint = \u00E2\u008C\u008A time Pk \u00E2\u008C\u008B Pk 9 return Constraint 3.4.2 Empirical Evaluation of Approximated WCRT Analysis The tightness of the algorithm can be verified with respect to conventional ap- proaches to finding Response-times with known task computation times [18, 38, 45, 48]. For a set of given task periods, we pick computation times that are uni- formly distributed and find the Response-time bound for all the samples and com- pare it with the Response-time bound obtained using SearchApprox( 4). Though the approach of generating synthetic task sets doesn\u00E2\u0080\u0099t reflect the application\u00E2\u0080\u0099s char- acteristics, it is ideal to test the average case performance of a RTA technique, given the input utilization bound U. Several other algorithms to obtain Uniform Distri- bution of task utilizations are available, some of which are listed in [9]. We use UUnifast [7] for its simplicity in formulation and adaptability for larger task sets. We use UUnifast to obtain the task sets, and determine the Response-time bounds using other RTA. Rub, obtained by procedure SearchApprox ( 4) is obtained for the Task periods assumed, for comparison. We calculate the exact formulation of Response-time bounds using the formulation given by Lehoczky et al. [32] to verify if the value is less than Rub. A well known bound that could be computed 41 efficiently, known as the Fluid-flow model, proposed by [48] is also evaluated. The formulation of Response-time bound using Fluid-flow model is by expanding the ceiling function in the exact characterization as follows: Ri = ei+ \u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) \u00E2\u008C\u0088Ri Pj \u00E2\u008C\u0089 e j Ri \u00E2\u0089\u00A4 ei+ \u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) (Ri Pj +1 ) e j Ri \u00E2\u0089\u00A4 \u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) e j 1\u00E2\u0088\u0092\u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) e jPj = R f luidi (3.18) Despite its simplicity, the Fluid-flow model has a very bad worst-case perfor- mance, which we portray in the experiment. Therefore, we also compareRub with another computationally efficient method proposed by Bini et al. [11], which has a bounded worst-case performance. The related works that are used in the evalu- ation, and the Response-time bound formulation, are listed in Table 3.3. Figure Cited article Response-time Model Rub [32] Exact Characterization Rubi = ei+ \u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) \u00E2\u008C\u0088Rubi Pj \u00E2\u008C\u0089 e j [48] Fluid-flow Model R f luidi = \u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) e j 1\u00E2\u0088\u0092 \u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) U j [11] Linear-approximation Model Rboundi = ci+\u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) e j(1\u00E2\u0088\u0092U j) 1\u00E2\u0088\u0092 \u00E2\u0088\u0091 j\u00E2\u0088\u0088hp(i) U j This LP based work Optimization Table 3.3: Comparison of Tight Upper Bounds on Response-time 3.7 shows the comparison of various techniques to obtain Response-time bounds that we discussed against our bound. We show only a small range of samples, of 42 the whole experiment for clarity. It is found that the Response-time bound using the algorithm specified in our work is close to the asymptotic value of the exact characterization. This gives us confidence that the upper bound is a sufficient value to determine task schedulability. 0 100 200 300 400 500 600 700 800 900 0 20 40 60 80 100 120 140 Samples R il b Comparison of Response Time Bounds with U = 0.84 Exact Characterization Baruah et al. Approx This Work Fluid Flow Model ( ) Figure 3.7: Empirical evaluation of Approximated WCRT Analysis 3.4.3 Discussion Our experimental analysis shows that the WCRT bound obtained using LP ap- proach is a useful number to determine the system characteristics, allowing the system designers to assign a fraction of the resource, and still be feasible. In ap- plications like Web services, where we assume a stream of requests arriving at a shared resource, WCRT bound , and thereby loss of QOS guarantees can be es- timated using our approach. The LP accepts a minimum and maximum solution 43 for the task computation times, which can be represented as mini \u00E2\u0089\u00A4 ei \u00E2\u0089\u00A4 emaxi , for a task \u00CF\u0084i. By enforcing the constraints of the computation times, the LP will give a feasible Response-time bound that suits GTM. There are studies discussing utiliza- tion bounds as a feasibility test for similar applications such as MF, indicated by Lee et al. [28]. This ensures that there are other applications in which the WCRT Analysis can be used for. An on-line scheduling algorithm is the only option in a system whose future workload is unpredictable, because it can accommodate dynamic variations in user demands and resource availability. The benefit of having a single metric to detect the worst-case Response-time of a group of requests in a system is itself useful for various applications with dynamic application requirements. The lack of knowl- edge of the workload is not an issue in the case of GTM. On-line Schedulability tests on the approximate WCRT Analysis, shown in our work is important to deter- mine the system load whenever a new request is received. Even though fast schedulability tests using Liu and Layland bound [33], or Hyperbolic Bound [10] are available, they are pessimistic and unsuitable for sys- tems processing a large number of incoming requests. We proposed a new WCRT Analysis technique for a system of tasks, when the system does not have infor- mation about the workload. We showed that the complexity of the technique is pseudo-polynomial, and proposed approximation techniques to minimize the com- putational complexity. The proposed approximation involves some heuristics to reduce the number of parameters involved in the computation of WCRT, to make it suitable for On-line scheduling test. Approximate WCRT is comparable to the average performance of the best known Response-time Bound analysis techniques. The major benefit of the model is that it is applicable for a GTM, and predicts the worst-case scenario of the system, when large number of requests are received. Any scheduling policy built on top of the idea will help the system administrator make responsible decisions for handling overload in real-time systems. 44 Chapter 4 Scheduling to Improve Rewards during Overload 4.1 Related Work In the context of soft real-time systems, where real-time jobs can be executed with some flexibility, many techniques have been presented for maximizing a utility function subject to schedulability constraints. While Buttazzo, et al. [13] provide a detailed exposition on soft real-time systems, some approaches that are more closely related to the work described in this work involve the imprecise compu- tation [17] and the Increased Reward with Increased Service (IRIS) [19] task models. In these models, a real-time job is split into a mandatory portion and an optional portion. The mandatory portion provides the basic (minimal) quality of service needed by a task; the mandatory portion has to be completed before the job\u00E2\u0080\u0099s deadline. The optional part can be executed if the system has spare capacity, but it too must be completed before the job\u00E2\u0080\u0099s deadline. The optional portion results in a reward, and the longer the optional portion can execute the greater is the reward garnered. The reward for executing the optional portion is described using a func- tion of the extent to which the option portion is executed. Along these lines, Aydin, et al. presented techniques for optimal reward based scheduling for periodic real- time tasks [3]. Other techniques for maximizing utility (which can be considered as revenue/rewards) include the use of linear and non-linear optimization [47], and 45 heuristic resource allocation techniques such as QoS-based Resource Allocation Model (QRAM) [43, 44]. Our work is distinct from the imprecise computation model or the IRIS model because jobs in our task model do not have a mandatory or an optional portion. Further, a fixed revenue accrues with each job completion and this is unlike prior work we have highlighted where the reward is a function of the optional portion. Overload in real-time systems has also received attention. Baruah and Haritsa described the ROBUST scheduling policy for handling overload [6]. Baruah and Haritsa used the effective processor utilization as a measure of the \u00E2\u0080\u009Cgoodness\u00E2\u0080\u009D of a scheduling policy. The Effective Processor Utilization (EPU) is the fraction of time during an overload that the system executes tasks that complete by their deadlines. When the EPU is used as a metric for measuring the performance of a scheduling policy the task model is a special case of scheduling to improve rewards: in this model the reward for a job completion is equal to the execution time of the job. The task model studied by Baruah and Haritsa made no assumptions about the arrival rates of jobs. Each job was characterized by its arrival times, its execution time and its deadline. The ROBUST scheduler is an optimal online scheduler among schedulers with no knowledge of future arrivals. Baruah, et al. established that no online scheduler is guaranteed to achieve an EPU greater than 0.25 [4]. When the value of a job need not be related to the execution length, Baruah, et al. [5] provided a general result that the competitive ratio for an online scheduling policy cannot be guaranteed to be better than 1 ( \u00E2\u0088\u009A k+1)2 where k is the ratio of the largest to smallest value density among jobs to be scheduled. The value density of a job is its value-to-execution length ratio. For systems where a job\u00E2\u0080\u0099s value need not be directly related to its execution length, Koren and Shasha developed the Dover online scheduling policy [24], which provides the best possible competitive ratio relative to an off-line (or clairvoyant) scheduling policy. Koren and Shasha also developed the Skipover scheduling ap- proach [25] to deal with task sets where certain jobs can be skipped to ensure schedulability at the cost of lower quality of service. While Skipover was devel- oped as a mechanism for dealing with overload, it is not suited to the application scenarios we have described earlier. Hajek studied another special case when all jobs are unit length and concluded 46 that the competitive ratio for online scheduling of such jobs lies in the interval [0.5,\u00CF\u0086 ] where \u00CF\u0086 = \u00E2\u0088\u009A 5\u00E2\u0088\u00921 2 \u00E2\u0089\u0088 0.618, the inverse of the golden ratio [21]. Competitive analysis of scheduling policies provides us good insight into the behavior of different policies but does not address all issues. The job arrival pattern that leads to poor performance of a policy \u00CE\u00B6 may be extremely rare in real systems. Additionally, two online algorithms with the same competitive ratio might have sig- nificantly varied performance in practice. Koutsoupias and Papadimitriou discuss the limitations of competitive analysis and suggest some refinements that could make problem formulation more realistic [26]. The limitations of competitive anal- ysis have spurred investigations into several heuristics that offer good performance in most settings. For example, Buttazzo, et al. have described experiences with robust versions of the earliest deadline first algorithm [14, 15]. With regard to prior work on handling overload in real-time systems, we study a general revenue model where the revenue earned on completing a job need not be related to the execution time of the job. Moreover, we propose a scheduling policy that has limited awareness of the characteristics of the workload. While in prior work ([4, 6, 14, 15, 24]) no assumptions were made about future job arrivals, we use estimates of arrival rates to make better decisions. Such information can easily be measured, or specified, in a system, and is often described in the service level agreements between service providers and customers. This information is, therefore, not unreasonable to expect for the class of systems that we are interested in. Furthermore, Stankovic, et al. [49] have stressed the need to incorporate more information about the workload. Writing about competitive analysis for overload scheduling ([49], p. 17) they note that \u00E2\u0080\u009CMore work is needed to derive other bounds based on more knowledge of the task set.\u00E2\u0080\u009D Although our work does not lead to deriving bounds on competitive performance of online scheduling policies, we use information concerning the task streams to develop a scheduling policy to improve revenues in the presence of overload. Lam, et al. [27] have presented a scheme that uses faster processors to handle overload. We have proposed a scheme that is suited to situations where extra re- sources may not easily be available, or cannot be deployed quickly, to ameliorate overload. Finally, we note that we use stochastic models for soft real-time systems. Real- 47 time queueing theory [30] deals with probabilistic guarantees for real-time systems but Real-time Queuing Theory (RTQT) does not provide tools either for analyzing overload conditions or for maximizing rewards in a real-time system. 4.2 System and Task Model The system and task model that we consider is that of n streams, {S1, . . . ,Sn}, with preemptible jobs; all jobs are executed on a uniprocessor system. Within a particular stream Si jobs arrive with a mean inter-arrival time Pi; the inter-arrival times are governed by a Poisson process with rate ri = 1Pi . 1 The execution time of each job may also vary; for stream Si we consider the execution time of jobs to be governed by an exponential distribution with mean ei. Each job also has a deadline; the deadlines for jobs of Si follow an exponential distribution with mean Di. When a job belonging to Si is completed prior to its deadline expiring a fixed revenue of vi(> 0) is earned. We will use the terms revenue, value and reward interchangeably for the rest of this work. In this work, we provide a method for achieving high average revenue over an infinite time horizon. An optimal scheduling policy, \u00CE\u00B6 \u00E2\u0088\u0097, is one that will achieve the supremum V \u00CE\u00B6 \u00E2\u0088\u0097 = limsup t\u00E2\u0086\u0092\u00E2\u0088\u009E {V \u00CE\u00B6 (t) t+1 } where V \u00CE\u00B6 (t) is the revenue obtained using policy \u00CE\u00B6 over the interval [0, t). The scheduling policies of interest are non-idling, or work conserving, policies that make decisions whenever the state of the system changes: when a new job arrives, when a job finishes, or when a deadline expires. This model also generalizes the traditional periodic task model studied by Liu and Layland. No relationship need exist between the deadlines and the rates of the tasks. 4.3 Identifying a Good Scheduling Policy Before we develop some intuition regarding scheduling policies that optimize the average revenue earned over a long run of the system, we note that this discussion 1The inter-arrival times correspond to peak workload. 48 is particularly relevant for overloaded systems, i.e., for systems where \u00E2\u0088\u0091ni=1 ei Pi = \u00E2\u0088\u0091ni=1 eiri > 1. If the system was under-utilized then such a policy is optimal and would generate an average revenue of \u00E2\u0088\u0091ni=1 viri; the earliest deadline first policy, in fact, emulates this allocation when the utilization is \u00E2\u0089\u00A4 1. Whenever the system is not overloaded, we will assume the use of the EDF policy. Notice that a system is guaranteed to meet all deadlines when \u00E2\u0088\u0091i ei/Di \u00E2\u0089\u00A4 1. In general, when a system operates at utilization less than 100% then EDF maximizes the average revenue earned. We shall identify an ideal policy by first determining an optimal static alloca- tion of the processor among the different job streams, and then improving that allo- cation at each decision step. Our first goal is to determine fractional allocations of the processor among the n streams. Essentially we seek a vector f= { f1, f2, . . . , fn} such that fi represents the proportion of processor time allocated to stream Si. In other words, such a static allocation would allocate an fi fraction of each time unit to task stream Si. Although this may be an impractical policy \u00E2\u0080\u0093 because of the ex- cessive context switching overhead \u00E2\u0080\u0093 we shall use this as an initial step to obtaining a more practical policy. 4.3.1 Optimal Fractional Resource Allocation We would like to partition the processor\u00E2\u0080\u0099s efforts among the n streams to optimize the revenue earned. fi represents that long-run fraction of time spent by the pro- cessor servicing jobs of stream Si. When dealing with systems subject to overload, job queue lengths may grow rapidly but the system is kept stable by the fact that jobs have deadlines. We let Li(t) represent the length of the queue of jobs from Si at time instant t. The n queue lengths are stochastic processes that evolve depending on the scheduling policy chosen; further the queue lengths are independent of each other because each queue is guaranteed a fraction of the processor. The queue length Li(t) is, therefore, a simple birth-death process with the rate of arrivals to the queue being ri and the departure rate being fi ei + lDi [influenced by job completions and deadline expirations] when the state of the queue, the queue length, is l. If we use terms that are more common to queueing systems, then the service rate si = 1ei , the deadline 49 miss rate di = 1Di , and the departure rate for the queue length process is fisi+ ldi. Applying some standard results concerning birth-death processes [40], the sta- tionary distribution for Li(t), when stream Si is allotted an fi proportion of the processor, is given by \u00CE\u00A0i(l, fi) = (ri)l \u00E2\u0088\u008Flm=1(si fi+mdi) \u00CE\u00A00(ri,si fi,di), (4.1) where l is the state of queue i and \u00CE\u00A00(ri,si fi,di) = ( \u00E2\u0088\u009E \u00E2\u0088\u0091 l=0 (ri)l \u00E2\u0088\u008Flm=1(si fi+mdi) )\u00E2\u0088\u00921 . (4.2) The average revenue obtained using scheduling policy \u00CE\u00B6f that allocates fi pro- portion of the processor to stream Si is Vf = n \u00E2\u0088\u0091 i=1 visi fi [1\u00E2\u0088\u0092\u00CE\u00A00(ri,si fi,di)] (4.3) and the optimal fractional allocation policy \u00CE\u00B6 \u00E2\u0088\u0097 is that policy that picks the maxi- mizing vector f: V \u00E2\u0088\u0097 = max{Vf| fi \u00E2\u0089\u00A5 0, n \u00E2\u0088\u0091 i=1 fi = 1}. (4.4) We will initially assume that we have obtained the optimal fractional allocation policy and suggest a mechanism to improve on policies that pre-allocate processor shares. We will refer to \u00CE\u00B6 \u00E2\u0088\u0097, the optimal fractional allocation policy, as Fractional Allocation Policy (FAP). Further, we noted earlier that the fractional allocation policy might require each time step to divided among all queues, which might lead to unacceptable overhead. The improvement step will result in a policy that can be applied at every time instant when the state of the system changes, i.e., whenever a new job arrives, or when a job is completed, or when a job misses its deadline. 50 4.3.2 An Improved Policy for Online Job Selection We will improve upon a fractional allocation policy \u00CE\u00B6f by defining a priority index Zi(li) that indicates the priority of a stream when there are li queued jobs belonging to that stream. Then, at any time t when the scheduler needs to make a decision, the scheduler will activate a job from the stream with the highest priority index; thus stream Si will be chosen if and only if Zi(li) = max k {Zk(lk)} . (4.5) A scheduling decision is made whenever the state of any of the queues changes. The approach underlying our improved policy is to assume that at every decision instant a particular job is scheduled and that from the next decision instant policy FAP will be applied; the selection of the job at the first decision instant is based on improving the revenue in comparison to a consistent use of FAP. By applying the improvement step (as dictated by the priority indices) at each decision instant we can obtain consistently better performance than FAP. This approach can be re-stated as follows: \u00E2\u0080\u00A2 If t = 0 is the first decision instant then we will select a job and execute it till the second decision instant. \u00E2\u0080\u00A2 Assume that FAP will be used from the second decision instant. Therefore, pick a job at t = 0 that will lead to an improved revenue when compared with the use of FAP from t = 0. \u00E2\u0080\u00A2 If we treat every decision instant exactly like the first decision instant then the modified policy will consistently outperform FAP. In this work we shall denote the policy that uses the above priority index as Policy Z.We shall now state the main theorem and then proceed to prove this theo- rem. Theorem 4. The scheduling policy that improves upon the fractional allocation policy \u00CE\u00B6f is the policy that chooses to service task stream i when li > 0 and Zi(li) = max k;lk>0 {Zk(lk)} 51 where Zi(li) = visi [ 1\u00E2\u0088\u0092 (si fi)\u00CE\u00A00(ri,si fi,di) (si fi+ lidi)\u00CE\u00A00(ri,si fi+ lidi,di) ] . Understanding the modified policy. The prioritization suggested by the up- dated scheduling policy is greedy. This is expected when scheduling tasks with deadlines. The priorities are based on the highest possible revenue rate (visi). At the same times, the priority attempts to delay those streams that typically have longer deadlines; draining queues that have jobs that can wait would, at later time instant, lead to serving jobs that do not yield high revenues and this is reflected by the zero probability term \u00CE\u00A00(ri,si fi,di). However, if a queue is sufficiently long then we can serve jobs in that queue without worrying about draining that queue and this is reflected by the\u00CE\u00A00(ri,si fi+ lidi,di) term. Also, when deadlines are short the deadline miss rate (di) is high and this is captured by the term (si fi + lidi)\u00E2\u0088\u00921 that boosts the priority of streams with shorter deadlines. Whenever a scheduling decision is to be made, the optimal choice would de- pend on whether executing a job now is better than deferring its execution. The penalty that one may incur by deferring the execution of a job is that the job may miss its deadline thereby resulting in no revenue. We denote the expectation of the revenue earned from Si by applying the fractional allocation policy when the state of queue i is li as Vf,i(li). The priority of each stream can then be computed as Zi(li) = si[vi\u00E2\u0088\u0092{Vf,i(li)\u00E2\u0088\u0092Vf,i(li\u00E2\u0088\u00921)}]. (4.6) Proof outline. In computing the priorities we essentially account for the poten- tial loss in revenue if we defer the execution of a job to a later time instant. The highest priority job is that job that will result in the maximum loss if its execution were to be deferred and its deadline were to expire as a consequence of the deferral. It becomes essential to compute the expected change in revenue, Vf,i(li)\u00E2\u0088\u0092Vf,i(li\u00E2\u0088\u00921) before we can determine the priority of a job. The rest of this section is dedicated to a discussion on how we can recover this quantity. To understand the long-run average reward obtained from a particular class of workload, we consider the evolution of the queue {Li(t), t \u00E2\u0088\u0088 R+} with initial condition Li(0) = li and being awarded a fraction fi of processing time. The queue 52 length will evolve as a birth-death process with birth rate ri and death rate si fi+ ldi at time t with l \u00E2\u0088\u0088 Z+, l = Li(t). A scheduling policy that apportions fractional processing to different job streams is guaranteed an average revenue of fisivi from stream i as long as queue i is never empty. If we have determined the optimal fractional allocations then a scheduling policy can attain high value by not allowing queues to empty: jobs that provide high revenue and have short deadlines may be preferred. We will, therefore, un- derstand the variation in the emptying time of a queue if a job is processed at time instant t or at a later time instant. The remainder of the proof is devoted to identifying the quantity Vf,i(li)\u00E2\u0088\u0092 Vf,i(li\u00E2\u0088\u00921). Proof. The stopping time for the birth-death process {Li(t), t \u00E2\u0088\u0088 R+} when the scheduling policy uses fractional allocations defined by the vector f is defined as \u00CF\u0084f,i(li) := inf{t| t > 0 and Li(t) = 0}. (4.7) The expected value obtained from queue i in the interval [0,\u00CF\u0084f,i(li)) is denoted V\u00CC\u0082f,i(li). Further, we denote the expectation for the stopping time as T f,i(li) := E[\u00CF\u0084f,i(li)]. (4.8) From standard results concerning Markov Decision Processes [42], we can es- tablish that the 0 state is a regeneration point for the queuing process {Li(t), t \u00E2\u0088\u0088 R+}. We can then obtain Vf,i(li)\u00E2\u0088\u0092Vf,i(li\u00E2\u0088\u00921) = V\u00CC\u0082f,i(li)\u00E2\u0088\u0092V\u00CC\u0082f,i(li\u00E2\u0088\u00921)\u00E2\u0088\u0092 [T f,i(li)\u00E2\u0088\u0092T f,i(li\u00E2\u0088\u00921)]Vf,i \u00E2\u0088\u0080li \u00E2\u0088\u0088 Z+, 1\u00E2\u0089\u00A4 i\u00E2\u0089\u00A4 N. (4.9) Notice that if we define an alternative stopping time \u00CF\u0084\u00CC\u0082f,i(li) := inf{t| t > 0 and Li(t) = li\u00E2\u0088\u00921} (4.10) then V\u00CC\u0082f,i(li)\u00E2\u0088\u0092V\u00CC\u0082f,i(li\u00E2\u0088\u00921) is the value derived from servicing queue i, which is gov- 53 erned by the MDP {Li(t), t \u00E2\u0088\u0088 R+} with Li(0) = li during the interval [0, \u00CF\u0084\u00CC\u0082f,i(li)). Also, T f,i(li)\u00E2\u0088\u0092T f,i(li\u00E2\u0088\u00921) = E[\u00CF\u0084\u00CC\u0082f,i(li)]. (4.11) We shall now introduce a shadow process {L\u00CC\u0082i(t), t \u00E2\u0088\u0088 R+} to ease our analysis. This process shadows the queueing process {Li(t), t \u00E2\u0088\u0088 R+} with some subtle dif- ferences. The shadow process is a birth-death process with birth rate ri and death rate si fi+(li+m)di in state (li+m),m \u00E2\u0088\u0088 N. The death rate is 0 in states where the queue length is less than li. The initial state of the shadow process is L\u00CC\u0082i(0) = li. The shadow process is identical to the original queue length process {Li(t), t \u00E2\u0088\u0088R+} when the queue length is greater than li\u00E2\u0088\u00921 but the shadow process cannot enter the state where the queue length is li\u00E2\u0088\u00922. The shadow process has as its regeneration point the state li\u00E2\u0088\u00921 and the reward derived from the shadow process per unit time is V\u00CC\u0083f,i = V\u00CC\u0082f,i(li)\u00E2\u0088\u0092V\u00CC\u0082f,i(li\u00E2\u0088\u00921) r\u00E2\u0088\u00921i +T f,i(li)\u00E2\u0088\u0092T f,i(li\u00E2\u0088\u00921) . (4.12) In the expression for V\u00CC\u0083f,i, the numerator represents the reward earned when the original MDP transitions from state li to li\u00E2\u0088\u00921; the denominator is the expected duration for the shadow process to return to its initial state, i.e., start from the initial state of li, transition to state li\u00E2\u0088\u00921 and then return to state li. From standard results regarding birth-death processes [40] we can obtain the stationary distribution for {L\u00CC\u0082f,i(t), t \u00E2\u0088\u0088 R+} as \u00CE\u00A0\u00CC\u0082i(l) = \u00EF\u00A3\u00B1\u00EF\u00A3\u00B4\u00EF\u00A3\u00B2\u00EF\u00A3\u00B4\u00EF\u00A3\u00B3 r l\u00E2\u0088\u0092li+1 i { \u00CE\u00A00(ri,si fi+(li\u00E2\u0088\u00921)di,di) \u00E2\u0088\u008Flm=li (si fi+mdi) } , l \u00E2\u0089\u00A5 li\u00E2\u0088\u00921 0, l \u00E2\u0089\u00A4 li\u00E2\u0088\u00922 (4.13) The value obtained per unit time for the shadow process, which does not earn any revenue in state li\u00E2\u0088\u00921, is given by visi fi(1\u00E2\u0088\u0092 \u00CE\u00A0\u00CC\u0082i(li\u00E2\u0088\u00921)) = visi fi[1\u00E2\u0088\u0092\u00CE\u00A00(ri,si fi+(li\u00E2\u0088\u00921)di,di)]. (4.14) 54 Further, we can use (4.3) and (4.12) to infer that (V\u00CC\u0082f,i(li)\u00E2\u0088\u0092V\u00CC\u0082f,i(li\u00E2\u0088\u00921)) r\u00E2\u0088\u00921i +T f,i(li)\u00E2\u0088\u0092T f,i(li\u00E2\u0088\u00921) = visi fi[1\u00E2\u0088\u0092\u00CE\u00A00(ri,si fi+(li\u00E2\u0088\u00921)di,di)] (4.15) and that (T f,i(li)\u00E2\u0088\u0092T f,i(li\u00E2\u0088\u00921)) r\u00E2\u0088\u00921i +T f,i(li)\u00E2\u0088\u0092T f,i(li\u00E2\u0088\u00921) = 1\u00E2\u0088\u0092\u00CE\u00A00(ri,si fi+(li\u00E2\u0088\u00921)di,di). (4.16) We can now combine (4.9), (4.14), (4.15) and (4.16) to conclude that Vf,i(li)\u00E2\u0088\u0092Vf,i(li\u00E2\u0088\u00921) = [visi fi][\u00CE\u00A00(ri,si fi,di)] [si fi+ lidi][\u00CE\u00A00(ri,si fi+ lidi,di)] . (4.17) Finally, we use (4.17) in (4.6) to complete the theorem. 4.4 Empirical Evaluation Having described the structure of a policy for job selection to maximize rewards, we shall now describe simulation results that compare the performance of our pol- icy with other approaches. Before elaborating on empirical evaluation, we emphasize that it is extremely difficult to exhaustively evaluate, via simulation, different scheduling policies, es- pecially when rewards can be assigned arbitrarily. The proof that Policy Z can yield strong, and increased, revenue (Theorem 4) is what should suggest the \u00E2\u0080\u009Cgoodness\u00E2\u0080\u009D of the policy. The empirical evaluations are only indicative of the general applica- bility of that result. 4.4.1 Comparison with Stochastic Dynamic Programming (SDP) Optimal solutions to the scheduling problem of interest can be recovered using stochastic dynamic programming [46]. Stochastic dynamic programming is, how- ever, computationally expensive and is not practical for most applications. For a simple workload with at most two task streams it is computationally feasible to re- sort to SDP; we used this case to compare the performance of the proposed policy 55 with the optimal policy. We begin by making two comparisons: 1. Optimal fractional allocation ( FAP) vs. Policy Z, and 2. Policy Z vs. the optimal policy via SDP. For these comparisons we used many task streams, and we present the results from a representative set of simulation runs (parameters in Table 4.1). Each run consisted of two task streams, and the simulations were performed for 9,00,000 time steps. Each task stream had the same average inter-arrival time of 350 time units, and the revenue earned for every job of task stream 2 was 1.0, i.e., v2 = 1.0. We also kept the same mean deadline for each task stream, D = D1 = D2. For some simulation runs the mean execution time was longer than the mean deadline, making scheduling decisions even harder. Experiment D e1 e2 v1 f \u00E2\u0088\u00971 E1 1000 600 600 1.0 0.50 E2 1000 620 725 1.1 0.54 E3 1000 580 790 1.2 0.57 E4 1000 545 855 1.3 0.61 E5 1000 520 925 1.4 0.64 E6 1000 500 1010 1.5 0.67 E7 500 610 735 1.1 0.55 E8 500 530 900 1.3 0.63 E9 500 475 1110 1.5 0.70 E10 250 590 765 1.1 0.56 E11 250 495 1020 1.3 0.67 E12 250 435 1430 1.5 0.77 E13 165 575 785 1.1 0.58 E14 165 465 1170 1.3 0.72 E15 165 400 2000 1.5 0.84 Table 4.1: Task Stream Parameters to Compare the Performance of the Pro- posed Policy with Other Policies (Two Task Streams) We describe our results for each experiment (Figure 4.1). The optimal frac- tional allocation is described with other parameters (Table 4.1). Recall that f \u00E2\u0088\u00972 = 56 0 5 10 15 20 25 30 35 40 45 50 55 E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12 E13 E14 E15 Pe rc en ta ge d iff er en ce Experiment FAP vs. Z Z vs. OPT/SDP Figure 4.1: Performance of Policy Z Compared with the Optimal Fractional Policy and SDP 1\u00E2\u0088\u0092 f \u00E2\u0088\u00971 . Policy Z clearly improves over FAP; the percentage improvement in av- erage revenue is at least 15% (red bars in the graph). We compute the percentage improvement as follows: If VZ was the revenue accrued by Policy Z at the end of an experiment and if VFAP was the reward accrued using FAP, then the percentage improvement is 100\u00C3\u0097 VZ\u00E2\u0088\u0092VFAPVFAP . In comparison to the optimal policy recovered using SDP, we determined the loss in average revenue (percentage loss = 100\u00C3\u0097 VSDP\u00E2\u0088\u0092VZVSDP ) using policy Z (yellow bars); the maximum loss was not more than 4%. This confirms the dramatic im- provement that can be obtained over the FAP and indicates that the suggested policy has a performance that is very close to the optimal SDP policy. The performance of Policy Z improves when the rate of deadline misses increases. 57 4.4.2 Comparison with ROBUST Baruah and Haritsa developed the ROBUST scheduler [6] for achieving near-optimal performance during overload for a specific class of systems where \u00E2\u0080\u00A2 The value of a job is equal to its execution length, and \u00E2\u0080\u00A2 Each job has a slack of at least s, i.e., Diei \u00E2\u0089\u00A5 s. The performance of the ROBUST scheduler is near-optimal in the sense that it can, asymptotically, match the performance of the optimal online scheduling policy for the mentioned class of systems. They showed that the best performance that an online scheduler can guarantee is an EPU of dse/(dse+ 1) and that the ROBUST scheduler guarantees an EPU that is at most 2/s(s+ 1) fractionally off from the optimum [6]. We provide a brief description of the ROBUST scheduler before detailing some empirical comparisons between the Policy Z and ROBUST. The ROBUST sched- uler partitions an overloaded interval into an even number of contiguous phases (Phases-1, . . . ,2a). The length of each even numbered phase is equal to a 1/(s\u00E2\u0088\u0092 1) fraction of the length of the preceding odd numbered phase. At the start of an odd phase, the algorithm selects the longest eligible job and executes it non- preemptively. This job may have been executed in the previous even numbered phase; the length of the odd numbered phase is equal to the execution time remain- ing for that job. An odd phase concludes with the termination of the chosen job. During an even numbered phase, the scheduler selects a job with maximum length; this job may be preempted if another job arrives with longer execution length. To compare Policy Z with the ROBUST scheduler, we used several simulations. For two sets of simulated runs, we chose a fixed slack factor of 2; for the other two sets of runs we chose a slack factor of 4. Each simulated run lasted 1,000,000 time units and involved four task streams. The execution time for jobs belonging to the same task stream were drawn from the same exponential distribution (the mean ex- ecution times for the four task streams were 50, 100, 200 and 400 respectively); the deadline for each job was set based on the slack factor. For simplicity we chose the same arrival rate for all streams; based on the desired workload intensity the arrival rate was determined.We did perform a variety of simulation studies with different 58 arrival rates for different task streams. The performance of the scheduling policies when the arrival rates for different streams are different is similar to the results reported in this work. Only Policy Z is concerned with task streams; the ROBUST scheduler simply schedules on a job-by-job basis. The reward for completing a job successfully was equal to the execution time for that task stream. We do not in- tend this empirical analysis to be exhaustive but merely indicative of the benefits of using stochastic approximation to derive scheduling policies. For each data point, we averaged 50 independent simulation runs and compared the behavior of the two policies. We found that Policy Z outperformed ROBUST in all scenarios (Figures 4.2 and 4.3). Policy Z is not clairvoyant, but the awareness of potential future arrivals enables it to make better decisions. With a slack factor of 2 (s = 2), we were able to improve the per-time step rewards in excess of 15% in some cases. When the slack factor increases (s = 4), Policy Z was able improve revenue per time step but the increases are smaller. When the slack factor is high, most policies will be able to recover from a poor decision and still generate near-optimal revenue. 0 5 10 15 20 25 30 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 Pe rc en ta ge im pr ov em en t Workload With exact information Without exact information Figure 4.2: Performance of Policy Z Compared with the ROBUST Policy when Slack Factor is 2 59 0 2 4 6 8 10 12 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 Pe rc en ta ge im pr ov em en t Workload With exact information Without exact information Figure 4.3: Performance of Policy Z Compared with the ROBUST Policy when the Slack Factor is 4 The ROBUST scheduler requires accurate knowledge of the execution times of jobs and their deadlines. Policy Z is obtained via stochastic approximation and is more tolerant of errors in the parameters. When the ROBUST scheduler is only provided with the mean execution time for a job its performance drops significantly and the improvement noticed by using Policy Z is more pronounced. (The red bars in Figures 4.2 and 4.3 are based on the ROBUST scheduler using exact information; the orange bars are based on approximate information.) Another observation is that when the extent of overload is small, both policies perform equally well (or equally poorly). Similarly, when the system experiences heavy overload, most choices are equally good and the two policies have smaller differences. 60 4.4.3 Comparison with REDF The ROBUST scheduler is targeted at systems with known slack factors and with a job\u00E2\u0080\u0099s value being equal to its execution time. The policy we have developed, however, is also suited to arbitrary reward assignments and to situations when jobs do not have a guaranteed slack. To understand the performance of Policy Z under general workloads we com- pared its performance with the performance offered by the Robust EDF heuris- tic [14, 15]. The REDF policy is identical to EDF when the system is not over- loaded. Whenever a job arrives a check is performed to determine if the system is overloaded. (If tasks are scheduled using EDF and ei/Di \u00E2\u0089\u00A4 1 then the system is not overloaded.) When an overload is detected, the least value task that can prevent the system from being overloaded is removed from the queue of pending jobs to a reject queue.2 If some job completes ahead of time then jobs from the reject queue whose deadlines have not expired may be brought back to the pending queue. But- tazzo, Spuri and Sensini showed that REDF is well behaved during overloads [15], and we used additional simulations to understand the performance of REDF and Policy Z, and to contrast the two approaches. For these simulations, we used the task streams similar to those in our com- parisons with ROBUST. For each run we used four task streams, S1,S2,S3,S4, with mean execution times of 150, 100, 200 and 400 respectively. The deadlines for jobs of the four task streams were drawn from exponential distributions with mean 600, 800, 1600 and 3200 respectively. The arrival rate was chosen to generate the required workload. Similar to the previous evaluation, each stream had the same arrival rate. We compared the performance of REDF with Policy Z under two reward mod- els: \u00E2\u0080\u00A2 The rewards associated with jobs of the four streams were 150, 300, 400 and 200 respectively. These were chosen to represent a random ranking of task streams in terms of value. \u00E2\u0080\u00A2 The reward associated with each stream was inversely related to the mean 2This policy can be modified and a smart search strategy might remove multiple jobs of low value to prevent overload. We have not implemented this approach in our evaluation. 61 0 1 2 3 4 5 6 7 8 9 10 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 Pe rc en ta ge im pr ov em en t Workload Z vs. REDF -- random rewards Figure 4.4: Performance of Policy Z Compared with the REDF Policy (Ran- dom Rewards) deadline for that stream, i.e., shorter the deadline greater the reward. The rewards associated with S1, . . . ,S4 were 450, 300, 200 and 100 respectively. This reward model was intended to be approximately linear in terms of job deadlines. We note that Policy Z results in measurable improvements in revenue when compared with REDF using both the random (Figure 4.4) and the linear (Fig- ure 4.5) reward models. The linear reward model indicates greater differences because REDF has to choose to drop jobs that may yield high rewards (because higher rewards are connected to higher utilization, and one job providing high re- ward may dropped in place of multiple jobs that jointly yield a smaller reward) to ensure that other jobs meet their deadlines. 62 0 2 4 6 8 10 12 14 16 18 20 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 Pe rc en ta ge im pr ov em en t Workload Z vs. REDF -- linear rewards Figure 4.5: Performance of Policy Z Compared with the REDF Policy (Linear Rewards) 4.4.4 Discussion On the basis of the three different comparisons (with SDP, with ROBUST, with REDF), we were able to ascertain the uniformly improved performance that the pro- posed scheduling approach (Policy Z) is able to offer. These comparisons strongly indicate that using knowledge of future workload does increase the revenue one can earn. The improvement in revenue can be at least 10%, and is likely higher when perfect information regarding the temporal requirements of jobs is not available. The improvements in revenue obtained using Policy Z diminish when the system is extremely overloaded; this hints at the possibility that most scheduling decisions are likely to be reasonable in those situations. We speculate that if Policy Z is near optimal (as is the case when there are two task streams \u00E2\u0080\u0093 see Figure 4.1) then other scheduling policies (e.g., ROBUST, REDF) are also likely to be only about 20 to 25% away from optimality (even less in some cases) in practice, and that is an encouraging result concerning the 63 practical applicability of those policies. The structure of the priority index for Policy Z is intuitive and can form the basis for obtaining good scheduling heuristics even when workload might not con- form to simple probability distributions. Implementation considerations. Policy Z requires a priority for each class of requests, and this dynamic priority depends on the length of the corresponding queues. It is possible to compute the priorities at different queue lengths off-line and use a table lookup to identify the priorities of tasks online. This makes the proposed policy easy to implement. We also need to identify the optimal fractional allocation policy, and this is also an offline operation. Identifying the optimal frac- tional allocation is an optimization problem in itself and we use a search over the space of possible allocations to determine the optimal allocation. This is feasible when the number of service classes is limited. It is likely that some sub-optimal ini- tial allocations may not affect the behavior of Policy Z significantly but this notion requires further study. 64 Chapter 5 Conclusions and Future Improvements Overload in certain soft real-time systems (such as Internet-based services) is of- ten unavoidable because the costs of provisioning for peak load are significantly greater than the costs of handling typical load. In such systems, service providers need to provide the best possible service to customers who demand higher quality of service and are willing to pay more for better QoS. In this thesis, we have addressed the issue of handling overloads in Real-time Systems, and have taken two different approaches. We initially proposed a way of find the conditions causing overload of the system, and avert it by provision- ing extra resources, or dropping requests. Subsequently, we proposed a fractional allocation policy based on stochastic approximation to make a decision to drop requests, while maximizing the revenue at the same time. 5.1 Summary of the Contributions 5.1.1 On-line Scheduling Policy to Detect Overload The problem of finding the WCRT for Rate Monotonic Scheduling is a challeng- ing problem. The past and current efforts on Rate Monotonic Analysis have been attempted at finding the best match of accuracy and computational complexity, but 65 not when there is no information about the workload. Our work shows that it is pos- sible to find an upper bound on WCRT for any possible distribution of resources among the task streams, with reduced complexity. We showed the hardness of the problem when Stochastic Analysis is done, even when PTAS are used to minimize the complexity. By using mathematical optimization techniques, we proved that the maximum WCRT can be determined with lesser complexity, but it is still not applicable for embedded devices with little processing power. In order to address the issue of complexity in the proposed WCRT analysis, we provided heuristics to minimize the problem size at different stages of the analysis. Therefore, we showed that by using polynomial number of constraints, the Linear Program converges to a suboptimal solution faster. In order to bound the worst- case accuracy of the approximation technique, we established worst-case limits on the possible solutions of the Linear Program formulation. Then, we followed an approximation scheme proposed by Fisher and Baruah [20] to restrict the search space of the solution, instead of incrementally checking all possible future arrivals of the task. WCRT gives a good indication of the possible overload to the system administrator, depending on the arrival of the requests and allows for handling overloads efficiently. 5.1.2 Overload Management to Improve Rewards by Stochastic Approximation When the resources available are stringent, we made assumptions on the workload characteristics and presented a scheduling policy for handling overload conditions while improving the revenue earned. The policy that we present, Policy Z, is based on stochastic approximation, using the information about future job arrivals. It is an on-line scheduling policy and therefore, approximate information about future workload is sufficient to make meaningful decisions. Empirical evidence suggests that Policy Z has an excellent performance when compared with other scheduling policies for value maximization in the presence of overload. 66 5.2 Potential Future Improvements In large-scale systems having multiple processing units to serve the requests from different clients are prevalent. Schedulability analysis, and fractional allocation are far more challenging in such scenarios. With more cores being added in the same processor in embedded systems, makes it more compelling to have the schedula- bility analysis extended for multiprocessor /multicore systems. The problem of scheduling of these resources, based on allocating jobs to different cores(bin pack- ing) is widely studied, and estimated to be a NP Complete problem. Therefore, identification of WCRT for dynamic task assumptions is an interesting area for research. Our scheduling policy, using stochastic approximation is sufficiently general and can be used in multiprocessor systems as well. However, we have restricted the discussion in this article to uniprocessor systems, but it is possible to use the policy in a system with m processors by selecting the top m jobs based on their priority indices. Even though we make some minor assumptions in our formulation, it can be removed and the model can be extended to such cases. In the case of Schedulability analysis, the assumptions such as periodic arrival of tasks, task independence, lack of release jitter induced delays, etc. can be eliminated, since the Linear Program formulation is fairly generalized. Similarly, we make some assumptions about job arrival rates and deadlines, we believe that the approach of generating an initial policy and then improving upon that policy (as we do with the optimal fractional allocation policy and Policy Z) is a useful tool for decision making in real-time systems that can be generalized and applied to other problems as well. 67 Bibliography [1] K. Albers and F. Slomka. An event stream driven approximation for the analysis of real-time systems. In ECRTS, pages 187\u00E2\u0080\u0093195, 2004. \u00E2\u0086\u0092 pages 14 [2] Amazon.com. Amazon web services. http://www.amazon.com/, May 2008. \u00E2\u0086\u0092 pages 1 [3] H. Aydin, R. Melhem, D. Mosse\u00CC\u0081, and P. Mej\u00C4\u00B1\u00CC\u0081a-Alvarez. Optimal reward-based scheduling for periodic real-time tasks. IEEE Transactions on Computers, 50(2):111\u00E2\u0080\u0093130, February 2001. ISSN 0018-9340. doi:http://dx.doi.org/10.1109/12.908988. URL http://portal.acm.org/ft gateway.cfm?id=365334&type=external&coll= Portal&dl=GUIDE&CFID=68485383&CFTOKEN=40244428. \u00E2\u0086\u0092 pages 45 [4] S. Baruah, G. Koren, B. Mishra, A. Raghunathan, L. Rosier, and D. Shasha. On-line scheduling in the presence of overload. In Proceedings of the Symposium on Foundations of Computer Science, pages 100\u00E2\u0080\u0093110, October 1991. ISBN 0-8186-2445-0. \u00E2\u0086\u0092 pages 46, 47 [5] S. Baruah, G. Koren, D. Mao, B. Mishra, A. Raghunathan, L. Rosier, D. Shasha, and F. Wang. On the competitiveness of on-line real-time task scheduling. Real-Time Systems, 4(2):125\u00E2\u0080\u0093144, May 1992. ISSN 0922-6443. doi:http://dx.doi.org/10.1007/BF00365406. \u00E2\u0086\u0092 pages 46 [6] S. K. Baruah and J. R. Haritsa. Scheduling for overload in real-time systems. IEEE Transactions on Computers, 46(9):1034\u00E2\u0080\u00931039, September 1997. ISSN 0018-9340. doi:http://dx.doi.org/10.1109/12.620484. URL http://portal.acm.org/ft gateway.cfm?id=265217&type=external&coll= Portal&dl=GUIDE&CFID=68485383&CFTOKEN=40244428. \u00E2\u0086\u0092 pages 46, 47, 58 [7] E. Bini. The Design Domain of Real-Time Systems. PhD thesis, Scuola Superiore S.Anna, Pisa, 2004. \u00E2\u0086\u0092 pages 41 68 [8] E. Bini and G. C. Buttazzo. The space of rate monotonic schedulability. In IEEE Real-Time Systems Symposium, pages 169\u00E2\u0080\u0093178, 2002. \u00E2\u0086\u0092 pages 37, 39 [9] E. Bini and G. C. Buttazzo. Biasing effects in schedulability measures. In ECRTS. IEEE Computer Society, 2004. ISBN 0-7695-2176-2. \u00E2\u0086\u0092 pages 41 [10] E. Bini, G. C. Buttazzo, and G. Buttazzo. Rate monotonic analysis: The hyperbolic bound. IEEE Trans. Computers, 52(7):933\u00E2\u0080\u0093942, 2003. \u00E2\u0086\u0092 pages 8, 13, 44 [11] E. Bini, T. Nguyen, P. Richard, and S. Baruah. A response-time bound in fixed-priority scheduling with arbitrary deadlines. IEEE Trans. Computers, 58:279\u00E2\u0080\u0093286, 2009. \u00E2\u0086\u0092 pages 42 [12] E. A. Brewer. Lessons from giant-scale services. IEEE Internet Computing, 5(4):46\u00E2\u0080\u009355, Jul./Aug. 2001. \u00E2\u0086\u0092 pages 1, 2 [13] G. Buttazzo, G. Lipari, L. Abeni, and M. Caccamo. Soft Real-Time Systems: Predictability vs. Efficiency. Springer, 2005. \u00E2\u0086\u0092 pages 45 [14] G. C. Buttazzo and J. A. Stankovic. RED: a robust earliest deadline scheduling algorithm. In Proceedings of the International Workshop on Responsive Computing Systems, pages 100\u00E2\u0080\u0093111, September 1993. \u00E2\u0086\u0092 pages 47, 61 [15] G. C. Buttazzo, M. Spuri, and F. Sensini. Value vs. deadline scheduling in overload conditions. In Proceedings of the IEEE Real-Time Systems Symposium, pages 90\u00E2\u0080\u009399, December 1995. \u00E2\u0086\u0092 pages 47, 61 [16] D. Chen, A. K. Mok, and T.-W. Kuo. Utilization bound re-visited. In RTCSA, pages 295\u00E2\u0080\u0093302, 1999. \u00E2\u0086\u0092 pages 8, 13, 18 [17] J.-Y. Chung, J. W. S. Liu, and K.-J. Lin. Scheduling periodic jobs that allow imprecise results. IEEE Transactions on Computers, 39(9):1156\u00E2\u0080\u00931174, September 1990. ISSN 0018-9340. doi:http://dx.doi.org/10.1109/12.57057. URL http://portal.acm.org/ft gateway.cfm?id=102826&type=external&coll= Portal&dl=GUIDE&CFID=68484521&CFTOKEN=56294152. \u00E2\u0086\u0092 pages 45 [18] R. Davis and A. Burns. Response time upper bounds for fixed priority real-time systems. In Real-Time Systems Symposium (RTSS), pages 407\u00E2\u0080\u0093418. IEEE Computer Society, December 2008. \u00E2\u0086\u0092 pages 14, 41 [19] J. K. Dey, J. Kurose, and D. Towsley. On-line scheduling policies for a class of IRIS (increasing reward with increasing service) real-time tasks. IEEE 69 Transactions on Computers, 45(7):802\u00E2\u0080\u0093813, July 1996. doi:http://dx.doi.org/10.1109/12.508319. URL http://portal.acm.org/ft gateway.cfm?id=627153&type=external&coll= Portal&dl=GUIDE&CFID=68485383&CFTOKEN=40244428. \u00E2\u0086\u0092 pages 45 [20] N. Fisher and S. Baruah. A polynomial-time approximation scheme for feasibility analysis in static-priority systems with arbitrary relative deadlines. In In Proceedings of the 13th International Conference on Real-Time Systems, pages 117\u00E2\u0080\u0093126. IEEE Computer Society Press, 2005. \u00E2\u0086\u0092 pages 13, 14, 38, 66 [21] B. Hajek. On the competitiveness of on-line scheduling of unit-length packets with hard deadlines in slotted time. In Proceedings of the Conference on Information Sciences and Systems, pages 434\u00E2\u0080\u0093439, March 2001. \u00E2\u0086\u0092 pages 47 [22] C.-C. Han and H.-Y. Tyan. A better polynomial-time schedulability test for real-time fixed-priority scheduling algorithm. In IEEE Real-Time Systems Symposium, pages 36\u00E2\u0080\u009345, 1997. \u00E2\u0086\u0092 pages 13 [23] X. S. Hu and G. Quan. Fast performance prediction for periodic task systems. In CODES, pages 72\u00E2\u0080\u009376, 2000. \u00E2\u0086\u0092 pages 22 [24] G. Koren and D. Shasha. Dover: an optimal on-line scheduling algorithm for overloaded uniprocessor real-time systems. SIAM Journal of Computing, 24 (2):318\u00E2\u0080\u0093339, April 1995. \u00E2\u0086\u0092 pages 46, 47 [25] G. Koren and D. Shasha. SkipOver: algorithms and complexity for overloaded systems that allow skips. In Proceedings of the IEEE Real-Time Systems Symposium, pages 110\u00E2\u0080\u0093119, December 1995. ISBN 0-8186-7337-0. URL http://portal.acm.org/ft gateway.cfm?id=828929&type=external&coll= Portal&dl=GUIDE&CFID=29442874&CFTOKEN=47928360. \u00E2\u0086\u0092 pages 46 [26] E. Koutsoupias and C. H. Papadimitriou. Beyond competitive analysis. SIAM Journal of Computing, 30(1):300\u00E2\u0080\u0093317, January 2000. ISSN 0097-5397. doi:http://dx.doi.org/10.1137/S0097539796299540. URL http://portal.acm.org/ft gateway.cfm?id=587029&type=external&coll= Portal&dl=GUIDE&CFID=29061345&CFTOKEN=21883824. \u00E2\u0086\u0092 pages 47 [27] T.-W. Lam, T.-W. J. Ngan, and K.-K. To. Performance guarantee for EDF under overload. Journal of Algorithms, 52(2):193\u00E2\u0080\u0093206, August 2004. ISSN 0196-6774. doi:http://dx.doi.org/10.1016/j.jalgor.2003.10.004. \u00E2\u0086\u0092 pages 47 70 [28] C.-G. Lee, L. Sha, and A. Peddi. Enhanced utilization bounds for qos management. IEEE Trans. Computers, 53(2):187\u00E2\u0080\u0093200, 2004. \u00E2\u0086\u0092 pages 8, 14, 18, 20, 21, 22, 23, 44 [29] J. P. Lehoczky. Fixed priority scheduling of periodic task sets with arbitrary deadlines. In IEEE Real-Time Systems Symposium, pages 201\u00E2\u0080\u0093213, 1990. \u00E2\u0086\u0092 pages 10 [30] J. P. Lehoczky. Real-time queuing theory. In Proceedings of the IEEE Real-Time Systems Symposium, pages 186 \u00E2\u0080\u0093 195, Dec. 1996. \u00E2\u0086\u0092 pages 48 [31] J. P. Lehoczky. Real-time queuing network theory. In Proceedings of the IEEE Real-Time Systems Symposium, pages 58\u00E2\u0080\u009367, Dec. 1997. \u00E2\u0086\u0092 pages 10 [32] J. P. Lehoczky, L. Sha, and Y. Ding. The rate monotonic scheduling algorithm: Exact characterization and average case behavior. In IEEE Real-Time Systems Symposium, pages 166\u00E2\u0080\u0093171, 1989. \u00E2\u0086\u0092 pages 9, 13, 16, 18, 20, 39, 41, 42 [33] C. L. Liu and J. W. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM, 20(1):46\u00E2\u0080\u009361, 1973. \u00E2\u0086\u0092 pages 8, 9, 13, 17, 44 [34] J. W.-S. Liu. Real-time systems. Prentice Hall, 2000. ISBN 978-0-13-099651-0. \u00E2\u0086\u0092 pages 7 [35] R. MacKenzie, D. Hands, and T. O\u00E2\u0080\u0099Farrell. Packet handling strategies to improve video qos over 802.11e wlans. In PIMRC, pages 1173\u00E2\u0080\u00931177, 2009. \u00E2\u0086\u0092 pages 15 [36] Y. Manabe and S. Aoyagi. A feasibility decision algorithm for rate monotonic and deadline monotonic scheduling. Real-Time Systems, 14(2): 171\u00E2\u0080\u0093181, 1998. \u00E2\u0086\u0092 pages 39 [37] A. K. Mok and D. Chen. A multiframe model for real-time tasks. IEEE Trans. Software Eng., 23(10):635\u00E2\u0080\u0093645, 1997. \u00E2\u0086\u0092 pages 14 [38] T. Nguyen, P. Richard, and E. Bini. Approximation techniques for response-time analysis of static-priority tasks. Real-Time Systems, 43: 147\u00E2\u0080\u0093176, 2009. ISSN 0922-6443. \u00E2\u0086\u0092 pages 8, 13, 14, 18, 41 [39] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall, 1982. ISBN 0-13-152462-3. \u00E2\u0086\u0092 pages 7, 11 71 [40] A. Papoulis and S. U. Pillai. Probability, Random Variables and Stochastic Processes. McGraw-Hill, 4 edition, 2002. \u00E2\u0086\u0092 pages 50, 54 [41] D.-W. Park, S. Natarajan, A. Kanevsky, and M. J. Kim. A generalized utilization bound test for fixed-priority real-time scheduling. In RTCSA, pages 73\u00E2\u0080\u0093, 1995. \u00E2\u0086\u0092 pages 8, 18, 19 [42] M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons, New York, 5 edition, 2005. \u00E2\u0086\u0092 pages 53 [43] R. Rajkumar, C. Lee, J. P. Lehoczky, and D. Siewiorek. A resource allocation model for QoS management. In Proceedings of the IEEE Real-Time Systems Symposium, pages 298\u00E2\u0080\u0093307, Dec. 1997. \u00E2\u0086\u0092 pages 46 [44] R. Rajkumar, C. Lee, J. P. Lehoczky, and D. Siewiorek. Practical solutions for QoS-based resource allocation. In Proceedings of the IEEE Real-Time Systems Symposium, pages 296\u00E2\u0080\u0093306, Dec. 1998. \u00E2\u0086\u0092 pages 46 [45] P. Richard and J. Goossens. Approximating response times of static-priority tasks with release jitters. In In the WiP segment of the 18th Euromicro Conference on Real-Time Systems, 2006. \u00E2\u0086\u0092 pages 41 [46] S. M. Ross. Introduction to stochastic dynamic programming. Academic Press, 1995. \u00E2\u0086\u0092 pages 55 [47] D. Seto, J. P. Lehoczky, and L. Sha. Task period selection and schedulability in real-time systems. In Proceedings of the IEEE Real-Time Systems Symposium, pages 188\u00E2\u0080\u0093198, Dec. 1998. \u00E2\u0086\u0092 pages 45 [48] M. Sjo\u00CC\u0088din and H. Hansson. Improved response-time analysis calculations. In IEEE Real-Time Systems Symposium, pages 399\u00E2\u0080\u0093408, 1998. \u00E2\u0086\u0092 pages 18, 41, 42 [49] J. A. Stankovic, M. Spuri, M. D. Natale, and G. C. Buttazzo. Implications of classical scheduling results for real-time systems. Computer, 28(6):16\u00E2\u0080\u009325, June 1995. ISSN 0018-9162. doi:http://dx.doi.org/10.1109/2.386982. URL http://portal.acm.org/ft gateway.cfm?id=620236&type=external&coll= Portal&dl=GUIDE&CFID=28250344&CFTOKEN=38174394. \u00E2\u0086\u0092 pages 47 72"@en . "Thesis/Dissertation"@en . "2012-05"@en . "10.14288/1.0072592"@en . "eng"@en . "Electrical and Computer Engineering"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "Attribution-NonCommercial-NoDerivatives 4.0 International"@en . "http://creativecommons.org/licenses/by-nc-nd/4.0/"@en . "Graduate"@en . "Response-time analysis and overload management in real-time systems"@en . "Text"@en . "http://hdl.handle.net/2429/40839"@en .