UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A portable real time threads environment Mechler, Roland 1997

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1997-0262.pdf [ 3.92MB ]
Metadata
JSON: 831-1.0051524.json
JSON-LD: 831-1.0051524-ld.json
RDF/XML (Pretty): 831-1.0051524-rdf.xml
RDF/JSON: 831-1.0051524-rdf.json
Turtle: 831-1.0051524-turtle.txt
N-Triples: 831-1.0051524-rdf-ntriples.txt
Original Record: 831-1.0051524-source.json
Full Text
831-1.0051524-fulltext.txt
Citation
831-1.0051524.ris

Full Text

A  P o r t a b l e  R e a l  T i m e  T h r e a d s  E n v i r o n m e n t  by Roland Mechler B . S c , University of British Columbia, 1992  A THESIS S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F T H E REQUIREMENTS FOR T H E D E G R E E OF  Master of Science in T H E F A C U L T Y O F G R A D U A T E STUDIES (Department of Computer Science)  We accept this thesis as conforming to the required standard  T h e  U n i v e r s i t y  o f  B r i t i s h  April 1997 © Roland Mechler, 1997  C o l u m b i a  In  presenting  degree freely  at  this  the  available  copying  of  department publication  thesis  in  partial  fulfilment  University  of  British  Columbia,  for  this  thesis  or of  reference  by this  for  his thesis  and  study.  scholarly  or for  her  I  The  of  University  Vancouver,  pate  DE-6 (2/88)  Apr;  Cp of  W\ pU  purposes  gain  shall  S C \ &V\ C C '  British C o l u m b i a  Canada  I  io,  mi  the  requirements  agree  that  agree  may  representatives.  financial  'f&f  I  further  permission.  Department  of  It not  be  that  the  Library  permission  granted  is  for  by  understood be  allowed  an  advanced  shall for  the that  without  make  it  extensive  head  of  copying my  my or  written  Abstract  Real Time Threads (RT Threads) is a threads package which provides real time scheduling semantics, application portability to a variety of host operating systems and hardware architectures, and support for distributed application programming via location transparent message passing between threads in distinct address spaces. RT Threads was initially designed as a programming environment for the development of the Continuous Media File System ( C M F S ) , a distributed multimedia application requiring real time scheduling of asynchronous tasks. In particular, real time scheduling is required for timely disk reading and network transmission of continuous media data at the server nodes, as well as for synchronization of multiple media streams at clients. The goal of this thesis is to show that R T Threads has performance characteristics making it practical for use both as a general purpose threads package, and as a platform for developing applications requiring real time scheduling. In fact, it will be shown that R T Threads peforms acceptably well (for non-critical applications), even in a non-real time operating system environment.  ii  Contents  Abstract  ii  Contents  iii  List of Tables  vi  List of F i g u r e s  viii  Acknowledgements  ix  1  Introduction  1  2  Real T i m e Threads Overview  6  3  2.1  Real Time Scheduling  7  2.2  Thread Communication and Synchronization  8  2.3  External I/O  11  A p p l i c a t i o n P r o g r a m m i n g Interface  12  3.1  Types and Constants  14  3.2  Thread Management  16  3.2.1  16  Thread Creation  iii  4  3.2.2  Thread Termination  17  3.2.3  Thread Scheduling  18  3.2.4  Miscellaneous  18  3.3  Synchronization  19  3.4  Communication  22  3.4.1  Intra-Address Space Communication  22  3.4.2  Inter-Address Space Communication  24  3.5  Dynamic Memory Allocation  25  3.6  Miscellaneous  26  3.7  External I / O  27  Implementation  29  4.1  Data Structures  30  4.2  System Startup  33  4.3  Thread Management  35  4.4  Queue Management  40  4.5  Thread Scheduling  43  4.5.1  System Clock  44  4.5.2  Context Switching  47  4.5.3  Modifying Scheduling Attributes  50  4.6  Semaphores  51  4.7  Inter Thread Communication  55  4.7.1  60  Cross Address Space I T C  4.8  Dynamic Memory Allocation  66  4.9  External I / O  66  4.10 Mapped Implementation  70 iv  4.11 Native Microkernel Implementation  5  Performance 5.1  6  72  73  Functional Overheads  74  5.1.1  Thread Creation  75  5.1.2  Semaphore Overhead  76  5.1.3  Context Switching  77  5.2  Inter Thread Communication  78  5.3  Real Time Performance  79  Related W o r k  94  6.1  POSIX  96  6.2  Real Time Support for Multimedia Applications  97  6.3  Inter Thread Communication  98  7  Additional Work  101  8  Conclusions  102  Bibliography  104  Appendix A Example Program  107  v  List  of  Tables  3.1  R T Threads Interface  13  3.2  Interface to U N I X I / O System Calls  28  5.1  Host operating systems and architectures  74  5.2  Thread creation times in microseconds  76  5.3  Context switch times in microseconds  78  5.4  Local send/receive/reply times  79  5.5  Cross address space send/receive/reply times  80  5.6  Base Test: No background threads, no background processes  5.7  1 background thread, no background processes  83  5.8  No background threads, 1 background process  84  5.9  1 background thread, 1 background process  85  5.10 No background threads, 2 background processes  86  5.11 1 background thread, 2 background processes  86  5.12 No background threads, 1 background compile  87  5.13 1 background thread, 1 background compile  88  5.14 2 R T Threads environments  89  5.15 2 R T Threads environments, 1 with background thread  89  vi  . . . .  82  5.16 2 RT Threads environments, both with background thread  90  5.17 One RT Threads environment with 100 threads  91  5.18 One RT Threads environment with 200 threads  91  5.19 One RT Threads environment with 300 threads  92  5.20 One RT Threads environment with 300 threads, measured in kernel  92  5.21 Multimedia client scheduling performance  93  vii  List  of  Figures  2.1  ITC via Message Passing  8  2.2  Client/Server Communication  10  4.1  RttThreadld  33  Vlll  Acknowledgements  F i r s t and foremost, I wish to thank my supervisor, D r .  G e r a l d Neufeld, for the  tremendous s u p p o r t , guidance, and encouragement he has given me, for which I will be forever grateful. I would also like to thank D r . N o r m a n H u t c h i n s o n , for his invaluable generosity in lending advice and opinion. M a n y thanks to D a v i d F i n k e l stein and D w i g h t Makaroff, who were both closely involved in the implementation and testing of R T T h r e a d s , and with w h o m it has been a pleasure to work. A c k n o w l edgement and thanks go as well to M u r r a y G o l d b e r g , whose P t h r e a d s preemptive threads package was the basis for this implementation, and to A n n L o , whose thesis work on real time scheduling was incorpoated into R T T h r e a d s . Finally, I thank my parents for their patience, their love, and for always standing behind me. None of this would have been possible without them.  R O L A N D  The University of British Columbia April 1997  IX  M E C H L E R  Chapter 1  Introduction  This thesis is concerned with the design, implementation and performance of Real Time Threads (RT Threads).  R T Threads is a threads package which supports  distributed application programming and application portability across operating systems and machine architectures. The first motivation for R T Threads was the desire for a portable multi-threaded programming platform on which to build distributed multimedia applications in a heterogeneous computer network environment. Another motivating factor was that a real time scheduling policy (with a well-defined scheduling behaviour) was considered best suited to the time critical nature of multimedia applications, particularly with respect to the synchronization of multiple media objects and the timely delivery of continuous media data.  RT Threads has been ported to several versions of U N I X (AIX, SunOs 4.1.3, Solaris, Linux, FreeBSD), Microsoft's Win32 software development platform (Windows N T , 95), as well as the Kea microkernel platform [30]. Machine architectures supported are Sun S P A R C , I B M PowerPC, and Intel x86/Pentium. A l l of the im-  1  plementations are at the user level, except the Kea version which is implemented in the kernel.  Threads packages are often closely associated with a particular operating system or software development platform. A key problem with porting threaded applications among various threads packages (as a means of porting between platforms) is that these packages do not all have a common set of functions and policies. For example, A I X Pthreads [1] does not support semaphores in its interface. Win32 Threads, on the other hand, has only five priority levels and does not support real time scheduling of threads . Defining a set of thread functions and policies with a 1  single interface, and supporting that interface on various platforms was chosen as a way to make the applications easily portable.  General purpose threads packages tend to use a time sliced round robin scheduling policy among ready threads at the highest priority level which has ready threads. Often the priorities are not fixed, i.e., the system may temporarily modify priorities. The advantage of time slicing is that the application is not responsible for fairness among threads of equal priority. The advantage of non-fixed priorities is that the system has some control of fairness among threads in different processes (i.e., to prevent a high priority thread in one process from starving other processes of C P U resources), since threads have a global scheduling scope in many of these systems. The disadvantage of both policies is that they take control of scheduling away from the application.  ' A time sliced round robin policy is the only one supported.  2  RT Threads, on the other hand, uses a fixed priority scheduling policy with no time slicing. Threads run to completion unless preempted by another thread of higher priority or equal priority and earlier deadline. The lack of time slicing means that the application has a greater control over scheduling of threads, but also that the application itself is responsible for fairness among its threads. This can be thought of as a cooperative threads environment. A n R T Threads application can in fact implement its own time slicing, if this is desired. Starvation of system processes is not an issue since R T Threads has a per process scheduling scope for threads (although operating in a multi-process environment may have consequences for the real time response of RT Threads).  Several aspects of the design of U B C ' s distributed Continuous Media File Server ( C M F S ) [19, 18, 20] require real time scheduling. A t the server, threads must be scheduled to read data from disk and to submit data for network transmission according to strict time constraints, in order to guarantee timely delivery of continuous media streams to clients. A t the clients, real time scheduling is used to synchronize multiple media streams, e.g., video, audio and closed captioning text. Synchronization is achieved by scheduling real time threads to submit the data for each stream to the appropriate display device at the correct time.  R T Threads does not provide hard real time guarantees. Its scheduling algorithm is optimal, meaning it will produce a feasible schedule if such a schedule exists [12], but no dynamic schedulability analysis is done and threads which pass their deadline continue executing until completion. Hard real time scheduling was not implemented for R T Threads because such guarantees are not possible when  3  p o r t i n g t o n o n r e a l t i m e o p e r a t i n g s y s t e m s s u c h as U N I X a n d W i n d o w s N T , because the overhead of d y n a m i c schedulability ing.  RT  T h r e a d s was  and  a n a l y s i s w o u l d likely be self defeat-  p o r t e d t o a m i c r o k e r n e l e n v i r o n m e n t i n a n effort t o b e  to p r o v i d e real t i m e guarantees for the C M F S  server.  R e s u l t s f r o m [2] s h o w e d  able that  i m p l e m e n t i n g a threads package on a bare machine provides a high degree of t i m i n g p r e d i c t a b i l i t y as c o m p a r e d t o i m p l e m e n t a t i o n o n t o p o f U N I X . H o w e v e r , o u r r e s u l t s show that even in a non-real time general purpose operating system  like U N I X o r  W i n d o w s N T , reasonably a c c u r a t e results are achieved for t i m i n g predictability,  so  l o n g as l o a d s f r o m o t h e r n o n - r e a l t i m e p r o c e s s e s  to  is r e l a t i v e l y  light.  B e i n g able  s u p p o r t a p p l i c a t i o n s w i t h s o f t r e a l t i m e r e q u i r e m e n t s , s u c h as m u l t i m e d i a clients, on widely  used general  display  p u r p o s e p l a t f o r m s has o b v i o u s a d v a n t a g e s over re-  quiring special purpose operating system  environments.  A n alternative to i m p l e m e n t i n g a new t h r e a d s package for p o r t a b i l i t y  would  h a v e been t o use a s t a n d a r d t h r e a d s interface, i.e., P O S I X t h r e a d s ( P t h r e a d s ) . fortunately, the P O S I X interface does not provide the functionality target real t i m e applications. deadline another  scheduling,  the  In p a r t i c u l a r , t h e s p e c i f i c a t i o n  ability  t h r e a d t o sleep,  for one t h r e a d t o s u s p e n d  or support  for d i s t r i b u t e d  Un-  required for the  does not provide  another  thread or  p r o g r a m m i n g by  allowing  for put di-  rect c o m m u n i c a t i o n between t h r e a d s in d i s t i n c t address spaces v i a message passing. A l s o , P O S I X does not specify the target applications. which signals preempted  a higher  a level o f s c h e d u l i n g c o n t r o l c o n s i d e r e d i m p o r t a n t for  For example,  the interface does  priority thread blocked  by that thread  not specify  that a thread  on a condition variable  2  should  be  immediately.  Pthreads uses the condition variable as its basic synchronization primitive, whereas RT Threads uses semaphores. 2  4  Thesis statement:  It is possible to implement a real time threads environ-  ment on a general purpose operating system, such that acceptable results for real time schduling will be acheived even under moderate system loads.  The remainder of this thesis is organized as follows.  A n overview of R T  Threads is given in chapter 2. Chapter 3 describes the application programming interface. Chapter 4 describes how R T Threads was implemented. Performance results are given in chapter 5. Related work and conclusions follow in chapters 6 and 8.  5  C h a p t e r  Real  T i m e  Threads  2  Overview  RT Threads has the following key features: • A real time scheduling algorithm based on the starting time, priority and deadline attributes of threads • Application modification of thread scheduling attributes • Thread creation/destruction • Thread synchronization (semaphores) • Inter-thread communication (ITC) • Globally unique (on the Internet) thread identifiers, allowing transparent message passing between address spaces (and machines) • Asynchronous I / O (blocking I / O calls block only the calling thread)  6  2.1  Real T i m e Scheduling  Real time scheduling in RT Threads is defined by three scheduling attributes, starting time, priority, and deadline. Starting times and deadlines are absolute time values (measured in in seconds and microseconds), and priorities are integer values from 0 to 31, where 0 is the highest priority. No thread is made ready before its starting time. The currently executing thread is called the running thread. Threads are scheduled first by priority, i.e., threads are preempted by higher priority ready threads. Threads of equal priority are scheduled according to the Earliest Deadline First ( E D F ) rule [13], which states that threads are to be scheduled in order of non-decreasing deadlines. Threads of equal priority and equal deadline are scheduled on a first come first served basis. Within a given priority level, threads with no deadline are scheduled after threads with deadlines. A running thread will run to completion unless preempted by another thread which becomes ready and has higher precedence according to the above algorithm. Such a thread can become ready in one of the following ways:  • Its starting time has been reached (i.e., the thread was sleeping). • It has become unblocked due to completion of I / O . • It has been signaled by the running thread, (i.e., it was blocked on a semaphore or an I T C call). • Its scheduling attributes have been explicitly modified by the running thread to give it higher precedence. • It has been created by the running thread, with scheduling attributes giving it higher precedence. 7  Thread A  Thread B  I  RttReceiveO ^  RttSendO  request  I  RttReplyO  ly  rep  Thread is  I Mocked  |  »S  rara,n  M Figure 2.1: I T C via Message Passing  2.2  Thread Communication and Synchronization  R T Threads provides two main mechanisms for inter-thread communication (ITC): shared memory and message passing. All threads within an R T Threads environment share a common address space. R T Threads provides semaphores to synchronize access to this shared memory. Semaphores are standard counting semaphores. A semaphore has a value which is decremented each time RttP() is called and incremented each time R t t V Q is called. Calling RttP() when the semaphore value is zero or negative will cause the calling thread to block. Calling RttV() when the semaphore value is negative will cause one blocked thread to become unblocked. A semaphore's initial value is specified when it is created with R t t A l l o c S e m O . Semaphores can be created in one of two modes. In  first come first served mode,  threads become unblocked in the same order in which they blocked, regardless of priority. In  priority mode, threads become unblocked according to the same priority  and deadline based algorithm used by R T Threads to schedule ready threads.  Message passing uses a send/receive/reply model. RttSendO sends a mes-  8  sage (request) to a specified thread and blocks until it receives a reply. R t t R e c e i v e O receives a message from any thread, blocking if there are no messages waiting for the calling thread. R t t R e p l y O sends a reply to a specified thread, which must be waiting for a reply, and does not block. Figure 2.1 shows the temporal relationship and blocking behaviour of these calls. Thread A and Thread B can be in the same address space or in different address spaces. Messages of arbitrary length can be sent using this mechanism. R T Threads' message passing facilities provide reliable communication between threads in distinct address spaces (and distinct machines), so as to provide support for programming distributed applications. These facilities are similar to the distributed inter-process communication (IPC) provided by the V Kernel [4]. R T Threads provides location transparency for message passing I T C using thread ids that are globally unique across the Internet. This is accomplished by embedding an IP address and a local port number in the identifier.  Each R T Threads environment (i.e., local address space) is identified by an IP address and port number, specified using the R t t N e t l n i t O call. A n R T Threads environment can create one special name server thread using the R t t C r e a t e N S O call. The thread id for the name server thread in a remote R T Threads environment can be obtained using the RttGetNSO call, which takes the IP address and port number of the remote environment as parameters. RttGetNSO does not perform any network communication, the IP address and port number provide enough information to uniquely identify the name server thread in a remote R T Threads environment. The name server thread simply provides a mechanism by which each RT Threads environments can create an easily locatable thread. Whether this mechanism is actually used to implement a name service, with which local threads can  9  Figure 2.2: Client/Server Communication  register, is left up to the application.  Cross address space I T C provides a convenient mechanism for implementing R P C style client/server transactions. Figure 2.2 shows an example of how a distributed client/server application might be designed using I T C . In the figure, the server consists of an administrator thread which creates local worker threads to handle client requests. The numbers in the figure indicate the order of events in the handling of a single request from a remote client. Note that  administrator/worker  communication is arranged such that the administrator never does an RttSendO because RttSendO blocks and the administrator must always be available to receive requests.  10  2.3  External I/O  R T Threads provides an I/O interface which is essentially a subset of the UNIX interface. Operations for file and socket I/O are supported. I/O calls which block will block only the calling thread. When a thread's I/O is complete it is immediately made ready.  11  C h a p t e r  Application  3  P r o g r a m m i n g Interface  An  R T T h r e a d s application begins execution  at t h e first s t a t e m e n t o f t h e  m a i n p O . M a i n p O is p a s s e d t h e e n v i r o n m e n t a r g u m e n t s a r g c a n d a r g v [ ] .  routine MainpO  is i t s e l f a t h r e a d a n d h a s t h e h i g h e s t p o s s i b l e p r i o r i t y a n d e a r l i e s t p o s s i b l e d e a d l i n e (it c a n n o t b e  preempted).  T a b l e 3.1  shows the operations which  make  up the  RT  Threads  interface.  A l l R T T h r e a d s interface r o u t i n e s have a n i n t r e t u r n value, r e t u r n i n g either RTTOK (for s u c c e s s ) o r R T T F A I L E D (for f a i l u r e ) , u n l e s s i n d i c a t e d o t h e r w i s e i n t h e interface description.  12  following  Thread Managament RttCreateO RttMyThreadldO RttGetThreadSchedAttr() RttSetThreadSchedAttr() RttNameToThreadld() RttThreadExistsO RttKillO RttExitO RttGetTimeDfDay() T h r e a d Synchronization and RttAllocSemO RttFreeSemO RttPO RttVO RttSemValueO RttSendO RttReceiveO RttMsgWaitsO RttReplyO RttNetlnitO RttMylPO RttMyPortO RttCreateNSO RttGetNSO  Create a new thread Get thread id of running thread Get scheduling attributes of given thread Set scheduling attributes of given thread Get thread id given name Given thread id, determine whether thread exists Given thread id, terminate thread Terminate running thread Get the current time Communication Allocate a new semaphore Free a semaphore Wait on a semaphore Signal a semaphore Get the value of a semaphore Send a message to a given thread and receive a reply Receive a message from another thread Determine whether R t t R e c e i v e O would block Reply to a message from given thread Enable cross address space I T C Get IP address of local R T Threads environment Get port number of local R T Threads environment Create new thread as name server thread Get thread id of name server, given IP address and port  M e m o r y Management RttMallocO RttFreeO RttReallocO RttCallocO  Allocate memory from heap Free memory previously allocated with R t t M a l l o c O Reallocate memory previously allocated by R t t M a l l o c O Allocate and clear memory from heap  Miscellaneous RttRegisterExitRoutine() RttBeginCritical0 RttEndCriticalO RttGetDataO RttSetDataO  Register a routine to be executed when application exits Disable preempting events (I/O completion and clock) Re-enable preempting events Get data associated with thread Set data associated with thread  Table 3.1: R T Threads Interface  13  3.1  Types and Constants  The type RttThreadld is an abstract data type used to identify threads at the user level. The routine xdr_RttThreadId() is provided to facilitate external data representation of Rtt Thread Ids so they can be transmitted as data between R T Threads environments on heterogeneous architectures. RttSem is another abstract data type, used to identify semaphores. The type RttTimeValue is used to represent time values in R T Threads. The type is transparent, i.e., the application has knowledge of  its fields seconds and microseconds. typedef s t r u c t  {  long seconds; long microseconds; > RttTimeValue; The seconds field can take on values from 0 to 2147483647 and the microsec-  onds field values from 0 to 999999. Variables of type RttTimeValue can express both relative times and absolute times. Absolute times are determined using the RttGetTimeOfDayO call.  The type R t t S c h A t t r is used to specify the scheduling attributes of a thread. Its fields s t a r t i n g t i m e , p r i o r i t y and deadline are visible to applications. typedef s t r u c t RttTimeValue int  { startingtime;  priority;  RttTimeValue d e a d l i n e ; }  RttSchAttr; 14  The startingtime and deadlinefieldsare used to specify the starting time and deadline, respectively, of a thread. Both express an absolute time. The priority field is used to specify the priority of a thread.  Priorities can take on values from 0  to 30 (inclusive), where 0 is the highest priority and 30 is the lowest. It is highly recommended that threads use only priorities in the range 10 to 30 inclusive so as not to interfere with higher priority threads used by the underlying system (i.e., threads used to implement cross address space communication). For applications that do not require a fine granularity of priority, the following priority constants are available: RTTHTGH = 10, RTTNORM = 20, RTTL0W = 30  The following constant values are also available as a convenience for scheduling time values: RttTimeValue RTTZEROTIME; RttTimeValue RTTINFINITE; RttTimeValue RTTNODEADLINE; RTTZEROTIME can be used to specify either the starting time or the deadline of a thread.  Giving a thread a starting time of RTTZEROTIME will make it ready im-  mediately (assuming it is not blocked). A deadline of RTTZEROTIME is the earliest possible deadline for a thread, causing it to be scheduled ahead of any thread with the same priority and a different deadline. RTTINFINITE can be used to specify a thread's starting time. A thread with a starting time of RTTINFINITE will effectively be suspended (it can only run again if another thread modifies its starting time). RTTNODEADLINE can be used to specify a thread's deadline. A thread with a deadline  15  of RTTNODEADLINE will be scheduled after all threads of equal priority which have a deadline.  3.2  Thread Management  Thread management consists of routines for creating, scheduling and terminating threads, as well as some miscellaneous related routines.  3.2.1  Thread Creation  i n t R t t C r e a t e ( R t t T h r e a d I d * t h r e a d , v o i d (*addr)  (), int stksize,  char *name, i d * a r g , R t t S c h A t t r s c h e d A t t r , i n t l e v e l ) This call is used to create threads. The first parameter, thread, is a pointer to an R t t T h r e a d l d , and is used to return the newly created thread's identifier by reference. The second parameter, addr, is a pointer to the routine which acts as the entry point for the created thread. The stksize parameter is the size, in bytes, that should be allocated for the new thread's stack. The name parameter is a null-terminated string used to specify a thread name for debugging purposes. The name is copied, so it's memory may be released after the R t t C r e a t e O call returns. More than one thread may have the same name. The fifth parameter, arg, is an argument (parameter) for the created thread. The entry routine for the new thread receives this argument as a parameter. It can be used to pass a 32-bit value, or a pointer to a larger data structure. Note that passing the address of an automatic variable is dangerous since the procedure calling R t t C r e a t e O may return before the thread  16  ever runs. The sixth argument, schedAttr, specifies the scheduling attributes of the created thread. If the starting time indicated is in the future, the thread will not become ready until that time.The final argument, level, indicates whether the created thread is a system (RTTSYS) or an application (RTTUSR) thread. The only difference between the two is that the R T Threads process (i.e., threads environment) will exit when there are no more RTTUSR level threads. Therefore, perpetual server threads should be created with level RTTSYS (unless they service requests from other R T Threads environments, in which case they will likely want to use RTTUSR).  3.2.2  Thread Termination  Threads terminate in one of three ways. Threads can return from their entry routine (either with a r e t u r n ( ) statement or by falling off the end of the routine). Equivalently, they can terminate themselves using the R t t E x i t O call. Lastly, threads can be killed by other threads using the R t t K i l l O call.  void R t t E x i t O  R t t E x i t O allows a thread to terminate itself,  int  R t t K i l K R t t T h r e a d l d thread)  R t t K i l l O allows a thread to terminate another thread, specified with the thread parameter.  17  3.2.3  Thread Scheduling  A thread's scheduling attributes are set at creation time. These attributes may also be retrieved and modified (by the thread itself or by another thread) at any time during the thread's lifetime.  i n t RttGetThreadSchedAttr(RttThreadId t h r e a d , R t t S c h A t t r * s c h e d A t t r ) This call is used to retrieve the current values of the scheduling attributes of the thread specified by the thread parameter. The attributes are returned by reference via the schedAttr parameter. This call has no effect on the scheduling of the specified thread.  i n t RttSetThreadSchedAttr(RttThreadId thread, RttSchAttr schedAttr) This call is used to set the scheduling attributes of the thread specified by the thread parameter, as specified in the schedAttr parameter. Any system scheduling effects caused by a change in scheduling attributes will take place immediately.  3.2.4  Miscellaneous  RttGetTimeOfDayO, RttMyThreadldO and R t t T h r e a d E x i s t s ( ) are miscellaneous routines which are useful for thread scheduling.  i n t RttGetTimeOfDay(RttTimeValue *time)  18  This call returns, by reference, the current clock time in seconds and microseconds since January 1, 1970 . 1  R t t T h r e a d l d RttMyThreadldO  This call returns the currently running thread's R t t T h r e a d l d .  i n t R t t T h r e a d E x i s t s ( R t t T h r e a d l d thread) This call is a predicate which returns 1 if the thread specified by parameter thread exists, and 0 otherwise. Use of this call is discouraged since even with a return of 1, by the time the following statement executes, the specified thread may no longer exist (preemption may occur between the two statements).  The call is included  primarily for reasons of backward compatibility.  3.3  Synchronization  Counting semaphores are provided as the primary means of thread synchronization. Two attributes are specified by the user when allocating a new semaphore, value and mode. Value indicates the number of times R t t P ( ) (see below) can be called before a calling thread will block, assuming no R t t V Q calls are made. Mode indicates the desired semantics for the order in which semaphore blocked threads are made ready. RTTFCFS specifies first come first served semantics, in which threads are made ready 'RttGetTimeOfDayO is based on the UNIX gettimeof day () system call. On other architectures, the base time may be different.  19  in the order in which they were blocked, regardless of priority. RTTPRIORITY specifies highest priority first semantics, in which threads of higher priority will be made ready before threads of lower priority, with threads of equal priority being made ready in earliest deadline first order. Threads of equal priority and equal deadline are made ready in first come first served order in RTTPRIORITY mode.  Since memory is shared within an R T Threads environment, semaphores can easily be shared among threads within a single environment. In keeping with the cooperative nature of an R T Threads environment, it is left up to the user to ensure that semaphores are no longer in use when a semaphore is freed. To help detect such a situation, an attempt to free a semaphore on which other threads are blocked will result in failure. In the event that a thread is killed while blocked on a semaphore, the state of the semaphore is automatically modified (its value is incremented).  Other threads packages provide primitives intended specifically for providing mutually exclusive access to shared data (these are usually called mutexes). R T Threads does not provide special "mutex" primitives; semaphores are lightweight in RT Threads and can easily be used for this purpose.  RT Threads provides semaphore operations R t t A l l o c S e m O , RttFreeSemO, R t t P ( ) , R t t V ( ) and RttSemValueO. Semaphore variables are of type RttSem.  int  RttAllocSem(RttSem *sem, i n t v a l u e , i n t mode)  This call allocates a new semaphore, returned by reference via the sera parameter. The initial value of the semaphore is specified by the value parameter, and must be 20  greater than or equal to 0. The mode for the semaphore ( RTTFCFS or RTTPRIORITY) is specified by the mode parameter.  i n t RttFreeSem(RttSem sem) This call frees a previously allocated semaphore, specified by the sem parameter. RttFreeSemO will return RTTFAILED if any threads are blocked on the semaphore.  i n t RttP(RttSem sem) This is the conventional semaphore wait primitive. It decreases the value of the semaphore specified by parameter sem by one, and blocks if the resulting value is less than zero.  i n t RttV(RttSem sem) This is the conventional semaphore signal primitive. It increases the value of the semaphore specified by parameter sem by one, and unblocks one thread blocked on the semaphore if the value was less than zero when the call was made.  i n t RttSemValue(RttSem sem, i n t *value) This call returns, via reference parameter value, the current value of the semaphore specified by parameter sem.  This call should be used with caution, since the  semaphore's value could change between the return from the RttSemValueO call and the following statement.  21  3.4  Communication  A n inter-thread communication (ITC) facility is provided by R T Threads, allowing message passing between threads both within an R T Threads environment, and between threads in different environments. A single interface for both types of communication is made possible because RttThreadlds are universally unique on the internet.  3.4.1  Intra-Address Space Communication  RT Threads provides blocking send/receive/reply style inter-thread communication primitives. A thread sending a message using R t t S e n d O is blocked until the message has been received by the destination thread and a reply has been made. A thread receiving a message using R t t R e c e i v e O is blocked until some other thread sends it a message.  R t t R e p l y O , a non-blocking operation, is used to reply to a  received message. There is also a non-blocking routine R t t M s g W a i t s O that allows a thread to test whether it has a message waiting to be received. R t t S e n d O and R t t R e p l y O both copy their message (request and reply) data. A description of the message passing routines follows.  i n t R t t S e n d ( R t t T h r e a d l d t o , v o i d *sData, u _ i n t s l e n , void *rData, u_int *rlen) This call sends a message to the thread specified by the to parameter, and blocks waiting for a reply. The request message is specified by a buffer pointer, the sData parameter, and a length, the slen parameter. A buffer must be supplied for the 22  reply to be copied into, given as the rData parameter. The parameter rlen is an input/output parameter, specifying the maximum reply length as input, and returning the actual length of the reply message as output. If the input length is less than the length of the reply sent, the reply will be truncated. R t t S e n d O returns RTTOK on success, RTTNOSUCHTHREAD if the specified destination thread does not exist, RTTFAILED on failure for any other reason.  int  R t t R e c e i v e ( R t t T h r e a d I d *from, v o i d *data, u _ i n t *len)  This call receives a message which was sent by another thread using R t t S e n d O . The R t t T h r e a d l d of the sending thread is returned by reference via the from parameter. A buffer must be supplied for the message to be copied into, given as the data parameter. The parameter len is an input/output parameter, specifying the maximum message length as input, and returning the actual length of the received message as output. If the input length is less than the length of the message sent, the message will be truncated.  int  R t t R e p l y ( R t t T h r e a d l d sndr, v o i d *data, u _ i n t l e n )  This call sends a reply message back to a thread which is blocked on an R t t S e n d O call. The destination thread (i.e., the original sender) is specified using the sndr parameter.  A pointer to the reply message buffer and its length are given as  the data and length parameters, respectively. R t t R e p l y O returns RTTOK on success, RTTNOTBLOCKED if the thread indicated by sndr is not blocked sending, and RTTFAILED for any other failure.  23  i n t RttMsgWaitsO This non-blocking call returns 1 if there is a message waiting to be received by the calling thread, and 0 otherwise.  3.4.2  Inter-Address Space Communication  The message passing routines described in the previous section can also be used to send messages between threads in distinct R T Threads environments, i.e., across address space, possibly machine, boundaries. The routines described in this section make this possible.  i n t R t t N e t l n i t ( u n s i g n e d l o n g i p A d d r , unsigned s h o r t portNo) This call is used to uniquely identify the calling R T Threads environment on the internet, and enable cross address space I T C . It must be called in m a i n p O , before any R t t C r e a t e O calls so that the IP address and port number specified by the ipAddr and portNo parameters can be embedded in all RttThreadlds. The user can let the system choose values for the IP address and/or the port number by specifying 0 for the respective parameters.  i n t RttMyIP(unsigned i n t *ipAddr) i n t R t t M y P o r t ( u n s i g n e d i n t *portNo) These calls return, respectively, by reference, the IP address and port number which identify the local R T Threads environment. They are useful if 0 was specified as 24  parameters to R t t N e t l n i t ( ) . Both return RTTFAILED if R t t N e t l n i t O has not yet been called.  int  R t t C r e a t e N S ( R t t T h r e a d l d * t h r e a d , v o i d (*addr)  () , i n t s t k s i z e ,  char *name, v o i d * a r g , R t t S c h A t t r s c h e d A t t r , i n t l e v e l ) This call takes the same parameters as RttCreate() and has the same semantics, except the created thread is marked by R T Threads as the unique name server thread for the environment. R t t C r e a t e N S ( ) may be called only once (it will return RTTFAILED on subsequent calls).  int  RttGetNS(unsigned i n t i p A d d r , i n t portNo, R t t T h r e a d l d *nServThreadId)  This call returns, by reference, the R t t T h r e a d l d of the name server thread for an RT Threads environment specified by IP address ipAddr and port number portNo.  b o o l _ t xdr_RttThreadId(XDR * x d r s , R t t T h r e a d l d *objp) This call is provided so that R t t T h r e a d l d s being sent in a message between address spaces can be encoded and decoded using the X D R external data representation.  3.5  Dynamic Memory Allocation  RT Threads provides thread-safe wrappers for the standard C library heap memory allocation routines m a l l o c O , c a l l o c O , r e a l l o c Q and f r e e ( ) , declared as follows: 25  void *RttMalloc(int size) v o i d * R t t C a l l o c ( i n t numElems, i n t void *RttRealloc(void *ptr, int  elemSize) size)  v o i d R t t F r e e ( v o i d *mem) Other C library routines which do not have RT Threads equivalents should be protected from context switches with R t t B e g i n C r i t i c a l ( ) and R t t E n d C r i t i c a l O calls.  3.6  Miscellaneous  int RttRegisterExitRoutine(void  (*routine)())  This call registers procedure routine (which takes no parameters) to be called when RT Threads exits. R t t R e g i s t e r E x i t R o u t i n e O may be called more than once to register multiple exit routines.  int RttRegisterDescriptor(int  fd)  Any file descriptors obtained from a non-RT Threads call (e.g., ones obtained in a non-RT Threads process and passed to an RT Threads environment) should be registered with RT Threads using the R t t R e g i s t e r D e s c r i p t o r ( )  void RttBeginCritical() void RttEndCriticalO  26  call.  These calls should surround any system calls or library calls which do not have RT Threads equivalents and may not be re-entrant. They delimit a critical section within which no R T Threads context switches will occur.  int  R t t S e t D a t a ( R t t T h r e a d l d t h r e a d , unsigned l o n g data)  int  R t t G e t D a t a ( R t t T h r e a d l d t h r e a d , unsigned l o n g *data)  These calls allow data to be associated with a given thread, and for that data to be retrieved.  3.7  External I/O  RT Threads also includes interface routines for many UNIX-style I / O calls. Blocking calls will only block the thread making the call and not the entire process in which RT Threads is running. The parameter lists are identical and the semantics are the same as the corresponding U N I X calls. See the U N I X man pages for details of a particular call. Table 3.2 shows the R T Threads calls which map directly to U N I X calls.  Some other routines are provided which do not map directly to U N I X calls, but are provided for convenience. They are described as follows:  int  R t t R e a d N ( i n t f d , char *buf, i n t numbytes)  This call has the same semantics as the U N I X r e a d ( ) , except that while read() may return after reading less than numbytes bytes, RttReadNO will block until 27  R T T Routine  UNIX  Call  RttOpen RttSocket RttBind RttAccept RttRead RttLseek RttSendto RttSendmsg RttSetsockopt  open socket bind accept read lseek sendto sendmsg setsockopt  R T T Routine  UNIX  Call  RttPipe RttClose RttListen RttConnect RttWrite RttRecvfrom RttRecvmsg RttGetsockopt RttGetsockname  pipe close listen connect write recvfrom recvmsg getsockopt getsockname  Table 3.2: Interface to U N I X I / O System Calls numbytes has been read (unless an error or the end of file is encountered),  int  R t t W r i t e N ( i n t f d , char *buf, i n t numbytes)  This call has the same semantics as the U N I X w r i t e ( ) , except that while w r i t e ( ) may return after writing less than numbytes bytes, R t t W r i t e N O will block until numbytes has been written (unless an error is encountered).  int  RttSeekNRead(int f d , char *buf, i n t numbytes, i n t seekpos, int  seekmode)  This call combines the functionality of R t t L s e e k Q and R t t R e a d O into a single call.  int  R t t S e e k N W r i t e ( i n t f d , char *buf, i n t numbytes, i n t seekpos, int  seekmode)  This call combines the functionality of R t t L s e e k O and R t t W r i t e O into a single call.  28  C h a p t e r  4  Implementation  Three models were used for implementing RT Threads: a native host model where RT Threads implements its own context switching; a mapped model where RT Threads threads are mapped to existing threads provided by the underlying operating system (or a separate user level threads layer); and a native microkernel model where R T Threads forms part of a microkernel implementation.  Most of the implementation is shared between the native host and mapped implementations. The source code is shared between the two models for all architectures, with precompiler directives used to switch between model and architecture specific parts of the code. The general description of the implementation here will follow that of the native host version, that being the original model and environment (UNIX) used to implement R T Threads. Features which differ in the mapped implementation will be described in section 4.10. The native microkernel is briefly described in section 4.11.  29  4.1  Data Structures  The overall state of each thread in the system is maintained in a structure called a thread control block (TCB). This structure has the type TCB. typedef s t r u c t t c b { RttThreadld threadld; int  errNo;  ThreadMgmtlnfo threadMgmtlnfo; QueueMgmtlnfo queueMgmtlnfo; Schedlnfo s c h e d l n f o ; ITCInfo  itclnfo;  Semlnfo semlnfo; IOInfo  iolnfo;  } TCB; Each thread is identified within the system by a pointer to its T C B . The threadld field of the T C B contains the thread's thread id (an identifier of type R t t T h r e a d l d , described later in this section). The errNo field of the T C B is a thread specific version of the errno global of the underlying operating system (for U N I X and Win32). The remaining fields of the T C B are structures specific to the various functional modules of the R T Threads implementation: threadMgmtlnfo contains information used by the thread management module, queueMgmtlnfo for the queue management module, and so on. Each will be described in greater detail in its respective section in this general implementation description.  30  RT Threads has a few global variables. Most are part of the _Globals_ global variable, which is a structure of the following type: struct globals { TCB * c u r r t h r e a d ; int  inOsFlag;  int  numbThreads;  i n t numbReadyThreads; i n t numbSysThreads; int  ioPending;  int  tickPending;  int  nameServerCreated;  R t t T h r e a d l d nameServerld; u_long h i d ; u_long l i d ; s t r u c t t c b tcbTbl[NUMBTCBS]; QUEUE qTbl[NUMQS]; QUEUE rdyqTbl[NUM_READYQ_PRIORITIES]; };  The currthread field is a pointer to the T C B of the currently running thread. The inOsFlag field is used to control shared access to shared data within the R T Threads kernel. Fields numbThreads, numb ReadyThreads and numbSysThreads are the number of threads currently alive, the number which are ready, and the number which are "system" threads (threads created with level RTTSYS). The ioPending and tickPending fields are flags which indicate whether I / O processing and clock tick handling, respectively, have been deferred. The nameServerCreated field is a flag indicating 31  whether the special name server thread has been created, and nameServerld is the thread id of the name server thread, if there is one. The hid and lid fields are the host id and local id (IP address and port) for identifying the local R T Threads environment (see section 4.7.1). The tcbTbl, qTbl and rdyqTbl fields are the global T C B table, queue table and ready queue table, respectively.  Access to shared data structures of the R T Threads implementation, such as T C B s , is protected by setting inOsFlag whenever a thread is in a critical section in the R T Threads kernel. This flag is checked whenever a preempting event occurs in the kernel, i.e., whenever a clock tick occurs (see section 4.5) or I / O has completed (see section 4.9). If the flag is set when such an event occurs, any associated processing and scheduling is deferred (by setting the ioPending or tickPending flag) until the original thread leaves its critical section. These critical sections are delimited with ENTERINGOS and LEAVINGOS, which are macros for the enteringOs () and l e a v i n g O s O routines. enteringOs () simply sets i n O s F l a g . l e a v i n g O s O clears the flag, and also checks ioPending and tickPending to see whether any processing has been deferred, calling their respective handlers directly if necessary.  The publicly visible type for the threads, R t t T h r e a d l d , is a 64-bit value which contains enough information to uniquely locate a thread among R T Threads environments on the internet. Figure 4.1 shows the layout of the R t t T h r e a d l d . It is divided into two parts, the host id (hid) and local id (lid). The host id contains the 32-bit IP address associated with the thread's R T Threads environment, used to locate its host machine. The local id identifies the thread on a particular machine, using the 16-bit port number, and within a particular threads environment, using  32  hid  lid  3  port  IP address  r-t&  index  O  32 bits  16 bits  4bits  12 bits  Figure 4.1: R t t T h r e a d l d the instance number and index. The index is an 12-bit index into the T C B table, allowing for as many as 4096 simultaneous alivethreads per address space. The 4bit instance number is incremented each time a T C B structure is reused, to help prevent the thread id for a terminated thread from mistakenly being accepted as a reference to a newer thread using the same T C B table index. The instance number can range from 1 to 15, instance 0 being reserved to indicate the special name server thread.  4.2  System Startup  This section describes how the R T Threads system (i.e., an R T Threads environment running in an underlying process) is initialized. From the user application's point of view, an R T Threads application starts in the same manner as a normal console application, except that the entry routine is named mainpO rather than main(). The actual main() routine is defined in the R T Threads library, and is the entry point for R T Threads startup. In main(), first the routine o s I n i t O is called to initialize data structures, and then the null thread and main thread are created.  33  After these threads are created, the StartOs_() routine (of the scheduling module, see section 4.5) is called to start the scheduling clock.  static void o s I n i t O This routine first does dynamic initialization of the global variables c u r r t h r e a d and inOsFlag, and then calls the routines InitReadyQueue_(), InitThreadMgmt_() and I n i t I O _ ( ) to do module specific initialization. These routines are described in their respective sections.  RTTTHREAD N u l l T h r e a d _ ( ) This is the entry routine for the null thread.  The null thread is created with a  special priority lower than that of any other thread in the system, and thus runs only when no other threads are ready to run. The null thread simply loops on the pause () system call, which yields control of the processor allowing other processes in the underlying system to run while the RT Threads environment is idle waiting for threads to wake up or I / O to complete. When such an event occurs, a signal is generated, causing pause() to return. This return will not actually happen until threads scheduled by the preempting event have completed their tasks, at which time the R T Threads system will become idle again, and the null thread will loop back to call pause() again.  RTTTHREAD mainThreadQ  34  This is the entry routine for the main thread, which simply calls the user defined mainpO routine, passing along the argc and argv parameters from main(). The main thread is created with the highest possible priority and the earliest possible deadline (RTTZEROTIME), and as such it will not be preempted by any threads created in mainpO (the intended purpose of mainpO is for user application initialization, e.g., command line parsing and the creation of one or more initial threads).  4.3  Thread  Management  The thread management module is responsible for the creation and termination of threads, and for maintaining state specifically related to the running of a thread, such as its entry point and stack. The ThreadMgmtlnf o structure is shown below. typedef s t r u c t { i n t inUse; RttSchAttr a t t r s ; TSTATE s t a t e ; char name[MAXTNAMELEN + 1] ; char *namePtr; int  *stmembeg;  int  *stkend;  i n t *sp; int stackSize; void  (*threadAddr)();  v o i d *threadArg;  35  unsigned l o n g u V a l ; int  level;  } ThreadMgmtlnfo; The inUse field is a flag indicating whether the T C B is currently in use. The attrs field maintains the current scheduling attributes for the thread in an R t t S c h A t t r structure, as described in section 3.1. The current state of the thread is maintained in the state field, which is of enumerated type TSTATE, with possible values as follows: typedef enum { RBLK = 0,  /* r e c e i v e b l o c k e d  */  SBLK = 1,  /* send b l o c k e d  */  WTFR = 2,  /* w a i t i n g f o r r e p l y  */  SLEP = 3,  /* s l e e p i n g threads  */  SEMB = 4,  /* semaphore b l o c k e d  */  IOBK = 5,  /* I/O b l o c k e d  */  THREAD..FREE  = 20,  /* not i n use .  */  THREAD..READY  = 21,  /* ready  */  /* running  */  THREAD..RUNNING = 22, THREAD..DYING  = 23 , /* dying  */  NOQ  = Oxff /* not on any queue  */  } TSTATE; The name and namePtr fields are used to hold and refer to the string name of the thread. The thread's stack is maintained using the stmembeg, stkend, sp and stackSize fields, which indicate the beginning and end of the stack in memory, the current stack pointer, and the stack size, respectively. The threadAddr and threadArg fields  36  are the thread entry point (i.e., the address of the entry routine) and the argument passed into the entry routine. Thread specific user data is stored in the u Val field. The /eue/field indicates the thread level as specified at thread creation and described in section 3.2.1.  The following routines, defined in the thread management module, are used to assist in thread management:  s t a t i c TCB * f indUnusedTCBO  This routine searches the T C B table for a T C B whose  inllse  field is 0. If found,  a pointer to that T C B is returned, otherwise TNULL is returned (TNULL is a null T C B pointer). The last index tried is maintained as a static local variable as an optimization, so the search can continue with the next T C B in the table the next time f indUnusedTCBO is called.  s t a t i c v o i d i n i t T C B ( i n t tcbNo, TCB * t c b P t r )  This routine initializes a T C B at system startup. Most notably, the  inllse field  is set  to 0, the state is set to THREADJFREE, and the instance number is set to 1 (instance number 0 is reserved for the special name sever thread).  s t a t i c v o i d threadIsLeaving(TCB * t c b P t r )  This routine is used to call cleanup routines for exiting threads, as defined in other modules. ItcThreadCleanup_() and SemCleanup_() are called. 37  s t a t i c void k i l l C u r r t h r e a d O This routine is called when thread terminates itself, either by returning from its entry routine or by calling R t t E x i t O or R t t K i l l O , to remove itself from the system. The numbThreads field of _Globals_ is decremented, as is the numbSysThreads field if the thread is a system thread. The t h r e a d l s L e a v i n g O routine is called to do any cleanup for other modules. SysFree_() is used to free the thread's stack. The T C B ' s state is set to THREAD_FREE, the instance number is incremented (and wrapped to 1 if it exceeds 15), and the inUse field is set to 0. If numbThreads is greater than numSysThreads, the next ready thread is scheduled by using DeqReadyThread_() to assign a new c u r r t h r e a d , and then calling SchedThread_() to schedule it. Otherwise, there are no RTTUSR threads left in the system and the R T Threads environment first calls C a l l E x i t R o u t i n e s _ ( ) to call any user registered exit routines, and then exits using the underlying e x i t ( ) call.  v o i d InitThreadMgmt_() This routine is used to initialize thread management. The numbThreads and numbSysThreads fields of _Globals_ are set to 0, and the i n i t T C B O routine is called for each T C B in the T C B table.  void CallExitRoutines_() This routine is called when the R T Threads environment exits naturally. It traverses a linked list of routines previously registered by the using application with the 38  R t t R e g i s t e r E x i t R o u t i n e O A P I call, void CallNclean_() C a l l N c l e a n _ ( ) is the entry routine used by the RT Threads implementation to start each new thread, the first time it is scheduled to run. The actual entry routine for the thread, as specified in the R t t C r e a t e O call, is explicitly called by C a l l N c l e a n _ ( ) . When C a l l N c l e a n _ 0 is called, c u r r t h r e a d will have already been set to the point to the new thread's T C B by the scheduler, so the entry routine address and argument can be obtained from the threadAddr and threadArg fields of the T C B ' s threadMgmtlnfo field.  Should the user specified thread entry routine return, the  k i l l C u r r t h r e a d O routine is called to remove the thread from the system.  Threads are created using the R t t C r e a t e O interface routine. unused T C B is obtained using the f indUnusedTCBO call.  First, an  If none are available,  R t t C r e a t e O returns RTTFAILED. Otherwise, the new thread's threadMgmtlnfo field is initialized. The inUse field is set to 1, and the attrs, name, stackSize, threadAddr, threadArg and level are assigned the values passed in as parameters to R t t C r e a t e ( ) . A stack is allocated for the thread using S y s M a l l o c _ ( ) . The IP address and port number of the threadld field of the T C B are set to the values of the hid and lid globals. The thread's state is set to THREAD_READY, and the new field of the T C B ' s schedlnfo field is set to 1, to indicate to the scheduler that this is a new thread. The new thread is now ready to be scheduled. The current time is determined using RttGetTimeOfDay(), and is compared with the starting time attribute of the new thread to see whether it should be put to sleep or made ready. If the starting time has been reached, the thread is made ready by calling EnqueueEdfTail_(), otherwise it is put to sleep by calling SetTimeout_() (see section 4.5). If the thread's 39  level is RTTSYS, the number of system threads (numbSysThreads) is incremented, and the next ready thread is scheduled using the SchedM macro (this will either be the currently running thread, i.e., the one which called R t t C r e a t e O , or the newly created thread, depending on their scheduling attributes).  R t t E x i t O and R t t K i l l O can be used to terminate threads.  RttExitO  terminates the currently running thread by calling k i l l C u r r t h r e a d O . R t t K i l l O first checks to see whether its thread id parameter is the currently running thread, in which case it too calls k i l l C u r r t h r e a d O . Otherwise, it terminates the specified thread .in much the same way that k i l l C u r r t h r e a d O does, except that it must also dequeue the thread from whichever scheduling queue it is on, determined by its state (and priority if it is a ready thread), using the DEQUEUE-ITEMO macro.  4.4  Queue  Management  A l l threads except the running thread are maintained in queues. The queues described in this section are used primarily by the scheduler, and will be referred to in general as scheduling queues. Ready threads are maintained in an array of ready queues, one for each priority level. Queues are implemented as doubly linked lists, for efficient removal of a thread from a queue. The QueueMgmtlnf o structure is shown below. typedef s t r u c t { s t r u c t t c b *prev; s t r u c t t c b *next;  40  u_char queue; } QueueMgmtlnfo; The prev and next fields are links into whichever queue the thread is currently on. The queue field is an 8-bit integer value, corresponding to the current state of the thread, indicating which queue the thread is on. The routines ENQUEUE_HEAD(), ENQUEUE_TAIL ( ) , EN QUEUE _I TEM(), DEQUEUEJffiAD ( ) and DEQUEUEJTEM ( ) , defined as macros for efficiency, allow threads to be placed on or removed from anywhere in a queue. The FindThreadOnQ_() macro is used to determine whether a thread is on a particular queue.  A thread is always on a single scheduling queue, unless it is the currently running thread (state THREAD-RUNNING), in which case it is not on any queue. Threads which are ready to run, i.e., are not blocked, sleeping, or running, have state THREAD_READY and are maintained on one of the ready queues. There is one ready queue for each priority level, these queues being maintained in an array called the ready queue table (field rdyqTbl'm the _Globals_ structure). The ready queue module provides routines which place threads in the appropriate position (according to the E D F rule) on the appropriate queue (i.e., the one corresponding to the thread's priority). Thus, the thread at the head of each priority queue is always the one which should be scheduled next for that priority. The ready queue routines are:  v o i d EnqueueEdfTail.(TCB *thread) This routine places a ready thread, specified as a pointer to its T C B , on the ready queue for its priority, behind all threads with an earlier deadline. The thread is 41  placed immediately following (at the tail of) any other threads with the same deadline.  v o i d EnqueueEdfHead_(TCB *thread) This routine places a ready thread, specified as a pointer to its T C B , on the ready queue for its priority. The thread is placed immediately preceding (at the head of) any other threads with the same deadline. This routine is used when a thread is made ready after being blocked sending or receiving a message, or doing external I/O.  TCB *DeqReadyThread_(void) This routine returns a pointer to the T C B of the thread which is at head of the highest priority, non-empty ready queue, removing that thread from the queue.  TCB *DeqIfReady_() Like DeqReadyThread_(), this routines removes the highest priority thread and returns a pointer to its T C B , but only if that thread would preempt the currently running thread. If so, the currently running thread, i.e., the thread indicated by the global variable c u r r t h r e a d , is placed on its appropriate priority queue. Otherwise, the value of c u r r t h r e a d is returned.  v o i d DeleteReadyThread.(TCB *thread) 42  This routine removes the thread specified by thread from its priority queue.  Threads which are not ready or running are either sleeping or blocked, and will be on an appropriate queue, having been placed there with one of the macro routines described earlier in this section. Sleeping threads, i.e., those whose starting time is in the future, are maintained in the sleep queue, ordered by start time for efficient waking. A l l other threads are on various blocked queues, one each for semaphores, sending, waiting for a reply, receiving and I / O .  4.5  Thread Scheduling  The key features of thread scheduling are: • determining which thread should be scheduled to run at any given time • a clock for scheduling future events • a mechanism for context switching, including saving thread state (registers)  Determination of which ready thread should be next to run at any given time is handled largely by the queue management routines described in section 4.4. The following sections describe the implementation of the system clock for R T Threads, and how context switches are performed.  43  4.5.1  System Clock  A "clock" is required by R T Threads for the purpose of making threads ready at their starting time. This is implemented using an interval timer ( s e t i t i m e r O ) which causes periodic alarm signals to be sent to the process. These signals are caught by a signal handler called the tick handler. The tick handler checks the sleep queue to see whether any threads have a starting time less than or equal to the current time (obtained using RttGetTimeOfDayO). A l l such threads are taken off the sleep queue and made ready. If there is a thread which should preempt the currently running thread, a context switch to that thread takes place immediately, within the signal handler.  This means that there may be no actual return from  the signal handler for an indefinite period of time, possibly longer than the interval between clock ticks. Thus, in order to ensure that subsequent alarm signals are caught at the correct time, the signal mask for the process must be modified before such a context switch in order to unblock alarm signals.  The following scheduling module routines are used to implement the clock and related functions:  s t a t i c i n t wakeSleepersO This routine is called by the tick handler to take all threads off the sleep queue whose starting times have passed, and make them ready by putting them on their respective ready queues. RttGetTimeOfDayO is called to get the current time for comparison, and the sleep queue is traversed using the queue management macros, dequeuing threads until one whose starting time is still in the future is encountered. 44  Threads are ordered in the sleep queue by starting time, so it is not necessary to search the entire queue. Each thread which is dequeued from the sleep queue is put on its ready queue using EnqueueEdfTail_(). wakeSleepersO then returns the number of threads which were awakened.  s t a t i c void tickHandlerO The tick handler is a routine registered with the operating system to to be called at regular intervals. It is a signal handler which catches the SIGALRM signal. When it is called, the inOsFlag is first checked to see if it set. If so, the tickPending field of _Globals_ is set, to defer the signal handling, and t i c k H a n d l e r returns immediately. If not in a critical section, it calls wakeSleepersO to wake (make ready) any threads whose starting time has passed. If wakeSleepersO returns a non-zero value, at least one thread has been awakened, meaning that a context switch may be required. The SchedM scheduling macro (see section 4.5.2) is called to perform such a switch, if required. However, while in the signal handler, the underlying operating system blocks SIGALRM signals until the signal handler routine returns. If a context switch occurs, the routine will not return for an indefinite period of time, during which the SIGALRM signal would continue to block, thus delaying future clock ticks indefinitely. To overcome this problem, the SigMask_() routine (see below) is called to unblock the SIGALRM signal before calling SchedM.  int  SetTimeout_(TCB * t h r e a d , RttTimeValue wakeTime)  The SetTimeout_() routine is used to put a thread to sleep by placing it on the time ordered sleep queue. The thread's state is set to SLEP, and its starting time is set to 45  wakeTime. The sleep queue is traversed using the queue management macros until a thread with a later starting time is encountered, and the new sleeper is linked into the queue ahead of that thread, unless the end of the queue is reached, in which case the new thread is linked to the tail of the queue.  v o i d S i g M a s k _ ( i n t how, i n t s i g ) SigMask_() is a convenience routine used to block or unblock signals. The how parameter is used to select whether the signal is to be blocked or unblocked, taking as a value either SIGJ3L0CK or SIG_UNBLOCK. The sig parameter is the signal to block or unblock, for example SIGALRM. The underlying s i g p r o c m a s k O call is used.  v o i d S t a r t C l o c k _ ( i n t useconds) S t a r t C l o c k _ ( ) is used to start the system clock. It uses the s e t i t i m e r O call to initiate the generation of SIGALRM signals at regular intervals, that interval being specified in microseconds via the useconds parameter.  void StartOs_() StartOs_() is used to initiate the signal related functionality of the system. The tick handler and SIGIO handler (see section 4.9) are registered with the operating system using the s i g a c t i o n O call, and the R T Threads system clock is started by calling S t a r t C l o c k _ ( ) .  46  4.5.2  Context Switching  The following data structure, Schedlnfo, is the type for the schedlnfo field of a TCB. typedef s t r u c t { int  savearea[36];  i n t new; }• S c h e d l n f o ; The savearea field is a statically allocated piece of memory which is used to store the register state of a thread when execution switches to the context of another thread. Its size, 144 bytes, is large enough to accomodate the necessary registers for all of the architectures on which R T Threads is implemented. The new parameter is a flag used to determine whether a thread is being scheduled to run for the first time.  The routines which perform the actual context switches are implemented in assembler code because they require access to the registers for saving thread state, and because they employ a "trick" involving return addresses . The assembler code 1  implements 3 functions used to achieve user level context switching:  v o i d startNewThread(void ( * f u n c ) ( ) , v o i d *sp) This routine is used to start a new thread. It sets the initial stack pointer for the thread to point to sp, the beginning of the stack memory for the thread as allocated in R t t C r e a t e O , and then jumps to June, which is the address of the entry routine 'Solaris provides support for user level context switching (setcontext () and getcontext). This support was not used in the RT Threads implementation, in part because it is not portable to other operating systems.  47  for the new thread (in reality, the address of C a l l N c l e a n _ ( ) is passed in, so that this routine can do cleanup if the actual thread entry routine returns).  i n t saveThreadContext(void *savearea) The saveThreadContextO routine is used to save the state (i.e., the registers) for a thread prior to a context switch. It takes a parameter  savearea which  to memory which is used to save the current state of the thread. The of the T C B ' s  schedlnfo  field is passed in for this purpose.  is a pointer  savearea field  When called to save  the state of a thread, saveThreadContextO returns a non-zero value, so that the context switching code will know that another thread must be resumed.  v o i d r e t u r n T o T h r e a d ( v o i d *savearea) The returnToThreadO routine is used to resume a thread which had at some point previously saved its state using saveThreadContextO. the memory referred to by its parameter  savearea  The contents of  are loaded into the registers.  returnToThreadO does an "unnatural" return, i.e., the function returns to the same address as the last call to saveThreadContextO for that thread did, except this time the return value is 0. Thus the context switching code can tell that a thread is being resumed and it should continue with execution of user code for that thread.  The following routines are also involved in context switching:  48  v o i d SchedThread_() This routine is called to perform a context switch, by calling the appropriate assembler routine. When called, c u r r t h r e a d must point to the T C B of the next thread to run. The state of c u r r t h r e a d is set to THREAD_RUNNING. Then, if the new field for the thread is set, startNewThreadO is called, and otherwise returnToThreadO is called resulting in the actual context switch. SchedThread_() never returns.  SchedM The SchedM routine is defined as a preprocessor macro for efficiency.  It is called  by the RT Threads kernel whenever there is the possibility that a context switch is needed, and it calls the appropriate routines to effect the context switch if required. It must be called within a kernel critical section (i.e., ENTERINGOS must have been called previously), and LEAVINGOS is called within the macro. When it is called, the current thread must either be in the THREAD_RUNNING state, or one of the blocked states (e.g., SEMB). In the latter case, the thread will have just been put in a blocked state by the kernel code, and SchedM will put the thread on the appropriate scheduling queue. A context switch will necessarily occur, because the current thread is now blocked. On the other hand, if the current thread was in the THREAD_RUNNING state before the SchedM call, a context switch might not occur, since there might not be a ready thread which would preempt the current one. Deqlf Ready_() is called to determine the next thread to run. If it returns c u r r t h r e a d no context switch takes place, otherwise saveThreadContext () is called to save the state of the current thread in preparation for a context switch. The saveThreadContext () call will return a non-zero value, and this condition will result in c u r r t h r e a d being set to 49  the T C B pointer returned by Deqlf Ready_(). The same procedure is followed if the originally current thread was in a blocked state, except that DeqReadyThread_() is used to obtain the T C B pointer for the next thread to run, since it is known that it will not be c u r r t h r e a d . A t this point SchedThread_() is called to perform the context switch. The saveThreadContextO call is in a conditional statement, which, when the return value is zero, results in a LEAVINGOS call and a return to the code which called SchedM. This occurs when the originally preempted thread is scheduled to run again.  4.5.3  M o d i f y i n g Scheduling A t t r i b u t e s  The R t t S e t T h r e a d S c h e d A t t r O A P I routine can be used to modify the scheduling attributes of a thread. This can result in an immediate context switch. If the thread is sleeping, it is first removed from the sleep queue with the DEQUEUE_ITEM() queue management macro and its state is set to THREAD_READY. The sleeping thread is dequeued because if the thread is to remain sleeping, but with a modified starting time, it will need to be re-enqueued on the sleep queue to ensure it ends up in the correct position in the time ordering. If the thread is on a ready queue, it is removed from that ready queue with DeleteReadyThread_() because it may change state or priority. RttGetTimeOfDayO is then called to determine the current time for comparison. If the new starting time for the thread is in the future, the thread's scheduling attributes are assigned the values indicated by the  attr parameter,  and if  the thread was running or ready, it is placed on the sleep queue using SetTimeout_() and SchedM is called to schedule the next ready thread. Otherwise, if the thread having its attributes set is the currently running thread, its attributes are simply  50  assigned their new values and SchedM is called. If the thread was not the currently running thread and its state is now THREAD_READY (because it had been ready or sleeping), it is put on the appropriate ready queue with EnqueueEdf T a i l _ ( ) SchedM is called to schedule the next ready thread (which may or may not be the one just made ready).  4.6  Semaphores  The publicly visible type for semaphores is the opaque type RttSem (defined as a v o i d * in the interface). In the semaphore implementation, the RttSem type maps to the private type SemaphoreStruct, i.e., RttSem variables passed into the semaphore routines are actually pointers to SemaphoreStructs, and are cast accordingly within the routines. The SemaphoreStruct type is a structure defined as follows: typedef s t r u c t semaphore { i n t magicNumb; TCB * f i r s t B l o c k e d T h r e a d ; TCB * l a s t B l o c k e d T h r e a d ; int  semaphValue;  int  mode;  } SemaphoreStruct; The magicNumb field is a magic number used to ensure that the semaphore pointer is valid. The firstBlockedThread and lastBlockedThread fields are T C B pointers for the first and last thread in the queue of threads blocked on the semaphore. The semaphValue field is the current value of the semaphore. The mode field indicates  51  the mode of semaphore, either first come first served (RTTFCFS) or priority based (RTTPRIORITY).  Semaphores are created using the R t t A l l o c S e m O routine, which allocates a SemaphoreStruct dynamically using S y s M a l l o c _ ( ) , and initializes its fields (the semaphValue and mode values are obtained from the value and mode parameters). The pointer to the allocated memory is cast to RttSem and returned by reference. RT Threads semaphores are destroyed by calling the R t t F r e e O routine, which frees the memory allocated for the semaphore using SysFree_(). R t t F r e e O first checks, however, whether the semaphore value is less than zero, indicating that threads are blocked on the semaphore, in which case it returns RTTFAILED without freeing the semaphore.  Threads which are blocked on a semaphore maintain the pertinent information in the semlnfo field of the T C B , which is a structure of type: typedef s t r u c t { s t r u c t t c b *semaphNext; s t r u c t t c b *semaphPrev; RttSem semaphore; } Semlnfo; The semaphNext and semaphPrev fields are links used to place the thread in the queue of threads blocked on a particular semaphore. The semaphore field is the semaphore which the thread is blocked on (or NULL if the thread is not blocked on a semaphore).  52  The following are routines local to the semaphore module used to add and remove threads from a semaphore's queue of blocked threads:  s t a t i c v o i d addBlockedThread(SemaphoreStruct *semaph) This routine links the currently running thread (currthread) into the blocked queue for the semaphore indicated by the pointer semaph. Links from the semlnfo field of the thread's T C B are used. The lastBlockedThread field of semaph is set to be c u r r t h r e a d , placing the thread at the tail of the queue.  s t a t i c v o i d removeThread(SemaphoreStruct *semaph, TCB * t c b P t r ) This routine removes the thread indicated by tcbPtr from anywhere in the blocked queue for semaph. This is an efficient operation because the T C B has the previous and next links which locate the thread in the queue. The removeThread routine is used when a thread waiting on a semaphore becomes unblocked (due to an R t t V ( ) call on the semaphore), or when a semaphore blocked thread is terminated.  s t a t i c TCB *dequeueFirstBlockedThread(SemaphoreStruct *semaph) This routine calls removeThread() with the firstBlockedThread field of semaph to remove the first blocked thread from the semaphore's blocked queue. It is used for first come first served mode semaphores.  s t a t i c TCB *dequeueHighPrioBlockedThread(SemaphoreStruct *semaph) 53  This routine searches the blocked queue for semaph to determine which thread has the highest priority. If more than one thread has this priority, the one with the earliest deadline is chosen, and if there is more than one such thread, the one closest to the first blocked thread is chosen. The resulting thread is then removed from the queue using removeThreadO. This routine is used for priority mode semaphores.  v o i d SemCleanup_(TCB * t c b P t r ) SemCleanup_() is called by a terminating thread to do semaphore related cleanup. If the thread indicated by tcbPtr is blocked on a semaphore, it is removed from the semaphore's blocked queue using removeThreadO, and the semaphore's value is incremented to reflect the fact that one less thread is blocked on it.  Threads can wait on or signal a semaphore using the R t t P ( ) and R t t V ( ) A P I routines. Each takes a semaphore variable (type RttSem), previously created with R t t A l l o c S e m O , as its single parameter, and casts that parameter to a pointer of type SemaphoreStruct* so it can access the semaphore's state. Each also checks the semaphore's magic number to ensure it is a valid semaphore.  In R t t P ( ) , the semaphore's value is first decremented. is less than zero, the thread will block on the semaphore.  If the new value If this is the case,  addBlockedThreadO is called to put the thread (i.e., the currently running thread, c u r r t h r e a d ) at the tail of the blocked queue for the semaphore. Then the state of the thread is set to SEMB (semaphore blocked), and the scheduling macro SchedM is called to put the thread on the semaphore blocked queue (the scheduling queue,  54  not to be confused with the semaphore's own blocked queue) and schedule the next ready thread (see section 4.5). If the new value of the semaphore is not less than zero, the thread does not block and R t t P ( ) returns immediately.  R t t V ( ) increments the semaphore's value. If the resulting value is less than or equal to zero, then there was at least one thread blocked on the semaphore and thus a blocked thread should be made ready. If the semaphore's mode is RTTFCFS, then the thread to be made ready is chosen with d e q u e u e F i r s t B l o c k e d T h r e a d O , and if the mode is RTTPRIORITY, the thread is chosen with dequeueHighPrioBlockedThreadO. The chosen thread is made ready with EnqueueEdfTail_(), and the scheduling macro SchedM is called. If the newly ready thread has scheduling attributes which take precedence over those of the currently running thread, the current thread will be preempted immediately.  4.7  Inter Thread Communication  Information used for RT Threads' message passing facility is maintained in the  itclnfo field  of the T C B of each thread. This field is a structure of type I T C i n f o,  defined as follows: typedef s t r u c t { int remoteltc; R t t T h r e a d l d remoteThreadld; void  *sendmsg;  u_int sendlen;  55  s t r u c t t c b *peer; void u_int  *receivedmsg; receivedlen;  struct tcb *waitingFirst; s t r u c t t c b *waitingLast; s t r u c t t c b *waitingNext; s t r u c t t c b *waitingPrev; > ITCInfo;  The first field, remoteltc, is a flag indicating whether the thread is currently in the midst of a cross address space message passing transaction. The second field, remoteThreadld is the thread id of the remote sending thread in such a transaction (see section 4.7.1). The fields sendmsg and sendlen are the message buffer and buffer length for the request message, maintained by the sending thread in the event that the intended receiver is not blocked receiving a message at the time the message is sent. The peer field is used to communicate the identity of the sending thread to the receiving thread. The receivedmsg and receivedlen fields are the buffer and buffer length for depositing the request message at the receiving thread, and for depositing the reply message at the sending thread. The remaining fields are used to implement a queue of waiting senders at a receiving thread, for the case where multiple threads send to a single thread before it does a receive. The waitingFirst and waitingLast fields are used to mark the ends of the queue in the intended receiver's T C B . The waitingNext and waitingPrev are links in the T C B of each of the sending threads in the queue.  56  The following local routines are defined in the I T C module:  s t a t i c v o i d addWaitingSender(TCB *to) This routine uses the links in the itclnfo field of a sending thread's T C B to place the thread in the queue of waiting senders for an intended receiver thread, as specified by the to parameter.  s t a t i c v o i d dequeueSender(TCB * t c b P t r ) This routine dequeues the first sender thread from the queue of waiting senders maintained at a receiving thread. A first come first served policy is used for enqueuing and dequeuing senders.  s t a t i c TCB *getTcbFromRemoteThreadId(RttThreadId thread) This routines maps a remote thread id to the T C B in which it is stored. The T C B table is searched until a match is found.  v o i d ItcThreadCleanup_(TCB * t c b P t r ) This routine is called to perform I T C related cleanup when a thread terminates. If any sender threads are on the waiting sender queue of the thread referred to by tcbPtr, they are taken off the scheduler's send blocked queue and made ready.  57  The R t t S e n d O routine first compares the IP address and port number in the destination thread id (determined using the GetlpAddrFromThreadldO and GetPortNoFromThreadldO macros) with the local R T Threads environment's host id and local id to see whether the intended receiver is a local thread. If they differ, the receiver is in another threads environment and ExecuteRemoteRequest_() is called to perform the cross address space communication (see section 4.7.1). Otherwise, the receiver is local, in which case a pointer to the T C B of the intended receiver is obtained to do the processing of the local send (the sending thread is the currently running thread, i.e., c u r r t h r e a d ) . The macro getTcbFromThreadldO is used to convert a thread id to a T C B pointer. This is done with the to parameter thread id, unless its instance number is 0 which indicates that the intended receiver is the local name server thread (see section 4.7.1), in which case getTcbFromThreadldO is used with the globally stored name server thread id to determine its T C B pointer. This will or occur either if the nameserver thread is the destination in a cross address space communication, in which case this will be the local proxy send, or if the nameserver thread id was obtained locally with the RttGetNSO call. The receivedmsg field for the sender thread is then set to the reply buffer address, passed in as the rData parameter to R t t S e n d O . The FindThreadOnQueue_() macro is used to see if the the intended receiver is on the receive blocked (RBLK) scheduling queue. If so, the receiver will have set the receivedmsg and receivedlen fields of its itclnfo structure, and the sender copies the request data (parameter sData) into the receiver's buffer. The length of the copy is the minimum of the send length specified by the sender (parameter slen), and the maximum receive length specified by the receiver (the value in the receivedlen field). The sender then sets itself as the receiver's peer, sets its own state to be waiting for reply (WTFR), makes the receiving  58  thread ready by changing its state to THREAD-READY and putting it on its ready queue using EnqueueEdf Head_(), and then blocks itself and makes the next thread ready (possibly the receiver) by calling the scheduling macro SchedM. In the event that the intended receiver exists but is not blocked receiving, the sender, puts itself on the receiver's queue of waiting senders using addWaitingSenderO, sets its own sendmsg and sendlen fields for the receiver to eventually copy the request message from, sets the receiver as its peer, sets its own state as send blocked (SBLK) and calls SchedM to block itself and schedule the next ready thread. Note that R t t S e n d O will always block. When the sending thread eventually becomes unblocked, it checks its own receivedlen field to find out the length of the reply message, as set by the replying thread, and returns (the reply length is returned by reference).  In R t t R e c e i v e O , the receiving thread starts by checking the waitingFirst field of its T C B ' s itclnfo field. If not TNULL, there is at least one message waiting, and the thread can pick it up right away without blocking. I.e., the waitingFirst field will be the T C B pointer for the next sender, and the receiver copies the message stored in its sendmsg field into its own receive buffer (the Data parameter to R t t R e c e i v e O ) , the copy length being the minimum of the send length (specified in the sender's sendlen field) and the receive length (the dereferenced value of the len parameter to R t t R e c e i v e O ) . The receiver then changes the sender's state from send blocked (SBLK) to waiting for reply (WTFR), switching it to the appropriate scheduling queue, and then returns. In the event that there are no messages waiting for the receiver, it sets its own receivedmsg and receivedlen fields to indicate the buffer to eventually be filled with a request message, then sets its state to receive blocked (RBLK) and calls SchedM to block itself and schedule the next ready thread.  59  RttMsgWaitsO simply checks waitingFirst for the calling thread, returning 0 if it is NULL and 1 otherwise.  R t t R e p l y O first checks the thread id to which the reply is being sent (parameter sndr) to see if it is a local or a remote thread. If it is a local thread, its T C B pointer is obtained using the getTcbFromThreadldO macro, otherwise the getTcbFromRemoteThreadldO routine is used (this maps the remote thread id to the T C B for a proxy thread which does a local send for the remote thread). In the local case, FindThreadOnQ_() is used to see whether the destination thread (i.e., original sender) is actually in blocked waiting for a reply. If not, RTTNOTBLOCKED is returned. Otherwise, the reply is copied into the receievedmsg field in the sender's T C B , the sender is made ready by setting its state to THREAD_READY and calling EnqueueEdfHead_(), and SchedM is called.  R t t R e p l y O never blocks, but there  may be an immediate context switch to the sender if it has a higher priority or equal priority and earlier or equal deadline.  4.7.1  Cross Address Space I T C  The cross address space I T C facility is for the most part built on top of RT Threads, but since it makes use of a few of the kernel's facilities to make message passing location transparent, it is considered part of the R T Threads kernel. Location transparency is achieved by embedding each thread's location in the R t t T h r e a d l d , and extracting it within the R t t S e n d O call. T C P / I P is used as the underlying protocol for sending messages across address space boundaries. Thus, if within the R t t S e n d O  60  call it is determined that the R t t T h r e a d l d of the intended receiver refers to a thread with an IP address and port number for an RT Threads environment other than the local one, ExecuteRemoteRequest_() is called to set up a T C P connection to the remote environment.  For receiving messages from remote senders, each R T Threads environment can enable an ITC  server system  thread which listens for incoming connection re-  quests, and spawns worker threads which forward I T C messages to the appropriate local thread using a local R t t S e n d O . The following data structures are used by the cross address space I T C implementation: s t r u c t ITCinfo { u_int slen; u_int r l e n ; R t t T h r e a d l d sender; RttThreadld r e c e i v e r ; } ITCinfo; The ITCinfo structure is used for transmitting control information. It precedes the actual message data in every transmission from sender to receiver. The field is the message data length, and will accept. The  sender field  rlen  slen  is the maximum length reply the sender  is the sender thread id (which will be returned to the  receiving thread by the R t t R e c e i v e O call at the remote end) and r e c e i v e r is the thread id of the intended receiver, so it can be located when the message gets to the remote threads environment. struct Replylnfo { 61  i n t retCode; u_int replyLen; } Replylnfo; The R e p l y l n f o structure is used to transmit control information back to the sender along with the reply message. The retCode field is used to relay the return code for the remote R t t S e n d O back to the sender, and the replyLen indicates the length of the reply.  The R t t N e t l n i t O A P I routine is used to enable cross address space I T C . It sets the host id (hid) and local id (lid) for the local R T Threads environment according to the ipAddr and port parameters passed to it. Either or both of these parameters may be zero, in which case system chosen values are used. In the case of the IP address, the underlying gethostbyname() call is used to determine the local IP address (which may not be the desired one if the host is multi-homed). In the case of the port number, the underlying b i n d O call is used to determine the port, if one is not chosen by the application. The routines of the remote I T C module which implement cross address space message passing are described below:  i n t ExecuteRemoteRequest_(RttThreadld t o , u _ i n t s l e n , v o i d *sData, u _ i n t * r l e n , v o i d *rData) This routine is used to perform a remote send. The parameters are the same as those passed to the R t t S e n d O call from which ExecuteRemoteRequest_() is called. A socket is created, bound to a system chosen port, and used to connect (using R t t C o n n e c t O ) to the IP address and port embedded in the to parameter thread id. 62  A n I T C i n f o structure is then filled in with the slen and dereferenced rlen parameter values, the thread id of the currently running thread for the sender, and the receiver being the to parameter. This control information structure is byte order encoded for network transmission using the locally defined e n c o d e l n f o O routine, which converts each field to network byte order using n t o h l ( ) . The control information structure (variable info) is then sent to the intended receiver using R t t W r i t e N O , followed by another R t t W r i t e N O call to send the actual message data. The message is considered to be a stream of bytes, and any byte order considerations are the responsibility of the higher level application calling R t t S e n d O . A t this point ExecuteRemoteRequest_() will block waiting for the reply from the remote receiver. First RttReadNO is called to receive the reply control information, which is read into a R e p l y l n f o structure. If the return value (retCode) value is not RTTOK, the T C P socket is closed and the return code value is returned to R t t S e n d O , where it is returned to the original caller. Otherwise, the replyLen field is used to determine how many more bytes are to be read from the connection in a subsequent RttReadNO call, which reads the data into the rData buffer supplied by the original caller. The dereferenced value of the rlen parameter is also set to the reply length. The T C P connection is then closed and RTTOK is returned.  s t a t i c RTTTHREAD i t c S e r v e r O This routine defines the I T C server thread, created in the R t t N e t l n i t O call. The I T C server first calls R t t L i s t e n O on the local socket descriptor (also created by R t t N e t l n i t O ) , to listen for incoming connection requests from other R T Threads environments doing remote sends to this one. The server then goes into a loop, first calling R t t A c c e p t ( ) to accept any incoming connections. Whenever R t t A c c e p t ( ) 63  returns with the socket descriptor for a new connection it calls R t t C r e a t e O to create an I T C worker thread (see the i t c W o r k e r O description below), passing it the socket descriptor as its argument.  s t a t i c RTTTHREAD i t c W o r k e r ( v o i d *arg) A n I T C worker thread is created each time a new connection corresponding to an incoming remote send request arrives, leaving the I T C server available to handle more incoming requests. The arg argument of itcWorker contains the socket descriptor for the T C P connection on which the request control information and data is being sent. RttReadNO is first called to receive the control information into an ITCinfo structure, the contents of which are then decoded to host byte order using the locally defined d e c o d e l n f o O routine. The slen field from that structure is then used to determine how long the message data will be, so that a buffer (local variable sData) can be allocated for it and it can be read off the T C P connection using RttReadNO, providing it is not a zero length message, in which case sData is set to NULL. A t this point the I T C worker thread prepares to do a local send, with the I T C worker acting as a proxy for the remote thread originally doing the send. R t t S e n d O is used to do the local send. However, the corresponding local R t t R e c e i v e O needs to return the thread id of the remote sender, and not that of the I T C worker (proxy), as would normally be the case. To accomplish this, the remoteltc field of the I T C worker's itclnfo T C B field is set to 1, and the remoteThreadld field is set to the thread id of the remote sender, as obtained from the sender field of the control information structure received on the T C P connection. Within the local R t t R e c e i v e O implementation, before the sender thread id is returned to the receiving thread (via its reference paramater), the remoteltc flag 64  is checked, and if set the remote ThreadId thread id is returned instead of that of the local sender. Once the local R t t S e n d O returns to the I T C server, the I T C server can send the reply to the remote sender. The retCode field of a R e p l y l n f o structure is assigned the return value of the R t t S e n d O call. If the return value is RTTOK, the replyLen field is set to the reply length returned by R t t S e n d O , otherwise it is set to zero. The fields of the R e p l y l n f o structure are then converted to network byte order using h t o n l O , and it and the reply message are sent to the remote sender using R t t W r i t e O . The T C P connection is then closed and the I T C worker thread terminates.  The cross address space I T C facility uses a name server thread mechanism so that a thread in one R T Threads environment can locate (i.e., find out the thread ids of) threads in other environments. R T Threads does not provide an actual name service, but rather provides a way to locate a single well known thread in any given RT Threads enviroonment, such an environment itself being uniquely identified by its IP address and port number, as set with the R t t N e t l n i t O call. The mechanism used to identify the special name server thread is give it an instance number of 0, while no other threads are allowed to have an instance number of 0. This is done by the R t t C r e a t e N S O call. A remote thread can use RttGetNSO to construct (without doing any communication) a thread id containing the IP address and port number of the destination threads environment. When R t t S e n d O is called with this thread id, the remote receiver will notice the 0 instance number and send the message to the well known name server thread there. This is actually done within the local R t t S e n d O call at the remote receiver.  65  4.8  Dynamic Memory Allocation  The R T Threads routines for dynamic allocation of heap memory are implemented simply as wrappers around the standard C library routines m a l l o c O , c a l l o c O , r e a l l o c O and f r e e ( ) . The corresponding A P I routines add protection from context switches (using ENTERINGOS and LEAVINGOS), because the underlying library routines are not necessarily re-entrant. Private (to the R T Threads kernel) routines SysMalloc_() and SysFree_() are also provided for memory allocation within the kernel, and must be called from within a critical section (i.e., kernel code protected with ENTERINGOS and LEAVINGOS). These routines simply call the underlying m a l l o c O and f r e e ( ) , but are provided to facilitate alternate implementations.  4.9  External I/O  The asynchronous I / O features of U N I X are used to implement asynchronous I/O in R T Threads, i.e., all file (and socket) descriptors are set to be non-blocking and to have a SIGIO signal generated whenever I/O is complete. When an I / O routine is called by a thread (e.g., R t t R e a d O ) , the underlying U N I X system call (e.g., r e a d O ) is called. If that system call returns with an indication that the call would block, the thread is placed on the I / O blocked (IOBK) queue. SIGIO signals are caught by the I/O handler routine, which uses the s e l e c t ( )  2  system call to determine which  file descriptors have completed I / O . Each blocking A P I routine has a corresponding local routine which is called to complete the I/O operation when it is known that the descriptor is ready. For example, the read() system call is called in the R t t R e a d O 2  For Solaris, the p o l l ( ) system call is used.  66  routine to read from the file or socket indicated by the file descriptor passed to it. If the read() call returns with an error and errno indicates that the call would have blocked, the calling thread is placed on the I / O blocked queue and changes its state to IOBK. When the I / O is ready, the SIGIO handler will catch a SIGIO signal and the local doReadO routine will be called to call read() again. If it completes successfully, the thread will be made ready again and a context switch will occur if the thread should run. The following structure, which is maintained as the iolnfo field of the T C B , is used to communicate the I / O operation to be performed and its parameters to the lower level I / O routine (e.g., doReadO).  typedef s t r u c t { char ioDp; int  ioRes;  char * i o B u f ; int  ioNumBytes;  int  ioFlags;  char • i o S r c B u f ; int  •ioSblPtr;  int  ioSeekPos;  int  ioSeekMode;  int  ioAl;  int  ioA2;  int  ioA3;  int  ioA4;  int  ioA5; 67  } IOInfo; The ioOp field indicates the I / O operation which is to be performed.  Constants  are assigned to indicate various I / O operations, for example the SNDM constant flags the sendmsgO operation. The ioRes field is used to communicate the result of the underlying I / O operation back to the calling thread. The other fields of the IOInfo structure are used to communicate various parameters to the lower level routine. The ioAl, ioA2, etc., fields are general purpose fields used for any I / O call arguments, except that ioAl is always used for the file descriptor argument.  s t a t i c i n t e n a b l e S i g i o ( i n t fd) This routine uses i o c t l O system calls to mark the given file descriptor for non blocking I / O and asynchronous I / O notification with the underlying operating system. This means that normally blocking system calls on the file descriptor will not block, but will return an error value and set the errno system global to indicate that the call would block (EWOULDBLOCK or EAGAIN, depending on the version of U N I X ) . It also means that whenever I / O is ready for completion at the operating system level, for example if data has arrived on the network for a given socket, a SIGIO signal will be generated, and a subsequent call for the associated file descriptor will not block.  void InitI0_() I n i t I 0 _ ( ) is called to initialize I / O for an R T Threads environment. It clears the global signal masks and then sets descriptors 0, 1 and 2 (standard input, standard 68  output and standard error) for f dmask. It also enables SIGIO generation for standard input by calling e n a b l e S i g i o O .  int checkBlkIoOperations(int selected) This routine traverses the I / O blocked queue, and for each thread in the queue, checks whether the appropriate global signal mask (WIOMASK or RIOMASK) has been set for its file descriptor (which is in the ioAl field of the T C B ) . If so, I / O is ready for completion and the appropriate lower level I / O routine can be called. A switch is done on the ioOp field of the thread's T C B to determine the appropriate signal mask to check (e.g., wiomask for output I / O such as w r i t e O or sendmsgO), and the appropriate lower level routine to call (e.g., d o W r i t e O for an ioOp of WRITE). The number of operations which were completed is returned.  v o i d IoHandler_() This routine defines the SIGIO handler, registered to be the signal handler for SIGIO signals. When called, the global i n O s F l a g is first checked to see if the interrupted thread was in a critical section in the kernel. If so, the handling must be deferred, which is done by setting the ioPending field of _Globals_ and returning immediately. Otherwise, the signal handler enters a critical section by calling ENTERINGOS. Then, s e l e c t () is called to determine which file descriptors (hence, which threads) have I / O ready for completion. The descriptors to be checked are maintained in a global (to the I / O module) mask, called f dmask, which is of type s i g s e t _ t . Two other masks global to the module, riomask and wiomask are set to  69  fdmask and passed to s e l e c t ( ) to determine which file descriptors are ready for reading (riomask), and which are ready for writing (wiomask). If any descriptors are selected, c h e c k B l k l o Q p e r a t i o n s O is called to complete the operations. A t this point SIGIO signals are unblocked using SigMask_(), and SchedM is called in case one of the threads whose I / O was just completed would preempt the thread which was interrupted by the handler.  4.10  Mapped Implementation  Neither Win32 nor Windows N T provide an equivalent to the U N I X system call s e t i t i m e r O for generating alarm signals at a regular interval. Win32 does provide similar functionality with its multimedia timer facilities, allowing a callback routine to be called at regular intervals. Thus the tick handler could be implemented as one of these multimedia callbacks. Unfortunately, callbacks will not occur if there is an outstanding callback which has not returned. Thus a context switch within the tick handler would effectively cause the scheduling clock to stop until that thread ran to completion. This could easily lead to incorrect scheduling behaviour (i.e., a thread with higher scheduling precedence might miss its "wake up call"). A suitable solution was not found, which provided the motivation for trying a new approach, the mapped model.  In the mapped model, R T Threads is implemented on top of an existing threads package, with each R T Threads thread mapped directly to a thread native to the underlying package. Thus the RT Threads implementation can take advantage of some of the features of the underlying package, such as context switching 70  and asynchronous I / O among threads. A mapped implementation of R T Threads was done for both A I X 4.1 PThreads and for the Win32 Software Development K i t . 3  The Win32 implementation is described here, i.e., those features which differ from the native host implementation.  A l l of the underlying Win32 threads are suspended at any given time, except the one underlying the currently running R T Thread. Context switches are performed by suspending the currently running thread's underlying Win32 thread, and resuming that for the new thread to be scheduled. The tick handler runs as a separate Win32 thread which executes in a loop blocking on a Win32 semaphore on each pass. The multimedia timer callback routine simply signals this semaphore to effect the clock tick. The tick handler can then perform context switches by suspending one thread and resuming another. Note that the R T Threads scheduler is still used and not the Win32 scheduler, since the Win32 scheduler does not give the desired scheduling behaviour. Mapping to underlying Win32 threads is used merely as a technique for performing context switches.  Not all context switches occur in the tick handler. Often a thread yields control itself, for example by blocking on a semaphore. In this case the thread cannot suspend itself first and then resume the next thread. Instead, it first resumes the next thread and then suspends itself. This could, however, cause a context switch before it actually suspends itself. To overcome this problem, the thread first boosts its Win32 priority before resuming the next thread to ensure that a context switch does not take place until it suspends itself. When the original thread is resumed 3  Win32 provides portability between Windows NT and Windows 95  71  again, it lowers its priority again to the normal priority reserved for the underlying Win32 threads.  Mapping RT Threads to Win32 threads made it possible to take advantage of Win32's asynchronous I / O among threads. I.e., this made it possible for the RT Threads I / O routines to simply call the underlying Win32 I / O routines (e.g., _read()) and block if necessary rather than using an I / O queue. Additional scheduling is required to resume another thread before the actual I / O call (in case it blocks), and to ensure that the correct thread is scheduled to run when I / O completes, since during the I / O call (and immediately before and after), more than one underlying Win32 thread is in an unsuspended state.  4.11  Native Microkernel Implementation  There were at one time plans to implement R T Threads as a stand-alone kernel on top of bare hardware for the Intel Pentium architecture. The motivation for this was to provide a very efficient environment for implementing the Continuous Media File Server, one in which real time guarantees could be provided. However, the Kea microkernel operating system had already been developed at U B C , allowing the R T Threads scheduler to be dropped into the kernel, so that Kea's threads would be scheduled according to RT Threads' scheduling policy. This put R T Threads close to the hardware, made it possible to make the R T Threads environment the only process in the system, and eliminated the need to develop new operating system 4  support from the ground up.  4  domain, in K e a terminology  72  Chapter 5  Performance  The performance of a threads package can be measured on a number of criteria. The most straightforward is measurement of the overhead of specific calls and threads related functional aspects. The term  functional overheads will  be used in this thesis  to describe such overheads. The most important of these are for synchronization calls (e.g., for mutexes), for thread creation, and for context switching.  Less straightforward is the impact on overall application performance of using a threads package. While functional overheads play a part in overall performance, their effect may be insignificant compared to performance improvements due to concurrency achieved by using threads for blocking I / O calls, so that computation may continue while I / O requests are being serviced by external hardware. The performance of a particular threaded application may be highly dependent on the nature of the application and the way it is implemented (i.e., its decomposition into concurrent tasks). It is difficult to choose representative benchmark applications which will show performance differences due to differences in scheduling policies or imple-  73  Hostname Stephen  rickards cascade ice blackdog appaloosa dsg monteith grouse sumas asp robson-win95  Operating System SunOS 5.5.1 SunOS 5.5.1 SunOS 5.5.1 SunOS 4.1.3C A I X 4.1 A I X 3.2.5 A I X 3.2.5 Windows N T 3.51 Windows N T 4.0 Windows N T 4.0 Windows 95 Windows 95  Architecture SPARC2 Pentium 166 UltraSPARC S P A R C 10 PowerPC 601 PowerPC 601 PowerPC 601 Pentium 100 Pentium 200 Pentium 200 Pentium 120 Pentium 200  Table 5.1: Host operating systems and architectures mentations between different threads packages. Instead, the approach taken here is to try to measure how well R T Threads performs in terms of meeting its own real time scheduling criteria.  In the tests described in this section, performance measurement times were averaged over several iterations, and unrelated overheads were subtracted where possible. A l l times listed in the tables are in microseconds. Table 5.1 shows a list of hostnames of the machines used in the tests, and their operating systems and architectures.  5.1  Functional Overheads  The following sections describe experiments used to measure the functional overheads of thread creation, semaphore calls and context switching for R T Threads. Measurements were made for other threads packages as well, for comparison.  74  5.1.1  Thread Creation  The importance of thread creation performance depends on the nature of the application using threads. If only a few long lived threads are created, then creation time is not critical. However, if threads have a low overhead for creation ("cheap" threads), implementation approaches which make liberal use of dynamic thread creation may become practical. A n example would be protocol stacks which allocate a thread per packet for processing.  The results for thread creation testing are summarized in table 5.2. Thread creation times for R T Threads were highly dependent on the size of the stack allocated (i.e., they reflect the time taken to allocate the stack using m a l l o c O within the call). Solaris Threads and Pthreads allow a stack to be allocated by the user, and stacks were allocated using m a l l o c O in the tests, the time for which is included in the results. Values in parentheses show creation times with average m a l l o c O times factored out. For Win32 Threads, a stacksize is specified at creation, but the stack is allocated by the system. For the Win32 mapped R T Threads implementation, the stack of the underlying Win32 thread is utilized, and no explicit m a l l o c O is done.  Creation times were measured as an average of 1000 sequential creation calls. A fairly high degree of variability was observed between successive repeats of the same test, thus the results shown are very approximate. They do show, however, that creation times for R T Threads are comparable to those for the other threads packages tested. They are generally better than those for the U N I X thread packages, several time better than those for A I X 4.1 Pthreads. They are only slightly worse  75  Hostname  Stephen  rickards blackdog monteith grouse robson-win95  Threads Package  1024 bytes  4096 bytes  16384 bytes  RT Threads Solaris Threads Pthreads RT Threads Solaris Threads Pthreads RT Threads Pthreads RT Threads Win32 Threads RT Threads Win32 Threads RT Threads Win32 Threads  280 (65) 300 (117) 305 (118) 82 (32) 93 (46) 100 (52) 103 (64) 626 (589) 67 62 45.5 45.2 27 25.5  922 856 867 230 249 245 275 816 93 86 65 63 99 95  1830 (95) 1744 (265) 1698 (240) 425 (40) 468 (85) 448 (82) 297 (144) 887 (720) 138 129 100 93 106 98  (110) (144) (169) (40) (59) (50) (145) (680)  Table 5.2: Thread creation times in microseconds than those for Win32 Threads, a good result considering that an underlying Win32 Thread creation is included in the cost of an R T Thread creation in the mapped implementation.  5.1.2  Semaphore Overhead  The overhead of semaphores is measured here for two reasons. Firstly, semaphores are the most common way of controlling access to shared data structures in R T Threads . Threaded applications will often require a high number of such calls, 1  although few might actually result in context switches if there is low contention for such access, as is likely to be the case in a uniprocessor environment. Thus the overhead of the call itself can be important. Secondly, context switching performance was measured using semaphores to cause the switches, and the overhead of the calls Semaphores can easily provide the same functionality as "mutex" constructs often found in other threads packages 1  76  needs to be determined so it can be factored out.  Results for semaphore call overheads are included in table 5.3 along with results for context switching. Semaphore overhead is lower for R T Threads in all cases. This is to be expected, since R T Threads is implemented at user level and there is no system call overhead for these calls.  5.1.3  Context Switching  The context switching performance of R T Threads was tested on various machines, comparing the results with those for native thread packages on each machine. For Solaris and A I X , the native host implementation of R T Thread was used, and for W i n dows N T and Windows 95 the mapped implementation was used. The test consisted of two threads alternately blocking on and signaling semaphores. Semaphores are implemented using condition variables and mutexes for Pthreads, since semaphores are not part of the Pthreads interface. Table 5.3 shows the results, where times are averaged over 100,000 iterations. Context switch times include the overhead for semaphore calls. A n estimate of the overhead for semaphore calls (one P and one V per switch) is also shown, determined by timing calls which did not result in context switches.  For context switching, the native host implementation of R T Threads outperformed the native threads packages in all cases. The mapped implementation showed a considerable performance penalty as compared to the underlying Win32 threads alone. While the mapped implementation can obviously not perform better than the threads package it is built on top of, work is ongoing to reduce this  77  Hostname  Stephen  cascade  rickards blackdog monteith asp  Threads Package  Context Switch  Semaphore Overhead  RT Threads Solaris Threads Pthreads RT Threads Solaris Threads Pthreads RT Threads Solaris Threads Pthreads RT Threads Pthreads RT Threads Win32 Threads RT Threads Win32 Threads  61 99 117 10 24 28 16 21 25 38 115 81 23 182 82  7 10 8 2 4 3 1 5 2 4 27 4 17 5 7  Table 5.3: Context switch times in microseconds overhead.  5.2  Inter Thread Communication  Tests were made for R T Threads to determine the average times for send/receive/reply transactions, measured as the blocking time for the RttSendO call. Results were averaged over several iterations, the number of which varied with the message size. Message sizes of 0, 1024, 65536 and 1048576 bytes were tested, both for local I T C transactions (i.e., between threads in the same R T Threads environment), and for cross address space I T C between machines on a local Ethernet network. Longer times for longer messages would reflect the overhead in copying the message in the local case, and the network transmission overhead in the cross address space tests.  The other threads packages tested did not have a communication facility 78  Host Stephen rickards ice blackdog appaloosa cascade monteith grouse robson-win95  0 bytes 129 28 155 85 71 20 147 81 164  1024 bytes 707 121 190 103 101 48 183 116 210  64 KBytes 36458 7342 3360 4075 4172 1104 2310 2513 2773  1 MByte 557847 126042 118748 68528 58207 17803 49832 51269 57403  Table 5.4: Local send/receive/reply times comparable to R T Threads' message passing, so comparisons could not be made.  Table 5.5 shows times, in microseconds, for I T C between two threads in a single R T Threads environment, for various message sizes. Times for zero length message would reflect the overhead for two context switches, plus the overhead of the R t t S e n d O , R t t R e c e i v e O and R t t R e p l y O calls combined.  Table 5.5 shows times, in microseconds, for cross address space I T C between various machines on a lOBaseT Ethernet network.  5.3  Real Time Performance  Experiments were designed to determine how well R T Threads does in terms of real time scheduling in various environments. To truly be a "real time" threads environment (i.e., such that threads will start as close to their scheduled starting times as is possible given the priorities, deadlines and execution times of other threads in the environment), R T Threads threads would have to be the only threads in the overall system on a computer, or preempt other non-RT Threads threads or processes in79  Sending Host ice appaloosa dsg appaloosa cascade appaloosa Stephen  appaloosa rickards appaloosa grouse appaloosa monteith appaloosa robson-win95 appaloosa robson-win95 grouse grouse sumas  Receiving Host appaloosa ice appaloosa dsg appaloosa cascade appaloosa Stephen  appaloosa rickards appaloosa grouse appaloosa monteith appaloosa robson-win95 grouse robson-win95 sumas grouse  0 bytes 26549 21355 20092 20058 26404 25841 31016 20225 21535 21322 13048 20073 13607 20014 12209 20103 6433 10176 6966 6905  1024 bytes 24186 20682 21526 66237 32858 31825 33668 23389 20107 21810 10750 20073 11505 20035 10394 20002 12892 10174 7669 6831  64 KBytes 190584 217334 228083 200006 403443 400061 398772 411676 409943 415503 203123 168049 201410 200548 201115 201727 309948 301699 148257 138490  Table 5.5: Cross address space send/receive/reply times  80  1 MByte 2554876 2646952 2211720 2194095 3117465 3136351 2620522 2841969 2823639 2958978 2323293 2274666 2412353 2497584 2382625 2413129 2270538 2410048 2267732 2153997  stantaneously whenever ready to run. Nevertheless, it is of interest to see whether RT Threads would be useful on a variety of platforms where such guarantees are not possible. In other words, can user applications comfortably make use of the real time aspects of R T Threads' scheduling in the context of a general purpose operating system such as U N I X or Windows N T .  The basic test program consists of a thread which repeatedly schedules itself with a starting time which is a random short time in the future. The rand() library function is used to derive a delay time of between 20 and 40 milliseconds. Each time the thread wakes up, the difference between the scheduled starting time and the actual starting time is measured, and the average and standard deviation are calculated at the end of the test. In the base test, no other user threads were running while the test thread was sleeping. Tests were run for 10,000 iterations. In the tables, "Idle System" means there were no user processes running, other than those described for the tests, but there may have been active system processes. "Loaded System" refers to a system with several active user processes. In tests labeled " R T Threads in Real Time Process", RT Threads was run as a U N I X real time process. In Solaris, real time processes cannot be preempted by non-real time processes, including system processes.  Unfortunately, the resolution of the system clock used for scheduling threads was 10 milliseconds (10,000 microseconds) in all cases, so even under ideal conditions (i.e., no other competition for C P U resources), a thread could start as much as just under 10 milliseconds, plus any post interrupt scheduling overhead, after it's scheduled starting time. With a uniform random distribution of starting times  81  Maximum Average Standard Difference Deviation Difference Idle System 101182 Stephen 3932 5856 rickards 3044 46489 5566 ice 77746 5389 3340 2921 blackdog 13093 5738 appaloosa 5721 2922 12835 monteith 5022 10961 2887 grouse 10909 5017 2890 14954 robson-win95 5969 3799 Loaded System cascade 415719 5540 5206 R T Threads in Real Time Process 10494 Stephen 2926 5689 rickards 10349 5548 2940 Hostname  Table 5.6: Base Test: No background threads, no background processes one would expect the actual starting times would be roughly 5 milliseconds plus the scheduling overhead later than the scheduled starting times, on average, with a standard deviation of around 3 milliseconds.  Table 5.6 shows results for the base test. As expected, results show an average difference somewhat higher than 5,000 microseconds (5 ms), and standard deviations of around 3,000 microseconds (3 ms). Maximum differences tend to be somewhat over 10,000 microseconds (10 ms), as expected, except on the various versions of SunOS where considerably higher maximums are seen. For the loaded system, a relatively high standard deviation (5.2 ms) and maximum (415 ms) were observed, yet the average was still reasonable.  The results for RT Threads run-  ning in a real time process, the case which should have the best results, were only marginally better.  82  Hostname  Average Standard Maximum Difference Deviation Difference Idle System Stephen 5641 299452 5940 rickards 3414 5623 90510 26121 1966924 ice 6350 blackdog 5763 3088 85118 appaloosa 5733 2977 40758 monteith 14322 5027 2889 grouse 5007 2887 10946 3774 robson-win95 5955 19356 Loaded System cascade 814844 137988 165153 R T Threads in Real Time Process Stephen 5694 10711 2926 rickards 2940 10240 5545 Table 5.7: 1 background thread, no background processes Table 5.7 shows results for the base test, with a low priority spinning (i.e., in a tight loop using up C P U resources) thread running in the same R T Threads environment (since they are of higher priority, the test threads which take the timing measurements should preempt the spinning thread). For the most part, results were only slightly worse than for the base test, as one would expect. The exception-was the loaded system, where the result was far worse. One possible explanation is that with the spinning thread, the R T Threads environment process is taking up considerable C P U resources and the underlying operating system's scheduler is less likely to schedule it when its alarm signal is generated (in other words, it will have used up its quantum). Table 5.8 shows results where there is a background spinning process, which presumably would compete with the R T Threads environment for C P U resources. Results are comparable to the base test, which suggests that an R T Threads application which is not compute intensive should still observe reasonably good real time 83  Standard Maximum Average Deviation Difference Difference Idle System Stephen 5852 3767 71459 rickards 5600 3491 160631 ice 379967 5528 7451 12824 blackdog 5749 2919 appaloosa 38246 5728 2945 monteith 5021 18729 2889 grouse 5019 2896 28729 3557 robson-win95 5376 13998 Loaded System 38402 3141 cascade 5469 R T Threads in Real Time Process Stephen 10610 5699 2925 rickards 10340 5553 2938 Hostname  Table 5.8: No background threads, 1 background process scheduling behaviour, even when other processes are active in the system.  Table 5.9 shows results where there is a low priority background thread and a background spinning process. In this case there is greater competition for C P U resources, which is reflected in the results. For the various versions of SunOS, the results are quite poor, especially for Solaris with an average of over 38 ms on both the S P A R C 2 and the Pentium 166, and a standard deviation of around 54 ms. However, for A I X and N T , the results are actually quite good, especially for A I X 4.1 and N T 3.51. For the loaded system, the results are similar to those in table 5.7, which is not surprising since the background process may not be adding much to the overall background load. For the R T Threads environment running in a real time process, the results are still virtually the same as for the base test, as would be expected.  84  Hostname  Average Standard Maximum Difference Deviation Difference Idle System Stephen 54476 252087 38978 rickards 38174 216343 53869 ice 22187 15538 106325 blackdog 3259 49079 5881 39121 appaloosa 4880 7701 49692 monteith 3930 5249 grouse 10452 129904 5967 34924 robson-win95 6081 4486 Loaded System 149734 170841 cascade 820978 R T Threads in Real Time Process Stephen 10717 5689 2926 2941 rickards 10233 5538 Table 5.9: 1 background thread, 1 background process Table 5.10 shows results where there is no low priority background thread, but there are two background spinning processes. Results are similar to those for the same test with only one background process.  Table 5.11 shows results where there is a low priority background thread and two background spinning processes. Results are somewhat worse than for the same test with only one background process, but still quite good for the various Windows operating systems.  Tables 5.12 and 5.13 show results with a compilation ( G N U gcc for U N I X , Visual C + + 4.0 for Windows) being performed in the background, which may be a more realistic situation than having a process spinning in a tight loop. Results are similar to those with a background spinner, although generally somewhat worse. The results for the loaded system are better, but that may be due to changes in the 85  Maximum Average Standard Difference Difference Deviation Idle System 103529 Stephen 5861 4028 45683 rickards 5571 3050 74688 ice 5377 3355 blackdog 5742 11015 2917 appaloosa 2922 15373 5726 monteith 5023 2886 10961 grouse 5021 66134 2971 3632 1971 7999 robson-win95 Loaded System cascade 5484 102751 3258 R T Threads in Real Time Process Stephen 2924 10561 5703 rickards 5554 10308 2939 Hostname  Table 5.10: No background threads, 2 background processes  Standard Maximum Average Difference Deviation Difference Idle System 380421 Stephen 99654 73425 rickards 378353 71768 98779 ice 45873 30867 363165 blackdog 10153 9143 67849 appaloosa 10482 60720 13856 monteith 89987 5608 6928 grouse 7142 250817 22149 5044 24279 robson-win95 3550 Loaded System cascade 181477 714466 173509 R T Threads in Real Time Process 10672 Stephen 5692 2925 rickards 5542 2941 10286 Hostname  Table 5.11: 1 background thread, 2 background processes  86  Hostname  Stephen  rickards ice blackdog appaloosa monteith grouse robson-win95 cascade  Average Standard Difference Deviation Idle System 4412 6203 5734 3511 3504 5767 5900 6070 2995 5786 5022 2887 5026 2895 9672 6323 Loaded System 5522 3167  Maximum Difference 85392 101128 46513 466406 24153 10961 24697 302643 28866  Table 5.12: No background threads, 1 background compile overall load over time.  Tables 5.14, 5.15 and 5.16 show the results of tests where two R T Threads environments were run concurrently on the same machine. In the first test (table 5.14), neither has a spinning background thread.  In the second (table 5.15),  one of the two RT Threads processes has a spinning background thread, and in the third (table 5.16), both do. Two results are shown in each case, one for each environment. In table 5.14, the results in all cases are reasonably close to those for the base case, except that the maximum difference tended to be much higher. In the second test, with one of the environments having a background thread spinning, results were again similar but slightly worse. The results for the third case, where both environments had a background spinning thread, the results are considerably worse. For the U N I X operating systems, the results are similar to those for table 5.9 (one background thread, one background process), which is not a surprising result as each of the RT Threads environments would act like a background spinning pro-  87  Hostname  Stephen rickards ice blackdog  appaloosa monteith grouse  robson-win95  cascade  Average Standard Difference Deviation Idle System 39494 18157 16900 39887 125322 31452 36437 13165 12799 42353 5073 3131 5133 3758 7068 7473 Loaded System 35452 65177  Maximum Difference 390007 526672 1533903 631339 628420 47836 169975 378022 724476  Table 5.13: 1 background thread, 1 background compile cess for the other. For the first time, however, some very poor results are observed for the Windows operating systems. In each case only one of the R T Threads environments exhibits the poor results, the one which was started first. For Windows N T 4.0 and Windows 95 the average was almost double that for the base test, for the first environment started. For Windows N T 3.51, the average was 39 ms, the standard deviation 360 ms, and the maximum nearly 4 seconds. It is difficult to interpret these results, it may be be a problem with interaction of the multimedia timers in the two processes, but that is merely speculation.  Tables 5.17, 5.18 and 5.19 show results for the base test, but with several threads doing the scheduling test simultaneously (100, 200 and 300 respectively). The results show totals for all threads combined. As the number of threads increases, the results show considerably larger differences in the scheduled and actual starting times for threads. This result probably reflects the fact that many threads will be scheduled to run at approximately the same time, and by the time they all  88  Hostname  Stephen  rickards ice blackdog appaloosa monteith grouse robson-win95 cascade  Standard Average Deviation Difference Idle System 5949/5930 4139/4189 5581/5587 3029/3028 5483/5430 3069/3040 5760/5737 3057/2919 5713/5828 2927/2909 3088/2896 5024/5018 5023/5015 3105/2887 3633/3621 5556/5649 Loaded System 3140/3185 5493/5491  Maximum Difference 81745/81525 38304/38040 55919/55742 95803/13274 24606/12268 113711/27781 116683/10946 14990/14982 35649/32227  Table 5.14: 2 R T Threads environments  Hostname  Stephen  rickards ice blackdog appaloosa monteith grouse robson-win95 cascade  Average Standard Difference Deviation Idle System 6199/4436 6023/5981 5590/5576 3030/3013 5402/5414 3429/3012 5767/5751 3376/2917 5767/5733 3298/2920 5023/5017 3034/2893 5044/5014 3296/2891 5782/9887 3729/5400 Loaded System 6802/5474 8488/3273  Maximum Difference 386794/99719 39661/45026 138975/44699 167307/10735 149607/12443 97619/18353 159975/12650 24741/24127 254772/103873  Table 5.15: 2 R T Threads environments, 1 with background thread  89  Hostname  Stephen rickards ice blackdog  appaloosa monteith grouse  robson-win95  cascade  Average Standard Difference Deviation Idle System 33287/27166 51808/42670 32459/26167 50755/42056 22008/22252 16807/16107 4707/5354 6802/8588 8196/5574 9705/9343 360349/3471 39060/5181 9922/5021 14915/2886 10082/5240 8954/3430 Loaded System 35079/32557 55748/52008  Maximum Difference 352237/211320 239984/241409 565581/99998 172364/43439 595853/42113 3893468/40304 169975/10907 261841/22954 569994/384839  Table 5.16: 2 R T Threads environments, both with background thread get a chance to run briefly, the accumulated overhead will add up to cause a high average delay in actual starting times. To test this idea, the test with 300 threads was run again, but measuring the time when each thread is made ready in the R T Threads kernel rather than its actual starting time. Results are shown in table 5.20, and are more reasonable, with average differences ranging from 5 ms to 10 ms, and standard deviations of around 3 ms.  Table 5.21 shows results for a scheduling test done with a real multimedia client application used with the C M F S file server. The application presents J P E G motion video using a Parallax hardware decoder card, synchronized with a simultaneous audio stream, and in some cases a closed captioning text stream. Threads, one for each stream, are scheduled periodically by the client to display their respective media data. Measurements are made in the R T Threads kernel to see how close to their scheduled starting time the threads are made ready, on average. Two versions of the same video clip showing the music group Yes, one at 15 frames per second,  90  Hostname  Stephen  rickards ice blackdog appaloosa monteith grouse robson-win95 cascade  Standard Average Deviation Difference Idle System 9054 9777 7251 2309 7525 3368 3199 7645 8451 3118 5046 3530 3247 5051 9507 3737 Loaded System 6002 3128  Maximum Difference 617879 67894 191491 169761 196294 298528 178528 98975 94445  Table 5.17: One R T Threads environment with 100 threads  Hostname  Stephen  rickards ice blackdog appaloosa monteith grouse robson-win95 cascade  Average Standard Deviation Difference Idle System 30756 22385 10245 2000 5878 14419 14307 21088 27473 5371 4182 5130 5424 14117 12716 9299 Loaded System 3880 7248  Maximum Difference 1602504 207351 376447 822681 318741 374181 1756181 1195975 145242  Table 5.18: One R T Threads environment with 200 threads  91  Hostname  Stephen  rickards ice blackdog appaloosa monteith grouse robson-win95 cascade  Average Standard Difference Deviation Idle System 59979 39486 4026 12885 39954 16533 46488 53473 57031 62089 18921 4239 69442 8868 33167 12136 Loaded System 12978 18406  Maximum Difference 1930563 415807 1085964 4108626 4254461 229975 6277325 1224975 512005  Table 5.19: One R T Threads environment with 300 threads  Hostname  Stephen  rickards ice blackdog appaloosa monteith grouse robson-win95 cascade  Average Standard Difference Deviation Idle System 5308 6086 5167 3256 3432 5071 3607 5029 5058 3891 5018 2889 5092 3246 2912 2122 Loaded System 7344 13224  Maximum Difference 354863 93237 191146 151694 148228 19189 129425 404975 329269  Table 5.20: One R T Threads environment with 300 threads, measured in kernel  92  Title  Yes Video (15 fps) Yes Video (30 fps) Reservoir Dogs (20 fps)  Media Type  Average Difference  Standard Deviation  Maximum Difference  Video Audio Video Audio Video Audio Text  4925 4963 6247 7348 4394 4779 4890  2936 3151 2603 7971 3373 4617 5124  29381 42423 24592 65987 47382 29077 32107  Table 5.21: Multimedia client scheduling performance one at 30, and both with synchronized audio were tested, as was a clip from the movie "Reservoir Dogs" with synchronized 20 frames per second video, audio and text. Measurements were taken over a two minute period. Results were generally as expected. Results were somewhat poorer for the 30 frame per second clip, although that may in part be explained by the fact that calls to the X server for displaying video frames were protected by the R t t B e g i n C r i t i c a l () and R t t E n d C r i t i c a l calls which block signals to the R T Threads kernel and may have caused scheduling clock ticks to be delayed.  93  Chapter 6  Related  W o r k  A number of other threads packages and operating systems with kernel thread support are available, such as A I X 4.1 Pthreads [1], Solaris Threads [23, 9], Mach C Threads [5], etc. Some of these provide real time functionality. For example, A I X 4.1 Pthreads can specify a fixed priority F I F O policy, but this requires root authority, due to the danger of starving system processes because of global thread scheduling. Solaris threads offer a real time class of threads, but these real time threads also require root authority and require a high degree of caution, since they preempt system processes [11] and could cause the entire system to deadlock.  Other work has been done to provide real time support for existing packages. The Real Time Thread Package ( R T T P ) developed at Brown University is a C + + framework which is an extension of the Brown Thread Package. The R T T P implementation described in [24] is a wrapper for Solaris Threads, providing a mechanism for non-super users to create real time threads and for protection against runaway real time threads taking over system resources.  94  Real-Time Threads [26] is a real time threads package based on Mach C Threads. One of its main goals, like R T Threads, is to provide portability among operating system platforms. Unlike R T Threads, this package does dynamic schedulability analysis, and offers support for multiprocessor platforms. The admission algorithm does, however, require static analysis of maximum execution times for threads, and threads must be assigned to specific nodes in multiprocessor environments.  Alternate techniques for providing user-level thread scheduling have been implemented. In [14], shared memory records in user space are used to communicate scheduling information efficiently between kernel space and user space to help user space thread schedulers make scheduling decisions based on kernel-level process scheduling and system call blocking behaviour. Most notably, software interrupts are delivered to user space when system calls block and unblock, and optionally to warn of imminent preemption to assist in avoiding context switches while holding locks. This framework was implemented in the Psyche parallel operating system and was used to implement more than one threads package, with different packages being able to coexist and communicate with each other. A similar shared data mechanism called a Meta-Level Scheduler (MLS) is presented in [22]. User-Level Schedulers (ULSs) communicate their state to the kernel via a shared memory region in the M L S (in user space), so the kernel scheduler can schedule the highest priority thread. This mechanism was implemented for the A R T S real time operating system [27] (see section 6.3). Another example of a shared memory mechanism for coordinating user level thread scheduling with the kernel can be found in [6].  95  6.1  POSIX  In [21], experience with the P O S I X 1003.4 and 1003.4A standard proposals is discussed. P O S I X 4 (1003.4) is a proposal for adding base real time support to the P O S I X standard, while P O S I X 4A provides thread extension (the "Pthreads" interface). LynxOS, a commercial U N I X compatible real time operating system, was modified to provide an interface and functionality conforming to 1003.4 and 1003.4A, with some of the new functionality added at the kernel level but as much as possible implemented as user libraries. P O S I X 4 requires that at least two scheduling policies be implemented; Round-Robin and F I F O . Round-Robin is a round robin time-slicing policy, whereas F I F O is a fixed priority policy where threads of a given priority run to completion unless preempted by higher priority threads (threads have a global scheduling scope, so single threaded processes can be thought of as threads in the overall system in terms of scheduling policy). The F I F O policy is similar to the R T Threads scheduling policy, except that it excludes deadline scheduling. W i t h respect to P O S I X 4A, an interesting result found by the authors was unexpectedly poor performance of the mutex and condition variable synchronization mechanisms.  A strictly user level implementation of Pthreads is described in [15]. Similarities with the R T Threads implementation include the use of a monolithic  monitor  for protecting kernel data structures, i.e., a single global flag to indicate execution in a kernel critical section, and a flag used to defer signal handling. Performance results were similar to those experienced with R T Threads; synchronization and context switching were as good as or better than kernel implementations, and dynamic allocation of stacks was found to be the main source of overhead for thread creation. Most of the complexity of the implementation was in providing for the delivery of 96  signals to individual threads, as required by the P O S I X standard. R T Threads does not provide this functionality because using a synchronous programming approach to handling asynchronous events with blocking threads was considered preferable to the asynchronous programming approach of using signals and signal handlers. In fact, this is one of the main motivations for providing threads in the first place, a sentiment also expressed in [21].  M i T h O S (Microkernel Threads Operating System) [3], developed at Florida State University, provides deadline and periodic scheduling extensions to a P O S I X Pthreads library, and is implemented on a bare machine.  6.2  Real Time Support for Multimedia Applications  Other work has been done in providing real time support for multimedia applications, particularly in terms of resource allocation in a shared system. Much of the work takes a Quality of Service (QoS) approach where threads which cannot meet their real time requirements dynamically lower those requirements. Real time thread support focuses on periodic scheduling and handling of missed deadlines. Another issue which has prominence in the area of real time systems is dealing with the problem of priority inversion, where a low priority thread holding a shared resource can cause a medium priority thread to hold up a high priority thread.  In [28] and [16], periodic real-time threads are used to handle periodic events in a multimedia client application, such as the timely display of continuous media data units (e.g., video frames). In both cases the test application was a QuickTime 97  video player. Support multiple clients on a single machine with graceful degradation required a mechanism for QoS management, and global scope for scheduling of threads. Real-Time Mach [29, 25] was used as the underlying operating system, providing such real time threads. One of the methods demonstrated was a self stabilization scheme whereby a missed deadline handler for each thread modified the thread's timing attributes, i.e., period and deadline, such that a thread would lower its frame rate if it was missing deadlines.  Results from [16] showed good results  when running multiple video clients with varying priorities, i.e., only the quality of the low priority client was degraded. The same test run with a time-sliced scheduler showed erratic degradation among the clients, such that the high priority stream was as likely to reduce frame rate as the low priority one.  In [17], a complex protocol for dealing with priority inversion, called the Semaphore Dependency Protocol, is presented. This protocol is less conservative than the protocol ceiling protocol, and results in fewer context switches than the priority inheritance protocol. The authors go on to conjecture, however, that the structure of multimedia applications is usually not complex enough to warrant such an elaborate protocol, and that simple priority inheritance should suffice.  The  philosophy of R T Threads is that even priority inheritance is not required, i.e., that it should be left up to the application to deal with problems like priority inversion.  6.3  Inter T h r e a d  Communication  R T - I P C , a real time extension to Mach I P C for Real-Time Mach is discussed in [10]. Like R T Threads, Mach provides a facility for message passing between threads, where multiple messages arriving at a receiver are queued awaiting reception by the 98  receiving thread (a port mechanism is used in Mach, but the effect is the same). As is the case with R T Threads, messages are dequeued in F I F O order, regardless of the priority of the sender. R T - I P C adds a number of options with respect to dequeuing policy and priority of the receiver. For dequeuing order, a choice of F I F O or priority based policy is available, much like the first come first served and priority modes available for R T Threads semaphores. Priority inheritance and priority handoff options are also available to deal with priority inversion problems which may occur between sender and receiver, much as similar protocols are available for mutex locks in many real time operating systems. In RT Threads, only F I F O semantics are provided for message queuing. In keeping with R T Threads' cooperative programming philosophy, it is condidered the application's responsibility to ensure that reasonable priorities are used for the message receiver, and that an appropriate design is used for handling received messages so as not to unduly hold up senders waiting in the message queue (fro eaxmple, the client/server model depicted in figure 2.2).  The Chant threads package [7] extends the P O S I X Threads interface with a mechanism called talking threads for distributed programming. Talking threads allow threads on different machines to communicate directly using message passing primitives. Thread identifiers are made universally unique by embedding in them a machine identifier, process identifier and thread identifier, a technique similar to that used by R T Threads.  Notification of message reception is done by polling  the underlying communication system in the Chant implementation, whereas R T Threads uses underlying asynchronous signals (as part of the external I / O implementation) to unblock threads waiting for messages.  99  H A R T O S [8] is a distributed real time operating system developed at the University of Michigan. H A R T O S is part of the H A R T S Hexagonal Architecture for Real-Time Systems, where several multiprocessor nodes are connected on a hexagonal mesh interconnection network. Dedicated network processors and per-hop deadlines for data transmission are among the techniques used to implement a real time message passing facility.  A R T S is an object oriented distributed real time operating system. Its communication facilities include a synchronous request/response style message passing mechanism similar to that of R T Threads. Real time support for this mechanism includes transmitting the sender's time fence (deadline) with the request message, which is compared with the receiver's time fence, taking communication delays into account, to determine whether the request can be processed in time. Also, a communication manager is employed at the receiver with prioritized message queues and prioritized, preemptable message handling worker threads, using a priority inheritance protocol to avoid priority inversion within the communication subsystem.  100  Chapter 7  Additional  W o r k  Work which remains to be done with R T Threads includes: • testing the native microkernel implementation • testing FreeBSD and Linux versions of RT Threads, and comparing the results with those for Windows N T and Windows 95 on the same hardware • implementing a mapped version of R T Threads for Solaris Threads, testing, and comparing the results for those with the native host implementation for Solaris.  101  Chapter 8 Conclusions  RT Threads has been used in a U N I X environment for approximately two years. Although refinements have been made over that time, it has proven to be stable. Both the server components and the clients for the C M F S distributed continuous media file server project have been implemented using R T Threads.  These  applications have a complex design of communicating threads, including network communication (and synchronization) between client and server components using the cross address space inter-thread communication facilities of R T Threads. Real time scheduling functionality has been used to implement timely disk reading and network transmission of continuous media data, fine grained network rate control for the purpose of traffic shaping, and display synchronization for multiple media streams. R T Threads has provided a convenient and powerful model for programming cooperating real time tasks.  Performance results show comparable or better results for R T Threads, in terms of functional overheads, than for the other threads packages tested in a U N I X  102  environment (Solaris Threads and P O S I X Threads). This suggests that RT Threads makes a practical alternative for use as a general purpose threads package in such an environment. Results for functional overheads in the Win32 environment were somewhat worse for R T Threads than for Win32 threads in the case of thread creation and context switches, which is to be expected considering that for R T Threads these functions are built on top of the Win32 facilities. Still, results were close enough that R T Threads should be a practical alternative for most applications.  Real time scheduling tests showed that R T Threads, in most cases, exhibits the expected real time scheduling behaviour in lightly loaded systems. Scheduling was almost always accurate to within the limiting underlying system timer clock resolution of 10 milliseconds. Results worsened when the underlying system had a heavy load, or when multiple compute intensive R T Threads applications were run on the same machine. These experiments were intended to test the limits of the R T Threads package. Generally, one would not expect to be able to run a compute intensive real time application and share computing resources successfully with other processes in a general purpose system.  Thus R T Threads can be successfully used as the basis for applications which require non-critical real time scheduling of events, even in a non-real time operating system environment. This result was shown in practical terms by testing the performance of a synchronized multimedia client.  103  Bibliography  [1] AIX Version 4-1 General Programming Concepts: Writing and Debugging Programs, October 1994. [2] T . P . Baker, Frank Mueller, and Viresh Rustagi. Experience with a Prototype of the P O S I X "Minimal Realtime System Profile". In the 11th IEEE Workshop on Real-Time Operating Systems and Software, May 1994. [3] T . P . Baker, Frank Mueller, and Viresh Rustagi. M i T h O S - A Real-Time MicroKernel Threads Operating System. In IEEE Real-Time Systems Symposium 1995, 1995. [4] David R. Cheriton. The V Kernel: A Software Base for Distributed Systems. IEEE Software, l(2):19-42, April 1984. [5] Eric C . Cooper and Richard P. Draves. C Threads. Technical Report C M U CS-88-129, Carnegie Mellon University, April 1988. [6] Ramesh Govindan and David P. Anderson. Scheduling and I P C Mechanisms For Continuous Media. In Proceedings of the 13th Symposium on Operating Systems Principles, pages 68-80, October 1991. [7] Matthew Haines, David Cronk, and Piyush Mehrotra. On the Design of Chant: A Talking Threads Package. In Supercomputing 94, April 1994. [8] Dilip D . Kandlur, Daniel L . Kiskis, and Kang G . Shin. H A R T O S : A Distributed Real-Time Operating System. ACM Operating Systems Review, 23(3):72-89, July 1989. [9] Sandeep Khanna, Michael Sebree, and John Zolnowsky. Realtime Scheduling in SunOS 5.0. In Winter USENIX, pages 375-390, January 1992. [10] Takuro Kitayama, Tatsuo Nakajima, and Hideyuki Tokudo. R T - I P C : A n I P C Extension for Real-Time Mach. In Proceedings of the 2nd USENIX on Microkernels and Other Kernel Architectures, 1993. 104  Conference  [11] B i l Lewis and Daniel J . Berg. Threads Primer. SunSoft Press, 1996. [12] Siu Ling Ann Lo. An Operating System Architecture for Real-Time Tasking Abstractions. P h D thesis, University of British Columbia, April 1995. [13] Siu Ling Ann Lo, Norman C . Hutchinson, and Samuel T. Chanson. Architectural Considerations in the design of Real-Time Kernels. In IEEE Real-Time Systems Symposium, pages 138-147, December 1993. [14] Brian D . Marsh, Michael L . Scott, Thomas J . LeBlanc, and Evangelos P. Markatos. First-Class User-Level Threads. In Proceedings of the 13th Symposium on Operating Systems Principles, pages 110-120, October 1991. [15] Frank Mueller. A Library Implementation of P O S I X Threads under U N I X . In Proceedings of the Winter USENIX Conference, pages 29-41, January 1993. [16] Tatsuo Nakajima and Hiroshi Tezuka. A Continuos Media Application supporting Dynamic QOS Control on Real-Time Mach. In A CM Multimedia '94, October 1994. [17] Roger Needham and Akira Nakamura. A n Approach to Real-time Scheduling but is it Really a Problem for Multimedia? In Proceedings of the 3rd International Workshop on Network and Operating System Support for Digital Audio and Video, pages 32-39, 1993. [18] Gerald W . Neufeld, Dwight J . Makaroff, and Norman C . Hutchinson. Design of a Variable Bit-Rate Continuous Media Server. In Proceedings of the 5th International Workshop on Network and Operating System Support for Digital Audio and Video, pages 375-378, Durham N H , April 1995. [19] Gerald W . Neufeld, Dwight J . Makaroff, and Norman C . Hutchinson. Internal Design of the U B C Distributed Continuous Media File Server. Technical Report 95-06, University of British Columbia, April 1995. [20] Gerald W . Neufeld, Dwight J . Makaroff, and Norman C . Hutchinson. Design of a Variable Bit-Rate Continuous Media Server for an A T M Network. In Proceedings of the IS&T/SPIE Multimedia Computing and Networking Conference, San Jose, C A , January 1996. [21] Bill O.Gallmeister and Chris Lanier. Early Experience W i t h P O S I X 1003.4 and P O S I X 1003.4A. In Proceedings of the IEEE Symposium on Real-Time Systems Conference, pages 190-198, July 1991.  105  [22] Shuichi Oikawa and Hideyuki Tokuda. User-Level Real-Time Threads: A n Approach Towards High Performance Multimedia Threads. In Proceedings of the 4th International Workshop on Network and Operating System Support for Digital Audio and Video, pages 66-76, November 1993. [23] M . L . Powell, S. R. Kleiman, S. Barton, D . Shah, D . Stein, and M . Weeks. SunOS Multi-thread Architecture. In Proceedings of the Winter USENIX Conference, pages 65-76,1991. [24] Kostadis Roussos. Real-Time Thread Package. Technical Report CS-96-18, Department of Computer Science, Brown University, May 1996. [25] Stefan Savage and Hidyuki Tokuda. Real-Time Mach Timers: Exporting Time to the User. In Proceedings of the 3rd USENIX Mach Conference, pages 111118, April 1993. [26] Karsten Schwan, Hongyi Zhou, and Ahmed Gheith. Real-Time Threads. ACM Operating Systems Review, 25(4):35-46, October 1991. [27] Hideyuki Tokuda and Clifford W . Mercer. A R T S : A Distributed Real-Time Kernel. ACM Operating Systems Review, 23(3):29-53, July 1989. [28] Hide Tokudo and Takuro Kitayama. Dynamic QOS Control based on RealTime Threads. In Proceedings of the J th International Workshop on Network and Operating System Support for Digital Audio and Video, pages 114-121, November 1993. t  [29] Hideyuki Tokudo, Tatsuo Nakajima, and Prithvi Rao. Real-Time Mach: Towards a Predictable Real-Time System. In Proceedings of the 1st USENIX Mach Symposium, October 1990. [30] Alistair C . Veitch and Norman C . Hutchinson. Kea - A Dynamically Extensible and Configurable Operating System Kernel. In Proceedings of the Third Conference on Configurable Distributed Systems (ICCDS '96), May 1996.  106  Appendix A  Example  Program  The following simple example program is included to demonstrate the basic structure of an R T Threads program. Note that the header file r t t h r e a d s . h must be included. In the program, two threads are created, each being passed an integer to identify it in the output. Both threads are created with the same priority, RTTHIGH. Each thread lowers its priority, in turn, causing the other to be scheduled. This is done within a critical section, protected with a semaphore. The example program demonstrates the scheduling effects of both scheduling attribute modification and the use of synchronization primitives. Sample output is shown in comments at the end.  #include <stdio.h> #include  "rtthreads.h"  s t a t i c RttThreadld threada,  threadb;  s t a t i c RttSem sem;  107  RTTTHREAD t h r ( v o i d *arg) •C  R t t S c h A t t r myattrs; RttThreadld  me;  i n t i , threadNum;  threadNum = ( i n t ) a r g ; me = RttMyThreadldO ; f o r ( i = 0; i < 3; i++) •C  printf("7,d: Wait\n", threadNum); RttP(sem);  RttGetThreadSchedAttr(me,  ftmyattrs);  p r i n t f ("*/,d: My p r i o r i t y i s '/,d\n", threadNum, myattrs . p r i o r i t y ) ;  /* lower my p r i o r i t y */ myattrs.priority++; RttSetThreadSchedAttr(me, myattrs);  p r i n t f ("7,d: S i g n a l \ n " , threadNum); RttV(sem); } p r i n t f ("'/,d: Done\n" , threadNum); }  108  mainpO { RttSchAttr a t t r s ;  RttAllocSem(&sem, 1, RTTFCFS);  a t t r s . s t a r t i n g t i m e = RTTZEROTIME; a t t r s . p r i o r i t y = RTTHIGH; a t t r s . d e a d l i n e = RTTNODEADLINE;  RttCreate(&threada, t h r , 16000, "a", (void * ) 1 , a t t r s , RTTUSR); RttCreate(&threadb, t h r , 16000, "b", (void *)2, a t t r s , RTTUSR);  printf("threads  created\n");  >  109  /*  Sample Output:  threads created 1: Wait 1: My p r i o r i t y i s 10 2: Wait 1: S i g n a l 2: My p r i o r i t y i s 10 2: S i g n a l 2: Wait 2: My p r i o r i t y i s 11 1: Wait 2: S i g n a l 1: My p r i o r i t y i s 11 1: S i g n a l 1: Wait 1: My p r i o r i t y i s 12 2: Wait 1: S i g n a l 2: My p r i o r i t y i s 12 2: S i g n a l 2: Done 1: Done  110  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051524/manifest

Comment

Related Items