Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A request/response protocol to support ISO remote operations Goldberg, Murray Warren 1989

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1989_A6_7 G64.pdf [ 5.15MB ]
Metadata
JSON: 831-1.0302101.json
JSON-LD: 831-1.0302101-ld.json
RDF/XML (Pretty): 831-1.0302101-rdf.xml
RDF/JSON: 831-1.0302101-rdf.json
Turtle: 831-1.0302101-turtle.txt
N-Triples: 831-1.0302101-rdf-ntriples.txt
Original Record: 831-1.0302101-source.json
Full Text
831-1.0302101-fulltext.txt
Citation
831-1.0302101.ris

Full Text

A REQUEST/RESPONSE PROTOCOL TO SUPPORT ISO REMOTE OPERATIONS By Murray Warren Goldberg B.Sc, University of Victoria, 1985 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (DEPARTMENT OF COMPUTER SCIENCE) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA October 1989 © Murray W. Goldberg, 1989 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department The University of British Columbia Vancouver, Canada DE-6 (2/88) Abstract The remote operations protocol [X229] is used by one application process to exchange op-eration information with a remote process. An operation request, with associated parameters, is sent using the InvokeQ primitive, and the outcome is returned using the ResultQ, ErrorQ, or RejectQ primitives. ISO defines a stack of protocol layers to support the remote opera-tions protocol. There is redundancy in this stack and services provided by the stack do not closely match the needs of remote operations. Therefore a more suitable supporting protocol is necessary. A supporting protocol must be efficient in terms of association setup and data transfer, must support both interested and disinterested servers, and must provide only the services required by remote operations. Association setup must be efficient as remote operation associations can be numerous and short lived. Interested servers are common, warranting support, though disinterested servers must not suffer the overhead accompanying this support. Several efficient request/response protocols exist capable of supporting remote operations, though each has disadvantages. This thesis defines a request/response protocol, RRP, which satisfies the criteria above. RRP requires no separate PDU transfer to establish or release an association. Data transfer is efficient in terms of network utilization, unnecessary retransmissions and acknowledgements. RRP assumes an unsequenced, unreliable, and error prone datagram service. RRP provides a reliable, sequenced, synchronous request/response service. RRP provides optional peer process monitoring for interested servers. ii If Contents Abstract ii List of Tables vi List of Figures vii Acknowledgement viii 1 Introduction 1 1.1 Overview . . . . 1 1.2 Definitions 2 2 Remote Operations and ISO Support 5 2.1 Distributed Systems 5 2.2 Communication in Distributed Systems 6 2.2.1 Remote Procedure Call 7 2.2.2 Remote Operations 9 2.3 ISO Remote Operations 10 2.4 ISO Support for Remote Operations 12 2.5 Evaluation of ISO Support for Remote Operations 14 2.5.1 Connections 16 2.5.2 Data Transfer 18 2.5.3 Other Remote Operations Support 20 iii 2.5.4 Comments on OSI 23 2.6 Model for Remote Operations Support 24 3 Existing Protocols for Remote Operations 25 3.1 Versatile Message Transaction Protocol 25 3.2 Remote Procedure Call 27 3.3 Delta-T Protocol 29 4 Request/Response Protocol 31 4.1 RRP and the OSI Reference Model 31 4.2 RRP Properties . 33 4.2.1 Connections 33 4.2.2 Flow Control and Selective Retransmission 34 4.2.3 Acknowledgement Strategy 35 4.2.4 Notification of Peer Unreachability 38 4.2.5 Segmentation and Reassembly 38 4.2.6 At-Most-Once Delivery 38 4.3 RRP Protocol Data Units 39 4.4 Elements of Procedure 42 4.4.1 RRP PDU Transfer 42 4.4.2 RRP Service Interface 42 4.4.3 RRP Timers 45 4.4.4 Connection Establishment 45 4.4.5 . Connection Release 46 4.4.6 Monitoring Reachability of Server-RRP 48 4.4.7 Monitoring Reachability of Client-RRP 49 4.4.8 Data Transfer 50 4.4.9 General Retransmission Scheme 52 4.4.10 Flow Control and Selective Retransmission 53 4.4.11 Retention of Results for Retransmission 54 iv 4.5 RRP Protocol States 55 4.6 RRP State Tables 56 4.7 Arguments for Correctness 60 5 Implementation 64 5.1 Threads 64 5.1.1 Scheduling and Context Switching 65 5.1.2 Memory Management 65 5.1.3 I/O . . 66 5.1.4 Sleep 66 5.1.5 Interprocess Communication . 66 5.2 Framework for Protocol Implementation 67 5.3 The Request Response Protocol (RRP) 69 5.3.1 State Machine Implementation 69 5.3.2 Connection Control Blocks 70 5.3.3 Network Access 70 5.3.4 Sending and Receiving Data Segments 71 5.3.5 Worker Processes 72 5.3.6 Initialization 72 5.4 The ISO Stack 72 5.5 ROSE *. 74 5.6 User Interface 74 6 Performance Evaluation 76 7 Conclusion 80 7.1 Summary • 80 7.2 Contributions 80 7.3 Further Work 81 v List of Tables 4.1 RRP State Table, Client States 58 4.2 RRP State Table, Server States 59 6.1 Connection Establishment and Release Performance 77 6.2 Data Transfer Performance . . • 78 vi List of Figures 2.1 Remote Procedure Call . . . 8 2.2 OSI Protocol Stack 15 2.3 Sliding Window Flow Control 19 2.4 Retransmit-From-N Retransmission Scheme 21 4.1 RRP Position in the OSI Stack 32 4.2 Single Segment Request/Response Interaction 35 4.3 Multiple Segment Request Transfer 36 4.4 Multiple Segment Request Transfer in the Presence of Error 37 4.5 RREQ PDU 40 4.6 RRESP PDU 40 4.7 RWFRPDU 40 4.8 RUNBIND PDU 40 4.9 RAYA PDU 40 4.10 RIAA PDU 41 4.11 RSIP PDU 41 4.12 RSEGREQ PDU 41 5.1 RRP Connection Control Block Contents 70 5.2 ISO Protocol Implementation Over TCP 73 5.3 Process Structure of RRP and ISO Protocol Implementation 75 vii Acknowledgement I would like to thank Dr. Gerald Neufeld for his patience, support and guidance as my thesis supervisor. I would also like to thank him for his understanding and flexibility as the leader of the project of which I am a member. I am also grateful to Sam Chanson for his reading of my final draft. I would also like to thank my close friends Ian Cavers and Barry Brachman for their careful scrutinizing of my initial draft. Donald Acton and Rick Morrison also provided valuable assistance when it came to formatting and organizing this thesis. I wish to thank my parents Harry and Fran, and my brother Lindsay. Their example, guidance and financial support was instrumental in allowing me to pursue my education. Finally, I am most grateful to my wonderful wife Ann. Her company, support, drive and inspiration is a gift I will always feel honored to receive. She also makes great diagrams. viii Chapter 1 Introduction 1.1 Overview Computer to computer communication enhances functionality through the sharing of data and resources. Data sharing involves the transfer of messages and files, and is accomplished by bulk data transfer protocols. Resource sharing involves the transfer of small or large amounts of data, and is accomplished by remote operations protocols (RO protocols). Generally, access to resources is available through a resource manager. The resource man-ager provides an interface through which the resource is accessed. An example of such a resource is a file system. Access is provided through open, close, read and write operations. Each of these operations requires parameters, and each returns some outcome. The RO protocol supports access to a remote resource manager through the following inter-face primitives; Invoke() sends the remote operation request, with the associated parameters, to the remote resource manager process. The remote resource manager process (or its worker processes) performs the requested operation and returns an outcome using ResultQ, ErrorQ or Reject (). The RO protocol assumes a relatively error free supporting protocol service. This sup-porting protocol must have features such as low association establishment and maintenance 1 CHAPTER 1. INTRODUCTION 2 overhead, and efficient data transfer. ISO defines a stack of protocols to support the RO protocol, but this stack does not satisfy these requirements. The goal of this thesis is to design an optimal protocol to support the RO protocol. Several steps are required to accomplish this goal. First, an evaluation of the requirements of Remote Operations is necessary. This includes an examination of existing ISO support, as well as a categorization of the types and needs of server applications which use Remote Operations. From this examination, a model of an ideal supporting protocol is derived. A new protocol conforming to the ideal model is designed, and finally an implementation is performed to verify the model and test performance. The following chapter describes the ISO RO protocol and its supporting protocol stack. Problems with the supporting stack are discussed and a list of desirable features for a support-ing protocol is given. Chapter three compares several existing supporting protocols against this desired model. The fourth chapter describes our supporting protocol RRP, and includes cor-rectness arguments. Chapter five describes the implementation of RRP. The last two chapters provide a performance evaluation and conclude the thesis. 1.2 Definitions The following definitions are used throughout the thesis, call - a request/response transaction. client application - a process that makes a request to a server. client-RRP - the RRP entity supporting the client application. connection, association - a relation between communicating processes (e.g. a client and a server) necessary for communication. This relationship typically exists across multiple requests and responses. CHAPTER 1. INTRODUCTION 3 connection oriented protocol - a communication protocol whose communicating parties must exchange connection information prior to data transfer. connectionless protocol - a communication protocol whose communicating parties may trans-fer data without a preliminary exchange of connection information. disinterested server - a server application not interested in the reachability of a connected client. Generally a server application that does not allocate resources on a per-association basis. (N)-Entity - an instance of the (N)-protocol-layer (e.g. a Session Entity is an instance of the Session Layer). header - control information at the beginning of a PDU indicating, among other things, the PDU type. idempotent request - a request that does not alter the observable state of the server (e.g. a time of day request, but not an append-to-this-file request). interested server - any server application interested in the reachability of a connected client. Generally a server application that allocates resources on a per-association basis. operation - a task that can be performed by a server (e.g. disk write, memory read). OSI protocols - a set of communication protocols denned by the International Standards Organization (ISO). The intention of these standards is that conformance will allow in-teroperability. outstanding request - a request for which no response has yet been received, packet - an individual unit of information transferred across a network. P D U - (Protocol Data Unit) -protocol entities. one usual unit of formatted data transferred between peer CHAPTER 1. INTRODUCTION 4 request - information transferred to a server application with the intent of having it perform the indicated operation. response - information transferred from a server to a client consisting of the result of a previously requested operation. ROSE - (Remote Operations Service Element) provides the remote operations service. server application - a process that performs operations requested by clients. server-RRP - the RRP entity supporting the server application. service user - the immediate user of a protocol layer (e.g. the Session layer is the service user of the Transport layer). unreachable - a protocol entity is unreachable if communication with it is no longer possible. Chapter 2 Remote Operations and ISO Support 2.1 Distributed Systems There is a recent trend in computing facilities away from centralization and toward distribution. A centralized system generally consists of a single large, fast processor accessed by tens or hundreds of users at once. A distributed system generally consists of 2 to 2000 (or more) smaller processors (workstations) connected by a local area network. The local area network provides communication between the workstations at a speed on the order of 10 megabits per second. There are four major reasons why distribution is being favored in lieu of centralization. First, distributed computing facilities are extensible. When the computing needs of a distributed facility grow, it can be adapted through the addition of more workstations or peripherals. Adding more workstations increases the computing power of the facility allowing the addition of more users, or increased performance for existing users. If the computing needs of a centralized system grow, small incremental performance improvements may be made (such as adding more memory), but only to a limited extent. Any significant growth requires the replacement of the centralized processor in favor of a faster one. This causes greater disruption and is much more expensive than the addition of workstations to a distributed system. 5 CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 6 The second advantage is a reduction in computation time. Each workstation in a distributed system is often dedicated to one user at a time. This provides more predictable response time than a centralized system accessed by a varying number of users. Also, if a particular workstation is overloaded, it is possible to move some of the computations to other lightly-loaded workstations. The third advantage is increased reliability. If a major component of a centralized system fails (e.g. the processor), then the system is completely inaccessible. If one or more worksta-tions in a distributed system fail, most users will be unaffected. The computing power of the facility will be reduced by the absence of the failed workstations, but no user will be signifi-cantly affected unless one of the failed workstations provides some service not available on the remaining workstations. For example, it is possible that a failed workstation controls a disk containing the files of several users. In this case, those users will not have access to their files, but the remaining users will be unaffected. This is a more desirable situation than that arising from failure in a centralized system. The final advantage is resource sharing. In a distributed system it is not necessary for each workstation to have all its own peripherals such as a disk drive and printer. Instead, these resources may be located at only a few workstations, and can be shared by the entire system. A workstation with a shared resource acts as a server, receiving and executing requests for access to the resource. Shared resources are not restricted to printers and disk drives. Modems, memory, and display devices may all be shared resources. Even the CPU of one workstation may be considered a shared resource as it may perform computations requested by processes on other workstations. 2.2 C o m m u n i c a t i o n i n D i s t r i b u t e d S y s t e m s The existence of distributed systems requires that remote1 processes be able to communicate. Two models of remote interprocess communication are available which are extensions of 1 Remote means across machine boundaries. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 7 existing local2 constructs. The first is Remote Procedure Call (RPC), and the second is Remote Operations (RO) [X229]. Both models allow a process on one machine to request that a remote process perform some operation. 2.2.1 Remote Procedure Cal l RPC is modeled after local subroutine calls. The difference between a local subroutine call and RPC is that the invoked subroutine is executed locally (and within the same process) in the former case, and remotely (by a different process) in the latter case. RPC is supposed to be a transparent mechanism where the programmer of the calling and invoked routines does not need to know if the subroutine call is local or remote. In practise, RPC is not completely transparent as pointer parameters are difficult to handle (and often not allowed). RPC is implemented using stubs and RPC servers. A stub packages and transfers the subroutine execution request to the remote RPC server which executes it. For example, imagine a process P which calls subroutine S(a,b), where a and b are parameters. In the local case, P calls S(a,b) using a local subroutine call. For RPC, S(a,b) must have a subroutine Sstub(a,b) and a process RPCserver associated with it. Here, process P calls subroutine Sstub(a,b) in the same way it would have called S(a,b) locally. Sstub(a,b) creates a message containing the subroutine execution request and the parameters, and transfers it to the machine where the remote subroutine is to be performed. When the message arrives, the RPCserver process is created which makes the local subroutine call to S(a,b) using the parameters received in the message. When S(a,b) returns, the result is returned and the RPCserver terminates. Sstub() receives the result and returns it to P. Most implementations use a communication server which transfers the messages and creates the RPCserver process on the remote machine. Figure 2.1 shows a typical RPC interaction. 2 Local means within machine boundaries. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 8 LOCAL Process P RPC  Machine A Machine B NETWORK Figure 2.1: Remote Procedure Call CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 9 2.2.2 Remote Operations Remote Operations is modeled after local Send/Receive/Reply (S/R/R) message passing. A process sending to another process is blocked until the receiving process receives and replies to the message. A receiving process is blocked until some process sends to it. Remote Operations extends this interprocess communication across machine boundaries. It also provides alterna-tive synchronization classes allowing a sending process to continue execution without awaiting a reply. Remote Operation interactions follow the client/server model. In this model one process (the client) requests that an operation be performed by a remote process (the server). Unlike RPC, the server exists indefinitely to perform subsequent requests from this and other clients. The life of the server is not restricted to the duration of one remote operation. A typical server application is written as an infinite loop. At the top of the loop the server waits to receive a request. Once a request is received, the requested operation is performed. When the server is ready to return the result (if any) to the client, a result message is sent. If the client was blocked, the receipt of the result allows it to continue execution. Results do not have to be returned in the order that the corresponding operations were received by the server. It is possible for the server to delay a response until some other event occurs. The server's ability to field multiple requests and respond to them in any order allows it to synchronize the activities of its clients. This is not true for RPC. For example, consider a distributed case of the bounded-buffer, producer-consumer problem. Here, producer processes create items to be consumed by consumer processes. Produced items are placed into buffers and linked into a list for a consumer to remove. The linked list is a shared resource so its manipulation is therefore in a critical section. There are a finite number of buffers available, so producers may have to wait when they have items and all buffers are full. Also, consumers may have to wait if they are ready to consume an item and there are none available. Say the producers and consumers are on separate machines, and the buffers are on a third. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 10 If RPC is used to add and remove items from the list, a conflict may arise when more than one producer or consumer tries to manipulate the list. Also, some form of synchronization is necessary to block the producers or consumers when there are no buffers to fill, or no items to consume. Both of these are general process coordination problems and can be solved using semaphores or some other scheme. If a server process is created to control access to the shared list, and Remote Operations are used to communicate with the server, no other process coordination scheme is necessary. The server is the only process manipulating the linked list, and therefore no conflict may arise from concurrent access. Also, the server can synchronize the actions of the producers and consumers. If there are no empty buffers and a producer requests that the server enqueue an item, the server can delay its response to the producer until an empty buffer is available. Likewise, if a consumer requests an item when there are none available, the server may delay its response until some producer has created one and delivered it to the server. The sending process cannot continue until a response is received, and therefore synchronization is achieved. RPC provides remote interprocess communication, but Remote Operations provides a more general scheme for remote interprocess communication and synchronization. 2.3 ISO Remote Operations ISO/CCITT X.229 [X229] defines a Remote Operations Protocol. The service provided by this protocol is referred to as the Remote Operations Service (ROS). The protocol and service together are referred to as the Remote Operations Service Element (ROSE). ROS allows a process to invoke an operation at another machine. Depending on the class of operation performed, the result or error outcome of the operation may or may not be returned to the invoker. An application process accesses ROSE services through the following interface. • BIND is used by the client or server to establish an association. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 11 • INVOKE is used by the client to transfer a request and its associated parameters to a server. • RESULT is used by the server (in response to an INVOKE) to transfer the operation outcome to the client. It is used only if the operation was performed successfully. • ERROR is used by the server (in response to an INVOKE) to transfer the operation outcome to the client. It is used only if the operation did not succeed (e.g. divide by zero error). • REJECT is used by the server or client to reject the INVOKE, RESULT or ERROR received (e.g. if there is a mistyped argument). • UNBIND is used by the client or server to release the association. There are five Operations Classes supported by X.229. Each provides a different level of operation synchronization and reporting. The synchronous class allows only one outstanding operation per connection. The results of the previous operation must be received before another operation may be performed. The asynchronous classes allow multiple outstanding operations per connection. The Operation Classes follow. 1. Synchronous - reporting success or failure (RESULT or ERROR). 2. Asynchronous - reporting success or failure (RESULT or ERROR). 3. Asynchronous - reporting failure (ERROR) only. 4. Asynchronous - reporting success (RESULT) only. 5. Asynchronous - outcome not reported. There are three Association Classes defined by X.229. These classes determine which con-nected process is capable of requesting operations (acting as the client). CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 12 1. Only the association initiator (process that does BIND) may act as the client. 2. Only the association responder (process that receives BIND) may act as the client. 3. Either associated process may act as the client. 2.4 ISO Support for Remote Operations X.229 defines PDUs for the transfer of requests and responses. This protocol, however, does not recover from transmission errors, missequenced PDUs, or duplicates. Therefore, the protocols supporting X.229 must be capable of these and other functions. ISO/CCITT defines a set of connection oriented protocols to support X.2293. These protocols are organized as a stack, with the protocol at layer N requiring the services provided by the protocol at layer N—1. Including X.229, there are seven protocol layers in this stack. The layers used by Remote Operations consist of the following, with each layer requiring the services of the layer below4. • Application Layer (includes X.229 Remote Operations and X.227 Association Control [X227]) • Presentation Layer [X226] • Session Layer [X225] • Transport Layer [X224] • Network Layer [X25] • Data Link Layer [X25] • Physical Layer [X21] 3ISO/CCITT also defines a set of connectionless protocols, though not for the support of X.229. 4The descriptions of the layers are taken, in part, from the X.200 reference model definition. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 13 The Application Layer is the highest layer, and its purpose is to provide a means for an application process to access the OSI communication services. This layer serves as a window through which connected application processes exchange information. An Application Entity (AE) is made up of one User Element (UE), and a set of Application Service Elements (ASEs). The UE is the application process making use of the OSI communication services. The ASEs (X.229 and X.227 being two) provide different interfaces to the OSI services, and generate PDUs to exchange with peer ASEs. One or more ASEs may serve a UE. The ASEs rely on the services of each other, and of X.226. A Presentation Entity (PE) provides access to all Session Layer services, as well as data syntax conversion and management. Data syntax is the format in which data is represented. The syntax selected for data transfer by the Presentation Entity is the Transfer Syntax. This layer converts data between the syntaxes of the two AEs and the Transfer Syntax. Transfer syntax selection and subsequent modification are functions of this layer. Syntax independence is also provided for AEs, meaning that an AE need not know how its peer AEs represent data internally. The Session Layer allows correspondent PEs to organize and synchronize their communi-cation. A session connection is requested by a PE, and exists until released by a PE, or by the Session Entity (SE). The PEs may exchange normal or expedited (urgent) data over the connection. This layer provides a turn management facility allowing only the PE possessing the turn to send data. A quarantine service is also provided where a SE gathers received data units, and either delivers them to the PE or discards them as requested by the sending PE. Finally, .a mechanism is provided for correspondent PEs to synchronize their communication by resetting their session connection to a denned state. The Transport Layer provides reliable, cost effective and transparent transfer of data be-tween two SEs. This is the lowest layer with end-to-end significance, meaning that each trans-port connection has only two Transport Entities (TEs), even if the connection spans several nodes or networks. The functionality of this layer varies depending on the quality of service CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 14 available from the supporting Network Layer. Five functionality classes (numbered 0 through 4) are denned. This layer provides end-to-end sequencing, error detection and recovery, segmen-tation and concatenation, and flow control. Duplex communication over a connection is also provided. The mapping of transport connections onto network connections may be one-to-one, many-to-one, or one-to-many. The Network Layer provides transfer of data between TEs. The interface provided to the TE is independent of underlying communication media characteristics (other than quality of service). One or more network connections, in series or in parallel, are required to support a transport connection. This layer provides routing management and may also provide flow control, sequencing, and error notification (depending on the service provided by the Data Link Layer). The Data Link Layer provides connections between Network Entities (NEs). Sequencing, flow control, and error correction and notification are also provided by this layer. The Network Layer delimits Physical Layer data (i.e. bits) so that network data may be sent as discrete groups of bits, rather than as a stream of individual bits. The Physical Layer is responsible for sequenced bit transmission between Data Link Entities. Physical Layer Entities are connected by a physical medium. This layer uses the mechanical, electrical and physical properties of the transmission medium to represent and transmit bits. Figure 2.2 shows the layers of the OSI protocol stack. 2.5 Evaluation of ISO Support for Remote Operations This section examines the strengths and weaknesses of the OSI stack, and the support it provides for Remote Operations. Through this examination of current Remote Operations support, a model is derived. This model is used as a basis for judging other supporting protocols, and as an ideal to be used in the design of a new supporting protocol. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 15 APPLICATION RbSE ACSE PRESENTATION L A Y E R SESSION L A Y E R TRANSPORT L A Y E R NETWORK L A Y E R Figure 2.2: OSI Protocol Stack CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 16 2.5.1 Connections For ROSE entities to communicate, the supporting OSI stack requires that a connection be previously established. To accomplish this, ROSE requests ACSE to establish a connection with a peer ACSE entity. In turn, ACSE requests a Presentation connection, which in turn requests a Session connection, and so on down to the Data Link layer. Connection establishment at each layer requires a single PDU exchange. PDUs at the ACSE, Presentation and Session layers can be concatenated and transferred as a single packet exchange. This is not true for layers below Session, and therefore each of these layers must exchange individual PDUs. The result is that before peer ROSE entities can communicate, as many as eight packets can be exchanged between their supporting entities. Connection release is similar. An exchange of packets is required at the ACSE, Session and Network layers, with ACSE and Session release PDUs concatenated. Once a connection exists, state information is maintained by the communicating protocols for the duration of the connection. This may be a waste of resources, depending on the level of connection support required by the client and server. This connection setup and release arrangement may be suitable for bulk data transfer pro-tocols where large amounts of data are transferred over each connection. In this case, the cost of connection establishment and release may be amortized over the life of the connection. This is not suitable, however, for Remote Operations support. Consider as a typical example of an application using the Remote Operations protocol, an X.500 directory service. Due to the dis-tributed nature of the X.500 database, a search operation might need to establish connections to several Directory Service Agents. Each connection would typically be short lived and would transfer only a small amount of data. It is inefficient to transfer connection establishment and release PDUs considering the small amount of X.500 data transferred. This approach is too slow when a human user is waiting for the results of an interactive operation. A connection establishment and release scheme requiring fewer PDU transfers and little connection estab-lishment time would be more suited to ROSE support than that currently provided by the OSI stack. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 17 One possible way to avoid the overhead of connection establishment and release is to es-tablish one permanent connection between communicating machines. Any process wanting to communicate with a process on a connected machine is able to use the pre-established connec-tion. In this way, the connection setup overhead in terms of packet exchange and resource usage is minimized. Any number of communicating processes may use this one machine-to-machine connection. The problem with this scheme is flow control. If one process is receiving packets faster than it is able to process them, the backlog of unprocessed packets is forced down into the machine-to-machine connection. Buffer space can become exhausted and the connection may be unable to deliver any packets until the offending process clears up its backlog. This affects all processes using that connection to communicate. One argument in favor of a PDU exchange for connection establishment is authentication. Authentication is the process by which a server verifies the identity of the client before per-forming operations for it. Some claim that connection establishment is the appropriate time for authentication, and that PDUs might as well be exchanged as they are required for authen-tication anyhow. There is, however, a counter argument to this claim. Authentication is an application issue and is not universally required. It is inefficient to require all communicating applications to exchange connection establishment PDUs because some of these applications perform authentication at connection establishment time. It is more appropriate for appli-cations to perform authentication within the scope of a connection. This way, applications not requiring authentication do not suffer the overhead of an initial PDU exchange. Also, applications requiring authentication may implement a more flexible authentication scheme. If established at connection time, the authentication level may not be changed during the life of the connection. If a client requests an operation which requires a higher level of authentication, it must release the connection and create a new one. If instead, authentication is done within a connection, a new level may be established dynamically, without the need to create a new connection. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 18 2.5.2 Data Transfer Flow Control and Acknowledgements The OSI stack uses sliding window flow control implemented at the Data Link and Network layers. This scheme requires PDUs to be numbered sequentially modulo eight (normal) or 128 (extended). The sender is free to send all PDUs numbered ACKED + 1 through ACKED + WINDOW-SIZE (using modulo arithmetic). ACKED is the highest numbered acknowledged PDU and WINDOW-SIZE is implementation dependent, known to both the sender and re-ceiver. The receiver must send acknowledgements (and therefore credit further PDUs) within a specified time period (typically about 1 second). Over lightly loaded connections, this scheme may result in an explicit acknowledgement for each data PDU. For heavily loaded connections, the result is similar to a blast protocol. The sender transmits a number (less than WIN-DOW-SIZE) of PDUs in succession and then awaits a single acknowledgement before sending more. Sliding window flow control is well suited to low propagation delay, low speed networks. Here, the network is well utilized because the time spent awaiting acknowledgement is small compared to the time spent transmitting. Modern networks, however, tend to provide high speed transfer with high propagation delay. As examples, a cross-country optical fibre link may have a round-trip delay of 200 milliseconds, and satellite link propagation delay is generally about one second5. For these high speed, high propagation delay networks this scheme is less acceptable as the time spent awaiting acknowledgement is large in relation to the time spent actually transmitting. This can cause low network utilization and unnecessarily slow bulk transfer. Figure 2.3 shows an example of sliding window flow control where the window size is 4, and the number of packets to transmit is 8. Notice that the high speed, high propagation delay network is poorly utilized while the low speed, low propagation delay network is well utilized. Increasing the window size for the high speed network alleviates the problem somewhat, but 5The source for these propagation delays was personal communication with Telecom Canada CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 19 Figure 2.3: Sliding Window Flow Control does not reduce the time spent awaiting acknowledgement. Remote Operations must be capable of efficient bulk transfer, and therefore a more efficient flow control scheme is desirable. Retransmission Strategy OSI stack retransmission is done in the Data Link, Network, and Transport Layers. When the receiver receives an out-of-sequence or damaged PDU, it immediately requests retransmission starting at the lost or damaged PDU. In modern high-speed networks, however, it is common for receiver overrun to occur. The network will deliver packets faster than the receiver is able to process them, causing packet loss. This often occurs in intervals causing the loss of every Nth. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 20 packet. JVmay be large or small depending on the differential in packet delivery and processing speeds. On a high propagation delay, high speed network, a large number of packets may be sent following a lost one before a retransmission request is received. For example, consider a sender blasting 100 PDUs in succession over a high propagation delay network, and then awaiting an acknowledgement. If the receiver is slow and drops every tenth PDU, it requests retransmission starting at PDU number 10. The sender now blasts PDUs 10 through 100, but the receiver drops PDU number 19. The receiver requests retransmission of PDUs 19 through 100, and so on until all PDUs have been successfully transferred. This requires approximately 450 PDU transmissions to successfully transfer 100 PDUs, a clearly unacceptable situation. Figure 2.4 shows an example of this retransmission scheme. Notice that the loss of an early packet requires a large number of unnecessary retransmissions. This problem is especially apparent when the window size is large and the propagation delay is high. 2.5.3 Other Remote Operations Support Notification of Peer Unreachability Server applications can be divided into two classes. The first class are those which dedicate resources to clients on the basis of an association held with that client. These servers are called interested servers. The second class does not dedicate resources to clients on the basis of associations held with them. These servers are called disinterested servers. Disinterested servers may still dedicate resources to their clients, but the dedication and release of these resources is not related to the duration of the association with the client. One example of an interested server is a file server. An interaction with this server consists of creating an association with the server (and opening a file as a result), making read or write requests on the open file, and then releasing the association (and closing the file as a result). If the server maintains open file descriptors for each open file, then it must be notified if a client with an open file becomes unreachable. This way, the resources dedicated to the client's open file descriptor may be released. This is an interested server as it dedicates resources to clients CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 22 on the basis of open connections. An example of a disinterested server is a file server which does not require a file to be opened before requests can be made. An interaction with this server consists of creating an association with the server, making read or write requests, and then releasing the association. The read and write requests are performed without initially opening the file. This server has no open file descriptors, and therefore dedicates no association-related resources. If the client becomes unreachable or closes the connection, the server does not care because the release of the connection does not affect the state of the server. There are no resources to be reclaimed. The distinction between interested and disinterested servers is not necessarily related to the distinction between servers performing idempotent and non-idempotent requests. For example, a disinterested server may perform idempotent or non-idempotent requests. If the dismterested file server above required that write requests include the location at which to write the data, it would be performing idempotent requests. If instead, written data was always appended to the end of the file, the server would be performing non-idempotent requests. In either case, this server remains disinterested. The same argument can be made about the interested file server. Idempotency is an inherent property of the service, while interest can be an implementation decision. The OSI stack does not provide unreachability notification for interested servers. Therefore, the server is required to periodically probe the client, adding complexity to both the server and client applications. There is clearly a need for this service as some networks have modified the Network or Data Link layer in order to provide it. One Data Link layer modification probes a connected peer using a RR PDU with the poll bit set. The expected response is a RR PDU with the final bit set. Another Network Layer modification probes the connected peer using a ClearRequest PDU on Logical Channel 0. The expected response is a ClearConfirm PDU. These ad hoc techniques can provide notification of peer unreachability but are not included in the OSI protocol specifications. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 23 Segmentation and Reassembly The data transferred by Remote Operations varies in length from small (e.g. time of day request), to almost unlimited (e.g. mail messages or file transfers). Most underlying trans-mission media have packet length restrictions. As a result, supporting protocols must perform segmentation and reassembly of large PDUs. In this case, a large PDU is divided into transfer-size pieces, and reassembled on receipt. The OSI protocol stack performs segmentation and reassembly at the Transport layer. At-Most-Once Delivery A server using Remote Operations may support non-idempotent requests. A duplicate request PDU may result in repeated execution of the request. Therefore, supporting protocols must ensure at-most-once request delivery. The OSI stack ensures at-most-once delivery of PDUs using sequence numbers in the lower layers. A PDU will be discarded if it contains a duplicate sequence number. 2.5.4 Comments on OSI Layering Layering of the OSI stack modularizes the protocols making them easier to understand, imple-ment, and interface. However, it also introduces inefficiencies. For example, most layers add a header to data PDUs before requesting their transmission by the layer below. This causes redundant labeling. Before being sent, each data PDU transferred by Remote Operations will accumulate four separate headers, each indicating the PDU type. In a naive implementation, the layered structure of the protocols can also cause unnecessary message passing or buffer copying between layers. Although implementation is not dictated by the OSI standard, this type of inefficiency is encouraged by the logical layering. CHAPTER 2. REMOTE OPERATIONS AND ISO SUPPORT 24 2.6 Model for Remote Operations Support The preceding examination of OSI support for Remote Operations is the basis for the following model. This model represents an ideal supporting protocol for Remote Operations. Connections • Connections should be efficiently created and released. Ideally, connection establishment and release should require no separate PDU exchange. • Maintenance of existing connections should require minimal state information. Data Transfer • Flow control must not restrict data transfer when the sender is prepared to send and the receiver is ready to receive. • Acknowledgements for received data should be implicit whenever possible so that unnec-essary traffic is reduced. • Retransmission must efficiently accommodate receiver overrun. Other Remote Operations Support • An interested server must be informed when a connected client becomes unreachable. A server not needing this service should not be required to bear the overhead associated with it. • Segmentation and reassembly are required for large PDU transfers. • Delivery of PDUs must have at-most-once semantics if non-idempotent services are to be accommodated. Chapter 3 Existing Protocols for Remote Operations In this chapter, other protocols capable of Remote Operations support are discussed. Each protocol is evaluated by comparison against the model presented in the previous chapter. 3.1 Versatile Message Transaction Protocol VMTP [VMTP] is a transport protocol designed to support both Remote Procedure Call and multicast communication. Connections VMTP does not require any serarate PDU transfer for connection establishment or release. VMTP interactions are based on the message transaction (request/response pair). An initial request to a server creates a connection record at the server. This record persists for at least T-rec seconds1, and is maintained for detection of duplicate request PDUs. Once T-rec seconds have elapsed without receipt of another PDU, the record is discarded. Therefore, VMTP connections are created and released efficiently, but lack of persistent connection records described in [VMTP]. 25 CHAPTER 3. EXISTING PROTOCOLS FOR REMOTE OPERATIONS 26 precludes VMTP from providing some necessary connection-related services (i.e. notification of client unreachability - discussed below). Data Transfer • VMTP uses Packet Group Based flow control. This scheme blasts up to 16 kilobytes of data in one operation. This 16Kb packet group must be received and acknowledged before another packet group is sent. • A response acknowledges receipt of a request. A subsequent request, explicit acknowl-edgement, or timeout acknowledges receipt of a response. This scheme avoids unnecessary explicit acknowledgements. • VMTP supports selective retransmission to recover from receiver overrun. Other Remote Operations Support • A client will be notified if it has an outstanding request to a server, and the server becomes unreachable. If a client becomes unreachable, no server will be notified. A server that wishes to test for client unreachability must probe its client's T-stable2 identifier at least once every T seconds. The lack of automatic notification complicates server application implementation. • Segmentation and reassembly are performed by VMTP, though VMTP imposes a max-imum length of 16 Kb on request and response data. This 16Kb restriction simplifies protocol implementation, but forces more complexity into clients and servers wishing to transfer more than 16Kb at one time. • At-most-once delivery is provided by VMTP. To detect duplicates, VMTP assigns trans-action identifiers to its PDUs, and keeps a record of all transaction identifiers received 2 A T-stable identifier is guaranteed not to be re-used for at least T seconds once it becomes invalid. CHAPTER 3. EXISTING PROTOCOLS FOR REMOTE OPERATIONS 27 over a minimum of the last T-rec seconds. All received PDUs are checked against recorded transaction identifiers for duplicates. VMTP assumes a maximum packet lifetime to en-sure that a duplicate does not arrive more than T-rec seconds after the original. If a crash occurs, all transaction identifier records are lost. Therefore, duplicate detection will not function immediately following a crash and restart. Duplicate detection, in this case, is left to the VMTP service user. Other VMTP Comments If a server processing a request responds after a long period of time (longer than T-stable), then the possibility exists that the T-stable identifier of the client that made the original request will now be bound to some other client. To avoid this, VMTP requires servers to respond to requests within T-rec seconds of their receipt. This is inappropriate for servers that potentially require a long time to satisfy requests. 3.2 Remote Procedure C a l l Remote Procedure Call [RPCp], [RPCt] describes a communication model and protocol de-signed to simplify distributed computing. Connections RPC maintains connections on two levels. At the machine-to-machine level, a connection is established on first communication between two machines. The purpose of this connection is to identify delayed packets received from a node that has since crashed, rebooted, and re-connected. At the process-to-process level, a connection is established implicitly when a request is transmitted. The connection records are maintained after the completion of the call, long enough to ensure that no duplicates arising from this call can still be received. No explicit connection release exists. A connection is released implicitly when the connection records CHAPTER 3. EXISTING PROTOCOLS FOR REMOTE OPERATIONS 28 are discarded. RPC suggests, but does not require, that these records be discarded once no retransmissions are possible over that connection. RPC connection establishment and release is efficient because no extra PDUs are exchanged to establish or release a connection. Data Transfer • Flow control for multiple segment requests and responses is Stop-and-Wait. Each segment of a segmented request or response (except the last) must be individually acknowledged before the next segment is sent. This method of flow control is unacceptable for any network with long propagation delays. • A single segment request, or the last segment of a multiple segment reguest, is acknowl-edged in one of two ways. In the first, a result arriving without delay serves as an acknowledgement. In the second, if the result is delayed, a retransmission of the unac-knowledged segment will force an explicit acknowledgement. Similarly, a one segment result, or the last segment of a multiple segment result, is either acknowledged implicitly by the next request from the same client, or explicitly after retransmission. This scheme minimizes acknowledgements during periods of high activity over a connection. • RPC expects an implicit or explicit acknowledgement for each segment transmitted. RPC retransmits any segment that is not acknowledged within a small time interval. C Other Remote Operations Support • A server using RPC is not notified if a client becomes unreachable between calls. • Segmentation and reassembly of large request and response PDUs is performed by RPC. The flow control scheme for segments is inefficient (as mentioned above). • RPC detects duplicate segments of a request or response through call-relative (unique within the scope of the call) sequence numbers. Duplicate segments are discarded. Dupli-cate requests and responses are detected through connection-relative sequence numbers. CHAPTER 3. EXISTING PROTOCOLS FOR REMOTE OPERATIONS 29 To detect a duplicate request or response, the sequence number of the previous request or response is maintained in the connection record for comparison. As mentioned previously, a connection record must exist longer than any delayed duplicate. Other RPC comments RPC is efficient in terms of connection setup and release. Data transfer for small requests and responses is very efficient, often requiring no PDUs other than the request and response. This is made efficient at the expense of large requests and responses. Their transmission requires an explicit acknowledgement for each segment, causing a performance loss. The RPC model of remote interprocess communication is not general enough to fully sup-port Remote Operations. As mentioned in Chapter 2, Remote Operations allows the server process to synchronize the actions of its clients. RPC cannot provide this service. 3.3 De l ta -T Protocol Delta-T [DeltaTp] is a transport protocol designed to support both request/response and stream communication. Delta-T's main contribution is in the area of connection establishment and maintenance. Connections Delta-T connections do not require opening or closing packet exchanges. Logically, the state information needed to maintain a connection always exists for all possible connections. Con-nections that are not active, and have not been active for some time are in a "default" state. Connections in the default state do not need to maintain any connection records. When a connection becomes active for the first time or after a period of inactivity, a connection record is created because the connection is no longer in the default state. If a connection is inac-tive longer than the life of a possible retransmission, the connection resumes its default state, CHAPTER 3. EXISTING PROTOCOLS FOR REMOTE OPERATIONS 30 and the explicit connection records may be discarded. Lack of explicit connection state for the duration of a connection precludes Delta-T (like VMTP and RPC) from providing some connection-oriented functions desirable for Remote Operations support (i.e. notification of client unreachability). Data Transfer • The flow control used by Delta-T is the sliding window scheme. This scheme is undesirable for Delta-T for the same reasons discussed above in Chapter 2. • Delta-T will retransmit data if an acknowledgement is not received within a small interval. It does not require a separate acknowledgement for each received PDU, but does require that one be sent a short time after receipt to avoid causing the sender to time-out and retransmit. Therefore, if the connection is busy, one acknowledgement may acknowledge more than one data PDU. Selective retransmission is not supported. Other Remote Operations Support • Delta-T has no facility to inform a server in case of client unreachability. • Delta-T does not support PDU segmentation and reassembly. It assumes that this service, if necessary, is provided by the network layer protocol. • Delta-T uses a sequence numbering scheme to detect duplicates. For this scheme to work, connection state (other than the default state) must be maintained for a connection during the interval in which a duplicate PDU may still be received. Chapter 4 Request/Response Protocol This chapter describes the Request/Response Protocol. RRP is designed in accordance with the ideal model described in Chapter 2 to support Remote Operations. 4.1 R R P and the OSI Reference Model RRP provides a session service. It is designed to make use of the ISO connectionless Transport service (ISO 8602), which uses the connectionless Network (ISO 8473) and Data Link (Logical Link Control type 1) services. ISO Remote Operations uses the services provided by RRP. RRP may be used with either the connectionless (addendum to ISO 8822) or connection oriented Presentation layers. The connectionless presentation layer provides a more efficient service and may be used when prior agreement regarding transfer syntax and encoding rules exists between communicating entities. The connection oriented Presentation service should be used in systems where there is no such prior agreement. In this case, transfer syntax and encoding rule negotiation may be performed at Presentation connection establishment time. Use of the connection oriented Presentation layer requires that ACSE be used for connection establishment and release. Figure 4.1 shows both alternatives. 31 CHAPTER 4. REQUEST/RESPONSE PROTOCOL 32 APPLICATION R O S E C O N N E C T I O N 1 -LESS P R E S E N T A T I O N ! CONNECnONCESS ........TRANSPORT...... CONNECHONLJSSS APPLICATION ROSE ACSE P R E S E N T A T I O N L A Y E R R E Q U E S T / R E S P O N S E P R O T P C O L C O N N E C T I O N L E S S -. T R A N S P O R T J Figure 4.1: RRP Position in the OSI Stack CHAPTER 4. REQUEST/RESPONSE PROTOCOL 33 4.2 R R P Properties 4.2.1 Connections RRP requires no separate connection establishment or release PDUs. A RRP connection is formed when a client application attempts communication with a server. Prior to this, the client indicates its desire to connect to the server, and some local initialization is performed by RRP. When the first request from the client to this server is transferred, the server-RRP informs the server application that a client is beginning a connection, and then RRP delivers the request. Each RRP connection is represented by two identifiers, one created by the client-RRP, and the other created by the server-RRP. Identifiers are unique within each entity. Each identifier consists of two parts: a generation number, and a connection identifier. The generation number is incremented on each incarnation of the entity. The connection identifier is incremented by the entity each time it creates a new connection. The first PDU from the client to the server over a new connection contains a reserved value for the server identifier. The server-RRP recognizes this PDU as the first of a new connection, and in response creates a new identifier for this connection. Both identifiers are transmitted in each subsequent PDU exchanged between the client and server. If either party wishes to release the connection, its RRP service discards the identifier associated with that connection. Any further PDUs received with the invalid identifier are ignored. This scheme assumes the existence of a higher level connection. Generally, the client and server applications agree to close their higher level connection (eg. closing a file), and can both close their ends of the RRP connection at the same time. This allows a higher level protocol to close the RRP connection without RRP having to transfer connection release PDUs. RRP provides two connection modes. The first mode supports interested servers where the server application needs to be informed in the case of client unreachability. The second mode is for disinterested servers which do not require this function. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 34 4.2.2 Flow Control and Selective Retransmission RRP flow control is driven by the receiver. Initially, the sender assumes existing permission to send the first DEFAULT segments of a request or response. Following these initial segments, the sender sends only those segments explicitly solicited by the receiver. If all initial segments of a request are lost, the receiver will not solicit further segments. In this case the client-RRP times-out and retransmits the segments. Once some of the initial segments have been received, the receiver explicitly asks the sender for outstanding segments. RRP places no restriction on which outstanding segments may be solicited by the receiver. RRP does, however, suggest guidelines in order to reduce unnecessary retransmissions and increase network utilization. The first guideline is that no outstanding segment should be solicited for the Nth time, until each other outstanding segment has been asked for at least N—1 times. The initial DEFAULT segments discussed above are considered to have been solicited once at the outset. This scheme gives delayed segments as much time as possible to arrive before they are solicited again, thus reducing unnecessary retransmissions. The second guideline is that a receiver should solicit a new group of segments before all previously solicited segments have been received. This reduces the time a sender spends awaiting a request for more segments. This increases network utilization and reduces request and response transfer time. This flow control scheme is simple to implement. The receiver solicits as many outstanding segments as it can accommodate. The sender simply sends those segments explicitly solicited by the receiver. This scheme also provides selective retransmission without the need for a separate mecha-nism. No distinction is made between segments that have not yet been transmitted, and those which have been transmitted and lost. All outstanding segments are treated equally and may be solicited by the receiver. If flow control and selective retransmission are treated as separate mechanisms, the protocol specification and implementation become unnecessarily complex. Figure 4.2 shows a typical RRP interaction consisting of a single segment request and CHAPTER 4. REQUEST/RESPONSE PROTOCOL 35 CLIENT NETWORK SERVER RDataReq^ RDatalnd • RRESP,seg#2 -4 RDatalnd M — RDataReq Figure 4.2: Single Segment Request/Response Interaction response. Figure 4.3 shows a possible flow control scenario for the transfer of a multi-segment request. In this example, the sender is assuming a standing request for the first four segments. Figure 4.4 shows the transfer of a multisegment request in the presence of error. In this case, segment 3 is given extra time to arrive because the receiver solicits previously unsolicited segments before asking for segment 3 a second time. 4.2.3 Acknowledgement Strategy A request PDU is acknowledged in one of two ways. If a response is quickly generated, the response itself acknowledges receipt of the corresponding request. A server which is slow to respond acknowledges the request by sending an explicit acknowledgement PDU. A response is acknowledged implicitly by the receipt of a subsequent request over the same connection, or a time-out. A subsequent request on the same connection indicates that the client received the response. A time-out (indicating that receipt of a retransmission from the client is now impossible) indicates that either the response was received, or that the client has 36 PRO T O C O L REC 37 to ******* figure 4 A-CHAPTER 4. REQUEST/RESPONSE PROTOCOL 38 become unreachable. 4.2.4 Notification of Peer Unreachability A client-RRP with an outstanding request to a server is notified if the server becomes unreach-able. A client with an outstanding request starts a timer, and expects one of two events to occur before timer expiry. Either the server responds, or it sends an acknowledgment PDU. If the server application is slow in responding, the server-RRP periodically sends a PDU to the client-RRP indicating that the server application received the request. These periodic acknowledgements are an indication that the server-RRP is still reachable. Each time an ac-knowledgement is received by the client-RRP, the timer is reset. If the timer expires, the client sends a PDU prompting the server-RRP for an immediate indication that it is reachable. If none is forthcoming, the client will retry several times before giving up. A server application may be notified if a client-RRP that it is connected to becomes unreach-able. This is an optional service. If this service is selected, the server-RRP will periodically probe the client-RRP for an indication that it is reachable. The probe and acknowledgement PDUs are small, infrequent, and may be grouped together on a machine-to-machine basis1. 4.2.5 Segmentation and Reassembly RRP segments a request or response into packets small enough to be transmitted by the underlying network. The segmented PDU is reassembled at the destination. Each segment of a PDU contains a sequence number indicating its relative position within the PDU. Each also contains the sequence number of the request or response to which it belongs. 4.2.6 At-Most -Once Delivery RRP uses sequence numbers over each connection to identify request/response pairs. A dupli-cate received request or response is discarded. A request must have a sequence number one 1 Grouping on a machine-to-machine basis, while possible, is not included in this implementation. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 39 greater than the previous request over that connection or it is discarded. 4 .3 R R P Protocol Data Units This section lists and describes the RRP protocol data units. Abbreviations used: SRCID - connection identifier of the sending RRP entity. DESTID - connection identifier of the destination RRP entity. C E X T - client application extension. SEXT - server application extension. T Y P E - PDU type. REQNO - request number that this PDU is associated with. SINGS - number of singly requested segments. SINGLES - singly requested segments. R A N G F - range flag - indicates whether a range of segment requests follow. R A N G E C - range of requested segments. SEGN - segment number. FINSEG - number of last segment for this request or response. DATA - user data. Figures 4.5 to 4.12 show the RRP PDUs. Each figure shows the fields of a PDU with the field length in bytes above. All fields have the most significant bit on the left. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 40 4 4 4 4 1 4 4 4 N SRCID DESTID CEXT SEXT 0x09 REQNO SEGN FINSEG DATA The Rrp-REQuest PDU (RREQ) transfers one segment of a request from the client-RRP to the server-RRP. Figure 4.5: RREQ PDU. 4 4 4 4 1 4 4 4 N SRCID DESTID CEXT SEXT 0x0a REQNO SEGN FINSEG DATA The Rrp-RESPonse PDU (RRESP) transfers one segment of a response from the server-RRP to the client-RRP. Figure 4.6: RRESP PDU. " 4 4 4 4 1 4 SRCID DESTID CEXT SEXT 0x0b REQNO The Rrp-Waiting-For-Response PDU (RWFR) indicates to the server-RRP that the client-RRP has not yet received a response to its request. Figure 4.7: RWFR PDU. 4 4 4 4 1 SRCID DESTID CEXT SEXT 0x0c The Rrp-UNBIND PDU (RUNBIND) indicates that the sending entity is releasing the association. Figure 4.8: RUNBIND PDU. 4 4 4 4 1 SRCID DESTID CEXT SEXT OxOd The Rrp-Are-You-Alive PDU (RAYA) is a probe from the server to the client. Figure 4.9: RAYA PDU. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 41 4 4 4 4 1 SRCID DESTID CEXT SEXT OxOe The Rrp-I-Am-Alive PDU (RIAA) is sent in response to the RAYA probe. Figure 4.10: RIAA PDU. 4 4 4 4 1 4 SRCID DESTID CEXT SEXT OxOf REQNO The Rrp-Server-Is-Processing PDU (RSIP) is an indication from the server that it has received the request, but has not yet responded to it. Figure 4.11: RSIP PDU. 4 SRCID 4 DESTID 4 CEXT 4 SEXT 1 0x10 4 REQNO 1 SINGS N SINGLES 1 RANGF 0 or 8 RANGEC The Rrp-SEGment-REQuest PDU (RSEGREQ) is a request from a receiver to a sender for particular segments. The number of SINGLES is given in SINGS. If present, each four byte segment number in SINGLES solicits the indicated segment. RANGEC is present only if RANGF is 1. If present, RANGEC consists of two four-byte segment numbers. These numbers form a range of requested segments. Figure 4.12: RSEGREQ PDU. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 42 4.4 Elements of Procedure 4.4.1 R R P P D U Transfer Purpose An underlying transfer service must exist to transfer PDUs between RRP entities. RRP is desinged to use the services of the connectionless Transport layer, but any transfer service satisfying the following requirements may be used. Requirements The transfer service supporting RRP PDU transfer must have the following characteristics: • It must provide a datagram service. • It must return the source address of received datagrams. • The maximum packet life in the network must be bounded and known. Reliability and sequenced delivery are not necessary characteristics of the transfer service. 4.4.2 R R P Service Interface The services of RRP are provided through the following interface. In our implementation, service primitives are transferred to RRP as events using the framework module's PostEvent() subroutine call. Events generated by RRP are delivered through a boundary routine interface. The PostEvent() interface and the boundary routines are described in Chapter 5. RBindReq RBindReq is invoked by the client application to prepare the client-RRP for connection estab-lishment with the server-RRP. Parameter: • ADDRESS - identifies the server application with which communication is desired. Con-sists of an address2 to identify the server-RRP entity, and an extension to identify the CHAPTER 4. REQUEST/RESPONSE PROTOCOL 43 individual application. Returns: • CONNECTION IDENTIFIER - used by the server application in subsequent service requests to identify this connection. The connection identifier consists of an 8 bit incar-nation number3, and a 24 bit connection number4. RUnbindReq RUnbindReq is invoked by the client or server application when an existing connection is no longer required. Parameter: • CONNECTION IDENTIFIER - identifies the connection to be released. RDataReq RDataReq is invoked by the client or server application to transfer a request or response, respectively. Parameters: • CONNECTION IDENTIFIER - identifies the connection over which this data is to be sent. • DATA LENGTH - indicates the amount of data to send, in bytes. • DATA - address of the data to be sent. 2The format of the address is dependent on the transfer service used. incremented each time the client RRP starts up. 4 Incremented each time the client RRP creates a new connection. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 44 RConnlnd RConnlnd is delivered to the server application to indicate that a new connection has been established. Parameter: • CONNECTION IDENTIFIER - identifies this connection in further interaction with the server-RRP. RDatalnd RDatalnd is delivered to the client or server application to signal the receipt of a response or request, respectively. Parameters: • CONNECTION IDENTIFIER - indicates the connection on which this data was received. • DATA LENGTH - indicates the amount of received data, in bytes. • DATA - the address of the received data. RDiscInd RDiscInd is delivered to the client or server application when a connection has been closed by a previously connected peer. Parameter: • CONNECTION IDENTIFIER - indicates the connection that has been closed. RAbrtlnd RAbrtlnd is delivered to a client or server application when a protocol error has occurred, or when RRP detects that a connected peer has become unreachable. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 45 Parameter: • CONNECTION IDENTIFIER - indicates the connection that has been abnormally closed. 4.4.3 R R P Timers Purpose RRP timers time retransmission and idle link intervals. Description There are two RRP interval timers, the retransmission timer (RTIMER), and the idle link timer (IDLETIMER). The RTIMER is used to time retransmission intervals for unacknowledged segments. This RTIMER interval may be any reasonable value such as one or two seconds. Passage of the IDLETIMER interval indicates that no retransmissions of a previously received PDU still exist in the network. The duration of the IDLETIMER must be greater than (MAX x RTD) + MPL. MAX is the maximum number of client-RRP retrans-mission attempts, RTD is the duration of RTIMER, and MPL is the maximum life of a packet in the network. 4.4.4 Connection Establishment Purpose This procedure is used to create a connection between a client-RRP and server-RRP for the purpose of transferring requests and responses. Summary Each RRP entity creates a locally unique identifier for each of its connections. A connection is established when the server-RRP and the client-RRP each know the other's connection identifier for this connection. The trading of these identifiers occurs with the first request/response exchange. Both identifiers are used in all subsequent transmissions to identify a PDU as belonging to this connection. Action Taken by Initiator • The client-RRP sends a connection establishing PDU to the server-RRP. This PDU is a RDATA PDU containing the first request over the connection. This first PDU has the DESTID field set to NOID, indicating that this connection is not yet established. It also CHAPTER 4. REQUEST/RESPONSE PROTOCOL 46 has the SRCID set to the identifier created by the client-RRP when the RBindReq event was received. • The client-RRP expects in response a PDU from the server-RRP indicating connection acceptance. This PDU returns the identifier created by the server-RRP for this connec-tion, as well as the identifier originally sent by the client-RRP. This PDU must be one of RRESP, RSIP, or RSEGREQ. Data transfer procedures, described below, are used following receipt of one of these PDUs. • If the client-RRP fails to receive one of these PDUs, the general retransmission strategy described below is used. Continued failure results in failure to establish a connection. Action Taken by Responder • A RDATA PDU with the DESTID field set to NOID is tested to verify that it is not a duplicate. This is tested using the PDU's source address, and the SRCID contained in the PDU. Together, this pair uniquely identifies the request. • The server-RRP creates a locally unique identifier for this new connection. This identifier, as well as the received client-RRP identifier, are used in all subsequent transmissions over this connection. • A RConnlnd is delivered to the server application. • A RDatalnd is delivered to the server application. • Data transfer procedures, described below, are used to respond to this, and further re-quests over this connection. 4.4.5 Connection Release Purpose This procedure is used to release a previously created connection. Summary A connection is released when either or both the client-RRP or server-RRP in-validate their identifier for that connection. Subsequent received PDUs bearing the invalidated connection identifier will be ignored. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 47 Action Taken by RRP Entity Releasing Connection In response to a RUnbindReq, a server-RRP entity performs the following actions: • Send a RUNBIND PDU to the client-RRP (optional). • Start the IDLETIMER. • Prior to timer expiry, the server-RRP must not respond to new requests. It must, however, respond to retransmitted PDUs in the normal fashion. • On expiry of the timer, the server-RRP releases the connection by discarding any record of the identifier associated with this connection. In response to a RUnbindReq, a client-RRP entity performs the following actions: • Send a RUNBIND PDU to the server-RRP (optional). • The client-RRP releases the connection by discarding any record of the identifier associ-ated with this connection. Action Taken by Entity Receiving RUNBIND PDU In response to a RUNBIND PDU, a server-RRP entity performs the following actions: • Start the IDLETIMER. • Prior to timer expiry, the server-RRP must not respond to new requests. It must, however, respond to retransmitted PDUs in the normal fashion. • On expiry of the timer, the server-RRP releases the connection by discarding any record of the identifier associated with this connection. In response to a RUNBIND PDU, a client-RRP entity releases the connection by discarding any record of the identifier associated with this connection. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 48 4.4.6 Monitoring Reachability of Server-RRP Purpose This procedure is used by a client-RRP to determine if a connected server-RRP entity is unreachable. Summary The client-RRP periodically probes the server-RRP to which it has an outstanding request. If no PDU is received in response to the probe, the probe PDU is retransmitted. Following several unacknowledged probes, the client-RRP informs the client application of server unreachability. Action Taken by Client-RRP • A client RRP starts the RTIMER after transmitting any part of a request to the server-RRP. Also, the variable CNT is set to 0. In response to the transmitted request, the client-RRP expects one of three PDUs: RRESP, RSIP or RSEGREQ. If any one of these are received, CNT is reset to 0 and the RTIMER is restarted. On receipt of a complete response, the RTIMER is stopped. • On RTIMER expiry, the RWFR PDU is transmitted. This PDU is retransmitted us-ing the retransmission strategy below until one of the three PDUs is received or the retransmissions reach a maximum number. • If the geberal retransmissions reach a maximum number, the client-RRP delivers a RAbrtlnd to the client application and invalidates the identifier associated with this connection. Action Taken by the Server-RRP On receipt of a RREQ or RWFR PDU, the server-RRP takes the appropriate action as dictated by the data transfer procedures below. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 49 4.4.7 Monitoring Reachability of Client-RRP Purpose This procedure is used by a server-RRP supporting an interested server application to determine if a connected client-RRP is unreachable. Summary While there are no outstanding requests, the server-RRP probes the client-RRP. If these probes go unanswered for some time, the server-RRP informs the server application that the client-RRP is unreachable, and the connection is released. Action Taken by the Server-RRP • On server-RRP startup, and after the transmission of a complete response, the server-RRP starts the RTIMER, and sets the variable CNT to 0. • On RTIMER expiry, the server-RRP transmits a RAYA PDU, and restarts the RTIMER. In response to the RAYA PDU, the server-RRP expects one of three PDUs: RIAA, RWFR or RREQ. If RIAA is received, CNT is reset to 0, and the RTIMER is restarted. If either RWFR or RREQ is received, then data transfer procedures given below apply. • The RAYA PDU is retransmitted using the general retransmission strategy below until one of the three PDUs is received, or the retransmissions reach a maximum number. • If the retransmissions reach a maximum number, the server-RRP delivers a RAbrtlnd to the server application, and invalidates the identifier associated with this connection. Action Taken by the Client-RRP On receipt of a RAYA PDU, the client does the following: • If there is no outstanding request, the RIAA PDU is transmitted. • If there is an outstanding request, data transfer procedures given below apply. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 50 4.4.8 Data Transfer Purpose This procedure enables the transfer of PDUs which may be larger than the sup-porting communication network will allow. Summary If a request or response is too large to be transferred as one unit, it is divided into appropriately sized segments. Segments of a request or response PDUare transmitted (and, if necessary, retransmitted) until the entire PDU has been received. Action Taken by the Client-RRP When Sending Requests • Each request is transferred in one or more segments. • Each segment contains its call number (incremented on each new request), as well as a request-relative sequence number. • Each segment must contain at least one byte of data, and no more than the amount allowed by the supporting network. For efficient transfer, it is suggested that all but the last segment contain the maximum amount of data allowed. • Segments are transmitted according to the flow control scheme described below. When the first group of segments is sent, CNT is set to 0 and the RTIMER is started. In response to transmitted segments, the client-RRP expects one of four PDUs: RSIP, RRESP, RSEGREQ or RAYA. • If a RSIP PDU is received, this indicates that the server-RRP has delivered the request to the server application. The client-RRP restarts the RTIMER and resets CNT to 0. • If a RRESP PDU is received, the client-RRP takes appropriate action as described below. • If a RSEGREQ PDU is received, the appropriate segments are sent, as described in "Flow Control and Selective Retransmission". CNT is reset to 0, and the RTIMER is restarted. • If a RAYA PDU is received, this indicates that the server-RRP did not successfully receive the initial request PDUs. The RTIMER is restarted, CNT is set to 0, and a RWFR PDU is sent. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 51 • If the RTIMER expires before any of these PDUs are received, a RWFR PDU is trans-mitted, and possibly retransmitted using the general retransmission scheme given below. If the retransmission scheme fails to produce one of the above PDUs, a RAbrtlnd is delivered to the client application, and the connection identifier is invalidated. Action Taken by the Client-RRP When Receiving Results • A received result segment is tested in case it is a duplicate. CNT is set to 0, and the RTIMER is started. • If the received segment completes the result, a RDatalnd is issued to the client application. The RTIMER is stopped. • If some number of result segments are still outstanding, a RSEGREQ PDU may be sent (according to the flow control section below), CNT is reset to 0, and the RTIMER is restarted. • If the RTIMER expires before more result segments are received, a RSEGREQ PDU is transmitted or retransmitted according to the retransmission scheme given below. • If the retransmission scheme fails to produce further response segments, a RAbrtlnd is delivered to the client application and the connection identifier is invalidated. Action Taken by the Server-RRP When Sending Results • Each result is transferred in one or more segments. • Each segment contains its call number, as well as a result-relative sequence number. • Each segment must contain at least one byte of data, but no more than is allowed by the supporting network. For efficient transfer, it is suggested that all but the last segment contain the maximum amount of data allowed. • Segments are transmitted according to the flow control scheme described below. When the first group of segments is sent, CNT is set to 0 and the RTIMER is started. In response to transmitted segments, the server-RRP expects one of three PDUs: RREQ, RWFR or RSEGREQ. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 52 • If a RREQ segment is received, it is a duplicate and may be discarded. CNT is reset to 0, and the RTIMER is restarted. • If a RWFR PDU is received, the most recently sent data is retransmitted. CNT is reset to 0 and the RTIMER is started. • If a RSEGREQ PDU is received, the appropriate segments are sent, as described in "Flow Control and Selective Retransmission". CNT is reset to 0 and the RTIMER is restarted. • If the RTIMER expires before any of these PDUs are received, a RAYA PDU is trans-mitted, and possibly retransmitted using the general retransmission scheme given below. If the retransmission scheme fails to produce one of the above PDUs, a RAbrtlnd is delivered to the server application, and the connection identifier is invalidated. Action Taken by the Server-RRP When Receiving Requests • A received request segment is tested in case it is a duplicate. CNT is set to O.and the RTIMER is started. • If the received segment completes the request, a RDatalnd is issued to the server appli-cation. The RTIMER is stopped. • If some number of request segments are still outstanding, a RSEGREQ PDU may be sent (according to the flow control section below), CNT is reset to 0 and the RTIMER is restarted. • If the RTIMER expires before more request segments are received, the RSEGREQ PDU is retransmitted according to the general retransmission scheme given below. • If the retransmission scheme fails to produce further request segments, a RAbrtlnd is delivered to the server application and the connection identifier is invalidated. 4.4.9 General Retransmission Scheme Purpose This procedure is used by the sender to retransmit any PDU for which an acknowl-edgement is expected, but not received. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 53 Summary Some PDUs require an acknowledgement. An acknowledgement can take the form of a RSIP, RWFR, RIAA, RRESP or RSEGREQ PDU, depending on the circumstances as de-scribed in this section. If the acknowledgement does not arrive within an acceptable time limit, the PDU is retransmitted. The retransmission and timeout is repeated until the acknowledge-ment is received, or until a maximum number of retransmissions have been attempted. Action Taken by the Sending RRP • When a PDU requiring an acknowledgement is sent, the RTIMER is started, and CNT is set to 0. • If the acknowledgement arrives, the RTIMER is stopped. • If the RTIMER expires, CNT is incremented. If CNT is less than or equal to MAX, the PDU is retransmitted and the RTIMER is restarted. If CNT exceeds MAX, then a RAbrtlnd is delivered to the application and the connection identifier is invalidated. 4.4.10 Flow Control and Selective Retransmission Purpose This procedure allows the receiving RRP entity to control the sending of request or response segments. It is also used by the receiver to solicit missing request or response segments. Summary RRP uses a receiver-controlled scheme which unifies flow control and selective retransmission. Here, the sender only sends PDUs specifically requested by the receiver. Action Taken by the Sending RRP Entity • A sender is allowed to send the first DEFAULT segments of a request or response without receiving permission. DEFAULT is a value agreed to by both the sender and receiver. • If there are more segments to send, the sender awaits a RSEGREQ PDU. CNT is set to 0 and the RTIMER is started. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 54 • If no RSEGREQ PDU arrives, the general retransmission scheme described above is used to retransmit the segments. If still no RSEGREQ PDU arrives, a RAbrtlnd is delivered to the application and the connection identifier is invalidated. • If a RSEGREQ PDU arrives, the segments requested by the RSEGREQ are sent. Again, if there are more segments to send, the sender awaits a RSEGREQ PDU, and the procedure is repeated. Action Taken by the Receiving RRP Entity • A missing segment is any segment of a request or response which has not yet been received. • The receiver is free to request any number of missing segments at a time. • Each missing segment should be requested exactly N times before any other is requested N+l times. • Missing segments should be requested in order of their segment numbers. • It is suggested that requests be sent for further missing segments after some portion (approximately half) of the previously requested segments have arrived. • Each time a RSEGREQ PDU is sent, the RTIMER is started and CNT is set to 0. If no requested segments are received before RTIMER expiry, the RSEGREQ PDU is to be retransmitted using the general retransmission strategy given above. If, after these retransmissions, no further segments arrive, a RAbrtlnd is delivered to the application, and the connection identifier is invalidated. 4.4.11 Retention of Results for Retransmission Purpose This procedure allows retransmission of unsuccessfully transmitted results. Summary All results are to be retained until such time as no request retransmissions are possible from the client-RRP. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 55 Action Taken by Interested Server • When the server-RRP transmits the final segment of a result, the result is retained and the IDLETIMER is started. • If a PDU from the client indicates that part or all of the result was not received, the unreceived portion is retransmitted. • When the IDLETIMER expires, the saved result may be discarded. Action Taken by Disinterested Server The actions taken here are identical to those for the interested server. In addition, when the IDLETIMER expires, the disinterested server-RRP may discard all state information other than the connection identifiers. This is not possible for interested servers, as other state information is required to periodically probe the client-RRP. 4.5 R R P Protocol States The RRP protocol states are divided into client and server states. Including the closed state, there are six client states and five server states. The client-RRP states are as follows. CLOSED - no connection CIDLE - connection open, no outstanding request CSENDING - in the process of sending a multi-segment request to the server-RRP CSENT - all segments of request sent at least once, waiting for a response CPROCESSING - all segments sent, and server-RRP has acknowledged their receipt CRECEIVING - receiving a multi-segment response The server-RRP protocol states consist of the following. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 56 CLOSED - no connection SIDLE - connection open, no outstanding request S R E C E I V I N G - receiving a multi-segment request SPROCESSING - all segments received, and RDatalnd delivered to server application SSENDING - in the process of sending a multi-segment response to the client-RRP 4.6 RRP State Tables The state tables for the RRP protocol are given in Tables 4.1 and 4.2. The following information is necessary to interpret the state tables. • States are listed across the top, events are listed down the left side. The action to be taken is at the row and column intersection of the current state and the received event. • On startup, RRP enters state CLOSED. • PDUs in the event column indicate receipt of that PDU. • Any received PDU with an inappropriate request number is to be discarded. • RTIMER and IDLETIMER in the event column indicate expiry of that timer. • Conditional actions are interpreted as follows: The predicate is followed by a question mark. The action to be taken if the predicate is satisfied is listed either to the right of the predicate, or indented below the predicate. The action to be taken if the predicate evaluates to false is optionally listed to the right of, or indented below an else statement. Indentation is significant in matching predicates and their corresponding else statements. The negation of a predicate is indicated by a preceding carrot. • A CPR predicate is to be interpreted as "have all segments of this request or response been received?" • A AS predicate is to be interpreted as "have all the segments of this request or response been sent at least once?" CHAPTER 4. REQUEST/RESPONSE PROTOCOL 57 • Blank actions indicate a protocol error. The action taken is implementation dependent. • A state name as an action means proceed to that state. • A PDU as an action indicates that the PDU is to be sent. If the PDU is a RREQ or a RRESP PDU, the solicited segments should be sent. • A CR as an action indicates that a RSEGREQ PDU may potentially be sent according to elements of procedure above. • A SRT as an action indicates that the RTIMER is to be started. • A KRT as an action indicates that the RTIMER is to be stopped. • A SIT as an action indicates that the IDLETIMER is to be started. • A KIT as an action indicates that the IDLETIMER is to be stopped. • CNT is an integer variable. • MAX is the maximum number of retransmissions. • CLOSING is a boolean variable. • The ignore action indicates that no action is to be taken. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 58 CLOSED CIDLE CSENDING CSENT CPROCESSING CRECEIVING RRESP ignore CPR? KRT,RDatalnd CIDLE else CNT-0, CR, SRT,CRECEIVINC CPR? KRT,RDatalnd CIDLE else CNT-0,CR, SRT,CRECEIVINC CPR? KRT,RDatalnd CIDLE else CNT-0,CR, SRT RAYA RIAA CNT = 0 DATA SRT CNT - 0 RWFR SRT CNT = 0 RWFR SRT CNT - 0 CR SRT RSIP CNT - 0 SRT CPROCCESSING CNT - 0 SRT CNT - 0 SRT RSEGREQ CNT - 0 RREQ AS? CSENT SRT CNT - 0 RREQ SRT RDataReq RREQ AS? CSENT else CSENDING SRT RBindReq CIDLE RUnbindReq CLOSED RTIMER CNT++<MAX? RWFR,SRT else RAbrtlnd CLOSED CNT++<MAX? RWFR,SRT else RAbrtlnd CLOSED CNT++<MAX? RWFR,SRT else RAbrtlnd CLOSED CNT++<MAX? CR,SRT else RAbrtlnd CLOSED Table 4.1: RRP State Table, Client States CHAPTER 4. REQUEST/RESPONSE PROTOCOL 59 CLOSED SIDLE SRECEIVING SPROCESSING SSENDING RREQ CPR? RConnlnd RDatalnd SPROCESSING, SRT else CR, SRECEIVING, SRT "CLOSING? CNT-0,KIT, SRT CPR? RDatalnd, SPROCESSING else CR.SRECEIVING CPR? RDatalnd SPROCESSING else CR CNT - 0, SRT CNT - 0 RSIP SRT CNT - 0 RRESP SRT RWFR CR SRT SIT SIDLE "CLOSING? DUP? RRESP, SIT INTS? SRT else CR,SRECEIVING SRT CNT - 0 CR SRT CNT - 0 RSIP SRT CNT - 0 RRESP SRT RIAA CNT - 0 INTS? SRT RSEGREQ CNT = 0 RRESP INTS? SRT SIT KRT, CNT-0, RRESP AS? INTS? SRT SIT, SIDLE else SRT RDataReq CNT-0, KRT, RRESP AS? INTS? SRT SIT, SIDLE else SRT,SSENDING RUnbindReq o r RUNBIND CLOSING - true KRT SIT CLOSING=true KRT SIT SIDLE CLOSING - true KRT SIT SIDLE CLOSING = true KRT, SIT SIDLE RTIMER CNT++ < MAX? RAYA INTS? SRT else RAbrtlnd, SIT CLOSING-true CNT++ < MAX? CR, SRT else RAbrtlnd CLOSING-true SIDLE, SIT RSIP SRT CNT++ < MAX? RAYA, SRT else RAbrtlnd CLOSING-true SIDLE, SIT IDLE TIMER CLOSING? CLOSED DINTS? CLOSED Table 4.2: RRP State Table, Server States CHAPTER 4. REQUEST/RESPONSE PROTOCOL 60 4.7 Arguments for Correctness This section presents informal arguments in favor of the correctness of RRP. Arguments 1, 2, 4 and 6 pertain to connections between a client and either an interested or a disinterested server. Arguments 3 and 5 pertain to connections between a client and an interested server. 1. No PDU from a previously released connection can initiate a new connection and cause the acceptance of duplicate data.5 • A connection will only be opened if a PDU is received for a connection in the closed state (CLOSED). • For duplicate data to be accepted and to create a new connection, it is necessary that first the original PDU is received and accepted, then the server-RRP changes state to CLOSED, and finally the duplicate PDU arrives. • Through examination of the states CIDLE through CPROCESSING, it can be seen that the maximum time between receipt of the original RREQ PDU and receipt of the duplicate PDU is (MAX x RTIMER-INTERVAL) + M P L 6 . • Through examination of states SIDLE through SSENDING, it can be seen that the minimum time between when a server-RRP closes a connection, and when the connection goes to state CLOSED is IDLETIMER-INTERVAL. During this time, no PDUs will be accepted over this connection. Therefore, the original PDU, to be accepted, must be received before this interval begins. • By definition, IDLETIMER-INTERVAL is greater than (MAX x RTIMER-INTERVAL) + M P L . Therefore, the duplicate PDU must be received before the server-RRP changes to state CLOSED. Thus, the server-RRP is not in state CLOSED when any duplicates arrive and a new connection is not created. • A server-RRP which crashes must wait IDLETIMER-INTERVAL before accepting new requests. This guards against the situation where a server performs a request, crashes, and then reboots to receive a duplicate of the previously performed request. 5 Because of the similarity between R R P connection management and Delta-t connection management, this argument follows the corresponding argument in [DeltaTp]. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 61 • In short, a closing connection will wait until no duplicate (of a previously accepted PDU) is possible before changing to state CLOSED. 2. No P D U from a (closed or open) connection can interfere with some other open connection. • Each connection is identified by a client-id and a server-id. Each id is unique within the entity that creates it. • Each id consists of generation identifier and a connection identifier. The genera-tion identifier is incremented on each reincarnation of the entity. The connection identifier is incremented on each creation of a new connection. • A PDU is only accepted as data on a connection if both identifiers in the PDU match the connection identifiers. • Therefore, no PDU from a connection (open or closed) may be accepted as data on another open connection. 3. If a client-RRP becomes unreachable, all servers connected to that client-RRP will be informed. • A server-RRP with an open connection will be in any of states SIDLE through SSENDING. • Examining these states, it can be seen that each time a PDU is received from the client-RRP, the RTIMER is restarted and CNT is reset to zero. • If the server-RRP is in one of states SIDLE, SRECEIVING or SSENDING, (i.e. not processing a request) a RTIMER expiry causes the transmission of a PDU to the client-RRP requesting an immediate reply. CNT is also incremented. • If the client-RRP does not respond, CNT reaches MAX and the server-RRP informs the server application that the client-RRP is unreachable. • The only way to avoid CAT reaching MAX is for a PDU to arrive from the client-RRP (indicating that it is reachable). 6 M A X is the number of possible retransmissions, and MPL is the maximum life of a packet in the network. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 62 • If the client becomes unreachable while the server application is processing a request, then the server-RRP is in state SPROCESSING. Once the request is processed, the server-RRP moves either to state SIDLE or SSENDING. In either case, the above mechanism for these states detects client-RRP unreachability. 4. If a server-RRP becomes unreachable while processing a request, the client application with the outstanding request will be notified. • A client-RRP with an outstanding request is in one of states CSENDING through CRECEIVING. • Examination of these states shows that each time a PDU is received from the server-RRP, CNT is reset to 0 and the RTIMER is started. • If the RTIMER expires, CAT is incremented and a PDU is sent requesting an immediate reply from the server-RRP. • If the server-RRP is unreachable, this PDU goes unanswered, the RTIMER expires, and CAT eventually reaches MAX. In this case, the client application is informed of server unreachability. • The only way for CAT not to reach MAX is for a PDU to be received from the server-RRP indicating server reachability. 5. In the case of temporary network partition, either both the client-RRP and server-RRP will detect the partition, or neither will. • A partition may occur where the duration is such that only one of the client-RRP and server-RRP detects it. • In this case, the entity that detects the partition closes the connection by invalidating the identifier associated with that connection. • When the entity which did not detect the partition attempts communication over the closed connection, it is ignored and therefore detects the partition. 6. Transmission is at-most-once. • Each request is assigned an identifier unique within its connection (the call or request number). A response contains the call number of its corresponding request. CHAPTER 4. REQUEST/RESPONSE PROTOCOL 63 • On receipt of a request, the server-RRP records the call number and delivers the request to the server application. The call number is held for at least IDLETIMER-INTERVAL. • Holding the call number for at least IDLETIMER-INTERVAL guarantees (in an ar-gument analogous to Argument 1 above) that it will be available while the possibility of a duplicate still exists. Received call numbers can be compared and duplicate re-quests ignored. • On receipt of a response, the client-RRP verifies the call number and delivers the response to the client application. The client-RRP then moves to state CIDLE. • In state CIDLE, any received response is assumed to be a duplicate and is ignored. • In any other state, the call number of a received response must match that of the original request. C h a p t e r 5 Implementation The implementation of the RRP protocol described in chapter 4 is written in C running under UNIX (SunOS 4.0), and consists of several modules. The first is a sub-kernel operating system called Threads. Threads provides efficient and convenient process creation, deletion and com-munication primitives not available in SunOS 4.0. Next is a framework module for protocol implementation. The framework module provides interval timers and a facility for communi-cation between protocol layers. Enforcing a common, well defined method of communication between protocol layers simplifies testing and development. The third module is the RRP pro-tocol. It is implemented to test the operation and performance of the protocol. A portion of the ISO stack including Transport, Session, Presentation and ACSE layers is implemented to gain experience with the services required by remote operations, and for performance compar-ison purposes. Finally, the Remote Operations protocol and a user interface are implemented. These are implemented as a practical exercise to test the interface and performance of RRP. This chapter describes the implementation details of each software module. 5.1 Threads Threads is a sub-kernel running inside a UNIX process. Threads provides an environment suitable for running a group of cooperating processes. Light-weight processes may be created 64 CHAPTER 5. IMPLEMENTATION 65 and destroyed dynamically. All Threads processes run in a shared memory space. Interprocess communication primitives, a sleep facility and memory management routines allowing fast memory allocation resistant to fragmentation are provided. A Threads process has access to all existing libraries and UNIX system calls. Some of the popular blocking UNIX primitives are rewritten in Threads so that when they are called, only the calling Threads process is blocked, not the entire UNIX process. 5.1.1 Scheduling and Context Switching Threads is not a time-slicing system. Context switches are performed only when Threads system calls are made by a Threads process. Normally, a context switch occurs as a result of I/O calls, sleep calls, or IPC calls. Also, a process may explicitly send itself to the end of the ready queue by calling the scheduling function, SchedQ. Knowing when a context switch is likely to occur is useful when programming cooperating processes. Execution inside a critical section will not be interrupted so long as Threads system calls are not made. Each Threads process has its own stack. To perform a context switch, the Threads context-switching subroutine first saves the state of the running process and then loads the state of a ready process. Saving the running process' state consists of saving the register contents onto the stack and saving the stack pointer into the process control block. To restore a ready process, the registers are loaded from the stack of the ready process and then the saved stack pointer is loaded into the stack pointer register. When the context switching subroutine returns, the return address of the new stack will be used, transferring control to the readied process. 5.1.2 M e m o r y Management Threads memory management provides fast memory allocation and deallocation, and keeps memory fragmentation to a minimum. Memory is allocated by Threads on a first-fit basis. Each request for memory is rounded up to the nearest multiple of MINSIZE to reduce fragmentation. All free memory is kept in an ordered list. Returned memory is added to the list and joined CHAPTER 5. IMPLEMENTATION 66 to adjacent free memory segments. Threads also provides a Temporary Memory facility. A process can free all temporary memory it previously allocated using a single call. 5.1.3 I / O Routines which support I/O in Threads include NonBlkRead(), NonBlkWrite(), ReadQ, Write(), ReadN(), WriteNQ, AcceptQ, Connect(), RecvfromQ, Open(), CloseQ and SocketQ. Each of these routines is meant to replace the corresponding UNIX routine. The routines Open(), CloseQ and SocketQ are provided to allow Threads to keep a count of the number of open file descriptors and sockets. Each of the remaining routines are provided because their corresponding UNIX calls potentially block the calling process. It is unacceptable for all Threads processes to block as a result of one Threads process doing I/O. To avoid this situation, Threads replaces common blocking UNIX system calls with similar Threads calls. The Threads I/O calls block only the calling Threads process. Threads processes awaiting I/O are placed on a queue. The system process doIO periodically checks the queued I/O requests using the UNIX selectQ call. If a request can be satisfied, the I/O is performed, and the process is readied. 5.1.4 Sleep Threads processes can put themselves to sleep for any number of seconds. Sleeping processes are queued in order of wake time. The first process in the queue is checked at every context switch to see if it should be awakened. Because Threads is not a time-slicing system it cannot guarantee that a process is restarted at exactly the time desired. 5.1.5 Interprocess Communication Threads processes communicate via SendQ, ReceiveQ and ReplyQ primitives. A sending pro-cess is blocked until some other process replies to it. A receiving process is blocked until some CHAPTER 5. IMPLEMENTATION 67 other process sends it a message. All Threads processes share one memory space, so no copy-ing of data is done. Instead, a pointer and a length (or actually any two four-byte values) are transferred in Send() and Reply(). Send() first checks to see if the destination process is currently waiting for a message (via Receive()). If so, the pointer and length are transferred and the receiver is returned to the ready queue. The sender is placed on the "waiting for reply" queue. If instead, the destination process is not waiting for a message, a check is made to verify that the destination process exists. The sender is then blocked (on the send blocked queue) pending a Receive() operation by the destination. Receive() works in a similar way. First, Threads checks for any process blocked waiting to send to the receiving process. If one is found, the pointer and length are transferred and the receiver is readied. The sender is placed on the "waiting for reply" queue. Otherwise, the receiving process is blocked pending some other process sending to it. Reply() is very simple. Unless an error has occurred, Reply() will find that its destinations process (the original sender) is waiting on the "wait for reply" queue. A verification of this is performed, and the reply pointer and length are transferred. Also, the original sender is returned to the appropriate ready queue. 5.2 Framework for Protocol Implementation The Remote Operations service is implemented as a group of cooperating processes. The main process consists of a protocol server. This process runs the protocol state machines and acts on requests from the user processes. The protocol server employs a group of worker processes. Workers include readers (to read from the network), writers (to write to the network), and timers (supplying an interval timer service). Two forms of communication are present in this model. The first is inter-pfocess communication between the protocol server, its workers, and its users. The second is inter-layer communication between the protocol layers in the CHAPTER 5. IMPLEMENTATION 68 protocol server process. The framework module provides communication primitives for inter-layer communication, and a uniform format for inter-process communication. All inter-layer communication is in the form of generated or received events. An event consists of the event type and some data. These events correspond to protocol state machine events (e.g. for the Transport layer - TDiscInd). Each protocol layer receives events through five boundaries. These boundaries are the Control-Up, Control-Down, Data-Up, Data-Down and Timer boundaries. The Control-Up boundary receives non-data events passed upward from the protocol layer below (or service provider). Control-Down events are non-data events passed downward from the layer above (or service user). A Data-Up event is a data transfer indication from a service provider, and a Data-Down event is a data transfer request from a service user. Timer events signal the expiry of a timer for that protocol layer. On startup, each protocol layer registers one subroutine for each boundary. This registration tells the framework module which subroutine to call when an event of a particular type is received for that protocol layer. The subroutine is passed the event type and all associated data as parameters. It is responsible for taking appropriate action (such as running a state machine) depending on the event received. The PostEvent() subroutine is used to generate events. The parameters to PostEvent() include the protocol layer for which the event is destined, the event boundary and event type, and any other data associated with the event. Any layer may use this routine to generate an event for any other layer. When PostEvent() is called, the framework module calls the appropriate boundary routine, passing to it the event type and associated data. This method of generating and fielding events standardizes and simplifies inter-layer boundaries. This improves code readability, and simplifies addition of new layers or replacement of existing layers. Communication between the protocol server process and its workers and users is also stan-dardized by the framework module. The framework module provides a dispatcher subroutine CHAPTER 5. IMPLEMENTATION 69 which acts as a translator between inter-process communication, and inter-layer communica-tion. Workers and users use Send() to deliver events to the protocol server process. The dispatcher runs inside this process. The dispatcher uses Receive() to receive the event from the user or worker, and then uses PostEvent() to deliver the event to the appropriate protocol layer. Therefore, events are delivered to protocol layers in a uniform way, regardless of their origin, simplifiing protocol implementation. The framework module also provides a interval timer service. An interval timer can run for any number of seconds. This causes the creation of a worker process which sleeps for the required interval. Once the process awakens, it sends a message to the dispatcher. The dispatcher then uses PostEvent() to inform the appropriate protocol layer. 5.3 The Request Response Protocol (RRP) The RRP protocol specified in the previous chapter is implemented using the Threads envi-ronment and the Framework module. This section describes the RRP implementation, the problems encountered, and their solutions. 5.3.1 State Machine Implementation Each state of the RRP protocol is implemented as one C subroutine. The parameters to these subroutines include the connection id, the connection control block, and the event. The RRP state machine is run by making a call to a state subroutine. The entry point of each state machine subroutine is held in a global array, and the appropriate state is called by using the state number as an index into this array. Inside each state machine subroutine is a C "switch" statement. There is one case for each possible event, as well as a default case for protocol errors. This arrangement parallels the RRP state table specification. CHAPTER 5. IMPLEMENTATION 70 5.3.2 Connection Control Blocks Each RRP connection is represented by two connection control blocks (one at each connected RRP entity). RRP connection control blocks (CCBs) are stored in a hash table for quick location. The hash key is the locally unique identifier associated with each connection. The contents of a CCB are shown in Figure 5.1. NAME DESCRIPTION ROLE indicates whether this RRP entity is a client, interested server, or disinterested server ID_NUMBERS connection identifiers and extensions STATE current protocol state RETRANS.COUNTER no. of retransmission attempts REQUEST_NO number of last request SEGS_RECVD number of segments received since last credit was sent SDATA pointer to send data RDATA pointer to received data RCREDITS pointer to last credit PDU received SCREDITS number of segments to credit HCS highest numbered credited segment. HSR highest numbered segment received TIMERS timer numbers for retransmission and idle timers ADDR address of connected peer Figure 5.1: RRP Connection Control Block Contents 5.3.3 Network Access RRP requires a datagram service to transfer PDUs. Normally this would be provided by a connectionless Transport service, but no such service was available during the implementation of RRP. Therefore, UNIX datagram sockets were instead used by RRP to transfer PDUs. UNIX datagram sockets use the Universal Datagram Portocol (UDP) and provide an unreliable, CHAPTER 5. IMPLEMENTATION 71 connectionless service. This service is very similar to that provided by connectionless Transport and is therefore suitable for use with RRP. 5.3.4 Sending and Receiving Data Segments Receiver overrun is a serious problem when large back-to-back PDU segments are transferred. Frequently (with a 5000 byte packet size), only the first one or two segments of a batch of ten will arrive at the destination. This is an unacceptable loss rate. To alleviate this problem, a small delay is introduced between transmitted segments. This delay varies with the segment size. For a 2 kilobyte segment, the delay is approximately 40,000 microseconds between segments. This delay significantly reduces segment loss. A better, though more complicated solution, might be to base the delay on past transmission success, rather than segment size. A PDU in the process of being transferred is queued at the CCB. Segments are numbered sequentially starting from zero. The portion of a request or response to send as a particular segment is calculated as an offset from the beginning of the data. The offset length is (Seg-mentSize x SegmentNumber). Copying of the segment is avoided in RRP through the use of the UNIX sendmsg call. This call takes any number of separate buffers and sends them as one packet. Received segments are queued in a linked list in order of segment number. Duplicate segments are discovered in the queueing process, and are discarded. Once the entire request or response has been received, the segments are copied into a contiguous buffer (unless there is only one segment), and the buffer is delivered to the RRP service user. This implementation does not restrict maximum packet lifetime in the network. Instead, a reasonable upper bound was assumed. A correct implementation needs explicit bounds on maximum packet life. CHAPTER 5. IMPLEMENTATION 72 5.3.5 Worker Processes RRP uses one worker process to read data from the network. This process executes a continuous loop reading from the network, and sending the read data to RRP through the dispatcher process. The reader uses Send() to transfer the data to the dispatcher, and the dispatcher uses PostEvent() to signal the event to RRP. 5.3.6 Initialization RRP performs some initialization when it begins execution. First, it initializes the CCB hash table. Then it creates its address and binds a UNIX socket to that address. Next, the RRP reader worker process is created. Finally, the RRP boundary routines are registered with the framework module. 5.4 The ISO Stack The ISO stack layers implemented include the Transport, Session, Presentation and Association Control layers (and, of course, the Remote Operations layer). These layers are implemented in the Threads environment using the interfaces provided by the Framework module described previously. The transport layer is interfaced onto an existing TCP implementation. The interface between the Transport layer and TCP sockets conforms to the model presented in [OSITCP]. Figure 5.2 shows the ISO protocol stack implementation running over TCP. The Transport class [X224] implemented is class 0. The session layer [X225] Basic Combined subset is implemented, without the half duplex functional unit. Included in this service subset are the Kernel functional unit and the Duplex functional unit. The Presentation layer [X226] Kernel functional unit is implemented. The Association Control implementation is the full ACSE as defined by [X227]. CHAPTER 5. IMPLEMENTATION 73 A P P L I C A T I O N R b S E * A C S E P R E S E N T A T I O N L A Y E R W M . . T R A N S P O R T L A Y E R T( «ftSS:::¥:&S::&^^ Figure 5.2: ISO Protocol Implementation Over TCP CHAPTER 5. IMPLEMENTATION 74 5.5 ROSE The ROSE implementation supports Operation class 1 and Association class 1. Operation class 1 allows synchronous invokes reporting success or failure outcomes. Only one invoke may be outstanding on any connection at any one time. Other operation classes allow for asynchronous requests with various levels of error and result reporting. Association class 1 only allows the association initiator to Invoke an operation. 5.6 User Interface The user interface makes the ROSE service available to applications. The User Interface is designed in two parts: an upper half, and a lower half. The upper half consists of stub routines which are called by the application. Their purpose is to package requests into a format dictated by the framework module, and Send() them to the dispatcher. The dispatcher uses PostEvent() to deliver the request to the lower half. The lower half of the user interface runs in the protocol server process. The upper half of the user interface provides access to the ROSE routines Invoke, Result, Error and Reject. It also provides access to the ACSE, allowing an application process to establish and release application associations. Finally, it provides an operation which blocks awaiting the arrival of an invoke, connection request, or connection release. The lower half of the user interface consists of a state machine reflecting the possible states of the application. The application process which initiates the application association is referred to as the client, and the responding process is the server. There are eight states in the state machine, four client states and four server states. The client states reflect the instances where the client is opening an association, closing an association, idle, or awaiting a response to a request. The server states reflect the instances where the server has received a Bind request but has not yet replied, has received an Unbind request but has not yet replied, is awaiting a request, or is processing a request. CHAPTER 5. IMPLEMENTATION 75 NETWORK Figure 5.3: Process Structure of RRP and ISO Protocol Implementation C h a p t e r 6 Performance Evaluation The performance of RRP can be judged in two ways. The first performance measurement is the number of PDUs transferred to create and release connections, and to complete a request and response. RRP is efficient in this respect, having no separate connection establishment or release PDUs1. No separate acknowledgements are required for transferred data. A selective retransmission scheme is used to avoid unnecessary retransmissions. Also to avoid unnecessary retransmissions, segments are credited in a way to allow delayed segments as much time as possible to arrive before being retransmitted. If a server application responds to a request quickly, RRP sends no more PDUs than are required to transfer the data2. The second performance measurement is the time required to create and release connections, and to transfer data. To provide a comparative benchmark, the ISO stack was implemented in (as far as possible) the same environment used for the RRP implementation. Tests were per-formed comparing the RRP implementation against the Session and Transport layer interfaced to TCP sockets. The tests consist of repeated connection establishment, data transfer, and connection re-lease. Message sizes are equal for the requests and responses, varying from one byte to 8500 bytes. The number of request/response interactions per connection varies from one to 500. 'The only exception is an optional RUNBIND PDU on connection release. 2 In the absence of failure. 76 CHAPTER 6. PERFORMANCE EVALUATION 77 RRP 74 ISO 726 Table 6.1: Connection Establishment and Release Performance Table 6.1 shows the time required to create and release five connections. Connection release in this test does send the optional UNBIND PDU. Table 6.2 shows the time to complete the indicated number of calls (request and result), of the indicated length, over each of five con-nections. The units are milliseconds of real time (elapsed time). Both the client and server processes run on SUN 3/50 computers running UNIX. The computers are connected by a 10 Mbit ethernet. The tests were performed during a time of light to moderate network activity. The results of these tests should be viewed with caution. Inequalities exist in the implemen-tations which affect the results. For example, retransmission and error recovery is performed in the UNIX kernel when using TCP sockets. The Session and Transport implementations do no error recovery and retransmission in user space. However, the datagram sockets used by RRP provide an unreliable service. Therefore, RRP does its own retransmission and error recovery in user space. These tests indicate a saving in connection establishment and release time using RRP. This saving is significant for applications using a high number of short lived connections. It is also significant in interactive applications where a fast response is required. By combining the connection establishment and release times with the data transfer times, the performance improvement for short lived connections is especially apparent. For example, creating five RRP connections, transferring a request and response of 1500 bytes over each connection, and releasing all five connection would take 314 milliseconds according to the performance measurements. Doing the same with ISO protocols would take 1126 milliseconds, or nearly three time longer. When message sizes and transfers become large, RRP is comparatively slow transferring CHAPTER 6. PERFORMANCE EVALUATION 78 N u m b er of Calls M S G L E N 1 10 50 100 300 500 1 RRP 180 2000 10060 20160 60780 103440 1 ISO 320 3140 15700 29480 81800 135920 500 RRP 240 2200 11000 21960 66080 112800 500 ISO 420 3440 17060 31580 90240 150380 1000 RRP 240 2380 11960 25920 73900 122700 1000 ISO 340 3560 17500 32540 93460 156080 1500 RRP 240 2720 13680 27320 90000 139620 1500 ISO 400 4000 19440 37100 105620 175240 2000 RRP 340 2880 15000 30540 86340 155580 2000 ISO 600 4120 19980 37820 110640 182760 2500 RRP 660 6980 32940 72820 207620 337320 2500 ISO 800 8020 40300 73960 208300 353980 4000 RRP 860 7020 35160 70280 215520 353140 4000 ISO 1500 9040 40960 77600 227060 384740 5500 RRP 1240 11040 57780 111180 340000 555400 5500 ISO 2200 11080 51160 100860 302200 497660 7000 RRP 1880 15740 78280 151800 456040 753620 7000 ISO 2420 14060 60080 119520 353040 597140 8500 RRP 2300 19260 98540 195360 597040 999420 8500 ISO 4400 15820 74720 146140 420980 704540 Table 6.2: Data Transfer Performance CHAPTERS. PERFORMANCE EVALUATION 79 data. One possible explanation is that increasing message size and transfer frequency increase the possibility of receiver overrun. The actual frequency of lost packets varied from nearly zero, to as much as 30 percent. This frequency fluctuated quickly and was difficult to predict. Packet loss causes increased switches between kernel and user space in the case of RRP, but not in the case of TCP. C h a p t e r 7 Conclusion 7.1 Summary RRP provides protocol support for ISO remote operations. It requires no explicit connection es-tablishment or release PDUs. RRP segments and concatenates data too large to be transmitted by the underlying network. It uses selective retransmission to help deal with receiver overrun. RRP optionally monitors client reachability, and monitors server reachability while the client has an outstanding request. RRP is especially efficient for server applications requiring a large number of short-lived connections which transfer small amounts of data. 7.2 Contributions Contributions of RRP are derived from each step of its development and testing. Several important features of Remote Operations protocol support are recognized by analyzing the Remote Operations model and evaluating existing protocols. For example, connection estab-lishment must be fast, and connection maintenance should require a minimum of resources. Receiver overrun must be handled efficiently. Sliding window flow control can be inadequate for high propagation delay networks. A distinction is made between interested and disinter-ested servers, and the support each requires. An interested server must be notified in the case of client unreachability, but a disinterested server must not suffer the overhead accompanying 80 CHAPTER 7. CONCLUSION 81 this service. Providing this service at the supporting protocol level saves the server application from explicitly monitoring client reachability. This reduces the complexity of the client and server application implementations. Using the information derived from the above examination, a model of an ideal supporting protocol is created. This model is intended to be an ideal against which other supporting protocols may be compared. RRP is designed to conform to this ideal model. Also, a unified approach to flow control and selective retransmission is used in RRP which simplifies the protocol specification and implementation. Finally, an implementation of RRP is performed to verify the ideal model and RRP design. The ISO supporting protocol stack is also implemented to provide a basis for RRP performance comparisons. 7.3 Further Work At least two areas require further work in the RRP protocol. The first area is receiver overrun. A solution more efficient than imposing a fixed gap between transmitted packets is desirable. A slightly better solution might be to estimate an appropriate gap based on the frequency of retransmission requests. The second area concerns the overhead associated with the detection of client unreachability. A solution requiring fewer packet exchanges would be very desirable. Bibliography [RPCp] Andrew Birrell and Jay Nelson. Implementing remote procedure calls. ACM Trans-actions on Computer Systems, 2(l):39-59, February 1984. [X25] CCITT. Interface Between Data Terminal Equipment (DTE) and Data Circuit-Terminating Equipment (DCE) for Terminals Operating in the Packet Mode and Connected to Public Data Networks by Dedicated Circuit. In Red Book, Volume VIII - Fascicle VIII.3, 1985. [X21] CCITT. Interface Between Data Terminal Equipment (DTE) and Data Circuit-Terminating Equipment (DCE) for Synchronous Operation on Public Data Networks. In Red Book, Volume VIII - Fascicle VIII.3, 1985. [X225] CCITT. Session Protocol Specification for Open Systems Interconnection for CCITT Applications. In Red Book, Volume VIII - Fascicle VIII.5, 1985. [X224] CCITT. Transport Protocol Specification for Open Systems Interconnection for CCITT Applications. In Red Book, Volume VIII - Fascicle VIII.5, 1985. [X227] CCITT. Association Control Protocol Specification for Open Systems Interconnec-tion for CCITT Applications, 1987. [X226] CCITT. Presentation Protocol Specification for Open Systems Interconnection for CCITT Applications, December 1987. [VMTP] David Cheriton. VMTP: Versatile Message Transaction Protocol, Protocol Specifi-cation, February 1988. [X229] CCITT and ISO. Remote Operations: Protocol Specification — Information Pro-cessing Systems - Text Communication - Remote Operations Part 2: Protocol Spec-ification, November 1987. [RPCt] Bruce Jay Nelson. Remote procedure call. Technical Report CSL 81-9, Palo Alto Research Center, 1981. 82 BIBLIOGRAPHY 83 [OSITCP] Marshall T. Rose and Dwight E. Cass. OSI Transport Services on Top of the TCP. COMPUTER NETWORKS and ISDN SYSTEMS, 12(3):159-173, 1986. [DeltaTp] Richard W. Watson. Timer-based mechanisms in reliable transport protocol con-nection management. Computer Networks, pages 47-56, 1981. [DeltaTt] Richard W. Watson. Delta-T protocol specification. Technical Report UCID - 19293, Lawrence Livermore Laboratory, April 1983. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0302101/manifest

Comment

Related Items