UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Multi-process structuring of X.25 software Deering, Stephen Edward 1982

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1982_A6_7 D45.pdf [ 2.7MB ]
Metadata
JSON: 831-1.0052089.json
JSON-LD: 831-1.0052089-ld.json
RDF/XML (Pretty): 831-1.0052089-rdf.xml
RDF/JSON: 831-1.0052089-rdf.json
Turtle: 831-1.0052089-turtle.txt
N-Triples: 831-1.0052089-rdf-ntriples.txt
Original Record: 831-1.0052089-source.json
Full Text
831-1.0052089-fulltext.txt
Citation
831-1.0052089.ris

Full Text

M U L T I - P R O C E S S S T R U C T U R I N G OF X.25 S O F T W A R E by S T E P H E N E D W A R D D E E R I N G B . S c , The University of Br i t ish Columbia, 1973 A THESIS SUBMITTED IN P A R T I A L F U L F I L M E N T OF THE R E Q U I R E M E N T S FOR THE D E G R E E OF THE F A C U L T Y OF G R A D U A T E STUDIES Department of Computer Science We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH C O L U M B I A October 1982 c ) Stephen Edward Deering, 1982 M A S T E R OF S C I E N C E in In presenting t h i s thesis i n p a r t i a l f u l f i l m e n t of the requirements for an advanced degree at the University of B r i t i s h Columbia, I agree that the Library s h a l l make i t f r e e l y available for reference and study. I further agree that permission for extensive copying of t h i s thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. I t i s understood that copying or publication of t h i s thesis for f i n a n c i a l gain s h a l l not be allowed without my written permission. Department of COMPUTER. S^f^AcE The University of B r i t i s h Columbia 1956 M a i n M a l l Vancouver, Canada V6T 1Y3 Date OoT<o DE-6 (3/81) Abstract Modern communication protocols present the software designer with problems of asynchrony, rea l - t ime response, high throughput, robust exception handling, and mul t i - l eve l interfacing. An operating system which provides lightweight processes and inexpensive inter-process communication offers solutions to a l l of these problems. This thesis examines the use of the mult i -process structuring fac i l i t ies of one such operating system, Verex, to implement the protocols defined by CCITT Recommendation X .25 . The success of the mult i -process design is confirmed by a working implementation that has linked a Verex system to the Datapac public network for over a year. The processes which make up the Verex X.25 software are organized into layers according to the layered definit ion of X .25 . Within the layers, some processes take the form of f in i te -state machines which execute the state transitions specified in the protocol definit ion. Matching the structure of the software to the structure of the specif icat ion results in software which is easy to program, easy to understand, and l ikely to be correct . Mult i -process structuring can be applied with s imilar benefits to protocols other than X.25 and systems other than Verex. - i i -Table of Contents Abstract u" L ist of Figures ' v Acknowledgements v 1 . Introduction 1 2 . Implementation Environment 4 3 . C l ient Interface 9 3 . 1 C a l l Handling 10 3.2 Data Transfer 1 5 3 . 3 Interrupts and Resets 15 3 . A Summary 17 A. Server Design 18 4 . 1 The Frame Layer 23 4 . 2 The Link Layer 27 4 . 3 The Packet Layer 34 4 . 4 The Transport Layer 40 4 . 5 Buffer Management 41 5 . Evaluation 44 Bibliography 49 - i i i -List of Figures Figure 1 . Message-Passing in Verex 5 Figure 2 . Layer Structure of the X . 2 5 Software 22 Figure 3 . Process Structure of the Link Layer 27 Figure 4 . Main Procedure of the Link Server 30 Figure 5 . Process Structure of the Packet Layer 35 - iv -Acknowledgements This thesis owes its existence to the perseverence and guidance of D r . David Cher i ton . I would especially l ike to thank David for creating such a st imulat ing research environment and for leading me into the f ie ld of communication software. Many thanks are due to Pat Boyle, John Demco, Gerald Neufeld, and Stella Atkins for their helpful comments and their f ine editing sk i l ls . I am grateful for the support of the U .B .C . Department of Computer Science and the National Sciences and Engineering Research Counci l of Canada. - v -1. Introduction This thesis presents a design for X .25 interface software. The software is structured as a col lect ion of processes communicating v ia messages. The organization of the processes ref lects the architecture of the protocol layers; their internal structure is derived from state- t ransit ion descriptions of the individual layers. This design has been successfully implemented on a small t imesharing system, l inking it to Canada's Datapac network. CC ITT Recommendation X.25 [10] defines a mult i - layer protocol for interfacing computer equipment to a packet -switched network. It has become a s tandard ' for access to public and private networks in many countries. The interconnection of those networks has made an enormous population of computers accessible at very low cost to any system which supports X . 2 5 . Unfortunately , software to support an X .25 interface places signif icantly greater demands on the resources and capabil it ies of a computer system than that required for most other peripherals. It requires greater sk i l l on the part of the software designer to meet those demands. Designers of software for computer communications have many choices but few guidelines for structuring and organizing their systems. The well -developed techniques for structuring sequential programs do not address many of the requirements of modern communication protocols, such as concurrency, t i m e - c r i t i c a l response, ef f ic ient data movement, robust recovery from errors and exceptions, and mul t i - leve l interfacing with other software. Many mechanisms have been provided or proposed to solve each of these problems; l i t t le is known of how to select and combine these various techniques to most ef fect ive ly translate the description of a - 1 -protocol into a working implementation. This thesis explores the use of processes and messages as structuring primit ives for communication software. The evolving tools for describing protocols, such as the ISO Reference Model for Open Systems Interconnection [19] and f inite state specif ications of protocol entities [1], are used as blueprints for an assembly of communicating processes. The resulting software is produced quickly and models closely the protocol descriptions, contributing to its understandability and maintainabi l i ty . The abstract notions of "processes" communicating v ia "messages" were introduced by researchers in operating systems [6], [13] and programming languages [18] to handle the complexit ies of rea l - t ime , paral lel programming. Tradit ional , procedure-based operating systems have proven awkward or unsuitable for supporting the patterns of data f low, the degree of asynchrony, and the c r i t i ca l t iming requirements of communication software. For example, efforts to support communication protocols within the c lassical ly -structured UNIX operating system [25] have forced implementore to extend UNIX with additional interprocess communication fac i l i t ies [17], [24] or discard it altogether in favour of more special-purpose systems [7]. On the other hand, most so -ca l led " rea l - t ime" operating systems provide an ad hoc assortment of synchronizing and communicating fac i l i t ies such as semaphores, signals, monitors, software interrupts, e tc . , each designed to solve a different problem. In contrast, the message-passing process model is a simple and unifying approach to synchronization, data transfer, concurrency, and distribution of function across processors and memories. The process and message-passing model presented by one operating system, Verex, is described in the fol lowing chapter. Verex's smal l number of pr imit ive operations provide a hospitable environment for a diversity of applications, in part icular , our X .25 implementation. Chapter 3 describes the interface presented by the X.25 software to cl ient processes within Verex, either user application programs - 2 -or higher- level protocol software. Chapter 4 is a discussion of the multi -process structuring methodology and a layer -by - layer analysis of its application to the Verex X.25 service — detailed knowledge of X .25 is assumed of the reader. In the f inal chapter, the software and its design are evaluated and its strengths and weaknesses examined in l ight of other X .25 implementations, other protocols, and other operating system environments. - 3 -2. Implementation Environment The X .25 software has been implemented as part of an experimental operating system named Verex. Much of the design of that software has been influenced by the environment provided by Verex. This chapter brief ly describes that environment. Verex, a descendant of Thoth [14], is a portable operating system for small computers. The system is structured as a kernel and a col lect ion of processes [22]. The kernel provides process creation and destruction, interprocess message-passing, and low- leve l device support. Processes provide the bulk of the operating system services, such as memory management, a f i le system, terminal support, interact ive session management, and network access. Appl icat ion programs execute as processes which obtain services by sending messages to system processes. With an appropriate selection of processes, Verex can be configured as a dedicated rea l - t ime system or as a mult i -user , interact ive programming environment. Centra l to the design of- Verex is the use of mult iple processes to support concurrent and asynchronous behaviour. To make' multi -process structuring at t ract ive , the implementation guarantees that processes are cheap to create and destroy, that process switching is e f f ic ient , and that message-passing is fast . Short, f ixed- length messages are sent directly to processes — there are no connections and no ports. Each process has a unique, global identif ier and a single FIFO queue for incoming messages. The kernel provides the fol lowing message-passing pr imit ives: Send ( message, id ) - 4 -sends the specif ied message to the process identif ied by id and suspends the sending process unti l a reply is received. The contents of message are changed by the reply. id = Receive ( message ) retrieves the f irst message f rom the incoming message queue and returns in id the process ident i f icat ion of the sender. If the queue is empty, the receiver is suspended unti l a message arrives. Reply ( message, id ) returns a reply message to the process identif ied by id. This enables the sender to resume execution. Forward ( message, i d l , id2 ) forwards a message received from process i d l to process id2. The ef fect is as if i d l had originally sent the message to id2. Figure 1 i l lustrates the use of these message-passing primit ives between processes. Send R e p l y I I R e p l y I I I I I r e c e i v e r 1 T I F o r w a r d I V r e c e i v e r 2 Figure 1. Message-Passing in Verex - 5 -In the current implementations of Verex, al l messages are 8 words long. This is suff ic ient for the exchange of control information and smal l amounts of data. For the movement of larger blocks of data, the kernel provides the fol lowing two operations: Transfer-from ( id, n-bytes, remote-vec, local-vec ) copies n-bytes of data from remote-vec in the address space of process id to local-vec in the address space of the invoking process. Transfer-to ( id, n-bytes, remote-vec, local-vec ) s imi lar ly copies n-bytes of data to remote-vec in the address space of process id f rom local-vec in the address space of the invoking process. Both of these operations are permitted only when the process identif ied by id is suspended awaiting reply f rom the invoking process. The provision of a separate mechanism for communicating large amounts of data is analogous to the D M A (direct memory access) fac i l i t y of hardware interfaces or the "reference parameter" fac i l i t y of procedure interfaces. It allows the interprocess communication mechanisms to be optimized for the amount of' data being communicated. The message-passing primitives described above support the remote procedure c a l l style of interprocess communicat ion. Sending a message and awaiting its reply is much l ike passing arguments to a procedure and receiving its results. However, processes that receive messages are more powerful than procedures: they can reply to the messages in any order or forward them to other processes for reply. These message fac i l i t ies are often used to establish a client/server relationship between processes. A c l i e n t process requests some service by sending a message to a server process and wait ing for a reply. The request message contains al l necessary fields to - 6 -specify the service required; the reply message contains the results of servicing the request. The client is given the simple procedure-like interface of Send, whereas the server can use the greater flexibility of a receiver to provide sophisticated scheduling and interleaving of services. An example of a server-based facility is the Verex I/O system [12]. Processes obtain input and generate output by sending requests to file servers. A file server is any process which implements the Verex file access interface; it may provide disk storage, device access, or any other source/sink of data that can fit the file model. Clients initiate file access by requesting a file server to create an instance of a file meeting certain specifications. For example, a file server which provides spooled printer access may be requested to create an instance of a writeable file, or a disk file server may be asked to create an instance of a readable file corresponding to a previously written file. Once an instance is created, a client may read or write blocks of data via further requests to the server. When finished, the client sends a message asking the server to release the file instance. File servers do not provide character-string names for their files. Instead, file names are maintained for all file servers by a single name server which has a well-known process identifier. The name server manages a directory of names and corresponding server information. A client can send a name to the name server and receive in reply the process identifier of the appropriate file server and some server-specific file identification. This file identification can then be passed to the file server as an argument in a "create instance" request. A standard library of I/O procedures provides a conventional byte-stream interface (Open, Close, Get-byte, Put-byte) to the block-oriented message interface. No "asynchronous" I/O interface is provided whereby a process can start a read operation, continue execution for a while, and then pick up the input. Such i concurrency is easily obtained in Verex by using multiple, independently executing - 7 -processes. An important feature of the Verex environment is the support for multiple processes executing within a single address space. The set of one or more processes which reside , in an address space is called a team. Each process on a team has a separate execution stack in the common space but shares code and global data with the other processes. This organization provides efficient support for programs which employ multiple identical processes or applications which require closely cooperating processes to share large amounts of data. Within a team, each process is assigned a priority. The kernel guarantees that the execution of a process will not be preempted by any process of the same or lower priority on the same team. Thus, processes of the same priority can access shared data without danger of mutual interference. Teams as a whole also have scheduling priorities. By suitable assignment of priorities to teams and processes, real-time response to external events can be controlled and guaranteed. Most of the Verex kernel and all of the system and application processes are written in a portable, high-level programming language named Zed [15] which is quite similar to C [21]. The X.25 software to be described is also written in Zed and is installed in a version of Verex running on a Texas Instruments 990/10 minicomputer. - 8 -3. C l i e n t Interface The X .25 Recommendation [10] defines the interface between a host computer system and a network; it does not define the interface presented to application (client) processes within the host. The cl ient interface depends on the nature of the host operating system and the requirements of cl ient processes. In the design of X .25 software for Verex, the primary goal was to provide general access to X .25 virtual circuits for a variety of application programs and higher- level protocol software. It was not to be restr icted to handling, say, incoming terminal connections only. A secondary goal was to provide the X.25 service via the standard Verex f i le access interface rather than inventing an entirely new, customized inter face. If v i r tual circuits could be made to look l ike Verex " f i le instances", cl ients could ,take advantage of the existing l ibrary of f i le access procedures to handle X .25 data t r a f f i c . Adopting the same interface as other devices and fi les would contribute to the device-independence of application software and to a decrease in the amount of new documentation required. However, the Verex f i le access interface does not encompass al l of the fac i l i t ies of X .25 v i r tual c i rcu i ts , such as "qual i f ied" data or data-bearing interrupts. In order to satisfy the goal of a general X .25 service it would be necessary to supplement the f i le access interface with mechanisms to access these additional features. A description of the resulting cl ient interface and discussion of the design problems encountered are presented below. - 9 -3.1 C a l l Handling Access to one or more X.25 networks is provided by a single team of processes, col lect ively known as the X.25 server. The X .25 server supports the establishment of v i r tual circuits both by placing outgoing calls to the networks and by accepting incoming calls f rom the networks. These two methods of connection are provided to cl ient processes through two different interfaces. To place an outgoing c a l l , a c l ient process must f irst locate the X .25 server by sending to the name server a query message specifying the name of the desired network, e.g. "Datapac" . If the X .25 server is present in the system and offer ing service on the named network, the reply from the name server contains the identif ier of a process on the X .25 team. To that process the cl ient then sends a "create instance" request message containing the fol lowing information: — the "create instance" request code. — an access code requesting both read and write access. — the logical channel number to be used for the v i r tual c i rcu i t . This value is normally zero to indicate that any free logical channel w i l l do. — the address of a vector containing the destination DTE (host) address, source D T E address, optional fac i l i t ies , and ca l l user data. This information is formatted according to the specif ications of the X.25 C a l l Request packet, excluding the f irst three octets. — the length of the C a l l Request vector . The f irst two items are standard for al l Verex f i le servers; the last three are specif ic to the X.25 server. - 10 -The formatt ing of the C a l l Request packet by the cl ient may appear to be an unnecessary burden for the cl ient and an inappropriate mixing of protocol levels. The alternative is to define a network-independent set of connection parameters to be supplied in some standard format . An example of this approach can be found in the DoD Internet Protocol [23] which provides an abstract "quality of serv ice" parameter to be mapped into network-specif ic parameters for various underlying network protocols. However, such generality was considered less important than simple but complete X .25 access for the in i t ia l design. Instead, the X .25 server takes the view that the format and contents of the C a l l Request packet are the concern of h igher - level , end-to-end protocols; the C a l l Request vector is simply treated as cl ient data to be passed untouched and unexamined to the network. Any errors in the C a l l Request detected by the network are reported back to the c l ient . The Verex l ibrary procedure Open takes a f i le name as an argument and requests the name server to provide al l corresponding server -speci f ic information required for a "create instance" request. Unfortunately , the current name server is not prepared to store the volume of information required to specify an X .25 connection. A lso , it is c learly impract ica l to provide f i le names to correspond to al l possible combinations of C a l l Request parameters, although some frequently cal led destinations could reasonably be given names. Due to the name server l imitat ions, an X.25 cl ient cannot use Open to establish a v irtual c i rcui t but rather must use some X .25 -spec i f i c code to col lect and format the C a l l Request informat ion. This is not a problem with current applications but it remains to be seen whether raw X .25 virtual circuits would benefit f rom having names. If the C a l l Request is successful and the v i r tual c i rcui t is established, the cl ient receives a reply message containing: — a code indicating success. - 11 -— the identif ier of a server process to handle all requests pertaining to the new virtual c i rcu i t . This process may or may not be the same as the one that received the "create instance" request. — a unique v i r tual c i rcui t identif ier which is to be included in all subsequent requests. The server uses this identif ier to detect attempts by a c l ient to reference an earlier v i r tual c i rcui t which has been c leared. The C a l l Request may f a i l for many reasons, such as lack of free logical channels (either local ly or at the destination), fai lure of the network or links to the network, or invalid C a l l Request packet format . In al l cases, the reason for fai lure is returned to the cl ient as the only i tem in the reply message. True to the X.25 def ini t ion, the server does not t ime out or retry C a l l Requests; the cl ient can easily do so if desired. Incoming cal ls are received and accepted by the X.25 server. When the server is in i t ia l i zed , it is given the name of a f i le containing a program (team) for handling incoming ca l ls . Whenever a ca l l comes in , the specif ied program is invoked with the C a l l Request packet as an" argument. Based on the contents of the packet, the program can reject the c a l l , invoke or notify a particular cl ient to take responsibil ity for the c a l l , or assume the role of c l ient i tself . This interpretation of incoming C a l l Request data by software outside of the X.25 server is consistent with the handling of outgoing cal ls . It yields a convenient modularity — different configurations of the system can use different ca l l handling software with the same X.25 server — and keeps the X .25 server smaller and simpler . The expl ic i t invocation of a ca l l handling program has some advantages over the common technique of having cl ients wait for incoming calls to arr ive. No system resources such as memory or swap space are consumed by wait ing c l ients . It avoids the issue of what to do when a ca l l arrives and no cl ient is wait ing. Unfortunately , - 12 -incoming ca l l service is thus available only in systems configured to support local program invocation, a relat ively high- level operation requiring, for example, access to a f i le store. A v irtual c i rcui t is cleared when a cl ient sends a "release instance" request to the X .25 server. Any other c l ient process awaiting a reply f rom the specif ied c i rcu i t is immediately returned an indication that the c i rcui t has been c leared. The c i rcui t may also be cleared by request of the network, request of the remote host, or failure of the link to the network. Another cause for clearing is the disappearance of the owner of the virtual c i r cu i t . Ownership resides in i t ia l ly with the cl ient process that established the virtual c i rcu i t ; it may be transferred to another process at the owner's request. The continued existence of al l current owners is checked by the X .25 server whenever a new cl ient attempts to establish a virtual c i rcu i t . If an owner has disappeared, its c i rcu i t is c leared, freeing up a logical channel for other ca l ls . This lazy form of resource recovery is not completely successful since the X.25 server detects cal l attempts only by local c l ients . Incoming calls are turned away by the network whenever al l logical channels are in use. The X.25 server receives no not i f icat ion of such events and therefore is not provoked into checking for abandoned c i rcu i ts . This problem could be al leviated for the price of a t imer which invokes periodic channel rec lamat ion. 3.2 Data Transfer Once a virtual c i rcui t is established, clients transfer data across it using "read instance" and "write instance" requests. A "read instance" request takes the form of a message containing the request code, the v i r tual c i rcui t ident i f ier , the address of a buffer and its length. When a data packet arrives, the X .25 server transfers the data portion of the packet, preceded by a one-byte header, into the buffer . The cl ient - 13 -then receives a reply message specifying the length of .the buffer's new contents. The header byte contains two single-bit flags corresponding to the X .25 Q-bi t and M-bi t received with the incoming packet. For a "write instance" request, the cl ient must provide the address and length of a s imilar buffer containing a header byte and the data to be transmitted. The X.25 server delays the reply to a write request unti l packet - leve l f low control allows transmission of the packet. The passing of the Q and M bits as a header in the client's buffer ref lects the view that packet "qual i f icat ion" and fragmentation/reassembly are the concern of the layer of protocol above X . 2 5 . This is c learly true for the Q -b i t , but not necessarily for the M-b i t ; the X .25 server could perform automatic fragmentation/reassembly for the case where cl ient buffers are larger than packets. However, at the Verex f i le access interface, the size of the blocks exchanged between cl ient and server is chosen for the convenience of the server, not the c l ient , and the blocks are used to carry byte streams with no natural record boundaries. The goal of providing general X .25 access here confl icts with the desire to conform to the f i le access interface: the header byte provides ful l X .25 control to higher- level protocol software, but obstructs straightforward use of v i r tual c i rcuits as byte streams insofar as the cl ient must be aware of packet boundaries. <.> Many protocols impose some kind of record structure on top of communication links which are essentially byte or bit streams, for the purposes of error control and flow control . Such records also provide a convenient unit for separating control information f rom data. Unfortunately , when the application requires a simple byte st ream, it appears that an extra layer of protocol is required to transform the record structure back into a byte st ream. An example of this phenomenon was observed when attempting to provide sophisticated terminal support across the network. The Verex terminal handling processes use the byte stream f i le access interface to access local terminal - 14 -interfaces. In order to use the same processes to handle network terminals , it was necessary to interpose another process whose only function was to insert/delete a header byte of zeroes in all blocks exchanged w i t h the X .25 server, and whose only ef fect was to degrade terminal response t ime. 3.3 Interrupts and Resets In order to exchange X.25 Interrupt packets, clients are provided with two addit ional, X .25-server -spec i f ic requests: "send interrupt" and "acknowledge interrupt". A "send interrupt" request message contains a single byte of data to accompany an outgoing Interrupt packet. A reply is returned when the network acknowledges the interrupt. Incoming interrupts are presented to the cl ient as a special reply code and single byte of data in the reply message to a "read instance" request. A f te r receiving such a reply, the cl ient must send an "acknowledge interrupt" request to enable reception of further interrupts. A problem arises with this scheme: interrupts become subject to the same flow control as data at the client/server interface. A cl ient normally exerts flow control on incoming Data packets by withholding "read instance" requests. However, when no read request is outstanding, Interrupt packets cannot be received either. The provision of a separate "receive interrupt" request was considered but rejected as an unnecessary compl icat ion for those applications that either do not require f low control or perform their own using higher- level protocols — they would require an extra process dedicated to waiting for interrupts. Instead, a mechanism for asynchronously notifying the cl ient of the Interrupt was adopted from the terminal servers' support for the "break" funct ion. A cl ient may nominate a " v i c t i m " process to be destroyed by the X.25 server when an Interrupt packet is received. The death of the v ic t im is detected by another process on the client's team (normally the creator of the v ic t im process) which can then issue a "read instance" request to pick - 15 -up the byte of interrupt data. This differs f rom dedicating a process to waiting for interrupts in that the v ic t im process can serve other purposes on the client's team and is only required when the cl ient needs asynchronous not i f icat ion . This method relies on the fact that process creation and destruction are relat ively inexpensive operations in Verex. X .25 Resets can occur as a result of temporary hardware failures of network access links or software errors in either the hosts or the network. A reset can result in the loss of an unspecified and indeterminable number of packets. To guarantee a reliable end-to-end communication path, higher- level software (a "transport service") must be used to detect and retransmit lost packets. Not i f icat ion of reset is insuff ic ient for complete recovery and when such higher- level protocols are employed, not i f icat ion of reset serves l i t t le purpose. Therefore, in the current design, the X .25 server hides resets f rom cl ients . When a reset occurs, any outstanding "write instance" or "send interrupt" request is satisf ied immediately with no indication of reset; an outstanding "read instance" request remains outstanding. The decision to hide resets contradicts the goal of providing ful l X .25 access to cl ients and may be re-evaluated in light of application requirements. The abil i ty to generate and detect resets could conceivably be used by some applications as an additional signall ing method. Also, there is at least one case where not i f icat ion of reset is suff ic ient for recovery: when the cl ient is a v i r tual terminal program, a message to the human user saying "Reset has occurred" is enough to alert the user to possible packet loss. The user can then perform whatever interaction is necessary to evaluate the damage and recover. - 16 -3.4 S u m m a r y Apart f rom the handling of resets just described, the primary requirement of general access to fu l l X .25 service is satisf ied by the cl ient inter face. The generality of the interface causes it to fa l l short of the secondary goal of compat ib i l i ty with the Verex f i le access interface. In part icular , the inabil i ty to use Open to create a v i r tual c i rcu i t and the need for clients to be aware of packet boundaries prevent the transparent use of virtual c i rcuits as byte streams and force X .25 dependencies on the c l ients . Perhaps it would be more worthwhile to cast f i les in the image of communication c i rcu i ts , rather than the other way around. In any case, part ial conformance to the f i le access interface does allow clients to use common l ibrary routines which provide b lock- level access to standard " f i le instances". The interface currently supports two applications: a program which emulates an X .29 Network Interface Machine [9] to provide local terminal access to remote hosts, and a matching host-side X .29 module [27] which supports access to the local host f rom remote terminals . These cl ients influenced the interface design; future applications may contribute to its evolution. - 17 -4. Server Design Communication software, like most other software, presents the designer with a multitude of choices: how to break it into parts, how many parts to have, how to connect the parts. Unlike most software, the parts that make up communication software often must satisfy rigorous constraints on real-time response, must support a high degree of concurrency, and must provide sufficient throughput for large volumes of data. This chapter describes and justifies the choices made in the design of X.25 software for the Verex operating system. Many of the decisions are applicable to other protocols and other operating systems. The Verex model of multiple processes interacting via messages was designed to support concurrent and real-time behaviour, precisely the kind of behaviour required of communication software. However, it is not obvious how to best exploit those multi-process facilities to produce software which meets particular performance goals and is also correct, clear, and maintainable. The methodology for multi-process structuring developed by Cheriton [13] for the Thoth operating system and subsequent experience with other (non-protocol) software in Verex have led to the following criteria for using multiple processes: — Logical Concurrency: A program which is required to perform two or more logically concurrent activities is often best implemented as multiple processes, each a simple sequential program, rather than as a single process with intertwined control structures. Examples are applications requiring concurrent garbage collection, producer/consumer problems, and full-duplex communication handlers. This is the class of programs for which coroutines - 18 -are often suggested. Real Concurrency: Multiple processes can be used to achieve real concurrency, even within a single processor. For example, a file copy program, though logically sequential, can benefit from separate reader and writer processes which control overlapping operation of external devices. Recognizing parallelism in the form of multiple processes is also the first step towards exploiting multiprocessor machines and distributed systems. Synchronization: Programs which must handle asynchronous events, such as device handlers of all kinds, can conveniently be structured as a collection of processes, each waiting for (i.e. synchronizing with) a particular event. Where necessary, the various processes can synchronize with each other using message-passing. This is an attractive alternative to schemes involving a single process which polls for events or traps "software interrupts". Access Control: Access to data structures which are shared among multiple processes can be synchronized by placing the shared data under the exclusive control of a single process. Operations on the data are performed only by the managing process in response to request messages from other processes. The data are therefore safe from the dangers of concurrent access. This structure is commonly used to control access to operating system resources, where a resource is represented by a "control block" or "state descriptor" under the management of a single server process. Modularity: Clearly identified "services", such as those provided by an operating system (device access, directory lookup, etc.) or layers of protocol software, should be partitioned into separate processes even if none of the above criteria directly apply. Such partitioning simplifies understanding, maintenance, replacement (perhaps by hardware), sharing, and possible - 19 -distribution across machines. Concurrent processes communicating via synchronized messages are vulnerable to deadlock. Cheriton in [13] discusses deadlock in the context of the Thoth operating system, which is essentially the same as Verex. He shows how deadlock can be avoided by disciplined use of the message-passing primitives, such that there are no circularities in the blocking behaviour of processes. The essential rule is that processes be ordered hierarchically with higher level processes using the Send primitive to pass messages downward and lower level processes using only the non-blocking Reply to communicate upward. It is possible to violate this rule without incurring deadlock but it then becomes much harder to demonstrate that the resulting communication structure is in fact deadlock-free. Within the above guidelines, the number of processes should be minimized. Proliferation of processes can lead to complex and awkward interprocess communication structures as well as unnecessary overhead for process switching and message-passing. For example, several closely related, shared variables are best placed under the control of one process rather than several processes in order to reduce the amount of message-passing required to examine or manipulate them. In addition to choosing a process structure, the designer of Verex software must also decide how to distribute the processes across address spaces. Verex allows a team of processes to share a common address space. The shared memory is initialized with code and data from a file and is the unit of swapping. The following considerations govern the use of teams: — Code Sharing: Multiple identical processes can share code, and thus reduce memory requirements, by residing on the same team. A good use of this facility is for multiple servers which handle identical devices, such as terminals or disk drives. - 20 -— Data Sharing: If a large amount of data must be shared between several processes, placing the processes on the same team can el iminate much costly data copying. — Swapping Cont ro l : For applications which require some non-swapping memory, such as D M A device servers, the program can sometimes be divided into two teams, one memory-resident and the other swappable, to reduce the amount of dedicated memory. — Program Size Constraints: An application which is too large to f i t into the address space of a small machine can sometimes be partit ioned into multiple teams. A common example is a multi -pass compi ler . — Dynamic Reconf igurabi l i ty : A program implemented as a single team can dynamically reconfigure its process structure using process creation and destruction. A program implemented as several teams can, in addition, dynamically reconfigure parts of its executable code using team creation and destruction. This is the basic method used to configure different versions of Verex, to handle changes in device avai labi l i ty , and to execute arbitrary programs. The above c r i te r ia for process and team structuring are not hard and fast rules. In some circumstances, one rule contradicts another and tradeoffs must be made. For example, using a single process to control access to shared data can sometimes create a bottleneck which reduces concurrency. Sometimes considerations of modularity or reconfigurabi l i ty demand the use of mult iple teams for processes which must communicate large amounts of data. This l ist of guidelines does not el iminate design choices, it merely enumerates them. - 21 -The application of these c r i te r ia to the design of X .25 software has resulted in a three- layer structure composed of several processes on a single team plus a kernel - level device driver (Figure 2). A t the bottom is the frame layer which is responsible for data f raming, check sequence calculat ion, and manipulation of the hardware interface to the network. The frame layer is provided by a standard device driver in the Verex kernel . In the middle is the link layer which handles the X .25 Link Access Procedure (LAP) . The link layer comprises several processes, al l on the same team, some of which communicate direct ly with the frame layer driver. A t the top is the packet layer, another group of processes on the same team as the link layer processes, which implements the packet - leve l procedures of X . 2 5 . This layer provides the cl ient interface described in Chapter 3, and uses the services of the link layer to exchange packets with the network. c l i e n t i n t e r f a c e X . 2 5 t e a m p a c k e t l a y e r p r o c e s s e s l i n k l a y e r p r o c e s s es k e r n e 1 i n t e r f a c e k e r n e l I f r a m e l a y e r d r i v e r I h a r d w a r e i n t e r f a c e Figure 2. Layer Structure of the X.25 Software This overall structure of the X.25 software obviously parallels the lowest three layers of the ISO Reference Model for Open Systems Interconnection [19]. This is not surprising, since many of the layering principles used in the Reference Model are s imi lar to the above c r i te r ia for mult i -process structur ing. They are simply principles of modularity which can be applied with equal benefit to descriptive models, protocol - 22 -definit ions, and process structures used to implement protocols. The Verex software conforms to the structure of X . 2 5 , and X .25 conforms to the ISO Reference Model . The detailed design of each of the three layers of software is described below. The discussion i l lustrates the mult i -process structuring methodology, showing where the above guidelines for using processes and shared memory are applied and where they are v io lated. The relationship of the three- layer X .25 software to the next higher layer of the ISO Reference Model , the Transport Layer , is also discussed. The design description closes with a discussion of buffer management issues which cut across al l the layers. 4.1 The Frame Layer At the bottom, communication software meets hardware. In Verex, low- leve l hardware access is provided by device drivers in the kernel . Only the kernel has (or should have) the appropriate level of privi lege to access I/O devices and, due to the power of some device interfaces to write into arbitrary memory locations, the kernel must control I/O to maintain its own integr i ty . Since the kernel creates the process abstraction and implements message-passing, device drivers inside the kernel cannot themselves make use of the structuring tools of processes and messages. Therefore, it is desirable to keep device drivers smal l and simple, leaving as much of the work as possible to the more structured world outside the kernel . The smallest , simplest driver for a typical synchronous link to an X .25 network would provide a way for a process to transmit one byte of data and receive one byte of data. Such a driver would be cal led many times to send or receive the sequences of bytes that make up X.25 l ink - leve l f rames. However, there are several aspects of ser ia l , synchronous communication that conspire against such a simple implementat ion. F i rs t l y , unlike asynchronous communicat ion, it is necessary that the bytes be produced and consumed at a f ixed rate, at least within a f rame. If the - 23 -transmission rate is high enough and the process-switching t ime is long enough, processes wi l l be unable to keep up. A ful l -duplex, 9600 bits per second interface can generate an interrupt approximately every 400 microseconds — t ime for only 100 to 200 instructions on a typical minicomputer. The speed problem can perhaps be al leviated by buffering in the kernel , but that greatly increases the complexity of the driver. Secondly, the driver must be aware of frame boundaries to properly switch the interface between transparent data mode and inter - f rame idle mode and to make use of cyc l i c redundancy check (CRC) hardware, if avai lable. This requires additional process-to-kernel interaction beyond simple reading and writ ing of bytes. Thirdly, for some machines there are direct memory access (DMA) interfaces which read and write whole frames between memory and the communication l ink. This kind of hardware is essential for very high-speed links (some public X .25 networks provide subscriber access at 56 or 64 kilobits per second). It would be wasteful and, in some cases, too slow to transfer frames between the communication link and kernel space and then pass the frames a byte at a t ime to and from processes, rather than exchanging complete frames direct ly with the processes' memory. For these reasons, a better design has the driver deal with complete frames instead of single bytes, even though this may take more code in the kernel . Processes request the driver to read and write a frame as a unit, leaving the driver responsible for frame del imit ing, transparency of data within the f rame, and C R C generation/validation. Adopting such a f r a m e - a t - a - t i m e driver interface solves the above problems and provides some additional benefits. If D M A hardware becomes available to replace byte- interrupt hardware on a particular machine, only the driver need be changed: the processes and the process-to-kernel interface remain the same. The same is true if alternative f raming, transparency, or checksumming mechanisms are employed — even an asynchronous link could be used to carry the f rames. The X .25 processes are insulated from device- and processor-specif ic idiosyncracies; they manipulate only the v i r tual , " ideal" device provided by the kernel and can therefore be transported - 24 - , unchanged from one computer to another. The hardware used for the first implementation of this driver is far f rom ideal : the only synchronous communication interface available for the TI 990/10 has neither D M A nor C R C capabil i t ies and, much worse, does not support the "b i t - s tu f f ing" required for X .25 framing and transparency. Fortunately , the Datapac network, in addition to bi t -or iented X.25 f raming, supports a byte-oriented framing structure which is compatible with our hardware [8]. Our access to this service is via a 1200 bits per second leased l ine. C R C calculations are done in software, using a fast table- lookup algorithm [29]. In the absence of D M A support, an effort is made to reduce processor overhead: only one pass is made over each frame on its way between process memory and I/O device — C R C updating and transparency are handled on the f ly with no intermediate buffering in kernel address space. This "pseudo-DMA" strategy shares a drawback with real D M A : the processes providing or consuming the frames must be locked into memory for the duration of the transfer. Since frames can arr ive, unsol icited, at any t ime, the receiving process must be permanently memory-resident. There are rea l - t ime constraints both within frames and between frames. Within a f rame, interrupts must be serviced at the f ixed rate of the link to avoid character loss on input or garbage f i l l on output. Adequate response is guaranteed by placing the synchronous I/O device at a high interrupt pr ior i ty . The Verex kernel , including its device drivers, is designed to disable interrupts only for very short, bounded periods of t ime, thus ensuring that interrupts f rom high-prior ity devices can always be serviced within a s m a l l , f ixed t ime of their occurrence. The constraint on response between frames applies to reception only — the transmitter can idle between frames. Since receive buffers are provided by a process, that process must be ready with a new buffer very soon after reception of a f rame. This requirement is met by assigning a high scheduling priority to the - 25 -receiving process, so that it executes immediately after being unblocked by the driver. This is suff ic ient for the current implementation where the byte-oriented framing sequence guarantees A or 5 byte t imes between incoming f rames; it may not be adequate for handling a high-speed link with normal X .25 framing where frames can arrive separated by only a single f lag character . In that case, it would be necessary to either provide buffering in the kernel , with consequent extra overhead, or allow the receiving process to supply more than one buffer at a t ime for sequences of f rames, at the cost of increased complexity at the process-to-kernel interface. Although the driver is part of the kernel , to the X .25 processes it looks and acts exactly l ike another process. Requests for reading and writ ing of frames take the form of messages to a special process identif ier which is recognized by the kernel . Disguising the driver as a process has signif icant advantages for development, debugging, and performance monitoring: it is easy to substitute a real process for the driver or insert a process transparently between the driver and its c l ients . The format and semantics of the messages to the driver conform to the Verex f i le access interface [12], making the driver look l ike a standard f i le server. Scotton in [27] points out the benefits of a uniform interprocess interface for all I/O-style data transfer: I/O applications are portable over different devices, processes can share a common l ibrary of I/O routines, and when used by components of layered communication protocols it allows simple stacking of layers and interchanging of components. The X.25 frame layer driver is thus designed to f i t in as the bottom-most component in a hierarchy of protocol servers. - 26 -4.2 The Link Layer The link layer comprises four processes on the X .25 team, as i l lustrated in Figure 3. The Link Server process maintains the state of the link according to the X.25 L ink Access Procedure (LAP) . It alters the state in response - to request messages f rom the packet layer and noti f icat ion messages from three "worker" processes, the Frame Reader, Frame Writer, and Timer. The Frame Reader requests incoming frames from the frame layer and notifies the Link Server whenever one arr ives. The Frame Writer requests the frame layer to transmit frames and notifies the Link Server whenever one has been sent. The Timer delays for f ixed intervals by sending requests to the system's Time Server (on another team) and notifies the Link Server whenever the t ime interval elapses. f r ame r e c e i v e d n o t i f i c a t i on I I I I I F r a m e I I R e a d e r I I I I I V r e a d r e q u e s t t o f rame l a y e r r e a d & w r i t e r e q u e s t s f r o m p a c k e t l a y e r f r ame s e n t n o t i f i c a t i o n I I I I I F r ame I I W r i t e r I I I j I V w r i t e r e q u e s t t o f rame l a y e r I • t i me o u t n o t i f i c a t i o n I I I I I T i m e r | I I I I V d e l a y r e q u e s t t o T i m e S e r v e r Figure 3. Process Structure of the Link Layer - 27 -This process structure is typical of Verex server teams; it arises from application of the guidelines for multi -process structur ing. The asynchronous, externally -generated events of frame ar r iva l , frame departure, and t imer expiration are each assigned a process which waits for the event's occurrence. The variables which represent the link state are affected by each of the events, and therefore are placed under the control of a single process, the Link Server, which updates them on behalf of the other processes. The Link Server itself never waits for specif ic events and is therefore always ready to receive requests f rom other processes — if the handling of a request must wait for another event to occur, the server internally queues the request and immediately becomes ready to accept the next request. A high frame arr ival rate is handled by assigning the highest scheduling prior ity to the Frame Reader and guaranteeing quick service from the Link Server. It is natural and useful to view the Link Server as a f in i te -s tate machine (FSM), performing state transitions in response to events in the form of received messages, and causing external effects by replying to messages. State transit ion diagrams have become a standard tool for describing communication protocols. A correct implementation of a protocol can be produced quickly and easily by casting its state diagram in the form of an executing F S M . Unfortunately , the state diagrams specif ied in the X .25 definit ion [10] are not suitable for direct implementat ion. F i rs t l y , they are provided for the packet - leve l only — the l ink - leve l is defined only in English prose which is open to ambiguous interpretat ion. In fac t , the implementors of Datapac found it necessary to produce an additional document [26] (also in English prose) to correct many ambiguities and omissions in the X .25 def in i t ion. Secondly, the packet - level diagrams refer to the state of a v irtual c i rcu i t between two machines, whereas the implementor needs a state diagram for the entity at one end of the c i rcu i t only. This dist inct ion is discussed by Vuong and Cowen [28], who show how to derive separate state diagrams for the two ends, given a combined state diagram. Thirdly, the X .25 state diagrams do not include such state components as - 28 -sequence counters and packet queues, which are of great importance to the implementor. Therefore, it was necessary to derive the FSM structure of the Link Server f rom the prose description of X .25 L A P . The L A P definit ion clearly distinguishes transmitter (primary) functions f rom receiver (secondary) functions. Once the link is established, these two sets of functions proceed relat ively independently and therefore result in simpler state transition graphs if implemented as two separate FSMs. Since L A P defines a ful l -duplex l ink, logical ly the two FSMs execute concurrently . It is tempting to use two processes for the the two state machines — a Primary Server and a Secondary Server — but in this case there is l i t t le benefit in doing so. No real concurrency would be obtained by separating the two, since both would execute on a single processor and neither would ever block in the middle of a state transit ion. (Overlap of actual transmission and reception is provided by the separate Frame Writer and Frame Reader processes.) Moreover, considerable message t ra f f i c would be required between the two servers to support the "piggybacking" of secondary responses on primary f rames, to coordinate shared use of the single I/O device, and to synchronize in the case of l ink disconnection. Therefore, to avoid the extra message and process-switch overhead of two servers, both primary and secondary functions are implemented within the single Link Server. However, the dist inction between the two FSMs is retained in the control structure of the single process to take advantage of their simpler state transition graphs. Figure 4 shows the top - leve l ' control structure of the Link Server. It follows a standard pattern for Verex servers. A f te r al locating an in i t ia l ized state vector, state, the procedure executes an endless loop. A t the top of the loop, it waits to receive a message. The message format is defined by the template R E Q U E S T to contain an identifying request code plus request -specif ic f ie lds. (Al l requests except TIMEOUT have a single extra f ie ld containing a pointer to a frame (packet) buffer ; - 29 -TIMEOUT requests have no extra fields.) Based on the request code, the Link Server invokes a corresponding event handling routine which updates the state as required and returns either a reply code to be placed in the reply message or an indication that no reply is to be sent. This loop is the only place where the Link Server blocks doing a Receive; the nonblocking Reply is cal led within some of the subroutines to respond to those requesting processes which do not get an immediate reply. Link_server( device ) template R E Q U E S T , R E P L Y ; unsigned i d ; reply ; word s t a t e [ ] , m s g [ . M S G _ S I Z E ] ; state = Init ial ize( device ) ; repeat { } id = Receive( msg ) ; se lect ( R E Q U E S T _ C O D E [ m s g ] ) case R E A D_BUF: reply = Read_buf( s t a t e , msg , id ) ; case WRITE B U F : reply = Write_buf( s ta te , msg , id ) ; case FRAMETJN: reply = Frame_in( s t a t e , msg , id ) ; case F R A M E O U T : reply = Frame_out( s ta te , msg, id ) ; case T IMEOUT : reply = Timeout( s ta te , msg, id ) ; ^ default : reply = I L L E G A L _ R E Q U E S T ; i f ( reply == NO_ R E P L Y ) next ; R E P L Y _ C O D E [ msg ] = reply ; Reply( msg, id ) ; Figure 4. Main Procedure of the Link Server Write_buf and Timeout perform L A P primary functions, concerned with data transmission. Read_buf performs L A P secondary functions, concerned wi th data reception. Frame_in and Frame_out are demultiplexing routines that invoke either primary or secondary routines, depending on the frame type. In some cases, primary routines invoke secondary routines, and vice versa, to handle such matters as - 30 -piggybacked responses in data frames and resetting of all state information on link disconnection. Therefore, all the routines receive as an argument the same state vector which contains both primary and secondary state variables as well as such shared state as the transmit queue. This state machine structure sacrifices some readability — most of the control flow is implicit in the sequence of values taken on by the state variables rather than explicit in the sequence of statements — but avoids the use of exception trapping mechanisms, "long jumps", or convoluted code to handle exceptions to the "normal" flow of control. In a full-duplex protocol for an error-prone circuit, it is difficult and perhaps meaningless to identify the normal flow of control: exceptions are the rule. The three worker processes are much simpler than the Link Server: each executes a single loop in which it waits for its particular event and then notifies the Link Server. The Link Server controls their execution by choosing when to reply to the notification messages. In the case of the Frame Reader and Frame Writer, the reply from the Link Server contains a pointer to a new buffer to be filled or emptied. The Link Server implementation violates one of the main principles of protocol layering: it uses knowledge of higher layer behaviour. Packet layer behaviour influenced the following design decisions: — The Link Server never transmits a Receive Ready (RR) frame in response to an incoming Information (I) frame unless the I frame has the poll bit on. Instead, the Link Server waits to piggyback the acknowledgement on the next outgoing I frame. The success of this strategy depends on the fact that at the packet layer most incoming packets generate outgoing packets in response, and therefore there is usually an I frame available for the piggyback. When there is not, the remote end eventually times out and retransmits its I frame with the poll bit on, prompting an RR frame from the Link Server. As a - 31 -result , the number of transmitted frames is cut almost in half , reducing processor overhead and increasing potential throughput. Retransmissions on timeout occur only when the link is otherwise idle; they do not represent delayed reception since the original frame is usually received successfully and the repeated frame discarded. This same technique could be applied to higher levels as we l l : if the packet layer was always used beneath a higher- level acknowledging protocol , packet layer RRs could be suppressed, and so on up the layers. — The link layer processes share a common buffer pool with the packet layer processes on the same team. The process holding a pointer to a buffer assumes ful l rights to modify or discard the buffer . A more complex buffer sharing arrangement would have to be imposed, or sharing el iminated altogether, if the link layer could not make assumptions about the buffer handling behaviour of the packet layer. For example, if the packet layer required a retransmission queue as does the link layer, a buffer might need to reside on both queues at once and could not be uni lateral ly discarded by one process. — The link layer does not impose any flow control on incoming frames, i.e., it never sends Receive Not Ready (RNR) frames and it always rotates its receive window when an I frame is successfully received. Instead, it depends on the flow control imposed by the packet level windowing mechanism to prevent overf low. This s impl i f ies the Link 5erver by reducing the number of states (no need for a "not ready" state) and el iminating al l the tests required to determine when buffers are exhausted and when they become avai lable. The gains in s impl ic i ty and performance resulting f rom these dependencies outweigh the disadvantages of t ight binding between the two layers: we do not anticipate using any protocol other than the X .25 packet layer on top of this link layer. - 32 -The link layer implements X .25 L A P as defined for Datapac in [8]. It deviates f rom that definit ion in some aspects. As mentioned above, R R frames are only i generated in response to polls, and R N R frames are never sent. Furthermore, received R N R s are treated as if they were R R s . This s impl i f icat ion could result in a greater number of discarded frames, and subsequent retransmissions, than if R N R s were honoured. However, we have never actually received an R N R over our link to Datapac, and if the other end was another instance of our link layer, we would be guaranteed never to see an R N R ! Another l iberty was taken with the L A P definit ion to permit two instances of the l ink layer to communicate without an intervening network. X .25 is defined as an asymmetr ic interface for connecting a host (DTE) to a network node (DCE) . This asymmetry is present in L A P only in the different address values used by the D T E and D C E in frame headers; it guards against accidental loopback of the communication l ink. However, a loopback capabil i ty is valuable for development and debugging. A symmetr ic interface allows transparent loopback as well as back- to -back connection of two DTEs . To obtain these benefits, a technique described by Bradbury [5] was adopted, whereby the link layer dynamically adapts to D T E or D C E mode, depending on the actions of the communicating partner. Basical ly , the method takes advantage of the fac t that L A P commands and responses are encoded with different values, making the address fields redundant. The redundant information is enough to identify the role (DTE or DCE) that the remote link layer is playing. It would be easy and sensible to modify the Link Server to conform to the Balanced Link Access Procedure (LAPB) of the more recent, revised X .25 recommendation [11] as it becomes available on Datapac and other networks. Unfortunately , L A P B has more asymmetries than L A P (in spite of its name) and, as Bradbury points out, the heuristics for dynamic adaptation to DTE or D C E mode are - 33 -considerably more complex. 4.3 The P a c k e t L a y e r The packet layer of the X.25 team is made up of one C h a n n e l Server process for each logical channel and one shared worker process, the P a c k e t D i s t r i b u t o r . Figure 5 i l lustrates the relationship between these processes and the layers above and below for a three channel X .25 serv ice . Each Channel Server maintains the state of its own logical channel, responding to packet arrivals f rom the Packet Distr ibutor and I/O requests f rom higher- level c l ients . To transmit packets, the Channel Servers send write request messages to the Link Server in the link layer. The Packet Distr ibutor spends most of its t ime awaiting a reply to a read request to the Link Server. When it receives a packet, it determines which Channel Server to notify based on the logical channel identif ier f ie ld in the packet header. Packets with an invalid or zero channel number are handled special ly by the Packet Distr ibutor . For example, the Packet Distr ibutor responds to a restart indication packet on channel zero by sending a restart confirmation packet back to the Link Server and notifying the highest-numbered Channel Server. The not i f icat ion is passed down the line of Channel Servers using the F o r w a r d message pr imit ive . The same forwarding technique is used to al locate logical channels: the highest-numbered Channel Server receives all cl ient requests to establish a new virtual c i rcui t and forwards them along to the f i rst free Channel Server (much l ike a telephone "hunt group"). - 34 -I/O r e q u e s t s f r o m c l i e n t p r o c e s s e s I V I I V V C h a n n e 1 S e r v e r 3 --> C h a n n e 1 S e r v e r 2 > C h a n n e 1 S e r v e r 1 I I I I A I I I I p a c k e t r e c e i v e d n o t i f i c a t i o n I I I _ J J I I I I P a c k e t | I D i s t r i b u t o r j I I T I V r e a d & w r i t e r e q u e s t s t o l i n k l a y e r A I I I A V V V w r i t e r e q u e s t s t o l i n k l a y e r Figure 5. Process Structure of the Packet Layer The packet layer functions could have been incorporated into the lower - leve l Link Server process; separate processes were chosen in order to make very clear the dist inction between the two protocol layers. It would be hard to keep this dist inction visible within the state machine structure of the Link Server. The clear separation eases the task of programming f rom the protocol descriptions, greatly enhances understanding by others, and faci l i tates the replacement of a layer (for example, the replacement of the link layer by a hardware implementation [30]). The cost of this separation is the overhead of message-passing and process-switching between the layers, a relat ively small cost in the Verex environment. - 35 -Internally, the Channel Servers have the same f inite state machine structure as the Link Server. Identifying the states and valid transitions was made easier by the state transit ion diagrams provided for the packet layer in the X .25 spec i f icat ion . The FSM for each logical channel is embodied in a separate process in order to exploit possible concurrency. Because of swapping, the exchange of data packets with cl ients in other address spaces using the Transfer-to and Transfer-from pr imit ives can sometimes involve disk operations. Providing separate Channel Servers allows packet t ra f f i c on one logical channel to overlap disk t ra f f i c for another. The Channel Servers are all created at system in i t ia l izat ion t ime, rather than • dynamically in response to v irtual c i rcu i t requests. Dynamic creation and destruction of Channel Servers would complicate the structure of the layer: a Channel Server Manager would be required to handle the coming and going of Channel Servers and the communication between Channel Servers would have to take process creation and destruction into account. Since the maximum number of logical channels is f ixed at network subscription t ime, and since the X.25 service must have access to enough resources to support al l of the channels in use at the same t ime, it is reasonable and convenient to stat ica l ly allocate the maximum number of Channel Servers. The resources consumed by each additional Channel Server are stack space on the X .25 team (300 bytes in the current implementation), six fu l l - s i ze packet buffers in the team's buffer pool (267 bytes each for Datapac), and a process descriptor block (currently 56 bytes) in the kernel . Af ter taking into account the other demands on the team's address space, these requirements l imi t the number of logical channels that can be supported in the current implementation to about 24, more than enough for such a small machine. The use of mult iple Channel Servers necessitates the demultiplexing function of the Packet Distr ibutor . This function could have been performed within the Link Server, but since it is c learly a packet - layer funct ion, the above argument for layer - 36 -separation applies. Moreover, the Packet Distr ibutor process plays an important role as a relayer of messages f rom the Link Server to the Channel Servers. Without an •) intermediate process, message-passing between the two layers of servers could take i one of two forms: — The Link Server could send packet arr ival messages direct ly to the Channel Servers. Since write request messages are sent in the other d i rect ion, this would create great potential for deadlock. — The Channel Servers could send read request messages to the Link Server. Since packets can take indefinitely long to arr ive, this would cause the Channel Servers to block indefinitely wait ing for replies, rendering them deaf to other requests f rom cl ients . In part icular , a Channel Server could not service a write request for a cl ient while awaiting an incoming packet. Therefore, the Packet Distr ibutor allows ful l -duplex communicat ion between the two layers of servers, while providing deadlock immunity . The need for another worker, complementary to the Packet Distr ibutor , to wait on write requests to the Link Server is obviated by having the Link Server reply immediately to write requests, queueing the outbound packets for eventual transmission. From the point of view of the Channel Servers, transmission is instantaneous. Unfortunately , this reduces the opportunities for piggybacking of acknowledgements: when a Channel Server has an acknowledgement to send, there is never a data packet waiting to go out; instead, they queue up, out of reach, in the link layer. A lso , unlike the link layer, the packet layer cannot simply withhold al l acknowledgement packets and depend on retransmissions and higher- level t r a f f i c , because the packet layer does not retransmit and there are many possible higher levels. However, one simple scheme is employed to provide occasional piggybacking: an acknowledgement for an incoming data packet is withheld unti l the next I/O - 37 -request is received f rom the c l ient ; if the request is a wr i te , the acknowledgement is piggybacked on the outbound packet; if i t is a read, the acknowledgement is sent by i tself . This t r ick works well if the client's application is half -duplex in nature, such as a command/response terminal session or remote procedure invocation. It does not help at all for applications that receive many packets before generating any response, such as bulk f i le transfers. There are signif icant advantages to placing the packet layer processes on the same team as the link layer. Most important is the reduction of expensive data copying between address spaces. As indicated in the preceding discussion, the two layers share a common pool of packet buffers and simply exchange pointers to buffers; data is only copied once to or f rom the cl ients' address spaces and once in or out by the frame level driver. The two layers also benefit f rom shared code: processes in both layers use a common set of procedures for language support, buffer pool management, queue manipulation, and modulo ar i thmet ic . This sharing can be extended to multiple X.25 links by repl icat ing both layers within the single team, subject to address space l imitat ions. The implementation of the packet layer conforms to the 1976 version of X .25 as specif ied in [8]. There is no support for permanent v irtual c i rcui ts nor for the more recently defined datagram or "fast select" fac i l i t ies [11],,although any of these could easily be added if desired. L ike the link layer, the packet layer processes never transmit R N R packets and incoming R N R s are treated l ike R R s . Reject (REJ) packets are neither generated nor accepted, as specif ied for Datapac. Consideration of Bradbury's technique for DTE/DCE adaptation at the packet layer [5] reveals a problem with the multiple Channel Server structure: the method requires knowledge of the current state and history of each logical channel, knowledge that is distributed among the separate Channel Servers. This distribution of state would also be an obstacle to the implementation of a datagram-based - 38 -internetwork protocol , such as IP [23] or Pup [3], on top of X .25 v i r tual c i rcu i ts : the heuristics for deciding when to establish and when to tear down virtual c ircuits depend on the number of circuits in use, their destinations, and demand by competing c l ients . It would require much message t ra f f i c to co l lect that information f rom the distributed Channel Servers, and it would be impossible to obtain a consistent "snapshot" of the total state, due to their independent execution. Reca l l that the motivation for separate Channel Servers was to exploit concurrency between packet t raf f ic and disk transfers with swapped c l ients . Unfortunately , the behaviour of the Packet Distr ibutor can compromise this concurrency: if i t sends an incoming packet not i f icat ion to a Channel Server which is blocked awaiting a disk transfer, the Packet Distr ibutor must also wait , ef fect ive ly cutt ing off incoming t ra f f i c for other channels. Furthermore, if the X .25 communication link is slow compared to the disk transfer t ime — the usual case — or if the probabil ity of a cl ient being swapped out is low, any performance improvement due to concurrent execution of Channel Servers would be negligible. In retrospect, it appears preferable to combine the Channel Servers into a single process. This would enable the use of algorithms that depend on global packet layer state. As a minor benefit , it would also raise the cei l ing on number of logical channels, since a separate execution stack would not be required for each channel. If the X .25 team were to be used in a heavy swapping environment, concurrency could s t i l l be exploited by employing a smal l number of worker processes to perform the actual transfers across address spaces. - 39 -4.4 The Transport Layer X . 2 5 , and our implementation of X . 2 5 , fu l f i ls the requirements of the bottom three layers of the seven layer ISO Reference Model for Open Systems Interconnection. Within that model, the X .25 service would be expected to support layer four, the transport layer. Many of the functions required of the transport layer are s imilar in nature to those of the packet and link layers, and therefore the same mult i -process structuring methodology can successfully be applied to the design of transport layer software. Since the packet and link layer software share the same address space to avoid extra data copying, it is tempting to incorporate the transport layer into the X .25 team for the same reason. However, there are several arguments against doing so: — There is presently no widely accepted transport layer protocol . The carriers that provide X .25 networks are not currently compelled to adopt a standard transport protocol , since transport is an end-to-end protocol of concern only to software in the customers' computers. This may change as carriers start providing higher- level services such as inter -machine electronic ma i l . — For some applications, a transport layer is unnecessary or unwanted. For example, the X .29 remote terminal protocol [9] is defined for use direct ly on top of X . 2 5 . Since X.25 is a useful stand-alone service it is best packaged independently of any higher layer protocols, able to support a number of applications and communication architectures. — One of the major purposes of the transport layer is to provide application processes with a network-independent interface to mult iple networks wi th , perhaps, different low- level protocols. A transport layer team that is separate and independent of any lower - level protocol software would best accommodate a variety of networks. - 40 -— The buffer handling requirements of a transport service are unlikely to be compatible wi th those of the X .25 team. For example, the transport protocol might require its own retransmission queue to obtain rel iable end-to-end data transfer, necessitating a more complex buffer sharing regime. The buffer size might have to be increased if the transport service provides fragmentation/reassembly of large data blocks. The complications introduced by these incompatibi l i t ies are easily avoided by copying data between transport layer buffers and X .25 buffers. The cost of copying across address spaces is, one hopes, not much greater than copying within an address space. In general, protocol layers should be separated into different teams or spaces unless they are components of a coherent protocol " fami ly" , unlikely to be used individually. This is the case with the packet and link layers of X . 2 5 . The X .25 fami ly does not (yet) include a transport layer protocol . 4.5 Buffer Management To minimize data copying, all the processes on the X .25 team share a common pool of packet buffers. Any block of data is copied only once into the team's space and once out; the X .25 processes simply exchange pointers to buffers as the data makes its way through the layers. The buffer pool is implemented as a linked list of free buffers, pointed to by a global variable. The processes obtain and discard buffers as needed by di rect ly manipulating the free buffer l i s t . Concurrent access by multiple processes raises the possibility that the l ist may become corrupted by unsynchronized modif icat ions. This is avoided by taking advantage of a property of Verex process scheduling cal led relative i n d i v i s i b i l i t y : a process executes indivisibly with respect to other processes of the same or lower priority on the same team unti l i t blocks. In other words, within a team, a process can be preempted only by higher priority processes. - 41 -On the X.25 team, only the Frame Reader and Frame Writer have higher priority than other processes, in order to service the link device promptly . By arranging that those two processes never touch the free buffer l is t , al l the others are free to manipulate it without danger of mutual interference. Relat ive indivisibi l i ty provides a second mechanism for mutual exclusion, in addition to the use of message-passing. However, its benefits are minor — it avoids the overhead of message-passing for synchronization — and its use can always be replaced by a suitable process and message structure. For example, the X .25 free buffer l ist could be placed under the control of a single process which alone manipulates the l ist in response to request messages from the other processes. The position of the Link Server in the process structure of the team makes it a suitable candidate to provide this extra serv ice. A l l the buffers are the same s ize : large enough to handle a maximum-s ize f rame. Unfortunately , much of the t raf f ic consists of tiny control messages, such as link layer and packet layer acknowledgements. More ef f ic ient use of buffer memory could be made by supporting two sizes of buffers — large and smal l — or using buffer fragments that are chained together to hold large frames. As usual, such increased space ef f ic iency would be bought with increased execution t ime: more processing would have to be done to select buffer s izes, fol low chain pointers, e tc . Enough buffers are created at compile t ime to handle the observed worst-case demand. If ever the buffer pool is exhausted, the X.25 team halts execution, destroying al l connections. It must then be recompiled with more buffers. Obviously, it would be preferable to use fewer buffers and to survive the occasional shortage during periods of unusual demand. However, when a process finds that the buffer pool is empty it is not suff ic ient for it to just wait unti l another process frees a buffer — the other processes may require t ra f f i c (such as acknowledgement packets) from the blocked process before they wi l l free any buffers. To avoid such deadlock, - 42 -it would be necessary to free up some inessential buffers. Since buffers are distributed among separate processes, considerable synchronization and communication would be required to identify the disposable buffers and to transfer them to the needy process(es). This problem has not been successfully addressed in the current design. The only virtue of the straightforward scheme adopted is its simplicity as reflected in the ease of programming, speed of execution, and small size of the Link and Channel Servers — there is no code to handle buffer shortages. - 43 -5. Evaluation The X .25 software as described was implemented by the author par t - t ime over a period of four months and represents approximately two months of e f for t . Much of the development t ime was spent on the Frame Layer driver because of d i f f icul t ies with the available communication hardware. The packet layer was programmed after the link layer and took about half as long, with the benefit of experience. The author applied the same structuring methodology to the somewhat simpler X .29 protocol to produce a v irtual terminal program in l i t t le more than a weekend. (It was needed to test the X.25 software!) The X.25 packet layer comprises 1268 lines of Zed language source, the link layer is 1005 lines, and procedures shared between the two layers contribute another 370 l ines. The frame layer driver in the kernel includes 190 lines of Zed and 555 lines of Assembly Language. The large amount of Assembly Language ref lects the shortcomings of the hardware, and includes software C R C calculat ions. The use of a h igh- level , portable programming language and concern with portabi l i ty during programming have made all but the frame layer software easily transportable to other machines. On the TI 990/10, the software translates into 17,066 bytes of program code (1206 in the kernel), 1792 bytes of stack and data space per link (72 in the kernel), and 2248 bytes of stack and data space per logical channel (including buffers). Thus, a s ingle- l ink, three-channel X .25 service requires 24,324 bytes on the X .25 team and 1278 bytes in the kernel . - 44 -The software has been in use for over a year, supporting incoming and outgoing terminal (X.29) connections over Datapac. It has proven to be extremely rel iable: it has fai led only once by running out of buffers. (The network has fai led much more often.) Unfortunately , l i t t le performance data have been col lected on the software, due to lack of a suitable testing configuration. Our slow (1200 bits per second) network connection prevents signif icant loading of the software: even when handling heavy t ra f f i c on multiple channels, no degradation of overal l system performance can be perceived. A one-way f i le transfer to a faster receiver attains an ef fect ive data rate of 1117 bits per second — 93% of the link capacity . This ref lects the ef f ic iency of the protocol more than the software. The X.25 software design has been used by others as a model for different communication protocols. In addition to the virtual terminal program mentioned above, a host-side X .29 service has been implemented [27] and a Byte Stream Protocol server for the Cambridge Ring loca l -a rea network [20] was constructed in a month by two undergraduate students with no previous experience in implementing communication software. This is a measure of the understandability and maintainabi l i ty of the software and the generality of the structuring methods. Very l i t t le has been published of structuring methods for X .25 or other protocol software. In one of the few relevant papers, Bochmann and Joachim [2] describe an implementation derived from a formal specif icat ion of X .25 and based on the multi -process fac i l i t ies of Concurrent Pascal . They decompose the protocol into a logical structure of, modules communicating via messages. Their module structure is very similar to our col lect ion of processes — some modules are f inite state machines and some are workers performing I/O or mult iplexing functions. However, they then go on to describe a technique for translating the logical structure into a physical real izat ion using Concurrent Pascal processes and monitors. The resulting - 45 -implementation is considerably more complex and opaque, if the difference in their diagrams can be taken as evidence. The authors say they would have been happier with language fac i l i t ies (or, presumably, operating system faci l i t ies) that more closely matched their logical structure. Cohen [16] describes the implementation and performance of X .25 and X .29 software using the R S X - 1 1 M operating system on a PDP/11 computer. R S X - 1 1 M is a typical rea l - t ime system with a large col lect ion of fac i l i t ies for mult i -process, rea l - t ime programming. Cohen used these fac i l i t ies to build a configuration of separate processes supporting the various layers of protocol and communicating via messages. His only comment on those fac i l i t ies was that early versions of his software tended to use too many of them or use them in an ineff ic ient or incorrect manner. The fac i l i t ies for multi -process structuring provided by Verex have proven to be adequate and desirable for communication software. The main drawback of our methodology is that the Verex environment is not universally avai lable. However, there are other message-based systems which d i f fer in various degrees f rom Verex, which could support software l ike that described. Lack of shared address spaces, non-blocking message-passing, different scheduling pol icies, e tc . , would ef fect some aspects of the design, but not a l l . The cost of interprocess communication on some systems might require combining the Link Server and Channel Server functions into a single process in order to obtain reasonable performance. As pointed out in Chapter 4, the motivation for separating the layers across processes was modularity rather than concurrency. It may be possible to maintain the visible separation of the layers within a single process by use of language fac i l i t ies for module definit ion or other coding conventions. The tighter binding of the layers would present other opportunities for improved performance: access to the link layer queues by the packet layer would allow more packet - level piggybacking of acknowledgements and fac i l i tate high-prior ity handling of interrupt and reset packets. Recovery f rom buffer shortages would be considerably simpler with only one process to deal w i th . / The mechanisms described in Chapter 3 for handling incoming calls and for signall ing interrupts to clients seem awkward and heavy-handed, relat ive to the rest of the design. Both of these situations require communication ini t iated by a server (the X.25 team), directed to a c l ient , unlike the normal c l ient - in i t ia ted interactions which are served so wel l by the remote procedure ca l l style of message-passing. It is not clear whether this indicates an inadequacy of the interprocess communication primit ives or of the design. As demands on the X.25 software increase, changes wi l l be needed. In part icular , more sophisticated buffer management algorithms wi l l be required to make better use of memory as the number of logical channels increases. The newer features of X . 2 5 , such as L A P B and datagram support, should be incorporated into the software as the need arises. As higher speed links are adopted, performance measurements w i l l be needed to focus opt imizat ion ef forts . It should not be necessary to change the software s ignif icant ly to take advantage of multiprocessor or distributed computer architectures. The use of processes for the protocol layers and messages for communication between the layers should allow easy distribution of al l or part of the X .25 software to separate machines. An existing example is a multiprocessor version of Verex [4] which gives the X .25 team its own dedicated processor with no changes to any software outside of the kernel . In conclusion, it has been shown that software to support X .25 \ can be implemented quickly, c leanly, and correct ly . Mult i -process structuring methods applied to layered specif ications of f in i te -state machines can satisfy the - 47 -funct ional i ty and performance demands of not only X .25 but many modern communication protocols. - 48 Bibliography 1. Bochmann, G.V., "F ini te state description of communication protocols", Computer Networks 2, 4/5 (October 1978), 361-372. 2. Bochmann, G.V. and Joachim, T., "Development and structure of an X.25 implementat ion", IEEE Transactions on Software Engineering 5, 5 (September 1979), 429-439. 3. Boggs, D.R. , et a l , "Pup: an internetwork architecture" , IEEE Transactions on Communications 28, 4 (Apri l 1980), 612-623. 4. Boyle, P .D. , "The design of a distributed kernel for a multiprocessor system", University of Br i t ish Columbia , Department of Computer Science, M.Sc. Thesis, June 1982. 5. Bradbury, C , "X25 asymmetries and how to avoid them", A C M Computer Communicat ion Review 8, 3 (July 1978), 25-34. 6. Brinch Hansen, P., RC4000 Software Mult iprogramming System, A/S Regnecentralen, Copenhagen, 1969. 7. Bunch, S.R. and Day, J .D . , "Control structure overhead in T C P " , Proc . IEEE Trends and Appl icat ions: 1980, Computer Network Protocols, Gaithersburg, Maryland, May 1980, 121-127. 8. Computer Communications Group, "Datapac Standard Network Access Protocol Specif icat ion" , Trans Canada Telephone System, Ot tawa, Ontar io, March 1976. 9. Computer Communications Group, "Datapac Interactive Terminal Interface (ITI) Specif icat ion and User's Manual" , Trans Canada Telephone System, Ot tawa, Ontario, October 1978. 10. CC ITT Recommendation X . 2 5 , "Interface between DTE and D C E for terminals operating in the packet mode on public data networks", March 1976. 11. CCITT Draf t Revised Recommendation X . 2 5 , "Interface between DTE and D C E for terminals operating in the packet mode on public data networks", February, 1980. • 12. Cher i ton, D.R. , "Distr ibuted I/O using an object-based protocol" , U B C Computer Science Technical Report 8 1 - 1 , University of Brit ish Co lumbia , January 1981. 13. Cher i ton , D.R. , The Thoth System: Mult i -Process Structuring and Portabi l i ty , Elsevier North -Hol land, New York , 1982. - 49 -14. Cher i ton, D.R. , Ma lco lm, M.A . , Melen, L.S. , and Sager, G .R . , "Thoth, a portable rea l - t ime operating system", Communications of the A C M 22, 2 (February 1979), 105-115. 15. Cher i ton, D.R. and Steeves, P . J . , "Zed language reference manual" , U B C Computer Science Technical Report 79-2 , University of Br i t ish Co lumbia , September 1979. 16. Cohen, N.B. , "Implementation and performance of an X.25 packet network interface machine", Technical Report T -78 , Computer Communications Networks Group, University of Waterloo, Waterloo, Ontar io, September 1978. 17. Haverty , J . F . and Rettberg , R .D . , "Interprocess communications for a server in UNIX" , Proc . IEEE Computer Society International Conf . on Computer Communications Networks, September 1978, 312-315. 18. Hoare, C . A . R . , "Communicat ing sequential processes", Communications of the A C M 21, 8 (August 1978), 666-677. 19. ISO/TC97/SC16, "Data Processing - Open Systems Interconnection - Basic Reference Model" , Document N537 Revised, December 1980. 20. Johnson, M .A . , "R ing byte stream protocol spec i f icat ion" , Computer Laboratory, Cambridge, Apr i l 1980. 21. Kernighan, B.W. and R i tch ie , D . M . , The C Programming Language, P r e n t i c e - H a l l , Englewood C l i f f s , New Jersey, 1978. 22. Lockhart , T.W., "The design of a ver i f iable operating system kernel" , U B C Computer Science Technical Report 79-15, November 1979. 23. Postel , J . , Editor , "DoD Standard Internet Protocol" , A C M Computer Communicat ion Review 10, 4 (October 1980), 12-51. 24. Rashid, R .F . , "An interprocess communication fac i l i t y for UNIX" , Technical Report C M U - C S - 8 0 - 1 2 4 , Department of Computer Science, Carnegie -Mel lon Universi ty , Pittsburgh, P A , revised June 1980. 25. R i tch ie , D . M . and Thompson, K., "The UNIX timesharing system", Communications of the A C M 17, 7 (July 1974), 365-375. 26. Rybczynski , A . , Weir, D., and Pal f raman, J . , "Questions and answers on X25 and characterist ics of intra -Datapac virtual c i rcu i ts" , Computer Communications Planning, Trans Canada Telephone System, Ot tawa, Ontar io, March 1978. 27. Scotton, G .R . , "An experiment in high level protocol design", University of Br i t ish Co lumbia , Department of Computer Science, M.Sc. Thesis, December 1981. 28. Vuong, S.T. and Cowen, D.D. , "Automated protocol validation v ia resynthesis: the CCITT X .75 packet level recommendation as an example", Technical Report C S - 8 0 - 3 9 , Computer Science Department, University of Waterloo, Waterloo, Ontar io, revised May 1981. - 50 -29. Wecker, S., " A table-lookup algorithm for software computation of cyc l i c redundancy check (CRC)" , D ig i ta l Equipment Corp. , Maynard, Mass. , January 1974. 30. Western D ig i ta l Corporat ion, "LSI packet network interface WD2501/11, short form data sheet", Prel iminary Specif icat ion, Irvine, Ca l i fo rn ia , June 1981. - 51 -

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0052089/manifest

Comment

Related Items