UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Records management attributes in international open document exchange standards Gregson, Harold Anthony 1996

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_1996-0053.pdf [ 9.93MB ]
Metadata
JSON: 831-1.0087050.json
JSON-LD: 831-1.0087050-ld.json
RDF/XML (Pretty): 831-1.0087050-rdf.xml
RDF/JSON: 831-1.0087050-rdf.json
Turtle: 831-1.0087050-turtle.txt
N-Triples: 831-1.0087050-rdf-ntriples.txt
Original Record: 831-1.0087050-source.json
Full Text
831-1.0087050-fulltext.txt
Citation
831-1.0087050.ris

Full Text

R E C O R D S M A N A G E M E N T A T T R I B U T E S IN INTERNATIONAL O P E N D O C U M E N T E X C H A N G E STANDARDS by H A R O L D A N T H O N Y G R E G S O N B . A . , University of Victoria, 1971 A THESIS S U B M I T T E D I N P A R T I A L F U L F I L L M E N T OF T H E R E Q U I R E M E N T S F O R T H E D E G R E E OF M A S T E R OF A R C H I V A L S T U D I E S in T H E F A C U L T Y OF G R A D U A T E S T U D I E S (School of Library, Archival and Information Studies) We accept this thesis as conforming to the required standard T H E U N I V E R S I T Y OF B R I T I S H C O L U M B I A October 1995 © Harold Anthony Gregson 1995 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. The University of British Columbia Vancouver, Canada DE-6 (2/88) Abstract The thesis is a study of the ability of international open document exchange standards to capture the attributes of the archival document, or record in its electronic form using a set of decontextualized attributes developed by diplomatics. Open document exchange is the ability to exchange documents in their complete form between heterogeneous electronic document management systems. Three standards are examined: ISO 8613 Open Document Architecture ( O D A ) , ISO 10166 Document Filing and Retrieval (DFR), and ISO 8879 Standard Generalized Mark U p Language ( S G M L ) as realized in the Text Encoding Initiative; As distributed computing becomes common, there is a growing need for such standards, but their usefulness to archivists and recordskeeping systems depends on their ability to recognize the record. Open document exchange standards function by setting broad descriptive requirements without prescribing the specific implementation by any given system. In describing documents, they are therefore in need of a terminology that is free of any particular records context. The thesis proposes that the archival science of diplomatics is capable of providing a complete set of decontextualized descriptors, or attributes, that encompass all aspects of the archival document or record. Since diplomatics is based on a scientific analysis of documents, and makes use o f terminology that has long been in use by records creators and keepers, it is proposed that diplomatics terminology should be treated as a surrogate international standard in defining the attributes of the document profile and the logical and layout structure of documents. Diplomatics concepts can then be used as the basis for describing records attributes in the document profile of electronic records management systems. The thesis demonstrates this proposition by means o f a thesaurus which maps attributes and concepts of S G M L , O D A , and D F R against diplomatic terminology. i i Table of Contents Abstract i i Table of Contents i i i List of Figures i v Acknowledgments v INTRODUCTION 1 Role of Communication Standards 7 Problem of Data Structure 12 The Need for Decontextualized Attributes 13 New Types of Documents 15 Definition of the Record 15 CHAPTER ONE: ELECTRONIC DOCUMENT MANAGEMENT SYSTEMS Definition 27 Structure of an EDMS 28 Document Profile 31 Role of Metadata 33 Distributed Database Management 34 Open Document Processing and Role of Standards 35 CHAPTER TWO: DD7LOMATICS 41 Juridical System 44 Facts 45 Creation Process 49 Procedures 50 Phases of Procedure 51 Persons 53 Intrinsic Elements & Extrinsic Elements 55 Transmission 67 Other Aspects 72 CHAPTER THREE: INTERNATIONAL OPEN DOCUMENT EXCHANGE STANDARDS 77 Standard Generalized Markup Language (SGML) 77 Open Document Architecture (ODA) 94 Document Filing and Retrieval (DFR) 107 CHAPTER FOUR: A THESAURUS OF RECORD ATTRIBUTES 111 Rules&Terms 112 Glossary Sources 183 CONCLUSION 185 BIBLIOGRAPHY 192 List of Figures Figure 1: Basic Processing Model 36 Figure 2: Sample Business Letter for S G M L TEI Text Encoding 87 Figure 3; Sample S G M L TEI Encoded Document 88 Figure 4: O D A Logical and Layout Structure 98 Figure 5: O D A Correspondence between Logical and Layout Objects including Content Portion 98 Figure 6: O D A Content Architecture and Layout for Diplomatic Elements 99 i v Acknowledgements I would like to acknowledge the interest and support of Clive Smith, World Bank Group Archivist, and Ana Flavia Fonseca, head of Document Management, in permitting me the time and opportunity to pursue the subject of this thesis, and those of my colleagues at the World Bank involved in the development of the electronic document management system who so freely gave of their time to discuss the various issues involved. I would like to thank Susan Rogers for her personal support. INTRODUCTION The subject of this thesis is the recognition within international open document exchange communication standards of those elements of form that permit the archival document to be distinguished as a record. The archival document is here understood to be "a document made or received in the course of activity as a means for and residue of it, and preserved for reference or in pursuance of legal obligation by its creator or legitimate agent, or successor."1 International open document communication standards are those standards promulgated by the International Organization for Standardization (ISO) in order to facilitate document interchange between heterogeneous electronic systems. Specifically, the standards considered in this thesis are ISO 8613 Open Document Architecture Standard (ODA), ISO 10166 Document Filing and Retrieval Standard (DFR), and ISO 8879 Standard Generalized Markup Language. The thesis takes as its starting point the need to define the record without which neither electronic records management systems, nor an exploration of diplomatic concepts, would have any point. In turn, the record cannot be explained without a definition of the document, a particularly important distinction given the indiscriminate use of the term in the electronic systems world. The thesis then goes on to explain the context of open document exchange standards, which is electronic records management, and the Open Systems Model , which in turn form a context of their own for the standards. Finally, the stage for an examination of the standards themselves, and the mapping of their attributes and concepts to a thesaurus of diplomatic terminology, is set by an explanation of diplomatics. 1 Duranti, Luciana, Managing Electronic Documents: Making Sense Out of Chaos or "Records Management is Dead! Long Live Records Management! "(Presentation to World Bank, April 27 1993, Washington, D.C): 9. The term document will be used here to include all types of recorded materials of which archival documents, or records, are only one type. This is an necessary distinction to bear in mind because open document exchange standards are designed to capture all sorts of documents, and not just records. 1 Introduction The spread of distributed information technology systems has compelled many organizations to rely in the conduct of their affairs on different types of computers, peripherals, and software applications manufactured according to proprietary specifications. Such distributed information systems offer great benefits in the use of information, but these can be realized only i f the information is "easily accessible and capable of being amended (e.g. annotated and updated) by all the possible organizational users."2 For example, with many different word processing or spreadsheet packages, how can a document be transmitted and received in the same format? With data, text, images, and graphics now frequently incorporated into the same document, each of which is generated by a software application of its own, the problem is compounded. The purpose of open document exchange standards can be denned more specifically as the ability to provide computerized document management systems with the ability to exchange a record in its complete form, which diplomatics defines as "one that contains all the elements it is supposed to contain according to the administrative and legal system."3 The most complete form of a record is the original, which diplomatics defines as not only a complete (or perfected) record in terms of form, but also the first to be issued by the creator. Without the element of completeness, the reliability4 of a record is in doubt. For instance, a contract that differs from the first to be created and enacted i n use of typefaces and layout of text is unlikely to be accepted in a court of law as anything but a simple copy, or a "mere transcription of the content from the original".5 The inability to transmit a record between heterogeneous computer systems in its complete form therefore has serious implications for everyone who must create and manage records. In the words of Alvin Tedjamulia, Executive Vice President of 2 Advisory Committee for the Co-ordination of Information Systems (ACCIS), Strategic Issues for Electronic Records Management: Towards Open Systems Interconnection (New York: United Nations, 1992): 3. 3 Duranti, Managing Electronic Documents, 9. 4 Reliability is the quality of trustworthiness conferred on a record "by its degree of completeness and the degree of control on its creation procedure and/or its author's reliability." Ibid., 9. 5 Luciana Duranti, Diplomatics: New Uses for an Old Science (Part I,) Archivaria 28 (Summer 1989): 3. 2 Introduction SoftSolutions Technology, "The purpose of document management is to know that an original is an original, who touches it, and when."6 Standardization has inevitably emerged as the obvious solution to this problem of which the movement known as Open Systems Interconnection (OSI) is a part. The objective of OSI has been "to facilitate interworking between different organizations (or between parts of large organizations) - even when the organizations (or their parts) have very different kinds of computer technology."7 OSI is backed by various international standards-setting bodies such as C C I T T (International Telegraph and Telephone Consultative Committee) and ISO (International Organization for Standardization), national standards bodies such as ANSI (American National Standards Institute) and BSI (British Standards Institute), manufacturers, and end user organizations. OSI has been the active promoter of a number of international communication standards intended to standardize the exchange and storage of electronic documents. The context of open document exchange is electronic records management (ERM). Electronic management of records applies to the whole spectrum of records-making and recordskeeping functionalities spanning the entire life-cycle of records. A n E R M system comprises a full range of records management processes including the creation and identification, appraisal, control and use, and disposition of documents.8 These processes are, at first sight, in many ways, similar to records management in the paper environment, but there are some fundamental differences. In order to make information available, an E R M system must be capable of transferring records from a store of documents to the user or to other stores. The record must also be moved from the creator or modifier to the document store.9 This raises issues of security, access, and physical transfer that are relatively uncomplicated from a technical point of view (although intellectually equally important) in the paper records environment. Transfer of 6 Andy Reinhardt, Managing the New Document, Byte (August 1994): 93. 7 ACCIS, Strategic Issues, 3. 8 ACCIS, Strategic Issues, 6. 9 Ibid., 11. 3 Introduction records becomes even more complex where the exchange must take place between different platforms or networks. This is hardly a problem where the physical medium of the record has always the same characteristics as in the case of paper. E R M systems, however, must be capable of handling a great many, different types of formats, as well as be capable of relating electronic records to paper records. E R M also goes much further than traditional systems in the management of records.' "Traditional records management systems have evolved in the context of custodians managing finished documents in aggregations of files and records series, essentially automating the paper system where the sheer volume of paper records to be managed precludes any finer degree of distinction."1 0 Records management in paper-based record environments is limited to the functions of use, maintenance and disposition of records. Electronic records management systems have the capability of controlling the actual creation and manipulation of records. "Electronic document management Systems . . . on the market today focus on enabling the author or members of a workgroup to manage the contents of an electronic document throughout the creation phase when multiple parts may be under revision separately but are eventually merged into a single, finished edition." 1 1 The creation phase in turn may be broken down into a number of functional requirements including the entry of data, editing, assembling of document parts, distribution, collaborative processing, import of external documents and storage:12 The context of an E R M system is the record itself. The management of records shifts to the item level as opposed to the more traditional file or dossier level. Whereas in paper recordkeeping systems documents aggregate themselves into the physical equivalent of the • intellectual relationship between them, within E R M systems, the relationship between any one . record and another must be precisely defined to the system beforehand i f the record is to be managed at all. This entails managing the record at the moment of creation. 1 0 Diane Hopkins, Karl Lawrence, Irene Travis et al., Extending ARM Requirements at the World Bank, (internal report, Washington: World Bank, 1994): 1. 1 1 Hopkins et al., ExtendingARM Requirements, 1. 1 2 EDMS Integration Team, Electronic Document Management System: Delivery Schedule Mapping to Functional Requirements, (internal report, Washington: World Bank, 1994): 1-15. 4 Introduction The international communications standards within which records management functionalities may be addressed are known as open document standards in that they permit documents to be created using software modeled to standards that are themselves "open", i.e. they can be exchanged and manipulated over a network. The question then becomes how the record can be defined and recognized within electronic document management systems employing open document communication standards, How must these standards capture the record and what must archivists know about these standards to appraise their application to a records systems? Before tackling this question, one more definition is needed. Oddly enough, there is no specific definition for a records system. A system itself has been defined as " a somewhat vague term that usually refers to a combination of components working together. For example, a computer system usually includes both hardware and software." 1 3 While this may be true, a more accurate definition would be to define a system as an architecture of components brought, together to achieve a particular end. The components may be human and non-human. For instance, a typical electronic office records system today is usually comprised of staff responsible for the drafting and approval of records using desktop workstations, printers for output of hard copies for distribution, a dedicated computer for centralized storage of electronic records, administrative support staff" responsible for maintaining classification schemes and filing hard copies, retention schedules for the disposition of records, policies and procedures designed to control the form, distribution, access to, and storage of records, and documentation designed to describe and maintain the computerized components. Its important to note that the hardware and software comprising the computerized component are only a part of the records system, even though it is easy to think of them as being the system itself. This is because, in an E R M , many of these functions; including records creation, maintenance, and use are all performed using computer equipment and software programs. But certain parts of the,system such as system documentation, and 1 3 Philip E. Margolis, The Random House Personal Computer Dictionary (New York: Random House, 1991): 452. 5 Introduction policies and procedures controlling the creation and maintenance of records are necessarily independent of the E R M (as comprised of hardware and software) in that they govern its very existence by determining its uses and rules. 1 4 A n implied qualification of system components is that all the parts that participate in the system are competent, that is, are "adequately qualified or capable" of carrying out their part. Competence, in this sense, is a term that can be applied to both human and non-human elements. For example, a structural member of a bridge is considered "competent" i f it is able to withstand the load forces that it is designed to transmit as part of the structural design. This engineering definition of competence can be applied to the non-human elements of an E R M , that is, the hardware and software. The archival meaning of competent stems from the idea of a competence, defined as "the sphere of functional responsibility entrusted to an office or officer". 1 5 From an archival viewpoint, therefore, competent means capable of carrying out a function by reason of being entrusted with that responsibility. I f these two definitions are united within the context of a system, competent means capable o f carrying out a designated responsibility towards the fulfillment of a general purpose or function. Such a definition can apply equally to both human and non-human components. Before a records system per se can be defined, two further conclusions must be drawn about the nature of systems. First, that a system cannot be defined by any one component. The reason components are brought together in a system in the first place is to accomplish an end that could not be achieved by any one component acting on its own. There is more to this than functional reality. Archival requirements of reliability and authenticity imply that no one element of a records system can be completely self-justifying. If the records created within a records system are to be deemed reliable and authentic, then the system itself must be able to guarantee these qualities. A recordskeeping system must therefore consist of elements that This would be as true of paper records systems as computerized systems. 1 5 . School of Library, Archival and Information Studies, Select List of Archival Terminology (unpublished glossary for Master of Archival Studies Program, Vancouver, University of British Columbia, 1990): 5. 6 Introduction are able to, so to speak, bear witness against each other in order to ensure that a record is authentic and reliable. This includes those elements, such as system documentation and policies and procedures, that permit the user to understand the system as a whole and judge of its purposes. We must therefore define a recordskeeping system as an architecture of competent records-creators, together with their equipment, and support mechanisms, governed by policies and procedures for the management of documents made and received in the course of business, designed to ensure the reliability, authenticity and completeness of archival documents (or records) in the course of their creation, maintenance, use, and disposition. A n important proviso here is that open document exchange standards cannot in themselves form a complete recordskeeping system but can be only part of a recordskeeping system in that they are a mechanism to capture and exchange documents. Therefore, all the elements required to guarantee reliability and authenticity cannot be met by the mechanism of document exchange. For instance, open document exchange systems are not designed to impose rules of governance on communicating systems.1 6 Their function is confined to communicating the completeness of a record. David Bearman has outlined system functionalities that he maintains are essential to guaranteeing the archival integrity o f an electronic recordskeeping system, or in other words, the reliability and authenticity of its records. Since, however, Bearman's functionalities are intended to capture the archival qualities of an electronic recordskeeping system, they are not relevant to a discussion of open document communication standards per se for the reason that such standards are only the messenger between systems and are not concerned with defining the recordskeeping system itself. Role of Communication Standards Standards may be defined in general as "publicly available definitions resulting from international, national, or industrial agreement." Such a broad definition applies to standards of any sort, communications or otherwise, and there are many thousands covering almost every conceivable manufacturing object and process. The aim of all standards is to impose some In fact, as standards, quite the opposite is true. 7 Introduction degree of consistency and as such, they are hardly new. The ancient world had its standard measures of weight and size for coins and many different types of goods. Communications is an area where there has always been a great deal of standardization of necessity, in writing, language, and the equipment and instruments used to communicate. It is no less true of the forms used to convey documentary meaning, such as a letter or a legal contract. It is a point of fundamental importance in discussing the desirability of standards and the applicability of standardization as a concept. The question is not whether to apply standards but how to define something that is already implicit. Standards tend to fall into two broad categories: informal standards, and those that are agreed through some structured process of development. Language itself might be taken as an example of an informal standard where a people have come to agree on accepted vocabulary and grammar.17 Informal standards evolve with a great deal of trial and error and spread because "something just works" and everyone can use it. The more formal standards are those that are agreed by standards-setting bodies who may be established within an industry, at the national, or at the international level. These standards evolve through a complex, bureaucratic process which is closed to all but the chosen stakeholders and experts. The relationship between the formal and the informal is dynamic: the one may borrow from the other. But the formal process is slower to develop, and tends to protect vested economic interests simply because it tends to arise at a later stage, after the genetic free for all, when the stakeholders have stabilized and are in a position where their continued survival depends on their ability to define and defend their interests. This does not mean that formal standards will necessarily prevail. It must be said that in the world of information and computer technology, standards will continue to come and go for as long as the industry remains in a state of explosive technological development.18 The international communications standards that are the subject of this thesis may well be obsolete or non-starters five or ten years from now. 1 9 Although language, too, has its more formal standards, such as authoritative dictionaries, and even standards-setting bodies, such a the Academie Francais. 1 8 The internet is an example of the informal paradigm of standards development where people continually put out ideas onto the Internet without any ability to control access through such mechanisms as copyright or pricing. In the Internet, free-for-all, standards have emerged, such as HGML for the 8 Introduction ' In the realm of computer communications, the aim of standards is to permit different systems to communicate. From this point of view, standards have been defined more specifically as "standard sets of rules of cooperation between peer entities that govern two areas: service provision to the end-systems; and peer cooperation with an entity on the other end-system involved." 2 0 They apply to programming languages, operating systems, data formats, protocols, and electrical interfaces. The fact that standards are rule-based is fundamental to our understanding of their requirements, for to be capable of standardization, a thing must itself be capable of being reduced to rules, or "principles to which actions conform." 2 1 In other words, for archival documents to be captured by international communication standards, they must be capable of being reduced to a set of rules. That archival documents do, in fact, embody a set of rules, a set of rules, moreover, that is capable of being translated into the machine environment of computer communications, is the sine qua non of this thesis. A second fundamental point to grasp about standards is that they are independent of any one system. The way that the Open Systems Interconnect Reference Model (OSI), permits retrieval of a file from a distant site is illustrative. The standard for this service, File Transfer, Access and Manipulation (FTAM) nowhere sets out a command, "transfer this file". The software that would be based on FT A M might have such a command but the standard itself defines only a set of more precise services that, when utilized in a processing order, will bring about file transfer. How. the software program presents this service to the user or chooses to access the services is outside the standard because such considerations can be determined by the system itself. It is not unlike the choice of a telephone: telecommunications standards ensure that the device meets system specification for connection and operation, and have document profile. There is no telling how long they will survive or how far they will go because they have developed as need has arisen and do not respond to vested interests. 1 9 In fact, the ODA standard has already been declared a non-starter by many even though it has been the subject of years of development by international standards-making bodies. 2 0 Peter Henshall, Opening Up OSI: an illustrated introduction (Chichester, Sussex: Ellis Horwood, 1992): 50. 2 1 Oxford Modern English Dictionary, 1992, s.v. "rule." 9 Introduction nothing to do with such matters as the choice of manufacturer or the uses to which the system may be put because these have no bearing on the ability of the system to communicate. They would, in any case, be too arbitrary to define. In other words, system-independent means independent of any particular context.22 For international communication standards to look at the communication of records, we are therefore assuming that records have a set of characteristics that are free of any particular documentary context.2 3 Sets of characteristics that are free of any particular context are called metadata, or, broadly speaking, information about information. International communication standards are concerned with the ability to store, retrieve and manipulate documents located at remote sites, that is, between networks. A remote site means not just one that is distant but that is serviced by a different system. A system that consists of a lot of end users all connected together using the same system (as on a Wide Area Network (WAN) or Local Area Network ( L A N ) is what is called a sub-system. International communication standards are not concerned with how sub-systems operate because they do not have any differences within the system. By connecting sub-systems up through gateways, they form networks. This is the realm of the international communication standard where the need to reconcile differences between sub-systems creates a need for a common language. There are several types of standards. International standards are those accepted by an international standards-setting body such as the ISO or the CCITT. National standards are those that are agreed by national standards-setting bodies, such as A N S I or CSI. Very often the national bodies work in conjunction with international bodies, and a national standard can be recognized as international. Then there are industry standards which have been established by industry groups. And finally there are de facto standards which have emerged within an industry and become accepted. I B M has been a source of some of these in the computer industry; the Internet, another. The international open document exchange standards within A rose is a rose is a rose, wrote Gertrude Stein. She may have been talking about ISO standards. 2 3 Standards agree on those elements that can be agreed. By agreeing upon them, they impart to them a certain universality, but the reverse is not necessarily true. For something to become standardized does not mean it must have universality to begin with. Standardization produces universality as an end product, but as a process it is not dependent on universality. A journey is not defined by its destination. 10 Introduction which records management functionalities may be addressed are known as open document standards in that they permit documents to be created using software modeled to standards that are themselves "open", i.e. they can be exchanged and manipulated over a network. The question then becomes how the record can be defined and recognized within electronic document management systems employing international open document communication standards. How must these standards capture the record and what must archivists know about these standards to appraise their application to records systems? The international open document exchange standards that are here discussed have all received sanction by the ISO, and have been the subject of broad discussion and/or development, with actual implementations. O D A (Open Document Architecture, ISO 8613) and S G M L (Standard Generalized Mark-up Language, ISO 8879) are both concerned with capturing the structure and representation of documents. D F R (Document Filing and Retrieval, ISO 10166) is a storage standard, discussed here because the storage of documents is in many ways the flip side of managing their representation and structure in that the object to be retrieved must mirror what was created.24 These standards address various aspects of electronic document management within a model known as Open Systems Interconnection (OSI). O D A and S G M L are designed to encode the structure and representation of documents so that they can be transferred between networks. They are part of a family of standards which contribute different functionalities to the OSI. These other standards are not examined in this thesis. The Virtual Terminal Service (VT) is another architecture and presentation standard designed to permit end user display of applications running on different networks, such as spreadsheet or word-processing packages.25 Transfer standards include the File Transfer, Access and Management Service (FTAM) , which provides "a generic way of getting files from one host computer system to another and also handles the problem of data conversion between different computers",26 and H. Fanderl, K. Fischer, and J. Kamper, The Open Document Architecture: From standardization to the market ABM Systems Journal. 31 (No. 4. 19921: 732. 2 5 ACCIS, Strategic Issues, 25 2 6 Ibid., 35. , 11 Introduction the Message Handling Service (MHS), which is a world-wide electronic mail service that will be able to handle not only interpersonal messages but also the transfer of different parts of documents.27 Information management standards include the Directory Service (X.500) which was designed as sort of world-wide yellow/white pages and has been expanded to include access to other types of information, such as staff, organizations and documents. Another information management service is the Jxiformation Resource Dictionary System (LRDS), which with the Remote Database Access standard (RDA) standard and the Structured Query Language 2 (SQL2) is intended to provide access to distributed database management.28. Finally, the OSI Management Service is intended to manage communication resources such as networks.2 9 The heart of an electronic document management system is the document profile for it is this which controls document definition and the document structure. The document profile is composed of attributes, each of which define a particular characteristic of a document.30. The attributes are defined - or not defined - in various ways by the different communication standards for their own purposes. For those standards concerned with records management, document attributes should be defined in a way that is useful for archivists, and one of the purposes of the thesis is the development of a thesaurus of standard attributes that can be mapped to the standards for document architecture and representation. Problem of Data Structure One of the most daunting aspects of developing the document profile is that data processing deals with information in a highly structured manner, as tables Documents must 2 1 Ibid, 35. 2 8 Ibid., 58. 2 9 Ibid., 59. 3 0 It must be borne in mind that SGML is different from ODA in that it does not use a document profile. SGML flags document characteristics by means of a markup language embedded in the document itself and defines attributes in a much narrower sense than ODA. In other words, SGML permits the compilation of a user-driven vocabulary of document characteristics into a syntax that can be exchanged and manipulated over a network. Nonetheless, the problem of standardizing the description of document characteristics remains regardless of the implementation. The Document Type Definition (DTD) or header used by SGML may be considered the conceptual equivalent of a profile. 12 Introduction therefore be translated into this same, highly structured environment in order to be manipulated. "Once data is placed in documents, it becomes frozen in a medium that no longer allows its analysis or reuse."31 Information managed in tabular or numeric format, arranged in columns or tables, is much easier to handle and is today easily available (e.g. airline reservation system). Such systems, however, cannot address documentary forms which comprise the bulk of the world's information. "Most information does not fit into a tabular model." 3 2 Documents resist the sort of re-use to which tabular information lends itself, because information once in document form is intended to be static. Although word processors might appear to address the movement of information, they actually address only the appearance of the document. Word processing is incapable of recognizing the components that make up a document, such as the signature, the address, or the text, as independent entities. For this reason, "documents remain linear strings of words and pictures, with no inherent organization below the file level." 3 3 But the problem with document creation in a document management system is that documents may be assembled from parts of other documents or require specific components, such as electronic signatures. In this respect, "for document-based information to become . . . dynamic, it must possess a structure and organization analogous to the rows and columns in tabular information."34 The need for decontextualized attributes. It is retrieval strategy that is probably the key to an E R M . Open systems are intended, to simplify the search for records by making it possible to standardize record attributes. The standards and the E R M s that use them, however, tend to reflect the purposes and context of their creators, with considerable variation in the types and definitions of attributes. "The Lani Hajagos, Lani, Documents and SGML (the Standard Generalized Markup Language standard for document processing), UNIX Review 11 ( No. 3 Mar. 1993): 38. 3 2 Ibid., 38. 3 3 Ibid., 39. File has here the data processing meaning of "a collection of records that all deal with the same sort of data." Advisory Committee for the Co-ordination of Information Systems (ACCIS), Management of electronic records: issues and guidelines (New York: United Nations, 1990): 154. 3 4 Hajagos, Documents and SGML, 39. 13 Introduction organization of a document database," as one writer puts it, "is almost always a direct result of the pre-defined structure imposed on the document collection by the author."35 Moreover, documents can be obtained from a variety of internal and external sources which have not been designed or pre-processed to work within existing information systems. For example, international standards make possible the exchange of documents between disparate systems. The possibility of document interchange between electronic document management systems therefore creates the need for an agreed standards of metadata that can be used to define documents. Documents have thus to be defined not only within the context of the system where they are generated, but also in terms of a system where they may be received. The spread of personal computer-based computing is another aspect of this same problem by encouraging the decentralization of document creation through such activities as distributed editing 3 5 and the development of collaborative tools. To operate at the pc level, the document profile must therefore become both "transparent" or user friendly to individual records creators and also capable of accommodating their personal foibles. It is precisely the freedom of being able to do things one's own way that has made personal computing so popular and productive. Yet the profile must also be capable of imposing standardization of attributes in terms of coping with decentralization and exchange between heterogeneous records systems. This problem is exacerbated by the complexity of modern bureaucracy and its procedures. A further dimension to the need for decontextualized attribute definitions is the need for attributes that are both explicit and machine-readable in order to manipulate document structure. To be accessible to a large audience in a heterogeneous computer environment, a "standard methodology for expressing a document content model, as well as a standard syntax for describing the model and marking document contents, is required.1 , 3 7 3 5 Thomas K Kolopolous, ed, Handbook of Document Management Systems Evaluation and Design (Boston, Mass: Delphi Consulting Group, 1991): 7. 3 6 The ability to edit a document which is on a remote system (remote editing) or edit a document simultaneously (joint editing). Ute Bormann and Carsten Bormann, Standards for open document processing: currents state and future developments Computer Networks and ISDN Systems 21 (North Holland: Elsevier Science Publishers BV, 1991): 158. 3 7 Hajagos, Documents and SGML, 38. 14 Introduction retrieval, the distinction between them is apt to be blurred, if not altogether lost. The first task is therefore to distinguish records from documents. There are many definitions of a document. From the retrieval point of view alone, a document has been defined as "a structure of syntactically normalized, semantically resolved propositions. . " 3 9 From a purely data processing viewpoint, a document has been defined as "a transaction set or message."40 A broader records management definition defines document as "a single record or manuscript item." 4 1 The World Bank has defined document within its own electronic document management system as "an identifiable unit of information that must be managed as one entity to support a Bank business function."4 2 According to the Bank, this could be a "coherent set of pages used to convey a complete message," but it could equally well be a spreadsheet table, a database row, a single page.4 3 A document may also be a data object, defined as "any collectivity of data operated upon by a software system as a logical entity, for example a record, a document, an image or a software routine."44 A similarly broad definition is offered by the Oxford Companion to the Law: "Anything on which signs have been marked to record or transmit any information, a category including books, letters, deed, title-deeds, maps, plans, drawings, photographs and the like." 4 5 Within open document communication standards, documents are defined to suit the particular purposes of the standard. For instance, the Document Filing and Retrieval Standard (DFR), which is concerned with the access and storage of information at a remote site, defines Karen Sparck Jones, Assumptions and Issues in Text-based Retrieval, in Text-based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, Paul S. Jacobs, ed. (Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1992), 160. 4 0 ACCIS, Management of electronic records, 150. 4 1 Ibid., 150. 4 2 Information, Technology and Facilities Department: Information Engineering (ITFIE), ITF Staff Paper No. 12: Information Management Architecture: FY 94. Harold Steyer, ed. (Internal report prepared by the World Bank, Washington, D.C.: World Bank, 1993): 99 4 3 Ibid., 5. 4 4 ACCIS, Management of electronic records, 147. 4 5 David M. Walker, Oxford Companion to Law (Oxford: Clarendon Press, 1980), 371. 16 Introduction a document as "a structured amount of information that can be filed, retrieved, and interchanged consisting of a DFR-object-class of the DFR-object.". 4 6 A D F R document consists of content together with attributes which are associated with the content. The Open Document Architecture standard (ODA) on the otherhand, defines a document as "a structured amount of information intended for human perception, that can be interchanged as a unit between users and/or systems.41. S G M L defines a document as a Document Type Definition and an instance or actual occurrence of text. The document profile itself is a document in its own right in that it can be exchanged by O D A , as can the Document Type Definition of S G M L . In effect, profiles and DTDs are virtual documents in that they are rules for the formulation of a document. The conceptual relationships between information, documents and records can be modeled by an adaptation of the Tree of Porphyry which sets them forth as a taxonomy.4 8 Porphyry derives all documents from the summum genus of information. Information was defined by Samuel Johnson as "intelligence given," or knowledge shared by giving it to someone as a message. The Oxford Dictionary defines information as "something told; knowledge; items of knowledge, news." 4 9 Within an E D M S , this communication takes place across a physical or logical communications switch in the electronic system. Intelligence must be communicated to qualify as information. Dreams and thoughts kept in our own minds are, therefore, not information in this sense. Yet a note made to ourselves about these thoughts and dreams or an account of them told to someone qualifies our recollections as information. The important point is that at the heart of the whole concept of 4 6 ISO/EC JTC 1/SC 18 Text and Office Systems Secretariat: USA (ANSI), Revised Text of DIS 10166-1: Information Technology - Text and Office Systems - Document Filing and Retrieval (DFR) -Part 1: Abstract Service Definition and Procedures (New York: ANSI, 1991): 5. 4 7 Canadian Standards Association, CAN/CSA-Z243.221-90 (ISO 8613-1:1989) Information Processing - Text and Office Systems - Office Document Architecture (ODA) and Interchange Format -Part 1: Introduction and General Principles (Rexdale, Ontario: Canadian Standards Association, 1990): 5. 4 8 This adaptation of the Tree of Porphyry, which was originally intended to explain inorganic chemistry, was first proposed by Trevor Livelton in Public Records: A Study in Archival Theory (unpublished Master of Archival Studies thesis, UBC 1991): 75. 4 9 Oxford Modern English Dictionary, s.v. "information." 17 Introduction information is a principle of objectification, whereby the intelligence we wish to convey is somehow separated from ourselves by putting it into a form that is capable of communication. Intelligence must also have a comprehensible meaning: raw, unstructured data (as opposed to simply garbled data) is not intelligence. Going further, intelligence must therefore be comprehensible, but not necessarily in only human-readable terms. Intelligence can also be comprehensible only by the computer itself. The second level of Porphyry's taxonomy captures this distinction in the differentia of recorded and non-recorded information. The act of recording information creates a document, if not, as yet, a record. But what does it mean, to record information? What is involved in the capture of information in a document? Diplomatics approaches this problem by defining the document as "the expression of ideas in a form both objectified (physical) and syntactic (governed by rules of arrangement). A document's components are: (1) a message; (2) a medium; (3) an intellectual codification of ideas (information configuration: text, image, etc.); and (4) logical arrangement of the internal elements (intellectual form)."5 0 These elements might be said to conceptually define any piece of information, recorded or unrecorded, but they must be present as both physical and intellectual realities to become recorded information or a document. In other words, all these elements must be brought together in one place for a document to come into existence. These various elements of the document may be taken for granted as purely conceptual distinctions where they are brought together by the unitary nature of the paper document. But when it comes to dealing with electronic document management systems and open document exchange communication standards, the message, medium, intellectual codification, and logical arrangement become independent attributes, or groups of attributes, that must each be physically defined to the document profile before a given document can be created, exchanged, and retrieved. This is because the computer is essentially a "brainless" brain that forces us to define conceptually any object or task that it is to perform, and break it down into its constituent elements, as attributes, in the case of objects, or in the case of tasks, algorithms. If Duranti, Managing Electronic Documents, 9. 18 Introduction these elements are independent, then each must be defined before they can be realized in the document profile. MESSAGE If a document is an attempt to communicate ideas, then it must be about something, and the message is no more, nor less, than this, the intelligence or intellectual content of the document, be it machine or human comprehensible. This definition of message is captured by the telecommunications meaning of message where the document is composed of an envelope and a body. 5 1 The body will contain a human-readable message, but the header will contain encoding that will permit the receiving application to interpret the content so that it can be read. Open document communication standards tend to associate message with the human readable content. For instance, O D A defines content as "the information conveyed by the document, other than the structural information, and that is intended for human perception."52 D F R defines content as "the prime information content of a D F R object".53 Whatever the definition, the independence of content, or message, from structure is a physical reality in open document interchange, where content can be swapped about and imported into various documents.54 For example, the aim of S G M L is to permit the information in a report to be used in another document with quite another purpose.55 To do so, however, the information in a document must be conveyed by means of a structure, and it is this that documents bring to information and the reason that structure, and not the intelligence per se in a document, is the focus of open document standards. 5 1 Henshall, Opening Up OSI, 104. 5 2 ISO, ODA: Part 1: Introducton and General Principles, 4. 5 3 ISO, DFR: Part 1: Abstract Service Definition and Procedures, 5 5 4 One is tempted to say data, but data is defined as "information formatted in a special way", and also as the plural of a datum, a single piece of information, which does not get us much further head. Definitions from Philip Margollis, The Random House Personal Computer Dictionary (New York, Random House, 1991): 112. 5 5 Indeed, that is one of the characteristics of content, that it takes on the meaning of its context. 19 Introduction MEDIUM It is, of course, impossible to contemplate the transmission of intelligence or knowledge without giving consideration to the medium. From a data processing point of view, the medium is the electronic telecommunications switch that transmits the message as digital or analogue data. From an archival point of view, for the message to be transmitted as a document, it must be somehow captured or fixed in a storage medium in much the same way that a photographic print is "fixed" in the development process. A medium is therefore the "physical material or substance upon which information can be recorded or stored,"56 and includes both those that are relatively permanent, and those that are ephemeral. Writing is thus "fixed" on paper; voice can.be recorded on tape; data is "written" to magnetic or optical disks. Electronic media encompass a very broad range of various storage media such as tape, diskettes, and hard drives.5 7 To be fixed in a medium, a message must be "written to" the medium, or physically attached. In traditional paper or parchment recordskeeping media, ink was the most common way of affixing the message.58 Electronic data may be affixed to the storage medium by magnetic or optical means. INTELLECTUAL CODIFICATION The means of representation, or the codification of information, partly dictated by the physical constraints of the media, is the third, fundamental aspect of a document.5 9 The use of 5 6 SLAIS, Glossary, 12. 5 7 In networks, media refers to the various types of cables linking workstations together. Margolis, 291. 5 8 The term formatting, as in formatting a disk, is not the same as affixing a message. To format a document meant to prepare it to receive a message, by drawing lines or making holes. Electronic storage media are formatted with the same purpose in mind, to make it capable of physically bearing a message. In medieval documents, formatting also had the purpose of indicating the purpose of a record. For example, priviledges usually had holes, and charters were often a particular size. The purpose of this was partly to overcome the problems of functional illiteracy. Symbols could be used to achieve the same end, by use of colour codes etc. 5 9 The term format must be distinguished from representation. The mode of representation is a broad term that takes in various forms of expression, such as writing, film etc, or what might also be called media. Format is the particular choice of codification available within each form of expression. Writing uses words; it can also uses Dictographs. Film uses 35 mm, video etc. EDMS will use digital or analogue formats to encode content. 20 Introduction text, raster or geographical graphics, voice, or digital or analogue data are all forms of intellectual codification by which the message may be represented in electronic documents and must therefore be transmitted by open document standards. As with the message, the. intellectual codification may. be machine or human readable. For instance, digital or analogue data is not itself human-readable except with great difficulty, although it may be used to produce a human-readable form of intellectual representation, such as a map, or words. For example, a word-processing application will use a formatting language to encode instructions that will permit a screen display of a document. LOGICAL STRUCTURE Finally, information must have some form of logical structure in order to be interpreted for use. The logical structure is actually two types of structure, a structure determined by meaning, and a structure required by physical requirements of the medium, or a layout structure. Since O D A and S G M L both distinguish between these two types of structure, the distinction will be retained henceforth. The logical structure consists of elements such as sentences, paragraphs, and footnotes that are determined by the content of the message. Graphics may be part of a logical structure as illustrations or tables. A relational database will have a logical structure in the form of tables or fields. The concept of logical structure is extended by compound documents that combine active links with databases or other documents in order to incorporate elements into a single document, such as graphics, text and data. The links that make possible the compounding.of the various parts of the document must be considered a part of the logical structure as well as the various elements;themselves. Only in the very broadest sense, can the layout structure of a document be considered logical, but since both tangible and electronic media place physical constraints on the ways and amount of information that may be recorded on them, choices must be made and intellectually accepted that determine the structure of the message. The traditional layout conventions of book publishing, such as page breaks, margins and the positioning of text and page numbers on the page of book publishing, for instance, represent a creative and intellectual response to the physical constraints of paper and printing machinery in an effort to balance aesthetics, textual clarity, and economics. Introduction From an archival viewpoint, the limitation with all these views of the document is that they do not address the nature of the document as a record. The definitions driven by retrieval are content-driven. In effect, they define the document by its subjects, as a mere container of topical information. The definitions used by international standards define the document in terms of its structure for purposes of processing but go no further. They are communication standards, designed to facilitate the exchange of the document as information. But to address the nature of the document as a record for the purpose of establishing whether it is a record, it is necessary to go further. Like documents, records have been variously defined. The data processing definition hardly distinguishes between a record and a document, and calls both a "set of related data or words, treated as a unit",6 0 or "in database management systems, a complete set of information."61 This type of record is composed of discrete fields which brought together in some intellectual aggregation (such as an address, which might be made up of fields for name, street, city, and postal code) form a complete unit. The data processing record really corresponds to the file. Each of the fields can also be manipulated individually and aggregated through various search techniques to form other data processing records. Another data processing definition of record permits the record to be defined as a data structure which might consist of a combination of other data objects, for example, a number of different types of numbers and a character string. Regardless of the definition, since all information in a computer must be stored as one type of file or another, the purpose of the data processing definition of record is to permit the management of data for whatever processing purposes are required, such as retrieval, storage, and manipulation. This is quite opposite to the purpose of the archival management of the record as evidence. The fact that a record is intended as evidence implies preservation and not manipulation. ACCIS, Management of electronic records, 176. Margolis, Personal Computer Dictionary, 402. 22 Introduction A n open-ended definition of the record is offered by Robek, Maedke and Brown according to whom a record is "Recorded information of any kind and in any form." 6 2 The problem with this definition is that it offers nothing that the definition of document does not already cover, implying that all documents must therefore be records. At this point it is useful to turn once more to the Tree of Porphyry as adapted by Trevor Livelton 6 3 as a means of conceptualizing the relationship between document and record. According to Livelton, records are really a species of document. The recorded form of information then becomes the subaltemum germs o f the document itself which in turn divides into two differentia of its own: those documents that are made or received, and those that are neither made or received. Records are an infima species o f those documents that are made or received in that they are produced in the course of some practical activity or transaction. Practical activity has been defined as "an activity whose purpose is not the activity itself but the production of effects capable of influencing situations." 6 4 This definition of a practical activity is related closely to that of a transaction. The A C C I S glossary offers a broad definition of a transaction as "information, communicated to other people in the course of business, via a store of information available to them."6 5 The S A A Glossary defines transaction more strictly as "an act, or several interconnected acts, in which more than one person is concerned, and by which the relations of such persons between themselves are altered."6 6 There are two fundamental points to note about this definition. First, that there must be a relationship between at least two people, the sender, or creator of information, and the recipient. This relationship must be Wilmer O. Maedke, Mary F. Robek, and Gerald F. Brown, Information and Records Management (Mission Hills, CA: Glencoe/McGraw-Hill, 1987): 568. This vague definition reflects the management problem of capturing records where there are not the resources to be more exacting. 6 3 Livelton, Public Records, 73 -75. 6 4 Duranti and Eastwood, Preservation of the Integrity of Electronic Records, 1. 6 5 ACCIS, Management of electronic records, 184. 6 6 Lewis J. Bellardo, and Lyn Lady Bellardo. A Glossary for Archivists, Manuscript Curators, and Records Managers (Chicago: Society of American Archivists, 1990): 35. 23 Introduction substantive, or have a separate and independent existence.67 A transaction, then, is an act that takes place within a context, which in this case, must be a substantive relationship. The second point to note is that a transaction consists of an act. The S A A definition hints at the nature of this act as something "by which the relations . . . are altered" between the people participating as sender and recipient. Transactions are therefore the result of practical activities since the purpose of these is the "production of effects capable of influencing situations." The converse, however, is not necessarily true: all practical activities do not necessarily result in transactions. A practical activity may be undertaken without producing any change in the status of those participating. A n example would be the writing of a reminder to oneself. There is broad agreement on the fundamental nature of the record as recorded information that manifests a transaction or a practical activity. Schellenberg defined the record as " A l l books, papers, maps, photographs, or other documentary materials, regardless of physical form or characteristics, made or received by an public or private institution in pursuance of its legal obligations or in connection with the transaction of its proper business [italics not those of citation] and preserved or appropriate for preservation by that institution or its legitimate successor as evidence of its functions, policies, decisions, procedures, operations or other activities or because of the information value of the data contained therein." 6 8 Jenkinson's definition of archives is similar in its essentials to Schellenberg's idea of a record: " A document which may be said to belong to the class of Archives is one which was drawn up or used in the course of an administrative or executive transaction (whether public or private) of which itself formed a part; and subsequently preserved in their own custody for their own information by the person or For this reason, the mere giving of information without the context of a substantive relationship would not qualify as a transaction. For example, newscasts, such as hurricane warnings, frequently compel listeners to take action, but the information would not qualify as a transaction between the broadcaster and the listener because there is no substantive relationship between them (which is not to say the broadcaster is absolved of responsibility for their acts). The listener's life may be altered by the news, but the relationship between the broadcaster and the listener would remain insubstantial and unaltered because it has no existence outside the broadcast itself). 6 8 T. R. Schellenberg, Modern Archives: Principles and Techniques (Chicago: University of Chicago Press, 1956; Midway Reprint, 1975): 16. 24 Introduction persons responsible for that transaction and their legitimate successors." The A C C I S glossary defines the record as "recorded information, regardless of form or medium created, received and maintained by an agency, institution, organization or individual in pursuance of its legal obligations or in the transaction of business."70 Similarly, the Society of American Archivists' glossary defines the record as "a document created or received and maintained by an agency, organization or individual in pursuance of legal obligations or in the transaction of business."71 The idea that a record is a document created or received in the course of a transaction or a practical activity has been carried over into the realm of electronic records. Within the electronic environment, David ..Bearman has defined the record as "any communication between one person and another, between a person and a store of information available to others, back from the store of information to a person or between two computers programmed to exchange data in the course of business."12 Finally, it remains to establish the relationship between records and archival documents. The distinction has revolved around the precise meaning of "preserved" in the definition of the record. B y preserved, Jenkinson meant the retention of documents by their creators for their own purposes. Schellenberg, on the other hand, saw preservation to mean retention for purposes other than those legal and administrative purposes for which the record was originally created, that is, for the secondary values of research. The debate has been important because it has been used a basis for distinguishing archives from records in that, according to Schellenberg, archives are a special group of records which have been selected by archivists on the basis of values other than those for which they were created. Livelton sets this debate aside by equating archives with records. "Selection," he writes, "as has been seen, is implicit in the notion of preservation [in that records must first be 6 9 Hilary Jenkinson, A Manual of Archival Administration, revised edition (London: Percy, Lund & Humphries & Co., 1937): 10. 7 0 ACCIS, Management of electronic records, 176. 7 1 Society of American Archivists, Glossary, 28. 7 2 Bearman, David, Managing Electronic Mail Archives and Manuscripts 22 ( No. 1): 38. 25 Introduction selected to be preserved]. And, by leaving it implicit - that is, by refraining from qualifying the notion of preservation - this reformulation of the traditional definition can accommodate, albeit tacitly, both the 'records' and the 'archives' sides of Schellenberg's distinctions between the agents, the recipients, and the purposes of selection for preservation."7 3 In other words, Livelton's infima species of records is synonymous with archives. The archival document, therefore, defined as a document produced in the course of a practical activity, is synonymous with record. One other point remains to be established about the nature of the archival document, and that is, that records always exist in aggregations. "Archival documents or records are necessarily composed of documents and the complex of their relationships. Because of this, any document, of any nature, which acquires relationships with a group of archival documents or records, is to be considered a record itself, following the fundamental rule which governs every collectivity, according to which each individual entity acquires the nature and characteristics of the whole to which it belongs." 7 4 A l l records are subject to the concept of the archival bond, defined as "the relationship that, because of the circumstances of their creation, records have with their creator, with the activity in which they participate, and among themselves. The archival bond is originary (it comes into existence when the record is made or received), necessary (it exists for every record), and determined (it is characterized by the purpose of the record). 7 5 Before proceeding to the issue of the completeness of a record, it is advisable to get a better picture of the electronic context by examining the electronic records management system ( E R M ) and the role and nature of standards in more depth. 7 3 Livelton, Public Records, 69. 7 4 Duranti and Eastwood, Preservation of the Integrity of Electronic Records, 4. 7 5 Ibid., 4. 26 C H A P T E R ONE E L E C T R O N I C D O C U M E N T M A N A G E M E N T SYSTEMS Definition A n E R M is the data processing environment in which archival documents are created, identified, and preserved within a recordskeeping system. In the sense that the definition of archival documents include those both made and received, an E R M must make provision for those records produced outside its electronic boundaries by independent recordskeeping systems. Electronic records management is the control of documents through electronic means. There are two levels of electronic document management. The less complex level is concerned with retrieval only, or the management of documents as "discrete objects" i.e. documents that cannot be modified. This is the focus of bibliographic retrieval systems and optical imaging systems. The more complex level attempts to go beyond management for retrieval only, to control of documents at their creation. Such systems permit documents to be created on the system as well as managed throughout their life cycle. Electronic document management must be distinguished from electronic records management. This distinction lies in the fact that, as we have seen, not all documents are records. Electronic document management is concerned with the management of documents defined in their broadest sense, with the document as a form of information that includes both bibliographical materials as well as records. Electronic records management, on the otherhand, is concerned only with the management of records as a species of documents. Just as electronic document management is the context of electronic records management, so it is with the systems themselves. Electronic document management systems (EDMS) are designed to manage all types o f documents. A n E R M might therefore be defined as the functionality of an E D M S that is specifically designed to manage records. In discussing the design and structure of an E R M , it is appropriate for present purposes to refer to the design of an E D M S . 7 6 The discussion of an EDMS owes much to the author's experience at the World Bank in Washington, DC, where a major electronic document management system is being designed and implemented at the time of writing. The World Bank is a special agency of the United Nations devoted to fostering economic development in second and third world counUies. The system being designed will eventually serve some 10,000 employees. The World Bank EDMS consists of a software for the creation of .27 Chapter One Electronic Document Management Systems Structure of an EDMS Electronic Document Design Methodology ( E D D M ) addresses "the creation of a flexible system of architecture, the preservation of system investment, the formulation of EDMS-based user needs, the resolution of E D M S data management anomalies, the integration of diverse data, and the minimization of risk."" This is the world into which the archivist must venture. There are five components to an E D M S . 1. Architectural design consists of the hardware and software infrastructure within which the E D M S will function. This includes operating systems, networks, hardware platform, input devices, output devices, software utilities and environments, remote connections, integrated facilities for the collection and dissemination of information.78 For instance, the architecture may specify the technical requirements for such functions as workflow, shared filing, document management, group authoring, and electronic mail. 7 9 2. Database control and organization is the identification and definition of structured access paths. This is a necessary feature of larger systems, although in smaller systems retrieval can be based on direct content search or, in the case of images, by a single identifier. 3. Application analysis and design identifies how the organization will use and react to the system. From a systems design viewpoint, this is considered the most difficult part.80 For documents, a document store managed by a database, and an imaging system to capture paper documents received from outside the Bank. 7 7 Kolopoulos, Handbook of EDMS, 6. 7 8 Architecture is defined as " a specification which determines how something is constructed, defining functional modularity as well as the protocols and interfaces which allow communication and cooperation among modules." ACCIS, Management of electronic records, 136. System architecture should not be confused with system configuration which is defined as " the arrangement of a computer system or network defined by the nature, number, and the chief characteristics of its functional units. More specifically, the configuration may refer to a hardware configuration or a software configuration." Ibid., 144. A graphics monitor and a video adapter could be a minimal configuration; the use of international communication standards could be part of the software configuration. 7 9 Organization and Business Practices Dept., Information Engineering, Electronic Document Management System (presentation by Robert Patt-Corner and Ronald Cutier, Washington, World Bank, 1994): 3. 8 0 Kolopoulos, Handbook of EDMS, 28-29. 28 Chapter One Electronic Document Management Systems example, in the World Bank, the document management regime has been divided up into four management domains within which documents are created, accessed, and stored. These domains are the personal domain, the work group domain, the business unit domain, and the institutional domain. Each has different rules for the creation, access and storage of documents. The World Bank Group Archives control documents in the business and institutional domains where they can no longer be altered and are intended for long-term preservation. 4. The document definition is the fundamental element that must be specific to each system. The document definition comprises an array of physical and intellectual elements such as the amount of space it may take up, type of document, and delineating features. Not only must the document itself be defined as an entity, but also, each different type of document must be defined if it is to be created and retrieved. The identification of document types is a critical task that requires, first, a rigorous analysis of document types as they are used across an organization as well as between organizations, then, the definition of attributes that will be mapped to the profile. This can be time-consuming.81 5. Document structure is the structure of the document including both logical components and the logical interactions of various document types and document objects,82 and its layout or physical organization. The logical structure or architecture is the way the document is perceived by a user, which may differ from its actual functioning or form. For instance, a paragraph or a chapter is a typical element of logical structure which can only be defined by the writer. The E D M S must define how a paragraph or chapter is to be encoded and retrieved, although the actual instance will be at discretion of the writer. 8 1 For instance, a loan or project at the World Bank has been estimated to involve over dozens of different document types, all of which must be defined to the EDMS. 8 2 A data object is a type of data structure which consists of a data type (text, image etc) packaged with programming that enables it to perform certain functions towards a given end. For example, a report consisting of text and tables could be programmed to get the tables from a certain spreadsheet. A document object is a document which also contains programming to make it perform in a certain way. A document type is simply a document that is repeatedly used frequenuy enough to be identified as a type. Its form may be more or less strictly prescribed. In the World Bank, for instance, a President's Report is a document type used to present a loan proposal to the Board of Directors and is common to all loans, with its own prescrptive characteristics. An invoice, on the otherhand, is a generic document type whose form is not formally prescribed in Bank procedure. 29 Chapter One Electronic Document Management Systems The problem of logical structure is complicated by compound documents in that they consist of a mixture of information configurations, such as text, sound, graphics and data drawn from different applications. For instance, a report could include text taken from several different reports, graphics uploaded from an image bank in a photo library, and tables drawn from a spreadsheet. The logical structure therefore also includes the links between different types of data that may be used in the document and the objects from which they are drawn. The links may not be just between different types of data, but between different objects, or data structures that include functionalities.83 For instance, a document may be made up of various parts, each of which is a different document in its own right. It may also have notes or annotations that may also be kept separately and which may be attached at some point in the writing or co-editing. . The layout of a document consists of its organization into pages, including the use of running heads, typefaces etc. It is important to recognize the distinction between fonnatting and document structure. Standards such as S G M L and O D A are able to preserve the original author format of the document, but "formatting is only useful for viewing of document information in its complete, unabridged version. Once a. user begins to extract parts of the document in the original, formatting is no longer available to facilitate comprehension. In fact it begins to hinder the process of understanding the connections between differing pieces of information that are no longer connected in an authored format. " 8 4 There are only two ways to resolve this problem of, loss of document organization. Either all documents must be placed into a single repository with a minimum of grammatical structure, or every document collection must be reconciled through a common structure. The second alternative favours the user but places a much greater burden on the system and is much more difficult to implement.85 8 3 Margolis, Personal Computer Dictionary, 329. For example, a table from a spreadsheet consists of data or numbers that may be manipulated using Various commands or functions. This is opposed to the inclusion of an image which cannot be manipulated. 8 4 Kolopoulos, Handbook of EDMS, 7. 8 5 Ibid., 8. 30 Chapter One Electronic Document Management Systems It is obviously the second alternative that faces the archivist i f the system is to respect both the needs of the creator and the retrieval requirements of users. In fact, these are likely to be synonymous if records are to be created from parts of other documents. The question then becomes, what should be the basis of a document structure that can be translated into machine-readable form, into tables that will permit not only the formatting of documents but their creation and manipulation? Document definition and the document structure are the features of the E D M S design that fall within the ken of the archivist. Document Profile The E D M S manages the document throughout its entire life by defining as attributes all the features of a document that reflect its physical and intellectual nature, as well as those others that determine its use, maintenance and disposition in a structure called the document profile. The document profile is defined as "control information that is associated with a specific document."86 There are really not specific rules for a profile except that the profile tends to contain characteristics that apply to the document as a whole, as does the O D A profile. The profile consists of descriptive rules that may be applied at the time of creation or later when a document has been finalized and is ready for such final capture processes as imaging, but the rules must be agreed at the outset, during system design. One of the most familiar document profiles is the e-mail header. The header in an e-mail message always identifies the sender of the message, the addressee, and the time, and provides proof of the transaction.81 A n attribute is defined as "a property or characteristic of one or more entities, for example, colour, weight, sex." 8 8 In terms of records, the author, date, or title of a document ITFIE, Information Management Architecture, 5. 8 7 The distinction must be made between the document profile and the ability of an application to format. Document profiles include the logical and layout characteristics of a document which in its original form, as created on the native application, may be defined by the application itself (e.g. maximum number of characters in a field, range of typefaces etc) but to qualify as a profile, they must be resident in the system external to any one application and capable of being imposed on any document generated in the system. 8 8 ACCIS, Management of electronic records, 138. 31 Chapter One Electronic Document Management Systems are elements that may also be construed as attributes. Crucial to an understanding of their function is that attributes are managed independently of content. They can be read or changed without changing the content of the document. The document profile constitutes a family of different groups or types of attributes that control the management of the document. Attributes fall into various groups. Contextual attributes may define provenance, such as the organizations and individuals responsible for the document, the type of business involved, the document type, or dates of creation. Management attributes specify access, security, storage and classification, and the extent to which a document may be processed or modified. Physical attributes may define types of data structures (such as graphics, or text), the size of the file, and interchange formats. Attributes may also be mandatory or optional, user-defined or system-defined. Each open document exchange standard has its own requirements and therefore its own set of attributes. For instance, the Document Filing and Retrieval Standard (DFR) has its own attributes which are defined as data items that identify a DFR-object, describes its DFR-content, helps control access to it, or in some way is associated with the DFR-object. 8 9 Without doubt, the document profile is at the very heart of the electronic document management system. It is here that the record itself will be defined and on the basis of those attributes, retrieved and managed. The profile is independent of the content of the document. For instance, when a document is created, its profile is the first thing to be defined and modified. This provisional profile, parts of which will be system-defined (such as identification of originating unit, business process, or document type) and others optional (such as a title) is subsequently linked to the content which together form the document. In the same way, i f the document is not preserved, all that may be retained will be the profile; the content will be eradicated, leaving an empty shell. The concept of a document profile is not confined only to electronic systems: it is characteristic of all documents. For example, everyone who writes a letter does so with certain rules in mind about how it should look in order to be understood as a letter, i.e. the inclusion of ISO, DFR: Part 1: Abstract Service Definition and Procedures, 5. 32 Chapter One Electronic Document Management Systems a salutation and closing etc. In this case, the "profile" is assumed or understood by the creator. Medieval notaries developed formularia which set out the rules for the creation of various types of documents by prescribing standard formulas, such as wordings or "boilerplate" text. These are the ancestors of the style guides and standard document forms that are today a common feature of life wherever a document is required as evidence of a transaction and are a form of profiling.9 0 In electronic information systems, the document profile resides as a functionality that must be invoked before a document is to be created. Role of Metadata The attributes are metadata, defined as information about information. The concept of metadata presumes that all information is made up of two parts: a conceptual form, or its description, and the actual occurrence of information fitting that description. A particular instance of the class of information indicated by the attribute type is called an attribute value. For instance, the attribute "date of creation" will define what constitutes a date of creation and in what form the it may be recorded as data; the entry of an actual date of creation will be an attribute value. The attribute "author's name" becomes a particular metadata and the actual name of an author becomes a specific occurrence. There is no limit to the characteristics and functionalities of a document that can be established in the profile but defining the attributes is by no means easy. The document profile forms a relational database. This means that each attribute must be unique, cannot be confused with another, and must consist of a highly specific, quantifiable value that can be manipulated by the E D M S with consistent accuracy and results. It is obviously preferable to avoid as far as possible user-defined attributes whose values are subject to human inconsistency, incompleteness, or inaccuracy. This is the problem of transparency: the more automatic the values of the profile, the less reliance on user input and the easier it is to use. Ideally, the profile should be defined in such a way that it depends on the user as little as possible. In the design of It might well be argued that is virtually impossible to sit down and write anything without quickly becoming aware of rules of formulation. This is true even of personal diaries, which have a recognized form even though they are not intended to be communicated to anyone but the author. The truth seems to be that communication itself, whether public or private, has implicit rules, and writing in particular, as one of the most structured forms of communication, especially so. 33 Chapter One Electronic Document Management Systems any database, the definition of the metadata is an exacting process that will determine ultimate success, all other elements being equal. It is therefore appropriate here to be careful about the use of the term metadata. Unless there are exponential leaps in artificial intelligence, E D M S will be successful only in dealing with highly definable data that is as little dependent on user-definition as possible. For instance, no E D M S is capable of grasping the abstract concept of a record series. The identification of a record series depends on the ability to recognize a number of physical and intellectual characteristics whose aggregation is arbitrary and whose occurrence cannot be defined with consistency. While series may be defined as metadata, one must be careful to recognize that all you are really doing is mcluding human-generated metadata in an automated system without contributing to or enhancing its accuracy. Metadata, in the case of a document profile, should be thought of as including only those attributes that can be defined in terms of automatic system manipulation.91 Distributed Database Management It is very likely that the E D M S will operate within the environment of distributed data processing in which some or all of the processing, storage and control functions, in addition to input and output functions, are situated in different places and connected by transmission facilities. The E D M S may function as a distributed database which is a logical database that has been divided among physical locations within a distributed information system,92 or in other words, is the same database operating out of different physical locations. For example, in the World Bank, a system called Integrated Records and Archives Management (ERAMS) is used to control the accessioning of inactive records by the Archives, the receipt of project records by information service centers, and the documenting of Bank reports and publications by the Internal Documents Unit. Each of these units operates out of different locations and has different requirements, all of which must be captured as attributes in I R A M S . This example leaves aside the question of whether series would continue to exist in an electronic document manage system. 9 2 ACCIS, Management of electronic records, 149. 34 Chapter One Electronic Document Management Systems A more complex manifestation of the concept of distributed data processing is the federated database. A federated database is a system which enables searching and/or display of structured data stored in decentralized, heterogeneous databases. Software and data structures do not have to be compatible, but the fields in the databases must have compatible semantics and these must be defined in a consolidated repository. A federated database system is made up of cooperating but autonomous databases.93 The control of documents within a distributed data processing environment relies on a database management system. A database management system is a software product that controls a data structure containing interrelated data stored so as to optimize accessibility, control redundancy, and offer multiple views of the data to multiple application programs. Database management systems also implement data independence to varying degrees.94 At the heart of the database management system are the data dictionary and the data directory. The data dictionary is a repository of information about the definition, structure, and usage of data, or in other words, a library of metadata. It does not contain the actual data itself. In effect, the data dictionary contains the name of each data type (element), its definition (size and type), where and how it is used, and its relationship to other data. The data directory is a structured description of the relationships between data in a database, such as cross-reference information showing which programs access which data or which departments within an organization receive which reports.95 Open Document Processing and the Role of Standards Distributed data processing, whether consisting of only one database operating out of different physical locations, or different databases that cooperate as a single system raise the problem of standardization. This is exacerbated in a situation where databases that are not part of the same system must try to communicate and share the same information Open document processing is the use of standardized information formats to create a common understanding ITFIE, Information Management Architecture, 99 ACCIS, Management of electronic records, 149. Ibid., 149. 35 Chapter One Electronic Document Management Systems between originator and recipient about the information being interchanged;96Intemational communication standards are meant to facilitate the exchange of documents between different systems, which may or may not mean different organizations or, in archival terms, records-creators. Figure 1 The Basic Processing Model97 Source Document Type Mapping Specification Source Document Processing Mep Result Document Type Result Document Open document processing functions by separating the information in the document from its layout or organization. In more specific terms, this requires "the clean separation between the document logic and the layout control of a document, i.e., the separation between the original document information (i.e. the structure and semantic categories of the information that the author has in mind and wants to convey to the mind of the reader) and the control information for processing steps to be performed on this original information, particularly control information for an automatic formatting program. This separation is important because . . . the creation of a document need not be performed at the same place or at the same time as the further processing steps of this document."98 The heart of open document interchange is the automatic processing step which transforms the source information into the corresponding result information. This concept can be clearly understood from the basic processing model in which a source document goes through an automatic processing step in order to be received as a result document. Both source 96 97 98 Bormann & Bormann, Standards for open document processing, 149. Ibid., 152. Ibid., 149. 36 Chapter One Electronic Document Management Systems and result documents are generated from a pre-determined document type which must be mapped to the specifications of the automatic processor in order to be transmitted and interpreted correctly." For this to happen between communicating databases, the attributes of both the result and the source document must be agreed in the document profile, or in other words, standardized. Since many documents share similar characteristics, the support of sets of rules which can be used to control, through standardization, the creation and processing Of specific documents of the same type is a necessity. These rules can be used to define both source and result documents. Commercial applications of open document processing distinguish two application environments: the publishing environment, and the office environment. The publishing environment is concerned with the processing and distribution of publications of all kinds; the office environment provides for "the processing, forwarding and turn-around of business documents (such as letters, forms, and reports) as a routine form of communication . . . including bi-directional routine forms of communications . . " 1 0 ° These two environments have different emphases. In the publishing environment, the emphasis is on interchange of manuscripts between author and publisher, and between publisher and printer. The formatting of the document for layout is of prime importance. The author in the publishing environment has no control over layout which is controlled by the publisher. The document product will last a long time, so the time taken to prepare the document is of less importance, and the product is likely to be reproduced in different formats. In the office environment, instead, the main emphasis is "on the blind interchangeability" of documents. That is, "it must be ensured that documents can be interchanged routinely between arbitrary originators and recipients without special pre-agreements between them and without restricting the ability to process the document by the recipient."1 0 1 There can be no delays in producing documents so there cannot be elaborate encoding of processing requirements. Another major difference is that the Ibid, 152. Ibid., 150. Ibid., 150. 37 Chapter One Electronic Document Management Systems originator, or author, of the document usually wants strict control over the layout of the document. The directives that define the document layout must therefore be interchanged together with the document content. The extended processing model of open document processing accommodates the need for extensions to enable documents to have the functionalities of both the office and publishing document environments. This is made possible by the use of extensions to the document modeling standards of O D A and S G M L and fosters the convergence of these two basic standards. In extended processing model, the source and result documents are defined by pre-determined conceptual documents, and then mapped to the specification of the automatic processor provided by the standard. In order not to complicate the description of documents by adding all the necessary information needed to define extensions, the functionalities of the extensions are added in additional processing steps mapped from a transit document.1 0 2 The most important standards for the purposes of open document exchange are those set by the International Standards Organization (ISO). These standards are part of a whole set of standards that govern the complete process of data interchange between computer systems. At the heart of ISO communications is the Open Systems Interconnection (OSI) Reference Model. The OSI model is a communication reference model that has been defined by the International Organization for Standardization (ISO). It is a seven-layered communications protocol 1 0 3 intended as standard for the development of communications systems worldwide. From top to bottom, the layers of the OSI model are: • Layer 1 - Physical Layer The physical layer defines the actual set of wires, plugs and electrical signals that connect the sending and receiving devices to the network. • Layer 2 - Data Link Layer 1 0 2 Ibid., 158-159. 1 0 3 A protocol is an "agreed format for transmitting data between two devices. The protocol determines . . . the type of error checking to be used; how the sending device will indicate that it has finished sending a message; and how the receiving device will indicate that it has received a message."Margolis, Personal Computer Dictionary, 389. 38 Chapter One Electronic Document Management Systems The data link layer is responsible for gaining access to the network and transrrutting the physical block of data from one device to another. It includes the error checking necessary to ensure an accurate transmission. • Layer 3 - Network Layer The network layer establishes the connection between two parties that are not directly connected together. For example, this layer is the common function of the telephone system. • Layer 4 - Transport Layer The transport layer is responsible for converting messages into the structures required for transmission over the network. A high level of error recovery is also provided in this layer. • Layer 5 - Session Layer The session layer establishes and terminates the session, queues the incoming messages and is responsible for recovering from an abnormally terminated session. • Layer 6 - Presentation Layer The presentation layer is used to convert one data format to another, for example, one word processor format to another or one database format to another. • Layer 7 - Application Layer The application layer is the top layer. It is the set of messages that application programs use to request data and services from each other. Electronic mail and query languages are examples of this layer. The OSI model is not complete because it does not define standards for user applications that lie above and beyond the Application Layer. OSI does not attempt to define standards for types of application such as spreadsheets, or word processing or E D M S . What it does provide is the ability for these packages to communicate. The OSI model depends for its functionality on a number of principles. First, the layers of the model must be exactly duplicated for both sender and receiver. Secondly, it assumes that each layer must be self-sufficient, or independent in its functioning from any other layer, that is, each layer must be complete in itself. To make use of its functionalities, one has merely to plug in; it is not necessary to know anything about the layers beneath. This means that the model is essentially modular: standards can be developed at each layer without affecting the other layers or having to redesign all or part of the whole model. Finally, the same features of the document profile that separates the description of data from its actual occurrence applies to standards. Standards are metadata meant to govern the design of an actual occurrence or implementation. The document management standards that are the subject of this thesis are those found at the top of the OSI Model, Layer Seven, Applications. These standards address different 39 Chapter One Electronic Document Management Systems types and functionalities of document communication and form, a considerable family. S G M L (Standard Generalized Markup Language) is designed to permit the exchange of documents that are to be published. O D A (Office Document Architecture) is designed to handle office documents. A S N . l (Abstract Syntax Notation One) is a standard language for defining data structures, or the way data may be encoded using binary encoding rules. C C F (Common Communication Format) is a standard for the transfer of bibliographic cataloguing and abstracting information originally produced by U N E S C O General Information Program for transferring data between computer systems. M A R C A M C is another standard used to transfer bibliographical and archival data. Facsimile transmissions are governed by their own set of standards, while the X.500 standards govern remote directory services. Retrieval from remote storage on servers is governed by D F R (Document File and retrieval Standard) and FT A M (File Transfer and Management) standard. The transfer of graphics and images is governed by yet other standards. 40 C H A P T E R T W O DIPLOMATICS Electronic document management takes place at the document level, and archivists need a tool that will help them work at the level of the document. The science of modem diplomatics, as proposed by Luciana Duranti, offers just such a tool. As Duranti points out, "The boundary lines between the two disciplines is to be found in the series, the fonds, the archives as a complex of documents, as a whole, which constitutes the area of archival science. Instead, the single document, the elemental archival unit, is the area of diplomatics."1 0 4 The general theory of diplomatics defines diplomatics as "the discipline which studies the genesis, forms, and transmission of archival documents, and their relationship with the facts represented in them and with their creator, in order to identify, evaluate, and communicate their true nature."105 B y focusing on the "true nature" of documents, this definition goes well beyond the original purpose of diplomatics as it developed up until the French Revolution, which was "strictly linked to the need to determine the authenticity106 of documents, for the ultimate purpose of ascertaining the reality of the rights or truthfulness of the facts represented in them," 1 0 7 and even further beyond the nineteenth century use of historical diplomatics as an tool of documentary criticism. The ability to determine the authenticity of documentary sources by the study of their forms and genesis will remain fundamental to the value of diplomatics. Indeed, "as public officials who are professionally knowledgeable of the nature of records, archivists still have an important role to play in guaranteeing the authenticity of documents and may see that role Duranti, Diplomatics, Archivaria 28, 10. 1 0 5 Ibid., 10. 1 0 6 The diplomatically authentic document is defined as "those which are written according to the practice of the time and place indicated in the text, and signed with the name(s) of the person(s) competent to create them." Ibid., 17. 1 0 7 Duranti, Diplomatics, Archivaria 28, 17. 41 Chapter Two Diplomatics grow in significance as they acquire machine-readable records." 1 0 8 But changes in the circumstances of document creation and the role of archives have led to a reassessment of the potential applications of diplomatics. Duranti writes that the application of the principles and methods of diplomatics as they were developed in the nineteenth century cannot be readily applied to modem documents because of the "plurality and fragmentation of our sources, and because the formalism of old bureaucracies has atrophied in modern ones, creating forms of documents which do not often lend themselves to systematic analysis and description." Yet despite a proliferation of laws and administrative bodies, the application of diplomatics is favoured by the "growing uniformity of laws, regulations, and structures, and of the way these activities are carried out because of the standardization promoted by records management, which is vital to an elephantine bureaucracy, and because freedom of information, underlining the accountability of administrative bodies and the citizen's right to control their activities, favour a better organization and determines the spreading of the knowledge of our social system, knowledge which is losing its elitist character."109 O f particular consequence is the recognition that the boundary between records management and archives is a nineteenth century historical aberration that can have ho place in the management of electronic records. Duranti believes that there is a particularly urgent need to apply diplomatic principles to electronic documents where the central concern should be to ensure that records are not only authentic, but also, even more important, reliable. 1 1 0 "The easiness of electronic records creation and the level of autonomy that it has provided to records creators, coupled with an exhilarating sense of freedom from the claims of bureaucratic strictures, procedures and forms, have determined the sloppiest records creation in the history of records making. Too many persons and too many records forms generated in too many different contexts participate in the same transaction; too much information is recorded; too many duplicates are preserved; and 109 Ibid., 23. Ibid., 9. 1 1 0 A record is considered reliable "when it can be treated as fact in itself, that is, as the entity of which it is evidence." Luciana Duranti, Reliability and Authenticity: the Concepts and their Implications (^ unpublished paper 1995): 3'. 42 Chapter Two Diplomatics too many different technologies are used. In a word, electronic records, as presently generated, might be authentic, but they are certainly not reliable." 1 1 1 These potentials are inherent in diplomatics as a science defined not by time and place, or by its application for historical, legal, or administrative purposes, but by the nature of its subject, the archival document. The archival document is broadly defined as a document made or received by a physical or juridical person in the course of a practical activity. 1 1 2 This definition distinguishes the archival document from the broader category of the written document which is defined as "evidence . . . produced on a medium . . . by means of a writing instrument... or of an apparatus for fixing data, images and/or voices," 1 1 3 the term "written" referring not to the physical act of writing, but rather to the "purposes and intellectual results of writing: that is, the expression of ideas in a form which is both objectified (documentary) and syntactic (governed by rules of arrangement)."114 Diplomatics posits that all written documents convey their information by means of rules of representation which are in themselves evidence of the intention to convey information." These rules, which we call the form, reflect the political, legal, administrative, and economic structures, culture, habits, myths of society, and constitute an integral part of the written document, because they formulate or condition the idea or facts which we take to be the content of the documents." The important point is that these rules are independent of content. "The form of a written document is . . . the whole of its characteristics which can be separated from the determination of the particular subjects, persons, or places it is about."1 1 5 This separation of form and content is of profound significance for document management because it shifts the focus away from the document as an object of generic information retrieval to management of the document as a record, or archival document. in Ibid, 9-10. Duranti, Diplomatics, Archivaria 28, 15. Ibid, 15. Ibid, 15. Ibid, 15. 43 Chapter Two Diplomatics The diplomatic concept of form is a broad concept that should not be confused with the familiar connotations of form as an "arrangement of parts", a "shape", or a formula. 1 1 6 The diplomatic concept of form is actually the expression of a system of elements of which the document is the physical and intellectual manifestation. These elements are the juridical system, or social system organized according to a system of rules which constitute the context of document creation and give it meaning and relevance; the act, or the movement of the will that gives origin to the document; the persons who participate in the creation of the document; land the procedures, or the genetic process by which the document is drawn up. A l l these elements are given expression in the documentary form, "which allows document creation to achieve its purpose by embracing all the relevant elements and showing their relationship."117 At the very heart of diplomatics lies the idea that all documents can be analyzed and understood in terms of this system of elements. Conceptually, that is, in terms of general diplomatics, or the theory of diplomatics, these elements are universal in their application, independent of any context of time and place or in other words, are decontextualized in nature. If this is true, then these system elements would also have the character of metadata, that is, their definition is independent of any particular occurrence of the data. In terms of special diplomatics, or the critical application of diplomatic theory to specific situations, these elements should then become capable of application within electronic document management systems as metadata within the register or document profile that can be used to define the data entry rules for the capture of records. JURIDICAL SYSTEM Diplomatics holds that all documents are created within the context of binding rules according to which social groups organize themselves. A juridical system is defined as a "collectivity based on a system of rules" 1 1 8. The system of rules is the legal system, and could 1 1 6 Oxford Modern English Dictionary, 415. 1 1 7 Luciana Duranti, Diplomatics: New Uses for an Old Science (part IV), Archivaria 31 (Winter 1990-1991): 10. 1 1 8 Luciana Duranti, Diplomatics: New Uses for an Old Science, Archivaria 29 (Winter 1989-90): 5. 44 Chapter Two Diplomatics be composed of customs, statutes, traditions, or even beliefs.1 1 9 It is the juridical system which is the ultimate context of document creation, permitting us to recognize a document as something we might expect to encounter under given circumstances, and thus imparting relevance and meaning. Understood as an abstraction, the concept of the juridical system is extremely flexible in its application and has the decontextualized nature of a theoretical construct: the system of laws could be those of any form of organization known to man as long as it has rules in some form: a modern bureaucratic state, a primitive tribe, a trade, a cult, or a family, at any time, in any place. Nor is the juridical system restricted to any particular physical or intellectual form of documents: the documents could be written or oral or be sacred objects or tokens. A juridical system is the broad context within which the creation of a record is validated, however, it is an abstract context that should not be confused with the recordskeeping system itself, or its rules or means of control and regulation. The juridical system provides the broad legal context which sanctions the existence of a recordskeeping system. It cannot, therefore, be the recordskeeping system itself. A policy and procedure manual governing the maintenance of a recordskeeping system is not an element of the juridical system per se, but a belief that written rules are necessary to demonstrate and exercise control is reflected in the existence of such documents and may be characteristic of the juridical system. The distinction between the juridical system and a recordskeeping system applies equally to open document exchange standards as instruments of the recordskeeping system. F A C T S Another vital diplomatic concept is that of facts. Diplomatics holds that all archival documents are created with a specific purpose in mind, that is, with intentional consequences, and therefore embody deliberate facts or in diplomatic parlance, acts. 1 2 0 Apart from the written archival document having a medium, form and content, it also implies "either the presence of a For instance, the World Bank is a juridical system in its own right whose creation of documents is governed by such factors as the Articles of Agreement, its own adminisuative laws and organizational competencies, the relationship with the legal systems of its members, and the rules and customs of banking and international financial Uansactions. 1 2 0 Facts themselves need not be deliberate. 45 Chapter Two Diplomatics fact and a will to manifest [the fact], or of a will to give origin to a fact. It also indicates a purpose. In fact, the existence of something written, directly or potentially, determines consequences, that is, it can create, preserve, modify or extinguish situations."121 Juridical systems attempt to anticipate facts, which are events or occurrences that have consequences. Those occurrences caused by humans are known as human facts; those, such as natural disasters, over which humans have no control, are known as natural facts. 1 2 2 Human facts which are a result of a deliberate intention are critical to the concept of the archival document. "Among human facts in general, the special type of fact which results from a will determined to produce it is called an action or act. The operation of will distinguishes an act from any other general fact. . . In other words, an act is a fact originated by a will to produce exactly the effect that it produces."1 2 3 For a document to be a record, it must manifest an act, or be the expression of a deliberate will to act. Those acts which are limited simply to the accomplishment of the act, known as mere acts, do not give rise to records: a decision to go for a walk, for example, would qualify as a mere act i f the only intent was to get a little exercise. To give rise to a record, an act must have the character of a transaction, defined as "a declaration of will directed towards obtaining effects recognized and guaranteed by the juridical system."1 2 4 The making out of an invoice, the drawing up of a bylaw, or the swearing of an oath are all examples of types of actions that would qualify as transactions because they can only be undertaken if their conceptual meaning is anticipated and accepted by the juridical system. For this reason, all the actions of public bureaucracies are construed as transactions. Because of the relationship between acts and records, without an understanding of the act in which it is involved, it is difficult to know the nature of a document. Diplomatics has 1 2 1 Duranti, Diplomatics, Archivaria 28, 16. 1 2 2 The pejorative phrase "facts of life" reflects this fundamental distinction, where humans are thought not to have control over certain parts of their own behaviour because of the role of inheritance, physiology, and instinct. 1 2 3 Duranti, Diplomatics, Archivaria 28, 6. 1 2 4 Ibid., 7. 46 Chapter Two Diplomatics developed a taxonomy of decontextualized acts that is intended to encompass the whole range of potential administrative situations in which documents might participate as records. If the nature of records is determined by their participation in acts, their value as evidence depends on the kind of relationships they have with those acts. On this basis, it is possible to distinguish the relative weight of documents in so far as they participate in a given act. Thus, documents which give direct expression to the act (i.e. without which the act could not exist) are called "dispositive", and are considered juridically relevant from a legal point of view, for example, a loan agreement, title deed, or sentence of consecration. The same is said of probative documents, which attest that an act occurred, for example, an oath or an invoice. Documents whose written form is not required but are generated spontaneously in carrying out an act such as a consultant report or a discretionary memorandum of advice are "supporting" and can only contribute indirect evidence of the actions they concern. Finally, "narrative" documents that are about actions but neither take part in them, attest to them, or support them, have no force as evidence of them. 1 2 5 The categorization of documents by their degree of relevance as evidence is tremendously important in analyzing the documents of organizations bureaucratic in nature. Diplomatics recognizes that bureaucratic facts have a special character: "they are juridical acts directed to the obtainment of effects recognized and guaranteed by the system, that is, they are transactions."126 Only those documents that are "reliable and complete, that is, able to convey information, capable of being used in a transaction, and of reaching the purposes for which they have been produced, are transactions," and can be called records in this context.1 2 7 The distinction lies in whether a document is the result of a procedure or a process. A procedure is "the body of written or unwritten rules whereby a transaction is effectuated, and comprises the formal steps to be undertaken in carrying out a transaction."128 The documents created in the course of procedure are "at one" with it and must be identified as such so that 1 2 5 Duranti, Diplomatics, Archivaria 29, 9. 1 2 6 Ibid., 12. 1 2 7 Ibid., 12. 47 Chapter Two Diplomatics various activities of an organization. Precisely because it is a term that can be defined to describe individual circumstances, "business process" has a rather variable meaning. 1 3 1 This, in itself, is not necessarily a serious disadvantage. As Luciana Duranti has pointed out, recordskeeping systems exist to serve the aims and purposes of records creators - not vice versa. Provided the system is able to tie records to acts in such a way that the records creator can recognize them and is able to distinguish records involved in one type of transaction from another, the theory of acts has been realized in practice. CREATION PROCESS Closely allied to the principle that all records give expression to acts is the idea that all bureaucratic documents are the result of a creation process or procedures. Two conceptual procedures are common to all records: they reflect respectively, one, the moment of action, and two, the moment of documentation. The moment of documentation is that point in time when the action is documented. The moment of action when the decision is taken to act and the command to prepare a document is given. These two moments are recognized in international open document exchange standards by such attributes as S G M L Release Date and O D A Creation Date and Time, and Release Date and Time (SEE Thesaurus - dates). Depending on the type of act, these two moments may occur simultaneously in the same document. For instance, a dispositive document unites the moment of action and the moment of documentation. These are separated in the probative document, where the document records an action that has already taken place. In medieval documents, the moment of action and the moment of documentation were usually united in the same document. A single document would encompass an entire act, from start to finish. Modern documentation, on the otherhand, is fragmented: bureaucratic procedures are complex, involving many different acts, all of which may generate documents of their own. For example, business process at the World Bank is an attribute of the document profile. The value consists of cost-accounting codes assigned by the ConUoller, e.g. Lending - Pre-Appraisal (of loans). Some of these lump several different activities together, others go into considerable detail. The advantage of the use of business process in this context is that it is encompasses the entire organization, reflects activities as they are understood and carried out, and since it is tied directly to budgets and expenditures, must be used by everyone and is likely to be maintained and policed on a regular basis. 49 Chapter Two Diplomatics PROCEDURES Al l records are the result of a procedure which is defined as a series of formal steps undertaken to give effect to an action. Procedures are peculiar to every type of act and the juridical context in which the act takes place, but diplomatic analysis suggests that all procedures can be classified into four, basic, conceptual categories: organizational, instrumental, executive, and constitutive. Organizational procedures are "aimed at the establishment of organizational structure and internal procedures, and their maintenance, modification or extinction." 1 3 2 Executive procedures are "those that allow for the regular transaction of affairs within limits and according to norms established by a different authority.1 3 3 Instrumental procedures are those connected with opinions or advice while executive procedures allow for "the regular transaction of affairs within limits, and according to norms already established by a different authority."134 Constitutive procedures are those which create, extinguish, or modify the exercise of power, and comprise a family of sub-types consisting of procedures of concession, limitation, and authorization.1 3 5[SEE Thesaurus -procedures for a complete list]. The identification of the specific procedures peculiar to the organization is essential to understanding the interrelationships of documents within a given act and ought to be captured by the electronic document management system if the interrelationships between records are to be reconstructed. O f course, procedures cannot be captured without identification and understanding of the act to which they give expression. Identification of the act must therefore come first, then the procedures. The fundamental aim of diplomatics is to determine the extent to which the archival document, or record, is a legitimate product of its context, the action which is realized within that context, and the procedures that put it into effect. These must be made manifest in the Luciana Duranti, Diplomatics: New Uses for an Old Science (Part IV) Archivaria 31 (Winter 1990-91), p 19. 1 3 3 Ibid., 19. 1 3 4 Ibid., 19. 1 3 5 Ibid., 19. 50 Chapter Two Diplomatics document itself. But with more complicated types of acts, such as acts on procedure, complex and continuative acts, many iterations and types of records may be involved, and no one record will embody the entire act. Some type of mechanism must therefore be identified that can electronically identify the act and the procedure and key these to the document. Electronic recordskeeping systems execute processes in order to permit programs to be run, and procedures or routines that permits programs to execute a particular task. These are not defined in terms of bureaucratic acts, but in terms of how data is handled by the system.1 3 6 Yet in order to recognize records, an electronic recordskeeping system must be able to acknowledge the bureaucratic procedures that lead to records creation. "Ultimately," writes Duranti, because it is an essential constituent of reliability, "the goal is to have restructured business procedures in which the records making and keeping function is a highly regulated and integral part of the usual and ordinary conduct of affairs." 1 3 7 As with acts, no mechanism exists in O D A , DFR, or S G M L to specifically flag procedures although O D A offers the catch-all of "User Specific Codes". There is no theoretical reason why this level of document control could not be imposed. On a practical level, the main reason against it is the problem of transparency: in the process of creating documentation, will the user be faced with a complicated header with all sorts of mandatory fields that must be filled in before anything further can be accomplished? Further, with something as unique and often subtle in its working manifestation as procedure, can the user be trusted to enter reliable data? While this is not a basis of theoretical objection, it is a real constraint on present-day systems. Again, the answer would seem to be that the more prescriptive, or dictated by legal requirements, the procedure may be, the more likely an attribute of procedure is to be sought. PHASES OF PROCEDURE A n additional complication is the sequencing of document creation in phases. A phase is the step in a procedure. Diplomatics posits a taxonomy of phases that breaks all procedures For example, a system may automatically and routinely invert cumulative indexes of terms as they are added, or Uansfer certain records offline in order to save space. A process has been described as an envelope within which a program runs, the system assigning it a number and performing various "bookkeeping" functions. Margolis, Personal Computing Dictionary, 383. 1 3 7 Duranti, Reliability and Authenticity, 10. 51 Chapter Two Diplomatics down into the same, conceptual steps. The initiative phase is composed of acts which initiate, the procedure. The prehminary phase or inquiry consists, of the collection of information needed to evaluate the application. In the consultative phase, the information is evaluated and prepared for decision which is taken in the deliberation phase. The deliberation phase may be followed by a phase of deliberation control, where persons who are not the authors of the decision check the decision for enforceability and compliance with policy and administrative norms. Finally the decision is put into effect in the phase of execution. 1 3 8 [SEE Thesaurus -phases of procedure for a complete list]. O D A , . D F R , and S G M L have no attributes that account for phases of procedure. Control over the phases of procedure is, however, essential to the reliability of documents. In the electronic environment, it is necessary,. at a minimum, to be able to demonstrate control over the initiation, deliberation, and execution phases of procedure. 1 3 9 Simply identifying the phase of procedure would not, however, be any indication of reliability. Control is a matter of being able to indicate that rules were being obeyed in carrying out the phase of procedure. The presence, for instance, of appropriate signatures, use of correct document types, and dates that reflect moments of action and documentation are all means of indicating that at each phase of procedure, the necessary controls were in effect and that the rules of procedure were being respected. Again,.the term "business process" might here be useful as a profile attribute to capture the phases. It may be broken down by various degrees of refinement depending on the formality and complexity of the act and the prescriptive nature of the documents, a question that is determined by the business needs of each organization. 1 4 0 Rather than become prescriptive and attempt to assign a specific attribute for phases of procedure, it is preferable to let the need for reliability of any given recordskeeper determine the degree to which the 1 3 8 Duranti, Diplomatics, Archivaria 31, 15. 1 3 9 Duranti and Eastwood, Preservation of the Integrity of Electronic Records, 22. , 1 4 0 For example, the business processes defined by the World Bank often encompass phases of procedure that are, however, sufficiently complex in thermselves to justify a separate cost accounting code. For instance, the pre-appraisal stage in the granting of a loan is the equivalent of the initiation phase of procedure while the appraisal stage is the equivalent of the phase of consultation. Each of these has a distinct set of documents and rules. 52 Chapter Two Diplomatics concept of business process might be used as an attribute to break down their activities into acts, procedures, and phases. PERSONS "Persons are the central element in any document." 1 4 1 Archival documents are the result of deliberate acts and acts needs persons to exist and be manifest. Diplomatics identifies four conceptual roles, or persons who participate in a document. They are the author, who is "competent for the creation of the document, which is issued by him or by his command, or in. his name."1 4 2 There are really two authors even i f they may be the same person: the author responsible for the act and the author responsible for the document itself:1 4 3 The writer is the person responsible "for the tenor and articulation of the writing," 1 4 4 the one who actually draws the document up. Again, they are frequently synonymous with the author but the conceptual role is different. Every act is addressed to someone, defined as the addressee, who is "the person to whom the document is directed." 1 4 5 As with the author, there may be two addressees: the addressee of the act, or the person to whom the act is directed, and the addressee of the document, or the person to whom the document is actually directed. Duranti believes that electronic documents should also distinguish between the addressee of the act and those who are merely copied, a group called the receivers. 1 4 6 The reason for this distinction is that in electronic mail, the addressees may all be lumped together into a header under "to:" and "cc" with the distinction between the two not always clear, whereas in paper documents, the receivers were usually those whose names were appended at the end of the document as part of the secretarial notes (SEE Thesaurus - secretarial notes). A fourth group of persons are those who in some way validate the signature as witnesses or the form of the document as 1 4 1 Duranu, Diplomatics, Archivaria 30, 5. 1 4 2 Ibid., 5. 1 4 3 The author of the act can also be identified from those who are responsible for preparing accompanying documentation or attachment. 1 4 4 Duranti, Diplomatics, Archivaria 30, 7. 1 4 5 Ibid, 6. 1 4 6 Duranti and Eastwood, Preservation of the Integrity of Electronic Records, 20. 53 Chapter Two Diplomatics countersigners. O f the four groups, the author, writer and addressee are essential for a record. to exist. The concept of persons is flexible, however precisely formulated. "In a diplomatic context, as well as in a legal one, persons are the subject of rights and duties; they are entities recognized by the juridical system as capable of or having the potential to act legally." 1 4 1 this concept of person is legal in nature: the persons involved in an act are defined in terms of their competencies and responsibilities, and not as individual human beings. They derive their existence from their recognition within the juridical system. They may be single individuals, or a collectivity. The author of the act, for example, may be a corporate body as broadly defined as, the state, such as the Government of Canada; the, addressee of the act may be a collectivity such as the people, or a congregation; the writer is often an individual, but they may also be an organization, such as a department, or a committee, or a work group. The persons may also need not be human. A n electronic system or program is quite capable of producing a record provided it is operating as a juridical person, that is, "capable of a c t i n g . . . as having the will that can create, maintain, modify, or extinguish situations." 1 4 8 For instance, an A T M (Automatic Teller Machine) is capable of producing a valid record of cash transactions because it is designed to interact independently with the client through its artificial intelligence and to produce records that are recognized by banks and the legal system. Compared to concepts of the juridical system and the acts, the persons lend themselves to relatively direct translation into a precise set of attributes in the document profile and in those international document exchange. standards where they are present in one form or another. They have the added advantage of being at least familiar to document creators and users even though there is a strong possibility of confusion with bibliographical definitions of persons that are not designed to achieve the same purpose!1 4 9 This is a practical consideration 1 4 8 Duranti, Preservation ofthe Integrity ofElectronic Records, 16. 1 4 9 For instance, the bibliographical definition of author is based on information that must be cited as literally as possible from the publication itself, whereas the diplomatic definition is based on an understanding of the administrative, legal, and historical context of the document and may have to be inferred. This is because a publication is not a record per se - it does not manifest an act that has any other consequences than the writing itself. In so far as the information is not intended to effect a predictable 54 Chapter Two Diplomatics where the occurrence is unpredictable, as for instance, in the case of documents authored by individuals, and data entry is dependent on the document creator. O D A , DFR, and S G M L use the broad category "Originators" to capture the concept of persons. A l l have an attribute called "Authors" which O D A / D F R defines as the "the name(s) of the person(s) or organization(s) responsible for the preparation of the intellectual content of the document.1 5 0 There is some ambiguity here with O D A Owners who are responsible for "the content of the document". 1 5 1 O D A Authors is really the equivalent of the author of the document, whereas O D A Owners is equivalent to the author of the act. O D A Preparers is more clearly equivalent to the writer because they are defined as those who are responsible for the physical preparation of the document. O D A also recognizes a further attribute called Organizations which is intended to associate the "originating organization" with the contents of the document. This may appear to be designed to accommodate a corporate version of the author of the act, as opposed to an authoring individual, and is an attribute of provenance. But once again there is ambiguity, because Organizations could also be confused with Owners. There is no recognition whatsoever within O D A S G M L or D F R of addressees or receivers, nor of witnesses and countersigners. INTRINSIC AND EXTRINSIC ELEMENTS Diplomatics holds that "the form of a document reveals and perpetuates the function it serves."152 This is to say that the context of the juridical system which give the document result, the act of authorship or publication is therefore a simple act in that the act of publication is its own fulfillment. Unlike records, publications are self-contained in that they do not depend on a contextual understanding of their creator or on a relationship to a genetic process or to other records for their meaning. Canadian Standards Association: CAN/CSA-Z243.224-90 (ISO 8613-4: 1989). Information Processing - Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 4: Document (Rexdale, Ontario: Canadian Standards Association, 1990): 12. DFR, while recognizing the ODA definition of Owners, also defines Owners as "a security subject who possesses rights to a specific DFR-object." This indicates custody of the object in terms of ownership or rights of possession, which is not the strict equivalent of author of the act. ISO/ffiC JTC 1/SC 18 Text and Office Systems Secretariat: USA (ANSI). Revised Text of DIS 10166^  1, Information Technology - Text and Office Systems - Document Filing and Retrieval (DFR) -Part 1: Abstract Service Definition and Procedures. (New York: ANSI, 1991): 7. Duranti, Diplomatics, Archivaria 32 , 6. 55 Chapter Two Diplomatics meaning, the act that is the cause of the document, the persons, and the genetic process of procedure are expressed together in the form of the document. A n analysis of the form of a document will therefore tell us what function a document serves and whether or not it can be trusted as evidence. This analysis is inductive: diplomatic criticism works upwards, from the document form to the context of creation. It does not depend on the prior accumulation of information about the context of the document, on historical, legal and administrative research, in order to determine the nature of the document. Such research clearly has a role in elucidating the context, but the function and context of a record must be directly evidenced in the form of the document itself. There are several implications in this approach to document form. First, diplomatic analysis, has to begin with the identification of document types since the ultimate test of meaning is found in the form. Therefore, in the design of an electronic document management system, this is clearly the focus of the document definition phase. Only when all the various types of documents have been identified for any given procedure can their interrelationships and roles in the act be mapped out. Secondly, the separation of form from content means that records must be identified by their formal constituents and not by the information they convey. While an electronic document management system must certainly be capable of retrieving documents on the basis of their subject, this function is conceptually quite separate from their identification as records. Thirdly, the separation of form and content is the conceptual equivalent of the separation of the document profile from the document contents (or context from content) in the model of open document interchange. Documents have both a physical form, or external make-up, composed of what are called the extrinsic elements, and an intellectual form, comprised of the intrinsic elements, which are the document's "internal articulation."1 5 3 "From a conceptual point of view, it may be 56 Chapter Two Diplomatics said that intrinsic elements of form are those which make a document complete, and extrinsic elements are those which make it perfect, that is, capable of accomplishing its purpose."1 3 4 The recognition that documents have structure is not unique to diplomatics. Hajagos, for example, points out that "in fact documents do have an implicit structure. Letters consists of an addressee, salutation, body, and signature. Books contain chapters, sections, subsections, and so on . . 1 , 1 5 5 David Bearman defines documentary, form as the "structure internal to the individual record dictating what data will be present for specific types of transactions and facilitate its recognition and use by signaling to readers, by means of typography, data structures, and electronic links, where particular information will be located." 1 5 6 Diplomatic theory goes one step further in asserting that the form of archival documents in fact follows a predictable pattern that can be captured in a model. The diplomatic model of document form consists of a typical, ideal document comprised of all the elements which documents can be expected to include, "the most regular and complete", independent of provenance or difference in purpose. "Once the elements of this ideal form have been analyzed and their specific function identified, their variations and presence or absence in existing documentary forms will reveal the administrative function of the documents manifesting those forms." 1 5 7 The great value of the diplomatic model of document form is that it is composed of elements, or rules, that are independent of context, i.e. are decontextualized. It therefore becomes possible to use the elements of intellectual and physical form identified in the model to establish standard definitions that could be used to define document attributes in international standards and in the document profile of electronic document management systems. In effect, this amounts to proposing the equivalent of an idealized document profile. 1 5 4 Ibid., 6. Definitions of the various extrinsic and intrinsic elements will be found in the Glossary. The following discussion will focus on their adaptability to international standards and their applicability in the electronic document environment. 1 5 5 Hajagos, Documents and SGML, 39. 1 5 6 David Bearman, Record Systems as the Locus of Provenance (paper presented to the Ontario Association of Archivists Conference on Archives and Automation, May 13, 1993): 4. 1 5 7 Duranti, Diplomatic, Archivaria 32, 6. 57 Chapter Two Diplomatics The intrinsic elements which determine the intellectual form are "considered to be the integral components of its intellectual articulation: the mode of presentation of the document's content, or the parts deterrnining the tenor of the whole." 1 6 0 They consist of three groups or "ideal sub-structures" of elements which tend to appear together without any particular juxtaposition to each other: the protocol, the text, and the eschatocol. T H E PROTOCOL The first of these, the protocol, consists of that part of the document which contains the administrative context of the action consisting of an indication of the persons, involved, time and place of documentation, subject, and any initial formulae. The protocol tends to appear at the beginning of the document and consists of the entitling, date, invocation, superscription, inscription, salutation, subject, the formula perpetuatis and the appreciation. Dates can be both topical (of place, i.e. "signed in the City of Victoria in the Province of British Columbia) and chronological. The date also captures the moment of action and the moment of documentation. Dates are important because they capture the relationship between the author/ writer and the fact or act in question. With traditional paper records, the date is usually added at the outset of compilation, but with electronic messages, the date is either captured by being automatically indicated. in the header or is included in the system at the moment of transmission.1 6 1 The name of the author of a record is an essential element. It may be captured by several different intrinsic elements. The entitling is the part of the protocol that comprises the name, title, capacity or address of the physical or juridical person issuing the document or of which the author of the document is an agent. In contemporary paper documents, the entitling takes the form of the letterhead and is not as important for the reliability of a record as the signature. But in electronic messaging systems, the name of the person issuing the record is added automatically to the document at the moment of action, or actual transmission. The entitling becomes an electronic address from which the message is sent, and thus, "juridically, the person from whom the address the message is sent is its author and writer, unless an Duranti, Diplomatics, Archivaria 32, 6. Ibid., 19. 59 Chapter Two Diplomatics attestation is attached to the record that would unequivocally demonstrate who its author/writer is, such as an electronic seal." 1 6 2 The superscription is the mention of the author of the document and/ or of the action and often appears as the initial wording of the text (e.g. "I, Samuel Doe . . ."). Nowadays, the superscription is often a part of the entitling. An example of a document where it appears by itself is a contract where the superscription names the first party. In electronic messages, the superscription cannot take the place of the entitling which indicates both the writer and author and is automatically added by the system. The inscription is another part of the protocol that comprises the name, title, and address of the addressee of the document and/or of the action and is thus essential to the existence of a record. With paper records, there is usually one addressee for each record with copies sent to others on a distribution list. Electronic messages, however, may be sent to multiple addressees simultaneously and copied at the same time to distribution lists of receivers. The distinction between the two must be carefully maintained because receivers are not, by definition, objects of an act. Other elements of the protocol are not essential to the capture of electronic records although they may continue to appear. The invocation is the mention of the name of God but it may occur in modern documents where a document claims to be invoked in the name of something, such as the people or the law. The symbol or logotype that make up a letterhead can be treated as an invocation where it is included for purely symbolic purposes.1 6 3 The salutation is a form of greeting characteristic of letters. The formula perpetuatis is a formula typical of medieval and modern documents conferring titles or privileges. It consists of a statement that the rights put into existence by the document are not circumscribed by time, e.g. in perpetuum. The appreciation is a sort of prayer for the realization of the content of the document, e.g. "looking forward to Duranti, Preservation of the Integrity of Electronic Records, 19. 1 6 3 The invocation can be viewed as a way of evidencing the abstraction of the juridical system by indicating the moral authority under which the document is drawn up, such as "In the Name of God". Modern documents, especially business documents, have no such abstract elements in them that might be explicitly interpreted as an appeal to a moral authority greater than themselves although the use of a coat of arms by an elected government might be taken to symbolize the system of beliefs that underly the rule of a given state. 60 Chapter Two Diplomatics THE TEXT The second group of elements, the text, is the central part of the document "where we find the manifestation of the will of the author, the evidence of the act, or the memory of i t ," 1 6 4 and consists of the preamble, the notification, the exposition, the disposition, arid the final clauses. The preamble is that part of the document that expresses the ideal motivation for action, such as a citation of law or regulations or opinions on which justify the act. The preamble is often a formula and can be treated as "boilerplate" text. The notification is the publication of the purport of a document whose purpose is to express that the act consigned to the document is communicated to all who may be affected by it as well as those who are directly concerned, usually the notification follows the preamble. The notification is typical of letters patent or a proclamation. The exposition is a part of the text where the immediate circumstances of the act are expressed or explained, and the reasons for the act given. The disposition is the expression of the will or judgment of the author, what the author wants done or intends to do. The final clauses are found within or following the disposition. [SEE Thesaurus - final clauses for a complete definition and list.] THEESCHATOCHOL The third group, the eschatocol, closes the document and consists of the corroboration, the complimentary clause, the attestation, qualification of signature and secretarial notes. The corroboration is the mention of the measures taken to make the document reliable and authentic. The complimentary clause is a brief formula expressing respect, such as "sincerely yours". The attestation is the subscription of those persons who took part in the issuing of the document, and is the substance and core of the eschatochol. Attestations include the signatures of the author, the countersigners, and witnesses. As has been indicated above, the role of the attestation has been affected by electronic messages where it is an alternative to the author and writer named in the entitling. The qualification of signature is the mention of the title and capacity of the signer that accompanies the signatures of attestation, e.g. "Division Chief," Chairman of the Board". The secretarial notes follow the qualification of signatures and Duranti, Diplomatics, Archivaria 32, 12. 61 Chapter Two Diplomatics can comprise a number of elements, such as the initials of the typist, distribution lists, and mention of enclosures. Like other intrinsic elements, the secretarial notes may now be treated as objects or attachments, making them individual documents in their own right with their own sets of attributes. For instance, secretarial notes can include distribution lists that in D F R are a separate object attached to the document. Moreover, filing and retrieval of electronic documents is a much more complex business, involving a standard of its own, D F R and many more attributes than are traditionally associated with classification of paper documents. These elements may be captured by such attributes as D F R References-to-Other-Objects but their association is by no means self-evident and must be individually mapped. In terms of the structure of electronic documents, the intrinsic elements belong to the logical structure, i.e. that part of the document whose components are deteirnined by, and determine meaning and are separated out from the layout structure which is determined by, and determines, the physical arrangement of the document. The intrinsic elements are purely intellectual elements that are entirely independent of physical structure. S G M L approaches the division of the intrinsic elements most closely with its conceptual groupings of front, body, and back matter for office documents, and indeed, for most documents. In reality, the S G M L groupings have little to do with the diplomatic groupings (SEE Chapter Three, S G M L ) , tending to base them on a publication paradigm. D F R is not at all concerned with content, but the generic logical structure of O D A is capable of recognizing the groupings of protocol, text, and eschatochol i f so required, although this has to be done by document classes or types. S G M L is probably most capable of recognizing the various intrinsic elements because of its ability to code individual textual features, such as a signature, or standard clauses.1 6 5 Extrinsic Elements The extrinsic elements break down into six groups of physical elements: the medium, the script, the language, special signs, seals, and annotations. It must be remembered that in order to retrieve them through SGML, each tag must be searched individually by document because SGML coding is done by document. With ODA the search is faster because the search is by attributes in the profile, and whatever profile (which may be common to any number of documents) has that attribute, the document will be retrieved. 62 Chapter Two Diplomatics The medium consists of a group of features that physically carry the message and includes such considerations as the material, the format, and the sort of preparation given the material for receiving the message. In the past, this meant an examination of the physical type of writing medium, such as parchment or paper, its size and shape, or format, and the way the surface was prepared for writing with rules or lines. With electronic records, the meaning is ambiguous. Generally, examination of the physical medium means identification of the type of electronic storage object, such as magnetic tape, C D - R O M ' s , or hard disks. The format becomes a description of the way the physical medium has been prepared or. formatted to accept the information, for instance, magnetic or optical. But medium is also conflated with mode of representation, such as graphical formats, which are better treated under script. It can also have other meanings: in the design of networks, medium means the type of cabling used to connect stations. The script determines the "layout and articulation" of the discourse,1 6 6 and includes such elements as layout, pagination and formatting, types of scripts, handwriting, typefaces and inks, paragraphing, punctuation, abbreviations and initialisms, erasures and corrections, and formulae for the composition of the text. Where electronic documents are concerned, the script may also be interpreted to include computer software "because it determines the layout and articulation of the discourse, and can provide information about provenance, procedures, processes, uses, modes of transmission, and last, but not least, authenticity."167 The script can also be extended so far as to include the system documentation since such information determines how the system operates and the way that it articulates the record and gives them form. Modem documents are increasingly complex as physical and intellectual objects and perhaps nowhere is this more evident than with the script. The script has been subdivided into Duranti, Diplomatics, Archivaria 32, 7. 1 5 7 Ibid., 7. Open standards, by their very nature, avoid prescribing how a system chooses to process a document and therefore, are unlikely to specify software needed to interpret a document. Thus, ODA does not specify particular word processing applications, it only prescribes that whatever word processing application is used, it must be able to handle ODA encoding. Open document exchange standards confine themselves to description only, leaving it up to the system to decide how the object described should be processed, according to its own rules and capabilities. 63 Chapter Two Diplomatics elements of content articulation, or the elements of writing and their arrangement, and content configuration, or the mode of expression of the content which is equivalent to its intellectual representation as a map, or a graphic, or text. Content articulation includes all the elements of document architecture, including the layout, and recognition of various textual or content structures, such as paragraphs. A document architecture is defined by O D A as "rules for defining the structure of documents... " 1 6 8 The script of electronic documents might also be divided into two groups which relate to each other as layers. The first layer is composed of human-readable elements: the image itself with its layout and logical elements such as paragraphing, headers, and elements of presentation, such as typefaces, use of holding etc. How these are actually encoded by the application is the second layer of script elements, which permits the application to articulate the document for human comprehension. It may be accessible to the user but would appear as screens of encoding that would not be readily understandable to any but a trained eye. A third layer might be the system information required by the application itself, such as how it configures the terminal, reads the operating system, and relates to other applications. This layer might be inaccessible altogether to the user. In so far as document exchange standards are concerned with preserving the integrity of records, the elements of content articulation and representation must be of vital concern. It is less easy to argue that standards such as O D A should be concerned with how applications interpret the requirements because to do so would require the standard to specify applications and other particular requirements which would limit their universality. O D A does, however, have the ability to accommodate the needs of particular documents through the use of document application profiles (DAPs) which are designed to permit a communicating system to determine i f it can handle a particular type of document. (SEE Chapter Three - ODA) . A n element of the script that is not dealt with by traditional diplomatics is the concept of links. A link may be defined as a "a pointer to another record." 1 6 9 A link may be 1 6 8 ISO, ODA, Part 1, 5 1 6 9 Margolis, Personal Computer Dictionary, 270. Links should not be confused with the concept of inserts. An insert is a form of copy that is inserted into the body of another document. The pasting of other documents into the body of electronic documents is very common, but this is not a link in the sense 64 Chapter Two Diplomatics embedded in a document template that is used to pull in data from a database into the document. Links may also be used to bring several documents together into a single virtual document. 1 7 0 The idea of links should be assigned to the script, and more specifically, to the. area of content articulation, because they are a physical means of joining the parts of a document together rather than an intellectual element in their own right, such as the entitling or the text. For example, an e-mail may consist entirely of pointers from a system account to a central mail box or records store. The presence of links may be essential to the completeness of a record, even i f the links are invisible to the user. D F R , O D A and S G M L all contain attributes designed to capture the idea of links: D F R User-References-to-Other-Objects, and D F R User-Reference, as well as O D A External References could all be used to capture links. The element of language is a third basic extrinsic element that goes beyond simply the tongue of particular communities to include the vocabulary, phraseology, and styles of different social groups. In electronic documents, language must also deal with machine-readable languages since some records (e.g. an E D I purchase) are substantively machine transactions. O D A and S G M L are designed for human-readable documents; there is no assumption that they are intended to exchange documents that would be machine-readable only even though, to be exchanged at all, a document must be completely machine-readable. The special signs are a group of extrinsic elements that reveal the various persons participating in the document such as symbols, personal marks, stamps and phrases. Like the secretarial notes and signatures, electronic special signs may exist as objects in their own right that become attachments. The seals are a little used group of extrinsic elements today although the concept of encrypted seals can be applied to electronic documents and if so, would again be of a pointer, or way of viewing another document without actually incorporating it into the body of a document. SEE Thesaurus - copy for a definition of insert. 1 7 0 A virtual record is defined as "pointers needed to create documents" Duranti & Eastwood, Preservation of the Intergity of Electronic Records, 15. A definition more characteristic of data processing as opposed to archives, keeping in mind the more limited meaning of record, is "the characteristics of an entity as perceived by the user, regardless of how they have been physically represented in a database. Thus an employee would have one virtual record, but may have numerous physical records linked together to accommodate repeating addresses, jobs held, benefits received, etc." ACCIS, Management of electronic records, 186. 65 , Chapter Two Diplomatics an attachment. Like the special signs, sealsmay have significance for security and access to documents: for instance, an electronic access code is a special sign, a seal can be a way of securing a document against tampering or further knowledge. They have a role in determining the form of transmission1 7 1 of a document and thus in ensuring its authenticity. The. last group of extrinsic elements are the annotations, they are an important constituent of reliability because they "represent the conjunction between, elements of intellectual form and of procedure. Thus, they are a bridge between the completeness aspect of a record and the procedural control on its creation." 1 7 2 Annotations fall into three groups. Annotations of execution may be included in a document after its compilation as part of putting it into effect, or its execution. These include annotations of authentication that are the express, legal recognition that a record or the signatures on it are what they claim to be. Annotations of execution may also include annotations of registration which are a reference to a transcription of a record made in a register by an office different from the one creating the record. Both of these type of annotations of execution are peculiar to particular record forms. For instance, a copy of birth certificate obtained from a government registry of births will indicate that it is an authentic copy. Another family of annotations are those added in the course of carrying out subsequent steps in a transaction, such as question marks, initials, dates of hearings or readings, or queries. They are made on received documents by the offices that carry out the related transaction, so they are made on finished documents. Annotations of handling include those of instruction, which are the mention of previous handling, directions for transmission, classification etc., dates of hearings or readings (for formal documents of this type), and signs made by commentators or readers of the text indicating opinions or comments. Annotations of management include registry numbers, classification codes, cross-references, dates of receipt, and names of recipients or of the office receiving the document. 1 7 1 Form of Uansmission is the form that a record has when it is made or received. Duranti, Preservation of Integrity of Electronic Records, \2. 66 Chapter Two Diplomatics The elements of documentary form come together in the concept of the status of transmission, which is concerned with the ability of the document to give effect to the act it embodies. The status of transmission defines the fundamental concepts of the original, the draft, and the copy. The original is defined as the form of the document that is perfect, in that it has all the elements necessary to give it completeness of form required by the juridical system in which it was created, especially those of content articulation and annotations, 1 7 4 is able to give effect to the. act it embodies, and is the first to be issued. 1 7 5 A draft is a sketch or outline of a definitive text prepared for purposes of correction. B y definition it lacks completeness and effectiveness. Anything other than an original is therefore a draft. A copy reproduces to a greater or lesser degree, depending on the type of copy, the form of the original but lacks primitiveness, and is unable to give effect to the act. Diplomatics distinguishes between several different types of copy: an imitative copy can reproduce the original or the draft in almost every respect except in primitiveness. 1 7 5 In the paper paradigm, copies made on copiers are imitative and can be virtually indistinguishable from the original. On the otherhand, a simply copy is a transcription of the contents of a document without respect to form. Notes transcribed from a report, for example, are a simple copy. 1 7 7 A copy-in-the-form-of-an-original is a copy that is created when two originals of the same document, addressed to the same person and having the same date, are sent to the addressee in two subsequent deliveries, with the first considered the original, and the second, the copy. 1 7 B Another type o f copy is the authentic copy, where officials who are authorized to execute such a function validate a copy so that it is capable of being used as evidence. Birth certificates are an example of a document which is often Duranti, Preservation of the Integrity of Electronic Records, 5. Duranti, Diplomatics, Archivaria 28, 19. Ibid., 20. Ibid., 21. Ibid., 19. 68 Chapter Two Diplomatics issued as an authentic or "certified" copy. 1 7 9 Another form of authentic copy is the vidimus which is an insert in another document whose purpose is to guarantee the conformity of the copy to the original. 1 8 0 (SEE Thesaurus - Copy for a complete list o f all the types of copies.) The concept of the status of transmission applies no less to electronic documents than to paper records. The principles of perfection and primitiveness all apply but with fresh complications. Duranti has maintained that all electronic documents are created as drafts and received by the user as originals "in consideration of the fact that the records received contain elements automatically added by the system which are not included in the document sent, and which make them complete and effective." 1 8 1 A n example of this is an e-mail where the text is created at the workstation of the author but the electronic mail application will not add the date and time of the moment of action until the actual moment of dispatch. In order to recognize the draft, the profile must be able to recognize versions which means the ability to capture the date of creation, name of the author, persons involved in commenting on the draft, queries, and number of the version. 1 8 2 Records received from outside the system become originals once they are physically attached to it by means of a profile which must be complete enough to capture all their original elements that make them complete, reliable and authentic. O D A D F R , and S G M L all contain a profile attribute, Status, designed to accommodate the states of draft and original. But since, the concept of original requires the presence of other elements to bring it to completeness, merely indicating that something is an original is not sufficient. Completeness, for instance, is conveyed by such attributes as O D A Title, Start Date and Time, Subject, Authors, and Owners, and not merely by the attribute Status. S G M L , O D A and D F R all contain an attribute for Revision History to capture how the document has been changed, while D F R attempts to track various Ibid., 21. Ibid., 21. Duranti, Diplomatics, Archivaria 33, 10. Duranti & Eastwood, Preservation of the Integrity of Electronic Records, 26. 69 Chapter Tvvo Diplomatics versions through the attributes Version-Root, Next-Version, and Previous-Version. The concept of version control is, in fact, control over the drafting of a document. Open document exchange standards are designed to exchange documents, or parts o f documents between heterogeneous networks so they can be processed, re-processed or read. This means that they must be capable of actually capturing the characteristics of a document or the document part as opposed to simply adding additional information. For instance, an electronic messaging system merely takes a message.and adds address and routing information. Open document standards are designed to encode the structure of the document and in some cases, add contextual and management attributes. S G M L is used to exchange all or part of a document by encoding its textual features. O D A is capable of encoding both the logical and layout structure and adds additional information in the form of attributes for security, access, author and originator, and dates of action and documentation. But to judge the ability of open document standards to capture the original against a paper paradigm would be misleading because of the definition of document. Both O D A and S G M L define a document as consisting o f a profile or document type declaration (DTD) and an instance of content, with the D T D and the profile both capable of being exchanged in their own right as documents. The attributes or tags are therefore actually part of an effective document and must be considered part of the original because without them, the document cannot be exchanged and therefore, comprehended. This can be seen from the way both O D A and D F R deal with the concept of draft and original as versions and editions. A D F R version is "a DFR-document specified by the user as a derivation of one or more other DFR-documents by means of specific DFR-attributes." 1 8 3 Printing the document out, o f course, merely produces an imitative copy without the profile or D T D information. If attributes or D T D tags are to be considered part of the actual document, then attribute management becomes an issue in determining the authenticity o f a document. H o w and when attributes are created, by whom and when they are modified wil l have to ISO, DFR: Part 1: Abstract Service Definition and Procedures, 20. 70 Chapter Two Diplomatics captured in the profile. D F R , in fact, recognizes this function with such attributes as Attribute-create-date-and-time, and Attribute-modified-date-and-time. Mode of Transmission The mode of transmission is "the method by which a record is communicated over space or t ime." 1 8 4 Electronically, the mode of transmission therefore refers to the way a record is sent as data. The authenticity of a record is dependent on keeping track of the transmission process. With paper records, this involves capturing indications of how the document was handled by courier, messenger, and postal services, its distribution, and procedures for filing of incoming and outgoing documents by means of date stamps, registration marks, and the rigorous application of administrative procedures for the receipt and sending of documents. The critical factor is security. The recordskeeping system must be able to guarantee that a record was sent or received as intended. In electronic systems, security is a question articulating the circumstances under which a document may be sent or received, and of providing an audit t ra i l . 1 8 5 Form of Transmission The form of transmission refers to the form the record has when it is made or received. It is again an essential constituent of authenticity, because it should be possible to guarantee that a record was received in the same form in which it was sent. Form, in this case, refers to the presence of the persons, and certain extrinsic and intrinsic elements, all of which must be captured in the document profile or identified within the document itself. The intrinsic elements consist of the entitling, which captures the name of the author of the document, the inscription, which captures the name of the addressee of the document, the subject, and the date and time of transmission. In addition, there must be various annotations comprising a classification code, and, where applicable, a registry number. In paper documents, other devices were possible to include such as watermarks, seals, and special signs peculiar to the author such as a personal monogram. The use of an encrypted Ibid., 12. Ibid., 25. 71 Chapter Two Diplomatics seal or the inclusion of a graphic whose meaning is shared only by the author and the addressee are two ways that such elements could be included in electronic documents. 1 8 6 OTHER ASPECTS The elements of a document traditionally recognized by diplomatics are able to. accommodate almost all the needs of electronic documents i f they are to be recognized as records. But there are some aspects where diplomatics needs to recognize some additional elements that are peculiar to electronic documents. Security and Access The first of these are the concepts' of security and access as they arise in electronic systems. Security has been defined as a technique for ensuring that data stored in a computer cannot be read or compromised. 1 8 7 This is an essential element of authenticity where the manner of preservation and custody of documents are concerned, 1 8 8 and such elements of physical form as seals and special signs, as well as control.over the creation and execution of the document and the recognition of certain document forms, such as letters closed, are elements long recognized by diplomatics as implying some measure of security. There is a great difference, however, in the fact that paper records, because their form is physically bounded by their medium, can be secured from all interference simply by locking them away or restricting their circulation. While electronic records can certainly be secured by encryption or restricting access, they are, by their nature, more open to interference by nature of their medium, which is volatile and dynamic, and by the possibility of remote access on a network. Security must be .specifically built in: the author of a document must have permission to access the document, and must be able to prove to the system that they have such a privilege. 1 Security is a question of securing both the system and the document. Even though system security is an important element of reliability, security of the system cannot be a specific concern of open document exchange standards, however, security of the Ibid., 25-26. Margolis, Personal Computer Dictionary, 426. Duranti & Eastwood, Preservation of the Integrity of Electronic Records, 25. 186 187 188 .72 Chapter Two Diplomatics document through access rights is a vital concern. 1 8 9 O D A , for example, defines document architecture on the basis of whether a document can be read, or modified. Access rights are usually assigned on the basis of a privilege to read, modify, delete, create, move, or destroy documents. Access rights therefore, also become a way of defining the persons participating in a document. The writer, for instance, as the person who gives the document intellectual articulation, is to some extent defined by the privilege of creating or modifying a document; a countersigner may have only access on the basis of their ability to modify the document with their approval. Access usually takes the form of two attributes: an access list, which include the name of the person granting access, and an authentication, or proof that the person seeking access is who they claim to be, in the form of a password or an encrypted name.. O D A , DFR, and S G M L all provide for access by means of such attributes as O D A Access Rights, and Authorization, and D F R Access-List. S G M L has an attribute for security classification of the document itself, called Sensitivity. Access is an aspect of document management that is needed throughout the life of a document, from creation to use and disposition. In the sense that it is not necessarily assigned after the creation of a document access control does not fit the strict definition of an annotation of management or handling. In fact, access control should probably be considered as an extrinsic element in its own right. The reason is that access control is the ability of a document to be physically opened for inspection. In paper records, this would be accomplished by first granting access to the container in which the document was held, and then opening out the document to the light so that the process of comprehension could begin. Access control is therefore a system surrogate for an element which is integral to the physical format of the document and is not simply an intellectual privilege. Nonetheless, in terms of electronic records management, access control takes the form of a system privilege. It is proposed that to define access control as follows: 1 8 9 Access is defined in Uaditional archival terms as " the availability of records/archives for consultation as a result of both legal authorization and the existence of finding aids." ACCIS, Management of electronic records, 136. In data processing terms, it is defined as "a priviledge to use computer information in some manner." Margolis, Personal Computer Dictionary, 4. 73 Chapter Two Diplomatics The authorization to enter an electronic data store with the specific privilege to either view or process data in some way, or to administer or modify the system itself. The authorization usually takes the form of a list of those who have been granted access together with their specific privileges. The access list may be a part of an application, or attached to an individual object, such as a document, or object class.190 Archival Bond A third element that needs to be recognized is the relationships that exist between documents which is called the archival bond. 1 9 1 As Duranti points out, diplomatics deals with the archival document itself, whereas archives are concerned with aggregations. But whereas paper records have tended to aggregate themselves because of the physical nature of the paper medium, electronic records must have their links established between them. Duranti points out that the act of classification is essential to making a record out of a document, because with out it, the records do not acquire the essential quality of interrelationship. 1 9 2 O D A S G M L and D F R all contain attributes that are designed to capture the archival bond. D F R deals with documents as groups, identifying the relationship between members, as well permitting the capture of relationships to documents outside the group through such attributes as User-references-to-Other-Objects. O D A also contains an attribute for Reference to Other Documents, as well as document class descriptions that permit the establishment of relationships between documents of a given type. These 1 9 0 An alternative view is to regard access as an aspect of persons for the reason that access rights are a priviledge and priviledges may only by granted to juridical persons. In an electronic document management system, it is not possible to define persons without rights of access which therefore become synonymous with their competence, or ability to act in some fashion. In that case, the access list proposed here might be viewed as a mechanism against which the access rights might be validated. 1 9 1 The archival bond has been defined as "The relationship that, because of the circumstances of their creation, records have with their creator, with the activity in which they participate, and among themselves. The archival bond is originary (it comes into existence when the record is made or received), necessary (it exists for every record), and determined (it is characterised by the purpose of the record)." Duranti & Eastwood, Preservation of the Integrity of Electronic Records, 4. 1 9 2 This insight into the fundmantal importance of classification was obtained from a seminar given by Luciana Duranti at the World Bank; Washington DC, August 28, 1995. Chapter Two Diplomatics attributes can be also be used to identify virtual series which will , however, be far more flexible than the traditional paper series. 1 9 3 Document Management Domains A final aspect of electronic document exchange that diplomatics does not identify specifically is the concept of document management domains. These are defined as "space defined by the boundaries of an electronic document management system within which records are created, modified, used, and destroyed. The space may be divided into several areas depending on the status of transmission and the access rights." 1 9 4 Document management spaces are divided up into general (or institutional space), group space, and individual space. The general space is "that part of the system that is accessible to all members o f the organization, managed according to established record making and record keeping rules by the competent staff, and that contains the central filing system of the organization, including the linkages with related records in other media. The primary characteristic of the general space is that no record that has crossed its boundaries can thereafter be manipulated." 1 9 5 The group space is shared by all those who share the same competence and contains many draft versions o f records. The individual space is accessible to individual members. The division of space on the basis of access control and status o f transmission means that document management domains must be captured by several attributes. A l l those used by O D A , D F R and S G M L to capture access and the status of transmission may therefore be used for this purpose. A specific annotation of management called Domain should probably be added to the profile that would consist of a code for each part of the system, on the basis of which, access control and the status of transmission could be automatically assigned. 1 9 3 The archival bond should not be confused with the concept of links which are physical links to the parts of a record. 1 9 4 Duranti & Eastwood, Preservation of the Integrity of Electronic Documents, 23. 1 9 5 Ibid., 23. 75 Chapter Two Diplomatics Conclusion As we have seen, apart the from juridical system, which is by definition an abstraction, almost all of the main elements of diplomatic analysis of the record - the facts, procedures, persons, and intrinsic and extrinsic elements, together with such key concepts as authenticity and reliability - can be captured in open document exchange standards. Some new elements such as links, security, access control, domain, and the archival bond should be explicitly recognized by diplomatics and added to the document profile in order to ensure that the functionality of the record within the electronic recordskeeping environment is fully realized. While this is conceptually possible, it is necessary to take a closer look at O D A , S G M L , and D F R to see just how they function and what their specific limitations and advantages might be. 76 C H A P T E R T H R E E I N T E R N A T I O N A L O P E N D O C U M E N T E X C H A N G E STANDARDS ISO 8879 Standard Generalized Markup Language (SGML) S G M L was adopted by the ISO in 1986. It is "an international standard for the description of marked-up electronic text. More exactly, S G M L is a metalanguage, that is, a means of formally describing a language, in this case, mark-up language." 1 9 6 Markup has traditionally been associated with publishing where manuscripts had to be marked up for typesetting with indications of typeface, font size, use of bold face, indentations, paragraphing and other formatting requirements. In this traditional sense, markup was the equivalent of encoding a text in one form so that it could be translated into the typeset form. A s the publication process became more automated and integrated, electronic markup languages were extended to include printing and editing. 1 9 7 As a form of encoding, markup has now come to mean "any means of making explicit an interpretation of a text." 1 9 8 A markup language must specify "what markup is permitted, what markup is required, how markup is to be distinguished from text, and what the markup means." 1 9 9 As a markup language^ S G M L is a "method of modeling document contents and identifying structural and content elements."2 0 0 It was designed for the publishing environment 2 0 1 in 1 9 6 C.M. Sperberg-McQueen and Lou Burnard, eds., Guidelines for the Encoding and Interchange of Machine-Readable Texts, Draft: Version 1.1. (Chicago and Oxford: Association for Computers and the Hurnanuies (ACH), Association for Computational Linguistics (ACL), Association for Literary and Linguistic Computing (ALLC) - Text Encoding Initiative, 1990): 9. 197 The Chicago Manual of Style for Electronic Manuscripts is an example. 1 9 8 Sperberg-McQueen & Burnard, Guidelines, 9. The ISO standard defines markup as "text that is added to the data of a document in order to convey information about it." ISO, International Standard ISO 8879 Information Processing: Text and Office Systems: Standard Generalized Markup Language (SGML) (Geneva: International Organization for Standardization, 1986): 14. 1 9 9 Ibid., 10. 2 0 0 Hajagos, Documents and SGML, 38. 77 Chapter Three International Open Document Exchange Standards that it permits authors to mark up their documents through the use of a generalized language that utilizes a syntax of human-readable (or character) codes. 2 0 2 S G M L has several advantages over earlier markup languages in that the coding is human-readable and can therefore be embedded by the author, and the markup syntax is rigorously defined so that it can be processed like a program by a compiler. Furthermore, the markup syntax is generalized so that the tokens are not related to any particular publishing context, and has a meta-language that can be used to "anticipate new mark-up constructs". 2 0 3 Perhaps most important of all from the point of view of standardization, S G M L is intended to be "future-proof or independent of hardware, software, and all applications. 2 0 4 S G M L uses several strategies to become future-proof. The metalanguage of descriptive markup means that documents can be processed by many different types of applications. A second data processing strategy that makes S G M L data independent is its use of a general purpose mechanism called string substitution that lets a communicating system know a particular string should be replaced by another. 2 0 5 O f greatest interest to archivists, however, is a third strategy, document types. A document type defines a document by its constituent parts and associated structure and consists of a "class of documents having similar characteristics." 2 0 6 For instance, a report might consist of a title and an author followed by an abstract. This definition of a report may serve for all reports, and anything which does not have these basic constituents would not be a report. The use 2 0 1 Understood broadly in the sense of a document-producing environment where the author does not require strict conUol over the appearance of the document. 2 0 2 Authors in this sense means anyone who encodes a text, which could be a scholar encoding an historical manuscript, a writer preparing a report, or an editor. 2 0 3 ACCIS, Strategic Issues, 30. 2 0 4 Lou Burnard, The Text Encoding Initiative: Towards an Extensible Standard for Encoding of Texts in Electronic Information Resources and Historians: European Perspectives, Seamus Ross and Edward Higgs, eds. (St. Katherinen: Max-Planck-Institut fur Geshichte In Kommission bei Scripta Mercaturae Verlag, 1993): 106. 2 0 5 Sperberg-McQueen & Burnard, Guidelines, 11. 2 0 6 ISO, SGML, 10. 78 Chapter Three International Open Document Exchange Standards of document types enables documents of the same type to be processed in a more uniform way without defining them to each system over and over again. 2 0 7 The purpose of all this is to facilitate the transmission of documents between dissimilar systems and to maximize their informational value by permitting the manipulation of their contents in an automated environment. Since it is mainly a descriptive language, however, S G M L cannot be used to process documents even though it is can contain processing markup. This is because, it "does not contain semantic definitions for controlling further processing steps on documents." 2 0 8 For instance, S G M L itself cannot be used to edit a document. Applications that are S G M L conformant, however, can be used to perform operations on an SGML-encoded text such as editing, linking or displaying texts in hypertext systems, formatting and printing, download to databases, content analysis, or collation among many potential uses. 2 0 9 S G M L is based on two main principles: descriptive markup predominates and is kept strictly separate from processing instructions; and markup is formally defined for each document. Descriptive markup simply uses codes to "provide names to categorize different parts of a document" 2 1 0 such as <para> for paragraph or <document date>. B y contrast, procedural markup specifies what operations are to carried out on a document such as "insert em dash one quad right, then skip one line, then indent right margin etc." 2 1 1 207 Sperberg-McQueen & Burnard, Guidelines, 11. 2 0 8 Bormann & Borman, Standards, 151. At the time this was written, a separate standard that would permit the specification of style information for controlling processing steps was underway: Document Style Semantics and Specification Language or DSSSL). 2 0 9 Sperberg-McQueen & Burnard, Guidelines, 3. 2 1 0 Ibid., 10. Descriptive markup is defined as "Markup that describes the structure and other attributes of a document in a non-system specific manner, independently of any processing that may be performed on it. In particular, it uses tags to express the element structure." SGML actually recognizes four kinds of markup. Apart from descriptive markup (tags), it recognizes references, markup declarations, and processing instructions. ISO, SGML, 2, 14. 79 Chapter Three International Open Document Exchange Standards In essence, S G M L "enables data to move between media by describing documents by their structural elements or textual features rather than their visual format." 2 1 2 Thus, in accordance with the basic processing model, S G M L would permit several paragraphs of a report, or a glossary in a source document to be imported as a result document without having to copy the whole document or even a page with its original layout or logical organization. Using S G M L , it is possible to mark up the various textual elements of the content of a document, so that these can be exchanged and re-processed without having to worry about the physical layout. It is important to note that elements are intellectually determined or logical in nature, and are not physical or layout divisions such as page breaks imposed by the necessities of presentations (such as limitations on page size). Textual features divide up into structural and non-structural elements. 2 1 3 Structural elements include such components as the parts of a book, such as the front matter, consisting of the table of contents, copyright page etc.) or the body, consisting of chapters and sections. Non-structural elements might include individual words, paragraphs, passages of text, names and dates, typographically highlighted phrases, basic editorial changes, pre-existing annotations, bibliographic citations, lists (such as glossaries and indexes) and hypertextual features such as simple links and cross-references.2 1 4 A fundamental point to grasp is that although elements are named, S G M L provides no way of knowing the meaning of a particular element: its only concern is to define the relationship between one element and other element types. For example, the elements Title and Author may associated with the element document type, but what Title and Author may specify has to be defined by the encoder. 2 1 5 Finally, elements take the form of tags or names. Elements are only one type of descriptive markup component. S G M L also defines entities and attributes. Entities are "a named part of a marked up document, irrespective of 2 1 2 Hajagos, Documents and SGML, 38. 2 1 3 Guidelines 71. 2 1 4 Burnard, Text Encoding Initiative, 111. 2 1 5 Sperberg-McQueen & Burnard, Guidelines, 12. 80 Chapter Three International Open Document Exchange Standards any structural considerations". 2 1 6 They are essentially free-floating units, or collections of characters, that are not related to the structure of the text and are best managed as a single unit. A typical entity is a photograph, or a book chapter, or a file. Attribute has a more restricted meaning within S G M L . A n S G M L attribute is used to describe "information which is in some sense descriptive of a specific element but is not regarded in itself as an element."2 1 7 A n S G M L document consists of a prologue and a document instance. The prologue is divided into two parts: the S G M L declaration, and the document type definition (DTD). The declaration "establishes the environment in which the document operates" by permitting the user to specify "basic facts about the dialect of S G M L being used" 2 1 8 that the system applications will need to read the S G M L file. These include such parameters as the characters sets (e.g. ASCII , E B C D I C ) used to encode data, types of S G M L delimiters, length of tag names, use of symbols etc. 2 1 9 The S G M L declaration is invisible to the reader. The Document Type Definition (DTD) is at the heart of the way S G M L describes texts. The D T D consists of a standard header or type of profile that defines all the textual components used to describe a text in the form of tags, attributes associated with particular tags, and entities and how they are to be interpreted for a particular document type. In order to process an S G M L text, an application validates the tags against the D T D . In this way, an S G M L document is self-defining and can be processed by any application that complies with the S G M L standard. 2 2 0 The D T D also defines the relationship between tags by permitting the establishment of hierarchical structures and permitting tags to exist or be "nested" at different levels of the hierarchy. In fact, an S G M L D T D is a hierarchy of tags with most tags subordinated to other tags. Tags may also be aggregated into small 2 1 6 Ibid., 27. 2 1 7 Ibid., 24. 2 , 8 Ibid., 29. 2 1 9 Hajagos, Documents and SGML, 39. 81 Chapter Three International Open Document Exchange Standards objects with an internal structure called crystals. For example, an address, containing tags for the street address, postal code, and city etc. is a commonly used crystal. 2 2 1 The D T D should not be confused with a relational database that is independent of the encoded text. For instance, the O D A standard establishes a document profile that is completely separate from the content in that it can be searched separately independent of content. But the S G M L D T D is directly tied to the encoding of the text through the process of validation. This is because S G M L is concerned only with what is actually present in the text, and not with context which in the case of records, often consists of information that is not necessarily present in the text but must be inferred. For instance, the author that appears in the S G M L D T D must be the name of the author that appears as part of the text. In diplomatic terms, this author is more likely to be the writer, of the person responsible for giving the text intellectual articulation. If the aim is to capture the author of the act and the author of the document, who may be different from the writer, the T E I S G M L treatment of the tag <author> is inadequate because this is information that will not appear in the text but may have to be inferred from the administrative context of the document. The document instance is "the data and markup for a hierarchy of elements that conforms to a document type definition," 2 2 2 or in other words, it is the actual marked-up text and contains all the data and tags needed to delineate each element. The document instance may exist in the same file as the D T D and the S G M L Declaration, but document instances can also be referents to files elsewhere that contain the D T D and the declaration. A n S G M L document, then, consists of a text, "or any stretch of natural language", marked up by tags, associated with a header, or document type definition. 2 2 3 Although the S G M L standard does not specify a set of tags, unlike O D A or D F R , tags may be bundled into groups or tagsets for specific types of documents, and these may be standardized. This is the aim of the Text Encoding Initiative (TEI), a project sponsored and organized Sperberg-McQueen & Burnard, Guidelines, 88. ISO, SGML, 13. Burnard, Text Encoding Initiative, 108. 82 Chapter Three International Open Document Exchange Standards by several professional associations in the field of computer-assisted literary and linguistic research.22'1 The aim of TEI is to "deliver a fully specified set of Guidelines which will enable researchers in any discipline to interchange texts and datasets in machine-readable form . . . 1 , 2 2 5 On the theory that many texts are similar in their basic structure but differ in their components or sub-elements, the TEI is identifying standard tagsets, or base tagsets, for a number of different types of texts. That is, certain elements needed to describe their structure are similar, but these must be used in conjunction with tags unique to each type of text. For instance, a dictionary entry consists of tags for form, sense, related entry, and etymology, while the base tag set of a memo would include references to other documents, local filing references, and a subject field. "The resulting document becomes a template for validating existing documents and creating new ones." 2 2 6 Thus far, the TEI has identified base tag sets for prose, verse, drama, transcribed speech, dictionary entries, terminological entries, and letters or memos. 2 2 7 The base tagsets characteristic of particular types o f texts may be combined with core tag sets of elements that research by T E I into different types of texts indicates are common to all texts. These include core structural elements of front matter, body, and back matter, and basic non-structural features which occur freely in the text such as paragraphs, highlighting, lists, bibliographic citations, foreign words or expressions, terms, cited words and glosses, abbreviations, notes, index entries, numbers and dates, and. crystals. 2 2 6 Others include pre-existing annotations. 2 2 9 The core and base tag sets may be The TEI is sponsored by the Association for Computational Linguistics (ACL), the Association for Literary and Linguistic Computing (ALLC), and the Association for Computing and the Humanities (ACH). Burnard, Text Encoding Initiative, 105. 2 2 5 Ibid., 105. 2 2 6 Hajagos, Documents and SGML, 40. 2 2 7 Burnard, Text Encoding Initiative, 109. 2 2 8 Sperberg-McQueen & Burnard, Guidelines, 71-89. 2 2 9 Burnard, Text Encoding Initiative, 111. 83 Chapter Three International Open Document Exchange Standards combined with user defined tags in the D T D in an approach that has been characterized as the "Chicago pizza": the user is offered a standardized set of tags to which the user-defined tags may be added as a kind of topping. The aim o f the TEI is to publish guidelines for the D T D o f specific types o f documents. A n early draft of the Guidelines 2 3 0 contains a T E I D T D for Office Documents. In keeping with S G M L ' s emphasis on structure, document types are not defined by function. "Although office documents can be classified according to a number of criteria and arranged in various classes (e.g. memorandum, business letter, report, minutes etc.), their commonalities rather than their specific features have been stressed in order to devise a single general structure for office documents". 2 3 1 The TEI D T D divides the office document up into three parts: front matter, text body, and back matter. At first it might be thought that this division corresponds roughly to the intrinsic elements of protocol, text, and eschatochol, but the similarity is, in fact, almost nonexistent. The text body consists of the core tag set mentioned above used to plus a subject line, salutation and "signoff features" characteristic of correspondence. The back matter is held to comprise bibliography, glossary and index features. On the grounds that these are features common to many other documents, the TEI D T D for office documents focuses mainly on the front matter which it considers a sort of document profile relating to the "special aspects of office documents", namely, production and storage of documents, document distribution, action requested and deadlines, and status and history of the document as a version or draft. 2 3 2 The D T D consists of a core tag set of these elements, some of which draw on O D A and the X400 standards. S G M L TEI Office Document Type Definition tag name encoding format Document Type Title <text category> <doc.title> 230 Sperberg-McQueen & Burnard, Guidelines, 289. 231 Ibid., 188. 232 Ibid., 188. 84 Chapter Three International Open Document Exchange Standards Document Date <doc.date> Author <author> Abstract <abstract> Table of Contents <toc> Language <lang> Revision History <revision.history> In addition to the core tag set, the office il elements: Document Reference <doc.reference> Additional User Specific Codes <user.codes> References to Other Documents <related.doc> In Reply To <in.reply.to> Local File System Reference <file.sys.ref> File Name <file.id> Location <directory> Access Rights <access> User Comments <user. comments> Subject <subject.term> Keywords <keywords> Creation Date <creation.date> Originator <originator> Preparer <preparer> Authorizing Person <authorzed.by> Primary Recipient <recipient> Secondary Recipient <recipient.secondary> Other User Information <user.info> Document Status <doc.status> Sensitivity <sensitivity> Number of Pages <doc.pages> Included in the recipient tags is a sub-element called <action> and <action by> which is interpreted as possibly including such actions as "action", "information", "opinion", "visa", "filing" etc. 2 3 3 . These action tags are merely a mixture o f standard filing and routing actions that would take the form of archival or procedural annotations and do not describe the act or the transaction itself. 233 Ibid, 189-190. 85 Chapter Three International Open Document Exchange Standards Included in the front matter are a number of "crystals" or minor tagsets that can be invoked as a group to dates, individual persons, postal addresses, and electronic addresses. Some or all of these tags might be used to encode an office document. A TEI encoded document consists of a D T D , which defines what tags will be used, a header, which provides information needed to describe the process of encoding and for bibliographic citation, and the actual encoded text. A n example is provided in Figures 2 and 3 following. Person name Organization Address <prop.name type=person> <org> <address> 86 Chapter Three International Open Document Exchange Standards FIGURE 3 Sample of SGML TEI Encoded Document This letter from the Government of Barbados to the World Bank (SEE facing page. Figure 2) has been coded with SGML tags according to the sample provided in the draft Guidelines, p 241-243 using tags identified for office documents. The SGML encoding is divided into two parts, the header <TEI.header> and the <text>. It does not include the DTD which would define all the permissible tags to be used in the encoding. Bold has been used to identify the first instance of each tag which are encoded in level pairs, e.g. <role></role>. The tags are listed in a hierarchy of elements that nest one within the other, beginning with the most all-encompassing, i.e. <offtce.[document]>. Square brackets have been used to denote supplied information for purposes of elucidation, e.g. <front. [matter]>. Comments are provided as footnotes in order to avoid cluttering the format of the encoding <office.[document]> <TEI.header>234 <file.description> <title.statement> <title>Letter, L. Erskine Sandiford to Mr. Yoshiake Abe, 19 May, 1993: machine readable transcript</title> <statement.of.responsibility> <role>transcribed by </role> <name>Tony Gregson</name> </statement of responsibility> </title.statement> <extent.statement>small (ca. 2 Kb></extent statement <publication.statement> <creation date>5 January 1995</creation date>235 <release> <release.place> Washington DC</release.place> <release.authority>World Bank Archives</release.authority> <re!ease.date>Jan 5 1995</release.date> </release> </publication statement> <notes> <source.description> <citn> <author>L. Erskine Sandford</author> <title> [letter to Yoshiaki Abe]</title> <imprint> <publ.city>Barbados</publ.city> 2 3 4 . The TEI header is intended to provide information that can be used in bibliographic citation and to document the circumstances of the encoding. 2 3 5 Creation date is the date of the encoding or the creation of the transcription. 88 Chapter Three International Open Document Exchange Standards FIG 3: SAMPLE OF SGML TEI ENCODED DOCUMENT (CONT'D) <publ.date>19 May 1993</publ.date>. </imprint> <citn.detail>l p.<citn.detail> </citn> </source.description> </file.description> <encoding.declarations> <aim>Example of transcription of a received document by World Bank.</aim> </encoding.declarations> <revision.history> <change.note> <rev.number>l</revision.number> <who>T. Gregson</who> <when>Jan 5 1995</when> <what>encoded SGML format</what> </change.note> <revision. history > </TEI.header> <text>236 <front.[matter]> <title.page> <doc.author> <propname type=person>L. Erskine Sandiford, Prime Minister and Minister of Finance and Economic Affairs</propname> <org> <org.name>Government of Barbados</org.name> <org.dept>Ministry of Finance & Economic Affairs</org.dept.> <address> <street>Government Headquarters, Bay Street</street> </address> </org> </doc.author> <doc.date>19 May 1993</doc.date> <recipient> <propname type=person>Mr. Yoshiaki Abe, Country Director,Latin America and the Caribbean Region</propname> <org> <org.name>International Bank for Reconstruction and Development</org.name> The text consists of front matter, the body, and back matter. No back matter has been included here because the letter does not have any of the SGML TEI standard components defined as back matter such as indexes, glossaries etc. 89 Chapter Three International Open Document Exchange Standards FIG 3: SAMPLE OF SGML TEI ENCODED DOCUMENT (CONT'D) <address>237 <street> 1818 H Street N.W.</street> <city>Washington D.C.</city> <post.code>20433</post.code> <country>U.S.A.</country> </address> </recipient> <subject.term>Human Resources Project</subject.term> </title.page> </front.[matter]> <body> <salutation>Dear Mr. Abe</salutation> <P> I wish to nominate His Excellency, Dr. Rudi Webster, Barbados' Ambassador to the United States of America, as the Expert to sign on Barbados' behalf the Statutory Committee Report. <p> <closing>Yours faithfully</closing> <signature>L. Erskine Sandiford</signature> </body> </text> </office> , : ^ From the point of view of capturing archival documents, there are many problems with the office document D T D as envisaged by the TEI . First and foremost it is bibliographical in nature, i.e. it treats office documents as self-contained document entities, independent of their interrelationship with other documents for meaning and their role in a transaction. The document is to be cited as i f it were a publication with a Title Statement and a Publication Statement when the letter has no title and cannot be said to have been published except in the most general sense of that word in that it has been "formally announced" or "read". 2 3 8 Secondly, the selection of tags is poor in the contextual information that indicates the relationship of the document to the transaction of which it is a record, and those who are responsible. In particular, there is no tag to identify the act in which the document took part (the making of a loan to Barbados), or the procedure (the negotiation of terms). 2 3 7 The address is an example of a crystal, a small object that has an internal structure consisting of a number of standard tags. 2 3 8 Oxford Modern English Dictionary, s.v. "publish." 90 Chapter Three International Open Document Exchange Standards There is also inadequate identification of those responsible for the document and who participated in the transaction. The header does not use the tag for organization as part of the Source Description where information may be supplied for contextual purposes or citation that is not found in the document text. The tags for organization (<org> and <org name>) are confined to the body only, where they consist of a literal transcription, or document instance, of the text. Similarly, the header does not make a distinction between the author of the act (the Government of Barbados) and the author of the document (L. Erskine Sandford). Again, the addressee is a textual tag to be transcribed literally and there is no distinction made between the addressee of the act (the World Bank) and the addressee of the document (Mr. Yoshiaki Abe). The tags also fail to capture other essential records attributes. There is nothing to capture the status of transmission which would indicate whether the letter was an original or a copy. 2 3 9 S G M L has a reputation for extreme flexibility in tackling problems of the encoding of all types of documents, including archival documents. Writes one researcher, " . . . it has proved remarkably difficult to find problems for which a solution could not be expressed in S G M L . " 2 4 0 But S G M L has broader limitations i f it is ever to be useful for the capture of archival documents. S G M L was designed for the publishing environment in that it permits authors to mark up their documents through the use of a generalized language for formatting by the publisher. As has been pointed out earlier, the publishing environment assumes that the creator of the document has no interest in the eventual appearance or format of the document but is interested only in controlling content. This assumption runs contrary to the definition of the archival document which is "a complete document . . . that contains all the elements it is supposed to contain according to the administrative and legal system." 2 4 1 Since these elements include the layout and logical structure as intrinsic and 2 3 9 As it is, the SGML transcription is really nothing more man a simple copy or a literal transcription of the contents of a document. 2 4 0 Burnard, Text Encoding Initiative, 106. 2 4 1 Duranti, Managing Electronic Records, 9. Chapter Three International Open Document Exchange Standards extrinsic elements, S G M L as it is presently conceived is not suited to the encoding of records. Because it is so important to avoid embedding format information in structured documents using S G M L , the format must be defined externally for particular presentations. These formats are then "dynamically associated" with structural elements to create a view of the information that is useful to the reader. 2" Such an approach gives S G M L encoded documents great flexibility in presentation, but this is not always desirable i f what is wanted is a view of the document as it is going to appear or as it originally existed. To resolve this problem, S G M L has to be capable of addressing "both avenues of document creation and maintenance, structure-oriented and format-oriented. S G M L offers a structured approach, but programs such as Pagemaker and Word Perfect are format-oriented which address only the appearance."2 4 3 As pointed out earlier, the development of D S S L (Document Style Semantics and Specification Language) may enable S G M L to handle layout by adding D S S L statements.244 The philosophy of the S G M L Text Encoding Initiative itself is problematic because it assumes that all S G M L documents are only interpretations. "No claim to absolute authority is made by any encoder, nor should ever be; the T E I scheme merely allows encoders to 'come clean' about what they have perceived in a text, to whatever degree seems appropriate." 2 4 5 This means that all such transcriptions are nothing more than simple copies with notes and could not be records in themselves since there is no guarantee that they capture the original. While this may be a necessity in a scholarly environment where texts are being transcribed for research purposes, it is inappropriate in a records-creation environment. Records are not interpretations but must have the quality of impartiality. Impartiality is "the characteristic of archival documents that they are created for limited, Hajagos, Documents and SGML, 41. 2 4 3 Ibid, 41. 2 4 4 B.C. Watson, R. J. Davis, ODA and SGML: An Assessment of Co-existence Possibilities, Computer Standards and Interfaces 11 (1990/91): 174. 2 4 5 Burnard, Text Encoding Initiative, 111 92 Chapter Three International Open Document Exchange Standards specific and immediate purposes of an administrative legal nature, not in order to instruct posterity." 2 4 6 Duranti, Managing Electronic Documents, 16 93 Chapter Three International Open Document Exchange Standards ISO 8613 Office Document Architecture (ODA) 2 " O D A is a standard that deals more specifically with the needs of the office environment, where it provides mechanisms for describing structures, standardized semantics for controlling document layout, and syntax definitions for interchanging this information. 2 4 8 O D A "defines interchange formats, concepts to represent the structure of the information in a document, and the meaning of a set of formatting parameters." 2 4 9 The purpose of O D A is to provide for the interchange of documents in order to permit presentation either as intended by the creator, or to allow processing, such as editing or reformatting, or both. 2 5 0 Interchange, in the O D A model, is assumed to follow the basic processing model, an automatic process of transferring a document "from an originating system to a receiving system," 2 5 1 but in the case of O D A , this automatic process is known as "blind document interchange" because it is assumed that "revisability and layout are preserved just based on the knowledge that both systems comply to the international standard." 2 5 2 While O D A is used to encode the structure of a document, it must be used with two other standard interchange formats to actually exchange the information. The Open Document Interchange Format (ODIF) is used to define a machine-readable bit stream representation of the document while S G M L is used to define a human-readable representation of the document in a format called Open Document Language ( O D L ) . 2 5 3 2 4 7 ODA has also been published using the title Open Document Architecture by CCITT and ECMA. Borman & Borman, Standards, 151. 2 4 8 Ibid., 150. 2 4 9 Fanderl et al. , The Open Document Architecture, 734. 2 5 0 ISO, ODA: Part 1, 1. 2 5 1 Ibid., 7. 2 5 2 Fanderl et al. , The Open Document Architecture, 734. 2 5 3 Ibid., 736. 94 Chapter Three International Open Document Exchange Standards O D A is designed to deal with compound documents which may consist of text, geometric graphics, and raster graphics. 2 5 4 It distinguishes three document architectures, all of which may be interchanged but which lend themselves to different purposes: formatted, processable, and formatted processable. Formatted documents are "read only", or in other words, intended only for presentation. Their interchange is designed to ensure that the same layout is preserved in all systems. Processable documents permit human editing or can be otherwise modified by machine-controlled processes that may change the content or structure. Formatted processable documents not only preserve the original layout in interchange, but may also be edited or restructured. The structure of a processable O D A document is designed to permit the processing of documents in three steps: editing, layout, and presentation (imaging). 2 5 5 A n O D A document consists of four parts: the logical structure (chapters, paragraphs, lists, diagrams etc.) and the relationship between logical elements, the layout structure (pagination and physical location of elements such as paragraphs), the content structure, representing different content types such as text and graphics, and the document profile, which contains descriptive attributes for filing and processing. ODA's use of content types permits other standard, such as graphic standards, to be used to define the contents. DOCUMENT ARCHITECTURE The fundamental concept behind O D A is document architecture, a set of rules that can be used to define the physical and intellectual structure of documents. Structure, in this case, means "the division and repeated subdivision of the content of a document into increasingly smaller parts." 2 5 6 The parts are called objects and organized into a hierarchy or tree. The purposes of O D A may therefore be restated in the more specific in terms of the document architecture: Ibid., 736. Ibid., 734. ISO, ODA: Part 1, 14 95 Chapter Three International Open Document Exchange Standards • to permit the exchange of documents between heterogeneous environments so that "different types of content, including text, image, graphic, and sound can coexist within a document"; and • to ensure that "the intentions of the document originator with respect to editing, formatting and presentation 2 5 7 can be communicated most effectively. 2 5 3 The rules that constitute the document architecture are based on three fundamental assumptions about documents: 1. THE T W O V IEWS: all documents consist of a layout view, which is how the document is physically organized into pages 2 5 9, and a logical view which is how the document is intellectually subdivided into units of meaning (e.g. paragraphs).2 6 0 The logical structure is usually embedded in the document by the author at the time of creation and editing, whereas the layout structure is determined by a formatting process such as a word processing application. 2. G E N E R I C A N D SPECIFIC S T R U C T U R E : all documents have a "specific" structure which is the one "the user may read", or in other words, is the human perceptible structure, and an underlying "generic structure" which is "the template that guides the creation of the document and that could be re-used for its amendment."2 6 1 The concepts of generic and specific are applied to both layout and logical structures. The generic structure represents properties that are common to a number of documents, whereas the specific structure is an instance of the generic structure in a given document only to a given document. Another way of looking at this is to say that the generic structure represents the standard components that might be common to a number of different documents, such as a title or pages, whereas the 2 5 7 Presentation is the "operation of rendering the content of a document in a form perceptible to a human being." Ibid.., 10. 2 5 8 Ibid., 13. 2 5 9 Ibid., 19. 2 6 0 Ibid., 14. 2 6 1 Ibid., 13. 96 Chapter Three International Open Document Exchange Standards specific structure would be the particular instance of the title in a given document. 2 6 2 The generic structure controls the editing process in that only structures conforming to those defined for the generic structure can be generated as specific structures.2 6 3 3. D O C U M E N T C L A S S E S : O D A posits the existence of document classes which it defines as "a set of generic features that are common to a category of documents." 2 6 4 A l l document classes have both a generic layout and a generic logical structure. O D A s concept of document classes is more complete than the . S G M L definition of document type because it encompasses a greater range of document features, including both layout and logical structures, enabling documents to be defined with greater specificity. The components of the layout and logical structures are managed as different types of objects. Those associated with the generic view are known as object classes; those associated with the specific view are known as objects. Hence, there are logical object classes, and logical objects, layout object classes and layout objects. A layout object class might consist of pages associated with headers, a layout object, a page; a logical object class might consists of chapters, a logical object, sections. The document architecture subdivides layout and logical objects into a further hierarchy of subordinate entities: document roots, composite objects, and basic objects, which are at the simplest level of the architecture in having no subordinate structural entities. In addition to these, O D A also specifically defines as object entities certain standard layout structures in the form of blocks, frames, pages, and page sets. N o such standard entities are defined for the logical structure. The purpose of describing elements as objects is to dissociate them from any one particular use. 2 6 2 For instance, the generic logical element "title" is common to all reports, but the title, "Drilling for Water in Abijan" is a specific instance of the logical structure for the document type, report. 2 6 3 Fanderl et al. , The Open Document Architecture, 735. This is one approach but another is to define the properties required by an editor to generate the document. For instance, instead of defining as a generic structure in the ODA document a map (an example of a geometric graphic), ODA would define how the word processing application is to interpret the requirement for a map. 2 6 4 ISO, ODA: Part 1, 13. 97 Chapter Three International Open Document Exchange Standards Figure 4 ODA Logical and Layout Structure265 page content portion content portion content portwn T •t T T section headng paragraph paraqraph paragraph | •T t • • content content portion portion k , block Figure 5 ODA Correspondence Between Logical and Layout Objects Including Content Portion26* report heading mumort nuns Lagend r"™^^ ™!] Ona or mora | A |] occurence ot A A c o r o n a of B and C 1| ^QfBorC 2 6 5 From ISO, ODA: Part 1, 14. 2 6 6 From ISO, ODA: Part 1, 16. tftbtoot , contents ctiapnr heaang . | CftSpiBf number chapter Boa . rewifaoce) heaang elemeni as figure pcajra 98 Chapter Three International Open Document Exchange Standards FIGURE 6 ODA Content Architecture and Layout for Diplomatic Elements This figure is intended to illustrate how diplomatic elements would be interpreted as part of an ODA document architecture in terms of logical and layout objects. The protocol is only one part of the document intrinsic elements. The others would be the text and the eschatocol. block pa ge section protocol content portion entitling date layout object paragraph paragraph invocation superscription paragraph paragraph inscription | paragraph 99 Chapter Three International Open Document Exchange Standards The logical and layout structures of an O D A document are in theory intended to be quite separate. It is important because the separation of layout and logical structures permits another layout to be applied to the same document when it has been exchanged between two different systems. But under some circumstances, layout may be driven by logical requirements, or presentation driven by either logical or layout requirements. For example, each section of a report may start on a new page, in which case, the formatting must recognize that a page break will be triggered by the end of each section. This dynamic relationship between logical and layout structures is captured by a document component in the form of directives called a style. O D A recognizes layout styles, defined as "a constituent of the document, referred to from a logical component, that guides the creation of a specific layout structure," 2 6 7 and presentation styles, defined as "a constituent . . . referred to from either a logical of a layout component which guides the format or appearance of a document." 2 6 8 Presentation styles "aggregate information that concerns the formatting of content, such as fonts and line-spacing." 2 6 9 O D A also structures the content of a document, defined as "the information conveyed by the document other than the structural information, and that is intended for human perception." 2 7 0 Content architecture is "the rules for defining the internal structure and representation of the content of basic components in terms of a set of content elements, attributes and control functions, and guidelines for presentation of the content." 2 7 1 The content architecture is used to determine document size, number of pages, and languages, basic element of representation, such as letters, pels, or geometric graphic elements (lines, polygons etc.) and other attributes relating to content, such as access. In other words, O D A content refers not to the subjects that may be treated in the 2 5 7 Ibid., 8. 2 6 8 The difference between this and layout is not clear. Formatting is defined as "the carrying out of operations to determine the layout of a document. "Ibid., 6. 2 6 9 Fanderl et al., The Open Document Architecture, 736. 2 7 0 ISO, ODA: Part 1, 4 2 7 1 Ibid., 16. 100 Chapter Three International Open Document Exchange Standards document, but to the way actual instances of information will be represented. For instance, words may be expressed as character text, or diagrams such as arcs or circles as geometric graphics. Each such instance forms a content element that is associated with a basic logical or a basic layout object. For instance, the content element of character text may be associated with the logical element paragraph. A set of content elements "belongs to" or is subordinated to a logical or layout object are is called a content portion. 2 7 2 Each content portion may therefore have its own architecture or set of rules for defining its internal structure in terms of content elements, their characteristics, and how they may be processed. As with layout and logical features, it is possible to have common content features or object classes keeping in mind that content is always associated with and therefore, part of the layout and logical structure. DAPs Just as textual features of documents can be standardized by the use of S G M L D T D s , so certain constituents of O D A documents can be standardized by means of document application profiles (DAPs). "Each of the D A P s specifies open document interchange within a certain class of applications [e.g. word-processing or imaging applications], through the definition of a set of features that are to be preserved with regard to document layout and processing behavior and its presentation in terms of O D A constituents." 2 7 3 So far, D A P s have been issued for revisable teletext messages, the handling of text, raster and geographic graphics by certain word-processing applications, such as Word Perfect, and advanced formatting for sophisticated document applications such as computer-aided publishing. 2 7 4 The D A P s are defined in terms of O D A functionalities and consists of globally registered identifiers that permit a receiving system to determine if it can handle the constituent. Ibid., 15. 2 7 3 Fanderl et al. , The Open Document Architecture, 137. 2 7 4 Ibid., 737. 101 Chapter Three International Open Document Exchange Standards A n O D A document, then, is defined according to one of three document architectures, processable, formatted, or formatted processable. It is comprised of a hierarchy of constituents consisting of one or more generic layout structures, specific layout structures, layout styles, generic logical structures, specific logical structures, and presentation styles. These metaelements are each associated with various types of objects which instance the actual structure and consist of an object description, an object, a presentation style, a layout style, a content portion description, and a document profile. 2 7 5 ODA DOCUMENT PROFILE The profile of an O D A document consists of "a set of attributes associated with the document as a whole." 2 7 6 A n O D A attribute is "a property of a document or of a document constituent (e.g. a logical object, a layout object, a logical object class, a layout object class, a style or a content portion). It expresses a characteristic of the document or document component concerned, or a relationship with one or more documents or document components." 2 7 7 Each attribute must be broken out according to the following criteria: • classification (mandatory, non-mandatory, defaultable) • permissible values divided into basic and non-basic values • default values, i f the value is defaultable. The profile consists of three clusters of attributes: constituents (i.e. generic logical and layout structures, object classes etc.); processing and imaging attributes, known as characteristics; and document management attributes which apply to the document as a whole (e.g. author's name, title etc.). 2 7 9 The profile may include or not include all the attributes and may or may not be exchanged with a document, but the profile may be exchanged by itself. [SEE Thesaurus for definitions of each element of the profile.] List of ODA attributes Document Constituents 2 7 5 ISO, ODA: Part 1, 4 2 7 6 Ibid., 18. 2 7 7 Ibid., 16. 102 Chapter Three International Open Document Exchange Standards Generic layout structure Specific layout structure Generic logical structure Specific logical structure Layout styles Presentation styles External document class Resource document Document Characteristics Content architecture classes Interchange format class O D A version Non-basic document characteristics Profile character sets Comments character sets Alternative representation character sets Document constituent attributes Page dimensions Medium types Layout paths Protection Block alignments Fi l l orders Transparencies Colours Borders Page positions Types of coding Coding attributes Presentation features Non-basic structure characteristics Number of objects per page Additional document characteristics Unit scaling Fonts listing Document management attributes Document description Title Subject Document reference Document type Abstract Keywords Dates and times Creation date and time Local filing date and time 103 Chapter Three International Open Document Exchange Standards Expiry date and time Start date and time Purge date and time Release date and time Revision history Originators Organizations.. Preparers Owners Authors Other user information Copyright Status User-specific codes Distribution list . • Additional information External references Reference to other documents Superseded documents Local file references Content attributes Document size Number of pages Languages Security information Authorization Security classification Access rights O D A clearly approaches the,ability to capture all the diplomatic attributes of a record, more closely than S G M L , at least as advanced by the Text Encoding Initiative, because it is designed to capture both the physical and intellectual components of a record as can be seen from Figure 5: ODA Correspondence Between Layout and Logical Objects, where logical elements can be equated with elements of layout. O D A is particularly interesting for its set of document management attributes. These map well to diplomatic elements in many respects. Its attributes of dates are particularly sensitive from the point of action and documentation as well as filing of the record. The attributes for persons, while lacking in precision in some respects (SEE the discussion under persons in Chapter Two), are comprehensive in their range and go well beyond the narrow bibliographical interpretations of persons. It is also sensitive to the question of status of transmission, and 104' Chapter Three International Open Document Exchange Standards goes beyond traditional diplomatic requirements in making provisions for security and access. Above all, O D A , unlike the S G M L TEI , makes a firm separation between form and content, which translates into elements of content configuration rather than interpreting subjects as unique elements. O D A therefore seeks to standardize documents, which makes it particularly suitable for documents of a prescriptive and formal nature, even though it is possible for these to have a processable status. In this respect, O D A is the diametric opposite of the S G M L TEI , whose emphasis is on permitting the widest possible latitude in interpreting text in all its forms. This is just as well because O D A ' s document architecture is quite complex to handle. Once defined for any given type of document, it would not be practical to make frequent changes. ODA vs SGML .. Borman and Borman maintain that the differences between S G M L and O D A are not that great. In fact, they are being driven together by the emergence of desktop publishing which is making the layout of office documents just as demanding as commercial publication. As the walls between traditional forms of documents (commercial publication in all its variety and office documents) dissolve in the environment of information interchange as their contents are no longer "frozen" by the document form, the distinction between office and publishing disappears. For a single standard to prevail, it must be equally capable of addressing both environments. This development has resulted in a number of planned extensions to both S G M L and O D A . S G M L is being extended so that it can include standardized semantics for controlling document layout. This is being accomplished by the development of a new standard designed to maintain a clear separation between logical document structure (as marked up in S G M L ) and instructions for the automatic creation of a page layout. Document Style Semantics arid Specification Language ( D S S L ) 2 7 9 allows a clear separation between the logical document structure by decoupling layout specifications from the document itself. Extensions planned for S G M L include support for direct access ISO/DEC DP 10179 - Text Communication -Document style semantics and specification language. (Geneva: International Organization for Standardization, 1989). Cited in Bormann & Bormann, Standards, 162, fn. 105 Chapter Three International Open Document Exchange Standards to document components, musical notation, and type fonts. O D A is to be extended with support for security, improved layout, colour, different data types and computations, indexing, voice, time synchronization, annotations, hypertext, revision control, distributed editing, backward compatibility and a standard application programming interface that will permit O D A to work easily with the many tools users require, such as spreadsheets, graphics programs, and word processing. 2 8 0 Such extensions should bring S G M L and O D A into increasing compatibility. These extensions interject another stage into document processing that elaborates the basic document processing model by introducing the concept of mapping the source document to a transit document. 2 8 1 Both O D A and S G M L do not operate in pure document processing environments of their own, but must be accommodated to existing applications through converters which provide interchangeability of documents between different systems. S G M L was premised on the belief that authors would continue to edit their work by means of their usual, local-document- processing systems, using them for both information and document markup. This is why S G M L is human-readable. But this requires a knowledge of the codes which can be time-consuming to acquire. Moreover, these have to be input with accuracy. Similarly, P O D A (Pilot O D A ) was developed to transform local application document formats into the O D A format which would otherwise be very time-consuming to do. Such approaches, however, can only be transitional because they sacrifice functionality in order to minimize the possibility of errors (as we recognize the neophyte linguist by his stilted grammar and attempts at commonplace conversation.) Bormann & Bormann, Standards, 154-158. Ibid., 158. 106 Chapter Three International Open Document Exchange Standards ISO 10166 Document Filing and Retrieval (DFR) Document filing and retrieval is the complement to document creation and exchange. D F R and O D A are intended to be complementary standards and share many of the same document management attributes. D F R is intended to provide "a large capacity document store to multiple users in a distributed office environment." 2 8 2 It is not "an attempt to generalize all filestores in computing systems, but rather, filestores where clients and servers are on different nodes of a distributed system." 2 8 3 D F R provides services between two "atomic" parts - the DFR-Server and the DFR-Use r . 2 8 4 The DFR-Server provides access to a file store for the user by means of Filing Ports, Retrieval Ports, and Administration Ports. The filing and retrieval services supported by the D F R configured server are called the DFR-Server Abstract Service and includes information on how the user can make use of the service. Administration is considered a separate type of service. The D F R server gives access to a DFR-Document store which consists of a number of different types of document objects. These are DFR-documents, which is the most basic object and could comprise any sort of document; DFR-Groups, or collections of documents; DFR-References which provide a means to include a document in more than one group without making copies; and DFR-Search-Result-Lists, which merely contain information satisfying some search criteria. 2 8 5 The DFR-Abstract-Service can be used to create, delete, modify, copy, move or simply read stored objects. The attributes of each object can also be created, copied, moved, stored, or modified, but the content of an object cannot be changed. In other words, the content of an object can be described by an attribute, and the description (and therefore the content) altered, but the actual instance of the content, such as the text of a 2 8 2 ACCIS, Strategic Issues, 51. 2 8 3 Ibid., 51. 2 8 4 The DFR-Server is a file server, which is "a computer and storage device dedicated to storing files." Margolis, Personal Computing Dictionary, 428. 285 ACCIS, Strategic Issues, 52. 107 Chapter Three International Open Document Exchange Standards document, cannot be changed. To do that, the document must be imported into an application environment where this is permitted, and then returned to DFR-document store. The DFR-document consists of a set of attributes established in a profile, and a content, or actual body of the document. The attributes must establish the location of the document at the very least. D F R is not all concerned with the content beyond its presence. The definition of document could include part o f a document, such as a page or part of a book, or a group of different documents, such as a number of documents that are all pulled together to form another document pulled together into a single document object. 2 8 6 A DFR-Group consists simply of a collection of DFR-objects that have some common characteristic, and again, could comprise different parts of a document, or a number of documents related by some common purpose or function. Whatever they are, they must all be stored in the same server. D F R does not allow for a distributed document store which would be spread out over a number of different servers. 2 8 7 The commonality is defined by attributes while content is captured by a list o f all the Unique-Permanent-Identifiers (UPIs) for each member of the group. A DFR-Reference consists of pointers to DFR-objects that enable them to participate in more than one group. The reference is again a set of attributes combined with a pointer to the particular DFR-object, which could be a DFR-Document or DFR-Group or a DFR-Search-Result-List. The function of D F R attributes is to "give support to the user in understanding D F R " . 2 8 8 They may come from a wide variety o f sources although D F R supports many attributes of O D A . The attributes break down into two groups: those attributes used to 2 8 6 For instance, a World Bank Staff Appraisal Report on a project consists of a number of different field reports and memorandums. 2 8 7 ACCIS, Strategic Issues, 53. 2 8 8 Ibid., 53. 2 8 8 Ibid., 52. 2 8 8 Ibid., 52. 108 Chapter Three International Open Document Exchange Standards manage the document within the D F R file store (the Basic Attribute Set), and those used to identify the object for management purposes outside the D F R store itself (the Extension Attribute Set). Basic Attribute Set Attribute-type Name D F R - U P I * DFR-Object-Class DFR-Document-Type DFR-Tit le DFR-Pathname* DFR-Parent-Identification* DFR-Referent-Deleted* DFR-Membership-Criteria DFR-Ordering DFR-Resource-Limit DFR-Resource-Used * DFR-Number-Of-Group-Merhbers* Version-Name DFR-Previous-Versions DFR-Next-Version* DFR-Version-Root* DFR-External-Location User-Reference User-References-to-Other-Objects DFR-Attributes-Create-Date-and-Time* DFR-Content-Create-Date-and-Time* DFR-Created-By* DFR-Attributes-Modify-Date-and-Time* DFR-Content-Modify-Date-and-Time* Document-Date-and-Time DFR-Reservation * DFR-Reserved-By* DFR-Access-List* *Assigned by the D F R Server. The other attributes may be assigned by the user or the owner. DFR Extension Attribute Set This group of attributes is assigned by the Owner and is a subset of the ODA-Document-Profile and therefore have similar definitions to those discussed under O D A . Other-Titles Subject Document-Type Document-Architecture-Class Keywords 109 Chapter Three International Open Document Exchange Standards Create-Date-and-Time Purge-Date-and-Time Revision-Date-and-Time Organizations Preparers Owners Authors Status User-Specific-Codes Superseded-Documents Number-of-Pages Languages Since D F R is a document store, it is sensitive to the manipulation of documents, and therefore, to the status of transmission. D F R achieves this through version management, where it tracks documents for their derivation from a Version-Root, and as they become members of groups, which may consist of family of versions of the same document. It is very flexible in its ability to link with other objects which may or may not be documents (document being only one type of object). With O D A it shares the same sensitivity to moments of action and documentation through its different attributes for dates. D F R is also capable of capturing the persons, and the relationships between records, or the archival bond, through attributes such as References-to-Other-Objects. Because it is a document store, however, D F R picks up on the document once it has been created, so it is not concerned with the logical and layout structure and in this respect, makes assumptions. about the completeness of a record that O D A does not. n o C H A P T E R F O U R A THESAURUS O F R E C O R D A T T R I B U T E S The Thesaurus is designed to equate document attributes, tags and structural features as identified in O D A , D F R and S G M L with general and special diplomatic concepts. In so doing, the Thesaurus treats diplomatics as a de facto international standard of document management in its own right, that is, one sanctioned by widespread usage rather than any standards-setting body. The Thesaurus relates attributes (or tags) in the document profile as well as structural characteristics of documents (such as the logical structure) of O D A , DFR, and S G M L TEI to standard diplomatic concepts. There is an attempt to propose new diplomatic elements that are not identified with any open document exchange standard at this time. These proposed elements are envisioned as attributes of the document profile and are indicated separately by underlining in bold, e.g. access control. The Thesaurus also includes general diplomatic concepts (such as the juridical system and the written document) that are too abstract to be directly capturable as tags or attributes in O D A , S G M L or D F R but which are nonetheless necessary to an understanding of the theoretical relationship between diplomatics, electronic records management ( E R M ) , and open document exchange. K E Y s S Y N O N Y M O U S T E R M O R A T T R I B U T E / T A G N • N A R R O W T E R M O R A T T R I B U T E / T A G B B R O A D T E R M O R A T T R I B U T E / T A G R R E L A T E D T E R M O R A T T R I B U T E / T A G = S E E A L S O A R C H A R C H I V A L ( INCUDING R E C O R D S M A N A G E M E N T ) T E R M DIP D I P L O M A T I C T E R M E R M E L E C T R O N I C R E C O R D S M A N A G E M E N T T E R M O D A O P E N D O C U M E N T A R C H I T E C T U R E S T A N D A R D S S G M S T A N D A R D G E N E R A L I Z E D M A R K U P L A N G U A G E D F R D O C U M E N T FILING A N D R E T R I E V A L S T A N D A R D PRP P R O P O S E D N E W D I P L O M A T I C T E R M A U DEFIN IT ION SUPPLIED B Y A U T H O R . I l l Chapter Four Thesaurus R U L E S 1. "No standards equivalent" or "No diplomatics equivalent" means that there is no term, attribute, tag, or structural characteristic of documents at present identified within open document exchange standards or diplomatics, that is synomymous, or is an exact match, with the entry term. In cases of partial equivalence, the term is assigned to either the Narrow term or the Broad term. 2. The citations used for each definition indicate a numbered source followed by the page reference, e.g. (1)-156 = A C C I S , Management of electronic records, page 156. 3. Where diplomatic terms have no actual attribute/tag or conceptual equivalent in O D A S G M L or D F R , the Thesaurus attempts to capture them at a broader level where there is some equivalency. For instance, constitutive acts have no equivalents except at the broader level captured in the term acts. 4. Both diplomatic terms and standards attributes/tags or document features are given for each level of the Thesaurus hierarchy, but where no standards equivalents or occurrences may be found, then only the diplomatic concept will be given. 5. Because of the lack of precise definitions in the Text Encoding Initiative, all S G M L tags have been correlated with diplomatic concepts on an approximate basis. 6. A l l standards terms, including attribute names, tags, and concepts are italicized, diplomatic and archival terms are in bold face. 7. Where the names of D F R and O D A attributes are synonymous (but not necessarily the meaning), the O D A spelling (which is unhyphenated) is preferred for the entry heading. 112 Chapter Four Thesaurus Abstract O D A - A n attribute of Document Description that contains information to summarize a document.(5)- 9 S G M L TEI - A summary of the content of the document as continuous prose.(7)-189 access control DIP - Proposed definition: The right of entering an electronic data store with specific privileges to view or process data in some way, or to administer or modify the system itself. The authorization usually takes the form of a list of those who are entitled to access together with their specific privileges. Access is permitted on the basis of authentication. The access list may permit entry to a system, an application, or to individual objects, such as a document, or groups of objects. ( A U ) s: N o diplomatic equivalent. n: D F R Access-List O D A Access Rights S G M L Access Rights b: E R M security r: D F R Authentication Access-List D F R - This attribute identifies the security subjects allowed to access this DFR-Object specifying for each of them their respective access rights.(8)-88 SEE access; reliability; security Access Rights O D A - This attribute specifies the access right(s) to the document relating to its privacy, as defined by the current owner(s) of the document. (4)-14 S G M L TEI - A n element of the local file system reference in the tag set for the front matter of office documents that is intended to tag electronic access rights. (7)-189. SEE access; reliability; security acts DIP - Movements of the wil l aimed to create, maintain, modify, or extinguish situations.(22)-3 • Among human facts in general, the special type of fact which results from a will determined to produce an act is called an action or act. The operation of will distinguishes an act from any other general fact. Therefore, all acts are also facts, but only facts generated by a determined will are acts. A l l archival documents express a fact which may be considered the transaction embodied in the record.(3) mere act: An act in which the will is limited to the accomplishment of the act, without the intention of producing any other effect then the act itself: effect and act coincided 11)-7 simple act: When the power of accomplishing the act is concenUated in one individual or organ we have a simple act. The will to produce the act is one will.(4)-21 113 Chapter Four Thesaurus collegial act: When the power of accomplishing the act is concentrated in a number of individuals acting with one will (for example, a circular signed by a number of ministries). (11)-13. A collegial act is a form of simple act. collective act: Those acts produced by the identical wills of different individuals or organs, and resulting in one document.(l l)-9 contracts. When the power of accomplishing the act belongs to two or more interacting parties (individuals, public bodies, states, state-and-individuals) we have a contract. Notwithstanding the difference in motivation and interests between the parties, their wills converge in one, aimed at producing one act. (11)-13 multiple act: Acts produced by the will of the same individual or organ but directed to different individuals or organs, and resulting in one document, e.g. a document giving merit increases to a number of employees. (11)-13 compound act. Acts composed of many different acts produced by the same individual or organ or by a number of individuals or organs, but all essential to the formation of some final act of which they are partial elements. The partial acts may concern the same or different subjects and may respond to convergent or contrasting interests, but each results in documents which are all necessary to the formation of the final document. The final product of the compound act may be further divided into continuative, complex acts and acts on procedure.(11)-14 acts on procedure: A form of compound act. When the final act derives from a series of different acts (which may be simply, compound, collegial, or collective, in sequence of parallel) produced by a number of different individuals and/or organs, which have equal or different motivation or interests and accomplish different functions. However, all these partial acts have the common aim of making possible the accomplishment of the final act.(ll)-14 complex act: A type of compound act that occurs when individuals or organs which may have different motivation and interests but pursue the same function, produce a number of simply acts having the same content, all necessary to the accomplishment of the final act, e.g. all the series of approval needed for the appointment of a Dean.(l 1)-14 continuative act: A type of compound act in which the same individual or organ needs to manifest the same will more than once in order to produce the final and definitive act, so that the partial acts constituting the compound act are all identical, but the documents resulting from them are not (for example, a City Council's three subsequent deliberations of the same by-law.)(l 1)-14 Acts in general: s ST N o standards equivalent n DIP SEE Glossary - acts - for a classification of different types of acts b DIP facts; juridical act; juridical fact r DIP functions; phases o f procedure; procedure; process; transaction P R P Business Process linked to O D A Document Type 11-4 Chapter Four Thesaurus addressee DIP - The person(s) to whom the document is directed.(22)-2 • Every document has two theoretical addressees: the person to whom the act is directed, and the person to whom the document itself is directed. These are not necessarily the same. There is no document without an addressee because documents result from actions and any action falls on somebody. A n action may be directed to an entire collectivity, and in such a case the addressee of the related document may be all the people, or a social, ethnic, and religious group and so forth. (17)-6 s ST N o standards equivalent n DIP addressee of document; addressee of act O D A Distribution List S G M L In Reply To; Primary Recipient; Secondary Recipient b DIP persons r DIP writer; author addressee of the act SEE addressee addressee of the document SEE addressee annotations DIP - Extrinsic elements consisting of additions to the record after its compilation. They can be distinguished in categories in relation to the procedural moment in the treatment of the affair in which they were added to the record in question. (22)-6 • For an electronic document, the document profile is the container of all annotations, but also of some elements of intellectual form.(22)-22 SEE annotations of execution; annotations of handling; annotations of management annotation of execution DIP - These are added in the execution phase, when the act is put into effect. They comprise annotations of authentication and registration. Authentication is the express, legal recognition that a record or the signature(s) on it is what it purports to be (particular to certain record forms). Registration is the reference to a transcription of the record made in a register by an office different from the one creating the record (particular to certain record forms).(22)-6 s ST N o standards equivalent n O D A Authorization; Expiry Date and Time; Release Date and Time; Start Date and Time S G M L Authorizing Person b DIP annotations r DIP annotations of handling; annotations of management 115 Chapter Four Thesaurus annotations of handling DIP - These are added during the handling of the matter. They comprise instructions, such as the mention of previous or following actions, directions for transmission, disposition, classification etc. They also include dates of hearings or readings, and signs added beside the text such as question marks or checks.(22)-6 s ST N o standards equivalent n O D A Additional Information; Revision History S G M L User Info b DIP annotations r DIP annotations of execution; annotation of management annotations of management DIP - These are added to the document as means of controlling the document itself. They include registry numbers, classification codes identifying its relationship with other documents in the receiving or generating office, cross-references to related files, date of receipt, and name of the recipient, usually the stamp of the receiving office. (22)-6,7 s ST N o standards equivalent n D F R External-Location; Keywords; Local-Filing-Date-and-Time; Membership-Criteria; Number-of-Group-Members; Parent-Identification; Pathname; Purge-Date-and-Time; Referent-Deleted; Unique-Product-Identifier (UPI); User-Reference; Resource-Limit; Resource-Used O D A Local File References; Authorization; Security Classification; Abstract; interchange format class P R P Access Control S G M L Document Reference (if defined as unique for each document); Local File System: Reference, File name, Location; Front Matter; encoding instructions; file header; revision history; User Comments; Source Description; Abstract; Authorizing Person; Sensitivity; Bibliographic File description b DIP annotations r DIP annotations of execution; annotations of handling appreciation DIP - A sort of prayer for the realization of the content o f the document, e.g. "looking forward to, I appreciate etc. "(19)-12 N o standards occurrence; SEE protocol archival bond DIP - The relationship that, because of the circumstances o f their creation, records have with their creator, with the activity in which they participate, and among themselves. The archival bond is originary (it comes into existence when the record is made or received), necessary (it exists for every record), and determined (it is characterized by the purpose of the record). (22)-4. s: N o standards equivalent 116 Chapter Four Thesaurus n: D F R Group; group-content; group-interrelationship; group-member; Root-Group; User-References-to-Other-Objects; Create-Date-and-Time; UPI ; Document-Type; Parent-Identification; User-Reference O D A Reference to Other Documents; Document Type, Creation Date and Time; document description; document class description PRP business process S G M L Creation Date; Document Reference; References to Other Documents; Additional User Specific Codes b: DIP archival document r: A R C H provenance archival document DIP - A document created or received by a physical or juridical person for the achievement of its purposes or in the exercise of its functions.(22)-4. A l l archival documents have facts, a purpose, consequences, and are the result of a genetic process.(3) s: N o standards equivalent n: N o standards occurrence b: E R M record; document; object r: D F R Document-Type; Object D P transaction DIP transaction O D A Document Type; Object Class S G M Document attestation DIP - The subscription of those persons who took part in the issuing of the record (i.e. the author, writer, countersigner, and/or witnesses). It might take the form of signatures. The attestation is the substance and core of the eschatocol. (19)-14,15 s: N o standards equivalent n: O D A logical object (where this is a signature); User Specific Codes b: DIP persons r: DIP eschatocol attribute D P - A property or characteristic of one or more entities, for example, colour, weight, sex." In an E D M S , kinds of attributes include data attributes, display attributes and user attributes.(l)-138 • Information (including text and voice) that can be interchanged with a document but which will only be presented to the recipient i f particular conditions arise, such as an explicit request.(9)-157 D F R has several different types of attributes: those that a server wil l execute on a mandatory basis (basic) and those that it does so optionally (extension). The various abstract operations also have their own attributes: search, security, produce and consume.(8)- 7, 18. 117 Chapter Four Thesaurus • D F R attributes are managed independently of the content. Attributes can be read or changed. If an attribute is changed the content of that object is not changed. Attributes characterize an object, that is, each attribute provides a piece of information about, or derived from, the object to which it corresponds. Attributes affect storage and retrieval of an object and control access to it.(8)-21 • A data item that identifies a DFR-object, describes its DFR-content, helps control access to it, or in some way is associated with the DFR-object. (8)-5 O D A - A n element of a constituent of a document that has a name and a value and that expresses a characteristic of this constituent or a relationship with one or more constituents.(2)-3 S G M L - Information which is in some sense descriptive of specific element occurrences but not itself regarded as an element. (7)-24. SEE ALSO tag Attribute-Create-Date-and- Time D F R - A DFR-specific mandatory attribute, part of the basic attribute set, that contains the date and time when the mandatory attributes of a D F R object were stored in a D F R document store. A D F R server sets it to the current date and time during the create abstract operation.(8)-85 SEE authentic record; status of transmission Attributes-Modified-By D F R - A DFR-specific, mandatory attribute, part of the basic attribute set, that identifies the D F R user which has most recently modified the D F R attributes of a D F R object. It can be read only by a D F R user having at least extended-read access rights.(8)-86 SEE author; status of transmission; writer Attributes-Modify-Date-and-Time D F R - A DFR-specific, mandatory attribute, part o f the basic attribute set, that contains the date and time when the attributes of a particular D F R object were last modified in a D F R document store. When a D F R object is created, this attribute is set to current time. Subsequently, the D F R server maintains the attribute and is not updated when those attributes modified or deleted by the server are modified or deleted, i.e. D F R Pathname, D F R Parent Identification, D F R Referent Deleted, D F R Resource Used, and D F R Number-of-Group-Members.(8)-86 SEE authentic copy; status of transmission authentication DIP - The legal recognition that a signature is affixed by, and belongs to the person whose name it expresses, that a document is what it purports to be, or that a copy conforms to the original. Authentication may refer to one or more signatures, to an entire document, or to a copy of a document.(3) s: D F R Authentication O D A Authorization 118 Chapter Four Thesaurus n: DIP authentic copy; copy-in-the-form-of-an-original; countersigner; witness b: DIP authenticity E R M security r: DIP author; copy; original; writer E R M access authentic copy DIP - A copy certified by officials authorized to execute such a function, so as to render it legally admissable in evidence. Also included are inserts - quoted or reported - in subsequent original documents in order to renew their effects, or because they constitute precedents of the legal act attested in the subsequent originals. The perfect form of insert is that called vidimus. An authentic copy in general, and a vidimus in particular, only guarantees the conformity of the copy to the original text. Thus, an authentic copy in the diplomatic sense is also an authentic copy in the legal sense but neither in diplomatics nor in law is it an authentic document. The authentication provides the copy with validity and the effects of the original, not with its forms, and it does not influence diplomatic, legal, or historical genuineness.(4)-21 s: N o standards equivalent n: D F R User-Reference; User-References-to-Other-Objects where it is a link DIP vidimus O D A References to Other Documents b: DIP copy r: DIP status of transmission authentic record DIP - A record whose genuineness can be assumed on the basis of one or more of the following: mode, form, and state of transmission, and manner of preservation and custody. (22)-12 s: N o standards equivalent n: DIP form of transmission; mode of transmission; status of transmission D F R Attributes-Create-Date-and-Time; Content-Create-Date-and-Time; Create-Date-and-Time; Attributes-Modify-Date-and-Time; Document-Date-and-Time; Next-Version; Previous Version; Revision-Date-and-Time; Status; User-References-to-Other-Objects; Version-Root O D A Document Date and Time; Status; interchange format class; O D A Version; Revision History; Security Information S G M L Authorizing Person b: DIP authenticity; authentication r: DIP forgery E D M S security 119 Chapter Four Thesaurus authenticity DIP - The extent to which a document is what it purports to be.(4)-l7 s: N o standards equivalent . n: DIP authentication; authenticity - diplomatic; authenticity - historical; authenticity - legal b: DIP juridical system r: DIP genuine document E R M security Author DIP - The person(s) competent for the creation of the document which is issued by them personally, by their command, or in their name. Usually, the author of a document coincides with the author of the act put into being or referred to by the document, because the person whose wil l has given origin to the act documented tends to be also the person competent for the creation of the related documentation.(17)-5,6 s: O D A Author = author of document; Owners = author of act D F R Authors = author of document; Owners = author of document; Created-By S G M L Author; Authorizing Person n: D F R Content-Modified-By DIP author of the act; author of the document b: DIP persons O D A Originators r: DIP attestation; countersigner; witness; writer D F R Organizations; Preparers O D A Organizations; Preparers S G M Preparer; Originator Authors D F R - A non-DFR specific, optional attribute, part of the extension attribute set, that defines name(s) of the person(s) and/or organization(s) responsible for the preparation of the intellectual content of the document. In the case of an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, authors.(8)-91 O D A - A n attribute of originators that identifies the name(s) o f the person(s) and/or organization(s) responsible for the preparation of the intellectual content of the document. The value of this attribute consists of one or more entries, each with two optional parameters: personal name of author, and author's organization.(5)-12 S G M L TEI - Names of those responsible for the intellectual content of the work, as given on the title page. SEE author; juridical person; persons SEE ALSO owners; preparers; originators; organizations 120 Chapter Four Thesaurus Authorization O D A - A n attribute of security information that identifies the person or organization approving or authorizing the document.(5)-14 SEE annotation of handling; deliberation control; security Authorizing Person S G M L - A n attribute of security information that identifies the person or organization approving or authorizing the document.(7)-190 SEE reliability; security SEE ALSO persons; juridical person Back Matter S G M L - That part of the core structural features of a document consisting of bibliography, glossary, and index features.(7)- 88 Not relevant to records. basic processing model D P - A model of open document interchange in which a source document does through an automatic processing step in order to be received as a result document. But source and result documents are generated from a pre-determined document type which must be mapped to the specifications of the automatic processor in order to be transmitted and interpreted correctly.(9)-152 SEE ALSO source document; Document Type; extended processing model Bibliographic File Description S G M L T E I - Part of the file header intended to provide a citation to the source document. SEE annotations of management business process DIP - A term that might be used to capture the concept of acts as carried out by any one juridical person in the course of their business by uniting the type of acts with the procedures needed to put them into effect. S E E acts chronological date DIP - The time of the compilation of the document and/or of the action which the document concerns.(22)-5 s: O D A Dates and Times n D F R Attributes-Modify-Date-and-Time; Contents-Modify-Date-and-Time; Attributes-Create-Date-and-Time; Content-Create-Date-and-Time DIP moment of action; moment of documentation O D A Creation-Date-and-Time; Local Filing Date and Time; Expiry Date and Time; Start Date and Time; Purge Date and Time, Release Date and Time; Revision History 121 Chapter Four Thesaurus S G M Document Date; Revision History b: O D A Dates and Times r: DIP topical date competence DIP - The authority and capacity of carrying out an act.(17)-8 s: DIP responsibility n: DIP author of the act O D A Authorization; Preparers S G M L Originator b: DIP persons r: DIP acts; reliability completeness DIP - A record that has all the elements of form required by the juridical system in which it is created. Completeness is conferred on a record by the presence of all required elements of its intellectual form, specifically the features of content articulation and the annotations. (22)-5 Completeness of electronic records is conferred by the presence of the following elements of intellectual form: chronological date, topical date, entitling, attestation, addressee, receivers, title or subject, and disposition.(22)-21 s: N o standards equivalent. n: D F R Document-Type; document architecture class; document content; UPI ; Object-Class; Document-Date-and- Time; Title; Owners; Authors; Status; Distribution-List; Subject; O D A document body; document profile; content; document architecture, Title; Start Date and Time; Distribution List; Status; Subject; Authors; Owners S G M L document type declaration(DTD); document instance; Title; Creation Date; Originator; Author; Primary recipient; Document Status; annotations b: DIP reliability r: DIP authenticity; archival document or record complimentary clause DIP - A brief formula expressing respect, such as "sincerely yours". SEE text compound document D P - A document containing a mixture of content types that may include text, sound, and raster or vector images.(1)-143 SEE ALSO User-References-to-Other-Objects; User-Reference 122 Chapter Four Thesaurus conceptual document D F R - A set of DFR-documents, considered to be "different versions" of the same document. (8)-5 SEE draft constituent O D A - A set of attributes that is one of the following types: a document profile, an object description, an object class description, a presentation style, a layout style or a content portion description.(2)-4 SEE ALSO content articulation Content D F R - The prime information content of a DFR-object. The nature of the DFR-content depends on the DFR-object class of the DFR-object.(8)-5 O D A - The information conveyed by the document, other than the structural information, and that is intended for human perception.(2)-4 SEE content SEE ALSO content architecture; content architecture class; content architecture level; content attributes; content articulation; content element; content portion; Content-Create-Date-and-Time; Content-Modified-by; Content-Modify-Date-and-Time content DIP - The places, names and dates that may be mentioned apart from the syntactical and objective elements of form (persons, intrinsic and extrinsic elements, facts and procedures) that are intended to give a document meaning and effect, s: N o standards equivalent n: D F R document content; Content-Create-Date-and-Time; Content-Modified-By; Content-Modify-Date-and-Time; O D A Content; content architecture class; content portion; Document Size; Number of Pages; Languages S G M L element declaration b: DIP intellectual form D F R content r: D F R object class DIP intrinsic elements O D A document body content architecture O D A - Rules for defining the internal structure and representation of the content of basic components in terms of a set of content elements, attributes and control functions, and guidelines for the presentation of the content. (2)-4 123 Chapter Four Thesaurus • The rules governing the more detailed internal structure of a content portion associated with a logical or a layout object. The rules depend on the type of content and there can be only one content architecture for each object.(2)-16 SEE content articulation SEE ALSO content architecture class; content architecture level content architecture class O D A - The rules for defining the internal structure and representation of the content of basic components in one set of forms defined for each type of content element. Examples are formatted form, processable form and formatted processable.(2)-4 SEE content articulation SEE ALSO content architecture; content architecture level content architecture level O D A - A n identified subset of the features pertaining to a content architecture class. (2)-5 SEE content articulation SEE ALSO content architecture; content architecture class content articulation DIP - The elements of the writing and their arrangement, that is, what determines the distinction between a letter and a memo, or a chart and a map.(22)-2 s: N o standards equivalent n: D F R document content; Document-Type O D A document constituents, generic layout structure; specific layout structure; generic logical structure; specific logical structure; layout styles; presentation styles; external document class; resource document; content architecture; content portion description; content element; document characteristics: content architecture classes; interchange format class; O D A version; non-basic document characteristics: profile character sets; comment character sets; alternative representation character sets; document constituent attributes: page dimensions; medium types; layout paths; protection; block alignments; fill orders; transparencies; colours; borders; page positions; types of coding; non-basic document characteristics: profile character sets; comments character sets; alternative representation; coding attributes; presentation features; non-basic structure characteristics: number o f objects per page; additional document characteristics: unit scaling; fonts listing. S G M L content articulation; document type definition; document type; document type declaration; document type definition; element; element declaration; element definition. b: DLP script; intellectual form r: DLP content configuration; annotations 124 Chapter Four Thesaurus content configuration DIP - The mode of expression of the content, e.g. graphics, text, images, or a combination. (22)-2 s: N o standards equivalent, n: D F R document content O D A content elements; presentation features; presentation styles S G M L entity b: DIP intellectual form; script r: DIP content articulation; annotations content attributes O D A - A group of document management attributes comprising Document-Size, Number-of-Pages, and Languages.(5)-13,14 SEE extrinsic elements; languages; content articulation SEE ALSO layout; content portion; content element; content Content-Create-Date-and Time D F R - A DFR-specific attribute, part of the basic attribute set, that contains the date and time when the D F R content of a D F R object was stored in a D F R document store. The server sets the attribute to the current date and time when the D F R content of the object is created. (8)-86 SEE chronological dates; moment of documentation SEE ALSO Content-Modified-By; Content-Modified-Date-and-Time content element O D A - A basic element of the content of a document. (2)-5 • For content consisting of character text, the content elements are characters. In the case of images or graphics, the content elements are picture elements (also called pels) or geometric graphics elements (lines, arcs, polygons, etc.)(2)-15 SEE content articulation SEE ALSO layout; content portion; content attributes Content-Modified-By D F R - A DFR-specific attribute, part of the basic attribute set, that identifies the D F R user which has been most recently responsible for modifying the D F R content of a D F R object. It can only be read by a D F R user having at least extended-read access rights. (8)-87 SEE author; draft; writer SEE ALSO Content-Created-Date-and-Time; Content-Modified-Date-and-Time Content-Modify-Date-and-Time D F R - A DFR-specific attribute, part of the basic attribute set, that contains the date and time when the D F R content of a D F R object was last modified in a D F R document store. 125 Chapter Four Thesaurus When a D F R object is created, this attribute is set to current time. Subsequently, the D F R server maintains the attribute.(8)-86 SEE archival annotation; chronological dates; draft SEE ALSO Content-Created-Date-and-Time; Content-Modified-By content portion O D A - The result of partitioning the content of a document according to its logical and/or layout structure.(2)-5. • A set of related content elements that belong to one basic object (if the document has any logical structure) and one basic layout object (if the document has any layout structure). It follows that a basic logical object has associated with it one or more content portions as does a layout object, and that any basic or composite object (logical or layout) has associated with it an integral number of content portions. Logical and layout objects do not always correspond (e.g. the arrangement of the content into sections and paragraphs need not correspond to pages.(2)-15 SEE content articulation SEE ALSO layout; content element; content attributes Contents S G M L TEI - A table of contents, specifying the structure of a work and listing its constituents. Part of the front matter. (7)-45 SEE content copy DLP - If a document is not an original or a draft, it is a copy.(4)-20 • Diplomatics recognizes several different types of copies: copy-in-the-form-of-an-original: A type of copy that is created when two originals of the same document, addressed to the same person and having the same date, are sent to the addressee in two subsequent deliveries. The first delivery is considered to be the original, the second delivery is a copy in the form of an original.(4)-19 imitative copy. A form of copy which reproduces, completely or partially, not only the content but also the forms of a document, including extrinsic elements such as layout, script, special signs etc. of the original. A modern example is the photocopy. (4)-20 inserts or insets. SEE authentic copy pseudo-original. A copy that attempts to imitate in every respect the extrinsic and intrinsic forms of an original but is not legally authorized and is therefore a forgery. (4)-21 simple copy: A simple copy is constituted by the mere transcription of the content for the original (e.g. notes taken from a report) and cannot have legal effects. This is the most common type of copy and is usually compiled as an aid to memory. (4)-21 Copies in general: s: N o standards equivalent n: D F R Reference = virtual copy consisting of pointers to an object; Status 126 Chapter Four Thesaurus n: DIP authentic copy; copy-in-the-form-of-an-original; imitative copy; pseudo-original; simple copy; receivers O D A Distribution List; Reference to Other Documents; Status b: DIP status of transmission S G M L Document Status r: DIP original; draft D P soft copy core structural features S G M L TEI - Basic structural features which are common to a large number of texts and which may be said to establish their principle gross structure or shape, e.g. the parts of a book, front matter, body, back matter. (7)-72 SEE office documents; header; script; layout; logical structure; text core tag set S G M L SEE ALSO office documents countersigner DIP - The signature following the subscription of the writer which has the special function of validating the physical and intellectual form of the document and of guaranteeing that the document was created according to the established procedure and signed by the appropriate person. The countersigner assumes responsibility only for the regularity of formation of the document and for its forms; that is, not for its content, and not for the wording chosen to express content, but for the presence in the document of all the elements required for its effectiveness.(17)-8 s: N o standards equivalent n: D F R Authentication DIP attestation O D A Authorization b: DIP persons r: DIP reliability Created-by D F R - A DFR-specific attribute, part of the basic attribute set, that identifies the D F R user which created the D F R object. It is not modified when the object is moved and can only be read by a D F R user having at least extended read-access right.(8)-86 SEE author; writer; juridical person SEE ALSO Dates and Times Creation-date-and-time D F R - A n optional attribute, part of the extension attribute set, that specifies the date, and, optionally, the time of day when the document was created. In the case of an O D A document, this value of attribute can be taken from the document profile where it is the equivalent of O D A creation date and time.(8)-90 127 Chapter Four Thesaurus O D A - An attribute of dates and times that specifies the date, and optionally, the time of day when the document was initially created.(5)-l0 SEE dates; moment of action; moment of documentation crystals S G M L - Small objects with internal structure containing semantically constrained sorts of data. Typical crystals for office documents include names, addresses, and organizations. The address crystal consists of sub-elements for city, country, state, street, postbox, telephone etc. (7)-88,89 custody A R C H The responsibility for care of archival material based on its physical possession.(21)-6 Custody does not always include legal ownership or the right to control access to records. (14)-9 SEE reliability SEE ALSO access; custodial history; security custodial history A R C H - The succession of offices or persons who had custody of a body of archival materials from its creation to its acquisition by an archives or manuscript repository. SEE security; reliability SEE ALSO provenance dates SEE chronological date; topical dates Dates and Times O D A - A group of document management attributes that comprise Document Date and Time, Creation Date and Time, Local Filing Date and Time, Expiry Date and Time, Start Date and Time, Purge Date and Time, Release Date and Time, and Revision History.(5)-10,11. SEE dates declarations S G M L - A formal statement in simple syntax used to define different levels of data in a document. There are several types in S G M L . ( A U ) SEE script; document type; content SEE ALSO element type declaration; entity; document type declaration; S G M L declaration descendant D F R - For a given DFR-group, any of the DFR-group members, and recursively, any descendant thereof. (8) - 5 SEE draft 128 Chapter Four Thesaurus discretionary documents DIP - Documents which refer to an act where the document is not needed to bring the act into existence (dispositive document) or to prove the existence of an oral act (probative document) (\\)-l SEE narrative documents; supporting documents disposition DIP - That part of the text of a document in which the author expresses their will or judgment. Here, the fact or act is expressly enunciated, usually by means of a verb able to communicate the nature of the action. (19)-12 SEE text dispositive document DIP - If the purpose of the written form was to put into existence an act, the effects of which were determined by the writing itself (that is, i f the written form was the essence and substance of the act), the document was called dispositive, e.g. contracts and wills.(29)-9 s: N o standards equivalent n: D F R Document-Type O D A Document Type; document architecture class; User Specific Codes P R P business process S G M L Document Type, document type definition b: DIP juridical relevance S G M L office documents r: DIP narrative document; probative document; supporting document Distribution List O D A - A n attribute of other user information that specifies a list o f intended recipients of a document. It has two parameters, "personal name of recipient", and "recipients organization."(4)-12 SEE addressees; receivers document DIP - The expression of ideas in a form both objectified (physical) and syntactic (governed by rules of arrangement). A document's components are: a message, a medium, an intellectual codification of ideas (information configuration: text, image, etc.), and a logical arrangement of the internal elements (intellectual form).(16)-9. s: no standards equivalent n: D F R document; object; object-class; object-content O D A document; document profile; generic document S G M document; document type declaration b: DTP written document D F R Group; Group-Member 129 Chapter Four Thesaurus r: D F R document-type O D A document type; document class; document class description S G M document type; document class; core tag set document D F R - A structured amount of information that can be filed, retrieved, and interchanged consisting of a DFR-object-class of the DFR-object.(8)-5. • A D F R document consists of a D F R document content together with attributes which are associated with the content. A D F R document is contained in one document store. Consistency between copies is outside the scope of the standard and is the responsibility of the user. (8)-14 O D A - A structured amount of information intended for human perception, that can be interchanged as a unit between users and/or systems.(2)-5 S G M L - A collection of information that is processed as a unit. A document is classified as being a particular document type. (23)-10 • A prologue and a document instance. (5)-29. SEE document document architecture O D A - Rules for defining the structure of documents, in terms of a set of components and content portions, and the representation of documents in terms of constituents and attributes. (2)-5 • The structural information of a document consisting of the set of one or more of the following structures: specific logical structure, specific layout structure, generic logical structure and/or generic layout structure.(2)-5 • The document architecture provides for the representation of documents in three forms: formatted form, processable form, and formatted processable form.(2)-13 • The key concept in the document architecture is that of structure. Document structure is the division and repeated subdivision of the content of a document into increasingly small parts. The parts are called objects. The structure has the form of a tree. The document architecture permits two structures to be applied to a document: a logical structure and a layout structure. Any one or both structures may be applied to a given document. (2)-14 SEE content articulation SEE ALSO document architecture attributes; document architecture level; document architecture class document architecture attributes O D A - The set of attributes that applies to a logical object or a layout object depends on the type of object: different sets of attributes are defined for basic logical objects, composite logical objects, document logical root, blocks, frames, pages, page sets and document layout root. Document architecture attributes are independent of the type of content of the objects to which they apply. Examples are "object identifier" for all objects; 130 Chapter Four Thesaurus "subordinates" for composite objects; layout directives such as "indivisibility", "offset", "separation", "position" (of blocks and frames) and "dimensions."(2)- 16,17 SEE content articulation SEE ALSO document architecture; document architecture level; document architecture class document architecture class D F R - A n optional attribute, part of the D F R extension attribute set, that specifies the document architecture class used in the document. In the case of an O D A document, this can be taken from the document profile and is equivalent of the O D A attribute, document-architecture-class. (8)-89 O D A - The rules for defining the structure and representation of documents in formatted form, processable form or formatted processable form.(2)-5 SEE content articulation SEE ALSO document architecture; document architecture level; document architecture attributes document architecture level O D A - A n identified subset of the features pertaining to a document architecture class. (2)-5 SEE content articulation SEE ALSO document architecture; document architecture class; document architecture attributes document body O D A - The part o f a document that may include a generic logical and layout structure, specific logical and layout structure, layout and presentation styles but excludes the document profile. (2)-5 SEE content articulation; intrinsic elements document characteristics O D A - Those attributes in the document profile that permit a recipient to determine which capabilities are required for processing or imaging the document. They include: a specification o f the form (formatted, processable or formatted processable); specification of the content architectures used; specification of character sets, fonts, styles, orientations and types of emphasis.(2)-18 SEE content articulation; intrinsic elements; form of transmission document class O D A - A set of logical object class descriptions, layout object class descriptions, generic content portion descriptions, styles and a document profile, that specifies a set of documents with common characteristics.(2)-6 • A specification of the set of properties that are common to a group of similar documents. The specification consists o f a set of rules to determine the values of the attributes that specify the common properties. These rules can be used to control 131 Chapter Four Thesaurus consistency among the documents making up the class, and to facilitate the creation of additional documents.(2)-17 SEE archival bond; content articulation; formularium document class description O D A - The specification of a document class. (2)-6 SEE archival bond; content articulation; formularium document content D F R - A body of information actually contained within the document, e.g. an office document, and not interpreted by DFR.(8)-5 • A body of information that has been provided to the D F R server for the purpose of storage. The server transfers the content of a D F R document to the user, and never interprets the content.(8)-14 SEE content S E E A L S O office document Document Date S G M L TEI - The date of the text as given on the title page. (7)-74 S E E moment of action; moment of documentation Docu ment-Date-and- Time D F R - A non-DFR-specific attribute, part of the basic attribute set, that specifies the date and time that the D F R user associates with the D F R document or with a D F R reference. In the case of an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, Document-Date-and-Time.(8)-87 O D A - A n attribute of dates and times which specifies the date and, optionally, the time of day that the originator associates with the document.(5)-10 SEE moment of action; moment of documentation document declaration S G M L - The part of the S G M L prologue which specifies basic facts about the dialect of S G M L being used, e.g. the character set; length of identifiers. Usually held by the S G M L processor in the form of compiled tables and is thus invisible to the user.(7)-29 SEE prologue document description O D A - A group of document management attributes comprised of title, subject, document reference, document type, abstract, and keywords.(5)-9 SEE subject; annotations of management; title 132 Chapter Four Thesaurus document instance S G M L - A marked up text.(7)-13 • The content of the document itself. It contains only text, markup and general entity references, and thus may not contain any new declarations.(7)-31 S E E content S E E A L S O declarations document interchange D P - The capability of transmitting documents from one information system and receiving them in another system in a form in which they can be acted upon by the receiving system, often called "revisable" form.(l)-150 SEE form of transmission SEE ALSO imaging document interchange architecture D P - The specification of rules and data streams necessary to interchange information in a consistent, predictable manner.(1)-150. SEE ALSO basic processing model; encoding; document declaration document management domains SEE domain document profile O D A - A set of attributes which specifies the characteristics of the document as a whole; an identified subset of the features pertaining to the document profile(2)-16,17 • A set of attributes associated with a document as a whole. It represents reference information about the document and may repeat information in the document content, for example, the title and name of the author.(2)-16,17 • In addition to reference information such as title, date and author's name, which facilitates storage and retrieval of the document, the document profile contains a summary of the document architecture features that are used in the document, in order that a recipient can easily determine which capabilities are required for processing or imaging the document. The attributes representing the latter type of information are called document characteristics. The document profile may be interchanged alone.(2)-18. SEE document S E E A L S O document characteristics; header; annotations Document Reference O D A - A n attribute of document description whose value is used to refer to the document from other documents.(5)-9 S G M L TEI - A n element of the front matter of a generic office document. (5)-189 SEE archival bond; annotations of management SEE ALSO classification 133 Chapter Four Thesaurus Document Size O D A - A n attribute of content that represents the estimated size of the whole document, expressed as a number of 8-but bytes. The size includes that of the document profile and the document body (if present).(5)-13 SEE annotations of management SEE ALSO Resource-Used; Resource-Limit Document Status S G M L TEI - A n element of the front matter of a generic office document. (5)-190 SEE status of transmission Document-Type D F R - A n non-DFR specific, optional attribute, part of the extension attribute set, that specifies the type of document, e.g. memorandum, letter, report, resource. This attribute specifies only an informal name; it does not specify a relation to a particular document class description. This attribute can be taken from the O D A document profile and is the equivalent of O D A document type.(8)-89 • A DFR-specific attribute, part of the basic attribute set, that contains an object identifier whose value defines the representation of the document content, for example, O D A or S G M L , in the D F R access protocol. For a D F R reference, this attribute will only exist i f the referent is a D F R document.(8)-81 O D A - A n attribute of document description which specifies an informal name for a document, e.g. memorandum, letter, report, resource. It does not specify a relation to a particular document class description.(5)-9. S G M L - A class of documents having a similar characteristics: (for example, journal, article, technical manual, or memo) (23)-10 SEE title; content configuration document type declaration (DTD) S G M L - A standard for a header that identifies an agreed upon document type, such as report, article, book, journal, and includes additional information needed to process the specified document. D T D is used as part of Standard Generalized Markup Language ( S G M L ) which defines tags to mark parts of documents. The tags which identify the parts are interpreted in terms of the D T D . (1)-150 • The document type declaration specifies the document type definition against which the document instance is to be validated. Like the S G M L declaration it may be held in the form of compiled tables within the S G M L processor, or associated with it in some way which is invisible to the user or requires only that the name of the document type be specified before the document is validated.(7)-30. • A t its simplest the document type declaration consists simply of a base document type definition (possibly also one or more concurrent document type definitions) which is prefixed to the document instance. More usually, the document type definition wil l be held in a separate file and invoked by a reference.(7)-30 • The motivating principle for the design of the D T D has been to allow but not to require structural constraints on documents. A n encoded document is seen as comprising a header 134 Chapter Four Thesaurus and a body. The header can contain S G M L declarations and additional declarations required to conform to the Text Encoding Initiative. The body contains the encoded text itself. (7)-193 SEE document SEE ALSO header; document type definition; element type declaration document type definition (DTD) S G M L - A formal specification for the structure of a document. SEE content articulation SEE ALSO document type declaration documentation requirements E D M S - A functional aspect of the preservation requirements for electronic recordskeeping systems that requires systems to preserve a number of different aspects of the record: content, structure and context: preservation of content plus any structure supported by the software in which the document was created, plus context whether assigned.by creators (such as key terms or distribution lists) or by the system with reference to the business application in which the record participated. documentation of processing: preservation of processing rules and schema's controlling views and permissions so that records as output products of specific processes can be understood with respect to the data known to the organization. functionalities: for records with functionality, documentation of business application procedures as embodied in system scripts, rules, instructions, and routines, with maintenance whenever they change, so that records can be correctly associated with the status of the system at the time of record creation. Functionality embodied in live links and their representations should be launchable.(12)-21 documentation A R C H - The organization and processing of documents or data including location, identification, acquisition, analysis, storage, retrieval, presentation and circulation for the information of users.(l)-150 D P - A n organized series of descriptive documents explaining the operating system and software necessary to use and maintain a file and the arrangement, content and coding of the data which it contains.(l)-150 SEE ALSO formularium domain DLP - Space defined by the boundaries of an electronic document management system within which records are created, modified, used, and destroyed. The space may be divided into several areas depending on the status o f transmission and the access rights. general (or institutional) space: that part of the system that is accessible to all members of the organization, managed according to established record making and record keeping rules by the competent staff, and that contains the central filing system of the organization, including the 135 Chapter Four Thesaurus linkages with related records in other media. The primary characteristic of the general space is that no record that has crossed its boundaries can thereafter be manipulated.(22)-23 group space: that part of the system that is accessible to all the individuals who share the same competence, horizontally or vertically, temporarily, or permanently. This is the space containing many draft versions of the same record, comments, notations etc.(22)-23 individual space: that part of the system that is accessible to individual members of the organization. The individual space within the organization's records system must be distinguished from the personal private space of the individual, which should also have a different electronic address. (22)-23 private space: that part of the records system in which records of a private nature are created and managed by the creator for their own ends, and which is accessible only to individuals as private persons. It should have a separate address. (22)-23 and (TG) s: N o standards or diplomatics equivalent n: D F R Version; Status; Version-Root; Next-Version; Previous-Versions O D A Status; Revision History S G M Revision History; Status b: N o standards or diplomatics equivalent r: DLP reliability E R M - The intersection of a class of objects and a common set of rules that govern objects. Domains may also be defined by the action or purpose of the rules, e.g. document management domains in which the object class is documents and the purpose is management; security classification domains which are defined by the degree of confidentiality; information access domains defined by the location, medium and representation of the information; and security management domains defined as groupings of business processes with a common set of rules. (10)-6 draft DIP - Temporary version of a record, prepared for purposes of correction. (22)-9 s: D F R Version; Status O D A Status S G M Revision History n: D F R Version-Root; Next-Version; Previous-Versions O D A Revision History b: D F R conceptual document DLP status of transmission r: DLP original; copy element S G M L - A component of the hierarchical structure defined by a document type definition; it is identified in a document instance by descriptive markup, usually a start-tag and end-tag.(23)-10 S G M L TEI - A textual unit, viewed as a structural component. Different types of elements are given different names, but S G M L provides no way of expressing the meaning of a particular type of element, other than its relationship to other element types. (7)-12. 136 Chapter Four Thesaurus • For instance, the element Preface may or may not occur within a larger element, Front Matter, and be composed of sub-elements as Title and Text. Elements may be both structural (related to the organization of the document) and non-structural (related to its intellectual content), such an element, Author. Elements are identified by tags. S G M L is in no way concerned with semantics of elements as these are software-dependent. Element should not be confused with attribute. SEE content articulation SEE ALSO attribute element type declaration S G M L - A markup declaration that contains the formal specification of the part of an element definition that deals with the content and markup minimization.(23)-10 SEE document type element definition S G M L - Application of specific rules that apply S G M L to the markup of elements of a particular type. A n element type definition includes a formal specification, expressed in element and attribute definition list declarations of the content, markup minimization and attributes allowed for a specific element type. A n element type definition is normally part of a document type definition. (23 ) - l 1 SEE content articulation encoding instructions S G M L - The part of the file header that contains such information required to interpret an S G M L conformant data file such as normalization of source text, methods of resolving ambiguous punctuation, editorial comments, reference system, levels of encoding, and normalization of machine-readable text.(7)-5 5 SEE content articulation entitling DIP - That part of the protocol comprising the name, title, capacity and address of the physical or juridical person issuing the document, or of which the author of the document is an agent. Today corresponds to letterhead.(3) s: N o standards equivalent n: O D A generic logical structure; specific logical structure; content portion S G M L element; element definition; element declaration b: DIP protocol S G M L front matter r: DIP intrinsic elements entity S G M L - Together with elements and attributes, part of the descriptive markup of a document, consisting of a named part of a marked up document, irrespective of any structural considerations. (7)-27. 137 Chapter Four Thesaurus • A n entity is a free-floating unit, such as a photograph, that is not part of the structure of the text. SEE content configuration SEE ALSO element; attribute eschatochol DIP - That part of a document that contains the documentation context of the act, i.e. enunciation of the validation, indication of the responsibilities for documentation of the act, and the final formulae (19)-11 s: no standards equivalent n: DLP annotations of management; attestation; complimentary clause; corroboration; final clauses; qualification of signature; secretarial notes; special signs D F R document content; Author; Owners; Preparers; User-Specific-Codes; Access-List O D A Distribution List; generic logical structure; specific logical structure; content portion; Author; Owners; Preparers; User-Specific-Codes; Security Information S G M L element; element definition; element declaration, Sensitivity; P R P receivers b: DLP intrinsic elements r: DLP extrinsic elements execution phase S E E phases of procedure executive procedures SEE procedures extended processing model D P - A n elaboration of the basic processing model for open document interchange which accommodates the needs for additional functionalities, such as the ability to handle graphics, required of both office and publishing environments in the O D A / S G M L standards.(9)-158,159 SEE ALSO basic processing model Expiry Date and Time O D A - A n attribute of dates and times that specifies the date and, optionally, the time of day after which the document is considered to be invalid.(5)-10. SEE archival annotations exposition DLP - A part of the text in which the substance is expressed, i.e. the narration of the concrete and immediate circumstances generating the act and/or the document. In documents resulting from procedures, whether public or private, the exposition may 138 Chapter Four Thesaurus include the memory of the various procedural phases or be entirely constituted by the mention of one or more of them. (18)-13 s: No standards equivalent n: D F R document content O D A generic logical structure; specific logical structure; content portion S G M L element; element definition; element declaration b: DIP text S G M L body matter r: DIP preamble; disposition external document class O D A - A document class referred to by the document profile of an interchanged document containing no generic structure .(2)-6 SEE content articulation external references O D A - A group of document management attributes comprising references to other documents, superseded documents, and local file references. (5)-13. SEE archival annotations extrinsic elements DIP - Those elements of documentary forms which constitute the material make-up of the document and its external appearance.(32)-6 s: DIP physical form no standards equivalent n: DIP annotations; content articulation; content configuration; language; medium; script; seals; special signs D F R document content E R M layout; logical structure O D A document constituents, generic layout structure; specific layout structure; generic logical structure; specific logical structure; layout styles; presentation styles; external document class; resource document; document characteristics, content architecture classes; interchange format class; O D A version; non-basic document characteristics, profile character sets; comment character sets; alternative representation character sets; document constituent attributes: page dimensions; medium types; layout paths; protection; block alignments; fill orders; transparencies; colours; borders; page positions; types of coding; coding attributes; presentation features; non-basic structure characteristics: number of objects per page; additional document characteristics: unit scaling; fonts listing. S G M L basic non-structural features; entity b: DIP form r: DIP intrinsic elements 139 Chapter Four Thesaurus External-Location D F R - A mandatory attribute, part of the basic attribute set, that contains a user-specified description of the location of an object stored outside any D F R document store. (8)-85 SEE annotations of management facsimile D P - The exact image of a document transmitted electronically to another location.(1)-153 Paper fax machines digitize the image and are equipped to print out the document. Fax modems must send and receive an electronic disk file. A R C H - A reproduction of a document or item that is similar in appearance to, but not necessarily the same size as, the original.(27)-470. SEE ALSO document interchange; imaging; mode of transmission facts DLP - Occurrences of human conduct and natural events that take place within a given juridical system.. Facts whose consequences are not anticipated by the juridical system are considered juridically irrelevant; facts which are contemplated by the body of written or unwritten rules on which the juridical system is based, that is, the legal system, are qualified as juridically relevant.(11)-5 SEE acts SEE ALSO dispositive document; narrative document; probative document; supporting document false document DLP - The concept of falsity refers to the presence of elements which do not correspond to reality. They refer to different elements of the document in a legal, diplomatic and historical sense. Legally and diplomatically, to say that a document is false is to say that the facts are untrue.(4)-18 SEE authenticity fax SEE facsimile file header S G M L - " A n electronic title page and preface" for S G M L text-conformant files consisting of a bibliographic file description, encoding declarations, and revision history identifying the source text or chief, source of information, and providing the basis for citation. This is not be confused with the S G M L prologue or the core structural features. (7)-53 SEE annotations of management; content articulation; status of transmission File Name S G M L T E I - A n element of the front matter of an office document, part of Local File References. (7)-189 SEE annotations of management 140 Chapter Four Thesaurus Fill Orders O D A - A document constituent attribute, part of the non-basic document characteristics. (5)-7 SEE content articulation SEE ALSO filling filling O D A - The storage of a document according to some defined method in order to facilitate retrieval. (2)-6 final clauses DIP - Formulae, part of the intrinsic elements, found within or following the disposition, the object of which is to ensure the execution of the act, to avoid its violation, to guarantee its validity, to preserve the rights of third parties, to attest the execution of the required formalities, and to indicate the means employed to give the document probative value.(19)-14. They are divided into the following groups: clauses of injunction, those expressing the obligation of all those concerned to conform to the will of the authority. clauses of prohibition, those expressing the prohibition to violate the enactment or oppose it. clauses of derogation: those expressing the obligation to respect the enactment, notwithstanding other orders or decisions contrary to it, opposition, appeals or previous dispositions. clauses of exception: those expressing situations, conditions or persons which would constitute an exception to the enactment. clauses of obligation: those expressing the obligation of the parties to respect the act, for themselves and for their successors or descendants. clauses of renunciation: those expressing consent to give up a right or a claim. clauses of warning: those expressing a threat of punishment should the enactment be violated. They comprise two categories: 1) spiritual sanctions, comprising threats of malediction or anathema; 2)penal sanctions, comprising the mention of specific penal consequences. promissory clauses: those expressing the promise of a prize, usually of a spiritual nature, for those who respect the enactment. clauses of corroboration: those enunciating the means used to validate the document and guarantee its authenticity. The words vary according to the time and place, but the clauses are usually formulaic and fixed. The final clauses in general: s: N o standards equivalent n: D F R document content O D A generic logical structure; specific logical structure; content portion 141 Chapter Four Thesaurus S G M L element; element definition; element declaration DLP text S G M L body matter DIP disposition; exposition; preamble font E D M S - A design for a set of characters. A font is the combination of typeface and other qualities, such as size, pitch, and spacing.(25)-189 O D A - A set of character images normally with a common design and size.(5)-6 SEE content articulation Fonts List O D A - A n attribute of'additionaldocument characteristics'that specifies the character fonts used in the document.(5)-9 SEE content articulation forgery SEE copy - pseudo-original form DLP - The form o f a written document is ... the whole of its characteristics which can be separated from the determination of particular subjects, persons, or places it is about Any written document in the diplomatic sense contains information transmitted or described by means of rules of representation, which are themselves evidence of. the intent to convey information: formulas, bureaucratic or literary style, specialized language, interview technique, and so on. These rules, which we call form, reflect political ,legal, administrative, and economic structures, culture, habits, myths, and constitute an integral part of the written document because they formulate or condition the ideas or facts which we take to be the content of the documents. (4)-15 SEE extrinsic elements; intrinsic elements; persons; acts; procedures form of transmission DD? - The form that the record has when it is made or received.(22)-12 s: N o standards equivalent n: D F R Document-Architecture-Class O D A Document Architecture Class; processes: formatted processable; processable; O D A version; interchange format class b: DLP transmission; authentic record O D A document architecture ST source document; result document r: status of transmission; mode of transmission; reliability format D P , A R C H - A predetermined arrangement of characters, fields, lines, punctuation, page numbers, etc.(1)-156 b: r: 142 Chapter Four Thesaurus • The display conventions and syntactic niles used to record commonly used data items such as dates, currencies etc.(10) - Glossary E R M - The semantics which define the rules for recording information contained in a document. These include electronic encoding schemes (ASCII) , image formats (TIFF), presentation of print conversions. (Postscript), creation tool conventions (Word Perfect, Excel etc.), and the differences between documents created by different versions of the same tool.(10) - Glossary SEE ALSO representation; medium formatted form O D A - A form of representation of a document that allows the presentation of the document as intended by the originator and that does not support editing and (re)formatting.(2)-6 SEE form of transmission SEE ALSO processable; formatted processable formatted processable O D A - A form of representation of the document that allows presentation of the document as intended by the originator and also supports editing and (re)formatting.(2)-6 SEE form of transmission SEE ALSO processable; formatted form formatting O D A - The carrying out of operations to determine the layout of a document, i.e. the appearance of its content on a presentation medium. (2)-13 SEE content articulation formularium DIP - Models of documents or instructions for their compilation, e.g. formulary, style guide, codebook.(AU) SEE documentation front matter S G M L T E I - The part of office documents that corresponds to a type of document profile. It consists of: a) production and storage information e.g. local file name; b) document distribution by post or electronic mail, e.g. originator; addressee; c) action request and reply deadline; d) status and history, e.g. draft, confidential, internal.(7)-188 • The essential features of the front matter of office documents are contained in the core tag set comprising the document type, title, document date, author, abstract, table of contents, language, and revision history. (7)-89. Optional features of the front matter in office documents include document reference, additional user specific codes, references to other documents, in reply to, local file system reference, subject field, keywords, creation date, originator, preparer, authorizing person, primary recipient, secondary recipient, other user information, document status, 143 Chapter Four Thesaurus sensitivity, and number of pages.(7)-89,190. SEE persons; annotations; status of transmission function DIP - The whole of the activities aimed to one purpose. When such activities, or part of them, are assigned to a person, they constitute a competence .(22)-4 general inscription DIP - A form of inscription in which the addressee is a larger, indeterminate entity, e.g. the citizens, the believers, or "To all to whom these presents shall come".(19)-12 S E E A L S O inscription; nominal inscription. general space SEE document management domains generic document O D A - A structured amount of information intended for the interchange of generic structures, and optionally associated styles and content portions, for use in the processing of interchanged documents. (2)-19 • A generic document consisting of a document profile and generic structures may be used to assist in the processing of interchanged documents and may be interchanged itself. (2)-19 SEE document SEE ALSO generic document structure; generic identifier; generic layout structure; generic logical structure generic document structure O D A - The template that guides the creation of the document and that could be re-used for its amendment.(2)-13, 14 • The set of logical object classes and layout object classes associated with a document, and their relationships.(2)-18 SEE content articulation generic identifier S G M L - The technical term assigned by the application user for the name of an element type.(7)-12 SEE content articulation S E E A L S O element. generic layout structure O D A - The set of all the potential specific layout structures that are applicable to a document class. The generic layout structure comprises a set o f rules from which specific 144 Chapter Four Thesaurus logical objects can be derived during the editing process (e.g. a template for a page layout). (2)-18 SEE content articulation SEE ALSO result document type generic logical structure O D A - The set of all specific logical structures that are applicable to a document class. The generic logical structure comprises a set of rules from which specific logical objects can be derived during the editing process (e.g. a style sheet).(2)-18 SEE content articulation SEE ALSO source document type genuine document DIP - The quality of a record that it is truly what it purports to be. . . . Genuineness is conferred on a record on the basis of one or more of the following: mode, form and state of transmission, and manner of preservation and custody. (22)-12 SEE authentic record; custody; mode of transmission; form of transmission; preservation; status of transmission Group D F R - A collection of DFR-Objects in a DFR-Document-Store which are called D F R -Group-Members of the DFR-Group. A DFR-Group consists of DFR-Attributes which are associated with the DFR-Group as a whole and a DFR-Group-Content which is a sequence of UPIs of all Members of the DFR-Group. (8)-18 SEE document; archival bond SEE ALSO Group-Interrelationships; Group-Content; Group-Member; Root-Group; Proper-Group; Parent-Identification Group-Interrelationships D F R - A collection of DFR-objects in a DFR-document-store which are called D F R -group-members of the DFR-group. A DFR-group consists of DFR-attributes which are associated with the DFR-group as a whole and a DFR-group-object.(8)-2 • A D F R group can be either a root-group or a proper- group, the difference being that a root-group has no affiliation with any other group, while a group proper is always a member of some other group (parent). • Any D F R object in a server can be reached through the root group because the D F R server only ever services one group.(8)-18 • A D F R group can be viewed as the root of a D F R object tree consisting of all the descendants of that D F R group. (8)-12 SEE document; archival bond SEE ALSO Group; Group-Content; Group-Member; Root-Group; Proper-Group 145 Chapter Four Thesaurus Group-Content D F R - A sequence of unique personal identifiers (UPIs) identifying all DFR-members of the DFR-group.(8)-6 SEE document; archival bond SEE ALSO Group; Group-Interrelationships; Group-Member; Root-Group; Proper-Group Group-Member D F R - A DFR-object which is identified in the DFR-content of its parent DFR-group.(8)-6 SEE document; archival bond SEE ALSO Group; Group-Interrelationships; Group-Content; Root-Group; Proper-Group group space SEE document management domain hardcopy A R C H - A document or copy, usually on paper, as opposed to a microform or machine-readable record.(l)-157 D P - Printed copy of machine output in a visually readable form, e.g. printed reports, listings, documents, summaries etc.(l)-157 SEE medium SEE ALSO soft copy handling annotations SEE annotations header A R C H - A word or series of words, and/or page numbers that appear consistently at the top of the pages of a document, including copyright notices, company logos, and so on.(l)-158 D P - System-defined control information that precedes user data.(l)-158 • That portion of a message that contains control information for the message such as one or more destination fields, the name of the originating station, an input sequence number, a character string indicating the type of message, and a priority level for the message.(l)-158 SEE document profile historical authenticity SEE authenticity image A R C H - A reproduction of the subject matter copied, usually by photography. (1)-159 146 Chapter Four Thesaurus D P - A n exact logical duplicate of a data item stored on a different physical medium. (1)-159 • A visually interpreted representation as displayed, plotted or printed.(1)-159 • A representation in storage by,means other than the storage code of the computer or device in which it is held. Examples include the but patterns of an alien code, and bit pattern matrices of such things as the punching pattern of a punched card or of a character to be displayed.(l)-159 SEE content configuration image O D A - Representation of a document in a form perceptible to a human, for example, on paper or on a screen. ( 2 ) - 2 0 . SEE content configuration imaging E R M - The document imaging process is concerned with presenting an image of the document in a form perceptible to a human, for example, on paper or on a screen. ( 2 ) - 2 0 . O D A - Imaging is a locally defined process that depends on the presentation device used. It is not part of the O D A standard for this reason.(2)-20 impartiality A R C H - The characteristic of archival documents that they are created for limited, specific and immediate purposes of an administrative-legal nature, not in order to instruct posterity. Therefore, they constitute reliable evidence of facts and events they relate to. Naturally, they contain the biases and idiosyncrasies of their creators, but, because they are not meant for dissemination, they have the capacity to reveal what actually happened. ( 1 6 ) - 1 2 SEE ALSO archives imitative copy SEE copy inauthentic document DIP - The concept of inauthenticity refers to the absence of the requisites which provide authenticity i.e. legal or diplomatic but not historical. SEE authenticity individual space SEE document management domain initiative phase SEE phases of procedure 147 Chapter Four Thesaurus inquiry phase SEE phases of procedure In Reply To S G M L TEI - A n element of the front matter of generic office documents that is not part of the core tag set.(7)-189 SEE addressee inscription DLP - Documents in epistolary form usually present in their protocol the name, title and address of the addressee of the document and/or the action. It may be a nominal inscription or a general inscription.(19)-12 s: N o standards equivalent n: D F R document content O D A generic logical structure; specific logical structure; content portion S G M L element; element definition; element.declaration b: DLP protocol S G M L front matter r: DIP entitling, dates; invocation; superscription; salutation; subject inserts also insets SEE copy instrumental procedures SEE procedures intellectual control A R C H - The acquisition and creation of documentation required to access the informational content of records. Contrasted with administrative control. (1)-162 interchange format class O D A - The form of interchange suitable to a specific application. (2)-7 S E E annotations of management SEE ALSO application; document interchange; processing interrelationship A R C H - The characteristic of archival documents that they are related among themselves by activities in which they participated and by the procedures and processes from which they have resulted.(16)-12 SEE ALSO archival bond 148 Chapter Four Thesaurus intellectual form DIP - The characteristics of the internal composition of the record. (22)-2 SEE annotations; content articulation; content configuration SEE A L S O extrinsic elements; intrinsic elements intrinsic elements DIP - Elements of intellectual form which are considered to be the integral components of its intellectual articulation: the mode of expression of the document's content, or the parts determining the tenor of the whole. (19)-6 SEE eschatochol; protocol; text SEE ALSO extrinsic elements; intellectual form; physical form juridical act DIP - When a juridical system takes into consideration in its body of rules not only the effects of human conduct but also the will determining it, we call that conduct a juridical act .( l l)-6 SEE acts; juridical system juridical fact DLP - A n event, whether intentionally or unintentionally produced, whose results are taken into consideration by the juridical system in which it takes place.(11)-5 SEE acts; juridicalsystem juridical person DLP - A n entity having the capacity or the potential to act legally and constituted either by a collection or succession of physical persons or a collection of properties. (17)-5 SEE juridical system; persons juridical relevance DIP - The degree to which a document participates in a juridical act. Those that are directly involved, without which the act cannot exist (ad substantem) or be proved to have taken place (probative), are called juridically relevant; those that are ancillary to the act, but still contribute to the act (supporting), are also juridically relevant. Those documents that have no bearing on the act (narrative) are called irrelevant.(TG) SEE acts; dispositive document; fact; narrative document; probative document; supporting document 149 Chapter Four Thesaurus juridical system DIP - A collectivity governed by rules which may be implicitly understood (e.g. beliefs, or customs), or explicit (e.g. codes of law). The system of rules is called a legal system.(l 1)-5 SEE acts; facts language DIP - The style, wording and composition used in compiling the document.(19)-8 s: D F R Languages O D A Languages S G M L Language; basic non-structural features n: Not required b: DIP extrinsic elements r: DIP script Languages D F R - A non-DFR specific attribute, part of the extension attribute set, that specifies the primary language(s) in which the content of the document is written. In the case of an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, languages. (8)-92 O D A - A n attribute of content that specifies the primary language(s) in which the content of the document is written. (5)-14 SEE language; script Keywords , A R C H - A word or group of words taken from the title or text of a document characterizing its content and facilitating its retrieval.(14)-19 S G M L T E I - A tag that is part of the front matter of a generic office document. (7)-189 SEE annotations of management layout O D A - A process whereby a document is organized into pages and all the physical constituents thereof (e.g. running heads, borders).(2)-l 8 SEE content articulation layout styles O D A - A constituent of the document, referred to from a logical component, that guides the creation of a specific logical structure (2)-8 SEE content articulation 150 Chapter Four Thesaurus links DIP - PRP - A n extrinsic element providing a physical connection between the parts of a document. . D P - In database management systems, a link is a pointer to another record. One or more records can be connected by inserting links.(25)-271 • In spreadsheet programs, linking refers to the ability of a worksheet to take its data for particular cells from another worksheet.(25)-270 • In many operating systems ( U N I X for example), a link is a pointer to a file. Links make it possible to reference a file by several different names and to access a file without specifying a full path.(25)-271 s: N o diplomatics equivalent n: D F R User-References-to-Other-Objects; User-Reference O D A References to Other Documents S G M L References to Other Documents b: N o standards or diplomatic equivalent r: E R M compound document Local File References O D A - A n attribute of external references that specifies where a copy of the document may be found. It consists of one or more entries, one for each location where a copy of the document may be found.(5) - 13. S G M L T E I - a group of elements characteristic of generic office documents consisting of file name, location, access rights, user comments.(7)-189 SEE annotations of management Local Filing Date and Time O D A - A n attribute of dates and times that specifies the date and, optionally, the time of day when the document was filed. When more than one entry occurs, the last entry indicates the most recent local filing date and time.(5)-10 SEE annotations of management Location (or directory) S G M L T E I - Part of local file references in the standard set of tags that are not part of the core set for generic front office documents. It is intended to specify the directory where the file may be found.(TG) SEE annotations of management logical D P - The way a data structure, hardware or software system, is perceived by an individual that may be different from its actual functioning or form. (1)-164 151 Chapter Four Thesaurus logical object O D A - A n element of the specific logical structure of a document which may have a meaning that is significant to the application user, for example, chapter, section, paragraph. • Layout objects and logical objects, or in other words, the intellectual arrangement of the text and the physical arrangement or format of a document do not necessarily correspond. (2)-17 S E E content articulation logical record D P - A compilation of related data elements referring to one person, place, thing or event that are treated as a unit.(1)-165 S E E A L S O record logical structure O D A - The result of dividing and subdividing the content of a document into increasingly smaller parts, on the basis of the human-perceptible meaning of the content, for example, into chapters, sections, paragraphs.(2)-9 • A l l logical objects and associated content portions representing the logical hierarchy of a document.(2)-9 • The logical structure is independent of the layout structure in principle and is determined by the author and embedded in the document during the editing process. Attributes associated with the logical structure may control the formatting process or the layout of the document.(2)-17 , S E E content articulation S E E A L S O document architecture management annotations S E E annotations medium DIP - The material carrying the message.(19)-7 • The physical substance to which the message of a document is affixed. The function of a document is to fix the message in a medium so that it can be preserved. s: O D A medium types n: N o standard equivalents b: extrinsic elements r: format Medium Types O D A - A n attribute specifying non-basic attributes of medium type. It consists of one or more groups of parameter values for "nominal page size", and or "side of sheet, and details of one non-basic medium type used in the document.(4)-7 SEE medium; S E E A L S O non-basic 152 Chapter Four Thesaurus mere act SEE acts mode of transmission DIP - Method by which a document is transmitted through space and time.(22)-12 s: N o standards equivalent n: N o standards equivalent b: DIP transmission r: form of transmission; status of transmission; authenticity moment of action DIP - That point in time when the decision to act is taken and the iussio or command to prepare a document is given. ( A U ) s: no standards equivalent n: D F R Content-Create-Date-and-Time; Creation-Date-and-Time; Contents-Modified-Date-and-Time; Attributes-Modified-Date-and-Time O D A Creation Date and Time; Release Date and Time; Expiry Date and Time; Start Date and Time S G M Document Date b: DLP status of transmission; phases of procedure - execution phase r: DLP moment of documentation moment of documentation DLP - That point in time when the action is documented. In probative documents, this will always follow on the moment of action. In substantive documents, the moment of action and documentation are always the same.(AU) s: no standards equivalent n: D F R Content-Create-Date-and-Time; Creation-Date-and-Time; Contents-Modified-Date-and-Time; Attributes-Modified-Date-and-Time O D A Creation Date and Time; Release Date and Time; Expiry Date and Time; Start Date and Time S G M Document Date b: DLP status of transmission; probative document; dispositive document r: DLP moment of action multiple acts SEE acts narrative documents DLP - Written evidence of an activity that is juridically irrelevant. (29)-9 s: N o standards equivalent n: D F R Document-Type 153 Chapter Four Thesaurus O D A Document Type; document architecture class; User Specific Codes PRP business process S G M L Document Type; document type definition b: DIP juridical relevance S G M L office documents r: DIP dispositive document; probative document; supporting document Next Versions D F R - A mandatory attribute, part, of the basic attribute set, that is a multi-valued attribute. It is defined only for D F R documents and is updated by the D F R server each time a new version is declared having this D F R document as its previous version (in a create or modify abstract operation), or when an existing version is discarded (by a delete or modify abstract operation). The D F R user is prohibited from modifying this attribute explicitly. When the value of this attribute is read by the D F R user, only those documents to which the user has at least read access rights are included in the result o f the D F R abstract operation. (8)-84 SEE draft; SEE ALSO Version; Version-Root; Previous-Versions; Version-Management; Superseded Documents nominal inscription A form of general inscription which refers to one or more specific persons by name. (19)-12 SEE inscription non-basic O D A - A qualifier for attribute values, control function parameters values and other capabilities that are only allowed in document interchange in the context of a given document application profile (DAP) i f their use is declared in the document profile (2)-9 non-basic document characteristics O D A - A set of attributes that must be declared in the document profile to be exchanged. They comprise profile character sets, comment character sets, and alternative representation character sets. SEE ALSO document characteristics; non-basic non-basic structure characteristics O D A - A set of attributes regarding the structure of the document that must be declared in the document profile to be exchanged. It comprises only one attribute, numbers of objects per page. SEE ALSO document structure; non-basic notification DIP - The publication of the purport of a document whose purpose is to express that the act consigned to the document is communicated to all who may be affected by it as well as those who are directly concerned. Usually follows the preamble in a dispositive document and is recognized by such formulas as "Be it known" or notum s/r".(19)-13 154 Chapter Four Thesaurus s: N o standards equivalent n: D F R document content O D A generic logical structure; specific logical structure; content portion S G M L element; element definition; element declaration b: DLP protocol S G M L front matter r: DLP entitling; dates; invocation; superscription; salutation; subject non-processable document D P - A format for document representation, such as a bit-map pattern of a page image or a page image defined by a proprietary page definition language, that prevents the document from being edited or manipulated in another computer system, unless operating the same software.(1)-169 SEE form of transmission Number-of-Group-Members D F R - A DFR-specific attribute, part of the basic attribute set, that specifies the number of members in a D F R group.(8)-83 S E E archival bond; compound document; document SEE ALSO group Number of Objects per Page O D A - A n attribute of non-basic structural characteristics that specifies the number of specific layout objects per page used in the document. This attribute is only specified in i f the number of objects per page exceeds the value specified by the document application profile. (5)-8 SEE content articulation Number of Pages D F R - A n non-DFR specific attribute, part of the extension attribute set, that specifies the number of pages in the specific layout structure (if any) of the document. In the case o f an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, number of pages. (8)-92 O D A - A n attribute of content that specifies the number of pages in a specific layout structure (if any) of the document. (5)-14 SEE content articulation object D M - A data element that includes both data and the methods or processes that act on that data.(26)-G-10 D F R - One of a set of information entities managed by a DFR-server. DFR-objects defined are DFR-documents, DFR-groups, DFR-references, and DFR-search-result lists.(8)-6 • A D F R object consists qf attributes and content and is introduced into a document store by creating a D F R entry.(8)-13 155 Chapter Four Thesaurus • A D F R object is immediately contained by a D F R group known as the parent. There can be only one parent per object. Both the parent and the object itself are each members of the D F R group. (8)-13 • There can be only one object per D F R reference, but an object can have several different references.(8)-12 O D A - A n element of a generic structure from which objects with common characteristics may be derived.(2)-9 S E E document; content articulation object-class D F R - A DFR-specific attribute, part of the basic attribute set, that indicates the class of a D F R object and is associated with every D F R object.(8)-81 • A DFR-attribute indicating the class of a DFR-object i.e. whether it is a D F R -document, group, reference or search=result list.(8)-6 O D A - Groups of similar logical (e.g. chapter or section hierarchy) or layout objects (e.g. size or style), or content objects (e.g. page headers, or footers). Object classes may include groups of entire documents such as memoranda, or a report in which case they may be called document classes.(2)-17 • A n element of a generic structure from which objects with common characteristics may be derived.(2)-9 S E E document; content articulation . S E E A L S O document class object class description O D A - A set of attributes that specify the properties of an object including its relationships, i f any, with other components.(2)-9 S E E document; content articulation S E E A L S O object object content D F R - The actual information stored with the D F R object. The nature of the content depends on the object class. For a document group, content consists of a string of unique permanent (UPIs) identifiers for all its members. For a D F R document, the content is a body of information, e.g. an office document. The D F R content of a D F R Reference is a pointer to some other D F R object (Group, Document or Search Result List) called a referent.(8)-13 S E E content; document object identifier D F R - The direct reference component of the D F R document content which is equivalent to the D F R Document-type attribute.(8)-14 object tree D F R - The DFR-object tree is tree formed by a DFR-group and its descendants.(8)-6 SEE descendants 156 Chapter Four Thesaurus object type O D A - A property of every component that specifies which attributes are permitted in the description to which it applies and indicates the role of the component in the document architecture. (2)-9 SEE document architecture ODIF - Office Document Interchange Format ODIF (ISO 8613-5) is a data stream defined in terms of a set of data structures, called "interchange data element", which represents the constituents (document profile, object descriptions, object class descriptions, presentation styles, layout styles and content portion descriptions) of a document. ODIF uses the Office Document Language (ODL) to represent and process documents. O D L uses S G M L names and markup conventions for representing the constituents and attributes of a document. (2)-21 office documents S G M L TEI - Various classes of documents (e.g. reports, correspondence, memoranda) sharing common features. Office documents are divided into front matter, text and back matter. The essential features are contained in the front matter and are denned in a core tag set which corresponds to a header. Optional tags and crystals define other features. (7)-188,189 SEE ALSO front matter; text; backmatter; crystals; document type; document profile official record A R C H - A n original record, or an authentic copy, whose written form is required administratively and/or legally (but not necessarily beyond the time necessary for its consequences to take place. (16)-10 • A record in law, having the legally recognized and judicially enforceable quality establishing some fact.(1)-170 SEE dispositive document; authentic copy organizational procedure SEE procedures Organizations D F R - A non-DFR specific attribute, part of the extension attribute set, that identifies the originating organization(s) associated with the document. In the case of an O D A -document, the value of this attribute can be taken by the D F R user from the O D A document profile where it is the equivalent of O D A attribute, organizations. (8)-90 O D A - A n attribute of originators that identifies the originating organization(s) associated with the document.(5)-11. S G M L TEI - A crystal identifying organizations as part o f office documents. It includes tags for the name of the organization, division or department, and the address.(7)-191 SEE persons; juridical persons; SEE ALSO Originators 157 Chapter Four Thesaurus original Diplomatics examines the concept of originality and points out the common denominators of all originals, independently of time and place of creation. The first element of originality is that indicated by the English legal definition, which derives from its etymology: the Latin word originalis means primitive or first in order. The second necessary element is perfection. To be original, a document must be perfect, a term which both legally and diplomatically means complete, finished, without defect and enforceable. A perfect document is a document that is able to produce the consequences wanted but its author, and perfection is conferred on a document by its form.(4)-19 s: No standard equivalent n: D F R Status; version; Next-Version; Revision-Date-and-Time O D A Status; Start Date and Time; Security Information; Revision History; document architecture S G M L Document Status; document type declaration b: DIP status of transmission r: DIP form of transmission; mode of transmission Originators O D A - A group of document management attributes consisting of organizations, preparers, owners, and authors.(5)-l 1,12 S G M L TEI - A tag that is part of the non-core tag set for the front matter of generic office documents.(7)-189 SEE persons; juridical persons Other-Titles D F R - A n non-specific D F R attribute that contains alternative titles for a DFR-object. In D F R , this attribute is taken from the O D A document profile.(8)-89 SEE title other user information O D A - A groups of document management attributes comprising copyright, status, user-specific codes, distribution list, and additional information.(5)-12 SEE annotations of management Owner D F R - A security-subject, with owner access rights to a specific DFR-object.(8)-7 • A non-specific D F R attribute, part of the extension attribute set, that identifies the name(s) of the person(s) and/or organization(s) responsible for the content of the document. In the case of an O D A document, the value of this attribute can be taken by the D F R user from the O D A document profile where it is the equivalent of the attribute, owner. (8)-90 • The ability to modify the Access-List attribute and apply a committed reservation. Includes read-modify-delete access. A DFR-user creating a D F R object by create or copy abstract operation is automatically included in the D F R Access List attribute as the owner. (8)-24 158 Chapter Four Thesaurus O D A - A n attribute of originators that identifies the name(s) of the person(s) and/or organization(s) responsible for the content of the document. This attribute consists of one or more entries, each with two optional parameters: personal name of owner and owner's organization.(5)-12. SEE persons; juridical persons SEE ALSO Originators Page O D A - A layout component that corresponds to a rectangular area used for presenting the content of the document. (2)-10 SEE content articulation SEE ALSO non-basic; Page Dimensions; Page Positions Page Dimensions O D A - A n attributes that specifies the non-basic values of the attribute "dimensions" of layout objects of type "page" used in the document. The value consists of one or more pairs of page dimensions.(4)-6 SEE content articulation SEE ALSO non-basic; Page; Page Positions Page Positions O D A - A n attribute that specifies the non-basic values of the attribute "page position" used in the document.(4)-8 SEE content articulation SEE ALSO non-basic; Page; Page Positions parent D F R - Each DFR-object, except the DFR-root-group, is a DFR-group member of a D F R -group, which is termed its parent.(8)-7 SEE ALSO parent identification Parent-identification D F R - A mandatory attribute, part of the basic attribute set, that identifies the D F R group of which the object is a member. Its value is equal to the unique personal identifier of the D F R group to which belong its parents.(8)-82 SEE document; archival bond SEE ALSO Group Personal Name S G M L T E I - A crystal that contains a set of tags intended to encode the personal names in generic office document. It consists of tags for title, personal name ( forename, first name, and Christian name), family name (surname and last name) and a generational qualifier or other suffix (Jr., Sr. etc.)(7)-191 SEE persons 159 Chapter Four Thesaurus persons DIP - Entities who are the subject of rights and duties and as such are recognized by the juridical system as capable of, or having the potential for acting legally. Persons may be a collection or a succession or a private individual.(17)-5 s: no standards equivalent n: D F R Authors; Organizations; Owners; Preparers; Access List DIP author of document; author of act; addressee of act; addressee of document; writer; witness; countersigner; O D A Originators: Authors; Organizations; Owners; Preparers; Authorization; Distribution List; Access Rights S G M Author; Preparer; Originator; Authorizing Person; Primary Recipient; Secondary Recipient; crystal - Personal Name b: DLP juridical persons r: DIP acts; attestation phases of procedure Ideal or decontextualized series of formal steps common to every procedure by which the procedure is realized. These phases are, in order of their occurrence, the initiative, inquiry, consultation, deliberation, deliberation control, and execution. initiation phase: Phase of procedure constituted by those acts, written and/or oral, which start the mechanism of the procedure. Examples of documents created in this phase are petitions, applications, claims, drafts, or bills. (18)-14 inquiry phase: Phase of procedure constituted by the collection of the elements necessary to evaluate the situation. Examples of documents created in this phase include surveys, estimates, curricula, technical reports, reference letters. (18)-14 consultation phase: A phase of procedure constituted by the collection of opinions and advice after all the relevant data have been assembled. Examples of documents created in this phase are agendas, minutes, memoranda, discussion papers. (18)-14 deliberation phase: A phase of procedure constituted by the final decision-making. Examples of documents created in this phase are appointment notices, contracts and laws. (18)-14 deliberation control phase: A phase of procedure constituted by the control exercised by a physical or juridical person different from the author of the document embodying the transaction, on the substance of the deliberation and/or on its forms. Sometimes, some form of control is necessary to insure the effectiveness of the deliberation and its enforceability. Examples of documents created in this phase are letters of transmission, memoranda, and definitive compilations of the documents embodying the transactions.(18)-14 execution phase: A phase of procedure constituted by all the actions which give formal character to the transaction (i.e. validation, communication, notification, publication). The documents created in this phase are the originals of those embodying the transactions. Examples are registrations, letters of transmission, letters to newspapers.(18)-15 Phases of procedure in general: s: N o standards equivalent 160 Chapter Four Thesaurus n: O D A Start Date and Time; Expiry Date and Time (for execution phase) S G M L Authorized B y (for approvals at execution and deliberation control) b: DIP procedures r: DIP acts physical form SEE extrinsic elements preamble That part of a document that expresses the ideal motivation of the action. In modern legal documents, the preamble contains a citation of the laws, regulations, decrees or opinions on which the act rests. Today, just as in the past, it is possible to notice that some types of documentary form have their own specific, and often stereotyped, preamble. (19)-12,13 s: N o standards equivalent n: D F R document content O D A generic logical structure; specific logical structure; content portion S G M L element; element definition; element declaration b: DIP text S G M L front matter r: DIP entitling; dates; invocation; superscription; salutation; subject Preparers D F R - A non-DFR specific attribute, part of the extension attribute set, that identifies the name(s) of the person(s) and/or organization(s) responsible for the physical preparation of the document. In the case of an O D A document, this attribute can be taken from the document profile where it is the equivalent of the O D A attribute, preparers.(8)-90 O D A - A n attribute of originators that identifies the names(s) of the person(s) and/or organization(s) responsible for the physical preparation of the document. The attributes consists of two or more entries, each with optional parameters. These are: a) personal name of preparer b) preparer organization^) 11 S G M L TEI - A tag identifying the preparer of a document as part of the non-core tag set for generic front office documents.(7)-189 SEE writer; persons S E E A L S O originators presentation O D A - The operation of rendering a document in a form perceptible to a human being. Typical presentation media are paper and video screens.(2)-13 SEE content articulation; medium; script SEE ALSO presentation features; presentation styles 161 Chapter Four Thesaurus presentation features O D A - A n attribute that consists of one or more sets of presentation features used in the document. Each set pertains to a single content type and consists of presentation features that are specified as non-basic by the document profile. Presentation features consists of presentation attribute values, control function parameter values, sets of content elements, and their parameter values. The names of the sets of presentation features are: character presentation features, raster-graphics presentation features, and geometric-graphics presentation features.(4)-8 S E E content configuration SEE ALSO graphics; basic; basic layout object; basic layout component presentation styles O D A - A constituent of the document, referred to from a basic logical or layout component, which guides the format and appearance of the document content. (2)-10 SEE content articulation SEE ALSO basic; basic layout object; basic layout component preservation A R C H - The actions which enable the materials in archives to be retained for as long as they are needed i.e. the basic functions of storing, protecting and maintaining records and archives in archival custody. (24)-476 SEE reliability SEE ALSO read-modify-delete; formatted; processing Previous- Versions D F R - A DFR-specific attribute, part of the basic attribute set, that is multi-valued. It is defined only for D F R documents and is assigned by the D F R user when the document is declared a new version (in a create or modify abstract operation). It can then be modified by the D F R server provided that the document has not yet become a previous version for some other new version. After that, it cannot be modified. It is automatically updated i f any specific pervious version disappears (by means of a delete or modify abstract operation). When the value of this attribute is read by the D F R user, only those D F R documents to which the user has at least read access rights are included in the result of the abstract operation. (8)-84 SEE draft SEE ALSO Version; Version-Root; Next-Version; Version-Management; Superseded Documents Primary Recipient S G M L T E I - A tag, part of the non-basic core tag set for generic office documents. (7)-190 • This tag may correspond to the addressee of the document. (TG) SEE addressee 162 Chapter Four Thesaurus principle of provenance A R C H - Also known as respect des fonds, is "the principle of the arrangement of archival material that fonds of different provenance should not be intermingled."(21)-17 SEE ALSO archival bond; provenance private document DIP - A document is private i f it is created by a private person or by their command or in their name, that is, by a person performing functions considered to be private by the juridical system in which the person acts.(17)-16 SEE document management domain SEE ALSO public document private space SEE document management domain; private document SEE ALSO public document privilege D P - A n indication of the access rights of a user or user program to the data of a computer system. If given a numeric value, it may be termed an "access control level. "(1)-173 SEE A L S O access probative document DLP - If the purpose of the written form of the document was rather to produce evidence of an act which came into existence and was called complete before being manifested in writing the document was called probative, e.g. certificates and receipts.(29)-9 s: N o standards equivalent n: D F R Document-Type O D A Document Type; document architecture class; User Specific Codes E R M business process S G M L Document Type; document type definition b: DIP juridical relevance S G M L office documents r: DIP dispositive document; narrative document; supporting document procedure DLP - A body of written and unwritten rules whereby a transaction is effectuated and comprises the formal steps to be undertaken in carrying out a transaction. (3) executive procedures. Those which allow for the regular transaction of affairs within limits and according to norms already established by a different authority.(18)-19 instrumental procedures. A type of procedure in which expressions of opinion are given or advice is sought. (18)-19 organizational procedures: Those aimed at the establishment of organizational structure and internal procedures, and their creation, modification, preservation, or extinction.(18)-19 163 Chapter Four Thesaurus constitutive procedures: Those procedures which create, extinguish, modify or preserve the exercise of power. They comprise several types: procedures of authorization: those which consent to the exercise of powers already held by a physical or juridical person. They do not create powers but remove limits to their exercise. procedures of limitation: those which deprive physical or juridical persons of powers or faculties; procedures of concession: those which create new situations and new powers for the addressee. (18)-12 Procedures in general: s: N o standards equivalent n: D F R User-Specific Codes O D A User Specific Codes S G M Action P R P business process b: DIP acts r: phases of procedure; process procedure of authorization S E E procedure procedure of concession S E E procedure procedure of limitation SEE procedure process DIP - A series of motions or activities in general, carried out to set oneself to work and go towards each formal step of the procedure. (22)-10 • Processes do not create reliable records because of their spontaneity and lack of rules. (22)-10 S E E ALSO phases of procedure D P - A n operating system concept that refers to the combination of a program being executed and bookkeeping information used by the operating system. Whenever a program is executed, the operating system creates a new process for it. The process is like an envelope for the program: it identifies the program with a process number and attaches other bookkeeping information to it. Multiprocessing systems can run several processes at the same time. There is usually a one-to-one match between a process and a program. Multitasking systems allow a single process to run one or more programs at the same time. (25)-382,383 • Typical processing assignments given records are filing, reorganizing files, updating, printing and so forth. 164 Chapter Four Thesaurus processes O D A - The editing, layout, and imaging of documents.(2)- 20. SEE ALSO processable, formatted processable, formatted. processable D P - A format for document representation that supports the capacity to communicate a document between two computing systems so that the transferred document can be edited by the recipient.(1)-173 O D A - A document that has been edited and is suitable for interchange for purposes of either further editing or formatting in layout.(2)-18 SEE narrative documents; supporting documents; phases of procedure processing O D A - The carrying out of operations on a document, including editing, reformatting, presentation, filing, and retrieval.(2)-10 SEE ALSO processable Profile Character Sets O D A - This attribute specifies the graphic character sets, other than the character set specified [for values of document profile attributes] used in those document profile attributes that consist of character strings. (4)-6 SEE content articulation Proper-Group D F R - Any DFR-group other than the DFR-root-group.(8)-6 SEE document; archival bond SEE ALSO Group; Group-Interrelationships; Group-Content; Group-Member; Root-Group protocol A R C H - A formal document embodying the terms of a legal transaction. (1)-174 • A diplomatic document, especially the final text of a treaty or compact, signed by the negotiators and subject to subsequent ratification.(1)-174 D P - A formal set of conventions governing the orderly exchange of information between communicating devices by defining such things as connection establishment, security provision, data sequencing, error control, etc. Protocols achieve efficient line utilization by reducing the amount of information transferred by distinguishing between device control information and data. (1)-174 DIP - That part of a document which "sets the scene" or contains the administrative context of the action consisting of an indication of the persons involved, time and place, and subject, and initial formulae.(19)-11 Document in general. s: no standards equivalent 165 Chapter Four Thesaurus n: DIP inscription; invocation; superscription; date; entitling; salutation; subject D F R document content; Author; Owners; Preparers; User-Specific-Codes; Access-List O D A Distribution List; generic logical structure; specific logical structure; content portion; Author; Owners; Preparers; User-Specific-Codes; Security Information S G M L element; element definition; element declaration b: DIP intrinsic elements S G M L front matter r: DIP extrinsic elements provenance A R C H - The organization or person creating a fonds.(21)-15 SEE archival bond SEE ALSO principle of provenance pseudo-original SEE copy SEE ALSO original public document DIP - A document created by a public person, at their command or in their name, that is, i f the will determining the creation of the document is public in nature. (17)-16 SEE document management domain Purge Date and Time D F R - A n optional attribute, part of the extension attribute set, that specifies the date, and optionally, the time of day after which the DFR-document can be purged from the D F R document store. The case of an O D A document, the value of this attribute can be taken by the D F R user from the O D A document profile where it is the equivalent of the O D A attribute, creation date and time.(8)-90 O D A - A n attribute of dates and times that specifies the date, and, optionally, the time of day after which the document can be purged from wherever it is stored.(5)-10 SEE annotations of management; chronological date SEE ALSO Dates and Times qualification of signature The mention of the title and capacity of the signer that accompanies the signatures of attestation. (19)-15 s: N o standards equivalent n: D F R document content O D A generic logical structure; specific logical structure; content portion S G M L element; element definition; element declaration b: DIP eschatochol r: DIP entitling; dates; invocation; superscription; salutation; subject 166 Chapter Four Thesaurus query D P - A request for information from a database.(25)-392 SEE ALSO transaction DIP - A question, esp. expressing doubt or objection.(27)-879 SEE annotations of handling read-modify-delete D F R - Access to a DFR-object that permits the user to delete or move that object. This includes read-modify status.(8)-24 S E E access control receivers DIP - A proposed term indicating those persons who are copied on a distribution list as opposed to the addressees. The two groups must be distinguished. In traditional textual records, the receivers are usually listed at the end of the document. SEE ALSO addressee s: No standards equivalent; no diplomatic equivalent n: S G M L Secondary Recipient b: DIP addressee of the document O D A Distribution List r: DIP copy; secretarial notes record A R C H - Recorded information, regardless of form or medium created, received and maintained by an agency, institution, organization or individual in pursuance of its legal obligations or in the transaction of business. (1)-176 DLP - A complete and effective archival document. Completeness and effectiveness are provided by form. A complete document is one that contains all the elements it is supposed to contain according to the administrative and legal system. A n effective document is a document capable of achieving its purposes.(16)-9 • Recorded transactions communicated to other people in the course of business via a store of information available to them.(l 1)-12 • Not all documents are records. Only those that fulfill the necessary requirements of form can be considered records; their content is irrelevant for diplomatic purposes. Documents are the genus, records the species. • Records arise from administrative activities which manifest themselves in series of acts. These acts, and their documentation, are governed by written or unwritten rules of procedure which are revealed in the forms of the records. (11)-10 • The necessary components of records are medium, content, form, persons, acts.(22)-3 SEE archival document; completeness SEE A L S O form; document; written document. D P - A set of related data or words, treated as a unit. (1)-176 167 Chapter Four Thesaurus recordskeeping system A R C H - an architecture of competent records creators, together with their equipment, and support mechanisms, governed by policies and procedures for the management of documents made and received in the course of business, designed to ensure the reliability, authenticity and completeness of archival documents (or records) in the course of their creation, maintenance, use, and disposition. SEE ALSO records Reference D F R - A D F R object which acts as a link to another D F R object which is called the referent of the DFR-reference.(8)-6 • A D F R reference consists of a D F R content containing a pointer to the referenced D F R object (the referent) and,to D F R attributes.(8)-15 • A D F R referent allows an object to participate in more than one D F R Group without requiring distinct copies of the object to be created. The content consists of a pointer to a referenced object.(8)-l5 • The attributes of the reference are associated only with the reference alone while those of the referent are associated only with it alone. (8)-16 SEE links; document SEE ALSO compound document; reference content; referent reference content D F R - The information stored in a DFR-reference for the purposes of identifying the referent. (8)-6 SEE ALSO Reference; links; compound document; document References to Other Documents O D A - A n attribute of external references that specifies references to any other associated documents and consisting of one or more entries.(5)-l 3. S G M L TEI - A tag, part of the non-core tag set of generic office documents. SEE annotations of management; archival bond; links SEE ALSO compound documents Referent D F R - That DFR-Object to which a DFR-Reference refers.(8)-7 SEE ALSO Object; Reference; links; document register A R C H - A list of events, letters sent and received, actions taken, etc. usually in simple sequence, as by date or number, and then often serving as a finding aid to the records, such as a register of letters sent or a register of visitors. (1)-177 DIP - Among the various types of copies are registers in which documents are reported in extenso.(3) 168 Chapter Four Thesaurus D P - A storage device having a specified storage capacity such as a bit, a byte, or a computer word and usually intended for a special purpose.(1)-177 SEE ALSO copy Release Date and Time O D A - This attribute specifies the date, and optionally, the time of day after which the document can be released from any restrictions specified in the attribute, Security Classification.^)-10 SEE security reliability DIP - A record endowed with trustworthiness. Specifically, trustworthiness is conferred on a record by its degree of completeness and the degree of control on its creation procedure and or/or its author's reliability.(22)-9 • Where electronic records are concerned, in addition to the elements of intellectual form, the profile of every record to be reliable must include date, time, author, addressee, subject. (22)-22 • If received from outside, it must include date of receipt, time of receipt, date of further transmission, time of further transmission, author, addressee, classification code, and registry number (if applicable).(22)-22 • Control of access to document management domains is an important constituent of the reliability of electronic documents. (22)-23 SEE ALSO access; authenticity; completeness; document management domains; security s N o standard equivalent n D F R Authors; Create-Date-and-Time; Created-By; Subject; Revision-Date-and-Time; Reserved-By; Reservation; Access-List; UPI; Document-Type; Document-Architecture-Class; Document Content O D A document architecture class; interchange format class; Subject; Document Type; Creation Date and Time; Revision History; Authors; Distribution List; Reference to Other Documents; Authorization; Access Rights; Security Classification; Local Filing Date and Time S G M L document type declaration(DTD); document instance; Title; Creation Date; Originator; Author; Primary recipient; Document Status; annotations b N o broader term r DLP authenticity; completeness 169 Chapter Four Thesaurus representation A R C H - The intellectual form in which information is presented for consumption by humans. E D M S - The form in which information is presented for consumption. These forms include image, text, voice, video, tables and graphics.(lO)-Glossary SEE ALSO format; medium Reservation D F R - A DFR-specific, mandatory attribute, part of the basic attribute set, that indicates whether a D F R object is reserved or not. This attribute is associated with each D F R object.(8)-87 SEE security Reserved-by D F R - A DFR-specific, mandatory attribute, part of the basic attribute set, that identified the security subject on whose behalf the D F R user has reserved this D F R subject. It is absent when the D F R object is not reserved and can be read only by a D F R user having at least extended read-access rights. (8)-87 SEE security Resource Document O D A - A generic-document containing one or more object class descriptions referred to by one or more object class descriptions o f another document.(2)-11 SEE Document Resource-Limit D F R - A mandatory attribute, part of the basic attribute set, that specifies the maximum resource to be used for a D F R object based upon accounting information. The resource limit includes the space required to store content (if a document), the D F R object tree (if a D F R group) and any associated attributes.(8)-83 SEE annotations of management; Resource-Limit Resource- Used D F R - A mandatory attribute, part of the basic attribute set, that contains information for accounting purposes based on resources used during some period of time, for example, the actual amount of storage used for the D F R object in the document store. The resource used includes both the space required to store content (the D F R object tree in the case of a D F R Group) and any associated attributes.(8)-83 SEE annotations of management; Resource-Limit responsibility DIP - The obligation to answer for an act.(17)-8 SEE ALSO competence 170 Chapter Four Thesaurus result document N SEE source document Revision-Date-and-Time D F R - A non-specific D F R attribute, part of the extension attribute set, that specifies the date and optionally, the time of day on which a revision of the D F R object occurred. In the case of an O D A document, the value of this attribute can be taken by the D F R user from the O D A document profile where it is the equivalent of the attribute, revision date and time.(8)-90 SEE draft; Revision History; status of transmission Revision History O D A - A n attribute of dates and times that specifies the history of the document, indicating when, where and by whom the document was created and revised. The value of this attribute consists of a sequence of groups of parameters. Each group forms an entry in the history. The first group in the sequence provides information on the creation of the document. The last group in the sequence provides information on the current version of the document. Each group consists of the following optional parameters: a) revision date and time; b) version number; c) reviser(s); d) version reference; e) user comments.(5) - 10-11. S G M L TEI - Part of the file header of an S G M L conformant data file comprising a description of the processes and interpretations that took place during the transfer of the text from the source to the data file, and a description of any subsequent editorial or other modifications made to the data file.(7) -55 SEE draft roboratio DIP - The most solemn phase of procedure in which the document is validated.(18)-13 Root-group D F R - The distinguished DFR-group within a DFR-document store having no ancestor and whose DFR-object tree encompasses all D F R objects in the D F R document store.(8)-6 SEE document; archival bond SEE ALSO Group; Group-Interrelationships; Group-Content; Group-Member; Proper-Group; Object-Tree; Document-Store roeatio DIP - A phase of procedure in which the request is made to compile a document which has been presented orally by the parties to the notary. The rogatio corresponds to the 171 Chapter Four Thesaurus iussio expressed by public authorities, even i f it has the diplomatic configuration of a contract. (18)-12 script DIP - A n extrinsic element concerned with the way the content. of the document is physically articulated or presented by such means as handwriting, fonts, page layout, use of paragraphs etc. "Computer software may be considered part of the extrinsic element "script" because it determines the layout and articulation of the discourse, and can provide information about provenance, procedures, processes, uses, modes of transmission, and last, but not least, authenticity."(19)-7 SEE content articulation; content configuration SEE ALSO language; medium; seals; special signs; annotations; specific document structure; software secretarial notes DIP- The qualification of signature may be followed by the secretarial notes (e.g. initials of the typist, mention of enclosures etc.) but usually it constitutes the last intrinsic element of documentary form.(19)-15 s: N o standards equivalent n: D F R Preparers O D A generic logical structure; specific logical structure; content portion, Prepared B y S G M L element; element definition; element declaration; Preparer b: D F R document content DIP eschatochol r: DTP entitling; dates; invocation; superscription; salutation; subject Secondary Recipient S G M L T E I - A tag, part of the non core tag set for generic office documents. (7)-190: • Use of secondary recipient could correspond to receivers. (TG) SEE addressee; recipients SEE ALSO Distribution List security E R M Techniques for ensuring that data stored in a computer cannot be read or compromised. Most security measures involve data encryption and passwords. Where mode of transmission is concerned, articulation of the circumstances and manner of transmitting records from one space to another either automatically or manually, and of receiving records from outside in any of the spaces, s: DIP N o diplomatic equivalent O D A Security Information n: D F R Access-List; Reserved-By; User; Reservation O D A Access Rights; Authorization; Security Classification S G M L Authorizing Person; Sensitivity 172 Chapter Four Thesaurus b: A R C H custody; preservation r: DIP mode of transmission security classification O D A - "This attribute specifies the security classification assigned by the document owner(s) relating to such aspects as the visibility, reproduction, storage, audit, and destruction requirements."(4)-14 SEE annotations of management; SEE ALSO security simple act SEE acts simple copy SEE copy soft copy D P - Data temporarily displayed on a video screen, in contrast to hardcopy, which is printed output from a computer.(l)-180 SEE ALSO view . software D P - "Computer instructions or data. Anything that can be stored electronically or displayed on paper is software. The storage devices and display devices are hardware."(25)-43 5,436 SEE script Source Description S G M L TEI - A part of the bibliographic file description intended to provide a usable bibliographic reference to the copy text used in preparing a machine-readable text, not necessarily a detailed description with the level of detail found in a library catalogue. (7)-5 5 SEE annotations of management SEE ALSO back matter source document D P - A document containing information entered into a computer. (1)-181 O D A / S G M L - In an open document processing system, such as O D A or S G M L , the document that is transmitted. The source document is defined by a pre-determined description in processable terms and is the equivalent of the result document. (9)-149 • The attributes of the source and the result document are mapped to the automatic processor and must agree in their document profile. SEE ALSO basic processing model; extended processing model; mapping; source document instance - S G M L ; specific logical structure 173 Chapter Four Thesaurus special signs DIP - Extrinsic elements of documents that consists of the signs of the writer and subscribers and the signs of the chancery or records office. Examples of the signs of the writers include the symbols used by notaries as personal marks in the medieval period, corresponding to the modern notarial stamp, crosses used by some subscribers in place of their name. Examples of the signs of the registry and chancery include the rota and bene used by the papal chancery, and office and archival stamps.(19)-8,9 s: N o standards equivalent n: D F R User-Specific-Codes; document content O D A User Specific Codes; generic logical structure; specific logical structure; content portion S G M L element; element definition; element declaration b: DIP extrinsic elements r: DIP attestation specific document structure O D A - The structure that the user may read.(2)-13,14 SEE script; content articulation specific layout structure O D A - A set of layout objects and associated content portions.(2)-11 SEE content articulation SEE ALSO source document; generic logical structure; Number of Pages specific logical structure O D A - A set of logical objects and associated content portions. (2)-11 SEE content articulation SEE ALSO source document; generic logical structure Start Date and Time O D A - A n attribute of dates and times that specifies the date and, optionally, the time of day after which the document is considered to be valid.(5)-10 SEE phases of procedure - execution phase; dates; completeness Status D F R - A non-DFR specific attribute, part of the extension attribute set, that specifies the document status, e.g. working paper, draft proposal etc. In the case of an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, status.(8)-91 O D A - A n attribute of other user information that specifies the document status, i.e. whether it is a draft, working paper, or original.(5)-12 SEE status of transmission 174 Chapter Four Thesaurus status of tradition SEE status of transmission. status of transmission The primitiveness, completeness, and effectiveness of a record when it is initially set aside after being made or received.(22)-13 SEE ALSO completeness; copy; draft; original; dates subject DLP - A n intrinsic element of documents following the inscription consisting of a statement that signifies what the document is about. The subject has been stated in some court records since the last century, but has generally been introduced into records of governmental bureaucracies and, by extension, into business records during this century.(19)-12 s: O D A Subject D F R Subject S G M L Subject Field n: D F R document content O D A generic logical structure; specific logical structure; content portion S G M L element; element declaration; element definition b: DIP protocol r: DLP intrinsic elements Subject O D A - A n attribute of document description that contains information to indicate the subject of the document. (5)-9 D F R - A non-specific D F R attribute, part of the extension attribute set, that contains information to indicate the subject of a DFR-object. In the case of an O D A document, the value of this attribute can be taken from the O D A document profile. (8)-89 S G M L T E I - A tag, part of the non core tag set for generic office documents.(7)-189 SEE subject subscription SEE attestation Superseded Documents O D A - "This attribute specifies reference(s) to document(s) superseded by the current document. It consists of one or more entries "(4)-13 SEE draft SEE ALSO Version-Root; Previous- Version; Version-Management; Next-Version superscription DLP - A typical element of the protocol used to be the superscription, that is, the mention o f the name of the author of the document and/or the action. Today the superscription 175 Chapter Four Thesaurus tends to take the form of an entitling: sometimes, however, it coexists with the entitling. It still appears by itself in all contractual documents where it includes the mention of the first party, in declarative documents (those beginning with the first person pronoun followed by the name of the subscriber) and in holographic documents, such as wills, e.g. "This is the last will and testament of..." (19)-12 s: No standards equivalent. n: D F R Authors; Owners O D A Authors; Owners; generic logical structure; specific logical structure; content portion S G M L Author; element; element declaration; element definition b: DIP protocol r: DIP intrinsic elements; entitling supporting documents DIP - Documents constituting written evidence of an activity which does not result in a juridical act but is itself juridically relevant. Examples would include working papers. (11)-19 s: N o standards equivalent n: D F R Document-Type O D A Document Type; document architecture class; User Specific Codes E R M business process S G M L Document Type; document type definition b: DIP juridical relevance S G M L office documents r: DIP dispositive document; narrative document; probative document tag S G M L - Descriptive markup.(23)-19 text D P - Text is words, sentences and paragraphs. The content of a word processing document is called text. Contrast with data, which is a precisely defined unit of information, such as name and address.(1)-184 DIP - A n intrinsic element that is the part of the document that contains the action, including the considerations and circumstances that gave origin to it, and the conditions related to its accomplishment. s: no standards equivalent n: DIP preamble; notification; exposition; disposition; final clauses D F R document content O D A generic logical structure; specific logical structure; content portion S G M L element; element definition; element declaration b: DIP intrinsic elements S G M L body matter r: DIP extrinsic elements 176 Chapter Four Thesaurus text unit O D A - A data structure representing a content portion description.(2)-12 SEE content articulation title The title of a document e.g. indenture, agreement, last wi l l and testament, report.(3) s: D F R Title; Version-Name O D A Title S G M Title n: O D A Document Type; generic logical structure; specific logical structure; content portion S G M L element; element definition; element declaration b: DIP intrinsic elements S G M L front matter r: DIP extrinsic elements S G M L bibliographic file description Title D F R - A mandatory attribute, part of the basic attribute set, that gives the name of the D F R object as specified by the D F R user.(8)-81 O D A - A n attribute of document description that gives the name of a document as specified by the author.(4)-9 S G M L TEI - The title of the work, part of the front matter. (7)-74 SEE title SEE ALSO front matter topical date The place as it appears in the document.(19)-11 s: N o standards equivalent n: D F R User-Specific-Codes; Pathname O D A Local File References(?); User Specific Codes; generic logical structure; specific logical structure; content portion S G M L Address; element; element definition; element declaration b: DLP protocol r: DIP dates transaction A R C H - Information, communicated to other people in the course of business, via a store of information available to them.(l)-185 DIP - Juridical acts directed to the obtainment of effects recognized and guaranteed by the system. (4)-12. • In a transaction, a person administers their own interests with other persons. Therefore, a transaction is an expression of autonomy of a physical or juridical person who self-disciplines their own conduct in a binding way.(l l)-7 177 Chapter Four Thesaurus • Documents which are reliable and complete, that is, able to convey information, capable of being used in a transaction, and of reaching the purposes for. which they were produced, are transactions.(11)-12 D P - Any business activity or request that is entered into a computer system. Orders, purchases, changes, additions and deletions are examples of transactions in an information system. Queries and other requests are also transactions to the computer, but are usually just acted upon and not recorded in the system. Transaction volume is a major factor in determining the size and speed of a computer system. (1)-184 SEE record SEE ALSO transaction processing; query transaction log D P - A record of transactions performed. (1)-185 SEE ALSO transaction; transaction processing; record transaction processing D P - A type of computer processing in which the computer responds immediately to user requests. Each request is considered to be a transaction. Automatic teller machines for banks are an example of transaction processing.(25)-472 SEE ALSO query; transaction log; transaction Transparencies O D A - This attribute specifies the non-basic values of the attribute "transparency" used in the document.(4)-7 SEE content articulation; script Types of Coding O D A - This attribute specifies the non-basic values of the attribute "types of coding" used in the document.(4)-8 SEE content articulation; script uniqueness E R M - A functional aspect of the capture requirements o f electronic recordskeeping systems that requires the system to assign a unique identifier to all business records upon their creation or receipt consistent with organizationally established naming conventions and classification schemes.(12)-20 SEE A L S O accuracy; authenticity; completeness; evidential context; reliability UniquePermanent-Identifier D F R - A DFR-specific attribute assigned to every D F R object by the D F R server to identify unambiguously a DFR-object within the DFR-document store.(8)-7 • A DFR-specific attribute, part o f the basic attribute set, that is used by the D F R server to uniquely identify a given D F R document, D F R group, D F R reference, or D F R 178 Chapter Four Thesaurus search-result list within the document store. Once assigned by the server, the UPI can never be changed or reassigned regardless of whether the object exists or is deleted.(8)-81 User D F R - The consumer of services supplied by a DFR-server. At any time the user is acting as a security subject and takes on the privileges o f that security subject. (8)-7 SEE security User Comments S G M L T E I - A tag that is part of the Local File References in the non-core tag set for generic office documents.(7)-189 SEE archival bond; links; compound documents SEE ALSO User-Reference-to-Other-Objects; User-Reference; User-Specific-Codes User-Reference D F R - A non-DFR specific, mandatory attribute, part of the basic attribute set, that contains an identifier for a particular D F R object. In the case of an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, document reference. The attributes User Reference and User Reference to Other Objects can be used to establish D F R user references between objects stored in a document store. That is, the value of the attribute User Reference is a DFR-user-specific identifier for a D F R object. This identifier can be stored in the attribute User References to Other Objects. User Reference is managed by the user and has a single value.(8)-85 SEE archival bond; links; compound documents SEE ALSO User Comments; User-Reference; User-Specific-Codes User-Reference-to-Other-Objects D F R - " A non-DFR specific attribute that is part o f the basic attribute set and contains references to other D F R objects. In the case o f an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, User-References-to-Other-Objects. The attributes User-Reference and User-Reference-to-Other-Objects can be used to establish D F R user references between objects stored in a document store. That is, the value of the attribute User-Reference is a DFR-user-specific identifier for a D F R object. If later a value of the attribute User- References-to-Other-Objects is used (for example in a Search abstract operation), the referent will be identified. This attribute can contain one or many references to other object."(8)-85 O D A - SEE DFR SEE business process; acts; annotations; archival bond; links; compound documents SEE ALSO User Comments; User-Reference; User-Specific-Codes 179 Chapter Four Thesaurus User-Specific- Codes D F R - A n non-DFR specific attribute, part of the extension attribute set, that specifies additional user-specific code(s) for a D F R object, e.g. contract number, project number, budget. In the case of an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, user-specific-codes.(8)-92 O D A - A n attribute of other user information that specifies additional user-specific codes, e.g. contract number, project number, budget code.(5)-12 S E E business process; acts; annotations; archival bond SEE A L S O User Comments; User-Reference; User-Reference-to-Other-Objects Version D F R - The DFR-document specified by the user as a derivation of one or more other DFR-documents by means of specific DFR-attributes. • Each version of a D F R document is itself a document (an individual entry in document store). A set of all documents considered to be versions of the same document is a conceptual document.(8)-20 SEE status of transmission; draft SEE ALSO conceptual document; Unique-Permanent-Identifier; Next-Versions; Version-Name; Previous-Versions; Version-Root; Superseded Documents Version-Management D F R - A set of DFR-specific attributes, Next Versions, Previous Versions, Unique Permanent Identifier, and Version Root, managed by the D F R server in a predefined way, in order to make it possible for the user to have a very flexible user-defined version structure, and to provide the user with DFR-server assistance while navigating through this structure or attempting modifications. Version management may be qualified as user defined and server-assisted.(8) - 20 • A DFR-reference to a mullet-version document always points to a particular version (that is, to one specific D F R document in the document store).(8)-21 • Versioning follows three patterns: linear ordering, with only one previous and one next version; tree model, where a conceptual document has several next versions; and directed graph model, where a version can be declared following more than one previous versions.(8)-21 • It is always the DFR-user who declares some existing or newly-created document to be a version. • Versions of D F R groups, references or search-result lists are not defined. (8)-21 SEE status of transmission; draft SEE ALSO conceptual document; Unique-Permanent-Identifier; Next-Versions; Version-Name; Previous-Versions; Version-Root; Superseded Documents 180 Chapter Four Thesaurus Version-Name D F R - A DFR-specific attribute, part of the basic attribute set, that is a free-form attribute intended for the D F R user's use and management. It is defined primarily for those D F R documents that are declared to be versions (in the sense of D F R version management), but it can also be used for any other D F R document. It can also appear in a D F R reference to a D F R document, normally as a copy of the corresponding attribute of the referent. In the case of an O D A document, this attribute may be taken from the document profile, where it is the equivalent of the O D A attribute, version name.(8)-83 • Version-name is only of marginal value in reinforcing the uniqueness of version values. (8)-21 SEE status of transmission; draft SEE A L S O conceptual document; Unique-Permanent-Identifier; Next-Versions; Previous-Versions; Version-Management; Version-Root; Superseded Documents Version-Root D F R - A DFR-specific attribute, part of the basic attribute set, that is defined and has the same value for all D F R documents which are declared versions of the same conceptual document, and, optionally, for D F R references to these documents. It is assigned for the first time by the D F R server when a D F R document is declared to be a version. The unique personal identifier (UPI) attribute of the version then becomes the value of the attribute, Version Root for both the old and the new version. The value of the D F R Version Root attribute is then systematically copied by the D F R server into the D F R Version Root attribute of each new version of the same conceptual document. The D F R Version Root attribute remains valid even when the "original version", bearing the UPI which is the value, has been deleted.(8)-84 • Version-root is the only attribute in common verified by the server. SEE status of transmission; draft SEE A L S O conceptual document; Unique-Permanent-Identifier; Next-Versions; Previous-Versions; Version-Management; Version-Name; Superseded Documents view D P - The data which a user with a given permission set is permitted to see in a database. (1)-186 • A capability to see, but not to add or change, data in a system. (1)-186 SEE content articulation virtual file store D P - A file that appears to be a single file but is actually two or more linked files.(1 )--186,187 virtual record A R C H - A set of instructions for the creation of a record. (4) DIP - Pointers needed to create documents.(22)-15 • Instructions for creating documents. 181 Chapter Four Thesaurus D P - The characteristics of an entity as perceived by the user, regardless of how they have been physically represented in a database. Thus an employee would have one virtual record, but may have numerous physical records linked together to accommodate repeating addresses, jobs held, benefits received, etc.(l)-186 writer The person who is responsible for articulating the intellectual form of the document. (17)-7 s: N o standards equivalent n: D F R Created-By; Preparers; Authors O D A Preparers S G M L Author; Preparers b: DLP Persons r: DLP Author witness DIP - One of the persons responsible for a document whose signature may serve to confer solemnity on a document, or to authenticate the signature of an author (either of the act of the document, or both), or to validate the content of the document, or its compilation, or to affirm that act for which both oral and written form is required took place. . (17)-8 s: N o standards equivalent. n: N o standards equivalent. b: DLP persons r: attestation; countersigner written document DLP - Evidence which is produced on a medium by means of a writing instrument or by an apparatus for fixing the data, images and/or voices. The attribute "written" is not used in diplomatics in its meaning of an act per se (e.g. drawn, scored, traced) but rather in the meaning that refers to the purpose and intellectual result of the action of writing. That is, a written document is the expression of ideas in a form that is objectified (i.e. removed from the writer) and syntactic (i.e. governed by rules of arrangement.(3) SEE document; archival document 182 Chapter Four Thesaurus GLOSSARY SOURCES (1) Advisory Committee for the Coordination of Information Systems (ACCIS). Management of electronic records: issues and guidelines. New York: United Nations, 1990. Glossary. (2) Canadian Standards Association. CAN/CSA-Z243.221-90 (ISO 8613-1: 1989) Information Processing - Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 1: Introduction and General Principles. Rexdale, Ontario: Canadian Standards Association, 1990). 3.0 Definitions. (3) Schaeffer, Roy. Diplomatic Definitions. Unpublished notes. December 1992. In possession of author. (4) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Parti). Archivaria 28 (Summer 1989), 7-27. (5) Canadian Standards Association. CAN/CSA-Z243.224-90 (ISO 8613-4: 1989) Information Processing - Text and Office Systems - Office Document Architecture (ODA) and Interchange Format - Part 4: Document Rexdale, Ontario: Canadian Standards Association, 1990. (6) International Organization for Standardization (ISO). Implementation of ISO/TEC 10027: 1990 : Information technology - Information Resource Dictionary System (TRDS) framework. Geneva: ISO/EC, 1990. (7) Association for Computers and the Humanities (ACH), Association for Computational Linguistics (ACL), Association for Literary and Linguistic Computing (ALLC). Guidelines for the Encoding and Interchange of Machine-Readable Texts. Draft. Version 1.1. C M . Sperberg-McQueen and Lou Burnard, eds. Chicago and Oxford: Text Encoding Initiative, 1990. (8) ISO/TEC JTC 1/SC 18 Text and Office Systems Secretariat: USA (ANSI). Revised Text of DIS 10166-1, Information Technology - Text and Office Systems - Document Filing and Retrieval (DFR) -Part 1: Abstract Service Definition and Procedures. New York: ANSI, 1991. (9) Bormann, Ute, and Carsten Bormann. Standards for open document processing: currents state and future developments, in Computer Networks and ISDN Systems 21. North Holland: Elsevier Science Publishers BV, 1991,149-163. (10) World Bank, Information Management Architecture (IMA). May 24 1993 internal document. (11) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part II). Archivaria 29 (Winter 1989-90), 4-17. (12) Bearman, David. Records Systems as the Locus of Provenance: Implications for Automation of Archival Control and Management of Electronic Records. Paper submitted to Ontario Association of Archivists, May 1993. (13) Delphi Consulting Corp. Handbook of Document Management Systems Evaluation and Design. Thomas K Kolopolous, ed. Boston, Mass: Delphi Consulting Group, 1991. (14) Bellardo, Lewis J. and Lyn Lady Bellardo. A Glossary for Archivists, Manuscript Curators, and Records Managers. Chicago: Society of American Archivists, 1990. 183 Chapter Four Thesaurus (15) Law, Margaret Henderson. Guide to Information Resource Dictionary Systems Applications: General Concepts and Strategic Systems Planning. Washington, DC: US Department of Commerce, 1988. (16) Duranti, Luciana: Managing Electronic Documents: Making Sense out of Chaos or "Records management is Dead! Long Live Records Management!". Presentation to World Bank, April 27 1993, Washington, DC. (17) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part III). Archivaria 30 (Summer 1990), 4-20. (18) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part IV). Archivaria 31 (Winter 1990-91) , 10-25. (19) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part V). Archivaria 32 (Summer 1991), 6-24. (20) Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part VI). Archivaria 33 (Winter 1991-92) , 6-24. (21) School of Library, Archival and Information Studies, University of British Columbia. Select List of Archival Terminology. 1990 (22) Duranti, Luciana and Terry Eastwood. The Preservation of the Integrity of Electronic Records. Unpublished draft of research. School of Library, Archival and Information Studies, UBC 1995. (23) ISO. International Standard ISO 8879 Information Processing: Text and Office Systems: Standard Generalized Markup Language (SGML) (Geneva: International Organization for Standardization, 1986) (24) Australian Association of Archivists. Keeping Archives. 2nd edition. Judith Mills, ed. Melbourne, Australia. Thorpe with Australian Society of Archivists Ltd. 1993 (25) Margolis, Philip E. The Random House Personal Computer Dictionary. New York:Random House, 1991 (26) O'Brien, James A. Introduction to Information Systems in Business Management, sixth ed. Homewood, IL and Boston MA: Richard D. Irwin Inc., 1991 184 CONCLUSION The subject of the thesis has been to explore the application of diplomatics to the control of electronic records through the document profile, or its equivalent, in certain open document exchange standards. Central to this proposition is the idea that diplomatics offers a set of decontextualized meta data that can be applied to the document profile in the same manner that international document exchange standards can be used to establish requirements, without prescribing an actual implementation. In effect, this is to suggest that diplomatics has the character of a standard. After examining the standards, it now seems more accurate to say that diplomatics is a sort of conceptual metastandard - a standard upon which standards can be based. Just as there is a direct relationship between general and specific diplomatics, or between the theory of diplomatics and its application, so the same may be said of the relationship between the open document exchange standards discussed here, and the implementation, through specific applications, of their descriptive requirements. There is, however, a fundamental difference, in that open document exchange standards are not, in themselves, a general theory - they are expressions of a theory of document exchange, namely, that documents have a structure that permits them to be mapped to an electronic system in such a way that they can be exchanged with accuracy, completeness, and manipulability. In the same way, diplomatics offers a theory of the archival document, or record, that can be used to model the descriptive attributes used by open document exchange standards such as O D A , or DFR, but diplomatics itself is not a standard in that it is designed to be accessible to, and implemented by, a community of electronic recordskeeping system designers. Standards that are specifically designed to capture the archival document remain to be written, although there are attempts in progress such as the Metadata Encapsulated Object ( M E O ) proposed by David Bearman, and the work which Luciana Duranti and Terry Eastwood are now doing with the U S Dept. of Defense. What of the nature of the document profile itself? Central to the attempt to examine the relationship between diplomatics and electronic records management through the medium of the document profile is the proposition that the conceptual, idealized document posited by diplomatics can be realized in the document profile of open 185 Conclusion communication standards. The document profile then becomes a surrogate, conceptual document, or, in other words, a conceptual profile. The analysis of the standards indicates that diplomatics, in this respect, is filling a vacuum. Those attributes that exist seem to be there out of custom. They are there because the standards are designed to capture documents whose context and structure is taken for granted by the standards designers, just as one does not bother to explain what is assumed to be obvious. The truth seems to be that there is no rhyme or reason behind the selection of these attributes apart from broad assumptions about the nature of the office and publishing environments in which documents are created or received. That O D A and D F R do not do a bad job of capturing characteristics of archival documents is perhaps due to their specific design as document management standards compared to S G M L , rather than any theoretical definition of the document. This can be proven from the self-referential definitions of document used by each of these standards, definitions which are designed to serve the purpose of the standard, and are not conceived within the context of a broad and rigorous theoretical conception of the document, let alone, the archival document or record. The criticism can be extended to the document profile which, when compared against the characteristics of a document, turns out not to be defined so much in terms of a document as of a data object, a far broader concept. A profile is actually a means of describing and manipulating whatever object may be defined to its structure. In the absence of any rigorous theoretical construct of the document, the association of "document" with profile is therefore something of a misnomer. It would be truer to refer to an "object profile" and to define a family of profiles, one of which would be a document profile, and even, a record profile. This is more than a mere taxonomic convenience. Just as the paper record, and its papyrus and parchment predecessors, came to displace the oral record as the predominant record form, perhaps we should see in the profile not simply a surrogate of the paper-defined conceptual document (an idea that the concept of the "virtual" document does not really escape) but an altogether new paradigm of record form, separating the model of the paper document from the idea of the record, which can be so many other things in the digital world. Thus, just as the medieval 1 8 6 Conclusion scribe scraped clean the parchment and ruled lines, so authors and writers will now define a profile. The thesis has proven to be as much about finding a way for archivists to approach the territory of electronics recordskeeping system design as it is about the nature of open document exchange standards as a tool of records management. Fundamental to its purposes has been to place the record within the context of the document profile as an instrument of document management, which is why the thesis takes pains to explain the nature and construction of electronic recordskeeping systems and the problem of document structure in data processing. This is to ask what the archivist really needs to know about electronic recordskeeping system design, or to put it another way, in the design process, what special knowledge or viewpoint can the archivist contribute? The thesis assumes that the archivist needs above all to understand the record, and that it is this grasp of its nature and characteristics which provides a yardstick against which to measure the usefulness of open document exchange standards. But, in fact, it emerges to be more than that: by positing the existence of an "open" document profile, the thesis demonstrates that diplomatics offers not only a critique of the profile, but an actual design tool, something that archivists and system designers can share in the form of requirements. But i f the general theory and methodology of diplomatics is relevant to the design of electronic recordskeeping systems, how effective is diplomatics as a design tool? Underlying the concern of this thesis with diplomatics and open document exchange standards has been the interface between recordskeeping and the design of electronics recordskeeping systems. In effect, this is a problem of object design. There are two approaches to the design of an object. One is by trial and error; the other is to proceed from a theoretical model that may be applied through trial and error but nonetheless regards reality not as the be all and end all, but as only one of many possible manifestations. Archivists have tended to take an organic approach to theory - that what theory there may be is inherent in the archival object, as a fonds, or as a series, which is therefore permitted to define itself. For instance, the basic qualities of archives -naturalness, interrelationship, uniqueness, paternity are all qualities that must be present in records before they can be considered an archives. In other words, these are found - as 187 Conclusion opposed to designed - qualities. This highlights a profound difference with the design of electronic recordskeeping systems, because these are based upon a given set of requirements. There is no way an archive can be translated into a set of system requirements in the sense of found qualities. A designed archive is, by its very nature, a false archive. Electronic recordskeeping systems, by contrast, are based upon pre-defined requirements. H o w does diplomatics change this picture? Diplomatics is a way of defining the record-object. It is able to do so because, unlike archival theory, it does not assume that the object defines itself. The archival document consists of constituents some or all of which may be required by the juridical systems in which it is generated in order to manifest completeness, reliability and authenticity. These requirements arise from a conceptual document or model of a document identified and tested over centuries of research and study. The point is that these requirements are not organic in that they permit the archival document to define itself - they are not "found" qualities, but form a set of established characteristics that are manifested in reality in the variety o f records forms and through the recordskeeping system in which they arise. Assuming the validity of the conceptual document model, diplomatics thus provides an invaluable bridge between electronic recordskeeping design methodology, and archival theory. This is not to say that diplomatics, as it now exists, is perfectly suited to describe electronic records. Some concepts, such as the juridical system, are apparently too abstract in their definition to be captured in concrete terms in the document profile, even i f the system itself nonetheless exists within the context of a juridical system. Others, particularly the persons, and the intrinsic and extrinsic elements, translate readily into attributes and constituents of electronic records. The same may be said of diplomatic concepts of reliability, authenticity and completeness which underscore qualities that records must have i f records are to have any value at all, regardless of how or where they are created, and that be readily deduced by the presence or absence of identifiable attributes and recordskeeping processes. In some respects, particularly in regards to the logical structure of electronic records, diplomatic concepts such as content articulation have proven to be too generalized to handle all the elements, but this may be as much a 188 Conclusion criticism of open document exchange standards in question ( O D A in particular, which has been criticized on this score as being far too complex) as it is of diplomatic terminology. Then there are features of electronic systems, notably, access control, domain, and security, and of electronic documents, such as links, that cannot be anticipated in paper records, and others, such as the archival bond, which must be explicitly articulated in the profile where this might have arisen naturally amongst paper records as a result of their physical association in a file. The new elements should be added to the corpus of diplomatic elements of form and procedure. On a conceptual level, diplomatics passes the test of the document profile rather easily. The main difficulty, in fact, would appear to be the lack of familiarity, particularly in North America, of archivists and system designers with diplomatics, than with diplomatics itself. There is, however, another factor hidden in the heart of diplomatics, and that is the actual process of defining the record. Records, by their very nature, are unique, that is, they are peculiar to their context. In the process of the design of electronic recordskeeping system, it is always necessary to define the document types as they are found in the particular recordskeeping systems in which they are created or received. Thus, even though diplomatics posits an idealized conceptual record, it is not a formula to be blindly translated into the reality of the document profile. The archivists and system designers working with the conceptual profile cannot escape developing a thorough understanding of the uniqueness the records they are endeavouring to map, and of the procedures and circumstances which determine their character and validity. In itself, this is further proof of the extent to which general diplomatics shares the decontextualized nature of open document exchange standards: the application of special diplomatics to the study of particular documents is the problem of applying any standard, such as, for instance, S G M L to the marking up of a particular type of document, or a particular implementation of D F R . There is a great deal of work to be done in designing the system, and it is all too easy to let go of fundamental concepts in an effort to accommodate the peculiarities of reality. What of the standards themselves? O f the three ISO document exchange standards examined here, O D A and D F R reveal themselves to reflect the characteristics of the 189 Conclusion archival document most closely in the attributes and structure of the profile. Their attributes for authors and the various types of dates map readily to diplomatic conceptions. O D A is particularly strong in regards to the ability to capture the logical and layout structure of the document. But on the whole, there is much room for improvement. The document management attributes could be considerably expanded to capture diplomatic elements for persons (where they fail to identify an addressee or the various participants such as witnesses) and to identify acts, procedures and phases o f procedure. As it now stands, these elements can be inferred by adapting existing attributes, but i f a true standard is to emerge that permits description of an archival document, there will have to be both more and explciitly defined attributes. While O D A and D F R do include certain elements, such as access control and security, that are specifically characteristic of electronic documents, other aspects, such as the archival bond, are not easily recognized, and there is nothing to capture the concept of domain. S G M L is another case altogether, at least as manifested in the Text Encoding Initiative. Since S G M L is a mark-up lanaguage, it is not intended to define any specific type of document. That is the function of specific encoding initiatives. The concept of the Document Type Definition, which is here treated as a conceptual profile, is a structure readily adapted to the description of any number of different types of documents, including records, and could presumably be adapted to capture logical and layout features by means of extensions. Since S G M L permits any and all types of elements to be defined, there is no theoretical reason why a complete set of diplomatic attributes could not be defined to a D T D . This very flexibility, however, is in itself problematic, because it means that agreement on records attributes must be wrung from amongst disparate implementations whose proliferation S G M L is specifically designed to encourage. In other words, S G M L leaves archivists and system designers pretty well where they started with agreement on a basic language, but no agreement on what to say. The philosophy of text encoding itself is not suited to the capture of documents because every encoding is presumed to be an interpretation - which violates the very nature of the archival record. Finally, the thesis has used the mapping technique of the thesaurus to demonstrate the relevance of diplomatic concepts to open document exchange standards. In effect, the 190 Conclusion thesaurus is an attempt to turn the tables on system design by forcing data processing terminology to conform to a terminology of the archival document or record. The thesaurus demonstrates that this works quite well, not only as a means of establishing an authority file, but also as a means of revealing the limitations of a terminology where there are no equivalents at the synonym, narrow, broad, or related levels. It is to be hoped that such a thesaurus can be extended to electronic records management terminology as whole, taking in the most widespread standards and including other archival terms than diplomatic concepts where these are relevant to system design. 191 B I B L I O G R A P H Y BOOKS, ARTICLES AND REPORTS Advisory Committee for the Co-ordination of Information Systems (ACCIS). Management of electronic records: issues and guidelines. New York: United Nations, 1990. Association for Computers and Humanities (ACH), Association for Computational Linguistics (ACL), and the Association for Literary and Linguistic Computing (ALLC), Guidelines for the Encoding and Interchange of Machine-Readable Texts, draft: version 1.1, C M . Sperberg-McQueen and Lou Burnard eds., (Chicago & Oxford: Text Encoding Initiative, 1990) 289 pp. Barry, Richard E. "Best Practices" for Establishing Good, Defendable Practices and Procedures for Digital Document Management. Draft report submitted to World Bank. Arlington, VA: Barry Associates, 1993. Barry, Richard E. Electronic Document and records Management Systems: Towards a Methodology for Requirements Definition. Draft report submitted to World Bank. Arlington, V.A.: Barry Associates, 1993. Barry, Richard. E. Document Filing and Retrieval: ISO Standard 10166. Assessment report prepared for World Bank. Arlington, VA: Barry Associates, March 2 1993. 6 p. Barry, Richard E. Open Systems Standards: Assessing Product Availability. Draft report circulated as part of World Bank presentation to Society for Worldwide Interbank Financial Telecommunication (SIBOS), Brussels, September 1992. Washington, D . C , September 1992. Bearman, David. Issues Involved in Using SGML for Data Interchange. Archives and Informatics 8 No. 1 (Spring 1994), 74-79. Bearman, David. Record Systems as the Locus of Provenance: Implications for Automation of Archival Control and Management of Electronic Records. Paper presented at Ontario Association of Archivists Conference on Archives and Automation, May 13, 1993, Toronto, Ontario. 28 pp. Bellardo, Lewis J. and Lyn Lady Bellardo. A Glossary for Archivists, Manuscript Curators, and Records Managers. Chicago: Society of American Archivists, 1990. Bormann, Ute, and Carsten Bormann. Standards for open document processing: current state and future developments, in Computer Networks and ISDN Systems. North Holland: Elsevier Science Publishers B.V., 1991, 149-163. Braid, Andrew, From Babel to EDIL: the evolution of a standard for document delivery, Computer networks and ISDN Systems, 27 (1994), 367-374 Burnard, Lou, The Text Encoding Initiative: Towards an Extensible Standard for Encoding of Texts in Electronic Information Resources and Historians: European Perspectives, Seamus Ross and Edward Higgs, eds. St. Katherinen: Max-Planck-Institut fur Geshichte In Kommission bei Scripta Mercaturae Verlag, 1993, 105-118. Cronk, Randall D., Unlocking Data's Content, Byte (Sept. 1993), 111-120. 192 Bibliography Du Rea, Mary V., and J. Michael Pemberton,£7ec/roA7/c Mail and Electronic Data Interchange: Challenges to Records Management. Records Management Quarterly (Oct. 1994), 3-12 Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part I). Archivaria 28 (Summer 1989). 7-27. Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part II). Archivaria 29 (Winter 1989-90), 4-17. Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part III). Archivaria 30 (Summer 1990), 4-20. Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part IV). Archivaria 31 (Winter 1990-91), 10-25. Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part V). Archivaria 32 (Summer 1991), 6-24. Duranti, Luciana. Diplomatics: New Uses for an Old Science (Part VI). Archivaria 33 (Winter 1991-92), 6-24. Duranti, Luciana. Managing Electronic Documents: Making Sense Out of Chaos or "Records Management is Dead! Long Live Records Management!". Presentation to World Bank, April 27 1993, Washington, D.C. Duranti, Luciana. Reliability and Authenticity: the Concepts and their Implications (unpublished paper, 1995) Fanderl, H., K. Fischer, and J. Kamper, The Open Document Architecture: From standardization to the market. IBM Systems Journal 31, No. 4, 1992, 728-754. Hajagos, Lani. Documents and SGML (the Standard Generalized Markup Language Standard for document processing), UNIX Review 11, No. 3 (Mar. 1993), 4 pp. Hayes, Frank, SGML Comes of Age. UnixWorld (Nov. 1992), 99-100. Jacobs, Paul S., ed. Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Hillsdale, N.J.: Lawrence Erlbaum Associates, 1992. Kay, Russell, Objects in Use. Byte 19, No. 4 (April 1994), 99-104. Law, Margaret Henderson. Guide to Information Resource Dictionary System Applications: General Concepts and Strategic Information Systems Planning. Washington, D .C: US Department of Commerce, 1988. Margolis, Philip E. .The Random House Personal Computer Dictionary. New York: Random House, 1991. Miller, Michael, The Next Software Revolution, PC Magazine 12, No. 6 (Mar 1993), 2 pp. Moore, James W., David Emery and Roy Rada, Language-Independent Standards. Communications of the A C M , 37 No. 12 (Dec. 1994), 17-35. 193 Bibliography Morell, Jonathan, Standards and the market acceptance of information technology: An exploration of relationships, Computer Standards and Interfaces Vol. 16 (1994), 321-329. Kilov, Haim, Information Modelling: a path to document analysis, MRE-2F049, Bellcore, internal research paper, April 1994, 13 pp. Library of Congress, Workshop on Electronic Texts - Proceedings. James Daly, ed. Washington, DC: Library of Congress, 1992. Mullins, Craig S., The Great Debate. Byte (April 1994), 85-96. Murray, Philip C , Documentation Goes Digital. Byte (Sept. 1993), 121-129 Nordin, Brent, David T. Barnard, and Ian A. Macleod, A review of the Standard Generalized Markup Language (SGML), Computer Standards and Interfaces 15 (1993), 5-19. O'Brien, James A. Introduction to Information Systems in Business Management, sixth ed. Homewood, IL and Boston MA: Richard D. Irwin Inc., 1991 Phillips, John T. Organizing and Archiving Files and Records on Microcomputers. Prairie Village, K.A.: Association of Records Managers and Administrators, 1992. Piersoll, Kurt, .4 Close-Up ofOpenDoc. Byte 19, No. 3 (March 1994), 183-188. Reinhardt, Andy, Managing the New Document. Byte 19, No. 7 (August 1994), 91-104. Rooney, Paula, Versatile electronic data delivery fuels corporate interest in SGML. PC Week 10 No. 8, 2pp. Saffady, William. Managing Electronic Records. Prairie Village, K.S.: Association of Records Managers and Administrators, 1992. School of Library, Archival and Information Studies, University of British Columbia. Select List of Archival Terminology. Unpublished glossary for Master of Archival Studies Program. 1990. Sparck Jones, Karen. Assumptions and Issues in Text-based Retrieval, in Text-based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Paul S. Jacobs, ed. (Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1992), 157-177. Stein, Richard Marlon, Object Databases, Byte 19 No. 4 (April 1994), 74-84. Taylor, Calvin. J., Object-oriented concepts for distributed systems. Computer Standards and Interfaces 15 (1993) 167-170. Thompson, Craig, A reference model for object data management. Computer Standards and Interfaces 15 (1993), 121-147. Vecchione, Anthony, How SGML Bridges Format Differences. Information Week (Mar. 29 1993), 22-23. Walker, David M. The Oxford Companion to Law. Oxford: The Clarendon Press, 1980. 194 Bibliography Watson, Bradley C , and Robert J. Davis. ODA and SGML: An Assessment of Co-existence Possibilities, Computer Standards and Interfaces 11 (1990/91): 169-176 Wilmott, Sam, Distinguishing Intelligence from Formatting, <TAG> The SGML Newsletter. No. 20 (Dec. 1991), 6-10. STANDARDS ISO/EC JTC 1/SC 18 Text and Office Systems Secretariat: USA (ANSI). Revised Text of DIS 10166-1, Information Technology - Text and Office Systems - Document Filing and Retrieval (DFR) -Part 1: Abstract Service Definition and Procedures. New York: American National Standards Institute (ANSI), 1991. ISO/TEC JTC 1. Revised Text of DIS 10166-1, Information Technology - Text and office Systems -Document Filing and Retrieval (DFR) - Part 1: Abstract Service Definition and Procedures. Draft. New York: International Standards Organization, 1991. ISO. International Standard ISO 8613: Information Processing - Text and office systems - Office Document Architecture (ODA) and interchange format - Part 1: General. Geneva, Switzerland: International Standards Organization, 1989. ISO. International Standard ISO 8613: Information Processing - Text and office systems - Office Document Architecture (ODA) and interchange format - Part 2: Document Structures. Geneva, Switzerland: International Standards Organization, 1989. ISO. International Standard ISO 8613: Information Processing - Text and office systems - Office Document Architecture (ODA) and interchange format - Part 4: Document Profile. Geneva, Switzerland: International Standards Organization, 1989. ISO. International Standard ISO 8613: Information Processing - Text and office systems - Office Document Architecture (ODA) and interchange format - Part S: Office Document Interchange Format (ODIF). Geneva, Switzerland: International Standards Organization, 1989. ISO. International Standard ISO 8879 Information Processing: Text and Office Systems: Standard Generalized Markup Language (SGML). Geneva: International Organization for Standardization, 1986. ISO/IEC JTC1/SC21 Information Retrieval, Transfer and Management for OSI. Draft Recommendation X.903: Basic Reference Model of Open Distributed Processing - Part 3: Prescriptive Model. Reprinted in Computer Standards and Interfaces Vol. 15 (1993) 191-274 ISO/TEX JTC 1/SC 21 Information Retrieval, Transfer and Management for OSI. Information Technology - Basic Reference Model for Open Distributed Processing - Part 2: Descriptive Model, Committee Draft ISO/IEC CD 10746-2.2. New York: ANSI, 1993, reprinted in Computer Standards and Interfaces Vol. 15 (1993) 171-190. VENDOR PRESENTATIONS Digital Equipment Corporation. The Open Software Foundation Distributed Computing Environment: An Introduction. Slide print presentation to the World Bank prepared by Terry Tvirdik. Littleton, M.A.: Digital Equipment Corporation, March 31 1993. Digital Equipment Corporation. The World Bank Open Systems Forum. Slide print presentation to World Bank on January 7, 1993 in Alexandria, VA. 195 Bibliography WORLD BANK INTERNAL DOCUMENTS Information, Technology and Facilities Department. Open Systems at the World Bank. Presentation by Hywel Davies, Director, to Society for Worldwide Interbank Financial Telecommunications (SIBOS), Brussels, September 1992. Washington, D.C: World Bank, August 27 1992. 17 p. Information, Technology and Facilities Department. Information Management: Vision and Objectives based on User Needs. Internal report by Karl. O. Lawrence. Washington, D.C: World Bank June 11 1993. 20 p. Information, Technology and Facilities Department: Information Engineering. ITF Staff Paper No. 12: Information Management Architecture: FY 94. Harold Steyer, ed. Internal report prepared by the World Bank, Washington, D .C: World Bank, 1993. Information, Technology and Facilities Department: Information Engineering. Document Management System: Requirements Analysis. Internal report. Washington, D.C: World Bank, December 23, 1992. Information, Technology and Facilities Department. Final Report of the Electronic Text/Image Review. Draft internal report prepared by Irene Travis et. al. Washington, D .C: World Bank, March 22 1993. Information, Technology and Facilities Department (ITF). Excalibur Project: Analysis and Recommendations. Internal report. Washington, D.C: World Bank, October 1992. Information, Technology and Facilities Department (ITF). Document Management Technology Architecture. ITF Staff Paper prepared by Irene L. Travis. Washington, D.C: World Bank, December 1989. Information, Technology and Facilities Department (ITF). Towards an Enterprise Document Management System Strategy and an Institutional Document Management System for the World Bank. Draft internal report prepared by Clifford A. Lynch. Washington, D .C: World Bank, September 25 1992. Information, Technology and Facilities Department: Information Services Division. Seminar on Appraisal of Electronic Records: Report and Recommendations. Internal report by Irene Travis et al. Washington, D.C: World Bank, October 21, 1993. Information Technology and Facilities Department (ITF). Developing Guidelines for Electronic Records: Report of a Project to Test the ACCIS TP/REM in Electronic Records Guidelines: A Manual for Policy Development and Implementation (ACCIS 89/018(b) 1989-07-17;. Prepared by the Task Force on Electronic Records Management Information. Washington, D . C : World Bank, September 1989. 1 9 6 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0087050/manifest

Comment

Related Items