UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Sharing and privacy using untrusted storage Ofir, Jacob 2000

You are currently on our download blacklist and unable to view media. You will be unbanned within an hour.
To un-ban yourself please visit the following link and solve the reCAPTCHA, we will then redirect you back here.

Item Metadata


831-ubc_2000-0511.pdf [ 1.99MB ]
JSON: 831-1.0051529.json
JSON-LD: 831-1.0051529-ld.json
RDF/XML (Pretty): 831-1.0051529-rdf.xml
RDF/JSON: 831-1.0051529-rdf.json
Turtle: 831-1.0051529-turtle.txt
N-Triples: 831-1.0051529-rdf-ntriples.txt
Original Record: 831-1.0051529-source.json
Full Text

Full Text

Sharing and Privacy Using Untrusted Storage by Jacob Ofir B . S c , York University, 1998 A THESIS S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F Maste r of Science in T H E F A C U L T Y O F G R A D U A T E STUDIES (Department of Computer Science) We accept this thesis as conforming to the required standard . The University of British Columbia August 2000 © Jacob Ofir, 2000 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, 1 agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department The University of British Columbia Vancouver, Canada DE-6 (2/88) Abstrac t Broadband connections to the Internet are enabling a new set of applications and services. Of interest is the impact of this additional bandwidth on current file system models. These models are being challenged as the Internet is enabling global file access, cross-domain sharing, and the use of Internet-based storage services. Various network file systems [3, 15, 8] offer ubiquitous file access, research systems [9] have offered solutions to cross-domain sharing, and cryptographic file systems [2, 5] addressed concerns regarding the trust of system administrators and data security. The Internet model requires that all these ideas be integrated into a single system. This thesis describes a new file system called bFS that addresses the chal-lenges of this new model by eliminating the assumption that servers (specifically, their administrators) are trusted. Instead, agents sue trusted to manage data, meta-data, authentication with storage providers, and enforcing access control. This en-ables global access and cross-domain sharing using untrusted storage servers. ii Contents A b s t r a c t i i C o n t e n t s S i i i L i s t o f T a b l e s v i L i s t o f F i g u r e s v i i A c k n o w l e d g e m e n t s v n i D e d i c a t i o n rx 1 I n t r o d u c t i o n 1 2 D e s i g n 5 2.1 Architectural Overview 6 2.2 Trust Model 7 2.2.1 Storage Providers 8 2.2.2 Agents 9 2.2.3 Users 10 2.3 Meta Data 11 2.4 Certificates 13 iii 2.4.1 Revocation 15 2.5 Agents 1 5 2.5.1 Protocol 1 6 2.5.2 Sharing 18 2.5.3 An Example 2 0 2.5.4 Multiple Agents 20 2.6 Encryption 22 2.6.1 Naming 24 2.7 Summary 25 3 Implementation 26 3.1 Overview 26 3.2 Client " 2 7 3.3 Agent 2 8 3.4 Performance Optimizations 30 3.5 Summary 30 4 Performance **1 4.1 Overview 31 4.2 Test Environment 32 4.3 Micro-Benchmarks 33 4.4 Andrew Benchmark 36 4.5 Summary 38 5 Related Work 3 9 5.1 Crpytographic File Systems 40 5.2 Cross-Domain Sharing 40 iv 6 C o n c l u s i o n s 42 6.1 Future Work 43 B i b l i o g r a p h y 45 v List of Tables 2.1 bFS Certificate 14 2.2 File access operations 17 2.3 Cryptography operations 18 2.4 Sharing operations 18 4.1 bFS micro-benchmarks in milliseconds 34 4.2 Andrew benchmark: bFS vs. NFS 36 4.3 Andrew benchmark using 32k read/write sizes: bFS vs. NFS . . . . 37 vi List of Figures 2.1 Architectural Overview 7 2.2 Sharing using public read-only 21 3.1 bFS agent overview 27 vii Acknowledgements Too many people ensured I complete this thesis. Their support ranged from encour-agement to threats, from brain-storming to just laughs. You would not be reading this today without the contributions listed below - so blame them, not me! To all the clowns in the DSG, thanks for the fantastic environment. In particular, Ross Carton, Douglas Santry, Alex Brodsky, Joon Suan Ong, Yvonne Coady, and Dima Brodsky have been great to bouce ideas and frustrations off of. There are many friends outside the DSG whose presence made my experience more enjoyable. Paul Kry, Joshua Richmond, and Derek DiFilippo have been, and continue to be great friends. Thank you. Michael Feeley and Norm Hutchinson signed their names on the front page. But they did more than just that. They let me roam the expanse of my thesis, but pushed me in the right direction when I strayed too far. They asked me hard questions and forced me to think in new ways. And they also made sure I got a good laugh once in a while. M y parents, brother, sister, and Emeline have also been amazing. They let me talk their ears off about stuff they did not really understand. And Emeline somehow managed to deal with me while I was thinking, coding, and writing -Wow! Lastly, the rest of the faculty, staff and fellow students made the depart-ment feel like home. Oh, this thesis would not have been possible without financial contributions from N S E R C and the B C Advanced Systems Institute. J A C O B O F I R The University of British Columbia August 2000 V l l l To my Fami ly . ix Chapter 1 Introduct ion The traditional approach to building secure file systems has been to assume that servers are trusted, but that clients need not be. This assumption of server trust has been critically important. Only a trusted server can implement the access control and data protection policies and mechanisms necessary to prevent unauthorized access to user data. This assumption of server trust has also been a natural one. File systems are typically restricted to a single administrative domain and thus server trust is provided by that domain's system administrator. This administrator controls the software that runs on the server as well as the users that can access it. File access control is based on a user authentication system under control of this administrator. Before a user can access a server's files, the administrator must have added that user to an authentication database used by the server. Furthermore, the transfer of file data over the network from server to client has typically not been viewed as a significant security hole, because administrators also controlled access to the network. This basic approach has been followed by all traditional file systems, even 1 global file systems such as A F S . In A F S [8], files from remote domains can be named, but access to those files requires that a user authenticate themselves with the domain that stores the files. For instance, if a user in the M I T Computer Science Department wishes to share with a colleague outside her department, she must first ask her system administrator to add that user to the local user database. The Internet is changing this model of file system access in two ways. First, users are increasingly accessing files from a different security domain than the file server that stores those files. Second, users are increasingly interested in being able to share their files with other users outside of the server's security domain; for example, a user may want to share files with colleagues, friends, or family members. As a consequence, Internet-based storage services have begun to appear in the past year or so. Internet-based storage services have several benefits. First, they facilitate global access, because unlike many secure file systems, they can be accessed from anywhere on the Internet. Second, they may provide a useful alternative to local disks for home computer users, thus freeing the typical user from fear that valuable data might be lost due to the loss of the local disk that stores it. Finally, automatic software maintenance and perceived unlimited storage capacity are also possible. This new model of file system access challenges the traditional file system model because the network is not trusted, the storage provider might not be trusted, and sharing is likely to occur between users from different administrative domains. The first challenge is clearly unavoidable as the network is the Internet, but the latter two require further exploration. In a model in which users buy storage, they may be unwilling to trust storage providers to prevent the accidental or intentional disclosure of their valuable data. 2 The criteria to determine what is valuable will vary from user to user, but the underlying privacy theme cannot be challenged as users must retain that right. This untrusted storage assumption is further enforced when examining corporate use, as it is unlikely that corporate users would trust their data to Acme File Storage, Inc. Internet-based storage could enable simplified sharing. Existing storage providers facilitate sharing by providing each user with a public read-only folder. Other users can access all files in a user's public public folder by simply knowning the user name. These sharing semantics are insufficient and will eventually have to be replaced with semantics that allow a user to place restrictions on who can access their files. The traditional approach of placing such access restrictions is using access control lists (ACL) . However, server-managed A C L s that support shar-ing between administrative domains lead to the following problem. Server-managed global A C L s require a server to authenticate all possible users. Maintaining such a database will prove tedious, if not impossible, and even if possible, would force that server's security policy onto users outside of its administrative domain. As an al-ternative, the server could delegate the authentication of foreign users to an outside server, in which case the server would have to trust the authenticating server. The argument against this approach is that it requires a global authentication scheme and key management policy. Such an infrastructure would require a global name for all users in the World, and even if it were possible, the idea itself is scary due to its Orwell-ian nature. This thesis describes a new file system, called bFS, that leaps into this new model by eliminating the requirement that storage providers be trusted. Instead, bFS agents are trusted to manage data, meta-data, authentication with storage providers, and enforcing access control. There are two core arguments for this 3 approach. The first argument is an end-to-end [13] argument for cryptographic security. A l l data sent between server and client must be encrypted to provide a secure communication channel. If data must also be stored encrypted on disk, as servers are not trusted, it is argued that clients should encrypt data, without the storage provider's knowledge, and send this ciphertext over unsecured channels. The second argument is for client control over user authentication. In bFS, users can control access to their files without involving their storage provider (the server nor the administrator). In effect, each user maintains their own private authentication database, thus avoiding the global authentication required if servers were to implement access control. 4 Chapter 2 Design Two core ideas motivate the design of bFS. The first is the use of a remote file-system, referred to as the storage provider. The storage provider is not trusted to prevent the accidental or malicious disclosure of user data. The second core idea is a decentralized sharing mechanism designed to span multiple administrative domains. Together, these ideas allow a user to place their trust in a software agent that manages encrypted data on the storage provider and enforces access control. An additional requirement is to ensure that the use of bFS is transparent to storage providers, and hence easily deployable. To fulfill this requirement the design has to support existing network file system protocols and require no involvement from the storage providers. This means that it should not be necessary for a storage provider to install any additional software or support any additional protocols in order to use bFS. Finally, if storage providers offer public read-only access, as most current systems do, this feature can be used to optimize access to files. The performance optimization occurs when data is read directly from the storage provider while fetch-ing the decryption information from the agent. I This chapter begins with an architectural overview of the system. This is followed by a detailed account of the trust relationship amongst the different entities and components that comprise bFS. The discussion then moves to meta-data and a description of the bFS Certificate and its management and revocation issues. After establishing sufficient background information, technical issues relating to agents, and encryption techniques are explored. 2.1 Architectural Overview A bFS file system is composed of a storage provider, agents, and clients. The storage provider stores data. The data includes the encrypted form of user data and the meta-data necessary to manage it. Clients access their data through agents that manage both data and the additional meta-data. The bFS certificate is a file that is used to authenticate agents and clients, and to locate a user's storage provider and agents. Storage providers are accessed using the protocol they require. For instance, some storage providers require CIFS, while others require some other protocol. It is the agent's responsibility to support the required protocol. Clients access agents using the bFS protocol. Both agents and users have their own private-public key pairs which are used for authentication using well known authentication techniques [16]. The public keys for a user and all their agents are stored within that user's bFS certificate. Figure 2.1 contains a high-level overview of the components. It illustrates two users. Bob and Alice, their agents, storage providers, and certificates. Bob authenticates with his agent and uses his agent to access his bFS file system. He also uses t.he public read-only access to his storage provider to increase performance. 6 His agent uses the credentials Bob provides during his logon with the agent, to authenticate with the storage provider. Alice's storage provider does not support public read-only access, so she always accesses her file system through the agent. Bob and Alice exchanged certificates in the past and Alice is accessing Bob's files through his agent. [Bob's Storage Provider [Bob's Agent A u t h e n t i c : P e r f o m y f / O Access Dala Bob's Certificate P u b l i c K e y A g e n t L o c a t i o n s & K e y s [Alice's Storage Provider (Alice's Agent) P u b l i c K e y A g e n t L o c a t i o n s & K e y s Figure 2.1: Architectural Overview 2.2 Trust Model The trust model describes the different entities in a system. This includes their roles, relationships, functional assumptions, and any other aspect of their presence within the system. There are three entities that comprise bFS: storage providers, agents, and users. Storage providers provide remote storage facilities to users. Users access this storage through agents. Agents maintain all data and meta-data, authenticate users, other agents, and when possible, storage providers. 7 2.2.1 Storage Providers A storage provider is an entity providing remote storage facilities to a user. The storage provider defines the communication protocol used to access the storage. These protocols, such as NFS [3, 17, 18], CIFS [4, 15], and the recently appearing network storage provider protocols (X:drive [19], netdrive [11], and driveway [6]) have an impact on the trust model. For instance, NFS using UID authentication is not secure. Proposed extensions to the NFS protocol [7] address this issue but are not widely deployed. On the other hand, newer versions of CIFS support secure authentication. Secure authentication with the remote storage does not guarantee privacy as the protocol may specify that after the authentication stage all data is sent in the clear. Since bFS uses the specified protocol and cannot require changes to the storage provider, it suffers from the weaknesses of the underlying protocol. For instance, if the underlying protocol provides very weak authentication then the overall authentication strength of bFS is weak, regardless of the strength of other authentication stages. Authentication of storage providers can occur only if it is supported by the storage provider protocol, and most existing protocols do not. This lack of server authentication is not surprising, because existing protocols rely on the assumption that servers are trusted. The lack of server authentication does not pose a challenge to bFS, as all data stored on a server is encrypted, and hence, a rogue server must still attempt to crack encryption keys. However, if an agent communicates with a rogue server, all data committed to that server is permanently lost as it is not committed to the real server. A non-technical way to deal with a rogue server is to assume that the rogue server will not be able to produce the file-system view that the user expects, giving the user the opportunity to detect the rogue server. 8 A more serious attack is the man-in-the-middle attack. In this situation a rogue server can let the real server present the expected file-system view. It can later modify packets as they travel between the two parties. If the remote storage protocol includes message signatures then modified packets will be detected, otherwise, bFS will, with very high probability, detect these modifications when decrypting the modified data. Authentication of agents can occur only if it is supported by the storage provider protocol. The agent obtains the credentials to authenticate as the user, from the user, when the user establishes a session with an agent. Users access their data using a software client that communicate with agents and storage providers. In most cases users will access all of their data through an agent. However, public read-only access allows clients to read data directly from the storage provider and decrypt it using a key retrieved from the agent. The use of public read-only access reduces load on agents at the cost of load on the clients. 2.2.2 Agents Each user owns at least one bFS agent. Agents are trusted software entities that provide a user with access to their bFS file-system. Recall that agents can be easily authenticated using their key-pair. Once authenticated, an agent is trusted to hold a user's storage provider credentials. The credentials vary depending on the protocol used to access the storage provider. For instance, S M B requires a host name, user name, and password. For an agent to perform its access control role properly, it must also authenticate clients. Since the agent is a trusted software entity, it must run within an environment that is trusted by the agent's owning user. Using trusted software agents allows bFS to remove trust from storage providers. 9 These agents can be deployed in various configurations. For instance, some users may decide to have a single agent residing on their home computer. They can use this agent when at home or away from home, but when communicating with the agent over an untrusted network, a secure channel must be established. This use of a secure channel leads to double encryption since the agent will decrypt remote data, and then have to encrypt it for transmission over the secure channel. To avoid this un-necessary step the client has two options. The first option, is for the client to read data directly from the storage provider and obtain decryption keys from the agent. The second option if for the client to read raw unencrypted data and obtain decryption keys from the agent. The difference between the two options is that the first requires that the client support both the bFS protocol and the storage provider protocol, while the second requires client support only for the bFS protocol. There are two other options when writing data. The first is to use a secure channel and suffer from double encryption. The second is to perform encryption in the client and send the agent encrypted data blocks. The first approach is simple, but the second requires moving agent functionality such as block alignment into the client. The second approach is much slower than the first when the client is very light weight and has very limited bandwidth to the agent. 2.2.3 Users Users access a bFS file system the same way as other distributed file systems, through a trusted client layer. The client communicates with agents using the bFS protocol, discussed in section 2.5.1. The decentralized nature of bFS requires an assumption that each user can manage their own sharing relationships. For instance, if Bob and Alice wish to 10 share files they must first exchange certificates. They do so using any method they feel is sufficiently secure for their purposes. Since bFS does not enforce any key management policy, different users can enforce their own policies. For example, some users may require that the public key specified in a bFS certificate be a X.509 certificate signed by VeriSign while others may require P G P certificates. Notice that a sharing relationship is a mutual relationship. If Alice were to pass Bob's certificate to Eve and Bob's agent did not possess Eve's certificate then Eve's agent would be unable to authenticate with Bob's agent. This property is a key feature of the decentralized key management system as it puts users in control of their sharing relationships by empowering them to determine which certificates should be trusted. 2.3 Meta Data Agents maintain meta-data in special files stored at the storage provider. Meta-data in bFS is divided into five categories: file-system key, user database, directory map, directory meta-data, and file meta-data. The special meta-data files are given names that contain the illegal character n u l l . These names are transformed into remote names using the same naming technique that applies to all other files. These naming techniques are discussed in section 2.6.1, and ensure that these illegal names become legal names on the storage provider. Core to both meta-data and the client access protocol is the object identifier, or OID. The OID uniquely identifies every object in the file-system. If all storage provider protocols provided a mechanism to access their internal object identifiers, and gain access to an object using an object identifier, then bFS would not have to maintain its own set of OIDs. However, even NFS3 [3] does not provide a mechanism to use a f i l e i d retrieved from a GETATTR to access the actual object. Since bFS has 11 to manage its own set of unique object identifiers, and does not inherit a mechanism to deterministically map these OIDs to files on the storage provider, bFS must provide its own mapping of OID to an object's meta-data. The file-system key is a symmetric key used to encrypt the user database and the directory map. It is stored in a file residing in the file-system root. The file is encrypted using the user's public key and stored in a file named { n u l l ' f s k e y ' } . The user database holds certificates for all sharing partners. It is stored in a file residing in the file-system root. The file is encrypted using the file-system key and stored in a file named { n u l l 'udb '} . The directory map contains a mapping, for directories, from OIDs to their remote name and parent's OID. It also contains the OID that will be assigned to the next object, and a symmetric key used to encrypt directory meta-data. The OID mapping ensures that given a directory's OID an agent can rapidly construct the fully-qualified path of the file containing the directory's meta-data. The agent constructs this path recursively in reverse order by looking up an OID, finding its remote name and its parent's OID, and then doing the same for the parent OID until it reached the root OID. The root OID is identified in the directory map as it is the only entry whose parent OID is identical to the entry's OID. The bFS protocol, discussed in section 2.5.1, specifies that a bFS file handle is composed of the object's OID and their parent's OID. This technique allows any file handle to be mapped to its meta-data. The directory map is stored in a file residing in the file-system root. The file is encrypted using the file-system key and stored in a file named { n u l l 'dmap'}. Both file and directory meta-data contain the object's key, real and remote names, OID, and an A C L . Both real and remote names are stored because the 12 naming technique may not be reversible. In addition to the common meta-data, directory meta-data contains a map of OID to file meta-data for files, or directory names for directories. File meta-data contains the file's real size. Each directory stores its meta-data in a file residing within that directory on the remote storage. The file is encrypted using the key specified in the directory map and stored in a file named { n u l l ' d i rmd '} . 2.4 Certificates A certificate is a file that contains information used to authenticate a user and all their agents, and to access the user's storage provider. Certificates are not centrally managed, and can therefore be created by anyone and managed using any policy. A user can even partition their storage by issuing certificates for different directories on their storage provider. For instance, Bob has Acme Storage File Storage, Inc. as his storage provider. His home directory is /users/bob, but he issues creates two certificates, one for /users /bob/personal and one for /users /bob/business . In effect, Bob established two separate accounts. bFS certificates contain all the infor-mation required for sharing, authentication, and bootstrapping, and their content is described in table 2.1. The certificate contains some form of the user's public key and is signed using the corresponding private key. The public key can take many forms. It may be a X.509 [16] certificate, a P G P [21] certificate, or a raw public key. In addition to the public key, the bFS certificate contains the following information: • Storage Protocol and Location: This is used by agents to communicate with the storage provider, and by sharing parties for public read-only access. For 13 Element Descript ion Preamble B F S C E R T Version b F S Certificate version Username length 8bit length Username The user's alias Storage provider type N F S , S M B , etc. Storage provider details Size of following segment length Storage provider details Pro toco l specific details. For instance, S M B re-quires host name and user name. Storage root length Size of following segment Storage root For instance, /mark bFS block size 16 bit block size Encrypt ion type W h a t encryption technique is used: type, key size, and ini t ial izat ion vector Naming type W h a t naming technique is used Publ ic access Is public read-only access available Certificate type P G P , X .509 , etc. Certificate length Size of following segment Certificate Certificate that contains the user's public key Number of agents Number of agents For each agent: Agent location length Size of following segment Agent location The host where the agent is running Agent port The port where the agent is running S F S hash Hash of agent's public key, hostname, and port Table 2.1: b F S Certificate 14 instance, Bob has a storage account with Acme File Storage, Inc. The storage provider uses SMB, their server is located at acmefilestore.com, and Bob's root on the storage is /users/bob. • Agent Locations: A list of the location and hash of the public key (using the self-certifying technique from SFS [9]) for the user's agents. More aggressive caching can take place if the user has only a single agent. • bFS Bootstrap Information: This information is used by agents and public read-only clients. This includes configuration information such as the encryp-tion and naming techniques. 2.4.1 Revocat ion Certificate revocation is major challenge in a centralized system. Such systems have certificate authorities (CA) that can be queried to determine whether a certificate has been revoked. Querying certificate authorities when presented with a certificate causes a performance bottleneck. Modern CAs use certificate revocation lists (CRL) to propagate new revocation. However, strict reliance on CRLs leads to window between a certificate revocation and acquisition of a new CRL in which revoked certificates are mistaken as valid. Certificate revocation in bFS requires users to distribute their new certificate to their sharing partners, and for these partners to update their local databases. 2 .5 Agents The bFS agent controls communication with the storage provider, performs all au-thentication, maintains meta-data, and controls sharing. Earlier sections discussed 15 the relationship between agents and storage providers, how standard authentication techniques are used in bFS, and additional meta-data maintained in bFS. What follows is a discussion of the protocol used to access agents, how sharing occurs in bFS, and issues related to the use of multiple agents per user. 2.5 .1 P ro toco l The bFS protocol is very similar to NFS. It has operations on names (such as resolve name into a file handle, remove the named object), operation on file handles (such as read, write), and administrative operations (such as add certificate, grant access). Each object in bFS is assigned a unique, persistent, object identifier (OID) which is part of that object's meta-data. The root of the file system contains the directory map, which is used to rapidly locate any directory's meta-data. Directories use OIDs to locate the meta-data for objects within the given directory, and OIDs are used within bFS file handles. A bFS file handle is composed of a user ID (UID), parent's OID and the object's OID. The UID is a unique number assigned to each user in the user database, with the UID of the main user being zero. Since the root does not have a parent, the OID used for that field is the root's OID. When an operation happens on a file handle, the directory map can be examined to determine either the object's remote name or, if the object is a file, the parent's remote name. If the object is a file, the parent's remote name is used to retrieve the directory's meta-data and access the file's meta-data. Communication between an agent and the user may take several forms. In some circumstances they may communicate over an untrusted network (ie, the Inter-net), a secure network, or even share the same address-space. Part of the user-agent 1(5 LOGON Tunnel authentication information, receive root file handle GUEST Logon as a guest to a file system LOOKUP Resolve parent file handle and object name into a fi le han-dle READ Read from the specified file READRAW Read encrypted data from the specified file WRITE Write to the specified file WRITERAW Write already encrypted data to the specified file CREATE Create a file in the directory specified by a file h andle MKDIR Creates a directory in the directory specified by a file han-dle REMOVE Removes a named file from a directory REMDIR Removes a named directory from a directory RENAME Renames an object within one directory, to an object within another directory GETATTR Retrieve attributes for the specified file handle SETSIZE Sets the size of a file specified by a file handle READDIR Retrieves the object listing of a directory Table 2.2: File access operations handshaking phase is to negotiate protocol parameters. These parameters deter-mine how data will be transferred between the two. For example, if using the same address space there may be no reason to establish a secure communication channel between the two co-located modules. When using a trusted network we would ex-pect the agent and user to use plaintext after authenticating one another. However, when using an untrusted network the two must communicate over a secure channel, raising the double encryption problem discussed in section 2.2.2. Tables 2.2, 2.3, and 2.4 describe the operations exposed by agents and avail-able using the bFS protocol. The tables list operations on files and directories, cryptographic operations, and sharing operations respectively. Users access their data using a software client that communicates with agents 17 REKEY Force a re-key of a file FSREKEY Force generation of a new file-system key GETREMOTEINFO Retrieve remote name and key Table 2.3: Cryptography operations GETUSERS List users GETUSER Retrieve a user's certificate given an alias ADDUSER Add a new user (with their bFS certificate) REMOVEUSER Remove a bFS user UPDATEUSER Update a user's bFS certificate SETRIGHTS Set rights for a user on a file handle GETUSERRIGHTS Get rights for a user for a file handle GETALLRIGHTS Get rights for all users for a file handle RESETACL Set A C L for the specified file handle to be same as parent's Table 2.4: Sharing operations and storage providers. Modern operating systems support various techniques to introduce new file systems. In Unix, new file systems can be introduced within the kernel or at user-level. In Windows, new file systems can be introduced using the Installable File Systems Ki t [10]. A new method to access files is using a web interface. Regardless of which technique is used, the client that interfaces with the bFS a.gent must communicate using the bFS protocol. 2.5.2 Sharing Existing network file systems accomplish sharing by using access control lists (ACL) or capabilities, based on the assumption that all users are known and authenticated to the file server. Whenever sharing has to cross an administrative domain, different methods, such as F T P or e-mail, are used. The disadvantages of such methods are 18 that sharing is obtrusive, does not fit the typical file sharing model, and write-sharing is complicated by the fact that files must be FTPed or e-mailed back to the owner. Furthermore, F T P and e-mail are usually not used in a secure manner. The sharing semantics of bFS are taken from Multics [14]. New A C L entries inherit the A C L of their parent directory when they are created. Subsequent changes to the parent's A C L do not affect any objects within the directory. There exists a user database in bFS where the certificate of all sharing parties are stored. This database assigns a unique user ID to each certificate, and when a certificate is loaded into the database it must be accompanied by an alias assigned by the user. The access control list contains a user ID and a privilege for each user granted access to an object. When a user's access to an object is revoked the object's contents may have to be re-encrypted. In some scenarios clients may retrieve an object's remote name and key from an agent, and then read the data using public read-only access. bFS maintains information in the A C L about which users have ever had in their pos-session the key to a file. If in the future a user's access is revoked and they never acquired the key, the system need take no action. Since the user never had the key they must go through the agent to obtain the cleartext data, but the agent will deny them access. However, if the user's access to a file is revoked and they acquired the file key at some point in the past, then the file is re-encrypted with a new key. This means that keys previously handed out to other users are now also invalid. These keys, previously retrieved and cached by clients, will cause, with very high proba-bility, decryption errors when used to read the padding information (discussed in Section 2.6), indicating to the client that the key has changed. 19 2.5.3 A n Example The following describes a way for users to share files. When Bob and Al ice wish to share files, they must first exchange b F S certificates. Once exchanged, the cer-tificates are loaded into their corresponding b F S file systems via their agents. B o b can now grant Al ice various levels of access to different files. The mechanism by which this happens depends on the client Bob is using to access b F S . For instance, some clients may have a graphical tool to load certificate and grant access, while others may only provide a command line interface. The client wil l use various b F S protocol commands to accomplish the necessary task. When Al ice wishes to read one of Bob ' s files, her agent (or client) detects an attempt to use a shared object. It determines that the object is owned by Bob , retrieves Bob 's certificate and initiates communication with one of Bob ' s agents. Once an authenticated session has been established, Al ice performs all operations on Bob ' s files through Bob ' s agent. Bob 's agent is aware that the operations are being performed by Al ice and can therefore enforce proper access control. Al ice can improve performance on some operations by taking advantage of public read-only access to files. She does so by requesting the key and remote name for the required file. Bob ' s agent ensures that access should be granted and responds with the key and remote name. Al ice now accesses the file using public read-only access. After fetching the required data she uses the key to decrypt it. Th is is illustrated in Figure 2.2. 2.5.4 M u l t i p l e Agents A user may have several agents, each of which may be designated public or private. Publ ic agents are published in the user's certificate and are hence accessible for shar-ing. Pr ivate agents are used to ensure that sharing does not compromise the user's 20 Bob _ Alice Figure 2.2: Sharing using public read-only performance. A local agent is the agent that the user is currently using, all others are remote. Since only a single agent is used at the user-storage authentication stage, all agents are required to broadcast, in a secure manner, a user's credentials to the user's remaining public agents. Once an agent receives a user's credentials it can proceed to establish secure sessions with the user's remaining public agents. Agents could be rendered useless if a user changes his password with the storage provider but does not re-authenticate through an agent, however, users, wil l eventu-ally authenticate with an agent to get at their data. To disable a public agent, the user must generate a new certificate with that agent removed, and change creden-tials (password) on the remote storage. To disable a private agent, the user need only change credentials. 21 2.6 Encryption Recall that one of the bFS design requirements is to optimize performance in public read-only environments. Since such an environment allows access to all files for the specified user, bFS must ensure that a given key is only used for objects with the same permissions. To avoid any book-keeping associated with maintaining these sub-groups of file system objects, bFS uses a different key for every object. Files are encrypted using symmetric cryptography since asymmetric cryptography is too expensive. Symmetric ciphers [16] come in different two primary flavors: stream and block. Stream ciphers process data one bit at a time and the ciphertext of a given bit depends on all the preceding bits. This dependency shows itself in the decryption process as well, and is an unacceptable performance barrier as bFS requires uniform-access-time random access to files. The second family of ciphers, block ciphers, processes data one block (typically 64bits) at a time. Block ciphers can operate in different modes. In E C B (Electronic Codebook) mode, the ciphertext of each block is completely independent from any other block. In C B C (Cipher Block Chaining) mode, the plaintext of block n is xor 'ed with the ciphertext of block n—1 before it is encrypted. To decrypt block n, it is decrypted in the normal manner, and then the results are xor 'ed with the ciphertext of block ra —1. Notice that decrypting block n does not require block n — 1 to be decrypted, which is required for random access reads. However, a change in ciphertext of block n will propagate changes for all the remaining blocks, which is an unacceptable performance penalty for writes. The remaining block cipher modes combine the flexibility of stream ciphers with the strength of block ciphers. At first, it appears as if bFS has to use E C B . This is not satisfactory as E C B has some cryptographic weaknesses. These weaknesses arise because two identical 22 plaintext blocks have the identical ciphertext. The duplication offers an attacker a starting point in at tempting to uncover the plaintext. The encryption technique in C F S [2] uses a combination of E C B and O F B (Output Feed Back) . B u t since the C F S author states some concerns with his approach, an alternative that balances E C B and C B C is used. bFS uses C B C on b F S blocks. Each b F S block contains a fixed number of ciphertext blocks, and hence a write anywhere in a file wi l l only affect the b F S block it resides within and not the remainder of the file. Th i s use of C B C on bFS blocks of-fers very strong security, and like all the bFS components, the cryptographic module could easily be told to use different block ciphers wi th different key lengths. W i t h slightly more effort, it could be converted to use a completely different encryption technique, such as the one used in C F S . To help avoid the duplicate block syndrome of E C B on the first C B C block in a bFS block, the cipher uses an init ial ization vector. Th is vector is xo r ' ed with the first block before it is encrypted — it acts as the ciphertext of block —1. Like other bFS modules, the ini t ial ization vector can easily be customized. The default init ial ization vector of block n is the 64-bit block number. Block ciphers require padding if the length of the plaintext is not guaranteed to be a multiple of the cipher block size. Therefore, both b F S and tradit ional file systems with buil t- in encrypting capabilities, such as N T F S 5 . 0 or C F S , face a problem when a file's size is not a multiple of the cipher block size. To solve this problem, bFS uses the same padding technique as in the Unix uti l i ty bdes [1]. The very last byte in a file, once decrypted, specifies how many bytes should be ignored. bFS ensures that the encrypted file size is always a multiple of the cipher block size, hence requiring null padding. This technique allows public read-only clients to 23 determine the actual file size by first reading the last b F S block. Ex is t ing file systems optimize disk usage of sparse files by not allocating null blocks. Doing so in b F S would mean that public read-only access to files would not only require the current key, but also the currently valid non-null ranges in the file. Th is could be avoided by assuming that a b F S block that is composed of only null bytes is in fact, a null plaintext block. Bo th approaches are unacceptable. In the first approach b F S would have to maintain a large amount of meta-data. The second approach is based on the assumption that the symmetric algorithm wil l not produce a block of nulls as ciphertext. Th is assumption is flawed as a block cipher maps one .64 bit quantity onto another. Th is relationship must be a function that is 1-to-l and onto, which means that some plaintext, using some key, must encrypt into eight null bytes. Furthermore, by using the storage provider's sparse file feature we are revealing information about the structure of the encrypted file. For simplici ty and symmetry bFS does not optimize storage of sparse files. 2.6.1 N a m i n g In order to provide an acceptable level of security, bFS encrypts file names as well as file contents. b F S currently supports four different transformations which map an object's real name to the name used on the remote storage, and can easily support more. The first is s imply the real name encrypted with the directory's key, using the file identifier as the ini t ial izat ion vector. Th i s results in relatively long file names as the encrypted binary data must be normalized to a set of legal characters for the remote name. The second transformation is the hash of the object's real name and the directory's key. Th i s results in fixed-length names. The third is a function of the O I D , and the forth is a randomly generated name, unique within the directory. 24 The advantage of the last two approaches is that a very compact namespace can be created, which could offer a longer real path length than supported by the remote storage. To support all these transformations, b F S stores both the object's real and remote names in its meta-data. 2.7 Summary The core ideas behind the design of bFS allow users to store their da ta wi th various storage providers using various network file-system protocols. Users no longer need to trust their storage providers to prevent the accidental or malicious disclosure of their data. Instead, users trust a software agent to control communicat ion with the storage provider, perform all authentication, maintain meta-data, and control sharing. The agent can support different network file-system protocols, different en-cryption and naming techniques, and deployed in multiple locations. Sharing relies on a user's ability to establish relationships with sharing partners, and exchange certificates in any manner sufficiently secure for one's purposes. Th is removes any centralized sharing authority and does not enforce any key management policies. Furthermore, sharing can span multiple administrative domains as system adminis-trators are never involved in the sharing process. 25 Chapter 3 Implementation The b F S prototype implementation consists of two entities: client, and agent. The client provides access to bFS by integrating with the local operating system and communicat ing with agents. The agent communicates with the storage providers, authenticates users, maintains meta-data, and enforces access control . Th i s chapter discusses these two entities in further detail . 3.1 Overview Figure 3.1 illustrates the relationship between clients, agents, and storage providers. Clients provide access to a bFS file system by communicat ing wi th either remote or co-located agents. These agents communicate with the user's remaining agents to ensure that all the user's agent have the user's credentials for access to the remote storage. These credentials are used by the agents to gain access to the remote storage. For instance, S M B agents require the username, password, and domain (host) to gain access to the remote storage. The agent also communicates with foreign agents to facilitate sharing. A l l I /O to the remote storage is done through 26 the remote storage facade. This facade provides a consistent interface within the agent to storage providers using various protocols. All cryptographic routines are encapsulated in a cryptographic library that conforms to the Java Cryptography Extension specification. C l i e n t Figure 3.1: bFS agent overview The following sections discuss the implementation of the client and agent, and what optimization were added to increase performance. 3.2 Client Access to bFS is provided by a user-level NFS server running on FreeBSD. The server acts as a gateway between the NFS protocol and the bFS protocol. It was 27 developed by implementing the server stubs generated by the R P C compiler. The server communicates with the kernel's N F S client using U n i x domain sockets, and with agents using T C P / I P over a trusted network. The user-level N F S server, on startup, mounts / b f s. In the mounted subtree exist two subdirectories: home, where the user's fdes are located; and f r i e n d s , where sharing parties are accessible. The / b f s / f r i e n d s directory contains a directory en-try for each entry in the owner's user database. For instance, if a user has Bob , Al ice and M i k e as sharing partners, / b f s / f r i e n d s / B o b , / b f s / f r i e n d s / A l i c e , and / b f s / f r i e n d s / M i k e would be present. When the user accesses one of these direc-tories, the N F S server retrieves the users's certificate using the GETUSER command. It uses the certificate to locate one of the sharing party's public agents, and com-municates with that agent when accessing files. When the user accesses / b f s/home, the N F S server communicates with the user's agent. 3.3 A g e n t The agent is implemented as a Windows Service writ ten in Java. The Windows platform was chosen because it provides a very simple and secure mechanism to authenticate users and acquire their security tokens. A single process can obtain security tokens for many users and switch between these identities as required. This allowed the agent to run as a system service (similar to a U n i x daemon), authenticate users to a domain, and impersonate them on subsequent access to the remote storage. Java was chosen for rapid development purposes. T w o major functional components exist in the agent: remote storage facade, and agent core. The remote storage facade is an interface that must be exposed for any 28 network file-system protocol that need be supported. The current implementation provides an S M B facade. Th i s allows the agent to authenticate against S M B domain controllers, and use S M B file servers as the storage provider. Since newer versions of S M B support secure authentication with a domain controller, the agent can securely authenticate users. The facade is the only component requiring change i f a different network file-system protocol was to be used. Due to the nature of the facade, stacking multiple facades achieves richer functionality. In the implementation the S M B facade sits below a caching facade. The caching facade uses the S M B facade when it cannot fulfill a request. Th is separation enables the use of different caching strategies (for instance, if support for multiple agents was included) without affecting the backend facade. The agent core communicates with a facade to perform all I / O on the stor-age provider. It manages meta-data, determines what needs to be read or wri t ten, manages keys, handles authentication, and understands the b F S protocol. Its func-tionality was discussed in section 2.5. The current agent implementation supports a single agent. In order to enable multiple agents, the remote storage facade has to be modified to utilise a cache consistency protocol. The sharing privileges supported by the agent are: read, write, and admin-ister. Only users with administrative privileges may modify A C L s and the user database, and administrative privileges may not be revoked from the owner. These semantics were chosen for their simplicity and have proven sufficient for the proto-type. 29 3.4 Performance Optimizations Performance optimizations can be introduced to both the client and the agent. The client contained no optimizations, but the agent contained two optimizations. The agent uses a simple block-caching algori thm, and caches Java objects representing meta-data. The cache uses a hashtable to map an object's fully qual-ified name to an entry that contains the file's remote file descriptor, a dir ty flag, a clock (used for n-th chance block replacement), the Java meta-data object associ-ated with this file (used for encryption and decryption) and a hashtable of blocks. The blocks hashtable maps a block number to a dir ty flag, clock, and the actual data. A syncher thread running at some user-defined interval performs lazy writes, clears dirty flags, and removes cache entries when the cache gets too large (size is configurable). Caching meta-data Java objects avoids the cost of re-creating a Java object from its persistent state. When data or meta-data change the changes are written to the cache and the associated blocks flagged as dirty. Add i t iona l opt imizat ion can improve performance. At t r ibu te caching on the client can reduce the number of G E T A T T R operations performed. The agent can improve performance by implementing prefetching. 3.5 Summary The bFS prototype implementation concentrated mainly on new functionality of-fered by b F S . Performance optimizat ion were not a major priority as it was known what optimizat ion wi l l improve performance, and that these optimizat ion could be made at a later date. 30 Chapter 4 Performance This chapter begins with a discussion of what the performance expectations are for the current b F S prototype. This is followed by a set of micro-benchmarks and the Andrew benchmarks. 4.1 Overview The system's performance is impacted by three items: Java, addit ional network communicat ion, and the nature of the user-level N F S server. The agent is implemented in Java using the Microsoft virual machine, running as a Windows Service. The cryptography l ibrary used by the agent is pure Java, containing no native calls. Java was selected because it enabled rapid development as the primary concern was exploring the new functionality offered by b F S , and tackling performance issues at a later stage. The user-level N F S server employs no performance enhancements and can be thought of as a simple gateway that translates the N F S protocol to the bFS protocol. Another system that used a user-level N F S server to mount its file-system 31 is SFS. The SFS client daemon employed aggressive attribute caching to achieve NFS performance [9]. It is expected that since no such effort was put into the bFS user-level NFS server, performance will not be on par with NFS. The goal for the performance of bFS is that it be comparable to the perfor-mance of NFS running over a secure channel. For reads and writes, bFS performs only one half of the encryption that NFS over a secure channel must perform and therefore should be faster. This is so because secure NFS would require data to be encrypted by the client, transmitted, and then decrypted by the server. bFS does not require server decryption. On the other hand, the bFS agent introduces over-heads that will make bFS slower. In other performance respects, the performance of the two systems should be comparable assuming that a similar optimization effort is performed on both. The following sections describe the test environment, microbenchmarks, and performance observations. 4.2 Test Environment The test environment consists of four machines: • Client: Pentium III 550MHz with 512MByte RAM running FreeBSD .4.0. • Agent: Pentium II 350MHz with 128MByte RAM running Windows 2000 Professional. • Samba: Sun Ultra 10 300MHz with 64MByte RAM running Solaris 5.7. SMB services are offered by Samba [15] version 2.0.0alphaX . • NFS: Sun Ultra 10, 300MHz with 192MByte RAM running Solaris 5.6. 32 When measuring N F S performance, Client communicates directly with NFS. W h e n measuring bFS performance, Client communicates with a user-level N F S server, (running on the same machine) using U n i x domain sockets. Tha t server communicates with Agent who accesses a S M B share offered by Samba, who needs to communicate with NFS since the share is actually an N F S mount. It is expected that b F S suffer a performance penalty due to the fact that the test environment requires addit ional network hops. Ideally, Client and Agent should be co-located, as should NFS and Samba. 4.3 Micro-Benchmarks M i c r o benchmarks are used to analyze the cost of file system operations. The user-level N F S server was instrumented to record each N F S request it received and all b F S operations it performed to satify that request. Each of the events was time stamped using the Intel Pent ium cycle counter, using the 'read t ime stamp counter' (rdtsc) instruction. The test involved making a directory, copying a 6 4 K B file into i t , and then reading the file. To determine Read cost during a cache miss, the user-level N F S server was stopped, re-started, and the file was read. Th i s process ensured that the block cache would be empty. Table 4.1 summarises end-to-end performance of.key functionality. The per-formance number are the mean of five test runs. The first column describes the operation. The second is the total time of the operation in milliseconds. This time is broken to the amount of time spent in the agent and the amount of time spent in the user-level N F S server (client). The last column indicates how much time could have been saved if the user-level N F S server performed some attr ibute caching. Operation Total time Agent Client Client overhead Read 32K (cache miss) 49.25 47.40 1.85 0.00 Read 32K (cache hit) 15.58 13.85 1.73 0.00 Wri te 32.K (cache entry not present) 33.09 32.11 0.98 2.64 Wri te 32K (cache entry present) 19.80 19.00 0.80 2.85 Create 44.67 44.49 0.18 6.83 M k D i r 102.24 101.82 0.42 18.12 Lookup cache hit 6.51 6.21 0.30 5.62 G e t A t t r cache hit 2.90 2.71 0.19 0.00 Table 4.1: b F S micro-benchmarks in milliseconds Notice that Create and MkDir have to perform remote file-system operations synchronously, and hence are relatively expensive. A Create requires a synchronous file creation on the storage provider. After creating the remote file the agent updates the directory meta-data. The MkDir operation requires a synchronous directory creation on the storage provider, followed by a synchronous file creation for the meta-data file of the newly created directory. After creating the two objects the agent updates the parent directory's meta-data and populates the new directory's meta-data. The Read cache miss requires both synchronous access to the storage provider and cost data decryption. The decryption step alone takes 15.24 milliseconds. This step completes in 0.62 milliseconds using a native C routine. The difference between the cache hit and miss numbers for Read indicate that reading the data from the storage provider and adding it to the cache requires almost 20 milliseconds. The cost of adding data to the cache when a cache entry (for the file) does not exist can be determined by examining different between the two Write figures. The two writes are generated when the 6 4 K B file is copied and do not differ in any way aside from the fact that the write creates a cache entry for the file while 34 the second write simply adds a block to the cache entry's block list. The difference between the two writes, and hence the cost of adding da ta to the cache when a cache entry for the file does not exist, is 13.11 milliseconds. The difference in the agent time of the two reads is 33.55 milliseconds, of which 15.24 is cryptographic cost, leaving 18.31 milliseconds to access the storage provider and insert the data into the cache. Since the cost of adding data to the cache is 13.11 milliseconds, the cost of network access to the remote storage is 5.2 milliseconds. There are two reasons for the performance of b F S : Java and the user-level N F S server. The agent is implemented in Java because it allowed rapid prototype development and portabil i ty. A t the outset, it was expected for Java to have a negative impact on performance, but it was also expected that ongoing efforts by the Java community wi l l improve Java performance. To quantify the effect of the decision to use Java for the agent, a modified user-level N F S server tracked agent performance as the file-system grew. Surpris-ingly, performance deteriorated rapidly. For instance, recall that a GetAttr micro-benchmark took approximately 3 milliseconds, as the file-system grew to several hundered files in a few dozen directories, the same operation required almost 80 mi l -liseconds. Further investigation was inconclusive, but suggested that the Microsoft V i r tua l Machine and the Java garbage collector may have been responsible for the performance deterioration. Unfortunately, the Microsoft V i r t u a l Machine cannot be configured to disable garbage collection, so this theory could not be validated. To quantify the effect of the user-level N F S server, the performance of a C program that uses the N F S server to access the b F S file system was compared to that of a Java program that accesses the agent directly. Where the C program 35 Phase b F S N F S I 1 0.4 II 14 0.8 III 5 0.4 I V 6 1.6 V 77 4.4 Table 4.2: Andrew benchmark: b F S vs. N F S took 25 seconds to write a one megabyte file, the Java program took just as long as the cp uti l i ty under N F S — 280 milliseconds. There is a single encompassing reason for the performance differences between the two methods of accessing the b F S system — the N F S client in the kernel causes a one megabyte write request to be broken into 128 eight kilobyte write requests. The b F S agent and the N F S server both have to process these additional requests. To determine exactly where these extra cycles are consumed, the Java program was modified to perform 128 eight kilobyte writes instead of the single one megabyte write. The modified program now required 450 milliseconds. This pointed at the user-level N F S server as the cause for poor performance. B y modifying the N F S read and write sizes to 16k (from 8k), performance for the C program improved to 11.7 seconds. M o v i n g to 24k resulted in the one megabyte write requiring 2 seconds. Final ly , by moving to 32k read and write sizes the one megabyte write required a mere 370 milliseconds. 4 . 4 Andrew Benchmark The current implementation falls short of the goal to match the performance of N F S . Table 4.2 compares the elapsed time in seconds of each phase of the Modified Andrew Benchmark [12] running on b F S and regular N F S . 36 Phase bFS N F S I 0.7 0.4 II 6 0.8 III 3 0.4 IV 4 1.6 V 38 4.4 Table 4.3: Andrew benchmark using 32k read/wri te sizes: b F S vs. N F S The first phase creates many subdirectories. The second copies many files and directories. The third recursively retrieves the status of every file in a subdirectory that contains source files for a program. Every byte of every file in that source tree is examined during the fourth phase. The last phase compiles the project in the source directory. T h e compiler and linker used during the last phase are themselves located in the file system being tested, and temporary files are also created within that file system. The b F S block size used was 4 K B y t e , da ta was encrypted using 128 bit Blowfish, and names were transformed using a function of the O I D . The N F S read © and write sizes were set to 8 K B y t e . Running the same benchmard using N F S read and write sizes of 3 2 K B y t e yields the results outlined in Table 4.3. Al though the performance is not on par with N F S , the numbers indicate that an enhanced user-level N F S server would be able to achieve performance on par wi th N F S . especially if it is integrated with a bFS agent. 37 4.5 Summary b F S offers functionality not available on any other network file system. The perfor-mance of the init ial prototype falls short of the goals to match N F S performance, however, analysis of the performance numbers has shown that the performance of b F S could be on par with N F S given sufficient optimizations. 38 Chapter 5 Related Work b F S is a unique file system as it addresses two issues that have yet to be addressed to-gether. Cryptographic file systems protect users from da ta exposure due to physical media theft or malicious administrators. Such systems can even sit on top of exist-ing network file system protocols, offering network-based cryptographical ly secure file systems. However, the use of a network file system does not imply cross-domain sharing. Current network file systems do not include support for cross-domain shar-ing as the historical deployment of these system was within closed user groups. A s stated earlier, the Internet is changing tradit ional file-system access and requires a more flexible model. T h a t model is the one support by b F S . Other file systems, such as S F S [9], C F S [2], and T C F S [5], have taken steps toward support for this new model. S F S st i l l requires server trust and does not solve the cross administrative domain sharing problem since, for authenticated users, it s imply maps the remote user to a local U I D . If Al ice wants to grant Bob access to her files but Bob does not have an account on Alice ' s system, Alice must either create Bob an account, or map Bob to her U I D , hence granting him more privileges than init ial ly intended. S F S also requires software to be installed on the servers. 39 C F S and T C F S solve the server trust issue by encrypting all data, but they do not attempt to solve the sharing problem. bFS offers a practical solution to both problems. 5.1 Crpytographic File Systems A number of researchers have written about adding cryptography to a file system, but surprisingly few have implemented i t . C F S [2] is a cryptographic file system for U n i x . A n encrypted directory is mounted under / c r y p t by a user's providing the password to the system. The Transparent Cryptographic Fi le System [5] extends C F S by integrating it seamlessly with the file system so that files appearing any-where in the file system can be transparently encrypted and decrypted upon access. Encrypt ion is triggered by turning on a new secure bit in the file protection bits, and keys are managed by a separate server process. Another system, called Crypt fs [20], offers functionality identical to the previous two, except that is it implemented as a stackable file-system at the vnode layer. In all three systems, clients trust servers to be secure. Further, all sharing must occur within a single administrative domain. 5.2 Cross-Domain Sharing S F S [9] implements a secure file system by separating key management from the rest of the file system security apparatus. In S F S , as in C F S and T C F S , the server is assumed to be secure. The key to their design is self-certifying path names -file names that effectively contain the appropriate remote server's public key. Th is allows a user who is authorized to access a file to do so from anywhere in the Internet without the intervention of any system administrator. However, each incoming file 40 system request is made with the credentials of a user who is known to, and capable of authenticating himself to, the local file server. S F S does not address either of the main goals of b F S : security in the face of untrusted storage providers and sharing across administrative domains. The closest that S F S can come to sharing across administrative domains is v ia its use of an authserv process which maps remote users into "a set of U n i x credentials—a user ID and list of group IDs" . B y doing so, the owner of a file accessible using S F S can specify access permissions based on the set of users and groups defined by the U n i x system administrator. C I F S [4] presents a global file system image, but does not address either of the main goals of b F S . It assumes that servers are trusted, and does not facilitate sharing across administrative domains. Each security domain in C I F S manages its own user authentication. Hence Al ice can only grant B o b access to her files i f she adds him to the local user database. If she does not have the abil i ty to do so then she must provide him wi th her credentials. A F S [8] restricts clients to a set of trusted servers maintained by an admin-istrator. Those properties are opposite to the main goals of b F S . There are a few other ways for users to securely share data in the Internet, however these do not involve file systems. Example include web ( H T T P ) and F T P access to files using the Secure Socket Layer (SSL) , and e-mail using P G P or other cryptographic tools. In these schemes, security is assured v ia appropriate use of cryptography, but at the cost of stepping outside of any shared file system. One could imagine constructing a file system interface to secure H T T P or secure F T P using the same sort of user-level N F S server that bFS uses. Once one takes care of key management and authentication of the pr imary user and guests, the resulting system would look just like bFS. 41 Chapter 6 Conclusions The Internet is changing the tradit ional model of file system access by making it easier for users to access their files from a variety of security domains, to share their files wi th colleagues in other administrative domains, and to use Internet-based storage services which the user may not wish to trust to keep their data private. Ex i s t ing cryptographic file systems solved the problem of untrusted storage by ensuring that data is stored in encrypted form. There are also other file systems that attempt to provide cross-domain sharing, but they require either a centralized user authentication mechanism, or modifications to the storage providers. b F S is unique as it addresses the issues of server trust and cross-domain sharing. Server trust is handled by using a cryptographic file-system. Other crypto-graphic file-systems do not require their own set of meta-data as a single key is used for the entire file-system and cross-domain sharing is not supported. Cross-domain sharing is supported in bFS by using a decentralized authentication scheme. Each user is responsible for managing their own private authentication database that contains the certificates of all users with whom a sharing relationship is established. The b F S prototype implementation illustrates that the model argued for in 42 this thesis can be deployed using existing network file system infrastructure. Analsis of the prototype's performance revealed that N F S performance can be attained by opt imizing the user-level N F S server, and taking some action to overcome the Java performance issues. b F S provides sharing and privacy using untrusted storage by allowing users to place their trust in a simple agent. Th i s agent is the only locus of trust in the system, and the b F S structure of interacting agents provides a user complete control over when, how, and with whom their files are shared. 6.1 Future Work O f immediate importance is a detailed examination of the Microsoft V i r t u a l Machine as it is believed to be the primary contributor to performance penalties. Aside from the performance issue, two projects come to mind . The first is the addition of a web interface to the system. The second is the use of decentralized sharing to enable a capabilities-based file-system. The largest challenge in adding a web interface is defining the trust model. A standard web authentication interface requires that a user enter their username and password. However, bFS requires a certificate and a matching private key for a user to authenticate with an agent. The naive solution is for a user to store their certificate and encrypted private key with the web server, their username and password acting as the decryption key. Th i s solution leaves a user's private key with the web server, and is hence unappealing. A much more appealing solution is to store a user's certificate and private key on a smart card. The web interface communicates with a smart card reader on the user's machine to retrieve the user's certificate and to sign data using the user's private key. 43 Using b F S to create a capabilities-based file-system would require application software to execute in an environment where file-system access is provided by an agent acting as a user with l imited access privileges. The existing user-level N F S server would have to be modified to present different file system images for different user processes. 44 Bibliography [1] M a t t Bishop. Implementation notes on bdes( l ) . Technical Report P C S - T R - 9 1 -158, Department of Mathemat ics and Computer Science, Dar tmou th College, Hanover, N H 03755, A p r i l 1991. [2] M a t t Blaze. A cryptographic file system for unix. In Proceedings of the 1st ACM Conference on Communications and Computing Security, November 1993. [3] B . Cal laghan, B . Pawlowski , and P . Staubach. N F S version 3 protocol specifi-cation. R F C 1831, Network Work ing Group , June 1995. [4] Microsoft Corpora t ion . Microsoft Networks SMB File Sharing Protocol (Docu-ment Version 6.Op). Redmond, Washington. [5] Dipar t imento di Informatica ed Appl icaz ioni of the Universi t di Salerno. Trans-parent Cryptographic File System, h t tp: / / tcfs .dia .unisa . i t / . [6] driveway, h t tp : / /www.dr iveway.com. [7] M . Eisler. N F S version 2 and version 3 security issues and the nfs protocol's use of R P C S E C - G S S and kerberos V 5 . R F C 2623, Network Working Group , June 1999. 45 [8] John H . Howard , Michae l L . Kazar , Sherri G . Menees, D a v i d A . Nichols, M . Satyanarayanan, Robert N . Sidebotham, and Michea l J . West. Scale and performance of a distributed file system. ACM Transactions on Computer Sys-tems, 6(1):51-81, February 1988. [9] Dav id Mazieres, Michae l Kaminsky , M . Frans Kaashoek, and Emmet t Wi tche l . Separating key management from file system security. In Proceedings of the 17th ACM Symposium on Operating Systems Principles (SOSP'99), pages 124-139, December 1999. [10] Microsoft . Windows 2000 IFS Kit. h t t p : / / w w w . m i c r o s o f t . c o m / H W D E V / n t i f s k i t / . [11] netdrive. h t tp : / /www.netdr ive .com. [12] John K . Ousterhout. W h y aren't operating systems getting faster as fast as hardware? Summer USENIX, pages 247-256, June 1990. [13] J . H . Saltzer, D . P . Reed, and D . D . C la rk . End-to-end arguments in sys-tem design. ACM Transactions of Computer Systems, 2(4):277-288, November 1984. [14] Jerome H . Saltzer. Protect ion and the control of information sharing in multics. Communications of the ACM, (7):388-402, Ju ly 1974. [15] Samba, h t tp : / /www.samba.org . [16] Bruce Schneier. Applied Cryptography. John Wi l ey & Sons, Inc., second edition, 1996. [17] R . Srinivasan. R P C : Remote procedure call protocol specification verion 2. R F C 1831, Network Working Group , August 1995. 46 [18] R. Srinivasan. X D R : Externa l da ta representation standard. R F C 1832, Net-work Working Group , August 1995. [19] X:dr ive . h t tp : / /www.xdr ive .com. [20] E . Zadok, L . Badulescu, and A . Shender. Cryptfs : A stackable vnode level encryption file system. Technical Report CUCS-021-98 , Computer Science De-partment, Co lumbia University, 1998. [21] P h i l Z immermann. Pretty Good Privacy, h t tp : / /www.pgp .com. 47 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items