UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Doxpects : XML transformation aspects Minevskiy, Ivan 2006

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2006-0587.pdf [ 3.21MB ]
Metadata
JSON: 831-1.0051719.json
JSON-LD: 831-1.0051719-ld.json
RDF/XML (Pretty): 831-1.0051719-rdf.xml
RDF/JSON: 831-1.0051719-rdf.json
Turtle: 831-1.0051719-turtle.txt
N-Triples: 831-1.0051719-rdf-ntriples.txt
Original Record: 831-1.0051719-source.json
Full Text
831-1.0051719-fulltext.txt
Citation
831-1.0051719.ris

Full Text

'I  Doxpects: X M L Transformation Aspects by Ivan Minevskiy B.Sc. Hon., University of British Columbia, 2004  A THESIS SUBMITTED IN P A R T I A L F U L F I L L M E N T OF THE REQUIREMENTS FOR T H E D E G R E E OF M a s t e r of Science  in The Faculty of Graduate Studies (Computer Science)  The University of British Columbia August 2006 © Ivan Minevskiy, 2006  11  Abstract Web services in the context of an Enterprise Service Bus provide an architectural approach to decomposing IT solutions into reusable pieces readily responsive to changing business demands. Developing a web service involves working with two different primary decompositions: the object-oriented design and the X M L document structure. X M L message transformations, such as encryption and adaptation, often manifest  as crosscuttihg concerns. Some existing  middleware standards address such document-oriented concerns by allowing custom Handlers to intercept and transform X M L messages at various times during a service invocation. However, handlers do not support the sound software-engineering principle of "programming to an interface". Their coarse-grained interface makes it difficult to map the design of concerns to the implementation and prevents useful static checking by a language compiler (e.g. checking whether two handlers may perform conflicting actions). To address this and similar design challenges, we have developed a new programming model, called Doxpects, which simplifies the development and maintenance of Handlers. The main benefit of Doxpects is the ability to perform X M L document transformation tasks (insertion, deletion, and replacement) in a familiar object-oriented way. Doxpects enable fine-grained statically-typed handler interfaces and take ' advantage of them.  iii  Contents Abstract  ii  Contents  iii  List of Tables  v  List of Figures  vi  Acknowledgements 1 Introduction  vii 1  1.1 Contributions of this Research  3  1.2 Outline  4  2 Background  6  2.1 Messaging Protocols  6  2.2 J A X - R P C  7  .2.3 X M L Schema  8  2.4XPath  11  2.5XMLBeans  11  3 Message Processing  12  3.1 Message Filters  12  3.2 Content Capture..  12  3.2.1 Automatic Iteration and Recursion  14  3.3 Content Modification  16  3.3.1 Replacement  16  3.3.2 Insertion  19  3.3.3 References to the Input Element Name  22  Contents  3.3.4 Deletion 3.3.5 Advice Interaction 3.4 Response Initiation 4 Type Checking and Dynamic Types  4.1 Modification Type Checking  .  iv  :  24 24 27 29  29  4.1.1 Simple Type Comparison  32  4.1.2 Complex Type Comparison  33  4.1.3 Tests of the Algorithm  38  4.2 Dynamic Types  39  5 Implementation Details  42  5.1 Compilation Process  42  5.2 Doxpect Handler  43  5.3 Doxpect Compilers  44  5.3.1 Reflection API Compiler  44  5.3.2 Command-Line Interface  46  5.4 Connected XMLBeans and the DoxpectBase Class  46  6 Related Work  49  7 Conclusion  51  Bibliography  53  Appendix A: Filtering Messages by Service Endpoints  56  Appendix B: Graph Comparison Algorithm  58  List of Tables Table 3.1: Sample Pointcut annotations  13  Table 3.2: The type restrictions for input and output parameters  17  Table 3.3: The insertion actions taken for different types of parameters.  19  Table 4.1: A summary of the facet support levels  33  Table 4.2: An overview of X M L Schema language features used in eBay, FedEx, Amazon and Google Search schemas  37  Table 4.3: A summary of comparison results of several eBay schema versions  38  Table 5.1: The possible options for the command-line Doxpects compiler  45  vi  List of Figures Figure 1.1: The Enterprise Service Bus architecture  2  Figure 2.1: SOAP intermediaries  7  Figure 2.2: The J A X - R P C Handler interface  8  Figure 2.3: The containment hierarchy of essential X M L Schema elements  9  Figure 2.4: Sample X M L Schema  10  Figure 2.5: A sample instance document of the schema from Figure 2.4  10  Figure 3.1: A decryption advice  15  Figure 3.2: A dimension units validation handler  18  Figure 3.3: A directions insertion handler  21  Figure 3.4: An address validation advice  23  Figure 3.5: An example of two request advices which are deemed conflicting by the interaction analysis in our previous work.  25  Figure 3.6: An X M L Schema used by advices in Figure 3.5  26  Figure 3.7: A response initiation example  28  Figure 4.1: An interoperability handler with type checking annotations  31  Figure 4.2: A recursive type definition found in an eBay schema  34  Figure 4.3: The effective occurrence interval  36  Figure 4.4: A case where our algorithm computes a wrong occurrence interval  36  Figure 4.5: An advice with a DynPointcut annotation and a dynamic type  41  Figure 4.6: An advice with a DynunionPointcut annotation and a dynamic type  41  Figure 5.1: A sample deployment descriptor of the DoxpectHandier for an Apache Axis server Figure 5.2: A sample X M L instance not conforming to its X M L Schema , Figure 5.3: An encryption advice  43 47 47  Acknowledgements I would like to thank Eric Wohlstadter for being my supervisor. His support, guidance, and encouragement  made this work possible. Working with him was an amazing learning  experience. I would also like to.thank Kris De Voider for being my second reader. Finally, thanks to my family for all of their love and support.  IVAN  The University of British August 2006  Columbia  MINEVSKIY  1  Chapter 1  Introduction The Enterprise Service Bus (ESB) is an architectural pattern that is increasingly gaining the attention of architects and developers, as it provides an effective approach to optimize the distribution of information between different types of applications across multiple locations [23]. The architecture, depicted in Figure 1.1, is centered on a bus which provides a common backbone through which applications can interoperate. The bus provides message delivery services, based on standards such as HTTP and SOAP. Many enterprises find ESB attractive because it combines features from previous technologies with new services, such as message validation, transformation, content-based routing, security, load balancing, etc [11,21]. ESB makes applications oblivious of the routing services and transport protocols details, and allows substitution of service implementations as needed. The ESB architecture can be seen as a distributed version of the mediator pattern: the bus serves as a mediator that has detailed knowledge of all applications; applications send messages to the bus when needed and the bus passes them on to any other applications that need to be informed. Web services in an ESB context provide an architectural approach to decomposing IT solutions into reusable pieces readily responsive to changing business demands. Developing a web service involves working with two different primary decompositions: the object-oriented design and the X M L document structure. As pointed out in the literature, developers often get frustrated when mitigating conceptual mismatches between the decompositions [2,26]. The existing approach to modularizing document-oriented concerns in Web Services is through the use of specialized APIs. Some existing middleware standards, such as the Java X M L Remote Procedure Call (JAX-RPC), allow custom Handlers to intercept and transform a SOAP message at various times during a service invocation [17]. Handlers can manage various  Chapter 1. Introduction  2  Service Clients Client  V  Client  Client  Client  1  (  Routing  )  Enterprise S e r v i c e B u S  Security  )  . (^WessagefVa lid a tip n )  'Serviceviii  Service  Service  ^Transformation) Load Balancing)  Service  Service Providers  Figure 1.1: The Enterprise Service Bus architecture.  crosscutting concerns such as logging and auditing, encryption and decryption, and so on. However, they do not support the sound software-engineering principle of "programming to an interface". Their coarse-grained interface makes it difficult to map the design of concerns to the implementation and prevents useful static checking by a language compiler (e.g. checking whether two handlers may perform conflicting actions). Aspect-Oriented Programming (AOP) helps to localize code related to a crosscutting concern (aspect) in a single logical unit (aspect definition), and to define points (joinpoints) in the dynamic call graph of a program where aspect-related functionality should be inserted (woven) and executed [24]. With aspects, concerns can be kept separate and be woven together with other concerns during compilation. Moreover, aspects improve modularization and make the system oblivious of the new concerns added into it. In our previous work, we have introduced a new aspect-oriented programming model, called Doxpects, which simplifies the development and maintenance of Handlers [30]. The main benefit of Doxpects is the ability to perform X M L document transformation tasks (insertion, deletion, replacement) in a familiar object-oriented way. We expand upon the standard AspectJ joinpoint model by including first class support for document content properties. A content-  Chapter 1.  Introduction  3  based pointcut provides the programmer a convenient way to express properties of joinpoints where X M L messages are being processed. A property of a message is expressed in the pointcut using XPath expressions which may be joined together using the standard pointcut logical operators. The document content matched by a pointcut is parsed into object-oriented entities with XMLBeans [32] to provide a natural Java-based programming model for document-oriented aspects. An advice does not have to perform the actual transformation, but its signature must contain Java annotations specifying where and how the transformation should be performed. Doxpects can perform useful compile-time checking of pointcuts to detect situations where a parameter type is incompatible with the content matched by a pointcut. The compiler can analyze advices to detect ones with conflicting transformations and to verify that the transformation output will be valid at runtime or that the transformations will succeed without runtime errors. For situations when a pointcut matches elements with different types, but with similar content, Doxpects provide a simple way to access and modify content common to the matched elements. The doxpect compiler can create a new, dynamic, X M L type containing the common child elements of the matched elements. The schema of the new type is compiled into XMLBeans which provide get and set methods. Alternatively, the compiler can be instructed to generate a dynamic type corresponding to a union of child elements.  1.1 Contributions of this Research This thesis contributes a further step towards providing a clean mapping from design to implementation of handlers with a useful static checking which takes advantage of a well specified interface. We provide a detailed description of old and new Doxpects features and show how they can help to develop and customize web-services. In particular, we offer the following contributions: >  •> Reimplemented the Doxpects framework to be able to write handlers in Java 5 language using annotations (Chapter 3).  Chapter 1.  •  Introduction  4  Enabled handlers to filter messages by. service endpoint URIs (Section 8).  •:• Developed a compile-time checking of pointcuts to detect incompatibilities between a parameter type and the content matched by a pointcut (Section 3.2). •  Added insertion and deletion transformations on top of the previously introduced replacement transformation (Sections 3.3.2 and 3.3.4).  •  Improved the compile-time analysis of advice interactions by taking advice pointcuts into account. Before, two advices could be deemed conflicting even if the elements they matched were in different and non-overlapping parts of a message (Section 3.3.5).  •  Enabled any request advice to terminate a request flow and initiate a response (Section 3.4).  •:• Improved the compile-time analysis of message content transformations by using pointcuts to get an approximate context in which the transformations might happen (Section 4.1). •  Introduced dynamic types, which provide a simple way to access and mutate content common to matched elements of different types (Section 4.2).  1.2 Outline We begin by presenting the background in Chapter 2. In Chapter 3, we delve into details and explain how a handler in the Doxpects framework can intercept, filter, and process messages. First, in Section 3.1, we present message filtering by service endpoint U R I and by message flow. We then describe how message content can be captured and transformed (Sections 3.2 and 3.3) and present our compile-time analysis of advice interactions (Section 3.3.5). At the end of Chapter 3, we show how a request advice can initiate a response (Section 3.4). In the first section of Chapter 4, we describe the compile-time analysis of message content transformations. In Section 4.2, we present dynamic types. In Chapter 5 we discuss the implementation details. In Section 5.1 we outline the compilation process and present the two available compilers: one based on Java Mirror A P I and the Annotation Processing Tool, and one based on the Java  Chapter 1.  Introduction  5  Reflection API. Then, in Section 5.2, we explain how to write a deployment descriptor for a doxpect handler. We conclude Chapter 5 with more details on the Reflection API compiler and a summary of the command-line interface. The related work in the areas of X M L transformation languages is presented in Chapter 6. Finally, in Chapter 7, we conclude and offer suggestions for future work. Most of the examples throughout the thesis are based on the X M L Schemas of the FedEx web-service. The service is used worldwide to ship and track items like letters and parcels.  FedEx: http://www.fedex.com/  1  6  Chapter 2  Background 2.1 Messaging Protocols Many web services use Simple Object Access Protocol (SOAP) messages to communicate with clients [28]. SOAP is an XML-based protocol that enables a completely interoperable exchange between clients and web services. SOAP messages are transmitted over HTTP, which is a commonly used request-and-response standard for sending messages over the Internet. Remote Method Invocation (RMI) is the biggest competitor of SOAP for communication in the Java space [20]. Under RMI, entire objects can be passed and returned as parameters. R M I has been available since J D K 1.02, and it is still popular with its use in technologies such as Enterprise JavaBeans (EJB). However, R M I is limited to Java Virtual Machines and cannot interface with other languages. Common Object Request Broker Architecture (CORBA) is a competing distributed systems technology that offers greater portability than R M I [9]. Unlike RMI, C O R B A is not tied to one language. Its only limitation is that a language must have a C O R B A implementation written for it. However, for Java developers, C O R B A offers less flexibility, because it does not allow executable code to be sent to remote systems. We choose to base the Doxpects framework on SOAP because it offers a much easier means of messaging and communication than do Java R M I and C O R B A . First, it is completely platform-independent, which enables greater interoperability between the client and server. Unlike C O R B A and RMI, it is human readable, very lightweight, and it is not necessary to have stub objects on client machines. It is great for asynchronous communication and for loosely  Chapter 2.  Background  Service .Client>t^4 > (jntermediary^ 4 » • • • 4 »(^Intermediary^ 4 »(^ervice Provider  Figure 2.1: SOAP intermediaries.  coupled clients and servers. Also, unlike XML-based SOAP messages, R M I and C O R B A messages have binary content,  which makes it hard to perform low-level  message  transformations. Finally, SOAP is designed with intermediaries in mind An intermediary is an entity that provides additional functionality and value-added services by processing parts of a SOAP message as it travels between a client and service provider. There can be many intermediaries between a client and service, and each of them can both accept and forward messages (Figure 2.1). An intermediary can be dynamically added and removed, providing a flexible way to extend the functionality of the client, service, and other intermediaries. SOAP intermediaries are widely used and provide functions like message filtering, transformation, and caching.  2.2  JAX-RPC  J A X - R P C handlers define an architectural means for an application to access a raw SOAP message of a web service request or response [17]. The Handler API defines three methods for message handling, as shown in Figure 2.2. They handle SOAP requests, responses and faults, respectively. On the client side, the handler's handleRequest method is invoked right after the proxy marshals the message. The handieResponse or h a n d l e F a u i t method is invoked just before the proxy demarshalls the message and returns it to the client. On a service endpoint, the handleRequest method is called before the message is.dispatched to the target service endpoint; the handieResponse or h a n d l e F a u i t method is called just before the container marshalls the message and returns it to the client. Generally, the h a n d l e F a u i t method may be called in the event of errors in endpoints, containers, or handlers themselves.  Chapter 2.  8  Background  package j a v a x . x m l . r p c . h a n d l e r ; p u b l i c i n t e r f a c e Handler { b o o l e a n handleRequest(MessageContext b o o l e a n handleResponse(MessageContext  context); context);  boolean handleFault(MessageContext c o n t e x t ) ; }  Figure 2.2: The J A X - R P C Handler interface.  On both client and server, a handler may be configured either programmatically or in a deployment descriptor. A deployment descriptor associates a handler with a particular web service endpoint and may specify any handler-specific configuration information. Because deployment descriptor information is declarative, it can be changed without the need to modify the source code. Handlers are permitted to modify the messages passed to them. If multiple handlers that are involved in one service invocation need to share information, they can do so by adding properties to the message context as it is passed from handler to handler.  2.3 X M L Schema The syntax of an XML-based language is defined using a dictionary of elements and attributes together with rules constraining their use. There exist several schema languages, such as D T D [3], X M L Schema [31], or DSD2 [27], which allow the syntax to be formalized. An X M L document is valid relative to a given schema if all the syntactic requirements specified by the schema are satisfied in the document. The W3C X M L Schema Definition Language is an X M L language for describing and constraining the content of X M L documents. The purpose of a schema is to define a class of X M L documents, and so the term "instance document" is often used to describe an X M L document that conforms to a particular schema.  Chapter 2. Background  9  <schema> ^ <secjuence>  —•{  <any> '.—  /  f  /  <choics>'  <element> 1\ <complexType>j^—<all>  y  x  \\  \\ <attribute> J) Y<anyAUribule>^ V /  -W '<element> |  1  <restriction> j  ^  <facet>  -»(<sirTipleType>^—<union>" <list> *{ <attribute>  Figure 2.3: The containment hierarchy of essential X M L Schema elements.  An X M L Schema is composed of the top-level schema element. The schema element contains type definitions (simpieType  and complexType  elements) and attribute  and  element declarations. In addition to its built-in data types (such as integer, string, and so on), X M L Schema also allows for the definition of new data types using the simpieType and complexType elements. Figure 2.3 depicts the containment hierarchy of essential Schema elements. For example, the schema in Figure 2.4 has one top-level element book of type bookType, which has three child elements: t i t l e , pages, and character. A possible instance document of the schema is shown in Figure 2.5. Unlike other schema definition languages, W3C X M L Schema lets us define the cardinality of an element (i.e. the number of its possible occurrences). We can specify both minOccurs (the minimum number of occurences) and maxOccurs (the maximum number of occurrences).  Chapter 2.  10  Background  <schema xmlns="http://www.w3.org/2001/XMLSchema"> <element name="book" type="bookType"/> <complexType name="bookType"> <seguence>  ^  <element n a m e = " t i t l e " t y p e = " s t r i n g " / > <element name="pages" type="decimal"/> <element name="character" t y p e = " s t r i n g " minOccurs="0" maxOccurs="unbounded"/> </sequence> </complexType> </schema>  Figure 2.4: Sample X M L Schema.  <book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <title>Romeo and J u l i e t < / t i t l e > <pages>100</pages> <character>Romeo</character> <character>Juliet</character> </book>'  •  Figure 2.5: A sample instance document of the schema from Figure 2.4.  For example, element c h a r a c t e r has maxOccurs set to unbounded, which means it may occur  any number of times. Both attributes have a default value of one. In an instance document, elements within a sequence structure must appear in the same order as in the schema. On the other hand, elements within an a l l structure may appear in any order. The choice structure allows only one of its child elements to appear in an instance document. However, note that sequence and c h o i c e structures may have minOccurs and/or  maxOccurs attributes which effectively allows their content to occur multiple times in an instance document. The any element allows any elements from specific namespaces to appear in an instance document in place of it. Similarly, a n y A t t r i b u t e has the same meaning for attributes.  Chapter 2.  11  Background  2.4 XPath XPath is a domain specific language used primarily to address portions of an X M L document [5]. An XPath expression is written as a sequence of slash-separated steps to get from one X M L node (the current 'context node') to another node or set of nodes. The simplest XPath takes a form such as / A / B / C , which selects C elements that are children of B elements that are children of the top-level A element in a X M L document. Two special operators are available to quantify over multiple possible elements: "*" and "//". A "*" fills in as a wildcard for any possible document element while the "//" fills in as a wildcard for any possible document sub-tree (essentially a closure over the "*" operator). For example, the expression / A / / B / * selects all child elements of a B element that itself is a descendant ('//') of a top-level A element. Also, predicate expressions can be specified in square brackets after any step. For example, / / x [ @id= ' F o o ] will match all x elements with an i d attribute which value is 1  FOO.  2.5 X M L B e a n s XMLBeans is a Java-to-XML binding framework which is part of the Apache Software Foundation X M L project [32]. XMLBeans map X M L and X M L Schema features as naturally as possible to the equivalent Java language and typing constructs. From a schema written in X M L Schema language it can generate a collection of Java interfaces and classes representing an object model of the corresponding X M L , documents. X M L data may then be processed as Java objects using the standard "get" and "set" JavaBeans pattern. Methods for marshalling and unmarshalling are automatically generated, and the mapping between X M L and Java can be controlled by specifying explicit bindings. In our programming model, document elements which are captured in a content-based pointcut are automatically made available to programmers using a statically typed XMLBeans representation. Another bean, called the replacement, is made automatically available to advice so that transformation is achieved simply by filling in the values of the new replacement bean. This feature provides the bridge necessary where pointcuts are expressed in one language (XPath) and behaviour is expressed in another language (Java).  12  Chapter 3  Message Processing In this chapter we describe how a handler in the Doxpects framework (a doxpect for short) can intercept, filter, and. process messages. We first present message filtering by service endpoint URI and by message flow (request, response, and fault). We then describe operations with message content, which can be grouped into two categories: capture and modification. We continue by describing our compile-time analysis of advice interactions. In conclusion, we show how a request advice can terminate a request flow and initiate a response.  3.1 Message Filters Doxpects can filter messages by a flow direction and by a service endpoint. The flow direction filters are implemented by the annotations  Request, Response,  and  Fault.  One of these three  annotations must be defined on every method that should be considered as an advice. Advices with Request are applied to request messages, with Response - to response messages, and with Fault - to fault messages. A doxpect may have more than one advice of each type. Advices of the same type are executed in the order of their appearance in the doxpect source code. To filter messages by a service endpoint, one has to use the class-level annotation Target. This annotation is described in details in Appendix A .  3.2 Content Capture All operations with message content can be grouped into two categories: capture and modification. Correspondingly, there are two groups of annotations designed to help with these tasks. No parameter can have annotations from both groups and, therefore, for simplicity, we  Chapter 3. Message Processing  Matched items  Pointcut  all f d x : A d d r e s s elements in  ©Pointcut("//fdx:Address")  the SOAP Body element the x e n c : E n c r y p t e d K e y element from  ©Pointcut(root = RootType.HEADER, value  13  = "ws'se : S e c u r i t y / x e n c : E n c r y p t e d K e y " )  the w s s e : S e c u r i t y header element in  the SOAP Header element the SOAP Body element  ©Pointcut(".") .  Table 3.1: Sample P o i n t c u t annotations.  will call a parameter with a capture annotation an input parameter and a parameter with a modification annotation - an output parameter. The only required annotation for an input parameter is  Pointcut,  which matches content  within a message. It has two parameters: v a l u e and r o o t . The v a l u e parameter defines an  XPath expression; the r o o t parameter defines a context for the XPath expression. The context can be the SOAP Body (default) or Header element of the current message, or an element matched by the P o i n t c u t of another parameter in the same advice. Table 3.1 has some examples of the P o i n t c u t annotation. You may have noticed that XPath expressions use prefixes, but do not bind them to namespaces. The binding is done in a class-level  include  annotation. Its value is a list of  URI-to-prefix mappings of the form " U R I a s p r e f i x " , where themselves. Both  URI  URI  and p r e f i x stand for  and p r e f i x must have no spaces inside them. For example:  @Include({"http://www.w3.org/2001/04/xmlenc# a s xenc", " h t t p : / / w e b s e r v i c e . f e d e x . c o m / v 2 006 0404/ a s f d x " } )  The content matched by a pointcut is made directly available to a doxpect advice through the value of the corresponding advice parameter, providing a fine-grained interface between documents and doxpects. An input parameter must have a type which is either a subtype of the xmiobject  class, which is the root of the XMLBeans type hierarchy, or a subtype of the  XmlObject  [ ]. In the former case, the parameter captures only the first matched element. If the  compiler detects that the pointcut may match multiple element, it gives a warning. A n array  Chapter 3. Message  14  Processing  parameter captures all matched elements. In both cases, the content to be captured is required to have the same type as the X M L Schema type from which the parameter type was generated by an XMLBeans compiler. If the compiler detect that the parameter type is incompatible with the content matched by the pointcut, it gives a warning. To get an approximate content matched by a pointcut, the compiler evaluates the pointcut in the context of schema documents. The approximations we employ in evaluating a pointcut at compile-time are described in Section 4.1. Consider a situation when one pointcut has to be evaluated in the context of elements matched by another pointcut. For example, one might want to have separate input parameters for several descendants of an element captured by another input parameter. To avoid repeating the XPath to the parent element in pointcuts of its descendants, one could use the annotation  Mark.  This pointcut can make the content captured by an advice parameter to be the context for a pointcut of another parameter. This annotation has one integer argument which value may range from 1 (default) to 9 inclusive. The value of a M a r k annotation becomes a numeric index of its advice parameter, which can be referred from a P o i n t c u t annotation of another parameter in the same advice. In a method signature, a context parameter does not have to be defined before the parameters which use it. Mutual or looped context dependencies are not allowed. For example, to match all f d x : A d d r e s s elements in the context of a parameter with an annotation ©Mark ( 3 ) ,  one would have to write an advice with a signature similar to the following one:  public void  checkAddr(  ©Pointcut("/fdx:ShipRequest") ©Mark(3) ShipRequestXmlBean  request,  ©Pointcut(root=Pointcut.RootType.MARK3, AddressXmlBean[]  value="//fdx:Address")  addresses)  3.2.1 Automatic Iteration and Recursion Often, a pointcut might match more than one element. To get access to all matches through an advice parameter, the parameter must have an array type. The most usual use case in such situations is to iterate over the array and do something with each matched element on its own. The method-level  Each  annotation provides a simple means of iterating over all such matches  automatically. When an advice has this annotation, it is executed once for each element matched by the pointcut of the first parameter. Moreover, a pointcut query, which has the value of the first  Chapter 3. Message  15  Processing  ©Request ©Each ©Recursive public void  decrypt( ©Pointcut(value = " / / x e n c : E n c r y p t e d D a t a " ) ©In  EncryptedDataTypeXmlBean  ©Out V e c t o r < X m l O b j e c t >  encData,  decData,  ©Pointcut(root = RootType.HEADER, v a l u e = "./wsse:Security[©soapenv:actor= wss4j']") 1  SecurityHeaderTypeXmlBean  secHeader)  {  WSHandlerResult try  decResult  = null;  {  ,  DoxpectSecurity  w s s 4 j = new D o x p e c t S e c u r i t y ( s e c P r o p s ) ;  wss4j.prepareReceiver(getMessageContext(), decResult } catch /*  = wss4j.decrypt(encData);  (WSSecurityException  Elided,  secHeader);  s e ) { /* Details  elided  */ }  copy data from decResult to decData */  }  Figure 3.1: A decryption advice recursively decrypts all x e n c : E n c r y p t e d D a t a body elements.  parameter as its context, is re-executed at each step of an automatic iteration and, therefore, usually has results which vary between steps. A pointcut of a parameter other than the first one, which does not have the first parameter as a context, is executed once and, therefore, has the same value at each step of an automatic iteration. Consider a situation where an advice modifies a message in such a way that this same advice has to be applied to the message again. For example, this happens in cryptography when some data is encrypted twice (super-encryption) [15]. In this case, a decryption advice replaces an encrypted element with a decrypted content, which itself turns out to be an encrypted element. Since the decryption advice most likely does not know how many layers of encryption an element has, it is useful to be able to automatically apply the decryption advice to the element until it is no longer encrypted. The method-level . R e c u r s i v e annotation provides just that (Figure 3.1). It continues to execute an advice until the pointcut of one of the parameters matches nothing.  Chapter 3. Message Processing  16.  3.3 Content Modification 3.3.1  Replacement  Although a doxpect advice may behave just like a standard AspectJ advice by executing extra code, doxpect advice are specially suited to perform transformation over captured document content. In an advice signature, any input parameter can be paired with an output parameter with the help of parameter annotations in and Out. The annotation i n must be declared on every input parameter to be replaced. The annotation has one parameter - pair index, which is equal to one by default. The corresponding output parameter must have an Out annotation with the same pair index. Initially, on advice invocation, an output parameter is an empty placeholder bean (or an empty vector of beans) of the type specified in the advice signature. Upon advice termination, the content captured by a parameter with the corresponding i n annotation is replaced by the content stored in the output parameter. The annotation Out, in addition to the pair index parameter, takes an X M L name of the replacement element. This name is required because the Java type of a replacement parameter only indicates the X M L type of the replacement element, which is often different from the desired name of the X M L element. A replacement specifies a contract that an element matched by a pointcut will be replaced by another element of the same or a different type in the advice. The advice interface along with the pointcut tells us not only how the types will be converted but also, from the pointcut, where in the document this will take place. Programmers can fill in the data of the replacement variable and the transformation is automatically executed on the underlying document when the advice exits. The contract information is used for checking pointcut types and advice conflicts. A replacement can be either one-to-one or one-to-many. In the former case, each captured element is replaced with exactly one output element. When an input parameter is capturing a single element and, therefore, is a subtype of xmiobject, the corresponding output parameter must store exactly one output element and, therefore, must be a subtype of xmiobject as well. On advice invocation, the output parameter contains an empty bean. Similarly, an input parameter which is a subtype of xmiobject [ ] requires the corresponding output parameter to be  Chapter 3. Message  Processing  17  Possible types Input parameter Output parameter  a subtype of XmlObject  a subtype of XmlObj e c t [ ]  a subtype of XmlObject,  a subtype of XmlOb j e c t [ ],  Vector<? extends XmlObject>  V e c t o r [ ] < ? e x t e n d s XmlObject>  Table 3.2: The type restrictions for input and output parameters.  an array of an XmlObject subtype as well. On advice invocation, the output array contains empty beans and has the same size as the input array. In the one-to-many case, each captured element is replaced with any number of output elements (zero output elements is equivalent to the removal of the input element). When an input parameter is capturing a single element, the corresponding output parameter must have a vector<x extends xmlobject> type. On advice invocation, the output parameter contains an empty v e c t o r with a component type x. Similarly, an input parameter which is a subtype of XmlObject [] requires the corresponding output parameter to be a v e c t o r []<x  extends  xmlobject>. On advice invocation, the output parameter contains an empty array of empty v e c t o r objects with component type x, and the output array has the same size as the input array. Table 3.2 summarizes the type restrictions for replacement parameter pairs. The annotation Each can be used to automatically iterate over an array of replacement parameter pairs. As with simple iteration in Section 2.1.1, the input parameter must be the first advice argument. The corresponding output parameter is assigned an empty value at the beginning of each iteration, and, therefore, must be either a subtype of x m l o b j e c t or a Vector<? extends XmlObject>.  For example, the FedEx web-service only supports centimetres (CM) and inches (FN) as dimension units. The v a l i d a t i o n H a n d l e r in Figure 3.2 has the v a l i d a t e u n i t s advice which converts dimensions in other units, such as millimeters (MM), to C M or FN. The handler intercepts all messages having the FedEx service as an endpoint. The advice further filters out the request messages. The annotation Each instructs the advice to iteratively apply itself to each fdx d i m e n s i o n s element matched by the pointcut of the first advice parameter. At the i iteration step, the advice is invoked with the following parameter values: the parameter inDims  t h  Chapter 3. Message  @Doxpect  // This annotation identifies  18  Processing  the class  as a doxpect handler  ©Include({"http://webservice.fedex.com/v20060404/ a s f d x " } ) ©Target({"http://["/]*/axis/services/fedex"}) public  c l a s s V a l i d a t i o n H a n d l e r extends DoxpectBase  {  ©Request ©Each p u b l i c v o i d v a l i d a t e U n i t s (©Pointcut ('"//fdx: D i m e n s i o n s " ) ©In D i m e n s i o n s X m l B e a n  inDims,  ©Out D i m e n s i o n s X m l B e a n  outDims,  ©Pointcut("//fdx:RequestHeader") RequestHeaderXmlBean  reqHeader)  {  String  inUnits  = inDims.getUnits();  //  CM and IN units  do not need to be converted  if  (inUnits.equals("CM")  || i n U n i t s . e q u a l s ( " I N " ) )  {  outDims.set(inDims); return; //  Convert MM to CM  } else  i f (inUnits.equals("MM")) {  outDims.setLength(inDims.getLength()/10); outDims.setWidth(inDims.getWidth()/10); outDims.setHeight(inDims.getHeight()/10); outDims.setUnits("CM"); }  /* O t h e r  II  conversions are elided */  Respond with an error  if  respondWithError(reqHeader,  units  are not supported  "ER-0125",  "Dimension u n i t s  "+ i n U n i t s  +" a r e n o t s u p p o r t e d "  } }  •Figure 3.2: A dimension units validation handler.  Chapter 3. Message Processing  Possible output type  Action  a subtype of X m l O b j e c t  The output element is inserted before or after the first matched element.  a subtype of X m l O b j e c t [ ] Vector<? extends  XmlObject>  Vector[]<? extends  XmlObject>  19  One output element is inserted before or after each matched element. All output elements are inserted before or after the first matched element. All output elements in a Vector are inserted before or after each matched element.  Table 3.3: The insertion actions taken for different types of parameters.  contains the i  t h  fdx:Dimensions  element found in the message body; the parameter o u t D i m s is  an empty placeholder; and the parameter r e q H e a d e r contains the f d x : R e q u e s t H e a d e r element, which is unique in a message. When the advice is invoked, it uses g e t methods of the i n D i m s parameter and s e t methods of the o u t D i m s parameter to get and convert dimension units. If the units cannot be converted, the advice uses the information in the r e q H e a d e r to create and send back to the requestor a response with an error message (more details on response initiation can be found in Section 3.4). 3.3.2  Insertion  Doxpects provide two parameter annotations for new element insertions. A parameter annotation  insertBefore  indicates that, upon advice termination, the value of the parameter  will be inserted before the content matched by the pointcut. Similarly, an annotation JnsertAfter  indicates that the value of the parameter will be inserted after the matched  content. In both cases, the content captured by the pointcut is not.available to the advice. The annotations i n s e r t B e f o r e and i n s e r t A f t e r have one parameter which, similar to the name  parameter of the annotation out, defines the X M L name of the element to be inserted.  A parameter with one of the insertion annotations is an output parameter and, therefore, can have any type suitable for an output parameter as defined in Table 3.2. The specific insertion action taken for different parameter types are outlined in Table 3.3.  Chapter 3. Message Processing  J  20  The annotation Each can be used to automatically iterate over all elements matched by a pointcut of an insertion parameter. As with simple iteration in Section 3.2.1, the parameter must be the first advice argument. For example, the handler in Figure 3.3 has two advices which insert FedEx outlet addresses and directions to response messages. The first advice, i n s e r t D i r s P e r L o c a t i o n , inserts a f d x : D i r e c t i o n s element with driving directions into each f d x : L o c a t i o n element found in fdx:FDXFedExLocatorRepiy messages. The first parameter, l o c a t i o n , provides access to the content of fdx: L o c a t i o n element from which we obtain the ID of the FedEx outlet. The second parameter provides a placeholder for the driving directions; its pointcut defines the element after which the f d x : D i r e c t i o n s element should be inserted. In the advice body, the ID is mapped to the driving directions which are saved in the d i r e c t i o n s object. Upon advice termination, the value of the d i r e c t i o n s parameter is inserted as the last child of the f d x : L o c a t i o n element (the f d x : L o c a t i o n schema allows any elements at the end of its child sequence). The Mark annotation is used to correlate the pointcut values: the pointcut of the parameter d i r e c t i o n s should be evaluated in the context of the element matched by the pointcut of the parameter location.  Similarly, the second advice inserts fdx: s t a t i o n A d d r e s s and f d x : D i r e c t i o n s elements into the parent element of each f d x : D e s t i n a t i o n S t a t i o n i D found in any FedEx response message. The first parameter, s t a t i o n i D , provides access to the ID of the FedEx station. The second parameter, address, provides a placeholder for the address. The pointcut of address defines the element after which the fdx: s t a t i o n A d d r e s s element should be inserted - after the last child of the s t a t i o n i D ' s parent element. The last parameter, d i r e c t i o n s , provides a placeholder for the driving directions. Its pointcut says that the f d x : D i r e c t i o n s element should be inserted at the same place as the fdx: s t a t i o n A d d r e s s element.  Chapter 3. Message  21  Processing  ©Doxpect ©Target({"http://[ /]*/axis/services/fedex"}) A  ©Include({"http://webservice.fedex.com/v20060404/ as f d x " } ) public class DirectionsHandler  extends  DoxpectBase  {  ©Response ©Each public void insertDirsPerLocation( ©Pointcut("fdx:FDXFedExLocatorReplyVfdx:Location") ©Mark LocationXmlBean l o c a t i o n , ©Pointcut(root=RootType.MARK1, value=" . / * [ l a s t ( ) ] " ) ©InsertAfter("fdx:Directions") XmlString {  directions)  '  directions.setStringValue( Locations.getLocationDirections(location.getBusinessID())); }  ©Response ©Each public void'insertDirsPerlD( ©Pointcut("//fdx:DestinationStationID") ©Mark XmlString  stationID,  ©Pointcut ( r o o t = P o i n t c u t . RootType . MARK1, value= " . . / * [ . l a s t () ] ") ©Mark(2) . ©InsertAfter("fdx:StationAddress") XmlString  address,  ©Pointcut(root=Pointcut.RootType.MARK2,  value=".")  ©InsertAfter("fdx:Directions") XmlString  directions)  {  address.setStringValue( Locations.getLocationAddress(stationID.getStringValue())); directions.setStringValue( Locations.getLocationDirections(stationID.getStringValue())); }  }  Figure 3.3: A directions insertion handler.  22  Chapter 3. Message Processing 3.3.3  R e f e r e n c e s to the I n p u t E l e m e n t N a m e  Often the name of the output X M L element should be the same as or similar to the name of the matched element. A special construct {$in} can be used to refer to the name of the original element from the name parameter of the Out, I n s e r t B e f o r e , or i n s e r t A f t e r annotation. For example, the following out annotation defines the name of the replacement element as the name of the original element with suffix "2": ©Out(name="{$ln}2")  It is also possible to get a substring of the original name. For example, to get a substring [3, length-2], one would write ${In[3,L-2] },  where L identifies the length of the string captured by $in. For instance, consider the advice in Figure 3.4. The advice validates addresses present in a FedEx • request  message.  Depending on  the  message  type  (e.g.  FDXShipRequest,  FDXRateRequest, etc), an address element might have one the following names: fdx:Address, f d x : O r i g i n A d d r e s s , or f d x : D e s t i n a t i o n A d d r e s s . The advice cleans up the address fields (e.g. removes unnecessary spaces, resolves country name abbreviations, etc) and ensures they have the valid syntax. If the validation fails, the advice creates and sends back to the requestor a response with an error message (see Section 3.4 for more details on response initiation). Otherwise, if all fields are valid, we want to replace the original address element by a cleaned address element with the same name. However, since we do not know the name of the address element at compile-time, we cannot specify the output element name in the Out annotation. In this case, the {$in} construct comes in handy: setting the name parameter of the out annotation to " fdx: $ { i n } " makes the name of the replacement element equal to the name of the matched element.  Chapter 3. Message  23  Processing  ©Request ©Each public void validateAddress( ©Pointcut("//fdx:Address | " + "//fdx:OriginAddress  | " + •  "//fdx:DestinationAddress") ©In AddressXmlBean  inAddr,  ©Out(name="fdx:${In}") AddressXmlBean outAddr, ©Pointcut("//fdx:RequestHeader") RequestHeaderXmlBean reqHeader) {  S t r i n g B u f f e r e r r o r M s g = new S t r i n g B u f f e r ( 1 2 8 ) ; String l i n e l = inAddr.getLinel().trim().replaceAll(" if  ", " " ) ;  ( l i n e l . l e n g t h ( ) == 0) errorMsg.append("Address/Linel  i s a r e q u i r e d element and " +  "cannot be empty."); e l s e i f ( l i n e l . l e n g t h ( ) > MAX_LINE1) errorMsg.append("Address/Linel  cannot be l o n g e r t h a n " + MAX_LINE1 +  " characters.\n"); /* *  if  The validation is  similar  of Line2,  City,  to the above and is  State,  Country, and Postal Code  elided */  ( e r r o r M s g . l e n g t h ( ) > 0) r e s p o n d W i t h E r r o r ( r e q H e a d e r , "ER-0057", " V a l i d a t i o n e r r o r s : \n"+ e r r o r M s g . t o S t r i n g ( ) ) ;  else { outAddr.setLinel(linel); /*  Elided:  copy other address components */  } }  Figure 3.4: An address validation advice.  Chapter 3. Message 3.3.4  24  Processing  Deletion  To remove the content matched by a pointcut from the underlying message, one has to annotate the corresponding parameter with a  Remove  annotation. Similarly to input parameters discussed  before, the parameter captures the content matched by the pointcut, but this content is deleted from the message upon advice termination. 3.3.5 A d v i c e  Interaction  When several advices mutate the same part of the message, they could interfere with each other. In such situations, the order in which the advices are executed may be very important. In our earlier work, three types of content-based advice interactions were identified:  corruption,  inhibition, and activation [30].  Consider a situation where a dominating advice replaces an element A with an element B of a different type. The dominated advice matches an element C which contains a child element of type A. Now, when the dominated advice executes, C is no longer properly typed because it should not contain an element of type B. Such a conflict is called an advice An advice inhibition  corruption.  occurs when a dominating advice replaces an element A with an  element B of a different type, so that a dominated advice is prevented from matching A or one of A's children. A missed inter-advice activation occurs when an advice does not match what it should have since it executed before certain content appeared. For example, a dominating advice matches an element B, but that element only appears in the document when another dominated advice transforms A to B subsequently.. A missed intra-advice activation occurs when a single advice performs a transformation which causes the same advice to again become active. For example, this situation arises when the decryption advice from Section 3.2.1 decrypts a super-encrypted element. The interaction analysis in our previous work was overly conservative because it used only type information in detection. Since advice pointcuts were not taken into account, two advices could be deemed conflicting even if the elements they matched were in different and nonoverlapping parts of a message.  Chapter 3. Message  25  Processing  ©Request ©Each public void  transformClientAddr(©Pointcut("/del:Client/del:Address") ©In A d d r e s s T y p e X m l B e a n  inAddr,  ©Out A d d r e s s 2 T y p e X m l B e a n {  /* Details  elided  */  outAddr)  }  ©Request ©Each p u b l i c v o i d validateDeliveryAddr(©Pointcut("/del:Delivery/del:Address") AddressTypeXmlBean {  /* Details  elided  */  addr)  }  Figure 3.5: An example of two request advices which are deemed conflicting by the interaction analysis in our previous work.  For example, consider an X M L Schema in Figure 3.6. The A d d r e s s element occurs in three places within the schema: as a top-level element, as a child of the element D e l i v e r y and a child of the element c l i e n t . Now observe the two request advices in Figure 3.5. The first advice, transf ormClientAddr, Address  performs an interoperability transformation of all c l i e n t elements: the  child element is replaced with its newer version - A d d r e s s 2 . The second advice,  validateDeliveryAddr,  validates the A d d r e s s elements within all D e l i v e r y elements. Note  that the two advices operate on different parts of a message and, therefore, cannot perform any conflicting actions with respect to each other. However, the interaction analysis in our previous work did not take pointcuts into account, and, in this case, it would conclude that both advices operate on A d d r e s s elements in all three locations shown in Figure 3.6. Then, since validateDeliveryAddr  is executed after t r a n s f o r m C l i e n t A d d r , the analysis would report an  inhibition conflict. To avoid such false positives, we have improved the analysis by using pointcuts to get an approximate context in which the interactions might happen. To get the context at compile-time, we evaluate advice pointcuts in the context of schema documents. If an advice interaction is detected, it is reported as a compile-time warning. The approximations we employ in computing the context are described in Section 4.1. Note that we have currently only implemented interaction checking in the context of a single doxpect and not yet across doxpects, although we feel that an extension will be straightforward.  Chapter 3. Message Processing <schema  26  elementFormDefault="qualified" targetNamespace="http://delivery" x m l n s = " h t t p : / / w w w . w 3 . o r g / 2 001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:del="http://delivery">  < e l e m e n t name="Address" <complexType <!— Details  type="del:AddressType"/>  name="AddressType"> elided  -->  </complexType> <element name="Address2" <complexType <!— Details  type="del:Address2Type"/>  name="Address2Type"> elided  -->  </complexType> <element name="Delivery"> <complexType> <sequence> < e l e m e n t name="Date" t y p e = " x s d : d a t e " / > < e l e m e n t name="Time"  type="xsd:time"/>  < e l e m e n t name="Address"  type="del:AddressType"/>  </sequence> </complexType> </element> <element name="Client"> <complexType> <sequence> < e l e m e n t name="Name" t y p e = " x s d : s t r i n g " /> < e l e m e n t name="Address" t y p e = " d e l : A d d r e s s T y p e " /> </sequence> </complexType> </element> </schema>  Figure 3.6: An X M L Schema used by advices in Figure 3.5. The A d d r e s s element is in bold and occurs in three places within the schema.  Chapter 3. Message  27  Processing  3.4 Response Initiation Consider a situation where a request advice performs some validation of message content. If the content is invalid the advice should report the error to the client right away without propagating the message further down the request flow. In other words, the advice should construct a response message with some customized error description and send it back to the client. Doxpects may help to achieve this behaviour when executed in the Apache Axis environment. A utility class AxisUtils has methods which help to initiate a response. The methods appendToResponseHeader(QName,XmlObj ect) and appendToResponseBody(QName,XmlObj ect)  append a new element with the given name and content to the response message SOAP Header and Body correspondingly. If the response message does not exist yet, it is created. Note that these methods can be called from different advices and even different doxpects. The method s t a r t R e s p o n s e makes the cuirent advice the pivot point. This effectively starts the response flow right after the current advice terminates. Note that the pivot point has the granularity of an advice and not of a doxpect. The response message constructed with the help of the AxisUtils methods becomes the initial response message which can be altered by response advices. If no content is added to the response message, it remains empty unless response advices modify it. This method has no effect if called from a response advice or if the current doxpect is not deployed in a H a n d i e r i n f oChain. For example, Figure 3.7 shows the r e s p o n d w i t h E r r o r method used by several request advices in previous sections. The method receives three parameters: request message header, error code, and error message. The top-level element of a request or response FedEx message corresponds to an operation, and its name starts with the operation name followed by the word "Request" or "Reply" correspondingly. To construct the name of the response top-level element, the method gets the name of the request top-level element and replaces its suffix "Request" with "Reply". Then, the top-level element is created with the help of the appendToResponseBody method. Note that the  second argument  of the call  is n u l l  -  this instructs  the  appendToResponseBody method to create an empty element in the message body. If a valid XmlObject was given rather than null., the content of the provided XmlObject would be copied to the new top-level element. Next, a fdx:ReplyHeader element is inserted into the empty toplevel element. The reply header contains a transaction identifier, which is set to the value stored  Chapter 3. Message  28  Processing  p r i v a t e v o i d respondWithError(RequestHeaderXmlBean  reqHeader,  String errorCode, String {.  errorMsg)  .  //  G e t the name of  the operation  Xmiobject reqTopElem  = DoxpectUtils.getParent(reqHeader);  S t r i n g reqName = r e q T o p E l e m . n e w C u r s o r ( ) . g e t N a m e ( ) . g e t L o c a l P a r t ( ) ; //  Replace the name suffix  String  "Request" with "Reply"  respName = r e q N a m e . s u b s t r i n g ^ 0 , r e q N a m e . l e n g t h ( ) - 7 )  QName name. = new QName(FDX_NS_URI, r e s p N a m e , //  Create  "fdx");  the root message element  Xmiobject r e p l y = AxisUtils.appendToResponseBody(name, //  Insert  II  + "Reply";  a fdx:ReplyHeader element and set its  the same as in  TransactionID  "ReplyHeader",  = DoxpectUtils.appendChild(reply,  ReplyHeaderXmlBean  to be  the reqHeader parameter  QName r e p l y H e a d e r N a m e = new QName(FDX_NS_URI, Xmiobject c h i l d  null);  replyHeader =  "fdx");  replyHeaderName);  (ReplyHeaderXmlBean)child;  replyHeader.setCustomerTransactionldentifier( reqHeader.getCustomerTransactionldentifier()); //  Insert  a fdx:Error  element and set its  code and message  QName e r r o r N a m e = new QName(FDX_NS_URI,  "Error",  child  errorName);  = DoxpectUtils.appendChild(reply,  "fdx");  ErrorXmlBean e r r o r = (ErrorXmlBean)child; error.setCode(errorCode); error.setMessage(errorMsg) ; 11 Make this  advice the pivot  point  AxisUtils.startResponse() ; }  Figure 3.7: A response initiation example.  in the request header. Next, a f d x : E r r o r element is appended after the reply header, and its code and message are set. Finally, the s t a r t R e s p o n s e method is invoked to indicate that the current advice should be the last advice in the request flow.  29  Chapter 4  Type Checking and Dynamic Types In the first section of this chapter we describe the compile-time analysis of message content transformations which verifies that the transformation output will be valid at runtime and that the transformations will succeed without runtime errors. In the second section of the chapter, we present dynamic types, which provide a simple way to access and mutate content common to matched elements of different types.  4.1 Modification Type Checking Consider a situation when a service has updated its schemas, but a client is still using old schemas. To remain backwards-compatible, the service must be able to receive and send messages in the old format. As shown in previous sections, doxpects can help to automatically transform both request and response messages to the new format. However, how can we check that the set of modifications performed by advices is enough to completely convert any message into the new format? We could do that by generating all possible message instances, transforming them according to advice signatures, and validating them against the new set of schemas. However, this is impossible because the set of instances described by a schema is potentially infinite. Instead, we apply the modifications to schema documents. Obviously, the modifications are emulated and may be incomplete. For example, a schema might have particles a n y which, in an instance document, allow any number of any elements in place of them. We choose to assume that none of the XPath steps match particles a n y . Also, it is hard and often even impossible to evaluate XPath predicates on values known only at runtime. For emulation purposes, we choose to ignore predicates altogether. However, the set of schemas altered by emulated modifications can still represent well the set of transformed messages.  30  Chapter 4. Type Checking and Dynamic Types  It might seem that the modification analysis could be based on XMLBeans classes generated from schemas rather than the schemas themselves. However, it cannot be done because an XMLBeans class lacks a lot of important information about the X M L type it corresponds to. First of all, the class lacks information about the order in which the child elements appear in the type. Secondly, the class does not know the structure of the type. For example, it does not model how the child elements are organized by <sequence>, <choice>, and < a i i > tags. Also, the class does not store values of minOccurs and maxOccurs attributes. Only indirectly, by analysing field types and method signatures, the attribute value can be inferred to be zero, one, or greater than one. Finally, we perform the modification type checking to improve the external service interface, which is defined in X M L Schema language. Therefore, to achieve the best results, we have to base our analysis on schemas and cannot use shortcuts like XMLBeans classes. To automatically perform a modification type checking of request messages, one should use the class-level annotation  TypeCheckRequest.  Similarly,  transformations, one should use the class-level annotation  to check response  TypeCheckRespon.se.  message  Each of these  annotations takes a paired list of schemas as a parameter. Each pair defines a schema to which the transformed message should conform (a reference schema) and a schema to which the original message conforms. The doxpect compiler transforms the later schema into a schema to be compared against the former schema. We call the transformed schema the validated schema. The exact syntax of a schema pair is "OrigSchema v s Ref Schema", where OrigSchema is the target namespace URI of the original schema to which the message conforms, and RefSchema is the URI of the reference schema. URIs of server-side schemas defined in an i n c l u d e annotation (Section 3.2) may be replaced by the corresponding prefixes. For example, an interoperability handler in Figure 4.1 bridges syntactic mismatches between two otherwise semantically compatible versions of the eBay X M L Schema: an old one used by a client  (http://webservice.ebay.com/v4i9/) and a new one used  by the service  ( h t t p : //webservice.ebay.com/v42l/). To ensure that the set of modifications performed by advices is enough to completely convert any message between the old and new format, the handler performs a modification type checking. The TypeCheckRequest annotation instructs the  31  Chapter 4. Type Checking and Dynamic Types ©Doxpect ©Include({"http://webservice.ebay.com/v421/ a s e b a y " } ) ©Target({"http://["V]*/axis/services/ebay "}) ©TypeCheckRequest({"http://webservice.ebay.com/v419/ v s e b a y " } ) ©TypeCheckResponse({"ebay v s h t t p : / / w e b s e r v i c e . e b a y . c o m / v 4 1 9 / " } ) public 7*  class  InteropHandler extends DoxpectBase {  E l i d e d */  }  Figure 4.1: An interoperability handler with type checking annotations.  handler to check that, after the set of transformations is applied to a client request, the message conforms to the new schema. Vice versa, the T y p e C h e c k R e s p o n s e  checks that, after the  transformations, any response message is compatible with the old schema. During a request type checking, a comparison is performed to ensure that, after all request advices of a doxpect are executed, all instances of a remote schema are valid instances of a local schema. Hence, in a request comparison, the reference schemas are the local ones and the validated schemas are the remote ones. Conversely, during a response type checking, a comparison ensures that, upon execution of all response advices within a doxpect, all instances of a local, validated, schema are valid instances of a remote, reference, schema. This naming convention recursively applies to elements, attributes and types defined in a schema. To compare two schemas, the set of global elements in the validated schema is compared to the set of global elements in the reference schema. A comparison succeeds if the reference schema has all global elements of the validated schema. Two elements or two attributes are equal if they have the same name and equivalent types. A pair of any or  anyAttribute  wildcards are compared by their namespace constraints. To  compare an element to an a n y wildcard, we check whether the wildcard namespace constraint allows the element namespace. Similarly, to compare an attribute to an a n y A t t r i b u t e wildcard, we check whether the wildcard namespace constraint allows the attribute namespace. When an element definition has no t y p e attribute, the element type is effectively a wildcard allowing the element to have any type. If a reference element has the wildcard type then any validated element with the same name matches it. On the contrary, a validated element with the  Chapter 4. Type Checking and Dynamic Types  32  wildcard type may only match a reference element with the wildcard type. When a type schema cannot be loaded, the type remains unresolved. Unresolved types can only be compared by names and we choose to do so when both compared types are unresolved. We report a mismatch when only one type is unresolved. When present, the value of the type attribute of an element resolves either to a complex type or a simple type. While a simple type can have neither element children nor attributes, a complex type, may have both. Complex types are further divided into two groups: those with simple content and those with complex content. While both forms of complex types allow attributes, only those with complex content allow child elements; those with simple content only allow character content. A complex type with a simple content is an extension or restriction of either a simple type or another complex type with a simple content. Therefore, we view a complex type with a simple content as a simple type with attributes and handle it together with simple types. The comparison algorithm of simple types and complex types with simple contents is outlined in Section 4.1.1. In Section 4.1.2, we describe the comparison algorithm of complex types with complex contents.  4.1.1 Simple Type Comparison In this section, we outline the comparison process of simple types and complex types with simple contents. To compare two types, we compare their primitive base types and facets that constrain the. range of their values. For complex types, we also compare attributes. A 1  comparison succeeds if the validated type is a supertype of the reference type or, in other words, if the set. of possible element instances of the validated type is a subset of the one of the reference type. To meet this condition, the types must have the same primitive base type and the validated type must be constrained no less than the reference type. For complex types, the set of attributes of the validated type must be a subset of attributes of the reference type. For a simple type restriction and a complex type restriction, the constraints of the base type are taken into account as well. For a complex type extension, we take into account attributes of its base type. The comparison algorithm supports atomic, list and union simple types. It supports all facets except for pattern and whiteSpace. Both numeric and date or time related values of the facets minlnclusive, maxlnclusive, minExclusive, and maxExclusive are supported.  33  Chapter 4. Type Checking and Dynamic Types  Facet  Level of support  *  Facet  Level of support  enumeration  Full  mininclusive  Full  length  Full  maxlnclusive  Full  minLength  Full  minExclusive  Full  maxLength  Full  maxExclusive  Full  .totalDigits  Full  pattern  None  fractionDigits  Full  whiteSpace  None  Table 4.1: A summary  the facet support levels.  The p a t t e r n facet constraints the lexical space of an attribute value to literals which match a specific pattern. The pattern value must be a regular expression. The p a t t e r n facets could be compared using the standard finite state machine technique. The whiteSpace facet specifies how occurrences of space, tab, line feed, and carriage return characters should be handled: preserved, replaced by space characters, or, for contiguous sequences, collapsed into a single space character. We see little benefit in comparing values of whiteSpace facets and therefore choose to ignore it. Table 4.1 provides a summary of the facet support levels.  4.1.2 Complex Type Comparison In this section, we focus on the comparison of two complex types with complex contents. A comparison of a validated complex type and a reference complex type is done in two steps: attribute comparison and element comparison. To pass the attribute comparison, the attribute set of the validated type must be a subset of the attribute set of the reference type. To simplify the element comparison, the elements of each complex type are converted into a graph which models their relative ordering. If two elements may directly follow each other in a message, their graph nodes are connected. A type extension is handled by appending the elements defined in the extension to the graph of the base type. A type restriction is handled like a complete redefinition of the base type and, therefore, the base type is ignored. A validated graph is a graph corresponding to a type from a validated schema, and, similarly, a reference graph corresponds to a type from a reference schema.  Chapter 4. Type Checking and Dynamic Types  34  <xs:complexType name ="StoreCustomCategoryType"> <xs:sequence> <xs:element name= 'CategorylD" t y p e = " x s : i n t " /> <xs:element name= 'Name" t y p e = " x s : s t r i n g " /> <xs:element name= 'Order" t y p e = " x s : i n t " /> <xs:element name= ' C h i l d r e n C a t e g o r i e s " type= 'ns:StoreCustomCategoryType" minOccurs="0"  />  </xs:sequence> </xs:complexType>  Figure 4.2: A recursive type definition found in an eBay schema. The type describes a configuration of a store custom category: category ID and name, order in which the custom category appears in the list of store categories, and an optional sub-category.  A graph wildcard node stores namespace constraints and occurrence parameters of an any wildcard. A graph element node stores the element itself and its two occurrence  constraints,  which indicate how many times the element must and may occur in an instance document. As the comparison progresses and nodes from two graphs get matched to each other, their occurrence constraint values decrease, reflecting the remaining number of unmatched element instances. Two element nodes are equal if the schema elements they correspond to have the same names and types. If the types are not simple, they are themselves compared. Recursive and mutually recursive type definitions are rarely used in the real life schemas that we have analysed: Amazon, eBay, FedEx, and Google Search. A sample of a recursive type definition from an eBay schema is shown in Figure 4.2. However, the algorithm is able to detect such definitions by maintaining a stack of type pairs that are currently being compared. When a pair of types, which is already in the process of being compared, is attempted to be compared again, the types in the last pair are assumed to be equal. The result of a comparison of a pair of types is cached to improve performance. Along with nodes corresponding to elements and wildcards, a graph has nodes which model the structural tags in a complex type: <sequence>, <choice>, <group>, <ail>, and corresponding terminal tags. Similarly to the element nodes, the nodes corresponding to the  35  Chapter 4. Type Checking and Dynamic Types  above start tags store occurrence parameters which indicate how many times the whole structure must and may appear in an instance document. A node that must occur in the graph at least once is called required; a node that may not appear in the graph - optional. A comparison is successful if all instances of the validated graph are also instances of the reference graph. An instance of a graph is a spanning path of its nodes. A spanning path covers all required nodes and any number of optional nodes in the order they are defined in the type. Note that instances of the reference graph do not have to be instances of the validated graph. Different paths can be created by forks like alternative children of a  choice  structure,  different permutations of elements within an a l l structure, or different final values of node occurrence parameters. The number of different paths is exponential to the number of forks, and, therefore, it is infeasible to analyse all of them. While each child element of a  choice  structure results in a new graph instance (a new spanning path), there are several other fork cases when it is very expensive to construct all instances of a graph and we choose to construct only a subset of them. For example, we do not compute permutations of elements in an a l l structure, but we plan to address this in the future work. Also, we do not create a separate graph instance for each possible value of an occurrence parameter. We only ensure that the validated element can occur no less and no more than the reference element. We take into account that a schema may have two or more elements with the same name and type directly following each other in a graph. For example, in Figure 4.3, the effective occurrence interval of any of the three <foo>  elements is [2,4]+ [0,l]+ [0,3] = [2,8]. Due to the simplified way the occurrence interval is  computed, the interval is always considered to be continuous, which is not true in practice. For example, Figure 4.4 shows a case where our algorithm computes a wrong occurrence interval: it reports [0,4] for  <foo>  elements, but the correct interval is [0,l]u[3,4]. However, for our  purposes, we find this approximation acceptable. The number of spanning paths in our analysis is never infinite. The two potential infinite sources of spanning paths are occurrence parameters and type recursion. We choose to create only one spanning path per element regardless of its occurrence parameters. Also, we restrict the maximal occurrence values of structural tags. Type recursion is guarded by element labels and, therefore, cannot result in additional spanning paths. If recursion could be used without enclosing the type reference in an element label, the schema language would be context-free.  Chapter 4. Type Checking and Dynamic Types .  36  <sequence> <sequence> <element name="bar"  type="nsl:barType">  <element name="foo" t y p e = " n s l : f o o T y p e " minOccurs = "2" maxOccurs="4"> </sequence> <element name="foo" t y p e = " n s l : f o o T y p e " minOccurs=" 0" maxOccurs="1"> <choice> <element name="bar"  type="nsl:barType">  <element name="foo" t y p e = " n s l : f o o T y p e " minOccurs ="1" maxOccurs="3"> </choice> </sequence>  Figure 4.3: The effective occurrence interval of any of the three <foo> elements is [2,8].  <sequence> <element name="foo" t y p e = " n s l : f o o T y p e " minOccurs="0" maxOccurs="1"> <choice> <element name="bar"  type="nsl:barType">  <element name="foo" type="nsl:fooType".minOccurs="3"  maxOccurs="3">  </choice> </sequence>  Figure 4.4: A case where our algorithm computes a wrong occurrence interval.  For context-free languages, the sub-type problem is undecidable, so it would not be possible to build an algorithm for that. Fortunately, most of the hard cases do not occur often in practice: according to our study of Amazon, eBay, Google Search, and FedEx X M L Schemas, the structure a l l and occurrence parameter values other than zero, one, and unbounded are very rarely used. In fact, no minOccurs attribute had a value other than zero or one, and only about two percents of maxOccurs attributes had a value different from one or unbounded. Table 4.2 provides a summary of extents to which major X M L Schema language features are used in the above four schemas.  37  Chapter 4. Type Checking and Dynamic Types  Feature  Number of occurrences in a schema eBay  FedEx  Amazon  Google  2193  531  1053  22  any  0  86  0  attribute  25  0  12  2  anyAttribute  0  0  0  0  461  96  160  0  choice  0  14  0  0  group  0  0  0  0  all  0  0  0  3  minOccurs  1721  445  876  0  minOccurs="0"  1718  444  876  0  0  0  0  0  maxOccurs  194  445  134  0  maxOccurs="unbounded"  187  94  129  0  4  8  0  0  496  86  167  5  0  26  142  0  190  144  14  0  0  120  14  0  60000  6700  2000  200  element  sequence  minOccurs (excl. "0", "1")  maxOccurs (excl. "1", "unbounded") complexType anonymous complexType simpieType anonymous simpieType schema size (lines)  ,  0  Table 4.2: A n overview of X M L Schema language features used in eBay, FedEx, Amazon and Google Search schemas.  Also, it has been shown that the problem of numerical occurrence indicator comparison in regular expressions is NP-hard [25]. Since the X M L Schema language subsumes regular expressions, this fact also applies to the comparison of its  minOccurs  and  maxOccurs  occurrence constraints. The graph comparison algorithm is implemented as iteration with backtracking. It concurrently iterates over the nodes in both graphs trying to match them against each other. The fork points in the validated graph, which occur during a comparison, are saved together with the associated states of node iterators. A join path of matched nodes is complete when the ends of both graphs are reached. Then, the state of the most recent fork point is restored and the iterators  38  Chapter 4. Type Checking and Dynamic Types Reference schema version  All X  415 X  413 X  411 X  409 V  407 409  X  X  X  X  X  X  411  V  y  •/  S  V  /  413  s  V  J  V  V  o> a  417  Valid;  c 1 schem a versh  419 X  421 X  s  415 419  407 V  s •/  /  V  421  Table 4.3: A summary of comparison results of several eBay schema versions. Pairs with a subtype relationship are marked by ^ . The versions 407 and 409 are not compatible with the rest of the versions because the child element order in several types of the version 411 has been changed and some child elements were removed altogether.  proceed from it. If, at some step of a comparison, no match can be found due to a wrong subpath previously taken at a fork in the reference graph, the algorithm tries to step back by discarding the latest pair of matched nodes. It continues to step back until either a fork with a yet unmatched alternative subpath is found in the reference graph or all matches of the current join path are discarded. In the later case, the comparison terminates reporting that an instance of the validated graph is not an instance of the reference graph. The pseudo-code of the comparison algorithm is provided in Appendix B. The algorithm is similar to the one in the XDuce language [13], which is described in the Related Work chapter. 4.1.3 T e s t s of the  Algorithm  We have tested the schema comparison algorithm on eight versions of the eBay schema: 407, 409, 411, 413, 415, 417, 419 and 421. For any pair of versions, the newer one has all global elements of the older one, but a couple of types are slightly changed in each version. All eBay types are based on the  <sequence>  structure, and, therefore, a newer type is a subtype of an  older one when it just inserts an optional element into its sequence or adds a required element to the end of the sequence. But when an element is removed, or the element order is changed, the  Chapter 4. Type Checking and Dynamic Types  39  newer version is no longer a subtype of the older version. The algorithm had correctly identified all pairs of schemas that have a subtype relationship (Table 4.3). Recall that a reference schema is said to be a subtype of a validated schema if all possible instance documents of the validated schema are valid instances of the reference schema.  4.2 Dynamic Types Consider a situation when a pointcut matches elements with different types, but with similar content. The author of the pointcut is only interested in a subset of the content common to all matched elements. One way to define the corresponding advice parameter is with the generic XmlObject type. However, in this case, the content of matched elements would not be accessible through get and set methods, and the developer would have to use the generic s e l e c t c h i l d r e n (QName) method and/or the XmlCursor interface. Doxpects provide a simple way to access and modify content common to matched elements of different types. The doxpect compiler can create a new, dynamic, X M L type containing the common content of the matched elements. Effectively, the schema of the new type contains an intersection of the child element definitions according to the X M L schemas of the matched elements. The schema of the new type is compiled into XMLBeans classes which have get and set methods for all common child elements of the matched elements. The child elements are compared by names and types. We ignore the relative-order of child elements and, therefore, provide an object-oriented subtyping for schema types. We choose to do it this way because dynamic types are only used in the object-oriented environment, where the ordering is not important. Moreover, enforcing the relative order would significantly restrict the applicability of dynamic types. In fact, rather than using schemas to compute the list of elements to be included in a dynamic type, we could be using XMLBeans classes corresponding to the X M L types of the matched elements. This is different from the modification type checking discussed in Section 4.1, where we cannot rely on the schema representation provided by XMLBeans classes because it is not sufficiently detailed to deal with an external service interface defined in X M L Schema language. A dynamic type is an object-oriented paradigm used internally by a service, and, thus, we could take a shortcut by using XMLBeans classes and the Java Reflection API to compute a  40  Chapter 4. Type Checking and Dynamic Types  list of dynamic type elements. However, it seems that this shortcut would not improve the performance. Therefore, we choose to reuse parts of the modification type checking framework in dynamic type generation. To define a parameter with a dynamic type, the annotation D y n P o i n t c u t must be used in place of the usual P o i n t c u t annotation. The DynPointcut annotation has the same parameters and behaves similarly to the P o i n t c u t annotation, but it also gives a hint to the compiler that a dynamic type should be created. The name of the dynamic type is arbitrary. With the DynPointcut annotation, the dynamic type is regenerated at each doxpect compilation. If one wishes to keep the current version of the generated type, the D y n P o i n t c u t annotation should be replaced back with the P o i n t c u t annotation. For example, the advice c h e c k P a c k a g e R e s t r i c t i o n s in Figure 4.5 intercepts three types of request  messages:  f d x : F D X R a t e A v a i l a b l e S e r v i c e s R e q u e s t , fdx:FDXRateRequest  and  fdx:FDXshipRequest. At compile-time, the common elements of the three types are combined into a dynamic type shipOrRateRequestxmiBean. At runtime, the content matched by the pointcut is parsed into a ShipOrRateRequestxmiBean  object and passed to the advice as the  request parameter. The advice uses the request object to check whether a package of the requested type may have the requested weight and dimensions. The advice initiates a response with an error if some restriction is not satisfied. For situation when a dynamic type must have a union of child elements, doxpects provide the DynUnionPointcut annotation. It is similar to the D y n P o i n t c u t annotations, but the schema of the new type contains a union of the child element definitions. For instance, Figure 4.6 has an advice with a DynUnionPointcut annotation. This advice is similar to the advice in Figure 4.5, but its ShipOrRateRequestxmiBean  class will provide get  and s e t methods for all elements of the three types matched by the pointcut. At runtime, if the advice calls a get method of an element not present in the matched message, n u l l is returned.  41  Chapter 4. Type Checking and Dynamic Types ©Request public void  checkPackageRestrictions( ©DynPointcut("fdx:FDXRateAvailableServicesRequest "fdx:FDXRateRequest  |" +  |" +  ."fdx:FDXShipRequest") ShipOrRateRequest.XmlBean  request)  {  i n t weight = request.getWeight().intValue(); int  length = request.getDimensions().getLength().intValue();  int width  = request.getDimensions().getWidth().intValue();  int height  = request.getDimensions().getHeight().intValue();  if  (!checkPackageWeight(request.getPackaging(), respondWithError(  if  / * Details  elided  */ ) ;  (!checkPackageDims(request.getPackaging(), respondWithError(  / * Details  elided  weight))  length, height,  width))  */ );  }  Figure 4.5: An advice with a D y n P o i n t c u t annotation and a dynamic type.  ©Request•  .  .  public void  checkPackageRestrictions( ©DynUnionPointcut("fdx:FDXRateAvailableServicesRequest "fdx:FDXRateRequest  |" +  |" +  "fdx:FDXShipRequest") ShipOrRateRequestXmlBean  request)  { /*  Details  elided */  }  Figure 4.6: An advice with a D y n U n i o n P o i n t c u t  annotation and a dynamic type.  42  Chapter 5  Implementation Details In this chapter we discuss the implementation details. We start by presenting the two available compilers: one based on Java Mirror API and the Annotation Processing Tool and one based on the Java Reflection API. We then outline the compilation process and explain how to write a deployment descriptor for a doxpect handler. We follow with more details on the Reflection API compiler and a summary of a command-line interface. Finally, we conclude with some noteworthy details of the XMLBeans parser and describe the DoxpectBase class, which is a superclass of every doxpect.  5.1 Compilation Process Doxpects have two compilers: one based on Java Mirror A P I and the Annotation Processing Tool and one based on the Java Reflection'API. A l l prior discussions were in the context of the Mirror API compiler because it does not have many of the limitations of the Reflection API compiler, which will be described in Section 5.3.1. The Mirror API compiler takes a doxpect source file and generates a wrapper class source file. Both source files are then compiled. The wrapper class does all the dirty work: it parses messages into XMLBeans, evaluates pointcuts, iterates over matched content (if the annotation Each is present), invokes doxpect methods, and performs message modifications. The Reflection API compiler takes a doxpect class file and generates a wrapper class source file, which is then compiled. Both compilers may be invoked either automatically by the DoxpectHandler class (Section 5.2) or manually from a command line (Section 5.3.2).  Chapter 5. Implementation  43  Details  •chandler I n f o C h a i n > <handlerIn£o c l a s s n a m i e = " d o x p e c t . h a n d l e r . D o x p e c t H a n d l e r " > <parameter  name="requestClassNames"  value="com.fedex.webservice. v2 0060404 d o x p e c t . E n c r y p t i o n H a n d l e r , c o m . f e d e x . w e b s e r v i c e . v 2 0060404 d o x p e c t . V a l i d a t i o n H a n d l e r " / > <paramster  name="responseClassNames"  value="com.fedex.webservice.v2 0060404 d o x p e c t . P o s t p r o c e s s H a n d l e r , doxpect.EncryptionHandler"/>  com.fedex.webservice.v20060404 <parameter  name="faultClassNames"  value="com.fedex.webservice.v2 0060404 d o x p e c t . F a u l t H a n d l e r " <parameter  name="outPackage"  value="com.fedex.webservice.v20060404 <parameter  doxpect"  />  name="srcInputDir"  ,value="/apache/Tomcat5.5/webapps/axis/src/" <parameter  />  />  name="binOutputDir"  value="/apache/Tomcat5.5/webapps/axis/WEB-INF/classes/"  />  </handlerlnfo> </handlerlnfoChain>  Figure 5.1: A sample deployment descriptor of the  DoxpectHandler  for an Apache Axis server.  A doxpect compiler needs schema definition files in order to perform type checking, detect advice conflicts, generate dynamic types, etc. A doxpect specifies schema namespaces in its include  annotation, but a schema is not always accessible at the location indicated in its  namespace  URI. The  locations .properties  actual  schema  locations  should  be  given  file stored in a directory specified in the system  in  a  CLASSPATH  schema-  variable.  The file should be formatted as a Java Properties file [19].  5.2 Doxpect Handler The doxpect library provides a special J A X - R P C handler, Doxpect Handler, which may wrap any number of doxpect classes. Certain compilation parameters can be configured when Doxpect Handler is installed, or deployed, to the web container. The configuration information  Chapter 5. Implementation  Details  44  is maintained in an X M L file called a deployment descriptor. A sample descriptor of the DoxpectHandler for an Apache Axis server is shown in Figure 5.1. The fully-qualified names of the doxpects to be executed for request, response, and fault messages are given in the handler parameters requestciassNames, responseCiassNames, and f a u i t c i a s s N a m e s correspondingly. If a doxpect wrapper class is older than the corresponding doxpect source file or does not exist, it is compiled dynamically by the DoxpectHandler class. The package to which the generated doxpect wrapper classes should belong must be given in the outPackage handler parameter. If the outPackage parameter is missing, the wrappers are generated in the same package as the doxpect classes. The parameter s r c i n p u t D i r specifies where to find input source files. This parameter is only used by the Mirror API compiler (to find doxpect source file), and, therefore, its presence indicates that this compiler should be used rather than the Reflection A P I one. The parameters s r c O u t p u t D i r and b i n O u t p u t D i r specify where to place generated source and binary files correspondingly. Source files include Java source files and dynamic type schema definition files. The possible binary files are Java class files and schema binary files (.xsb) generated by the XMLBeans compiler. For the Mirror API compiler, both of these parameters are optional and default to the value of the s r c i n p u t D i r parameter. For the Reflection API compiler, only the s r c O u t p u t D i r is optional - it defaults to the value of the b i n O u t p u t D i r parameter.  5.3 Doxpect Compilers 5.3.1  Reflection A P I  Compiler  The Reflection API compiler takes a compiled doxpect class as input, which misses some information about the original doxpect source. For example, the order of advices inside a doxpect is unknown. Therefore, to enable a relative ordering of advices of the same type, the annotations Request, Response and F a u l t should have a positive integer parameter which indicates the ordinal value of their advices. By default, this value is equal to one. Advices are invoked in the ascending order of their ordinal values. If two or more advices have the same number, their relative invocation order is unspecified.  Chapter 5. Implementation  -reflect  -srcInDir  <directory>  45  Meaning  Option -r,  Details  Use the compiler based on Reflection API Specify where to find input source files (required without - r e f l e c t , otherwise ignored) Specify where to place generated class files (required with  -binOutDir <directory>  - r e f l e c t , otherwise defaults to the value of the - s r c i n D i r parameter) Specify where to place generated source files. An optional  -srcOutDir <directory>  parameter. With - r e f l e c t , defaults to the value of -binOutDir; otherwise defaults to the value of - s r c i n D i r .  -p <package name> -package <package name> -f,  -forceRecompile  -noXA,  -noXPathAnalysis  Specify package for generated files Force recompilation even when the wrapper class files are up-to-date Do not compare the types of potential XPath matches against the actual advice parameter type  -noCA, - n o C o n f A n a l y s i s  Do not perform advice conflict checking  -noRA, - n o R e p l a c e A n a l y s i s  Do not perform replacement/modification type checking  -h, - h e l p  Print a synopsis of standard options  Table 5.1: The possible options for the command-line Doxpects compiler.  Values of type parameters are also lost during a compilation [16]. Without this information, the compiler does not know the component type of advice parameters with v e c t o r types. Therefore, the component types should be specified in the parameter-level  Component  annotation. Finally, it is impossible to have dynamic types with the Reflection API compiler because its input doxpect class file cannot be created by the Java compiler in the first place due to the missing definition of the dynamic class.  Chapter 5. Implementation  Details  46  5.3.2 Command-Line Interface The command-line interface can be invoked with the following command Java doxpect.Compiler'<options>  < q u a l i f i e d c l a s s names>  where possible options are described in Table 5.1.  5.4 Connected X M L B e a n s and the D o x p e c t B a s e Class When an X M L element is parsed by an XMLBeans factory as part of a SOAP message, the created bean object is typed in the context of its parent: if the element is improperly placed or is not allowed by the schema of the parent element, it is represented by a generic XmlObject bean regardless of its X M L name and type. However, if an element is parsed on its own and not as part of a larger element, it is always typed according to its X M L name and type. There are many situations when we want to process X M L elements which do not conform strictly to schemas. For  example, the X M L content encryption is rarely foreseen in schemas, which prevents  encrypted messages from conformance. For example, in Figure 5.2, when the foo element in the instance document is parsed, the bar element is parsed into an XmlObject object because it is not a valid child of foo according to the foo's definition. However, parsing the b a r element on its own would result in a specialized BarTypeXmlBean object. A doxpect wrapper parses improperly placed or disallowed elements separately to get them properly typed. However, such beans have no information about their parents, which is often useful information. The DoxpectBase class, which is a superclass of every doxpect, provides methods to retrieve beans parsed in the context of the whole message and, therefore, connected to their parents. To make things easier, both untyped and typed beans are accessible in this way. If one wants to make sure that a parameter passed to an advice is a bean (or an array of beans) connected to the rest of the message then a getConnected() method has to be called. The method takes an XmlObject (or an XmlObject [] for arrays) and returns a connected version of its parameter. The DoxpectBase class also provides access to the doxpect handler parameters and to the current message context.  .  Chapter 5. Implementation  Al  Details XML Instance  XML Schema <schema e l e m e n t F o r m D e f a u l t = " q u a l i f i e d " targetNamespace="http://domain.com" xmlns="http://www.w3.org/2001/XMLSchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:nsl="http://domain.com">  <nsl:f o o  <element name="foo">  <nsl:bar>  <complexType>  </nsl: f o o  <sequence> <element name="abc"  type="nsl:abcType">  </sequence> </complexType> </element> <element name="bar" type= " n s l : b a r T y p e <!--  Other details  are elided  11  >  -->  </schema>  Figure 5.2: A sample X M L instance not conforming to its X M L Schema.  ©Response p u b l i c v o i d encrypt(©Pointcut("//fdx:Address . | / / f d x : C r e d i t C a r d " ) XmlObject[]  toEncrypt)  { //  Make sure the beans are connected to the rest  XmlObject[] //  of  the document  toEncrypt =getConnected(toEncrypt);  Encrypt  try { D o x p e c t S e c u r i t y w s s 4 j = new D o x p e c t S e c u r i t y ( s e c u r i t y P r o p s ) ; wss4j.prepareSender(getMessageContext()) ; wss4j.encrypt(toEncrypt); } c a t c h . ( W S S e c u r i t y E x c e p t i o n s e ) { /* Details  elided  */ }  Figure 5.3: An encryption advice encrypts all f d x : A d d r e s s and f d x : C r e d i t C a r d body elements.  Chapter 5. Implementation  Details  48  For example, the advice in Figure 5.3 encrypts all f d x : A d d r e s s and f d x : C r e d i t C a r d body elements. The encryption library cannot encrypt disconnected element and, therefore, we call the g e t c o n n e c t e d () method to get the connected version of the advice parameter t o E n c r y p t .  49  Chapter 6 Related Work The extensible Stylesheet Language Transformation (XSLT) is the predominant domainspecific language for X M L transformation [6,22]. X S L T is a purely functional language based on pattern matching and template instantiation. X S L T supports modular transformations through the delegation of sub-tasks between templates. Similarly to Doxpects, it uses XPath for content matching. Unlike Doxpects, X S L T can transform X M L documents into any content type (e.g. UTrnX, RTF, or plain text). However, this makes it impossible to express the interface of a template in terms of its output. Another important limitation of X S L T is the inability to transform the original document; rather, a new document is created based on the content of the original one. XDuce is a statically typed functional X M L processing language that takes X M L documents as primitive values and represents them internally as sequences of nodes [13,14]. The document schema, represented by DTD, is described statically by regular expression types against which instance documents are matched. It supports a local form of type inference where types are specified explicitly for function arguments but inferred for pattern matching. In its current version, the XDuce type system does not model element attributes or unordered data. JWIG is an extension of Java that, among other features, provides a mechanism for construction of X M L documents using XML templates and plug operations [7,8]. It uses a technique - X A C T - that provides an integration of X M L values and operations for X M L transformation into Java [26]. X A C T guarantees type safety of the transformations: the approach is based on dataflow analysis rather than traditional type systems. The validity analysis uses the DSD2 schema language. The X M L values are represented as X M L templates, which are wellformed X M L fragments containing named gaps where other templates or strings may be  Chapter 6. Related Work  50  plugged. The gaps may appear in place of element or attribute values. XPath is used for selecting fragments of X M L values. Both XDuce and JWIG perform a more sophisticated type safety analysis than does the Doxpects framework. However, unlike Doxpects, they do not have an explicit interface for transformations. Also, they make no contribution to the design, traceability, and maintenance of crosscutting document-oriented concerns. Tozawa and Hagiya compare algorithms for X M L schema subtype checking and show how their algorithm outperforms XDuce [29]. X J proposes mechanisms for integration of X M L as a first-class construct into Java [12]. It uses the X M L Schema and XPath standards, and supports in-place updates of X M L data. The language syntax is based on Java, but has special syntactic constructs to represent XPath expressions and access X M L data. At compile-time, an XPath expression is abstractly evaluated on a X M L Schema to infer the type such that an evaluation of the XPath expression on any document conforming to the X M L Schema results is an instance of this type. The X J compiler emits the standard Java code that accesses X M L data using D O M .  51  Chapter 7 Conclusion We have described and demonstrated by examples an approach to modularizing crosscutting document-oriented concerns in the web service setting. The previous approach using handlers inadequately addressed important software engineering practices such as "programming to an interface". This thesis contributes a further step towards providing a clean mapping from design to implementation of handlers with a useful static checking which takes advantage of a well specified interface. The development of the Doxpects language has been primarily example driven. We have focused our attention on implementing those features of the language which addressed the programming difficulties encountered in those examples. We demonstrated how a handler in the Doxpects framework can intercept, filter, and process messages. We explained message filtering, but could not provide a complete example where filtering by service endpoint URI would be used. We feel that this kind of message filtering would be more useful in Doxpects applied to more lightweight standards that use X M L over HTTP without an additional messaging layer such as SOAP. We are looking into adapting the Doxpects framework to simple interfaces such as REST. Due to the rare use of attributes in the commercial schemas that we used in our study, the current version of the language cannot bind X M L attributes to advice parameters. In other words, pointcuts can only match X M L elements. We may add this feature into the language later. In our language, the insertion transformation is represented by a single advice parameter which is bound to the inserted content. Therefore, the matched content is not made directly available to the advice. It could be made available if the transformation was represented by a  Chapter 7.  Conclusion  52  pair of parameters as it is done in the replacement transformation. However, we did not have enough motivation to implement it this way because we could not find enough examples where this would be necessary. We believe that our compile-time analysis of advice interactions will be quite useful for systems with large collections of handlers applied to messages. Even though the analysis reasons about the transformations on the schemas symbolically rather than manipulating specific document instances, we believe that it performs well. The current implementation can perform interaction checking in the context of a single doxpect and not yet across doxpects, although we feel that an extension will be straightforward. We showed how a request advice can terminate a request flow and initiate a response. However, this feature could not be implemented without Axis-specific API and, therefore, it might useful only for this platform. We believe that our compile-time transformation sufficiency check will be useful for services which must be kept backwards-compatible with all previous versions of their X M L Schemas. Our schema comparison algorithm has a very limited support of the structure a l l and facet pattern because they are hard to compare and are not very common in practice. Specifically, they were not used in our main sources of examples: FedEx and eBay X M L Schemas. In the future, the algorithm could be improved to better support them. We introduced dynamic types, which provide a simple way to access and mutate content common to matched elements. We believe that this feature will be very useful for handlers implementing crosscutting concerns and intercepting message elements of different types. We described the two available compilers: one based on Java Mirror API and the Annotation Processing Tool and one based on the Java Reflection API. The first one takes doxpect source files, the second - doxpect class files. The second compiler does not support all features of the language and we expect it to be used only when the doxpect source files are unavailable.  53  Bibliography [1] Berners-Lee, T., R. Fielding, and L . Masinter, RFC 3986: Uniform Resource Identifiers (URI): Generic Syntax, http://rfc.net/rfc3986.html, 2005. [2] Benzaken, V . , Castagna, G., and Frisch, A . CDuce: An XML-Centric General Purpose Language.  In  Proceedings  of  the ACM  International  Conference  on  Functional  Programming, 2003. [3] Boag, S., et al., XQuery 1.0: An X M L Query Language. W3C Candidate Recommendation, http://www.w3.org/TR/xquery/, 2006. [4] Bray, T., et al., Extensible Markup Language ( X M L ) 1.0 (Third Edition). W3C Recommendation, http://www.w3.org/TR/REC-xml/, 2004. [5] Clark,  J., X M L Path  Language  (XPath) Version  1.0.  W3C Recommendation,  http://www.w3.org/TR/xpath, 1999. [6] Clark,  J., X S L Transformations  (XSLT)  Specification. W3C Recommendation,  http://www.w3.org/TR/xslt/, 1999. [7] Christensen A . S., C. Kirkegaard, and A . M0ller, A Runtime System for X M L Transformations in Java. In Proceedings  of the Second International  XML  Database  Symposium, 2004. [8] Christensen, A . S A . M0ller, and M . I. Schwartzbach, Extending Java for high-level Web M  service construction. ACM Transactions on Programming  Languages and Systems, vol. 25,  no. 6, pages. 814-875, 2003. [9] Common Object Request Broker Architecture: Core Specification, Version  3.0.3.  http://www.omg.org/cgi-bin/apps/doc7formal/04-03-12.pdf, 2004. [10] Fielding, R. T., Architectural Styles and the Design of Network-based Architectures.  PhD  Dissertation,  University  of  http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm, 2000.  California,  Software Irvine,  54  Bibliography  [11] Fontanella, J. and R. Saia, SOA in IT Benchmark Report. Aberdeen Group, 2005. [12] Harren, M . , et al. X J : Integration of X M L processing into Java. Technical Report rc23007, I B M Research, 2003. [13] Hosoya H. and B . C. Pierce. XDuce: A Typed X M L Processing Language. In Proceedings of International Workshop on the Web and Databases, 2000  [14] Hosoya, H . and B . C. Pierce. XDuce: A Typed X M L Processing Language. Transactions on Internet Technology, 3(2), 2003.  [15] Imamura T., B. Dillaway, and E. Simon, X M L Encryption Syntax and Processing. W3C Recommendation, http://www.w3.org/TR/xmlenc-core/, 2002. [16] Sun Microsystems, Generics in the Java Programming Language. http://java.sun.eom/j2se/l.5/pdf/generics-tutorial.pdf, 2005. [17] Sun Microsystems, Java API for XML-Based RPC. http://java.sun.com/webservices/jaxrpc/index.jsp. [18] Sun Microsystems, Java Architecture for X M L Binding (JAXB). http://java.sun.com/xml/jaxb/, 2002. [19] Sun Microsystems, Java Properties Files. http://java.sun.com/docs/books/tutoriaI/il8n/resbundIe/propfiIe.html [20] Sun Microsystems, Java Remote Method Invocation (Java RMI). http://java.sun.com/products/jdk/rmi/, 2003. [21] Kastner, P., Enterprise Service Bus: The Foundation of Successful Service-Oriented Architecture. Aberdeen Group, 2006. [22] Kay, M . , X S L Transformations (XSLT) Version 2.0, W3C Candidate Recommendation, http://www.w3.org/TR/xslt20/, 2005. [23] Keen, M . et al., Patterns: Implementing an SOA using an Enterprise Service Bus. I B M Redbooks, 2004. [24] Kiczales, European  G.,  et  Conference  Springer, 2001.  al., on  An  Overview  Object-Oriented  of  AspectJ.  Programming  In  Proceedings  (ECOOP),  of  the  pages 327-353.  55  Bibliography  [25] Kilpelainen, P. and R. Tuhkanen. Regular Expressions with Numerical Occurrence Indicators  -  Programming  Preliminary Results. In Proceedings  of the Eighth  Symposium on  Languages and Software Tools, pages 163-173. University of Kuopio,  Department of Computer Science, 2003. [26] Kirkegaard, C , Moller, A . and Scwartzback, M . Static Analysis of X M L Transformations in Java. IEEE Transactions on Software Engineering,  2003.  [27] M0ller, A., Document Structure Description 2.0. BRICS, Department of Computer Science, University of Aarhus, Notes Series NS-02-7, http://www.brics.dk/DSD/, 2002. [28] SOAP Version 1.2. W3C Recommendation, http://www.w3.org/TR/soap/, 2003. [29] Tozawa, A., and Hagiya, M . X M L schema containment checking based on semi-implicit techniques. In Proceedings  of the Conference on Implementation and Application  of  Automata, 2003. [30] Wohlstadter E. and K . De Voider, Doxpects: Aspects supporting X M L transformation . interfaces. In Proceedings  of the International  Conference  on Aspect-Oriented  Development, 2006. [31] X M L Schema. W3C Recommendation, http://www.w3.org/XML/Schema, 2004. [32] XMLBeans. Version 2.1. http://xmlbeans.apache.org/, 2005.  Software  56  Appendix A  Filtering Messages by Service Endpoints This class-level annotation defines a list of URI regular expressions. Only messages targeted to 2  or originating from the services matched by one of the regular expressions are processed by the doxpect to which the Target annotation belongs. To define a regular expression for services to which the doxpect does not apply, one should prepend the expression with a ' ! ' sign. For example, the following expression allows a doxpect to be applied to messages targeted to or originating from  s h i p p i n g . com  ©Target ( { " s h i p p i n g W  and  s h i p p i n g . net,  . ( (com)  I  (net)  )" ,  but not from  shipping.  org.  "lshippingW.org"})  Note, that some common URI symbols, such as ' . ' and ' ? ' , have special meaning in regular expressions and must be escaped. The syntax of a URI is  [scheme:] [ / / a u t h o r i t y ] [path] [?query] [#fragment].  If a path,  query, or fragment part is not specified in a pattern then any string is accepted in its place. Hence, for example, the above pattern also matches messages to and from shipping.com/services/track.  While the query and fragment parts are rarely used in SOAP-based web-services, they are extensively used in simpler H T M L - or XML-based architectures like REST [10]. Therefore, we chose to support these URI.parts in Target patterns. A pattern of a query part may contain any number of parameter name-value pairs joined by '&' signs. In a pattern, a parameter value is considered to be a regular expression, but a 2  Alternatively, could be implemented at the method level  Appendix A. Filtering Messages by Service Endpoints  57  parameter name is always a pure string and must comply with the generic syntax defined in RFC 3986 [1]. For example, the following pattern matches a URI shipping.com/getpage.htmi which query path has the parameter i d set to any string with prefix "user" and the parameter a c t i o n set to either "shipRequest" or "shipReply". "shippingW.com/getpage\\.html\\?id=user.*&action-ship((Request)|(Reply))"  The order of parameters in a pattern does not matter. Therefore, the above pattern matches both "shipping.com/getpage.html?id=user325&action=shipRequest" "shipping.com/getpage.html?action=shipRequest&id=user32 5".  and  58  Appendix B Graph Comparison Algorithm What follows is a pseudo-code of the graph comparison algorithm outlined in Section 4.1.2. // A stack // either // stepped  item  is a state  of a comparison.  two element/wildcard over  in either  A new state  is created  nodes are matched or a structural  when node is  graph.  S t a c k matchStack = new S t a c k ( ) ; // Stack  of fork points.  // The comparison  state  A point with  is identified  the given  index  by an index corresponds  in the  matchStack.  to the fork  point.  S t a c k f o r k P o i n t s = new S t a c k ( ) ; // True if a step  was performed  and no backtracking  is  necessary  b o o l e a n madeStep = f a l s e ; // True if the algorithm  has just  backtracked  boolean j u s t B a c k t r a c k e d = f a l s e ; Node vNode = vGraph. g e t F i r s t N o d e () ,- // Current  node in the validated  graph  Node rNode = r G r a p h . g e t F i r s t N o d e ( ) ;  node in the reference  graph  // When true forces // to be matched  the current  // Current  node in the validated  graph  again  // (e.g. when the node was matched less  times  than its "minOccurs"  b o o l e a n mustMatchVNode = f a l s e ; // Same for the current  node in the reference  b o o l e a n mustMatchRNode = f a l s e ;  graph  value)  Appendix B. Graph Comparison boolean  59  Algorithm  compare()  {  .  '  while (true) { madeStep = f a l s e ; if  (!justBacktracked) { .// Repeat for each graph: // If the current  node in the graph is a all/sequence/choice  // and was already // child  nodes  /* code elided // Repeat  used, then reset  to their  occurrences  head  of its  "maxOccurs."  */  for each graph:  // If the current  node in the graph is a sequence/choice  // and wasn't used enough times // the corresponding /* code elided  flag  (less  member  than its "minOccurs")  (mustMatchVNode  or  then set  mustMatchRNode).  */  // Repeat for each graph: // If the current // ensure // If this  that  node in the graph is a choice/all  all children  condition  // the latest  item  were used enough  is not satisfied,  step  end,  times. back by  discarding  on the matchStack.  f o r e a c h ( n o d e : vNode, rNode) { if  (node.isStructureEnd()) { Node head = n o d e . g e t S t r u c t u r e H e a d ( ) ; if  ( ( h e a d . i s A l l H e a d ( ) || h e a d . i s C h o i c e H e a d ( ) ) && ! e n s u r e C h i l d r e n M i n O c c u r s ( h e a d ) ) // The backtrackOneStepBack // if the matchStack if  method returns  false  is empty  (!backtrackOneStepBack(matchStack))  return  false;  justBacktracked = true; }  } '  }  // Get iterator  over nodes that  follow  the current  v l t e r = vNode.getlteratorOverNextNodes();  validated  node  {  Appendix B. Graph Comparison // if vNode is a "choice"  or "all" structure  •// remember to return  to it later  if  || v N o d e . i s A l l  ((vNode.isChoiceO  60  Algorithm with more than one  to try other  child,  paths  ()) && v l t e r . s i z e ( ) > 1)  f o r k P o i n t s .push (new Fo'rkPoint (matchStack. s i z e () ) ) ; while  (vlter.hasNext()) {  vNext = c l t e r . next (•) ; // If mustMatchVNode  is true,  force  reuse  // If vNode is the end of a structure, // to be reused  necessary.  the whole structure  and we want- vNext to point  // If vNode is not a structural if  of vNode if  has  to the head of the  node, then make vNext point  structure.  to vNode.  (mustMatchCCur && ( ( v N o d e . i s S t r u c t u r e E n d ( ) && vNext ( ! v N o d e . i s S t r u c t u r a l ( ) && vNext vNext = /* code elided  != v N o d e . g e t S t r u c t u r e H e a d ( ) ) || != vNode))) {  */  }  // If vNext wasn't used, more times if  than its "maxOccurs"  (vNext.canUse()) { // If a structural // to if  node, skip  it and push the current  matchStack  ( v N e x t . i s S t r u c t u r a l ( ) && !vNext.isEndOfGraph()) { // If vNext is a head node of a structure // occurrences  of child  /* code elided  */  // However, to be able // backtracking), // the state  nodes are restored  to restore  the current  pushed  to the  already to their  the current  occurrence  used then the  state  values  matchStack.  justBacktracked = false; v N e x t . u s e ( ) ; // increase  occurrence  count by one  vNode = vNext; v l t e r = vNode.getlteratorOverNextNodes(); continue;  original  later  values.  (during  are saved as part of  matchStack.push(getCurrentState()) ;  }  state  '  Appendix B. Graph Comparison // If no backtracking // the iterator // we continue  occurred,  over  construct  a fresh  the next nodes of rNode.  to use the popped  61  Algorithm copy of  Otherwise,  iterator.  i f (!justBacktracked) r l t e r = rNode.getlteratorOverNextNodes(); justBacktracked = false; while  (true) {  // If no more elements optional  in rlter,  //  (skip  nodes) or  if  (irlter.hasNext()) { // If in sequence if  try to go  forward  backtrack  and the next node is optional,  (rNode.isSequenceMember()) // Upon a successful  skip,  try to skip it  { rNode and r l t e r are changed  skipOptionalNodelnSequence(rNode,rlter); }  // If couldn't  skip,  // the previous if  try to backtrack  unpaired  (structural)  to rNode  (!rlter.hasNext()) { // Peek from the matchStack. // either  pairs  // structural  Recall  that items  of matched element/wildcard nodes, which were skipped.  // Continue // with  structural  stack are unpaired item has  node from  graph.  to pop from the stack  // to skipped  nodes or If the picked  // no vNode component then it 's a skipped // the reference  of this  structural  exhausted  while  popped items  node from the reference  correspond  graph  iterators.  C o m p a r i s o n S t a t e peek = m a t c h S t a c k . p e e k ( ) ; while if  (peek.vNode()  == n u l l && ! p e e k . r l t e r ( ) . h a s N e x t ( ) ) {  (!backtrackOneStepBack(matchStack)) r e t u r n  false;  peek = m a t c h S t a c k . p e e k ( ) ; }  // If we found a stack  }  a structural  rNode and a  // non-exhausted  rlter,  if  == n u l l && p e e k . r l t e r ( ) . h a s N e x t ( ) )  (peek.vNode() if  }  item with pop it  (!backtrackOneStepBack(matchStack)) r e t u r n  false;  Appendix B. Graph Comparison // If rlter if  has no more  62  Algorithm  items  ( ! r l t e r . h a s N e x t ( ) ) break;  rNext = r l t e r . n e x t ( ) ; // If mustMatchRNode // (similar  is true,  to forcing  force  a reuse  reuse  //  necessary  of vNode).  // If rNode is the end of a structure, // to be reused  of rNode if  and we want rNext  the whole structure  to point  has  to the head of the  structure.  // If rNode is not a structural //• to point if  node, then we want rNext  to rNode. .  (mustMatchRNode && ( ( r N o d e . i s S t r u c t u r e E n d ( ) && rNext != r N o d e . g e t S t r u c t u r e H e a d ( ) ) | ( ! r N o d e . i s S t r u c t u r a l ( ) && rNext != rNode))) { rNext = /* code elided  */  }  // If we've reached if  the ends of both  (vNext.isEndOfGraph() // See if there if  && r N e x t . i s E n d O f G r a p h ( ) ) {  are fork points  // we have to backtrack  graphs  in the candidate  graph  that  to  (forkPoints.empty()) { return true;  // comparison  succeeded  } else { ForkPoint fork = backtrackPoints.peek(); // Cut matchStack  so that  the state  // the fork point  is at the top  corresponding  to  matchStack.setSize(fork.stacklndex + 1); // Now restore if  the state  (!backtrackOneStepBack(matchStack))  madeStep = t r u e ; break; } }  correspnding  to the fork  point  return false;  63  Appendix B. Graph Comparison Algorithm //  If rNext  can  be  used  if  (rNext.canUse())  (was  used  If rNext  is a structural  //  the  //  (similar  if  (rNext.isStructural()  graph,  to  //  If rNext  //  occurrences  //  values.  the  elided  //  However,  //  backtracking),  //  of  the  skipping  of  code  node,  push  is a head  /*  than  its  not  the  maxOccurs)  {  //  reference  less  it  but to  the  stack  and  of a structural  &&  node  vNext)  of a structure  nodes  are  of  skip  !rNext.isEndOfGraph())  node  child  end  {  already  restored  to  used,  their  then  the  original  */  to be  able the  state  to restore current  pushed  to  the  current  occurrence  the  state  values  later  are  saved  (during as  part'  matchStack.  matchStack.push(getCurrentState()); justBacktracked rNext.use();  =  false;  / / Increase  occurrence  count  by  rNode  =  rNext;  rlter  =  rNode.getlteratorOverNextNodes();  one  continue; }  // A t t h i s are  point now  we have  //  we  going  //  The  //  nodes  //  The  if  (compareNodes(vNext,  method and  non-  to  method  compares  namespace  If  //  by  int  useCount  equal  the maximal  //  Push  rNext)  //  comparison  we  is  the  types  of  pair  of  wildcard  described  rNext))  their  common  element nodes.  below  along  with  stack  matchStack.push(getCurrentState());  {  remaining  occurrences  amount.  = adjustOccurrences(vNext,  to  and  nodes,  &&  decrease  possible  the matched  rNext  names  checkPossibleMatchCounts  nodes, are  and  containment  checkPossibleMatchCounts(vNext,  //  vNext  compare.  compareNodes analyses  structural  the  rNext);  current  state  of  the  that  Appendix B. Graph Comparison // Advance  in both  64  Algorithm  graphs  vNode = vNext; rNode = r N e x t ; madeStep = t r u e ; break; // Break  from the inner  infinite  loop  >.  } } } }  // If could // if  match the current  then break  vNext  from the outer  to something  infinite  loop  in the reference  and prepare  graph  for the next  (madeStep) break;  }  // If could //  match the current  continue  if  to the next  vNext  to something  in the reference  graph,  step.  (madeStep) c o n t i n u e ;  // else if  try to backtrack  one step  (!backtrackOneStepBack(matchStack))  return  false;  } boolean  checkPossibleMatchCounts(Node vNode, Node rNode)  {  *  foreach(Node node: vNode, rNode) { //  S e e if node has neighbour  // and which elements/wildcards // the node's  element/wildcard  /* code elided // Constructs  name-type  /* code elided  the same name and type  may appear d i r e c t l y before in an instance  or  after  document.  */ an interval  // the given  nodes which have  of possible combination  cumulative  might  occurrences  that  have.  */  }  // Ensure  that  /* code elided  the interval */  of vNode lies  within  the rNode  interval.  step.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051719/manifest

Comment

Related Items