Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Reducing remodularization complexity through modular-objective decoupling Chern, Rick 2008

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2008_fall_chern_rick.pdf [ 2.82MB ]
Metadata
JSON: 24-1.0051353.json
JSON-LD: 24-1.0051353-ld.json
RDF/XML (Pretty): 24-1.0051353-rdf.xml
RDF/JSON: 24-1.0051353-rdf.json
Turtle: 24-1.0051353-turtle.txt
N-Triples: 24-1.0051353-rdf-ntriples.txt
Original Record: 24-1.0051353-source.json
Full Text
24-1.0051353-fulltext.txt
Citation
24-1.0051353.ris

Full Text

Reducing Remodularization Complexity Through Modular-Objective Decoupling by Rick Chern B.Sc. (Hons.), The University of British Columbia, 2006 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in The Faculty of Graduate Studies (Computer Science)  The University of British Columbia (Vancouver) August, 2008 c Rick Chern 2008  Abstract This dissertation defines “modular-objective coupling”, and shows that programming language designs which imply reduced modular-objective coupling reduce complexity of remodularizations—behaviour-preserving restructurings for which the only intended goals are to change program source code structure. We explicitly distinguish between two points of view on program structure: modular structure—the structure of a program as a set of static text documents, and objective structure—the structure of a program as a dynamic computational model during execution. We define modular-objective coupling as the degree to which changes in modular structure imply changes to objective structure, for a given programming language. We use the term remodularization to refer to any behaviour-preserving source code restructuring, for which the only intended goal is to change modular structure. We argue that programming languages with strong modularobjective coupling introduce accidental complexity into remodularizations, by requiring complex objective structure changes to achieve intended modular structure changes. Our claim is that a programming language design which implies reduced modular-objective coupling reduces remodularization complexity in the language. To validate this claim, we first present SubjectJ, a subject-oriented programming system that extends Java. The design of Java implies strong modular-objective coupling, while SubjectJ is designed for reduced modularobjective coupling. We then perform a series of remodularization case studies comparing Java and SubjectJ. Our results suggest that remodularizations are less complex in SubjectJ.  ii  Table of Contents Abstract  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ii  Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . .  iii  List of Tables  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  v  List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  vi  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Introduction . . . . . . . . . . . . . . . . . 1.1 Concepts and Terminology . . . . . . . 1.1.1 Modular and Objective Structure 1.1.2 Modular-Objective Coupling . . 1.2 Complexity in Remodularizations . . . 1.2.1 Illustrative Example . . . . . . . 1.3 Central Claim and Contributions . . . . 1.4 Overview of Dissertation . . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Historical Perspective . . . . . . . . . . . . . . . . . . . 2.2 Programming Systems with Loose Modular-Objective. . . 2.2.1 Hyper/J and Multidimensional Separation. . . . 2.2.2 Other Programming Systems . . . . . . . . . . . 2.2.3 Comparison to Our Research . . . . . . . . . . . 2.3 Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Language Design and Refactoring . . . . . . . . 2.3.2 External Behaviour Preservation . . . . . . . . . 2.3.3 Tool Support . . . . . . . . . . . . . . . . . . . . 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . .  1 1 2 2 5 6 9 9  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  10 10 11 11 12 13 14 14 15 16 17  iii  Table of Contents 3 SubjectJ . . . . . . . . . . . . . . . . . . 3.1 Design Goals . . . . . . . . . . . . . . 3.2 The SubjectJ Programming System . 3.2.1 Information Atoms . . . . . . 3.2.2 SubjectJ Information Atoms . 3.2.3 Subjects: Modules in SubjectJ 3.2.4 Refactoring Tools . . . . . . . 3.2.5 Prototype Implementation . . 3.3 Remodularizing Java Code in SubjectJ 3.4 Summary . . . . . . . . . . . . . . . .  . . . . . . . .  . . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  . . . . . . . . . .  19 19 20 20 23 25 27 32 33 33  . . . . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  . . . . . . . . .  35 35 37 39 40 43 44 45 47  5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Concluding Statements . . . . . . . . . . . . . . . . . . . . .  49 49 49  Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  51  4 Experiment . . . . . . . . . . . . . . . . . . 4.1 Experimental Setup . . . . . . . . . . . 4.2 Quantitative Results . . . . . . . . . . . 4.3 Qualitative Results . . . . . . . . . . . 4.3.1 Separation Strategies . . . . . . 4.3.2 Deciding on the “Right” Strategy 4.3.3 Automation of Transformations 4.3.4 Introduction of Bugs . . . . . . 4.4 Summary of Experiment Conclusions .  . . . . .  iv  List of Tables 3.1 3.2  Java 1.5 annotations specifying the interfaces of subjects. . . Tracking annotations for decomposing composed programs. .  28 30  4.1 4.2 4.3  Overview of code bases and refactoring tasks for case studies. Overview of quantitative case study results. . . . . . . . . . . Overview of bugs introduced during case studies. . . . . . . .  36 38 46  v  List of Figures 1.1 1.2 1.3 1.4 1.5  Visualization of the modular structure of a Java program. . Partial objective structure visualization of a Java program. GUI and model code tangled in the Counter.java file. . . . A separate model with a listener infrastructure. . . . . . . . GUI code as a client of the model. . . . . . . . . . . . . . .  . . . . .  3 4 6 7 8  3.1 3.2 3.3 3.4 3.5 3.6  Java information atoms and their dependencies. . . . . . SubjectJ information atom granularity. . . . . . . . . . . SubjectJ information atoms and their dependencies. . . Subjects in SubjectJ. . . . . . . . . . . . . . . . . . . . . Overview of the SubjectJ refactoring tools. . . . . . . . Composed SubjectJ program with tracking annotations.  . . . . . .  21 25 25 27 29 31  4.1 4.2 4.3 4.4  “Dual object” implementation of saving XML bug reports. . . Instance method getStoragePath() before separation. . . . . . “Static method” implementation of getStoragePath() method. Buggy remodularized Piece and Soldier classes. . . . . . . .  . . . . . .  . . . . . .  41 42 43 47  vi  Acknowledgements I would like to thank my supervisor, Dr. Kris De Volder, for his crucial guidance, as well as his exceptional patience and support throughout the course of this research. This work would not have taken shape without the many insightful discussions we had, however relevant or tangential they were to the research problems we were tackling at hand. I also want to thank Dr. Eric Wohlstadler for taking on the task of being my second reader. I would like to thank him and my labmates in the Software Practices Lab for providing helpful feedback and suggestions during this research, and for contributing to a fun and motivating work environment. Finally, I would like to thank my family for supporting and encouraging my educational pursuits.  vii  Chapter 1  Introduction This dissertation defines “modular-objective coupling”, and shows that programming language designs which imply reduced modular-objective coupling reduce complexity of remodularizations—behaviour-preserving restructurings for which the only intended goals are to change program source code structure. We explicitly distinguish between two points of view on program structure: modular structure—the structure of a program as a set of static text documents, and objective structure—the structure of a program as a dynamic computational model during execution. We define modular-objective coupling as the degree to which changes in modular structure imply changes to objective structure, for a given programming language. We use the term remodularization to refer to any behaviour-preserving source code restructuring, for which the only intended goal is to change modular structure. We argue that programming languages with strong modularobjective coupling introduce accidental complexity into remodularizations, by requiring complex objective structure changes to achieve intended modular structure changes. Our claim is that a programming language design which implies reduced modular-objective coupling reduces remodularization complexity in the language. To validate this claim, we first present SubjectJ, a subject-oriented programming system that extends Java. The design of Java implies strong modular-objective coupling, while SubjectJ is designed for reduced modularobjective coupling. We then perform a series of remodularization case studies comparing Java and SubjectJ. Our results suggest that remodularizations are less complex in SubjectJ.  1.1  Concepts and Terminology  In this section we describe two points of view on program structure, and then define “modular-objective coupling” in terms of these two points of view.  1  Chapter 1. Introduction  1.1.1  Modular and Objective Structure  A program is essentially a set of static text documents. These text documents describe some dynamic computational model of program execution. We can consider two points of view on program structure, embodied by the following definitions: Definition Modular Structure is the structure of a program as a set of static text documents. Definition Objective Structure is the structure of the dynamic computational model when a program is executed. We illustrate these concepts using a simple object-oriented Java program. Figure 1.1 is a visualization of the modular structure of the program. As can be seen, modular structure refers to the modularization of source code using Java packages and .java files, as well as the modularization of code (e.g. into method bodies) within .java files. Module interfaces are specified using visibility keywords in the code, in combination with the module types. For example, the the public keyword attached to the Circle class declaration indicates that the class declaration is visible to any module, while the private keyword for the Circle.radius field indicates that the field declaration is visible only within Circle.java. Objective structure in Java refers to the creation and deletion of objects, and their behaviour and interactions during runtime. Objective structure spans across time—Figure 1.2 contains a partial visualization of our example program’s objective structure, with a grey “slice” depicting the structure of objects at the point in execution marked on the call graph. At this point, two instances of Circle exist, referenced by two variables—smallCircle and bigCircle—in DrawApp.main(). Note that static fields and methods of a class are modelled by a single object that is created at the beginning of execution. Also, separation of superclass and subclass code into individual .java files in the modular structure is not mirrored by separate objects in the objective structure.  1.1.2  Modular-Objective Coupling  Some changes to modular structure do not alter objective structure, while others do. We illustrate this relationship between modular and objective structure using three scenerios based on the example Java program presented in Section 1.1.1: 2  Chapter 1. Introduction  Figure 1.1: Visualization of the modular structure of a simple Java program.  3  Chapter 1. Introduction  Figure 1.2: Partial objective structure visualization at point shown in call graph, for a simple Java program.  4  Chapter 1. Introduction • Moving the Shape.printCentre() method declaration and body code from the Shape.java file to Circle.java would not result in objective structure changes in our program—the instances of Circle in Figure 1.2 would remain the same. • Moving the Circle.printType() static method declaration and body code to a new Printing.java file would make changes to objective structure (essentially, the method would be contained in a new object replacing the original object, and runtime references would be changed to refer to the new object). • Moving the code for Circle.printRadius() to Printing.java would require objective structure changes—to begin with, both corresponding methods would be removed from the instances of Circle shown in Figure 1.2. Further objective structure changes would depend on the specific modular structure changes performed. The above scenarios are based on a specific program in the Java language. However, modular and objective structure, and their relationship between each other, varies amongst programs written in different programming languages. We thus define a type of coupling that is relative to each programming language: Definition Modular-objective coupling is the degree to which modular structure changes imply changes to objective structure, for a particular programming language. To clarify, modular-objective coupling is a programming language property, although it can be exhibited by the degree to which modular structure changes imply objective structure changes for an individual program written in a language.  1.2  Complexity in Remodularizations  In this dissertation, we investigate a particular type of source code restructuring: Definition A remodularization is a behaviour-preserving restructuring that has a stated intent of changing a program’s modular structure, but does not have a stated intent of changing a program’s objective structure.  5  Chapter 1. Introduction public class Counter extends JPanel { private int count = 0; private JButton button; private JLabel label; public Counter() { label = new JLabel(""+getCount()); add(label); button = new JButton("Increment"); add(button); button.addActionListener(new ActionListener() { public void actionPerformed(ActionEvent e) { increment(); } }); } public int getCount() { return count; } public void increment() { count++; updateDisplay(); } private void updateDisplay() { label.setText(""+getCount()); } }  Figure 1.3: GUI and model code tangled in the Counter.java file. The problem we examine in this dissertation is how programming languages with strong modular-objective coupling—especially object-oriented languages—introduce accidental complexity into remodularizations, making them difficult to perform. We now present an example to illustrate this problem.  1.2.1  Illustrative Example  The source code for a simple counter program written in Java is shown in Figure 1.3. When executed, the program displays a single “Increment” button, along with the number of times this button has been pressed. The Counter class contains code for handling GUI functionality (e.g. the button and label fields), as well as code for handling the counter model functionality. Suppose we want to perform a remodularization with a primary goal of separating GUI-related code from the rest of the code, and developing a clear interface between the two code modules. As with all remodularizations, 6  Chapter 1. Introduction public interface CounterListener { public void valueChanged(); } public class CounterModel extends JPanel { List<CounterListener> listeners = new ArrayList<CounterListener>(); private int count = 0; public CounterModel() {} public void increment() { count++; notifyListeners(); } private void notifyListeners() { for (CounterListener l : listeners) l.valueChanged(); } public void addListener(CounterListener l) { listeners.add(l); } public int getCount() { return count; } }  Figure 1.4: A separate model with a listener infrastructure. external behaviour of the program must be preserved. We believe this remodularization of Java source code is difficult to accomplish. In particular, we believe the required separation of code cannot be performed using small behaviour-preserving transformations supported by refactoring tools in a typical IDE.1 Instead, the remodularization is typically accomplished by manually implementing a model-view-controller architecture, resulting in the code shown in Figures 1.4 and 1.5. We argue that the accidental complexity in this remodularization is due to Java having strong modular-objective coupling. The stated problem is only to modularize source code, and thus complexities arising from changes to objective structure are non-essential to the problem. However, a property of Java is that the code for a class declaration and body must be contained within one .java file. This implies that moving any code from one .java file to another requires making changes to the class structure. In turn, instances of those classes at runtime are altered as well, changing objective structure. To preserve external behaviour, further changes to the computational model must be made (e.g. to update references between objects), 1  http://eclipse.org verified March 2008.  7  Chapter 1. Introduction public class Counter extends JPanel implements CounterListener { private CounterModel count; private JButton button; private JLabel label; public Counter() { count = new CounterModel(); label = new JLabel(""+count.getValue()); add(label); button = new JButton("Increment"); add(button); button.addActionListener( new ActionListener() { public void actionPerformed(ActionEvent e) { count.increment(); } }); count.addListener(this); } public void valueChanged() { label.setText(""+count.getCount()); } }  Figure 1.5: GUI code as a client of the model. introducing additional complexity to the remodularization. In terms of our illustrative example, moving the GUI code out of Counter.java implies moving the code out of the Counter class. Thus, GUI functionality— i.e. updating the counter display, and listening for “Increment” button clicks—is moved into a new object in the computational model. To preserve external behaviour, the new object must be notified when the display needs updating, and the original object must be notified when the counter should be incremented. The need to establish the appropriate communication between the new and old objects leads to the implementation of an MVC architecture. Implementation of the MVC architecture itself is a source of complexity. For example, lacking suitable IDE tool support, manually implementing an MVC architecture makes guaranteeing preservation of external behaviour more difficult than applying a composition of small behaviour-preserving transformations supported by refactoring tools. Implementing an MVC architecture also introduces complexities related to poor incrementality, as an MVC architecture may not be desired at this point in development.  8  Chapter 1. Introduction  1.3  Central Claim and Contributions  There are two main contributions of this dissertation. The first main contribution is the definition of terminology for understanding and working with program structure—i.e. “modular structure”, “objective structure”, and “modular-objective coupling”. Using this terminology, we formulate the central claim of this dissertation: Programming language designs which imply reduced modularobjective coupling reduce remodularization complexity in the language. The second main contribution is an experiment validating our central claim. The experiment consists of a series of remodularization case studies in two languages—Java [15] and SubjectJ. SubjectJ was designed to be similar, but have reduced modular-objective coupling, compared to Java. To compare remodularization complexity between the two languages, we collected quantitative data—time taken to perform each remodularization, and also performed a qualitative analysis of the remodularization process. The SubjectJ programming system is not intended to be a novel contribution in itself—its design is presented as an example of applying modularobjective coupling concepts to programming language design, and it is used in our experiment as a means for examining effects of modular-objective coupling on remodularization complexity.  1.4  Overview of Dissertation  The remainder of this dissertation is structured as follows. In Chapter 2, we provide an overview of programming systems that can be seen as having loose modular-objective coupling, and discuss related research in refactoring. Chapter 3 presents SubjectJ, a programming system designed for loose modular-objective coupling. Chapter 4 describes our case study experiment comparing SubjectJ and Java. We provide concluding statements and possible directions for future work in Chapter 5.  9  Chapter 2  Related Work This dissertation claims that programming language designs which imply reduced modular-objective coupling reduce remodularization complexity in the language, and we validate this claim by comparing Java with SubjectJ— a language we designed for loose modular-objective coupling. In this section we discuss related work in three broad categories. In Section 2.1 we provide a historical perspective on the concept of modular-objective coupling. In Section 2.2, we discuss related work on programming systems that, like SubjectJ, can be considered to have loose modular-objective coupling. Finally, since we can regard remodularization as a specific type of refactoring (being program transformations intent on preserving program behaviour), we provide an overview of complimentary research on handling refactoring complexity in Section 2.3.  2.1  Historical Perspective  As early as 1969, Dijkstra discussed language design in terms of coupling between static and dynamic program structure, and its potential impact on the understandability of programs [9]. Dijkstra motivates structured programming by arguing for the need to easily map between program’s textual structure, and the structure of its execution. For the sake of achieving this straightforward mapping, he proposes restricting the freedom with which program text can be structured. Essentially, Dijkstra argues for strong modular-objective coupling with sequenced statements, conditionals and loops at the intra-procedural level. While Dijkstra argues for strong modular-objective coupling at the intraprocedural level, we have arguably seen a trend in programming language evolution that moves away from strong modular-objective coupling at the coarser level. Examples include the introduction of object-oriented inheritance and dynamic method dispatch [14], later followed by the introduction of inter-type declarations in aspect-oriented programming [25]. The pointcut model of aspect-oriented programming even contributes to loose modular-  10  Chapter 2. Related Work objective coupling at the sub-method level, in contrast to Dijkstra’s arguments for strong intra-procedural modular-objective coupling. Our experimental results support the apparent trend for reduced modularobjective coupling. At the same time, we believe our results do not inherently contradict Dijkstra’s view, considering that we mainly investigate coupling at the coarse-grained modular level, not at the sub-method granularity.  2.2  Programming Systems with Loose Modular-Objective Coupling  Applying the terminology and concepts from Section 1.1, Java has particularly strong modular-objective coupling because the member declarations for a class must occur within the context of a single class declaration. On the other hand, many programming systems can be considered to have loose modular-objective coupling. We first describe these programming systems, and then explain the role of our research in relation to these languages.  2.2.1  Hyper/J and Multidimensional Seperation of Concerns  In order to validate our central claim in this dissertation, we develop a programming system we call SubjectJ. The design of SubjectJ was inspired by work on subject-oriented programming [19, 34, 35] and multi-dimensional separation of concerns [40] by Ossher et al. Hyper/J is a Java-compatible implementation of hyperspaces [39], based on the idea of multi-dimensional separation of concerns. In Hyper/J, software units (e.g. individual members of a class) can belong to more than one dimension of concern—each type of concern is represented as a dimension in the global hyperspace. Software units are represented as points within the hyperspace, thus mapping each unit to exactly one concern from each dimension. The “class” dimension is simply a default dimension with concerns corresponding to Java classes. A set of software units pertaining to a particular concern in a dimension can be made into a declaratively complete hyperslice. A hyperslice is a set of software units that includes the abstract declarations of any other units required. Thus, each hyperslice explicitly declares the dependencies it has on other software units. Hyperslices can be composed to form partial or complete systems, based on various composition rules. Dependencies of a hyperslice have an opportunity to be satisfied via composition. SubjectJ can be considered as a simple variant of Hyper/J, supporting 11  Chapter 2. Related Work only two dimensions of concern (“class” and “subject”) and a single built-in composition rule. However, SubjectJ modules encapsulate implementation details through explicitly defined interfaces. In contrast, Hyper/J hyperslices do not distinguish between exported and non-exported declarations, and thus cannot truly encapsulate implementation details by hiding declarations from other hyperslices.  2.2.2  Other Programming Systems  The main difference between SubjectJ and Java is that it allows splitting of declarations that belong to a single class across multiple source files. There are many other object-oriented based programming systems that provide similar functionality. For example, C# [20] supports partial classes. A partial class is essentially an incomplete class definition, which is merged prior to compilation with other partial classes defining the same class. Together, the partial classes of a class must form a complete class definition—partial classes cannot be separately compiled. A partial class is not a module, in the sense that it does not encapsulate its contents and have a well-defined interface to other code. Partial classes are typically used under specific circumstances, such when adding code to automatically generated source files [20]. Similarly, open classes allow Ruby [11] class members to be declared outside of a class. Multimethod and open languages (e.g. [5, 7]) also decouple the declaration of methods from the declaration of classes. For example, MultiJava [6] supports both open classes and multimethods. Class members can be added to existing classes, even without access to the source code of the existing classes. Multimethods allow for dynamic dispatch on arguments in addition to the receiver object. Together with open classes, multimethods encourage grouping of methods according to criteria other than strictly by receiver object type. Other examples are systems which provide mixin layers as units of modularity that crosscut classes [3, 37]. A mixin layer module implements a particular software feature. Difference-based modules in MixJuice [22] are based on mixin layers. Each difference-based module implements a particular software feature by extending other difference-based modules. Each module can either define additional classes, interfaces, or members that do not exist in the extended modules, or can add additional code to existing classes, interfaces, or members defined in the extended modules. Aspect-oriented programming languages (e.g. AspectJ [25], JBoss AOP [2], and AspectC++ [38]) provide inter-type declarations, pointcuts, and advice. 12  Chapter 2. Related Work These features make it easier to move code out of classes and into aspects. Inter-type declarations allow complete member declarations to be moved out of classes and into aspects, reducing modular-objective coupling at the granularity of member declarations, similar to SubjectJ. Pointcuts allow for finer-grained selection of points in execution (e.g. access of a particular field), and code to be executed at these points can be modularized into aspects that are attached to the pointcuts. In this sense, pointcuts and advice extend modular-objective coupling to sub-method granularity.  2.2.3  Comparison to Our Research  Recall that SubjectJ itself is not intended to be a novel contribution—in fact, the technologies described above differ in mechanism, but they all allow for easier movement of code without changing objective structure, thus reducing modular-objective coupling. To support our experimental comparison, we did not design SubjectJ to add novel language features, but to be as similar to Java as possible, while removing one of Java’s greatest sources of strong modular-objective coupling. For our experiment, additional language features in SubjectJ would make it harder to attribute differences in remodularization complexity to differences in modular-objective coupling. We claim that the value of our contribution is in the validation of our central claim—that programming language designs which imply reduced modular-objective coupling reduce remodularization complexity in the language. We believe many language designers intuitively understand this claim. For example, the authors of MixJuice explicitly mention that moving code between MixJuice super-modules and sub-modules retains the semantics of the code [22], and suggest that this could make some refactorings easier. Our contribution is validation of this intuition—we provide tangible evidence that existing language features which reduce modular-objective coupling may reduce accidental complexity for remodularization. We should note that our experimental results cannot be generalized to all language features that reduce modular-objective coupling. For example, consider the aspect-oriented programming language AspectJ [25], discussed in the previous section. Our results suggest that inter-type declarations substantially reduce remodularization complexity. However, we cannot draw similar conclusions about pointcut advice because these features operate at a finer level of granularity, and have no counterpart in SubjectJ.  13  Chapter 2. Related Work  2.3  Refactoring  This dissertation specifically examines how a programming language with loose modular-objective coupling reduces the complexity of remodularization. As a behaviour-preserving transformation, remodularization can be considered a particular kind of refactoring, where the goal of the transformation is to modularize source code. Thus, our work falls into the broader research area on easing refactoring. We provide a brief overview of this research area, describing how it compliments our research. The term “refactoring” was first introduced by Opdyke [32], where he presented it as an approach to restructuring object-oriented software in a way that is automatic and behaviour-preserving. Johnson and Opdyke present further refactorings and analyze their application in [24] and [33]. Fowler et al. describe and catalogue over 70 refactorings in [12]. Mens et al. [28] provide an extensive survey of research in refactoring, discussing a broad range of issues. The discussions below refer to this survey.  2.3.1  Language Design and Refactoring  Our dissertation is concerned with the impact of language design on refactorability of programs. The survey by Mens et al. includes a brief summary of refactoring support (e.g. tools, techniques, and formalisms) provided for different programming languages. In comparing the support, the authors observed similar problems to those described in this dissertation. In particular, they comment on how object-oriented principles can introduce accidental complexity to refactorings. For example, Najjar et al. observed that inconsistent uses of the “super” construct in Java—such as embedding conditional statements and declaring anonymous classes within “super” calls— made refactoring of Java class constructors into static “creation methods” difficult to perform consistently [31]. Garrido et al. mention that specific features of the C language such as “pointers” and “structs” introduce complexity into refactorings [13], and specifically examine the challenges of refactoring preprocessor directives in C source code. They recognize that performing refactorings on preprocessed C code has limitations, thus necessitating refactorings that restructure preprocessing directives in conjunction with standard C code. They examine the consequent difficulties in building automated refactoring tools for C, suggesting that such difficulties are partially responsible for the relatively low level of automated refactoring support in C and C++ (compared to Java). 14  Chapter 2. Related Work As with our work, the above research recognizes that difficulties in refactoring may be rooted in programming language properties. However, the above research focusses on refactoring techniques that address such difficulties—in contrast, our work focusses on formalizing and analyzing programming language properties (e.g. modular-objective coupling) that contribute to such difficulties.  2.3.2  External Behaviour Preservation  Much complexity in refactoring comes from “external behaviour” being a subjective notion. By definition, refactorings must preserve external behaviour, but the meaning of external behaviour is dependent on what behaviour one considers to be observable from a particular point of view. Different techniques have been proposed to deal with this complexity. Unit Testing A practical technique for specifying and verifying behaviour preservation adapts unit tests. Pipka [36] and Deursen et al. [8] both acknowledge that it is often not possible to use the same set of unit tests to verify program behaviour both before and after a refactoring; in fact, many structural refactorings are intended to change source code interfaces, thus potentially invalidating unit tests. Deursen et al. propose “test-first refactoring” as an extension to “testfirst design” in agile software development. They introduce the idea of a “refactoring session”, where unit test modification and application refactoring are performed in alternating steps, and unit tests are kept valid after each step. Pipka proposes “Test First Refactoring”, where unit tests are changed before refactorings on application code are performed. The modified unit tests are used to verify behaviour preservation after the refactorings are performed. In both approaches, the onus of ensuring behaviour preservation is partially transferred from the task of application refactoring to the task of unit test modification. Thus, the issue of language design impact on refactoring remains relevant, as complexities due to modular-objective coupling can affect the ease with which unit tests can be modified. Avoiding Change Other techniques for preserving external behaviour are based on identifying program transformations that provide strong guarantees of semantic equiva15  Chapter 2. Related Work lence, so that both external and internal behaviour are preserved. For example, Bergstein [1] presents a set of primitive class transformations for objectoriented programs that are formally proved to be “object-preserving”—i.e. they do not change runtime objects and their behaviour. H¨ ursch et al. [21] presents an automated framework for evolving objectoriented systems, based on “change avoidance” techniques in adaptive objectoriented programming [27]. Adaptive object-oriented programming actively prepares for change in class structure, by specifying only the essential method and class constraints in the program text, without specifying a particular class structure that meets these constraints. “Customizations” to different class structures that meet the constraints can be made while avoiding changes to the program text (and thus external behaviour). Many languages with loose modular-objective coupling essentially apply a form of “change avoidance”, by allowing for changes in modular structure while avoiding changes to the class structure. Also, by avoiding changes to class structure, languages with loose modular-objective coupling effectively increase the number of “object-preserving” transformations that can be performed. We can consider our experimental results to be encouraging for change-avoiding refactoring strategies in general, as our results specifically validate designing languages for loose modular-objective coupling.  2.3.3  Tool Support  Refactoring tools can provide substantial support for achieving remodularization goals. There is a large amount of work directed towards providing automated refactoring support for complex refactorings [4, 16, 26, 30, 41, 42]. For example, Tip presents a set of automatable refactorings based on type constraints in [41]. The refactorings presented by Tip form the basis for many automated refactorings in the Eclipse IDE (the IDE used in our case study experiments), such as “extract interface” and “generalize declared type”. As described by Tip, the use of type constraints can automate precondition checking and execution of certain refactorings. For any particular program, a set of type constraints can be automatically derived based on subtype relationships. Certain refactorings then become automated transformations from that particular program, to another program that obeys the same type constraints. For instance, one of the many type constraints for a method call to Type.method() is that a method with the same signature as method() must be declared in either Type, or one of the supertypes of Type. When automatically moving methods from a subtype to a supertype, this 16  Chapter 2. Related Work type constraint (along with several others) is obeyed to determine which types the method can be moved to. However, we believe that the level of support that can be provided by automated refactoring tools is dependent on the complexity of transformations performed by the tools, and thus indirectly affected by modular-objective coupling. Indeed, we found in our experiment that SubjectJ and Java remodularizations differed not so much in the extent of changes performed, but in the level of reliable tool support for the changes. In particular, the Eclipse IDE does not provide a reliable refactoring for general movement of instance methods between classes, particularly for cases where the type constraint described above would be violated; however, the less complex refactoring of moving static methods is reliably supported. In contrast, moving methods between SubjectJ subjects only affects the static interfaces between subjects, and by definition, preserves all type constraints. Thus, SubjectJ can reliably support this type of transformation through tools that are of comparable complexity to Java tools for moving static methods, or moving instance methods to supertypes. We believe that developing reliable and sophisticated refactoring tools is complimentary to investigation of programming languages features that reduce modular-objective coupling. For example, aspect-oriented programming languages which reduce modular-objective coupling create the opportunity to develop new kinds of refactorings, both for remodularizing crosscutting concerns into aspects (e.g. extracting method fragments into pointcuts and advice), and for refactoring of aspects themselves (e.g. replacing intertype declarations with aspect methods) [29]. Work on developing tool support for such refactorings ([10, 17, 18]) remains relevant, and must consider new challenges. For example, Hannemann et al. describe an interactive approach for refactoring code into aspects that addresses complexities introduced by the aspect-oriented features of AspectJ [17].  2.4  Summary  There have been arguments in the past for strong modular-objective coupling at the sub-method granularity. However, we have arguably seen a trend towards looser modular-objective coupling; examples include the introduction of object-oriented inheritence and dynamic method dispatch. In fact, there already exist many languages that provide loose modular-objective coupling. In this dissertation, we present one such language—SubjectJ—and explore the effects of modular-objective coupling on remodularization complexity. 17  Chapter 2. Related Work Our contribution is validation of the intuition that these languages reduce remodularization complexity. Remodularization can be considered a type of refactoring; as such, our work falls into the broader research area concerned with refactoring ease. Existing research in this area is generally complimentary to our work; the main issues of relevance are: the effect of language properties on refactoring, preserving external behaviour, and tool support. Some research recognizes that refactoring difficulties may be rooted in programming language properties, but the research generally focusses on refactoring techniques to address the difficulties present in existing languages. In constrast, our work focusses on analyzing how programming languages properties contribute to such difficulties. Techniques for addressing the problem of preserving external behaviour during refactoring involve adapting unit tests, and avoiding change of program semantics. Adapting unit tests to specify and verify behaviour can help reduce accidental external behaviour changes during refactoring, but complexities in implementing the refactorings themselves remain. Techniques for avoiding change of program semantics are based on identification of program transformations that provide strong guarantees of semantic equivalence, and providing frameworks for applying such transformations. We can consider our experiment results to support a particular type of change avoidance technique for reducing refactoring complexity—designing languages with loose modular-objective coupling essentially applies the strategy of avoiding change, by allowing for changes to modular structure while avoiding changes to the computational model. Finally, we believe that developing reliable and sophisticated refactoring tools is complimentary to investigation of programming language features that reduce modular-objective coupling. Refactoring tools can provide substantial support for achieving remodularization goals, but we believe that the level of tool support that can be provided is dependent on the complexity of transformations performed by the tools, and thus constrained by strong modular-objective coupling. Even in languages with reduced modular-objective coupling, work on developing tool support for refactoring remains relevant, and must consider challenges introduced by new language features.  18  Chapter 3  SubjectJ To validate our central claim—that programming language designs which imply reduced modular-objective coupling reduce remodularization complexity in the language—we first present SubjectJ, a programming language that is explicitly designed to reduce modular-objective coupling, particularly in comparison to Java. As described in Chapter 2, SubjectJ builds heavily on the ideas of subject-oriented programming [19, 35] and multidimensional separation of concerns [40], and is essentially a simple variant of Hyper/J [39]. In this chapter, we describe the design of SubjectJ from the perspective of modular-objective coupling.  3.1  Design Goals  SubjectJ is designed for use in our experimental comparison with Java, which we describe in Chapter 4. The purpose of the experiment is to compare how modular-objective coupling affects complexity of remodularization. The design goals for SubjectJ are derived from its intended use in our experiment. The primary design goal of SubjectJ is that it should have reduced modular-objective coupling compared to Java. In particular, SubjectJ should provide more support than Java for the separation of source code into modules that have well-defined and explicit interfaces, without changing objective structure. This design goal ensures that Java is compared with a language that has strictly looser modular-objective coupling. Additionally, there are two pragmatic design goals. The first goal is that any Java program should be a valid SubjectJ program. This design goal allows realistic, existing open source Java code bases to be used with both Java and SubjectJ in our experiment. The second goal is that SubjectJ syntax should be mostly compatible with existing Java tools in the Eclipse 3 IDE. This allows Eclipse tools to be used with both Java and SubjectJ programs, so that SubjectJ and Java are comparable in terms of realistic IDE support. Together, the pragmatic design goals reduce differences between Java and SubjectJ that are beyond modular-objective coupling, making interpretation of our comparison results more straightforward. 19  Chapter 3. SubjectJ  3.2  The SubjectJ Programming System  We now describe the design of the SubjectJ programming system. The design is based on the design goals described in Section 3.1. Recall that the primary design goal of SubjectJ is to provide better support for modularization of source code without changing objective structure. One approach to this goal is to increase the freedom with which parts of source code in a program can be moved around, without changing the described computational model. To use this approach, we need to first answer two important questions: 1. What information is contained in a given piece of source code, and where is the source code that expresses a given piece of information? 2. What constraints apply to moving this information around? Modelling program source code in terms of “information atoms” allows us to precisely answer these questions. We can then approach the design of SubjectJ language syntax by defining SubjectJ information atoms to both comply with our pragmatic design goals, and reduce constraints on source code movement. Modelling source code as information atoms also allows us to see that information can be repeated for the purposes of subjective structuring, without changing the model described by the information. Along with a precise understanding of dependencies between information atoms, this allows for the separation of source code into declaratively complete “subject” modules that have well-defined and explicit interfaces. Checking for declarative completeness of subjects, and the actual separation of code into subjects is accomplished through a set of Eclipse refactoring tools designed to manipulate source code abstractly as information atoms.  3.2.1  Information Atoms  We can view the source code of a program as expressing a set of assertions that describe a computational model. We define a concept to describe the relationships between these assertions, and to source code: Definition An information atom is an indivisible packet of information, consisting of one or more assertions about program structure, that can be mapped to a range of text in the source code where those assertions are expressed, and that has clearly defined static dependencies on other information atoms its assertions depend on. 20  Chapter 3. SubjectJ The above definition only strictly applies to languages with static type systems, such as Java. In languages without static type systems, statically determining dependencies amongst information atoms may not be possible. It is also important to note that the granularity of information atoms is a language design choice. In other words, an information atom is “indivisible” because the programming language prevents separation of the atom’s assertions from each other (and movement of corresponding source code text to different locations). Further division of the information would not necessarily be meaningless. Explanatory Example To illustrate the concept of information atoms, consider the Java program fragments shown in Figure 3.1. The piece of source code in lines 1-2 of Point.java expresses the following four assertions: • There is a class called Point. • The Point class is public. • The Point class extends the Shape class. • The Point class implements the Movable interface.  Figure 3.1: Fragments of a simple Java program, depicted in terms of information atoms and their dependencies amongst each other.  These particular four assertions are indivisible—in Java, they can only be expressed together. The assertions can also be mapped to a range of source code text (lines 1-2) where they are expressed. Thus, this is an example of a 21  Chapter 3. SubjectJ Java information atom. We illustrate information atoms in Figure 3.1 using dashed boxes. Other examples of information atoms are the individual field, constructor and method declarations. For example, the setDotSize method is declared in an information atom at lines 6-9 of Point.java. It contains assertions such as “The Point class has a method called setDotSize”, “This method has one int parameter”, etc. Information atoms have a static dependency structure, determined by the dependencies amongst the assertions which compose the atoms. We distinguish between two kinds of dependencies: referential and contextual. As an example of a referential dependency, consider the information atom at lines 1-2 of Point.java. One of its assertions is that the Point class extends the Shape class. This assertion depends on the existence of Shape, which is asserted in the information atom at line 1 of Shape.java. There is therefore a referential dependency from the former atom to the latter. We illustrate referential dependencies in Figure 3.1 using solid arrows. As an example of a contextual dependency, consider the information atom at lines 6-9 of Point.java. One of its assertions is that the Point class has a setDotSize method. However, the fact that the assertion is about the Point class is implied by the context of the declaration, rather than explicitly stated in the atom’s text region. Therefore, we say that the information atom on lines 6-9 has a contextual dependency on the atom at lines 1-2. We illustrate contextual dependencies in Figure 3.1 using dashed arrows. The concept of information atoms can be used to answer the questions listed in the beginning of this section. Take the source code text at lines 6-9 of Point.java. What information is contained in this piece of source code? The information contained in this piece of source code is the collection of assertions in the information atom that is mapped to lines 6-9—e.g. “The Point class has a method called setDotSize”. This information atom has dependencies—both contextual and referential— on other information atoms. It has a contextual dependency on the existence of a Point class, and referential dependencies on the existence of the field int dotSize. Where is the source code that expresses this information? The information atom that asserts the existence of a Point class is mapped to lines 1-2 of Point.java, and the information atom that asserts the existence of int dotSize is mapped to line 4 of Point.java. What constraints apply to moving the information expressed by lines 6-9 of Point.java? The constraints that apply to moving the information atom at lines 6-9 of Point.java are that contextual and referential dependencies 22  Chapter 3. SubjectJ cannot be broken. The atom’s referential dependencies require that int dotSize remains accessible. In Java, the atom’s contextual dependency on the atom at lines 1-2 of Point.java effectively mean that the piece of source code at lines 6-9 cannot be moved outside of Point.java, as the context implying that setDotSize belongs to the Point class can only exist within Point.java. An important note is that the granularity of information atoms in a language affects the constraints that apply to moving the atoms around. A dependency between any two assertions implies a dependency between the entire information atoms containing those assertions. The more coarsegrained those information atoms are, the more dependencies those atoms have on other information atoms, and thus the more constrained the movement of atoms is.  3.2.2  SubjectJ Information Atoms  To design SubjectJ syntax in terms of information atoms, we need to provide a full definition of SubjectJ information atoms. In particular, we need to define the following: 1. The SubjectJ assertions, and their corresponding syntax. 2. The static dependencies between SubjectJ assertions. 3. The indivisible groups of SubjectJ assertions that make up SubjectJ information atoms. Our goal is to provide a definition of SubjectJ information atoms that reduces constraints on source code movement, while meeting our pragmatic design goals. First of all, we define SubjectJ assertions (and their corresponding syntax) to be the same as those in Java. Our first pragmatic goal states that any Java program should be a valid SubjectJ program—this implies that all Java assertions should be SubjectJ assertions. At the same time, our second pragmatic goal states that SubjectJ syntax should be mostly compatible with Java-based IDE tools. We thus refrain from introducing additional SubjectJ assertions, as such assertions would not be supported by Java-based IDE tools. Since we have defined SubjectJ assertions to be the same as Java assertions, the static dependencies between SubjectJ assertions are the same as  23  Chapter 3. SubjectJ those in Java—referential and contextual dependencies are as explained in the previous section. Finally, we define the groups of SubjectJ assertions that make up SubjectJ information atoms. We want to increase the freedom of source code movement over Java. So, we define each SubjectJ information atom to be either the same as a Java information atom, or a further decomposition of a Java information atom. Defining SubjectJ information atoms to not combine assertions from different Java atoms ensures that any separation between assertions in Java is also possible in SubjectJ. The finer-grained SubjectJ atoms allow for potential separation between SubjectJ assertions that would be inseparable in Java. To decide on the granularity of SubjectJ information atoms, we began with atoms identical to Java information atoms, and decomposed only those atoms which we observed to most constrain the desired movement of code in preliminary remodularization tasks that we performed. For example, we decided to keep the class header and its “extends” clause as a single information atom, because we observed that we usually moved superclass declarations along with classes. On the other hand, we decided to separate each “implements” clause into individual information atoms, because we observed that entire interfaces would often be irrelevant to many of the modules we wanted to create. The SubjectJ information atoms are shown in Figure 3.2. The atoms are similar to Java information atoms, with the following exceptions: Each element of an “implements” clause corresponds to a separate atom; each element of an “extends” clause for an interface corresponds to a separate atom; constructor/method signatures and their bodies correspond to separate atoms; and field signatures and their initializers correspond to separate atoms. We now illustrate how the higher granularity SubjectJ information atoms increase freedom of source code movement. Compare the SubjectJ information atoms depicted in Figure 3.3 to those previously depicted in Figure 3.1 for an identical Java program. Suppose we had a region of code elsewhere that contained a call to Point.setDotSize. This call would depend on the assertion of the method’s signature, corresponding to line 6 of Point.java. In Java, this would imply a referential dependency on the information atom mapped to lines 6-9 of Point.java in Figure 3.1, which would transitively imply dependencies on all the other information atoms depicted in the figure. In SubjectJ however, the same call to Point.setDotSize would imply a referential dependency on the information atom at line 6, which would only transitively imply dependencies on the atoms at the first lines of Point.java 24  Chapter 3. SubjectJ  Figure 3.2: The granularity of SubjectJ information atoms. and Shape.java. Thus, movement of the call to Point.setDotSize in SubjectJ would be constrained by fewer dependencies on other code than in Java.  Figure 3.3: Fragments of a simple SubjectJ program, depicted in terms of SubjectJ information atoms and their dependencies amongst each other.  3.2.3  Subjects: Modules in SubjectJ  Part of our primary design goal is for SubjectJ is to have source code modules that support more flexible encapsulations of code. We implement such modules as “subjects”—collections of information atoms. As with other facets of SubjectJ design, the design of subjects is guided by the design goals of Section 3.1. In particular, a SubjectJ subject should be self-contained, have a well-defined and explicit interface to other subjects, and be mostly compatible with existing Java IDE tools. 25  Chapter 3. SubjectJ In order for a subject to be self-contained, the source code of a subject should be declaratively complete [40]—i.e., a subject needs to include at least the declarations of anything it depends on. In other words, the referential and contextual dependencies of each information atom in a subject must be satisfied within the subject itself. An important insight helps facilitate this: Information atoms can be repeated —particularly, amongst different subjects—without changing the program that is described. Each contextual and referential dependency in a subject needs only to be satisfied by one copy of an information atom; this copy can be contained within the subject. Figure 3.4 shows an example of two SubjectJ subjects, derived from the code in the motivating example. In the figure, repeated information atoms are connected using double lines. Notice how the arrows illustrating referential and contextual dependencies do not cross the bounderies between subjects. In particular, observe that repeated class declaration atoms allow member declarations for a class to be in separate subjects, removing one of the main Java limitations discussed in Section 1.2. The repeated information atoms of a SubjectJ program must remain consistent, and thus form the interfaces between subjects. As per our primary design goal, the repeated atoms that are part of a subject’s interface must be explicitly specified. Since our pragmatic design goals require SubjectJ subjects to be mostly compatible with existing Java IDE tools, we choose to specify the interface for each subject using Java 1.5 annotations. Each repeated, or “shared” information atom has an annotation attached in close proximity to the source code segment corresponding to the atom. See Table 3.1 for descriptions of these annotations. These annotations are part of SubjectJ syntax, and their meaning as a specification for subject interfaces is only defined for SubjectJ source code. However, they do not pose compatibility problems with Java IDE tools, which can ignore the SubjectJspecific meanings of the annotations. Note that the interface of a subject is independent from Java’s public, private, and protected modifiers, and is scoped relative to subjects rather than classes. For example, the Counter.increment() method signature is repeated amongst the GUI and OTHER subjects in Figure 3.4, and thus part of each subject’s interface. This is specified using an @Import annotation in GUI, and an @Export annotation in OTHER. Note that even though the @Export annotation is attached to the entire method declaration and body of Counter.increment(), only the method signature is part of the interface. Both subjects need to agree on the semantics of the Counter.increment() method signature (e.g. the “pre” and “post” conditions of the method). On the other hand, the Counter.button field in GUI is not part of the subject’s interface, and thus a 26  Chapter 3. SubjectJ  Figure 3.4: SubjectJ subjects, with information atoms depicted. Repeated atoms are connected with double lines, referential dependencies are shown with solid arrows, and contextual dependencies are shown with dashed arrows.  Counter.button field could be introduced in the OTHER subject with different semantics from those of Counter.button in the GUI subject.  Each subject itself is represented as an Eclipse Java project. Only the SubjectJ programming system defines a set of Eclipse projects as subjects of a single SubjectJ program, but the Eclipse Java IDE tools still provide useful support for working within each subject. The simple example in Figure 3.4 illustrates two Java projects with one .java file each, but in general, a SubjectJ subject can contain multiple .java files in multiple Java packages.  3.2.4  Refactoring Tools  Checking for declarative completeness of subjects, separating code into subjects, and merging subjects is supported through a set of Eclipse refactoring tools. By modelling source code as information atoms, these tools system27  Chapter 3. SubjectJ Annotation @Export  @Import  Attached to Field, Method, Constructor Field, Method, Constructor  @Shared  Class  @Shared  Interface  Meaning The signature atom of this declaration can be shared by other subjects. The signature atom of this declaration is shared with another subject which exports it. This annotation should only be attached to declarations that do not have a body atom or field initializer atom. The class header atom can be shared by other subjects. “Implements” clause atoms are shared automatically if the corresponding interface is shared. The interface header can be shared by other subjects. “Extends” clause atoms are shared automatically if the corresponding interface is shared.  Table 3.1: Java 1.5 annotations specifying the interfaces of subjects. atically analyze and make changes to modular structure, while preserving objective structure. There are three SubjectJ-specific refactoring tools—Compose, Decompose, and Checker. Figure 3.5 illustrates how the SubjectJ refactoring tools interrelate with the Java development tools in Eclipse. We describe each SubjectJ tool in more detail below. The Compose Tool The Compose tool composes a set of SubjectJ subjects into a single Eclipse Java project. As shown in Figure 3.5, the resulting Java program can be compiled and run using the standard JDT compiler. The Compose tool ensures that the information expressed by the source code is not changed—only the structuring of the information is changed. The Compose tool accomplishes this by respecting subject interfaces during composition. In most cases, this consists of merging copies of information atoms according to @Export, @Import, and @Shared annotations. In other cases, the Compose tool needs to ensure that unique information atoms remain separate after composition, to avoid losing the information expressed by the atoms. For example, suppose that the UI and OTHER subjects 28  Chapter 3. SubjectJ  Figure 3.5: Overview of the SubjectJ refactoring tools.  in Figure 3.4 were modified by independent developers. Suppose each developer added a helper method to the Counter class, and that these methods happened to have the same signature. Without appropriate @Import and @Export declarations, the method signature atoms for these methods would not be copies of each other, according to SubjectJ semantics. Incorrectly merging the signature atoms into a single atom would result in a loss of information. The current implementation of the Compose tool checks for such naming conflicts and issues error messages, requiring developer intervention. A more sophisticated implementation could automatically resolve some conflicts by automatic renaming. The Decompose Tool The Decompose tool is the inverse of the Compose tool, and is designed to automate the process of copying composed code into new Eclipse Java projects, and inserting annotations into the decomposed source code describing each subject’s interface. The Decompose tool requires its input files to be annotated with the “tracking” annotations shown in Table 3.2. These annotation are produced by the Compose tool during composition, and specify—relatively concisely— the subject each information atom belongs to, and the interface of each subject. Note that when a declaration does not have a @Subject annotation attached, it is treated as having a @Subject("OTHER") annotation. As with the Compose tool, the Decompose tool ensures that only the structure, but not content, of the information expressed by the source code is changed. Given a Java project of composed code, the Decompose tool 29  Chapter 3. SubjectJ Attached to Meaning Annotation: @Subject(S1 , . . . ) Class The class header atom (includes “extends” clause but not “implements” clauses) belongs to subjects S1 ,. . . Interface The interface header atom (doesn’t include “extends” clauses) belongs to subjects S1 ,. . . Annotation: @Subject(S) Field The field (signature plus optional initializer atom) belongs to subject S Method, The signature and optional body atom belong to subject Constructor S Annotation: @Export(S1 , ...) Field, The declaration’s signature atom belongs to subjects Method, S1 ,. . . Constructor Annotation: @Implement( @Mapping(key=S1 , values = I11 ,I12 ,. . . ),. . . ) Class For each subject Si , only the “implements” clauses for interfaces Ii1 , Ii2 , . . . belong to Si . For subjects not explicitly mapped, implicitly include all “implements” clause atoms. Interface For each subject Si , only the “extends” clauses for interfaces Ii1 , Ii2 , . . . belong to Si . For subjects not explicitly mapped, implicitly include all “extends” clause atoms. Table 3.2: “Tracking” annotations that allow Decomposer tool to decompose composed programs into subjects. iterates through all information atoms in the code, copying each atom to one or more subjects, according to the tracking annotations in the code. No information is lost or created (only copied) during this process. For example, consider the composed program in Figure 3.6. The @Export("OTHER") and @Subject("UI") tracking annotations attached to Counter.updateDisplay() specify that the signature atom of Counter.updateDisplay() should be copied to subjects UI and OTHER, while the body atom of Counter.updateDisplay() should be copied to subject UI. Furthermore, the tracking annotations imply that Counter.updateDisplay() should be annotated with @Export in subject UI, and @Import in subject OTHER. Running the Decompose tool on this program results in the UI and OTHER subjects previously presented in Figure 3.4.  30  Chapter 3. SubjectJ @Subject({"UI","OTHER"}) public class Counter extends JPanel { private int count = 0; @Subject("UI") private JButton button; @Subject("UI") private JLabel label; @Subject("UI") public Counter() { label = new JLabel(""+getCount()); add(label); button = new JButton("Increment"); add(button); button.addActionListener( new ActionListener() { public void actionPerformed(ActionEvent e) { increment(); } }); } @Export("UI") private int getCount() { return count; } @Export("UI") public void increment() { count++; updateDisplay(); } @Subject("UI") @Export("OTHER") private void updateDisplay() { label.setText(""+getCount()); } }  Figure 3.6: Composed SubjectJ program with “Tracking” annotations. The Checker Tool The Checker tool can be run on SubjectJ files with tracking annotations, to check proposed SubjectJ subjects for declarative completeness, before decomposition is performed to create the subjects. It can also optionally insert tracking annotations to include the necessary information atoms for declarative completeness in the proposed subjects. Given source code with tracking annotations, the Checker tool performs the following: 1. Create all information atoms to model the source code, labelling each atom with the subjects it belongs to. By using static binding information provided by the Eclipse JDT parser,2 each information atom 2  http://help.eclipse.org/help32/index.jsp verified April 2008.  31  Chapter 3. SubjectJ is created with links to all other information atoms needed to satisfy its contextual and referential dependencies. 2. For each information atom, compare the subjects that the atom is labelled with, to the subjects that each dependency is labelled with. Label a dependency atom with a “missing subject” if the dependent is labelled with the subject, but the dependency atom is not. 3. Optionally (if desired by user), modify tracking annotations in original source to include “missing subjects”. For example, if a method signature atom is labelled with a “missing subject”, then that subject would be included in an @Export annotation attached to the method declaration. We can see that when tracking annotations are automatically modified in Step 3, only direct (not transitive) missing dependencies are satisfied. Typically, the above procedure is repeated until no missing subject labellings are found, to ensure declarative completeness of all proposed subjects. For example, suppose the developer only marked each of the following in Counter.java with a @Subject("UI") tracking annotation: the constructor, the label field, the button field, and the updateDisplay method. Running the Checker tool would infer missing tracking annotations required for declarative completeness of both subjects. The code after missing annotations are added is in Figure 3.6.  3.2.5  Prototype Implementation  The SubjectJ Decompose, Compose and Checker tools are implemented as a single Eclipse plugin. Its implementation consists of approximately 4000 lines of code, of which approximately 300 lines implement a rudimentary SWT UI that lets the user launch the SubjectJ tools from within Eclipse. We note some performance details for our prototype implementation of the SubjectJ programming system, as our case study experiment will involve time measurements. The running times of the refactoring tools are approximately linear to size of the code base to which the tools are applied. Using a PC with a 2.13 GHz Intel Core 2 processor and 1GB of memory, we ran the tools on the largest code base in our case studies, having 70833 lines of code (see Section 4.1). The Decompose and Compose tools each required approximately 1 minute to run, while the Checker tool required approximately 2 minutes to run (for each iteration).  32  Chapter 3. SubjectJ  3.3  Remodularizing Java Code in SubjectJ  We now describe how remodularization of source code in SubjectJ can be performed. Suppose we wanted to move a particular method m in some SubjectJ program P from subject S to subject T . The following general process may be used: 1. Produce CP , the composed version of P , with the Compose tool. 2. Find the declaration of m in CP . 3. Edit the tracking annotations attached to m, replacing S with T . 4. Run the Checker tool to infer and insert missing annotations. This has the effect of updating the interface of each subject as required. 5. Produce the refactored program P by running the Decompose tool. Remodularizations can involve variations of the above situation. The typical SubjectJ use case in our case study experiments—remodularizing an existing Java code base into two subjects—is one variant. As an example, suppose we wanted to remodularize the Java program from Figure 1.3 to separate all UI-related code into its own UI subject, and keep remaining code in the OTHER subject. To accomplish this, we first skip Step 1, as we already have a complete Java program. We then find UI-related code and insert tracking annotations as in Steps 2 and 3, placing a @Subject("UI") annotation each on the constructor, label field, button field, and updateDisplay method. Recall that all code not explicitly annotated is treated as belonging to the OTHER subject. To make this code a valid decomposition, we need more annotations to describe the interface between the UI and OTHER subject. Following Step 4, we run the Checker tool to infer the needed annotations, producing the marked up code shown in Figure 3.6. With Step 5, running the Decompose tool on this marked up code produces the remodularized program shown in Figure 3.4.  3.4  Summary  In this chapter, we presented SubjectJ, a programming language explicitly designed for reduced modular-objective coupling compared to Java. SubjectJ’s design goals are derived from its use in our case study experiment. 33  Chapter 3. SubjectJ Its main design goal is to provide more support than Java for the separation of source code into modules (with well-defined and explicit interfaces), without changing objective structure. Two pragmatic design goals—backwards compatibility with Java programs, and compatibility of SubjectJ syntax with existing Java tools—allow for more straightforward interpretation of experimental results. To meet the design goals, SubjectJ is designed around the concept of information atoms. An information atom is an indivisible packet of information mapped to a source code fragment, with clearly defined static dependencies on other information atoms. The main idea behind SubjectJ is to allow breaking of any Java program into individual information atoms, and rearrangement of those atoms into an alternative modular structure consisting of “subjects”. Each SubjectJ subject encapsulates a declaratively complete set of information atoms, and has a well-defined interface specified by Java 1.5 annotations embedded in its source code. Three tools in SubjectJ manipulate information atoms—Compose, which merges SubjectJ subjects into a single Java program; Decompose, which performs the inverse operation of Compose; and Checker, which infers (and optionally inserts) annotations missing for declarative completeness of subjects. These tools support the typical SubjectJ use case in our case study experiments: remodularizing an existing Java code base into two subjects.  34  Chapter 4  Experiment We proceed with validating our central claim—that programming language designs which imply reduced modular-objective coupling reduce remodularization complexity in the language. Having presented SubjectJ, a programming language designed for looser modular-objective coupling than Java, we now describe an experiment comparing remodularizations in Java and SubjectJ, and present quantitative and qualitative results suggesting that remodularizations are less complex in SubjectJ. To compare remodularization effort between Java and SubjectJ, we performed a series of 8 case studies on 8 different open source Java software packages. Each case study involved performing a remodularization task twice—once in Java and once in SubjectJ. We then compared the results to each other. To quantify effort, we measured the time taken to perform each remodularization. Our qualitative analysis was focused on finding anecdotal evidence of accidental complexity. We first present our experimental setup in Section 4.1. Then we present the quantitative results in Section 4.2 followed by the results of the anecdotal analysis in Section 4.3. Conclusions of the experiment are summarized in Section 4.4.  4.1  Experimental Setup  For each of the 8 case studies in the experiment, we selected a different open source Java code base. Table 4.1 has an overview of the selected code bases. We also defined, for each case study, a remodularization task requiring the identification and modularization of a certain subset of the existing code base. The remodularization tasks were characterized by the high-level description shown under “Code to Separate” in Table 4.1. We defined an acceptable remodularization to be the creation of either a Java package or a SubjectJ subject containing all and only the identified source code (with any required interface code). The 8 case studies were performed by the author of this dissertation (who we call “the programmer”), in the order in which they are shown in 35  Chapter 4. Experiment Application and Description† Tetris—game of Tetris  LOC Code to Separate 1036 GUI handling.  http://cslibrary.stanford.edu/112/  TyRuBa—logic programming language http://tyruba.sourceforge.net/  JHotDraw —simple drawing application http://www.jhotdraw.org/  Chinese Chess—game of Chinese chess  22116 Storing and persisting “facts” used by the language. 14611 All functionality for creating and modifying text figures. 3073 Logic for AI opponent.  https://chinese-chess-xiang-qi.dev.java.net/  MineRay—game of minesweeper https://mineray.dev.java.net/  DrawSWF —simple drawing and animation application http://drawswf.sourceforge.net/  FindBugs—Java source code bug finder  3478  Logic for populating map with mines. 7540 All functionality for creating and modifying text figures. 70833 Saving of bug analysis results.  http://findbugs.sourceforge.net/  JChessBoard —game of chess  6190  GUI handling.  http://jchessboard.sourceforge.net/ † All website references verified February 2008.  Table 4.1: Overview of selected code bases and corresponding refactoring tasks for case studies. LOC is the total number of non-blank and noncomment lines in the code base. Table 4.1. With the exception of the JHotDraw code base, he was unfamiliar with the selected code bases prior to performing the experiment. For each case study, the remodularization was performed twice consecutively, once using Java and once using SubjectJ. For the first 4 case studies, the SubjectJ remodularization was performed first; for the last 4, the Java remodularization was performed first. The programmer was not allowed access to any data or results produced during the first remodularization process while performing the second. The programmer was otherwise allowed to use all available tools, including automated Java refactoring tools in an installation of Eclipse 3.2.2. Since Eclipse tools are not aware of SubjectJ annotation semantics, we provided two additional resources for 36  Chapter 4. Experiment comparable browsing and refactoring support between SubjectJ and Java. First, we installed version 3.1.13 of the JQuery [23] plugin into Eclipse, and customized the plugin to provide rudimentary support for browsing Java elements marked with @Subject and @Export tracking annotations. Second, we allowed the SubjectJ tools to be used as described in Section 3.3. This provides functionality for moving members to and from subjects, analogous to the Eclipse “Move Method” refactoring for Java. For each remodularization, we measured the total time needed to complete the task, made detailed notes of the steps performed during the remodularization process, and saved a copy of the code base upon completion of the task. The time data is presented and analyzed in Section 4.2. The notes and saved code were used as the basis for the more qualitative results presented in Section 4.3.  4.2  Quantitative Results  Table 4.2 provides an overview of the time data, listed in the order the case studies were performed. The “SubjectJ” and “Java” columns show the time taken to perform the refactoring task in the respective language; the times shown do not include any intermediate breaks taken by the programmer. The “Difference” column shows the difference between SubjectJ and Java times, given as a percentage of the SubjectJ time. A positive difference indicates that the Java refactoring took more time. A negative difference indicates that the SubjectJ refactoring took more time. The table is divided in two sections based on which remodularization was performed first, implying a different learning bias caused by performing the same task twice. Note however, that the SubjectJ time is always placed in the first column. Similarly, the difference column is always computed relative to the SubjectJ time. This is not to suggest both sections should be interpreted in the same way, but to facilitate contrasting the numbers in both sections to each other. The top half of the table shows a positive trend in time differences: two cases yield mildly positive differences, one case a strong positive difference, and one case a mildly negative difference. These time differences are expected to be biased negatively, due to performing the SubjectJ remodularization before the Java remodularization for each of these case studies. Considering this negative bias, we believe the nonetheless positive trend suggests to some extent that remodularizations take less time to perform in SubjectJ than in Java. 37  Chapter 4. Experiment Code Base  SubjectJ Java (hours) (hours) SubjectJ refactoring performed first (total hours = 63.4) Tetris 3.0 3.5 TyRuBa 18.0 20.3 JHotDraw 4.2 4.0 Chinese Chess 3.8 6.6 Sum(1) 29.0 34.4 Java refactoring performed first (total hours = 66.9) MineRay 0.7 2.3 DrawSWF 2.0 5.8 FindBugs 9.9 30.7 JChessBoard 3.0 12.5 Sum(2) 15.6 51.3 Aggregated results (total hours = 130.3) Sum(1)+Sum(2) 44.6 85.7  Difference  +17% +13% -5% +74% +19% +252% +183% +210% +325% +229% +92%  Table 4.2: Overview of quantitative case study results. The time differences in the bottom half of the table are expected to be biased positively, so a positive trend in this half is not surprising. However, the differences are consistently and significantly greater than in the top half the table. This suggests that the learning advantage for the second iteration of a task is substantial, and that the negative bias on the top half of the table is considerably strong. Interpreting the positive trend in the top half of the table under a strong negative bias suggests that remodularizations take considerably less time to perform in SubjectJ than in Java. The two halves of the table have different biases, but time spent remodularizing was almost equal in the two halves (63.4 hours vs. 66.9 hours). So, we can compute an aggregate score that is not particularly weighted towards either half. Combining the 8 case studies, a total of 44.6 hours were spent using SubjectJ, and 85.7 hours were spent using Java. Thus, over the entire course of our experiment, about 92% more time was spent remodularizing in Java than in SubjectJ. However, this score does not allow us to draw a general conclusion that SubjectJ cuts remodularization time in half. First, in light of our qualitative results (see Section 4.3), we believe that there is a stronger learning bias when the Java remodularization is performed first, as we observed that remodularizing in Java requires a more in-depth understanding of the code 38  Chapter 4. Experiment base. We also believe that the programmer may have become more fluent in performing case studies during the course of the experiment, as he became more familiar with remodularization techniques (e.g. “separation strategies” for Java). Second, the code bases in the bottom half of the table have more total lines of code than the code bases in the top half. It is possible this could introduce bias, but at least for our experiment, we do not see a significant correlation between lines of code and percentage differences in remodularization time. For instance, the 4 differences in the bottom half of the table range from 183% to 325%, but the case studies in this half with the largest code base (FindBugs, with 70833 LOC) and smallest code base (MineRay, with 3478 LOC) have differences that fall in between this range at 210% and 252%, respectively. In summary, we believe our aggregate score of 92% should not be interpreted as anything more precise than an indication of a substantial effect. In any case, even if we consider only the first four case studies—where performing remodularizations in SubjectJ before Java negatively biased the time differences—we still see a positive 19% trend.  4.3  Qualitative Results  After performing each case study, we analyzed our notes and the resulting code bases. The main focus of this analysis was to find anecdotal evidence explaining why performing the same remodularization in Java was more complex than in SubjectJ. Overall, the Java remodularization process seemed cognitively harder. In both cases, the programmer needed to explore the code to find sections relative to the concern of interest. However, in SubjectJ the general remodularization process itself was centered almost exclusively around this type of activity. This was facilitated by the use of SubjectJ annotations to mark code of interest, and tracking static dependencies using the Checker tool. In constrast, in standard Java, the programmer needed to not just identify code of interest, but also decide on a strategy for changing objective structure to allow for separation of the code into its own Java package. From here on forward, we will use the term separation strategy to refer to any strategy for changing objective structure solely to allow code separation. The listener infrastructure from our motivating example is one such strategy. Four additional separation strategies were used in the case studies. Separation strategies added complexity to the Java remodularization process. The complexity manifested itself in multiple ways: difficulties de39  Chapter 4. Experiment ciding which strategy to apply; complex transformations not well supported by available automated refactoring tools; and introduction of bugs. We will begin by describing the different separation strategies used in the case studies. Then we will discuss the above problems related to using separation strategies in more detail, illustrating each problem with anecdotes from our case studies.  4.3.1  Separation Strategies  Over the course of the 8 case studies, changes to objective program structure were frequently needed to achieve the desired code separation. The need to change objective structure was much more prevalent in the Java remodularizations than in the SubjectJ remodularizations—this was partially reflected in the number of different separation strategies used. In the SubjectJ refactorings, only the “split method” strategy was used. In the Java refactorings, four strategies in addition to “split method” were used—“listener”, “dual object”, “static method”, and “subclassing”. Split Method In both Java and SubjectJ remodularizations, the “split method” strategy was used when a method contained both code that was related to a concern of interest, and code that was not. Essentially, the “split method” strategy extracts the code of interest into a new method. In the original method, the extracted code is replaced with a call to the new method, with variables required by the extracted code passed as method arguments. This strategy is implemented in the Eclipse IDE as the “Extract Method” refactoring. Listener The “listener” separation strategy involves applying the listener infrastructure, as described in the illustrative example of Section 1.2.1. The strategy separates code of interest into one or more listener classes, creating new corresponding objects at runtime. Calls to code in the listener are turned into update notifications to listener objects. The strategy was used once, in the Java refactoring of the JTetris case study. Dual Object One strategy, which we call “dual object”, is generally applicable and was used in every case study except the first. It splits objects of a given class 40  Chapter 4. Experiment public class BugInstance { private BugProperty propertyListHead; private SAVE_BugInstance saveBugInstance; ... public BugInstance(...) { ... saveBugInstance = new SAVE_BugInstance(this); ... } public BugProperty getPropertyListHead() { return propertyListHead; } } public class SAVE_BugInstance { private BugInstance bugInstance; public SAVE_BugInstance(BugInstance bugInstance) { this.bugInstance = bugInstance; } public void writeXML() { ... BugProperty prop = bugInstance.getPropertyListHead(); ... } ... }  Figure 4.1: “Dual object” implementation of saving XML bug reports. into two objects: one object containing the methods and fields we want to separate, and one containing the rest. For example, the code in Figure 4.1 is the result of applying dual object to separate the writeXML method from a class called BugInstance. Each BugInstance object becomes two objects, with the SAVE BugInstance objects containing the writeXML() method. The dual objects have mutual references to each other, and constructor code needs to be added to initialize these references. Accesses from one object to the other are accesses to this in the original code, and need to be updated. Static Method Another strategy, which we call “static method”, was used only in the TyRuBa case study. This strategy converts a group of instance methods from several classes into a single static method. As an example from the TyRuBa case study, we have in Figure 4.2 an abstract getStoragePath() method in QueryEngine, and two subclasses with concrete implementations of the method. Applying the “static method” strategy moves the code for these methods into a single static getStoragePath(QueryEngine) method, 41  Chapter 4. Experiment public abstract class QueryEngine { ... public abstract String getStoragePath(); ... } public class FrontEnd extends QueryEngine { private File path; ... public String getStoragePath() { return path.getPath(); } ... } public abstract class RuleBaseBucket extends QueryEngine { FrontEnd frontend; String identifier; ... public String getStoragePath() { return frontend.getStoragePath() + "/" + identifier; } ... } public class SimpleRuleBaseBucket extends RuleBaseBucket { // inherits RuleBaseBucket.getStoragePath() implementation }  Figure 4.2: Instance method getStoragePath() before separation. shown in Figure 4.3. The receiver object (which is of type QueryEngine) becomes one of the parameters of the static method, and method dispatch is converted to an “if instanceof” test for each class originally providing a concrete implementation of the method. This static method can then be moved around relatively freely, because moving static method source code amongst classes requires relatively simple changes to objective structure. Subclassing A final strategy, which we refer to as “subclassing”, was used only in the DrawSWF case study. This strategy splits a class into a superclass and a subclass. The subclass contains the field, method, and inner class declarations that need to be separated, while the remaining member declarations stay in the superclass. References are updated to create and use objects of the subclass.  42  Chapter 4. Experiment public class FactBaseQueryEngineUtils { synchronized static public String getStoragePath(QueryEngine qe) { String storagePath = null; // null if can’t generate if (qe instanceof FrontEnd) { FrontEnd fe = (FrontEnd)qe; FactBaseFrontEnd fbfe = fe.getFactBaseFrontEnd(); storagePath = fbfe.getStoragePath(); } else if (qe instanceof RuleBaseBucket) { storagePath = qe.frontend().getFactBaseFrontEnd().getStoragePath() + "/" + qe.getIdentifier(); } return storagePath; } ... }  Figure 4.3: “Static method” implementation of getStoragePath() method.  4.3.2  Deciding on the “Right” Strategy  When separating source code from classes in Java, it was often difficult to decide what separation strategy to use. In most cases, the programmer decided to apply the dual object strategy, because he found it hard to predict whether all code of interest could be captured by other (possibly more elegant, but less general) strategies. However, there were situations where experimentation with dual object, and careful consideration of strategy benefits and limitations, led to the use of a different separation strategy. For example, in the “TyRuBa” case study, the abstract QueryEngine.getStoragePath() method (and every implementation of it, as shown in Figure 4.2) was identified as one of the methods to be separated from the rest of the code. While starting to apply dual object, the programmer realized that code for initializing dual objects could not be placed in the abstract QueryEngine and RuleBaseBucket classes, as the objects requiring dual object references would be instances of concrete subclasses. So, the programmer attempted to add code for initializing a dual object reference in SimpleRuleBaseBucket (a subclass inheriting the getStoragePath() implementation from RuleBaseBucket). This was complicated because this code could only be placed after super() constructor calls, causing potential problems if the dual object needed to be accessed during the execution of super(). To avoid potential problems, the programmer observed that no fields needed to be moved, and decided to use the static method strategy instead, as shown earlier in Figure 4.3. The static method strategy avoids the com43  Chapter 4. Experiment plexity of the dual object strategy’s mutual references, but as seen here, it is less generally applicable—it can only be applied to move methods, but not instance fields. It introduces complexity of its own by using “if instanceof” tests and typecasts, but in this case the complexity seemed more predictable. In contrast, remodularizations in SubjectJ did not require making these kinds of decisions because only one separation strategy (split method) was ever used.  4.3.3  Automation of Transformations  The more complex separation strategies used in the Java refactorings usually involved manual transformations not supported well by the available automated refactoring tools. In contrast, the only strategy used in SubjectJ refactorings was the “split method” strategy, which is relatively wellsupported by Eclipse. Other program structure changes in SubjectJ refactorings were subjective in nature and well-supported by the SubjectJ tools. We provide an anecdote from the “FindBugs” case study. While refactoring using Java, the programmer applied the dual object strategy on 38 classes to separate their writeXML() method declarations. An example of the resulting code from one of the classes—BugInstance—is shown in Figure 4.1. Lack of automated refactoring support for this strategy required the programmer to manually change a large number of different places in the code in a coordinated fashion. While refactoring using SubjectJ, the programmer also decided to separate the writeXML method declaration from the same 38 classes. However, manual code modifications were limited to adding @Subject annotations to each writeXML method. Tracking annotations required for declarative completeness of subjects were automatically determined and added using the Checker tool. Decomposition of source code into subjects was performed by the Decomposer tool. While the changes where similar in extent to the Java refactoring, the programmer did not need to spend much effort manually performing changes. This anecdote illustrates that the difference between the SubjectJ and Java refactorings is less in the extent of the changes than it is in how well the changes could be supported by reliable automated tools. Although Eclipse has refactorings to move methods and fields, the programmer did not use them here because they were felt to be unreliable. Indeed, he did try to use the Eclipse “move” refactoring tools in the first case study, but found they often introduced errors into the code. We later verified that the “Move Method” tool in Eclipse passes the receiver object as an argument to the 44  Chapter 4. Experiment method, essentially using the method as a static method. Thus, if the method is used in a polymorphic way, then references to the method are not updated. The “Move Field” tool appears to never update references. We do not believe these shortcomings are because of a lack of effort on the part of the Eclipse developers, but rather because strong modular-objective coupling makes moving instance members in Java a complex problem. In comparison, moving declarations between SubjectJ subjects is relatively uncomplicated because it only affects the (static) interfaces between the subjects.  4.3.4  Introduction of Bugs  Mistakes made while manually applying separation strategies caused bugs in a number of cases. The fact that application of such strategies was more prevalent while remodularizing in Java manifested itself in the number of bugs introduced:3 eight bugs were introduced in Java remodularizations, and only one bug was introduced in a SubjectJ remodularization. Table 4.3 provides a brief overview of the bugs introduced during the case studies. All 9 bugs were introduced while making manual code changes to apply separation strategies. More specifically, Bugs #1 to #6 were related to applying the dual object strategy. Bugs #7 and #9 were caused by mistakes made when manually extracting method code. Finally, Bug #8 was related to applying the “subclassing” strategy. We now describe two bugs in more detail—Bug #2, introduced during a Java refactoring; and Bug #9, the only bug introduced during a SubjectJ refactoring. The programmer introduced Bug #2 while applying dual object, to separate “brain” code from the abstract class Piece into BRAIN Piece. Figure 4.4 contains fragments of the buggy decomposed code. As can be seen, the bug resulted from forgetting to properly initialize the references from BRAIN Piece objects to their dual objects; comments mark the locations of missing initialization code in the figure. A possible cause for this forgetfulness is the added complexity of applying dual object to an abstract class: the Piece class does not (and cannot) call a constructor of the abstract BRAIN Piece class, so the programmer was not alerted to the lack of an explicit BRAIN Piece constructor by compiler errors. Bug #9 was introduced by the programmer when manually extracting method code in a situation that was not supported by the Eclipse “Extract Method” tool. A careless mistake resulted in the extracted method shown below: 3  We considered bugs to be distinct if they were detected and fixed between different test runs of the application being refactored.  45  Chapter 4. Experiment # Code Base Behaviour and Cause Bugs introduced during Java remodularization tasks 1 JHotDraw NullPointerException; Dual is referenced before it is created. 2 Chinese Chess NullPointerException; Reference to dual is not initialized. 3 FindBugs NullPointerException; “Getter” method returns null instead of reference. 4 FindBugs Program freeze; “Getter” method calls itself infinitely. 5 JChessBoard NullPointerException; Dual of JChessBoard used before it is fully initialized. 6 JChessBoard NullPointerException; Dual of History used before it is fully initialized. 7 DrawSWF NullPointerException; Manually extracted code from method erroneously sets local variable to null. 8 DrawSWF IllegalArgumentException; Reference to original class instead of new subclass. Bugs introduced during SubjectJ remodularization tasks 9 DrawSWF NullPointerException; Manually extracted code from method erroneously returns null. Table 4.3: Overview of bugs introduced during case studies (arbitrarily numbered for reference convenience).  @Export("OTHER") @Subject("Text") private static DrawObject createObject2_TEXT(int drawing_mode) { if (drawing_mode == TEXT) new Text(); return null; }  The programmer wrote “new Text();” when he intended to write “return The programmer was not alerted by a compilation error because the method still ended with a return statement, resulting in a NullPointerException. new Text();”.  46  Chapter 4. Experiment public abstract class Piece { private BRAIN_Piece brainPiece; public int column() { ... } ... } public abstract class BRAIN_Piece { private Piece piece; // Missing constructor to initialize ‘‘piece’’ public boolean checkFacingGenerals(...) { ...piece.column()... } ... } public class Soldier extends Piece { public Soldier(...) { ... brainPiece = new BRAIN_Solder(this); ... } ... } public class BRAIN_Soldier extends BRAIN_Piece { private Soldier soldier; public Soldier(Soldier soldier) { // Missing super() call to BRAIN_Piece constructor this.soldier = soldier; } ... }  Figure 4.4: Remodularized implementation of Piece and Soldier class containing bug.  4.4  Summary of Experiment Conclusions  In this experiment, we compared remodularization effort between Java, and SubjectJ—a programming language designed for looser modular-objective coupling than Java. We performed a series of 8 case studies, each requiring a remodularization task to be performed once using Java, and once using SubjectJ. The time data in Section 4.2 suggest that remodularization tasks take longer to perform in standard Java than in SubjectJ. Qualitative observations presented in Section 4.3 provides insight into this time difference, explaining how the use of complex “separation strategies” in the Java re47  Chapter 4. Experiment modularizations led to accidental complexity not encountered in the SubjectJ remodularizations. Particularly, decisions on which strategy to use were trivial in SubjectJ, since only strategy was used; but were complex in Java, where multiple strategies were used. Since the separation strategies involve changes to objective structure, a more in-depth understanding of the code was required. We observed that the extent of the changes between SubjectJ and Java remodularizations was similar, but the more complex separation strategies used in Java remodularizations are difficult to support with refactoring tools, and were typically performed in our case studies using error-prone manual approaches. Together, we believe these results suggest that remodularizations are less complex in SubjectJ than in standard Java.  48  Chapter 5  Conclusions In this chapter, we describe possible future directions, and provide concluding statements for this dissertation.  5.1  Future Directions  In this dissertation, we investigated the effect of modular-objective coupling on remodularization complexity, by comparing Java with SubjectJ—a language designed to remove Java’s main source of modular-objective coupling. In our experimental comparison, we noticed that, while remodularizing in SubjectJ involved fewer changes in objective structure compared to Java, one common separation strategy for changing objective structure was frequently used in both languages—“split method”, used when sub-method modular changes were required. However, there do exist languages that provide modular-objective decoupling at the sub-method granularity (e.g. AspectJ). Remodularization experiments that compare languages with differing degrees of modular-objective coupling, including languages with submethod modular-objective decoupling, could compliment the results presented in this paper. Also, we believe that reducing the accidental complexity of refactorings likely has an impact on refactoring as a whole. However, the refactoring tasks in our experiment were intentionally selected to be of a specific type: remodularizations to separate a certain subset of code from an existing code base. As such, determining the impact of modular-objective coupling on refactoring in general would require further investigation.  5.2  Concluding Statements  In this dissertation, we defined the concepts of modular program structure, objective program structure, and modular-objective coupling. We argued that modular-objective coupling introduces accidental complexity into remodularizations—behaviour-preserving restructurings that have a stated 49  Chapter 5. Conclusions intent of changing a program’s modular structure, but do not have a stated intent of changing a program’s objective structure. We claimed that a programming language design which implies reduced modular-objective coupling reduces remodularization complexity in the language. To validate our claim we presented SubjectJ, a prototype language designed for loose modular-objective coupling compared to Java, and performed an experiment consisting of 8 remodularization case studies comparing SubjectJ with Java. Our results demonstrate that remodularizations are less complex in SubjectJ than in Java, supporting our central claim that programming languages designed for reduced modular-objective coupling reduce remodularization complexity in the languages.  50  Bibliography [1] Paul L. Bergstein. Object-preserving class transformations. In OOPSLA, pages 299–313, 1991. [2] Bill Burke, Austin Chau, Marc Fleury, Adrian Brock, Andy Godwin, and Harald Gliebe. JBoss aspect oriented programming. http://www. jboss.org/, February 2004. [3] Richard Cardone and Calvin Lin. Comparing frameworks and layered refinement. In ICSE ’01: Proceedings of the 23rd International Conference on Software Engineering, pages 285–294, Washington, DC, USA, 2001. IEEE Computer Society. [4] Eduardo Casais. Automatic reorganization of object-oriented hierarchies: A case study. Object-Oriented Systems, 1(2):95–115, December 1994. [5] Craig Chambers. Object-oriented multi-methods in Cecil. In O. Lehrmann Madsen, editor, Proceedings ECOOP ’92, volume 615 of LNCS, pages 33–56, Utrecht, the Netherlands, June 1992. SpringerVerlag. [6] Curtis Clifton, Gary T. Leavens, Craig Chambers, and Todd D. Millstein. MultiJava: modular open classes and symmetric multiple dispatch for java. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages and Application (OOPSLA-00), volume 35.10 of ACM Sigplan Notices, pages 130–145, N. Y., October 15– 19 2000. ACM Press. [7] Linda G. Demichiel. Overview: The Common Lisp Object System. Lisp and Symbolic Computation, 1(2):227–244, September 1988. [8] Arie van Deursen and Leon Moonen. The video store revisited — thoughts on refactoring and testing. In M. Marchesi and G. Succi, editors, Proceedings of the 3nd International Conference on Extreme Pro-  51  Bibliography gramming and Flexible Processes in Software Engineering (XP2002), May 2002. [9] Edsger W. Dijkstra. Notes on Structured Programming, chapter 1, pages 1–82. Academic Press, 1972. [10] Ran Ettinger and Mathieu Verbaere. Untangling: a slice extraction refactoring. In Karl Lieberherr, editor, Proc. 3rd Int’ Conf. on AspectOriented Software Development (AOSD-2004), pages 93–101. ACM Press, March 2004. [11] David Flanagan and Yukihiro Matsumoto. The Ruby Programming Language. O’Reilly, 2008. [12] Martin Fowler, Kent Beck, John Brant, William Opdyke, and Don Roberts. Refactoring: Improving the Design of Existing Code. Addison Wesley, 1999. [13] Alejandra Garrido and Ralph Johnson. Challenges of refactoring C programs. In IWPSE ’02: Proceedings of the International Workshop on Principles of Software Evolution, pages 6–14, New York, NY, USA, 2002. ACM. [14] A. Goldberg and D. Robson. Smalltalk-80: The Language and its Implementation. Addison-Wesley, 1983. [15] James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java Language Specification, Third Edition. The Java Series. Addison-Wesley, Boston, Mass., 2005. [16] William G. Griswold and David Notkin. Automated assistance for program restructuring. ACM Transactions on Software Engineering and Methodology, 2(3):228–269, July 1993. [17] Jan Hannemann, Thomas Fritz, and Gail C. Murphy. Refactoring to aspects: an interactive approach. In Eclipse ’03: Proceedings of the 2003 OOPSLA workshop on Eclipse technology eXchange, pages 74–78, New York, NY, USA, 2003. ACM. [18] Jan Hannemann, Gail Murphy, and Gregor Kiczales. Role-based refactoring of crosscutting concerns. In Peri Tarr, editor, Proc. 4rd Int’ Conf. on Aspect-Oriented Software Development (AOSD-2005), pages 135–146. ACM Press, March 2005. 52  Bibliography [19] William H. Harrison and Harold Ossher. Subject-oriented programming (A critique of pure objects). In OOPSLA, pages 411–428, 1993. [20] Anders Hejlsberg, Scott Wiltamuth, and Peter Golde. C# Language Specification. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2003. [21] Walter L. H¨ ursch and Linda M. Seiter. Automating the evolution of object-oriented systems. In Kokichi Futatsugi and Satoshi Matsuoka, editors, ISOTAS, volume 1049 of Lecture Notes in Computer Science, pages 2–21. Springer, 1996. [22] Yuuji Ichisugi and Akira Tanaka. Difference-based modules: A class independent module mechanism. In Proceedings ECOOP 2002, volume 2374 of LNCS, Malaga, Spain, June 2002. Springer Verlag. [23] Doug Janzen and Kris De Volder. Navigating and querying code without getting lost. In AOSD, pages 178–187, 2003. [24] Ralph E. Johnson and William F. Opdyke. Refactoring and aggregation. In Proceedings of the First JSSST International Symposium on Object Technologies for Advanced Software, pages 264–278, London, UK, 1993. Springer-Verlag. [25] Gregor Kiczales, Erik Hilsdale, Jim Hugunin, Mik Kersten, Jeffrey Palm, and William G. Griswold. An overview of AspectJ. In J. Lindskov Knudsen, editor, ECOOP 2001 — Object-Oriented Programming 15th European Conference, volume 2072 of Lecture Notes in Computer Science, pages 327–353. Springer-Verlag, Budapest, Hungary, June 2001. [26] Raghavan Komondoor and Susan Horwitz. Effective, automatic procedure extraction. In IWPC, page 33. IEEE Computer Society, 2003. [27] Karl J. Lieberherr, Ignacio Silva-Lepe, and Cun Xiao. Adaptive objectoriented programming using graph-based customization. Commun. ACM, 37(5):94–101, 1994. [28] Tom Mens and Tom Tourwe. A survey of software refactoring. IEEE Transactions on Software Engineering, 30(2):126–139, February 2004. [29] Miguel P. Monteiro and Jo˜ao M. Fernandes. Towards a catalog of aspect-oriented refactorings. In AOSD ’05: Proceedings of the 4th international conference on Aspect-oriented software development, pages 111–122, New York, NY, USA, 2005. ACM. 53  Bibliography [30] Ivan Moore. Automatic inheritance hierarchy restructuring and method refactoring. In OOPSLA, pages 235–250, 1996. [31] Rajaa Najjar, Steve Counsell, George Loizou, and Keith Mannock. The role of constructors in the context of refactoring object-oriented system. In CSMR, page 111. IEEE Computer Society, 2003. [32] William F. Opdyke. Refactoring Object-Oriented Frameworks. Ph.D. thesis, University of Illinois, 1992. [33] William F. Opdyke and Ralph E. Johnson. Creating abstract superclasses by refactoring. In CSC ’93: Proceedings of the 1993 ACM conference on Computer science, pages 66–73, New York, NY, USA, 1993. ACM. [34] Harold Ossher, William Harrison, Frank Budinsky, and Ian Simmonds. Subject-oriented programming: Supporting decentralized development of objects. In Proc. 7th IBM Conf. Object-Oriented Technology, July 1994. [35] Harold Ossher, Matthew Kaplan, William Harrison, Alexander Katz, and Vincent Kruskal. Subject-oriented composition rules. In OOPSLA ’95: Proceedings of the tenth annual conference on Object-oriented programming systems, languages, and applications, pages 235–250, New York, NY, USA, 1995. ACM. [36] Jens Uwe Pipka. Refactoring in a “test first”-world. In Proc. 3rd Int’l Conf. eXtreme Programming and Flexible Processes in Software Engineering, 2002. [37] Yannis Smaragdakis and Don Batory. Implementing layered designs with mixin layers. In Eric Jul, editor, ECOOP ’98—Object-Oriented Programming, volume 1445 of Lecture Notes in Computer Science, pages 550–570. Springer, 1998. [38] Olaf Spinczyk, Andreas Gal, and Wolfgang Schr¨oder-Preikschat. AspectC++: An aspect-oriented extension to the C++ programming language. In Proceedings of the Fortieth International Conference on Tools Pacific, pages 53–60. Australian Computer Society, Inc., 2002. [39] P. Tarr and H. Ossher. Hyper/J user and installation manual. Technical report, IBM T. J. Watson Research Center, 2000.  54  Bibliography [40] Peri Tarr, Harold Ossher, William Harrison, and Stanley M. Sutton, Jr. N degrees of separation: Multi-dimensional separation of concerns. In Proceedings of ICSE ’99, pages 107–119, Los Angeles CA, USA, 1999. [41] Frank Tip. Refactoring using type constraints. In Hanne Riis Nielson and Gilberto Fil´e, editors, SAS, volume 4634 of Lecture Notes in Computer Science, pages 1–17. Springer, 2007. [42] Lance Tokuda and Don S. Batory. Evolving object-oriented designs with refactorings. In ASE, page 174, 1999.  55  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0051353/manifest

Comment

Related Items