Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Time-travel programming: programming language support for interacting with past executions Salkeld, Robin 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2018_september_salkeld_robin.pdf [ 823.96kB ]
Metadata
JSON: 24-1.0367029.json
JSON-LD: 24-1.0367029-ld.json
RDF/XML (Pretty): 24-1.0367029-rdf.xml
RDF/JSON: 24-1.0367029-rdf.json
Turtle: 24-1.0367029-turtle.txt
N-Triples: 24-1.0367029-rdf-ntriples.txt
Original Record: 24-1.0367029-source.json
Full Text
24-1.0367029-fulltext.txt
Citation
24-1.0367029.ris

Full Text

Time-travel ProgrammingProgramming Language Support for Interacting with Past ExecutionsbyRobin SalkeldBMath, University of Waterloo, 2005A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFDoctor of PhilosophyinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Computer Science)The University Of British Columbia(Vancouver)May 2018© Robin Salkeld, 2018The	following	individuals	certify	that	they	have	read,	and	recommend	to	the	Faculty	of	Graduate	and	Postdoctoral	Studies	for	acceptance,	the	dissertation	entitled:		Time-travel	Programming:	Programming	Language	Support	for	Interacting	with	Past	Executions		submitted	by	 Robin	Salkeld	 	 in	partial	fulfillment	of	the	requirements	for	the	degree	of	 Doctor	of	Philosophy	in	 Computer	Science		Examining	Committee:	Gregor	Kiczales	Supervisor		Andrew	Warfield	Supervisory	Committee	Member		Ronald	Garcia	Supervisory	Committee	Member	Norm	Hutchinson		University	Examiner	Sathish	Gopalakrishnan		University	Examiner			Additional	Supervisory	Committee	Members:		Supervisory	Committee	Member		Supervisory	Committee	Member		AbstractBecause software so often behaves unexpectedly or fails only in production envi-ronments, several recent tools from both industry and academia record data aboutexecution for the benefit of post-hoc analysis. Debugging on these data instead ofa live program is much more difficult, however, because the semantic abstractionsprovided by the programming language are no longer available. Many post-hocanalysis tools process this data through additional reflection-based code or domain-specific query languages, but do not recover the expressive power of the originalprogramming language.This thesis proposes the concept of time-travel programming, which we defineas simulating the execution of additional code in the same programming languageas if it were present in the past environment of recorded data. Furthermore, weshow that the aspect-oriented programming (AOP) paradigm provides a naturalmechanism for specifying this additional execution, and allows us to reuse estab-lished semantics and implementations. We provide evidence of this technique’sflexibility, feasibility and effectiveness through two implementations: one an inter-preter for an extremely simple AOP language in the style of a core calculus, andone for the AspectJ programming language. We evaluate flexibility via applyingthe implementations to multiple execution recording formats, feasibility by show-ing the AspectJ implementation is performant enough for post-hoc analysis, andeffectiveness by demonstrating that evaluating new and existing aspects retroac-tively can be used to address common post-hoc analysis tasks.iiiLay SummaryJust as personal video recorders have enabled recording and replaying live televi-sion, several tools support the equivalent of taking a picture of computer softwareas it runs, or continuously recording it so it can be accurately replayed later. This ispotentially invaluable for understanding issues after they happen, but these tools donot yet reproduce the original experience of a developer observing and interactingwith live software.This thesis proposes “time-travel programming” as the concept of evaluatingadditional code as if it was sent back in time to interact with the original program.We provide the theoretical foundation for this concept as well as a concrete pro-totype of such a time-travel machine for Java code using three different recordingtechnologies. We demonstrate that this enables powerful after-the-fact analysis bytaking existing programming techniques from the present and applying them in thepast.ivPrefaceA version of Chapter 3 has been published as [61]. I was the lead investigatorfor this research project, responsible for conceptualization, the implementation ofthe ACC source transformer and runtime based on the Tralfamadore system, andthe majority of the manuscript composition. B. Cully, G. Lefebvre and W. Xucontributed various extensions of the Tralfamadore system for the purpose of theruntime and a subset of the text in Section 3.5.2. A. Warfield and G. Kiczalesprovided high-level guidance and manuscript edits.A version of Chapter 4 has been published as [59]. I was the lead investigatorfor this research project, and performed the majority of the design of the RAPLlanguage and the abstract framework for retroactive weaving, the entirety of theRAPL interpreter implementation, and the majority of the manuscript composition.R. Garcia contributed substantially to the design of several language features, andprovided manuscript edits.A version of Section 3.1 and Chapter 5 has been published as [60]. I was thelead investigator for this research project, responsible for the architectural design,implementation, experimental evaluation, and manuscript composition. G. Kicza-les provided guidance on methodology and manuscript edits.The remainder of the dissertation is my own original and unpublished work.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiList of Listings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Analyzing Past Executions . . . . . . . . . . . . . . . . . . . . . 11.2 Time-travel Programming . . . . . . . . . . . . . . . . . . . . . . 31.3 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Claims and Contributions . . . . . . . . . . . . . . . . . . . . . . 51.5 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.1 Complexity and Abstraction . . . . . . . . . . . . . . . . . . . . 8vi2.2 Reflection and Metaprogramming . . . . . . . . . . . . . . . . . 102.3 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 Aspect-oriented Programming . . . . . . . . . . . . . . . . . . . 132.5 AspectJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.6 Execution Recording . . . . . . . . . . . . . . . . . . . . . . . . 172.6.1 Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . 172.6.2 Traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.6.3 Deterministic Replay . . . . . . . . . . . . . . . . . . . . 183 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.1 Compatibility with Program Source . . . . . . . . . . . . . . . . 193.2 Coordination with Past Execution . . . . . . . . . . . . . . . . . 233.3 Access to Program State . . . . . . . . . . . . . . . . . . . . . . 253.4 Avoidance of Side-Effects . . . . . . . . . . . . . . . . . . . . . 273.5 Exploratory Implementation . . . . . . . . . . . . . . . . . . . . 293.5.1 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5.2 Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.5.3 Lessons Learned . . . . . . . . . . . . . . . . . . . . . . 353.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 Essential Retroactive Weaving . . . . . . . . . . . . . . . . . . . . . 374.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.2 Base Language . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3 Adding Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.4 Aspect Weaving . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.5 Defining Retroactive Aspects . . . . . . . . . . . . . . . . . . . . 444.6 Retroactive Weaving . . . . . . . . . . . . . . . . . . . . . . . . 454.6.1 Recording and Reading Traces . . . . . . . . . . . . . . . 464.6.2 Retroactive State . . . . . . . . . . . . . . . . . . . . . . 484.6.3 Retroactive Control . . . . . . . . . . . . . . . . . . . . . 494.7 Ensuring Soundness . . . . . . . . . . . . . . . . . . . . . . . . . 514.7.1 Deterministic Replay . . . . . . . . . . . . . . . . . . . . 534.8 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53vii4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 Retroactive Execution on the JVM . . . . . . . . . . . . . . . . . . . 555.1 Holographic Virtual Machines . . . . . . . . . . . . . . . . . . . 565.1.1 Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.1.2 Mutations . . . . . . . . . . . . . . . . . . . . . . . . . . 595.1.3 Translating Code . . . . . . . . . . . . . . . . . . . . . . 605.2 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675.2.1 Missing Bytecode . . . . . . . . . . . . . . . . . . . . . 675.2.2 Native Methods . . . . . . . . . . . . . . . . . . . . . . . 685.2.3 Class Initialization . . . . . . . . . . . . . . . . . . . . . 705.2.4 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . 715.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.3.1 Case Study: Diagnosing a Memory Leak . . . . . . . . . 725.3.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . 755.3.3 Completeness . . . . . . . . . . . . . . . . . . . . . . . . 775.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.4.1 Mirror-based Behavioural Intercession . . . . . . . . . . . 785.4.2 Reproducing Past State and Behaviour . . . . . . . . . . . 795.4.3 Heap Dump Analysis . . . . . . . . . . . . . . . . . . . . 805.4.4 Static Code Analysis . . . . . . . . . . . . . . . . . . . . 805.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 806 Retroactive Weaving for AspectJ . . . . . . . . . . . . . . . . . . . . 826.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826.2 Events and Intercession . . . . . . . . . . . . . . . . . . . . . . . 836.3 Reflective AspectJ Weaver . . . . . . . . . . . . . . . . . . . . . 846.3.1 Events as Join Point Shadows . . . . . . . . . . . . . . . 866.3.2 Efficient Event Requests . . . . . . . . . . . . . . . . . . 866.4 Execution Recordings . . . . . . . . . . . . . . . . . . . . . . . . 896.4.1 Events Database . . . . . . . . . . . . . . . . . . . . . . 896.4.2 Deterministic Replay . . . . . . . . . . . . . . . . . . . . 906.5 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91viii6.6 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.6.1 Contract Verification . . . . . . . . . . . . . . . . . . . . 936.6.2 Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . 946.6.3 Race Detection . . . . . . . . . . . . . . . . . . . . . . . 966.6.4 Memory Leak Detection . . . . . . . . . . . . . . . . . . 976.6.5 Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . 986.7 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 986.7.1 Adaptation Effort . . . . . . . . . . . . . . . . . . . . . . 996.7.2 Results and Runtime . . . . . . . . . . . . . . . . . . . . 1016.8 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1036.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1057 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1067.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1067.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1077.3 Threats to Validity . . . . . . . . . . . . . . . . . . . . . . . . . . 1087.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111A Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119A.1 Illegal Native Methods in the JRE . . . . . . . . . . . . . . . . . 119A.2 Illegal Side-effects in AspectJ Case Studies . . . . . . . . . . . . 123A.3 RAPL Interpreter Source Code . . . . . . . . . . . . . . . . . . . 125ixList of TablesTable 2.1 A possible memory layout of the sample binary search tree . . 9Table 5.1 Results of executing Object.toString on every object in a VM,comparing performance on a holographic VM versus a live VMvia the JDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Table 6.1 Summary of case studies and the AspectJ features they use . . 92Table 6.2 Execution time comparison of retroactive weaving with load-time weaving . . . . . . . . . . . . . . . . . . . . . . . . . . . 102Table A.1 Categorization of forbidden methods in the Java Runtime Envi-ronment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122Table A.2 Illegal side-effects encountered by case studies . . . . . . . . . 124xList of FiguresFigure 1.1 Traditional aspect weaving versus retroactive aspect weaving . 7Figure 2.1 A sample binary search tree, representing the set {1,4,5,7,9} 9Figure 2.2 An example of the Eclipse user interface while debugging aJava program . . . . . . . . . . . . . . . . . . . . . . . . . . 12Figure 3.1 An example of working with an associative mapping using ap-plication code. . . . . . . . . . . . . . . . . . . . . . . . . . 21Figure 3.2 The same example as in Figure 3.1, but using the meta-levelinterface of a heap dump model . . . . . . . . . . . . . . . . 21Figure 3.3 A simple ACC example inspired by a Linux kernel module . . 27Figure 3.4 An example of supressing unwanted side-effects using aroundadvice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Figure 4.1 Grammar for terms in the base version of RAPL . . . . . . . 38Figure 4.2 Factorial function in RAPL . . . . . . . . . . . . . . . . . . . 38Figure 5.1 The overall holographic objects architecture . . . . . . . . . . 58Figure 5.2 Analysis code used to diagnose the Eclipse CDT memory leakbug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Figure 6.1 A contract verification aspect . . . . . . . . . . . . . . . . . . 94Figure 6.2 Suppressing illegal side-effects with aspects . . . . . . . . . . 100Figure 6.3 Relocating illegal side-effects with aspects . . . . . . . . . . 101xiList of Listings4.1 RAPL code for an interactive factorial loop . . . . . . . . . . . . 394.2 RAPL code for an aspect to trace the factorial function . . . . . . 414.3 Possible input and output for applying Listing 4.2 to Listing 4.1 . 425.1 Original Java code for the sample Employee class . . . . . . . . . 615.2 Original JVM bytecode for the sample Employee class . . . . . . 615.3 Translated JVM hologram bytecode for the sample Employee class 636.1 An excerpt of output from the AspectJ tracing aspect . . . . . . . 95A.1 Complete RAPL interpreter source code . . . . . . . . . . . . . . 126xiiGlossaryACC AspeCtCAOP Aspect-oriented ProgrammingAPI Application Programming InterfaceAST Abstract Syntax TreeCDT Eclipse C and C++ Development ToolsCIL C Intermediate LanguageCPU Central Processing UnitDNF Disjunctive Normal FormDR Deterministic ReplayJDI Java Debugging InterfaceJRE Java Runtime EnvironmentJVM Java Virtual MachineMAT Eclipse Memory Analyzer ToolMNM Mirror-based Native MethodPC Program CounterQEMU Quick EmulatorxiiiRAPL Retroactive Aspect Programming LanguageTOD Trace-oriented DebuggerTTP Time-travel ProgrammingVM Virtual MachineVMM VirtualMachineMirrorXML Extended Markup LanguagexivAcknowledgementsI would like to thank many people for joining me in what has been a long, exhaust-ing, and ultimately rewarding journey.Andrew Warfield, for encouraging me down this path and providing many en-tertaining discussions around my crazy ideas.Ronald Garcia, for providing the fresh perspective I was hoping for.Gregor Kizcales, for supporting me, for putting up with me, and for teachingme so much more than I ever expected to learn.Adrian Kuhn, Thomas Fritz, and C. Albert Thompson, for your badly neededdoses of encouragement.My parents, for ultimately making this possible and inspiring me to find outjust how much I could accomplish.My son, for inspiring me to finish what I’d started while simultaneously makingit that much harder to.My wife, for leading by example in every way, for taking it in stride whenthings were tough, and for believing in me when I couldn’t. Above all, this is foryou.A portion of this work was funded by NSERC PGS M and D awards and anNSERC Discovery Grant.xvChapter 1IntroductionThis chapter provides the motivation for the contents of this thesis. It examines howthe power of abstraction can be lost when analyzing past executions of software viarecorded information. It then introduces the concept of Time-travel Programmingas a generalized solution to this shortcoming.1.1 Analyzing Past ExecutionsOne of the most common and basic approaches programmers use to understandexecution and diagnose issues is inserting statements that output the state of theprogram as it executes. Knowledge is built incrementally, as each output may pro-vide clues as to where next to look in the code for more insight and hence where toadd the next output statement. Programmers spend many hours in this loop gainingunderstanding about how code behaves by writing and evaluating more code. Asthis repetition can be tedious and inefficient, many programming language toolsetsinclude interactive debuggers that support connecting to a running program andinspecting its state. The dynamic evaluation of code snippets in a precise con-text within a running program is a common feature of such tools, as they mimicthe power of modifying and rerunning a program. Thus a programming language isnot only the means for humans to tell computers what to do but also a sophisticatedmeans for asking questions about what computers are doing.An unfortunate side-effect of the ever-increasing complexity of software, how-1ever, is that programs often behave unpredictably when used by real consumers inproduction environments. Inevitably programmers must figure out why softwaredid not behave as expected in the field, where unexpected behaviour can mani-fest in many forms: rare or obscure application errors, fatal crashes, exhaustingmemory or other resources, or simply taking far too long to perform a task.Ideally developers would like to apply the same interactive, iterative approachabove when diagnosing issues that appear in the field, but this is often ineffectivesince reproducing such issues can be challenging. The same input to a programcan produce the correct answer in a development environment and yet fail in aproduction environment because of subtle differences in contexts. Moreover, someissues will not occur consistently because of the non-determinism due to factorssuch as concurrency.In practice, nearly all software will be configured to capture some degree of in-formation about how a program executed so that programmers might have a chanceof inferring the cause of unexpected behaviour post-hoc. This commonly takesthe form of logging statements that capture some of the human-readable infor-mation described above. Programmers can diagnose some bugs and unexpectedbehaviours simply by reading the high-level, vastly simplified timeline of a pro-gram’s execution contained within its logs. They allow programmers to becomecomputing archaeologists, sifting through artifacts from the past in order to gleaninsight.For deeper and more sophisticated debugging and runtime analysis the man-ual approach does not scale, so it becomes necessary to automate the analysis.Thus the recorded information needs to be machine-readable rather than human-readable. Many mainstream programming environments can produce a snapshotof a program’s state at a single instant in time for later analysis, either on re-quest or as an automatic reaction to encountering fatal errors. Moreover, tech-nology for recording the continuous execution of a program is an active area ofresearch [18, 49, 51, 62], and is becoming efficient enough to find its way intomainstream computing environments [63].However, the information in any such execution recording is encoded as low-level data and events in the recorded programming language runtime, or even in theoperating system or hardware beneath it. The sophisticated layers of abstraction in2the program have been lost, leaving a substantial semantic gap. Related tools andresearch take the approach of building libraries of operations to analyze or visu-alize common datatypes from the recorded data in order to enable insight. Thedatatypes and operations will be represented by a user-level application model: ei-ther expressed within some other programming language, sometimes a specializedquery language, or in the original language but at the metaprogramming level. Be-cause these techniques do not support evaluating code in the original programminglanguage, they are forced to manually reproduce the behaviour of a live runtime,which is difficult and time-consuming to accomplish for even a small subset ofcommonly-used code and impossible to generalize to all possible programs.1.2 Time-travel ProgrammingThe key inquiry motivating this thesis is to what extent it is possible to reuse boththe semantics of and the code written in a programming language to interact withits own execution recordings. For this to be useful for software developers, the se-mantics of the relevant code must be maintained: any retroactive computation mustbe consistent with how the execution would have behaved if it had been presentduring the original computation. We define Time-travel Programming (TTP) as theparadigm of programming in the context of a recorded prior execution, as if newcomputation is being sent through a time machine to occur in the past executioncontext.We refer to this simulated execution as Retroactive Execution. Its definingcharacteristic is consistency with the preexisting semantics of the programminglanguage. In the simplest case, the semantics of evaluating an expression such asprint(tree) requires a great deal of context. Providing a sound interpretationof the programming language’s semantics is equivalent to recreating the behaviourof a debugging tool connected to a live program. Therefore, in general, implement-ing retroactive execution involves simulating the entire language runtime.Execution recordings such as those described in Section 1.1 are always incom-plete to some degree for any programming language that supports interacting withanything outside the program’s internal state. In any realistic execution environ-ment there will be factors that are implicitly omitted from recordings, such as the3exact timing of events, external resources such as file systems and networks, non-determinism in the presence of concurrency, and so on. In addition, recordings willoften be explicitly scoped in order to keep performance and recording size manage-able. If retroactive execution attempts to access any of this missing data, there isno guarantee that the simulation will behave consistently as TTP requires. In thesecases the simulation must fail, and hence retroactive execution implementations bydefinition will always be partial.Many forms of analysis require coordinating multiple such additional computa-tions over time as it is represented in an execution recording. For example, if a bugin a program’s implementation caused a binary search tree to mysteriously losean element the tree could be inspected repeatedly over time to pin down exactlywhen it happened. This amounts to simulating the behaviour of adding additionalcode at multiple locations in the source. Specifying this additional, potentiallycomplex and stateful computation as it relates to the original program is a natu-ral programming language design problem, and is best solved by extending theoriginal programming language while maintaining source-level compatibility.We observe that this additional computation will by necessity crosscut theoriginal computation, and hence that this problem aligns precisely with Aspect-oriented Programming (AOP). Casting time-travel programming into the AOPparadigm allows us to leverage not only the established semantics of AOP lan-guages but also their implementations. We define Retroactive Weaving as the pro-cess of evaluating one aspect of a program with respect to the prior execution ofanother aspect, in contrast to conventional weaving as illustrated in Figure 1.1.Retroactive weaving then becomes an alternate implementation strategy for provid-ing the established semantics of the AOP language, changing not what the aspectmeans but when it is actually executed.1.3 Thesis StatementThis thesis demonstrates the advantages of time-travel programming: reusing aprogram’s source code and its programming language, or an extended version ofit, to analyze and interpret its own prior computation. We show that this general-ized technique is a flexible, feasible and effective paradigm for providing post-hoc4runtime analysis.1.4 Claims and ContributionsWe support our thesis statement with three key claims:• TTP is effective, in that it supports the natural and succinct expression ofmany post-hoc analyses;• TTP is flexible, as it is applicable to any programming language and anytechnology for recording that programming language; and• TTP is feasible for a mature mainstream programming language, in that:– it can be implemented by reusing the language’s original implementa-tion;– it can achieve performance adequate for the purposes of post-hoc anal-ysis; and– it can succeed frequently enough to be useful despite incomplete infor-mation about the original execution.We provide evidence to support these claims through the following contribu-tions:• An abstract formal framework for retroactive weaving as it relates to AOP,thus defining the core semantics and requirements of TTP;• A discussion of the design space for a valid TTP implementation via aninterpreter for a simple core language with TTP features, demonstrating theflexibility and generality of TTP;• An architecture for and a library implementation of TTP for the Java VirtualMachine (JVM) using the AspectJ AOP language and reusing its implemen-tation, further demonstrating the flexibility of TTP via pluggable implemen-tations for three distinct execution recording technologies; and5• An evaluation of the effectiveness of the JVM implementation at addressingcommon post-hoc runtime analysis problems in terms of its runtime perfor-mance, ability to reuse existing code, and success rate of producing usefulresults despite incomplete information, thus providing evidence of:– The effectiveness of TTP via useful examples in AspectJ; and– The practical feasibility of TTP for a mature mainstream programminglanguage.1.5 OrganizationThe remainder of this dissertation is organized as follows. Chapter 2 provides in-formation on existing concepts and prior work that this thesis builds on. Section 1.2discusses the motivation for the concepts this dissertation proposes and a prototypeimplementation of those concepts. Chapter 4 provides a foundation for the se-mantics of retroactive aspects by applying the concepts to a simple core language.Chapter 5 then focusses on an implementation and evaluation of retroactive exe-cution, a related concept necessary to support retroactive weaving but with a largeamount of complexity and independent utility of its own. Chapter 6 then presentsand evaluates a retroactive weaver for AspectJ based on extending this implemen-tation. Finally, Chapter 7 summarizes the dissertation and examines the limitationsof this research and potential avenues for future work.6Retroactive WeavingDecoupled analysis applied to execution or traceConventional AOP WeavingAnalysis included in compiled program binaryRetroactiveAnalysisOriginalExecutionProgram CodeAnalysis CodeJoin PointFigure 1.1: Traditional aspect weaving versus retroactive aspect weaving7Chapter 2BackgroundThe concepts we introduce in this thesis are synthetic and borrow from multiplemature fields of research. This chapter provides important background informa-tion on those fields and relates prior work to the work described in the followingchapters.2.1 Complexity and AbstractionAs computing becomes more and more omnipresent in our lives, software contin-ues to grow more and more complex. This complexity is made manageable bylayers of abstraction. Programming languages abstract the low-level details of thehardware they run on, and datatypes abstract the implementation of the high-levelconcepts they represent.Consider the binary search tree, one of the simplest common data structures incomputer science. Semantically, this abstract datatype represents an unordered setof elements. It is implemented as a binary tree where the element in each node isgreater than all elements in its left subtree and lesser than all elements in its rightsubtree. Figure 2.1 presents a sample binary tree containing five integers. Typicallythe binary search tree implementation would be provided as a reusable library andinclude a collection of functions for querying and modifying binary search trees.Applications are composed of many such abstractions, often layered on top of eachother.87941 5Figure 2.1: A sample binary search tree, representing the set {1,4,5,7,9}... 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ...... 7 17 11 5 0 0 9 0 0 1 0 0 4 14 8 ...Table 2.1: A possible memory layout of the sample binary search treeAt a lower level, the nodes of the tree structure itself are usually implementedas non-contiguous chunks of computer memory. Each cell contains three pieces ofdata: the addresses of its two subtrees, left and right; and the actual value storedin that cell. Table 2.1 illustrates a simplified view of a possible memory layout forthe sample binary search tree in Figure 2.1. The upper row contains the addressof each memory location and the lower row contains the actual value stored at thatlocation. Manually reading the semantic contents of this data structure based onthis memory is challenging, especially when in practice the size of such a binarysearch tree is frequently on the order of hundreds or thousands of elements.It is straightforward, though not completely trivial, for the binary search tree toprovide a function that produces the human-readable representation of its contents.The code in such a function would traverse through the nodes of the tree in or-der, most likely using recursion, incrementally constructing a string of characters.Given the tree in Figure 2.1, it would produce the string ”{1,4,5,7,9}”.This kind of functionality is extremely important to programmers: it enablesthem to reason about code using the abstraction it represents instead of the com-plexity it hides. When attempting to understand a large software project or de-termine the root cause of a subtle bug, one of the most common and effective9approaches is to add more code that outputs such human-readable encodings of aprogram’s state at various points as it runs, or to connect to a running program witha debugging tool that supports interactive evaluation of code snippets.Thus programming is not a fire-and-forget exercise. A significant portion ofthe process of writing software is interactive and incremental. Programmers spendmany hours in the code-evaluate-repeat loop, gaining understanding about howcode behaves by writing and evaluating more code. A programming language isnot only the means for humans to tell computers what to do, but also a means forasking questions about what computers are doing.2.2 Reflection and MetaprogrammingThe term metaprogramming refers to manipulating programs or their runtimes.Compilers and translators are hence a kind of metaprogram, since they refer tocode as data. Reflection refers to elements of a programming language that ref-erence its own code or data, and hence is a specialization of metaprogrammingtargeted at the same program [64]. Reflection adds another level of expressivepower and flexibility to a programming language, allowing programs to generalizeover otherwise unrelated elements such as fields of an object.Java’s reflective interface consists of several closely-related classes in the corelibrary such as Class and Field. These classes provide several methods whichreflect on the state of the runtime, including classes, objects and threads. Toread the value of the field f on the object o, for example, the straightforwardor “base-level” expression would be o.f. The same value can be retrieved us-ing reflection, assuming o is an instance of the class c, using the expressionc.getDeclaredField("f").get(o). A reflective interface adds a level ofindirection and provides greater flexibility, but also requires a more verbose syntaxthan the base level code.Kiczales et. al. [33], in describing metaobject protocols, provide terms for threedistinct levels of self-reference of increasing power, which we use in reference tometaprogramming in general: introspection refers to read-only access to the stateand structure of a program, modification (originally self-modification in the re-flective case) refers to modifying this state reflectively, and intercession refers to10modifying the behaviour of elements of the program. Java’s reflective interface of-fers a substantial set of reflective operations for introspection but relatively few forself-modification or intercession: it is possible to change field values, for example,but not to change the definition of already-defined methods.2.3 DebuggingThe term debugging in a broad sense is the process of software developers ana-lyzing the cause of undesired behaviour in their code. For the purposes of thisthesis, we use the term debugging to refer more specifically to interacting with aprogram as it runs, in order to extract additional information about it or even tomake changes on the fly in order to test potential bug fixes.Debugging tools are built on another example of metaprogramming, as theyconnect to a running program in order to examine and in some cases alter the stateof that program. The Java Debugging Interface (JDI) is one example of an Ap-plication Programming Interface (API) for interacting with the state of a runningprogram from within a debugger tool. The JDI includes interfaces that resemble thenative Java reflective classes superficially: ReferenceType, for example, is theequivalent of Class, and both define a Field type. The JDI offers a metapro-gramming interface that is strictly more powerful than native Java reflection. Itis possible, for example, to force the currently active method invocation to returnearly with a given value.To support human interaction in debugging tools, the JDI also includes a richevent-driven API. In particular, it is possible to describe events of interest, such aswhen a particular method is invoked, and receive notification when those eventsoccur. While clients of the JDI are handling such events the thread responsible forraising the requested event is suspended, or even the entire target program pausedif the clients request it. This capability enables a limited form of intercession, asthe JDI client can intercept the behaviour of certain program elements and modifyprogram state before resuming.Figure 2.2 shows an example of the Eclipse user interface, which is built usingthe JDI, while debugging a Java program. The lower pane displays the output ofthe program so far, and the middle pane shows the line of code currently executing.11Figure 2.2: An example of the Eclipse user interface while debugging a JavaprogramThe small circle on the left indicates the user has set a breakpoint at the currentline, which caused the program to pause when it reached this line. The upper-left pane displays active threads and their state, in particular their current stack ofmethod invocations. The upper-right pane lists the symbols that are in scope in thecurrent context and their values. Note that because the variable s is selected in thispane, the result of invoking s.toString() is displayed just beneath the list ofsymbols.In general, the trade-off for offering the connectivity and metaprogrammingfeatures for debugging described above, in particular monitoring the program forrequested events, is to significantly slow down the programming language runtime.Therefore, such runtimes typically provide a so-called debugging mode which must12be explicitly turned on and is usually left off for production.2.4 Aspect-oriented ProgrammingThe paradigm known as aspect-oriented programming (AOP) was inspired by theobservation that programs contain many concerns that are orthogonal to the tra-ditional decomposition of functions or class methods. Software design typicallyincludes several such concerns, such as logging, verification, security, persistence,and event handling. Concerns such as these are said to crosscut each other if theirareas of effect in a program intersect but are not strictly contained in each other. Innon-AOP languages this orthogonality forces the code that implements each con-cern to be scattered throughout the code, leading to duplication, increased mainte-nance cost, and decreased developer understanding of the system’s design. AOPlanguages include one or more features designed to allow crosscutting concerns tobe implemented as independent modules. See Section 2.5 for a concrete example.Early presentations of the aspect-oriented programming paradigm focussed ondecomposing a program into a base program and one or more aspects, where thebase program was always modularized along the traditional dimension of functionsor class methods. Masuhara and Kiczales [43] later provided one of the more gen-eralized definitions of cross-cutting modules and of aspect-oriented programming,and we adopt their terminology here. Their model defines weaving as the pro-cess of coordinating the cross-cutting specifications of two independent programsto produce the behaviour of the combined program. At a high-level, weaving is afunction with signature1 A×B→ X , where A, B and X are all languages, not nec-essarily distinct from each other. The term join point refers to individual elementsof the combined program where the two input programs intersect. A and B bothinclude mechanisms to identify relevant join points in the X program and how theyeach affect the semantics of those join points. There are several broad classes ofweaving implementations and many strategies for efficient dispatch, but this modeldescribes the semantic requirements on such implementations.1Masuhara and Kiczales’ version also includes a third optional META meta-language used toallow external configuration of how the two programs are combined. This is not relevant for thecontent of this thesis and we omit it here.132.5 AspectJAspectJ is an aspect-oriented extension of Java, and is one of the most well-known AOP languages. It adds aspect definitions, which are somewhat similarto classes but are designed to support modular implementations of concerns thattypically crosscut class and method decompositions. Aspects contain advice dec-larations, which are similar to methods but are executed in response to relevantprogram events rather than explicitly invoked. Pointcuts are declarative expres-sions which pick out these sets of relevant program events. Aspects can also con-tain intertype definitions, which semantically add additional static type structureto one or more unrelated classes, such as declaring additional fields, methods, orimplemented interfaces.A frequently-cited application of aspect-oriented programming is tracing pro-gram execution, which in practical terms means outputting relevant informationwhen key events occur. Below is a simplified version of this aspect as packagedwith the Eclipse-based distribution of AspectJ and described in the AspectJ Pro-gramming Guide [2]:public class Trace {protected s ta t i c Pr intSt ream stream = System . e r r ;protected s ta t i c in t ca l lDep th = 0;public s ta t i c void t r aceEn t ry ( S t r i ng s t r , Object o ) {ca l lDep th ++;p r i n t I n den t ( ) ;stream . p r i n t l n ( ” - -> ” + s t r + ” : ” + o . t oS t r i n g ( ) ) ;}public s ta t i c void t r a ceEx i t ( S t r i ng s t r , Object o ) {p r i n t I n den t ( ) ;stream . p r i n t l n ( ”<- - ” + s t r + ” : ” + o . t oS t r i n g ( ) ) ;ca l lDep th - - ;}private s ta t i c void p r i n t I n den t ( ) {for ( i n t i = 0 ; i < ca l lDepth ; i ++) {stream . p r i n t ( ” ” ) ;}14}}aspect TraceMyClasses {pointcut myClass ( Object ob j ) :th is ( ob j ) && within (TwoDShape) | | within ( C i r c l e ) | | within ( Square ) ;pointcut myConstructor ( Object ob j ) : myClass ( ob j ) && execution (new ( . . ) ) ;pointcut myMethod ( Object ob j ) : myClass ( ob j ) &&execution ( * * ( . . ) ) && ! cflow ( execution ( S t r i ng t oS t r i n g ( ) ) ) ;before ( Object ob j ) : myConstructor ( ob j ) {Trace . t raceEn t ry ( ” ” + t h i s J o i nPo i n t S t a t i cPa r t . ge tS ignature ( ) , ob j ) ;}a f te r ( Object ob j ) : myConstructor ( ob j ) {Trace . t r a ceEx i t ( ” ” + t h i s J o i nPo i n t S t a t i cPa r t . ge tS ignature ( ) , ob j ) ;}before ( Object ob j ) : myMethod ( ob j ) {Trace . t raceEn t ry ( ” ” + t h i s J o i nPo i n t S t a t i cPa r t . ge tS ignature ( ) , ob j ) ;}a f te r ( Object ob j ) : myMethod ( ob j ) {Trace . t r a ceEx i t ( ” ” + t h i s J o i nPo i n t S t a t i cPa r t . ge tS ignature ( ) , ob j ) ;}}The myClass pointcut matches all join points within the code that definesthe three listed classes. The myConstructor pointcut intersects this pointcutwith one that matches the execution of any constructor, and hence matches allconstructor executions in these classes. This aspect therefore causes the executionof the tracing statements before and after every constructor or method execution inthe listed classes. In a non-AOP implementation, these calls to the tracing modulewould have to be duplicated in every constructor and method in these classes. Notethat the cflow clause in the myMethod pointcut is a common idiom used byadvice to exclude the join points in the control flow of their own code and henceavoid infinite regress.The AspectJ distribution includes the de facto standard implementation of As-pectJ, which we will refer to as ajc. This toolset implements the semantics of as-pects by modifying the code that is run, as described in [30]. The AspectJ compilerand runtime define a common binary format for aspects. Each aspect is compiled15into a Java class, and each advice definition becomes a method (which we refer toas an advice method) with a generated internal name and annotations containinginformation relevant for weaving, such as the associated pointcut definition. Otherauxiliary data such as cflow state data structures are inserted into these classes asneeded.The elements of code that are causally linked to join points at runtime, andhence need to be augmented in this way to implement the cross-cutting nature ofAOP programs, are referred to as join point shadows. Each join point shadow ismatched statically with each pointcut, and wherever there is a potential dynamicmatch, the shadow is altered to implement the advice semantics as well. The exactimplementation details vary, and may involve inserting calls to generated methodscontaining advice bodies, or inlining advice bodies directly with shadows. Simi-larly, intertype declarations are generally implemented by inlining the declarationsinto their target type.ajc supports two approaches to weaving, both of which use this binary as-pect format at runtime. Compile-time weaving locates the weaving process in thecompilation stage. All code, including that contained by either methods or advicedefinitions, is augmented as it is compiled to JVM bytecode to include calls to themunged advice methods as needed. Load-time weaving, by contrast, applies trans-formations to JVM bytecode as it is loaded into the JVM and before it is executed.Because of their very cross-cutting nature, the scope of code that aspects affectis not manifest in aspect source or binaries themselves. This is instead determinedby external configuration: compile-time weaving is scoped by the build environ-ment, and load-time weaving by configuring the Java runtime to affect how classesare loaded, generally by creating custom class loaders.Other AOP languages support what is called runtime weaving, which is apply-ing the semantics of aspects dynamically. The definition of the AspectJ languagedoes not support runtime weaving: among other reasons, the semantics of disablingan aspect containing intertype definitions that are referenced by running code areunclear. However, dynamic weaving is a well-established approach in the litera-ture [9] [54] as well the preferred method for dynamic AOP languages [26].From the early days of research on aspect-oriented programming, many haveobserved when a program is factored into multiple aspects, it is natural to classify16those aspects as either interfering or non-interferring. The latter style of aspects arein some sense safe to omit as needed, such as to increase performance in productionenvironments.One of the often-cited benefits of AOP is flexibility. An aspect that traces alarge volume of debugging information may be easily turned off when deployingthough a small amount of compiler or runtime configuration. Any such aspectthat is not strictly required for program correctness is certainly optional and non-interferring in some sense. Dantas and Walker [22] offer a formal definition ofharmless advice which prevents such advice from altering a computation’s finalvalue, while still allowing the advice to read state of interest and perform inputfrom and output to the outside world.2.6 Execution RecordingThis section describes a number of variations on the theme of recording some sub-set of the state of a program as it executions.2.6.1 SnapshotsWe use the term snapshot to refer to any persisted record of the complete state of arunning program to a high degree of detail. Many operating systems automaticallyproduce a core dump when a program crashes, which is one form of snapshot thatrecords the low-level state of a program’s raw memory. It may also include thecontents of Central Processing Unit (CPU) registers.A JVM will produce a so-called heap dump when it exhausts available mem-ory. A heap dump records the state of a Java program in terms of Java semantics,as objects and primitive values and the references between them rather than rawmemory state. It does not, however, include the Program Counter (PC) counter ofeach thread of execution.2.6.2 TracesThe idea of recording snapshots can be semantically extended to record the stateof a program over time as it executes, rather than at a single instant in time. Ratherthan recording multiple independent snapshots, however, this can be done much17more efficiently by only recording the incremental changes between timestamps.Indexing these deltas enables efficient handling of requests for specific runtimevalues at specific times.Omniscient debugging [38] describes narrowing the gap between live debug-ging and execution trace analysis by creating trace browsers that look like livedebuggers. Developers can examine variable values and control flow state usingthe same idioms as live debugging, but also travel backwards and forwards throughtime to events of interest. Research has largely focused on improving the executiontracing and trace querying efficiency [55, 56] of omniscient debugging.2.6.3 Deterministic ReplayDeterministic Replay (DR) [24, 25, 31, 68] is an active area of research that enablesrecreating the actual execution of a program exactly as it previously happened.This is accomplished by recording sources of non-determinism in the program asit executes. This includes external sources such as user input, but also internalsources such as the results of concurrent access to shared data. A later replay stagereads this recording and forces the same outcome of non-deterministic operationsas before.18Chapter 3MotivationThis chapter identifies some of the shortcomings of the various systems availablefor post-hoc dynamic analysis, and argues for the position that time-travel pro-gramming is a natural and effective solution. It also describes an initial implemen-tation of retroactive weaving for AspeCtC (ACC), an aspect-oriented extension ofC, using static analysis and code transformation techniques. This implementationprovided some evidence for the effectiveness of our position although it was notdeveloped further. We then discuss the positive and negative observations made inour experience with this prototype.3.1 Compatibility with Program SourceWe first describe the approach many post-hoc analysis tools use to work around theinability to execute code, which is to force developers to program analysis usinga reflective model, and examine the problems we intend to solve with retroactiveexecution.For the purposes of concrete discussion we will first focus on Java heap dumpsas a particular example of recording execution; as outlined in Section 2.6 suchsnapshots of execution state are a key part of many approaches to recording execu-tion. There are many types of retroactive analyses that can be applied to a snapshot:counting the number of objects of each unique type and how much memory theyoccupy, for example, to determine the root cause of exhausting available mem-19ory. The Eclipse Memory Analyzer Tool (MAT) [3], which we use to represent thecurrent state-of-the-art, includes a rich UI for browsing the object graph, severaldozen actions for navigating it, and several pre-defined analysis commands such asidentifying the set of objects that are most likely leaking memory.Consider a Java developer browsing the object graph in a heap dump whohas identified an object that implements the built-in java.util.Map interface.This implies that the object implements an associative mapping from keys to val-ues. Suppose the developer needs to know what value a certain key is mappedto. When debugging a live process this is as simple as evaluating the expressionmap.get(key), but the task is surprisingly involved with a dead process. Evenif the concrete type of the map was java.util.HashMap, a well-known imple-mentation backed by a straightforward hash table, finding the right key-value pairinvolves manually searching through the internal representation of a potentiallymassive hash table to find the matching entry.Not being able to execute code also means there is also no way to invoke themethod Object.toString. This method is defined at the top of the Java classhierarchy with a default implementation and therefore callable on any object, andis used to create human-readable text describing the object. The Eclipse debuggingperspective, for example, includes a window dedicated to displaying the result ofcalling toString on objects in other windows (as shown in Figure 2.2), and it isexactly this basic functionality that is missing in a heap dump browser.The tools used to interact with heap dumps attempt to address this issue by pro-viding utilities to automate these tasks: in particular, the MAT includes a commandnamed “Extract Hash Entries”. To implement this utility, however, the tool de-veloper must include handler code for every single subclass of java.util.Mapthat might be encountered in the object graph.1 Since the tool cannot possibly han-dle all possible types in arbitrary application code, the utility must be pluggableso that developers wishing to analyze their heap dumps can contribute handlers forthe types of objects in them. Ultimately, this means that those developers are facedwith the job of re-implementing the semantics of a non-trivial portion of applica-tion code using the meta-level represented by the heap dump model.1The standard library alone includes several dozen, such as ConcurrentHashMap,LinkedHashMap, TreeMap, and so on.201 pub l i c S t r i ng getProper ty ( Serv i ceProper t i es map, S t r i ng key ) {2 r e t u rn ( S t r i ng )map. get ( key ) ;3 }Figure 3.1: An example of working with an associative mapping using appli-cation code.1 pub l i c S t r i ng getPropertyMeta ( IOb jec t map, S t r i ng key ) {2 St r i ng [ ] keys = n u l l ;3 St r i ng [ ] values = n u l l ;45 IOb jec t ob jec t = map. resolveValue ( ” headers ” ) ;6 IOb jec tAr ray ar ray = ( IOb jec tAr ray ) headers ;7 i f ( a r ray != n u l l ) {8 long [ ] keyAddrs =9 ar ray . getReferenceArray ( ) ;10 i f ( keyAddrs != n u l l ) {11 / / Ca l l he lper method to dereference12 / / addresses to S t r i ng ob jec ts13 keys = ge tServ i ceProper t i es ( keyAddrs ) ;14 }15 }1617 / * S im i l a r l y f o r ” values ” f i e l d * /1819 i f ( keys == n u l l | | values == n u l l )20 r e t u rn n u l l ;21 f o r ( i n t i = 0 ; i < keys . leng th ; i ++) {22 i f ( keys [ i ] . equals ( key ) ) {23 r e t u rn values [ i ] ;24 }25 }26 r e t u rn n u l l ;27 }Figure 3.2: The same example as in Figure 3.1, but using the meta-level in-terface of a heap dump model21Figure 3.1 contains an example of base code that retrieves the value a key ismapped to in one particular type of map, and Figure 3.2 contains an example ofMAT source code that achieves same thing using the reflective heap dump API.This snippet is part of a large submodule responsible for reconstructing the state ofthe Equinox OSGi Java module system [29] from a heap dump; this code is onlyconcerned with the configuration properties, a mapping from strings to strings,for a particular service. The configuration is stored as a pair of parallel arrays,and this code first resolves the object references contained in each, then iteratesthrough the list of keys until it finds a match. This example illustrates a number ofdisadvantages inherent to this approach.Complex: The complexity inherent in reproducing the behaviour of a poly-morphic method such as Map.get is high. As shown in Section 2.2, meta-levelcode is inevitably much more verbose than the equivalent base-level code, andreproducing higher-level language features such as polymorphic method dispatchand field shadowing correctly at this level is challenging.Type-unsafe: If the analysis code makes a type error, it will not cause an erroruntil further downstream reads fail to find the fields they are expecting, if an erroroccurs at all. This makes tracking down coding errors much more difficult. InFigure 3.2 the meta-level code assumes that the property values are strings whenin fact they can be other types, and in the base-level code a cast is necessary tomake the code compile. Guarding against such errors through explicit meta-leveltype checks requires programmer discipline, contributes greatly to code bulk, andas shown in the example are often omitted.Brittle: In Figure 3.2, the code expects the properties to be stored in two paral-lel arrays. In more recent versions of Equinox, however, these fields no longer ex-ist, and the properties are instead stored in a single special-purpose object. Becauseof the various null checks in this code, likely present to prevent this exact problemfrom crashing the overall analysis, no error is raised, and the service object appearsto have no properties. The fact that meta-level analysis accesses datatype internalsdirectly means that the analysis can very easily break or, even worse, silently pro-duce incorrect results when run against a dump produced by a newer version of thetarget code. Since this approach involves a second, redundant implementation ofthe functionality in the application, it introduces an ongoing maintenance burden.22Insecure: The access control mechanisms built into a language, such as declar-ing types or fields private or protected in Java, are generally more easily cir-cumvented within its own reflection facilities. Access control depends on knowingthe accessing context, which for Java means the class the referencing code residesin. The dynamic nature of reflection means that this is not statically available, andnot even dynamically available unless explicitly provided at great inconvenience.When traversing a heap dump model these protections are lost entirely: in the ex-ample above the code reads from two private fields with no extra difficulty. Thiscontributes towards the brittleness of analysis code as above, since it is easy to re-fer to internal details that are likely to change in the future, but it also means thatsensitive data is easy to read, whether by accident or on purpose.Unfamiliar: Developers familiar with base-level (ordinary) programming in alanguage must learn an additional paradigm to understand and effectively workwith the meta-level object model. In addition, because application values fre-quently refer to datatypes from libraries, reimplementing application-level coderequires understanding and reimplementing library code as well, since the encap-sulation that normally hides those details from clients has been lost. Tool develop-ers can alleviate some of this burden by handling the most common functionality,but handling even a small percentage of the code a user is likely to encounter isimpractical.Most debugging and analysis tools that are capable of reading snapshots or coredumps share these limitations. The GNU Project Debugger (GDB), for example,supports automated debugging through Python scripts, but these scripts operate ona similar meta-level representation of program state.The primary benefit of retroactive execution as a tool for retroactive analysis isto provide the same analytic power as the original programming language, enablingthe developer to simply evaluate map.get(key) in the context of the snapshotand obtain the needed answer.3.2 Coordination with Past ExecutionThe previous section provided motivation for the utility of retroactive execution.Automated post-hoc analysis code, however, also needs to specify individual points23in time of interest in the program’s execution to trigger additional code, and haveaccess to local values that aren’t available from the global state. Retroactive exe-cution restores the expressive power of code for interacting with a single point intime, but does not by itself provide an equally expressive mechanism for interact-ing with the flow of state over time. Without such a mechanism, locating relevantpoints in time and collating or comparing values over time is just as complicatedand error-prone as the metaprogramming-based alternative to retroactive executionoutlined above.Ideally, offline analysis should be just as intuitive and familiar as adding printf()statements within a program’s source code: the analysis’ interaction with the orig-inal program should be type-safe; consistent with its value, name binding and con-trol flow semantics; and able to reuse its datatypes and algorithms. In addition, itshould be possible to author analysis code with consistent semantics that do not de-pend on which execution capture and replay technology is used, or even on whetherthe analysis is evaluated during the program execution or after it.Offering the same flexibility as inserting additional code at arbitrary locationsin the source code is challenging, and ensuring the instrumentation is run exactlywhen needed often involves complicated and flow-sensitive specifications. Thisproblem is addressed in various ad-hoc ways by existing dynamic analysis tools,both online or offline. Pin [41] and the ATOM language it is based on [65] arequite aspect-like at the instruction level, but each instruction must be individuallyinspected as a metadata object to determine where to insert additional instructions.The Program Trace Query Language (PTQL) [28] is a reduced dialect of SQLcustomized to query program executions, which is described as being equally ap-plicable to post hoc evaluation. PTQL treats individual execution events such asfunction calls as relational tuples, with time as an explicit dimension that mustbe manually manipulated, which makes analysis of control flow awkward to spec-ify. VMWare’s VAssert [4] system supports replay-only statements in source code,which requires analysis foresight and is less convenient for testing bug theoriesafter the fact.Coordinating the interaction of the cross-cutting analysis with the original pro-gram execution lies precisely in the domain of aspect-oriented programming. Inparticular, pointcuts provide a rich specification language for solving exactly this24problem, and are equally applicable to matching joinpoints that occurred in thepast. They include binding mechanisms such as args(x, y) and return(z)which allow values to be extracted from the original program and used in analy-sis code. Using a declarative sub-language to describe analysis trigger points alsopresents opportunities for optimization in the context of execution tracing and re-play, since irrelevant events and even entire periods of execution can be ignored orskipped over when reading a trace or instrumenting a dynamic replay process.In some cases an AOP language’s pointcut types may not express all of thedesired concepts for post hoc analysis, but as proposed previously [12, 20] it isreasonable to extend pointcut languages for greater expressiveness or precision; wehave extended ACC ourselves, in fact, in order to address its shortcomings whenapplied to kernel-level code, as described in Section 3.5.2. Implementing AOPconcepts beyond pointcuts and advice in the context of retroactive evaluation couldalso enable more sophisticated analysis. Intertype declarations, in particular, offeran attractive solution for attaching additional analysis state to program structuresafter the fact, and are flagged as future work in our prototype.3.3 Access to Program StateRetroactive aspects will read values from the original program execution wher-ever such values are referenced through pointcut arguments or global variables.Retroactive weaving therefore needs to extract those values from the original exe-cution.Simply applying traditional dynamic advice weaving to a deterministic replayprocess would appear to be an ideal implementation, but unfortunately this willinevitably interfere with the fragile replay process. In our case of kernel-level pro-cess replay, for example, it is necessary to record all system calls made to the kernelso they they can be reproduced in replay. Any system calls made by the analysiscode will be incorrectly trapped by the replay process and produce the wrong re-sults, and in addition throw off future results, so even an apparently harmless callto printf() can be disastrous. Similar issues exist for any replay mechanism: ingeneral, efficient replay relies on being able to assume all deterministic behaviourremains exactly the same between record and replay. Post hoc analysis must there-25fore be somehow isolated from the DR process it analyzes.Recently, Devecsery et. al. showed in their work on eidetic systems [24] thatit was feasible to record all user processes in a system with minimal overhead andstorage on the order of a few terabytes over several years. They also leveragedprior techniques [7] for ensuring their Pin-based analysis does not perturb theirDR processes: ensuring the memory allocated by analysis does not overlap withthe original memory referenced by the recorded process, and distinguishing sys-tem calls made as part of replay versus those made during analysis code. Thisdoes mean, however, that analysis must be written to explicitly copy values intothis isolated memory space in order to analyze them, and if analysis code invokesexisting code in the process it runs the risk of leaking references to this space andinvalidating the isolation.Because retroactive aspects are compatible with the program source and canmaintain their own state, references to both the original program’s state and theanalysis state must be disambiguated. In our C implementation, for example, thismeans that pointer values may actually refer to either the original program’s mem-ory or the new analysis runtime’s memory. We refer to this as the dual addressspace problem.In other decoupled analysis frameworks such as Speck [49], data must be man-ually marshalled between processes in order to synchronize them, which does notscale well to arbitrary source-level analysis. Choi and Alpern use remote reflec-tion [48] to debug a Jalapeo JVM [17] from another JVM. Remote reflection solvesthe dual address space problem in Java dynamically by customizing the debuggingJVM to track whether each object is remote or local; references to remote objectmembers are handled by communicating with the JVM being debugged, and theflag is propagated through such references.Our approach in our retroactive weaver for ACC is to solve the problem stati-cally instead. At compile time, variables declared by the program source or boundby pointcuts are marked by our aspect compiler as program state. Other variablesand fields are inferred based on dataflow as either program or analysis state withinthe source code, and references to program state are rewritten to instead call aruntime API to recreate that state.This concept can be illustrated with a simple example: Figure 3.3 contains a261 before ( s t r u c t mutex * lock ) :2 c a l l ( $ mutex lock ( . . . ) ) && args ( lock ) {3 i f ( ! check order ( cur ren t , lock ) ) {4 p r i n t f ( ” Lock order reve rsa l : %p (%s ) ” , lock , lock ->name ) ;5 }6 }7 before ( s t r u c t mutex * lock ) :8 c a l l ( $ mutex unlock ( . . . ) ) && args ( lock ) {9 remove from order ( cur ren t , lock ) ;10 }Figure 3.3: A simple ACC example inspired by a Linux kernel modulesample ACC aspect inspired by the witness kernel module, which validates thatlock acquisition and release code is structured correctly to avoid deadlocks by veri-fying that lock ordering is consistent. The check order and remove from orderfunctions manipulate a lock order relationship tree; their implementation is omittedfor brevity. This is one of several units of conditionally-enabled kernel code thatprovides valuable debugging information but is normally too slow or intrusive toenable for production builds.In this aspect, the lock pointer is bound though the args pointcut, and ishence marked as referring to base state, as is lock->name. So is the result ofcurrent, a Linux kernel macro used to get a pointer to the currently running taskfrom a CPU local variable. The string literal in the call to printf, on the otherhand, is a new value created by the aspect code and hence its address is marked asreferring to aspect state. Our implementation of this inference process is describedin Section 3.5.1.3.4 Avoidance of Side-EffectsFor source-level post hoc analysis to have consistent semantics whether it is evalu-ated online or offline, the analysis code must not have side-effects that would havechanged the program behaviour; the analysis framework should reject any suchanalysis.In other systems where analysis is performed outside the original program,such as Speck or remote debugging, any side-effects occur externally as well and27are not in danger of perturbing the replay. However, allowing them can cause lateranalysis code to observe these changes and behave inconsistently, and these devia-tions can be buried deep within application code called from the analysis and hencehave very subtle consequences. Other approaches based on virtualization such asIntrovirt [32] and VAssert [4] allow mutations to occur within the replay process,but then revert to a prior checkpoint to undo their effects before continuing. Thisalso prevents the analysis from maintaining any state between triggers, however,such as the depth counter in the tracing example of Section 2.5 or the lock orderingdata structure in Figure 3.3, and hence restricts its power.The same static analysis that addresses the dual address space issue can alsobe used to detect and reject attempts to write to the program’s memory space,which is the most common way analysis code could affect the program’s execution;see Section 4.7 for a more general and precise discussion of this issue. A largepercentage of useful application code, however, will include side-effects such ascaching even if their primary use is to read state, which would seem to imply theycannot be used in retroactive aspects.One solution is to use adviceexecution(), a generic pointcut introducedin AspectJ which matches all joinpoints inside advice code. This is often used toexclude aspects from applying to other aspects when the interaction is undesirable.Combining it with the cflow pointcut operator creates a pointcut that covers alladvice execution. In the case of retroactive aspects, it therefore covers all code inthe decoupled analysis, and so can be used to only suppress or redirect side-effectsin application code when called retroactively. This achieves the goal of advice codethat behaves identically whether woven directly or retroactively. See Figure 3.4 foran example. Here around advice is used to avoid undesired side effects by replacingcertain calls with no-ops.Note that this simple definition will not have the desired effect if the originalprogram also contains advice execution. However, this can easily be replaced witha more nuanced pointcut that picks out only the retroactive execution, possiblyjust by adding additional conjunctive clauses that name the aspects being wovenretroactively.281 / / Program code2 vo id fetchFoo ( char * foo ) {3 increaseFooRefCount ( ) ;4 computeFoo ( foo ) ;5 }67 / / Ana lys is advice8 a f t e r ( ) : c a l l ( bar ( ) ) {9 char foo [ 1 00 ] ;10 fetchFoo ( foo ) ;11 p r i n t f ( ” Foo i s : %s ” , foo ) ;12 }13 around ( ) : c f low ( adv iceexecut ion ( ) )14 && c a l l ( increaseFooRefCount ( ) ) {}Figure 3.4: An example of supressing unwanted side-effects using around ad-vice3.5 Exploratory ImplementationThis section defines our retroactive weaving prototype for ACC. The system con-sists of a compiler that compiles ACC aspect files into a C library and two differentruntimes for evaluating the compiled aspects against a particular program execu-tion recording. In combination with the existing ACC weaver, this allows the sameACC code to be evaluated either inline or post hoc.3.5.1 CompilerOur compilation process consists of a chain of several distinct source transforma-tion phases taking ACC code as input and producing C code as output, followedby a call to an underlying C compiler (gcc in our case) to produce object files.The source transformation phases are implemented by extending the C Intermedi-ate Language (CIL) OCaml library [47], which is designed to support the analysisand transformation of C source.We have modified the CIL distribution to support parsing ACC code and repre-senting pointcuts and advice in its abstract syntax tree. The compiler also annotatesvariables and types to track which memory space they refer to by leveraging CIL’ssupport for gcc attributes, which are declarations with the syntax “$attribute”29that can be attached to nearly all elements of C syntax. In our case these annota-tions are either resolved $base or $aspect annotations, or annotation variablessuch as $$a that represent unknowns. Since the C type system includes valuetypes with multiple layers of indirection (e.g. pointers to pointers), the annota-tions are attached to all levels of types; as an example, the type char $base *$base * $aspect is interpreted to mean “pointer in the aspect space to a loca-tion storing a pointer in the base program space containing an immutable characterderived from program values.” This would be exactly the correct type to ascribe toa variable holding the beginning of a sequence of strings extracted from the origi-nal execution: the array itself would be a consecutive region of space in the aspectexecution, but the pointer values inside would refer to the memory of the originalexecution.The source transformation phases are outlined below.Annotation InferenceThis phase walks the ACC source tree in a bottom-up pattern, inserting annota-tions and variables as needed. The $base and $aspect annotations are firstintroduced as base cases into the Abstract Syntax Tree (AST) according to the se-mantics of weaving: values that are bound from the original execution throughpointcut parameters (e.g. args(a, b, c) and this->args) are assignedto the $base space at all levels of indirection, whereas values that are allocatedby the weaving runtime (i.e. targetName(x) and this->targetName aresimilarly assigned to the $aspect space.This phase also builds up a list of type constraints of the form T1 = T2 basedon nodes in the AST that require two address spaces to be the same: assignments,function arguments and return values, arithmetic, etc. Annotation variables arethen resolved by solving the constraints using straightforward unification [58].In general, values from different address spaces cannot be used together. Thismeans that even arithmetic combining such values is forbidden, although explicitcasts can always be inserted if necessary. An important exception is pointer arith-metic, where offset values can be from any address space, and the base pointerdetermines the address space of the result. This is necessary to support reading30values from arrays in aspect code, for example. Adding an annotation to repre-sent values from either address space would allow more valid aspects to be typedwithout additional source modification, at the cost of added implementation com-plexity.Annotation PropagationAs an additional aid to determining the address spaces of values in the source, thisphase takes advantage of the observation that addresses in the original program ad-dress space cannot point to values derived from the aspect space by construction;to arrive in such a state requires an assignment to a target lvalue, which is prohib-ited as an illegal side-effect. Therefore, the occurrence of any type variable as anaddress space annotation in a type at a deeper level of indirection than a $base an-notation can be replaced by $base as well. For example, char $$a * $base* $aspect will become char $base * $base * $aspect.This is an important rule for C code, and kernel code in particular, in which itis common to calculate addresses through raw numeric calculations followed by acast to a pointer. Ideally this rule could be incorporated into the general inferencingphase, but this rule cannot be encoded as a simple equality of types and hencecannot be included in a sound way. As future work, we plan to use a less simplisticinferencing algorithm in order to address this lack of power.Program State AccessIf the previous phases have not rejected the input program, the AST now con-tains only language constructs whose interactions with the program state are fullytractable. The next phase then replaces all instructions that read the program’smemory with calls to reconstruct the values from the retroactive weaving runtime.This includes the reads of any lvalue derived from the program execution, or tak-ing the address of any such lvalue. Addresses of symbols and data structure offsetsare calculated from information extracted from a debug build of the program. Thefunctions used to produce addresses of global variables and resolve register andmemory accesses are declared in a header file added to the source at this stage, tobe implemented by the particular backend weaving runtime.31Some special cases are implemented directly in the compiler backend since Cis not expressive enough to redefine code using around advice as described earlier.For example, printf takes a variable number of arguments, and the operations itperforms on those arguments depend on the format string; a pointer value matchedwith a %s pattern will be dereferenced, but one matched with a %p will not. Oursolution is to match the formats to the arguments and to copy values like stringsinto aspect space if printf will dereference them, and then pass the modifiedarguments to the original implementation. This supports calls that pass a mix ofprogram and analysis pointers, which is useful given that some joinpoint data con-sists of analysis pointers (e.g. function names as char * values).Transformation from ACC to CThe final component is responsible for splitting the aspect bodies from their asso-ciated pointcuts. For a single given retroactive aspect, this component produces aC source file with three separate artifacts:1. A set of advice bodies transformed into regular functions by discarding theirpointcuts and advice kind, along with stub methods for invoking the advicebody functions with a unified interface;2. A function table for these stub methods; and3. A string constant containing the set of aspects from the input in ACC syntax,with their bodies discarded, in the same order as in the function table.This enables parsing the aspect stubs and dynamically invoking their bodyfunctions without recompilation of the target weaving runtime.3.5.2 RuntimeThis section describes the interface between the aspect runtime and the target exe-cution environment and a brief description of the two target environments we cur-rently support: one instantiates state on demand from an instruction level traceproduced by the Tralfamadore [37] dynamic analysis framework, and one pro-vides hooks into the Quick Emulator (QEMU) virtual machine monitor [11], which32we have modified to perform deterministic recording and extract virtual machinestate during replay. For these backends, the retroactive aspect compiler producesa shared library containing the aspects in the original ACC source as regular Cfunctions, with metadata consisting of the advice kind and pointcuts attached.This shared object runtime exports a common retroactive weaving interface, whichserves two purposes: first, to provide the target environment with event notifica-tion callbacks on events such as function calls and returns and context switchesto produce join points, and second, to let it register functions for inspecting targetregister and memory state.Understanding Kernel ExecutionRunning aspects against kernel execution is made somewhat more complicated bycontext-sensitive pointcuts such as cflow(pc), which matches any join pointthat occurs inside, or beneath, the pointcut pc. User-level aspects can refer tothe threading abstraction provided by the operating system to uniquely identifyindividual flows of execution (i.e. pthread self()) and track joinpoint stackscorrectly.Kernel code is more challenging because although the notion of thread exists,it is not a good abstraction for tracking individual flows of execution due to in-terrupts and exceptions. Interrupt handlers execute in the context of the currentlyrunning thread but are conceptually different flows of execution. Identifiers suchas current, which maps to the currently executing task, cannot be used to dis-tinguish between threaded (system call, kernel threads) and interrupt handler exe-cution. Interrupts can also be nested, making this problem worse.Because our trace-based framework is designed to analyze kernel execution,it provides an abstraction to demultiplex individual flows of kernel execution. Ituses platform-specific rules to isolate system calls, interrupts, exceptions and ker-nel threads and label them with a unique opaque identifier. These rules are bothhardware and operating system specific and require tracking stack switching, inter-rupts, exceptions and instruction such as iret and sysexit.We have extended the implicit structure encoding the current join point to sup-port the expression this->cflowID. This evaluates to the flow identifier pro-33vided by the backend runtime that can be used to index per-control-flow infor-mation for later retrieval, while still encapsulating the details of how independentcontrol flows are tracked. We believe this to be a logical, generic extension toAOP and suggest that it could be used in any pointcut and advice language to avoiddependencies on specific threading libraries.Trace-based RuntimeThe Tralfamadore backend is completely offline in that running aspects involves nore-execution of guest code at all. All updates to register and memory that occurredduring execution recording are present in the trace. Because Tralfamadore uses amodified version of QEMU to capture traces, it can be used to record the executionof an unmodified operating system kernel.To add support for retroactive advice execution, we implemented a new toplevel operator stacked on top of the existing kernel flow and function call operators.This new operator implements a driver loop that pulls on these events and calls intothe weaving runtime whenever an event of interest is recognized. Tralfamadorealso supports tracking the state of registers and provides a memory index whichsupports efficiently finding the last update to a given memory address range. Weuse both of these features to support callbacks from the weaving runtime to inspectguest state.Deterministic Replay RuntimeWe have also implemented a retroactive weaving runtime directly into a virtualmachine monitor (QEMU), which we have enhanced to perform deterministic ex-ecution recording and replay [25]. It records all non-deterministic events (such asexternal interrupts or reads of the CPU timestamp counter) occurring during exe-cution so that they may be injected at the same point in execution during replay.We have also added hooks into QEMU to register callbacks that can be in-voked before and after the execution of any basic block. We use these hooks totrack individual flows of kernel execution similarly to Tralfamadore and to trackfunction calls and return. Whenever such an event occurs, these hooks call intothe retroactive weaving library, potentially calling aspect code. Accessing guest34state is much more efficient than in the trace-based approach: it is simply a matterof mapping pointers into QEMU’s data structure representing the memory of thetarget virtual machine. Compiled aspects and the retroactive weaving runtime areloaded and unloaded into QEMU during replay using the standard dynamic loadedlibrary mechanism.Deterministic recording and replay can be made very efficient through the useof CPU performance counters [25], costing as little as 5% overhead in VMWare [63].Our prototype is much more expensive (approximately 20x) due to its pure soft-ware implementation, which makes it easier to hook instrumentation into replayedexecution. As AfterSight [19] demonstrated, it is possible to use low-overheadhardware deterministic recording as the source for software replay, and we hope todo this ourselves in future work.3.5.3 Lessons LearnedAlthough we we successfully authored and executed several retroactive ACC as-pects using this prototype, we found that the need to provide explicit type annota-tions when the type inference could not determine them was a substantial burden.The second implementation described in Chapter 5 instead uses the equivalent ofruntime type dispatch to address the dual address space problem, including dy-namic type checks to detect incorrect usage. The overhead of this approach wasnot found to be significant compared to the overhead of retroactive execution ingeneral.In addition, we discovered that providing a language runtime that executes codeusing recorded state was a substantial undertaking, made especially challenging bythe lack of a managed environment and adequate abstraction. Raw pointer manipu-lation often prevented retroactive execution from completing successfully becauseit was not possible to determine which address space was being dereferenced.Finally, the presence of inline assembly in much of the Linux kernel greatlyincreased the engineering effort that would be necessary to support enough of thetarget code base to evaluate larger and more useful examples of retroactive aspects.Although time-travel programming as a paradigm is theoretically applicable to anyprogramming language, applying it to assembly would likely face more severe35issues with easily ensuring isolation and avoiding side-effects.3.6 SummaryWe have presented the concept of retroactive advice as a unified, source-level ap-proach to post hoc dynamic analysis. We have also described our prototype imple-mentation of a retroactive weaver for ACC. We assert that evaluating aspects onprior executions of a program opens up a wide range of analysis applications thatmight otherwise be infeasible.36Chapter 4Essential Retroactive Weaving4.1 IntroductionOur experience with retroactive weaving motivates us to express the concept inabstract, simpler, and more precise terms. Our contribution in this chapter isa more essential and general expression of the idea, which should be applica-ble to any aspect-oriented language with varying effectiveness. Our presentationcentres around a simple functional language called Retroactive Aspect Program-ming Language (RAPL), which serves as a minimally-defined example to illustratehow retroactive weaving interacts with core programming language features. Wepresent the language and its definitional interpreter in three layers: we first presentthe base functional language, then extend it to support aspects, and then introducesupport for retroactive weaving. Finally, we explore a general notion of soundnessfor retroactive weaving and explain how our example language and interpreter re-alize it.4.2 Base LanguageWe first describe the functional core of RAPL, which we intend to reflect theessence of many modern programming language features as simply and unsurpris-ingly as possible. RAPL has integers, Booleans, and atomic symbols as primitivedatatypes. It provides symbols for marking join points and expressing pointcuts37v ∈ IDENT, n ∈ Z, s ∈ STRING, t ∈ TERMt ::= true | false | (equal? t t) | (if t t t)| n | (+ t t) | (× t t)| v | (lambda (v∗) t) | (t t∗)| (rec t) | (let ([v t]) t) | ′v| (box t) | (unbox t) | (set-box! t t)| (seq t t) | (void) | (read s) | (write s t)Figure 4.1: Grammar for terms in the base version of RAPL1 (rec (lambda (fact)2 (lambda (x)3 (if (equal? x 0)4 15 (* x (fact (+ x -1))))))))Figure 4.2: Factorial function in RAPL(see Section 4.3). It also includes a void value, which is returned from effectful op-erations. It supports first-class functions and mutable boxes, using a call-by-valueevaluation strategy. Finally, it includes two operations named read and writefor external input and output of integers, respectively, as these interact non-triviallywith retroactive weaving. Figure 4.1 contains the RAPL expression grammar; ourreference interpreter, which is in the PLAI Scheme dialect of Krishnamurthi [36],contains an equivalent ExprC datatype. The complete interpreter implementationis included as Section A.3.We use the canonical factorial function as a running example; its implementa-tion in RAPL is shown in Figure 4.2. The rec operator is used to create a recursiveunary function.Because retroactive weaving interacts significantly with external side-effects,we also need an example that interacts with its external environment, and hence wedefine a program that repeatedly prompts for a number in Figure 4.1. Again apply-ing the rec operator creates a recursive function that takes a single argument, butin this case the argument is ignored and the result is a recursive thunk. If the inputis zero, the program terminates. Otherwise it displays the result of applying the38Listing 4.1: RAPL code for an interactive factorial loop(rec (lambda (fact_prompt)(lambda (v)(let ([in (read "x")])(if (equal? in 0)in(seq (write "fact(x)" (fact in))(fact_prompt (void)))))))))factorial function from Figure 4.2 (bound to the identifier fact here) and promptsfor the next number.Here is an example of interacting with this program.x> 3f a c t ( x ) : 6x> 4f a c t ( x ) : 24x> 0Program r e s u l t : 0Our reference interpreter also uses the following internal definitions, which arereferenced in later excerpts:(define-type Value[numV (n number?)][boolV (b boolean?)][closV (arg symbol?) (body ExprC?) (env Env?)][boxV (l Location?)][symbolV (s symbol?)])(define-type Binding[bind (name symbol?) (value Value?)])(define Env? (listof Binding?))(define mt-env empty)(define Location? number?)(define-type Storage[cell (location Location?) (val Value?)])(define Store? (listof Storage?))(define mt-store empty)(define-type Result[v*s (v Value?) (s Store?)])39; Top-level evaluation function; ExprC -> Value(define (interp-exp expr)(v*s-v (interp expr mt-env mt-store))); Recursive interpretation function; ExprC Env Store -> Result(define (interp expr env sto)(type-case ExprC expr...))4.3 Adding AspectsWe now extend RAPL to include a single aspect-oriented feature: a simple form ofpointcuts and advice for function application. AspectJ [34] provides the quintessen-tial example of pointcuts and advice, in which pointcuts quantify sets of join pointsto affect and advice methods define how the join points are modified. Advice maybe declared to run before, after, or around (i.e. in place of) a quantified join point;the latter flavour is the most expressive, and supports a distinguished proceedexpression that represents resuming the original join point, or, in the case of over-lapping advice, the next advice method in the chain.In RAPL, the only type of join point supported is the application of functions.Since RAPL has first-class function values, around-style advice can be expressedas higher-order functions that accept and return functions. Within the body of suchan advice function, invoking the passed-in function is analogous to the proceedexpression in AspectJ, as shown in previous work [26].Adding aspects to the base version of RAPL requires two new expression cases:t ∈ TERM, t ::= ...| (tag t t)| (aroundapps t t)We first add a mechanism for tagging values with arbitrary metadata to thelanguage: the expression (tag t e) dynamically attaches the value of the texpression to the value of the e expression. Tagged values behave identically tountagged values, except that computation involving tagged values can be identifiedand modified by advice. The tagging construct provides a means of identifying40join points, since otherwise function definitions have no external identity. Thesub-expressions that are explicitly tagged in an expression represent the modularinterface that it exports as subject to advice. In practice the associated tags wouldlikely be derived from higher-level language features.RAPL programs dynamically register advice using expressions of the form(aroundapps a e): while evaluating the expression e, the advice a is used topotentially wrap every tagged abstraction value before it is applied to an argument.Advice takes the form of a function that accepts two arguments: the tag attachedto the function being applied and the untagged form of that function. The result ofthe advice function is then applied to the argument in place of the original.Here is an example of an aspect that applies to the factorial function as definedabove. This aspect traces the argument and result of each recursive call to thefactorial function:Listing 4.2: RAPL code for an aspect to trace the factorial function(lambda (thunk)(aroundapps(lambda (t original)(if (equal? t 'fact)(lambda (y)(seq (write "y" y)(let ([result (original y)])(seq (write "result" result)result))))original))(thunk))Since RAPL does not have modules, we encapsulate this aspect as a functionthat takes a computation which is frozen in a thunk and advises the result of acti-vating it. Common higher-level language features such as top-level definitions andglobal namespaces would make composing these modules less awkward.We tag the factorial function so the advice will apply:(rec (lambda (fact)(tag 'fact(lambda (x)(if (equal? x 0)1(* x (fact (+ x -1))))))))41We may then combine the modules by applying the aspect function to the fac-torial prompt thunk:Listing 4.3: Possible input and output for applying Listing 4.2 to Listing 4.1x> 3y : 0r e s u l t : 1y : 1r e s u l t : 1y : 2r e s u l t : 2y : 3r e s u l t : 6f a c t ( x ) : 6x> 0Program r e s u l t : 0Both advice and tagging may be nested. The next section provides a concreteimplementation of aspect weaving that precisely defines the semantics for bothforms of nesting.4.4 Aspect WeavingWe now modify the interpreter to implement the semantics of aspects via dynamicweaving; coordinating the cross-cutting concerns as expressions are evaluated. Thescope of an aroundapps declaration is the dynamic extent of its expression ar-gument and hence must be tracked dynamically, rather than bound to closures asthe environment is. The stack of active advice is therefore maintained in a separateargument to the interpretation function, and is generally passed down through re-cursive invocations of interp. Other forms of advice, such as advising operationson boxes, would be defined as additional type cases in the Advice datatype. Theimplementation of tag and aroundapps, represented by the ExprC cases tagC(tag v) and aroundappsC (advice extent) respectively, is trivial:(define-type Advice[aroundappsA (advice Value?)])(define AdvStack? (listof Advice?)); ExprC Env AdvStack Store -> Result42(define (interp expr env adv sto)(type-case ExprC expr...[tagC (tag v)(interp-tag tag v env adv sto)][aroundappsC (advice extent)(interp-aroundapps advice extentenv adv sto)]...)); ExprC ExprC Env AdvStack Store -> Result(define (interp-tag tag v env adv sto)(type-case Result (interp tag env adv sto)[v*s (v-tag s-tag)(type-case Result (interp v env adv s-tag)[v*s (v-v s-v)(v*s (taggedV v-tag v-v) s-v)])])); ExprC ExprC Env AdvStack Store -> Result(define (interp-aroundapps advice extent env adv sto)(type-case Result (interp advice env adv sto)[v*s (v-a s-a)(let ([new-adv (cons (aroundappsA v-a) adv)])(interp extent env new-adv s-a)]))The only other case directly affected is applications. The base implementationfirst reduces the function and the arguments to values, then invokes the applyroutine to evaluate the function body with the augmented environment:; Value (listof Value) AdvStack Store -> Result(define (apply f args adv sto)(type-case Value f[closV (params body env)(let ([bs (map bind params args)])(interp body (append bs env) adv sto))][else (error "only functions can be applied")]))To handle advice, we replace this with a new version that first applies all ad-vice in scope to the function before applying the result to the given argumentsusing the original apply routine. The core new operation is applying a singleadvice definition to a function, implemented by weave-advice. Because theactual identification of join points occurs in the source language, the interpreterimplementation is relatively simple:43; Value (listof Value) AdvStack Store -> Result(define (apply-with-weaving f args adv sto)(type-case Result (weave adv f sto)(v*s (v-w s-w)(apply v-w arg adv s-w))))); Applies all advice in scope for all tags on f; AdvStack Value Store -> Result(define (weave adv f sto)(type-case Value f[taggedV (tag tagged)(type-case Result (weave adv tagged sto)[v*s (v-w s-w)(weave-for-tag adv tag v-w s-w)])][else (v*s f sto)])); Applies all advice in scope for a single tag on f; AdvStack Value Value Store -> Result(define (weave-for-tag adv tag f sto)(if (empty? adv)(v*s f sto)(type-case Result (weave-advice adv tag(first adv) f sto)[v*s (v-w s-w)(weave-for-tag (rest adv) tag v-w s-w)]))); Apply a single advice function to f; AdvStack Value Advice Value Store -> Result(define (weave-advice adv tag advice f sto)(type-case Advice advice[aroundappsA (g)(apply g (list tag f) adv sto)]))Nested advice declarations and tags are handled with a double iteration: firstover nested tags from innermost to outermost, and for each tag over nested advicedeclarations from inner to outer. Any tags attached to values do not affect theirsemantics outside of advice; for every other operation in the language the tags oneach operand are stripped before performing the original logic.4.5 Defining Retroactive AspectsTo define the semantics of retroactive aspects precisely we relate retroactive weav-ing, which is the process of executing them, to execution in the presence of con-44ventional weaving for AOP languages in general. We use mathematical notation toexpress the key requirement.Let E be the partial function that represents evaluating a program to produceobservable results, which is undefined for evaluations that do not terminate. Forimpure languages with external input/output, E depends not only on a term but itscontext, which we denote with c ∈ CTXT. The range of E is OBS; for RAPL thisis the result value and any output. Therefore E : PGM×CTXT→ OBS.As outlined in Section 2.4, Masuhara and Kiczales [43] model AOP in generalvia a weaving process with signature W : A×B→ X. Retroactive weaving appliesto those instances of AOP frameworks where at least one of the input languagesis executable independently and its semantics subsumed by X. This is true ofAspectJ, for example, but not of DemeterJ [39]. WLOG, we assume the executablelanguage is B, and relabel the weaving signature as W : ASPECT× PGM→ PGM.Therefore executing an aspect a together with a program p in context c is modelledas E(W (a, p),c).We augment E to also produce a trace, an abstract notion of intermediate com-putation information: Etraced : PGM×CTXT→ OBS×TRACE.Retroactive weaving is another evaluation function RW : ASPECT×TRACE→OBS. Its behaviour is defined by an invariant: for any program p, aspect a and ex-ecution context c, if Etraced(p,c) = (o, t), and RW (a, t) is defined, then RW (a, t) =E(W (a, p),c).The intuition behind this model is that a trace encodes partial information aboutthe past execution context. Retroactive weaving produces the observable behaviourthe augmented program would have produced in that context. If the augmentedprogram depends on missing information, the process must signal an inconsistencyerror, represented here by undefinedness in RW.4.6 Retroactive WeavingWe now describe how our definitional interpreter provides both tracing and retroac-tive weaving.454.6.1 Recording and Reading TracesThe augmented interpreter must be able to record the relevant information duringone execution and read this information during a future execution. Thus we firstrequire interpretation to record a subset of the states it reaches while interpretingexternally, so that retroactive weaving can identify join points post-hoc and applyadvice as required. Within this context we use trace to refer to an ordered list ofrecorded interpretation states.To consume traces one state at a time during retroactive execution, we addanother parameter to the interpretation function for the remaining trace to read.The head of this trace represents the current state of the original execution; undernormal evaluation this list will be empty. The retroactive weaving process reacts toeach recorded state, potentially performing additional execution, before moving tothe next recorded state by popping the head of the trace list.; ExprC Env AdvStack Store Trace -> Result(define (interp expr env adv sto tin)...)(define Trace? (listof State?))(define mt-trace empty); Trace -> State(define (trace-state tin)(first tin)); Trace -> Trace(define (next-trace-state tin)(rest tin))To produce traces, we extend the Result data type so that every computationcan also provide the trace for that computation. In addition, during retroactiveweaving the input trace must be threaded through the interpreter much as the storeis, and hence we also extend Result to include the remaining input trace:(define-type Result[v*s*t*t (v Value?) (s Store?)(tin Trace?) (tout Trace?)])The datatypes that define our version of tracing are as follows:(define-type Control[app-call (f Value?) (args (listof Value?))]46[app-result (r Value?)])(define-type State[state (c Control?) (adv AdvEnv?) (sto Store?)(tin Trace?)])The Control datatype enumerates the different kinds of interpretation controlflow we record, in particular just before applying a function value to an argumentvalue, and just after such a call produces a result value. The State datatypecombines the interpreter control point with the interpreter arguments. It does notinclude the environment because advice in RAPL is modelled with function calls,and hence cannot access the lexical scope of join points, but this would not neces-sarily be true in AOP languages that make use of dynamic binding in aspects.Since our traces only carry information about function applications, only themain application routine extends the current trace:; Value (listof Value) AdvEnv Store Trace? -> Result(define (apply-with-weaving f args adv sto tin)(type-case Result (weave adv f sto)(v*s*t*t (v-w s-w tin-w tout-w)(type-case Result (apply v-w args adv s-w tin-w)(v*s*t*t (v-r s-r tin-r tout-r)(let ([c (state (app-call f args)adv sto tin)][r (state (app-result v-r)adv s-r tin-r)])(v*s*t*t v-r s-r tin-r(append (list c) tout-w tout-r(list r)))))))))Other operations in RAPL could be recorded in the same style; in general thesetraces are produced by appending the sub-traces for individual sub-computationstogether in evaluation order.We omit the details of serializing these traces to and from persistent storage, inour case one file per trace, as they do not affect the semantics of retroactive weav-ing. In practice, however, there is ample opportunity for optimizing the writingand reading of such traces, and scalable implementations can be quite sophisti-cated [55]. In particular, recording a full copy of the store at every event canbe expensive, and more practical implementations will instead record individualchanges to the store incrementally.In all but the simplest of programming languages and their environments, it will47not be possible or feasible to record all of the information a program could havequeried during its execution. This is especially true of more mainstream languagesthat have access to file systems, networks, and more unpredictable sources of val-ues such as the current time. The particular instance of tracing we present here ischosen to be simple and sufficient to support a reasonable number of retroactiveaspects, and we intentionally omit many other interpreter states as well as the ex-ternal input accessed via read during execution. For a more complete discussionof how this affects our implementation’s completeness, see Section 4.7<>.4.6.2 Retroactive StateThe state of the retroactive interpretation can build on the original interpretationstate. In particular, retroactive execution can use references to values, includingstore locations and their contents, from the original execution. Consider an al-ternative version of the factorial function which uses a box internally to track itscounter:(let ([fact_helper(rec (lambda (fact)(tag 'fact_helper(lambda (bx)(let ([x (unbox bx)])(if (equal? x 0)1(seq (set-box! bx (+ x -1))(* x (fact bx)))))))))])(lambda (x) (fact_helper (box x))))To create an equivalent version of Figure 4.2 for this implementation, the ad-vice needs to dereference the box passed to the helper function in order to obtainthe actual value of x. Therefore, the retroactive weaving interpretation must dealwith a mix of locations: new locations created during the retroactive evaluationand old locations from the trace. In addition, the values stored in old locations maychange as the trace is traversed, so references to old locations must somehow bekept current. Finally, it is necessary to distinguish old and new locations to detectinconsistent executions (see Section 4.7).Our approach is to add another case to the Value datatype to implement alayer of indirection on values obtained from the trace. This aligns closely with48how production implementations are likely to be implemented, as it allows theunderlying trace and its store to proceed independently of the retroactive state [60].(define-type Value...[traceValueV (v Value?)])When a value from prior state is bound by a retroactive aspect (such as the boxpassed to advice for the fact helper function above), it must be lifted to theretroactive context, so that new and old store locations can be distinguished. Thevalue may be a box itself, or it may be a compound value such as a closure whichmay transitively refer to store locations. We define a lift-trace-value func-tion in our interpreter for this purpose. The omitted Value cases are handled bystraightforward structural recursion: primitive values are untouched, and taggedvalues and the value bound by the environments stored in closures are lifted piece-wise.; Value -> Value(define (lift-trace-value v)(type-case Value v[boxV (trace-loc)(traceValueV v)]...))We then augment fetching from the store to handle boxes from the trace:; Store Trace Value -> Value(define (fetch sto tin b)(type-case Value b[boxV (loc) ...] ; As before[traceValueV (v)(type-case State (trace-state tin)[state (c adv s-t tin-t)(fetch s-t tin-t v)])][else (error 'interp "attempt to unbox a non-box")]))4.6.3 Retroactive ControlImplementing retroactive weaving involves producing the extra execution that anaspect specifies at various positions in the trace. When an application callback isapplied retroactively to an application in the trace, we need to use a placeholder49to resume the original execution - that is, reading the rest of the trace - instead ofevaluating the application. To achieve this we add another case for values:(define-type Value...[resumeV])Any tags on the original function must be carried over to the stub value so thatapplication advice will behave identically:; Value -> (listof Value)(define (all-tags v)(type-case Value v[taggedV (tag tagged)(cons tag (all-tags tagged))][else empty])); (listof Value) Value -> Value(define (deep-tag tags v)(foldr taggedV v tags)); Value -> Value(define (rw-resume-value v)(deep-tag (all-tags v) (resumeV)))Applying this value as if it were a function instead resumes the process ofweaving the trace:; Value (listof Value) AdvEnv Store Trace -> Result(define (apply-without-weaving f args adv sto tin)(type-case Value f[closV (params body env)(let ([bs (map bind params args)])(interp body (append bs env) adv sto tin))][resumeV ()(rw-call f args adv sto tin)][else (error "only abstractions can be applied")]))The core of the retroactive weaving implementation are these three mutuallyrecursive functions:; Value (list of Value) AdvEnv Store Trace -> Result(define (rw-call f args adv sto tin)(rw-result adv sto (next-trace-state tin))); Value (listof Value) AdvStack Store Trace -> Result(define (rw-replay-call f args adv sto tin)50(let ([resume (rw-resume-value f)][lifted-args (map lift-trace-value args))(apply-with-weaving resume lifted-args adv sto tin))); AdvStack Store Trace -> Result(define (rw-result adv sto tin)(type-case State (state-c (trace-state tin))[app-call (f args)(type-case Result (rw-replay-call f argsadv sto tin)(v*s*t*t (v-r s-r tin-r tout-r)(rw-result adv s-r(next-trace-state tin-r))))][app-result (r)(v*s*t*t (lift-trace-value r)sto tin mt-trace)]))rw-replay-call consumes the next sub-sequence of the trace from a func-tion application up to its corresponding result, and rw-result continues to con-sume such sub-sequences until it reaches the result for the current application.These routines essentially reconstruct the original tree of recursive calls to theinterpretation function. Note that these versions of the core retroactive weavingroutines do not produce a trace for retroactive weaving itself in order to simplifypresentation, but adding this tracing using the implementation strategy shown inSection 4.6.1 is straightforward.The top-level entry point to retroactive weaving is interp-rw, a separate butrelated function that corresponds to RW in the abstract model in Section 4.5. Asdemonstrated above, in RAPL advice can be declared in discrete modules if thecomputation they advise is provided as a parameter, delayed within a thunk. TheresumeV value is also used to represent the trace as such a thunk.4.7 Ensuring SoundnessThe implementation so far will behave correctly for many retroactive aspects. How-ever, not all retroactive aspects are sound according to the semantics defined inSection 4.5; if the augmented program attempts to access information that wasnot recorded, retroactive weaving is required to terminate in an error, whereas theinterpreter thus far may instead produce inconsistent observable behaviour.51The implementation above assumes that when a recorded state has been pro-cessed and the paused original execution resumed, that original execution wouldhave reached the same next recorded state in the trace. If a retroactive aspect per-turbs the program state in some way, the program may have continued to makean unsupported operation as above, so we cannot assume this is safe. Therefore,for this particular implementation of tracing and weaving, ensuring soundness isequivalent to ensuring that retroactive advice would not have perturbed the originalexecution.Since aspects in RAPL are quite general and expressive, there are several waysthat retroactive weaving can fail:New external side-effects: Advice itself might attempt to add additional inter-action with the original context. In RAPL this means extra calls to read, whichconceptually consume values from the program input prematurely and shift thevalues read by the original program, leaving the later inputs uncertain. This is pre-ventable by replacing the source drawn by the read expression with a stub thatraises an error during retroactive weaving.Modifying arguments: Advice may pass a different list of arguments to thewrapped function than was originally provided. To prevent this, we add withinthe rw-call function a comparison of the arguments the stub resumeV value isapplied to against the arguments provided in the original join point.Modifying results: Similarly, advice may return a different result than a joinpoint of the original computation. Another check must be inserted before returningfrom rw-replay-call to compare the value produced by the advice stack tothe original.Modifying the original store: Advice could also perturb the original exe-cution more indirectly by mutating boxes, so we modify the implementation ofset-box! to raise an error if the given box is a reference to the original store(i.e. a traceValueV as described in Section 4.6.2).Modifying control flow: More deviously, advice may fail to invoke advisedfunctions, or invoke them more than once. Because the construct that representsproceeding in advice is a first-class value (i.e. a function), it could also be boundand applied later, outside the scope of the advice. All these cases can be pre-vented by attaching the length of the remaining trace to the stub resumeV value52at the time it is created, and comparing this to the current length of the remainingtrace in the store value whenever it is applied. This ensures each stub is appliedin order and no more than once. An additional check after the top-level call torw-replay-call to verify that the entire trace has been consumed ensures thateach stub is applied at least once.4.7.1 Deterministic ReplayThe restrictions above depend heavily on the exact information recorded duringthe original execution. Rather than recording the full state of interpretation at rel-evant points, which can be very expensive, the runtime could instead only recordnon-deterministic events, so that the state can be reconstructed by replaying theinterpretation. For RAPL, recording would become storing only the sequence ofinput integers, and replaying would involve assigning the read source to be thatsequence.The straightforward approach to retroactive weaving using replay is to trace thereplay process and then use retroactive weaving on the trace as above. This couldbe made more efficient by having the replay process produce a stream of stateswhich the weaver consumes. It is tempting to optimize this further by directlyweaving the retroactive aspects against the original program during the replay in-terpretation instead. This is not sound in general, though, without again modifyingthe interpreter to guard against new retroactive external side-effects as above.4.8 Related WorkRAPL bears a strong resemblance to AspectScheme [26], another aspect-orientedlanguage with first-class function values. The key place they differ is that As-pectScheme is an AOP extension to an existing full-featured programming lan-guage, whereas RAPL is intended to be a core language, with the minimum fea-tures required to support retroactive weaving. Some of AspectScheme’s features,such as statically scoped advice and equality of functions via source location, canbe expressed via desugaring to RAPL.De Fraine et. al. provide a core calculus for AOP with their A calculus [23].The A calculus is object-oriented, but like RAPL also models proceed as binding a53closure-like value, and supports passing said closures as first-class values. Becausethe RAPL interpreter is more focused on modelling two alternative strategies of as-pect weaving, it avoids object-orientation and types to keep the semantics simpler.4.9 SummaryWe have presented the concept of retroactive weaving as an abstract concept di-rectly related to the semantics of conventional aspect weaving for aspect-orientedprogramming languages. We provided a definitional interpreter that implementsretroactive weaving for a simple core language, illustrating the interaction of retroac-tive weaving with common core language features. Finally, we discussed thesoundness requirement for such an implementation and its consequences.54Chapter 5Retroactive Execution on theJVMThis chapter focusses first on the retroactive execution facet of time-travel pro-gramming, as it provides significant utility independent of retroactive weaving,and its efficient implementation for a full-featured modern language is non-trivial.We present an architecture that enables additional execution in the context of aprogram snapshot, as if a live, debuggable process had been restored from thesnapshot. This allows developers to invoke any code present in the original pro-cess, or even to load new analysis code into the emulated process, with no needfor metaprogramming. This functionality is implemented as an ordinary libraryand does not require a custom language runtime. The execution is made soundby forbidding the recorded objects from accessing state external to the snapshot,since the original environment has been lost and cannot be accurately emulated.We hence refer to these objects as holographic objects: accurate recreations thatcannot interact directly with the outside world.The abstract model used to represent the state in a program snapshot for holo-graphic objects can also represent any instantaneous state within an executionrecording over time. Chapter 6 hence expands on this architecture to implementretroactive weaving.The main contributions of this chapter are:55• an architecture for holographic objects, which enable restricted executionstarting from the state captured in a heap dump;• evidence that this architecture can be efficiently implemented in a statically-typed language on an unmodified commodity language runtime; and• evidence for the utility of holographic objects by using them to diagnose anunsolved memory leak in a mature mainstream application.The implementation discussed here supports the Java programming languageand runs on the JVM. The general approach based on emulating language seman-tics, however, is applicable to other language runtimes as well; see Section 5.1.3for a discussion of the requirements to support holographic execution efficiently.5.1 Holographic Virtual MachinesBecause it offers high fidelity, a snapshot of a process at the time of a failure is anearly omnipresent feature of modern programming language runtimes, and oftentranslates into a heap dump file that contains the state of every live object and itsconnections to other objects. A number of tools can parse this file and present thedeveloper with an interactive, browsable tree of values. This interface is familiarand useful for developers, as it parallels how a debugger models the state of a livesystem. Unlike live debugging, however, the developer cannot execute any of theirown code in the context of the failure, which is a critical piece of functionality asillustrated in Section 3.1.Ultimately all of these difficulties would be resolved if a heap dump analysistool could execute code on the objects in the snapshot, as if the execution occurredon the live process immediately after the snapshot was taken. We aim to providethis functionality through holographic objects, which are virtual objects that re-flect the state and behaviour of the objects recorded in the snapshot. This sectiondescribes the high-level architecture we have used to make this possible.To implement holographic objects, we require a reflective API for accessingthe state of the recorded objects, and an execution environment that implementsthe semantics of normal execution with respect to the reflective API instead ofin-memory native objects. For example, an instruction that accesses a field of an56object should have the effect of accessing that field from the holographic object inthe heap dump via our reflective API.Holographic objects should behave like the recorded objects they imitate, orelse any analysis performed on them may produce incorrect results. Any code inthe control flow of holographic execution that cannot be exactly reproduced basedon the information in the heap dump must result in an explicit exception. Thisincludes any attempt to access or mutate the external environment of the originalprocess, such as other processes, the file system, the network, and so on. This alsoimplies that holographic objects must be completely sandboxed: it must be impos-sible for them to obtain references to any objects in the host Virtual Machine (VM),or vice versa. Holographic objects are hence encapsulated inside a holographicvirtual machine. Values may only be passed between the guest and host VMs us-ing explicit reflective methods, and only primitive values1 are permitted to avoidleaking references.Figure 5.1 contrasts an ordinary VM with a holographic VM running insideanother ordinary VM. On the left is an ordinary virtual machine and its interactionswith the file system and external environment. On the right, a holographic VMsimulates the behaviour of a VM restored from a heap dump. A holographic VMis only permitted to read class files from disk and interact with the VM emulatingit through reflective methods. It is otherwise forbidden from interacting with itsenvironment.1In the case of the JVM, String values are permitted despite being objects, as they are a coretype whose implementation must be immutable.57VMCodeHeapExternalEnvironmentFileSystemTranslatorAdapterHolographicVMHeap DumpClass FilesHolographic File SystemVMCodeHeapExternalEnvironmentFileSystemClass FilesReflectiveMethodsFigure 5.1: The overall holographic objects architecture585.1.1 MirrorsTo support creating holographic objects on top of multiple snapshot formats, orindeed to other sources of object state, we define our reflective API using an inde-pendent set of reflective interfaces. The core functionality of the reflective accesswe need is apparent if we compare the heap dump model to existing reflective APIs.The Java platform includes two such APIs: a set of built-in reflective methods suchas Class.getFields, and the Java Debugging Interface (JDI) provided by theJava Platform Debugger Architecture [50], on which remote Java debuggers arebuilt. Each API provides similar functionality backed by different state: the built-in Java reflection methods reflect on the state of the current VM, the JDI reflectson the state of a separate VM being debugged, and the heap dump model reflectson the past instantaneous state of another VM.Bracha and Ungar [13] label such pluggable, independent reflection interfacesas mirrors, and we adopt their terminology here. We define a central interfacenamed VirtualMachineMirror, which encapsulates an entire object graphincluding all loaded classes and hence all executable code in the system. Otherinterfaces represent objects, classes, fields, methods, arrays, and so on.A holographic VM is then represented as a VirtualMachineHolographwrapper that refers to an underlying VirtualMachineMirror instance andimplements that interface itself. This achieves our goal of making holographic ob-jects a general-purpose library, as the wrapper can be applied to any representationof VM state that can be used to implement the generic mirrors API. The libraryprovides similar holographic wrappers for the other related mirror interfaces. Mostimportantly, the holographic implementation of MethodMirror.invoke doesnot delegate to the wrapped method mirror, but instead emulates the semantics ofthat method’s definition as described above.5.1.2 MutationsAlthough holographic execution is not allowed to read from or write to the stateof the outside world, many useful expressions that are semantically functional willhave internal side-effects that attempt to modify the internal object graph. Forexample, looking up a string key in a HashMap as described earlier requires cal-59culating the string’s hash code, and the implementation of string hashing caches theresult in an instance field of the string the first time it is calculated. This means ifthe method was not previously called on a string in the original VM, or if the stringwas newly-created as part of holographic execution, the method will attempt to seta new value on the mirror. The heap dump model is read-only, as many potentialmirror implementations will be, and hence this will trigger an exception.To support mutation in holographic execution, the holographic adapters super-impose a mutable mirror graph over the wrapped, potentially immutable graph.Each wrapper is initially empty, exposing state identical to that of the wrappedobject. As values are written to the holographic objects they are stored in the wrap-ping object, and future reads will return those values. The same approach appliesfor other more subtle side-effects on the mirrors API, such as expanding the setof classes loaded by a class loader by defining a new class. This polymorphismbetween old and new state in the model is analogous to the polymorphism in theRAPL interpreter’s Value type between the traceValueV data constructor andthe remaining data constructors, as described in Section 4.6.2.Maintaining mutations on the object graph independently in this way also of-fers flexibility in the semantics of multiple successive sessions of holographic ex-ecution. If the same holographic VM instance is used for each, the side-effects ofprior executions will potentially affect future executions. This is consistent with adeveloper’s experience when evaluating expressions in a normal debugging client.Alternatively, a new holographic VM can be instantiated for each evaluation andthen discarded, so each successive evaluation proceeds from the same pristine ini-tial snapshot state. This is equivalent to experimenting with a live process by re-peatedly forking a new sandboxed process that can be perturbed in arbitrary waysand then discarded.5.1.3 Translating CodeThe most obvious approach to implementing holographic execution would be cus-tomizing a language runtime to support it, but this would be non-portable. Weinstead chose the implementation strategy of translating programs into lifted ver-sions equivalent to holographic object semantics. Each instance of a core language60operation is mapped to methods of the mirrors API, and the implementors of thoseinterfaces thus determine the runtime behaviour of the language.This general approach applies to any programming language in which it is pos-sible to express a large subset of the language’s semantics in a reflective interface.Object-orientation is not a hard requirement, as illustrated by the implementationfor C described in Section 3.5; in that context a collection of methods such as bytereadMemory(void * addr) provide the equivalent interface to mirrors. Inaddition, holographic execution will be more performant and less complicated toimplement if at least some of the operations in the language have copy-by-value se-mantics, since those operations can be left untranslated and hence operate at nativespeed. Holographic execution works well for JVM bytecode since it is not purelyobject-oriented: primitive values are passed by value and primitive operations arenot customizable by user programs.Listing 5.1: Original Java code for the sample Employee class1 public class Employee {23 private in t age ;45 public s ta t i c Set<Employee>6 over40 ( Employee [ ] i npu t ) {78 Set<Employee> r e s u l t =9 new HashSet<Employee> ( ) ;10 for ( Employee e : i npu t ) {11 i f ( e . age > 40) {12 r e s u l t . add ( e ) ;13 }14 }15 return r e s u l t ;16 }17 }Listing 5.2: Original JVM bytecode for the sample Employee class1 public class Employee {23 private I age6145 public s ta t i c over40 ( [ LEmployee ; ) LSet ;6 L07 NEW HashSet8 DUP9 1 ) | INVOKESPECIAL HashSet.< i n i t> ( )V10 ASTORE 111 L112 ALOAD 013 DUP14 ASTORE 515 2 ) | ARRAYLENGTH16 ISTORE 417 ICONST 018 ISTORE 319 GOTO L220 L321 ALOAD 522 ILOAD 323 3 ) | AALOAD24 ASTORE 225 L426 ALOAD 227 4 ) | GETFIELD Employee . age : I28 BIPUSH 4029 IF ICMPLE L530 L631 ALOAD 132 ALOAD 233 5 ) | INVOKEINTERFACE Set . add ( LObject ; ) Z34 POP35 L536 IINC 3 137 L238 ILOAD 339 ILOAD 440 IF ICMPLT L341 L742 ALOAD 143 ARETURN6244 }Listing 5.3: Translated JVM hologram bytecode for the sample Employeeclass1 public class hologram / Employee2 extends ObjectHologram {34 / / I n he r i t e d from ObjectHologram :5 / / pub l i c LOb jec tM i r ro r ; m i r r o r6 public f i n a l s ta t i c LClassMi r ror ; c l a ssM i r r o r78 public s ta t i c over40 ( Lhologramarray1 / Employee ; )9 Lhologram / Set ;10 L011 LINENUMBER 17 L012 NEW hologram / HashSet13 DUP14 1 ) | GETSTATIC hologram / HashSet . c l a ssM i r r o r15 | : LClassMi r ror ;16 | INVOKEINTERFACE ClassMi r ro r . newRawInstance17 | ( ) L Ins tanceMi r ro r ;18 | INVOKESPECIAL hologram / HashSet.< i n i t>19 | ( L Ins tanceMi r ro r ; ) V20 ASTORE 121 L122 ALOAD 023 DUP24 ASTORE 525 2 ) | INVOKEINTERFACE Ar rayM i r ro r . leng th ( ) I26 ISTORE 427 ICONST 028 ISTORE 329 GOTO L230 L331 ALOAD 532 ILOAD 333 3 ) | INVOKESTATIC ObjectArrayHologram . getHologram34 | ( LOb jec tAr rayMi r ro r ; I ) LHologram ;35 | CHECKCAST hologram / Employee36 ASTORE 26337 L438 ALOAD 239 4 ) | GETSTATIC hologram / Employee . c l assM i r r o r40 | : LClassMi r ror ;41 | LDC ” age ”42 | INVOKESTATIC InstanceHologram . ge t I n t F i e l d43 | ( LHologram ; LClassMi r ror ; LS t r i ng ; ) I44 BIPUSH 4045 IF ICMPLE L546 L647 ALOAD 148 ALOAD 249 5 ) | INVOKEINTERFACE hologram / Set . add50 | ( LHologram ; ) Z51 POP52 L553 IINC 3 154 L255 ILOAD 356 ILOAD 457 IF ICMPLT L358 L759 ALOAD 160 ARETURN61 }Our particular implementation targets the JVM, and hence the source languageit translates is JVM bytecode using the ASM bytecode processing framework [14].We label the translated versions as hologram classes. These classes are encap-sulated within the holographic VM API and never directly exposed to developersor tools using holographic objects. All instance field declarations in the originalclasses are replaced with a single ObjectMirror instance field, and the indi-vidual bytecode instructions are lifted to operate on those instances. The transfor-mations are all local and context free, although a single instruction will frequentlybe translated into multiple instructions. Figures 5.1, 5.2 and 5.3 contain a smallexample of how bytecode is transformed. Modified instructions are numbered inboth the original and translated bytecode listings. Only object and array instruc-tions are modified; control flow and primitive instructions are left untouched. Note64that downcasts such as the one in Figure 5.3 are often necessary to ensure typesafety.The semantics of holographic execution imply that holographic object refer-ences require two orthogonal dimensions of polymorphism: the original class hier-archy for virtual and overloaded method invocations, and the virtual mirror inter-face methods for object state access. We have chosen to map the original hierarchyinto an isomorphic hierarchy of hologram types which preserve the subtype rela-tion, allowing method invocation to operate as in the original bytecode. Our tech-niques are similar in several respects to those used by Factor et. al. [27] to transpar-ently rename classes in order to support instrumentation of core Java classes. Notethat another approach would be to replace object references with direct mirror ref-erences and implement dynamic method dispatch manually instead. We suspectthat the overhead of handling method dispatch is likely equal to or worse than theoverhead of wrapping mirrors with hologram class instances.Each source type is usually mapped to exactly one internal type, but in somecases maintaining the subtyping relationship in user-level code requires splittingthe type into a concrete class and an interface. The mapping function betweenhierarchies is therefore actually defined by two functions: HC, which is guaranteedto be a concrete, instantiable class, and HT , which may be an interface. Thesefunctions obey the following properties:• For all types C in the original bytecode, HC(C)<: HT (C)• For all types C and D in the original bytecode, if C <: D, then HT (C) <:HT (D)In general, HC(C) is used wherever new instances of C are created, or whenC is used as the superclass of another class, whereas HT (C) is used wherever Cis used as a reference type for local variable, method parameters, and so on. Thecases where the two functions differ are outlined below.Interfaces: Object is both the base class of all concrete classes and the topof the subtyping lattice, and hence a supertype of interfaces as well. References oftype Object are mapped to a Hologram interface, which all hologram classesand interfaces implement and extend, and which has a single getMirror()65method. Where Object appears as a superclass, however, an ObjectHologramclass is used instead, which actually declares the mirror field and implementsgetMirror().Arrays: Each distinct array type, which is normally created automatically inthe JVM without requiring explicit class definitions in source, is mapped to a dis-tinct class type; although there is no virtual method dispatch on arrays, array typescan still create valid method overrides when used as parameter types. These mustalso be split, since they must be concrete and instantiable but also support multi-ple inheritance because of covariance; for each interface A that B implements, thehologram type for B[] must be a subtype of the hologram type for A[].If T is an array type with reference element type E and n dimensions (i.e. thetype E[][](n)[]), we use HAC(E,n) and HAT (E,n) to refer to HC(T ) andHT (T ), which will be a class and an interface, respectively. The extends andimplements clauses for these types are defined according to the following rules:• HAC(E,n) implements HAT (E,n)• If E extendsC, then HAT (E,n) implements HAT (C,n)• If E implements I, then HAT (E,n) implements HAT (I,n)• For all n > 0, HAT (Object,n) implements HAT (Object,n−1)The last rule above is necessary because of array subtyping covariance andthe fact that Object[] <: Object. Note that HAT (Object,0) is simplyhologramType(Object), which is the Hologram interface.Since the results of translating bytecode will be the same for successive holo-graphic VMs on the same heap dump, our system caches translated bytecode ondisk to improve performance. The cache is a fast associative mapping keyed byclass name with sequential separate chaining to handle multiple classes with thesame name. This approach is effective since the name of a class is by far its mostspecific characteristic, but still handles multiple classes with the same name occur-ring in a single VM.Holographic JVMs also provide an optional prepare operation that iteratesthrough all currently loaded classes and eagerly generates the translated bytecode66for each, which will pre-populate the cache. This will often be the preferred work-flow: a holographic JVM could be prepared in advance and the cached bytecodedistributed along with the heap dump.5.2 ScopeThere are several obstacles that may prevent holographic execution from emulat-ing live execution soundly. All are direct results of missing information in thesnapshot, although in most cases these can be solved by additional configurationprovided by the user. This section outlines the factors that limit the completenessof this technique and the extent to which we are able to overcome them in ourimplementation.5.2.1 Missing BytecodeThe most immediate obstacle to holographic execution on the JVM is the fact thatheap dumps generally do not contain any bytecode, as most JVM implementationsmaintain class definitions in a separate area of memory. We must somehow re-cover the definitions of the classes in the heap dump in order to execute any code.Class definition on the JVM is dynamic: the core ClassLoader class and itssubclasses are used to locate the bytecode for a requested class name at runtime.The implementation of these class loaders can be arbitrarily complex and is oftennon-trivial in popular application containers such as OSGi, and so providing themissing bytecode in a holographic VM though manual configuration is not feasi-ble.Our solution is to leverage the fact that nearly all class loaders eventually loadbytecode from a class file on the file system, and more specifically one that matchesthe requested class name. We use holographic execution itself to call the appropri-ate method on the class’ loader to read the contents of the matching class file. Thisapproach is valid for the vast majority of Java code, but for full generality thispiece of the architecture is pluggable so that more unusual class loaders that dy-namically modify or generate bytecode can be handled when the generic solutionfails to locate a class file.Since the state of the original file system at execution time is also not captured67in the heap dump, an exception would normally be thrown when this mechanismattempts to access the file system. However, the configuration of a holographicVM includes a simple finite mapping from paths in the original file system to pathsin the file system of the host machine, creating what we call a holographic filesystem. Whenever holographic execution attempts to access the original file sys-tem, the path is remapped onto the host file system. This approach also works forpaths inside compressed class file archives (“jar” files), and could theoretically beextended to support more atypical sources such as URLs.This system, in combination with other hooks for external input and outputdescribed in the following section, resembles how the KLEE symbolic executiontool [15] simulates the external environment. Where KLEE represents values andcontrol flow affected by the external environment symbolically, the holographicVM requires an exact simulation of the past environment and does not tolerateuncertainty, instead raising an error to indicate missing information.Assuming that the developer using this system can provide a copy of the sameversion of the compiled class files, which should not be difficult given the preva-lence of source version control systems in software development, the holographicfile system allows any class loading logic to read the correct bytecode from disk.This solution also allows the holographic VM to load classes that were availableto the original VM but not yet loaded, which is often necessary when holographicexecution hits code that hadn’t yet been executed in the original VM. Further-more, another workaround for the problem of unusual class loaders above is topre-generate the relevant bytecode as class files in the mapped file system.5.2.2 Native MethodsMany programming language platforms feature standard runtime libraries that areimpure, in that some of the provided features are not implemented in the languagebut instead handled by the runtime itself with no corresponding source code. Inthe case of the JVM, many low-level methods in the core Java Runtime Environ-ment (JRE) library are native methods, which means they have no bytecode but areinstead handled directly in the JVM. Such native methods cannot be called directlyon holographic objects, but holographic execution will inevitably encounter them.68Even calling a toString method as described earlier is almost guaranteed to hitthe native System.arraycopy method somewhere in its control flow.Native methods may have arbitrary effects on the external environment of theVM, and hence some cannot be called in holographic execution. Many are purelyfunctional in behaviour, however, and are only implemented in native code for ef-ficiency. The holographic VM architecture includes another pluggable mechanismfor providing semantically equivalent, Mirror-based Native Methods (MNMS), andincludes such implementations for the most commonly encountered native methodsin the JRE.In most cases the implementations of MNMs can be quite naive and unopti-mized. In the context of supporting post-hoc debugging and analysis, the raw ef-ficiency of the implementation is not the primary concern so long as the method’ssemantics can be accurately reproduced. See Section 5.3.2 for a discussion of theefficiency of our architecture.Native methods can be left unimplemented, or they can be expressly markedas forbidden because their semantics require accessing their external environment.In either case, the unsupported native method is replaced with an MNM stub thatthrows an exception. This means that classes with unimplemented or forbiddennative methods can still be loaded and used in holographic execution so long asthose native methods are not actually called. This is critical since the classes in theJRE include over 1000 native methods, many of which involve some form of inputfrom or output to the external environment.Note that application classes outside of the standard language runtime can alsoinclude native methods, and so if a developer wishes to execute holograph code thatwill hit those methods they must provide the required MNMs themselves. Nativemethods are much less prevalent in application code than in the core JRE, how-ever, and the burden of providing these alternate implementations is far less thanthe burden of re-implementing everyday code as in the reflection-based analysisapproach.695.2.3 Class InitializationClass initialization occurs in Java when a class is first used, and involves invokinga special static method in the class’ bytecode called an initializer. This can havearbitrary effects on the object graph, and holographic execution must preserve thisbehaviour by invoking the initializer of any uninitialized class before accessing it.Like a class’ bytecode, however, the initialization flag is not present in mostheap dumps, so there is no direct way to tell if a class was defined but not yet ini-tialized at the time of the dump. Since failing to initialize an uninitialized class canlead to inconsistent, unsound errors, holographic execution must raise an exceptionif it attempts to load a class whose initialization status is indeterminate.We observe that in almost all cases the initialization status of a class can beautomatically inferred from other data, based on the rules for when initializationmust occur. Before class initialization, every non-constant static field has a defaultvalue: false for boolean fields, null for object references, and so on. Set-ting a value on a static field forces initialization, so a non-constant static field inthe heap dump with a non-default value implies that the class must be initialized.Conversely, if the execution of a class initializer has the definite effect of setting anon-default value on a field but that field has the default value in the heap dump,the class must not be initialized.In addition, given the definition of each class’ initializer, we can define a pre-ordering A.B to mean “the initialization of class A forces the initialization of classB.” If we use initialized(A) to symbolize that class A is initialized, we have twoadditional rules we can use to infer whether a class is initialized:• If initialized(A)∧A.B, then initialized(B)• If ¬initialized(B)∧A.B, then ¬initialized(A)To take advantage of these rules, the holographic VM architecture performsa conservative analysis of the effects of each class initializer method encounteredwhile translating bytecode. We use an abstract interpretation [21] similar to thetype inferencing algorithm used by JVMs to verify bytecode, where the abstractvalues are three-valued booleans indicating whether a value is a default value, isnot, or could be either. The output of this data-flow analysis is both a three-valued70boolean for each static field and the set of classes the method’s execution is guar-anteed to force the initialization of. When it is necessary to check if a holographicclass A is initialized, the class’ static field values are compared with the static anal-ysis results as described above. All classes B for which A .B are also checkedrecursively, and if any are definitely uninitialized A is determined to be uninitial-ized as well.This analysis is sound but not complete: classes may still be encountered forwhich the rules above are not enough to infer whether it is initialized. We furtherobserve, however, that many class initializers are idempotent, in that they may beexecuted more than once without any additional side-effects. This means they cansafely be run on classes that may already be initialized. The architecture thus in-cludes another pluggable mechanism for users to mark specific classes as havingidempotent class initialization. In Section 5.3.3 we provide evidence that this ne-cessity should be relatively rare. It is also possible for a class to have non-inferableinitialization status and a non-idempotent initializer, but we have yet to encountersuch a case.5.2.4 ConcurrencyOur holographic VM implementation is currently limited to single-threaded execu-tion, but there are no assumptions in the architecture that would prevent concurrentholographic execution. Like the JDI model, executing code in a holographic VMhappens in the context of a specific thread mirror from the heap dump, and usesa dedicated native thread in the host VM to execute the translated bytecode. Thesemantics are identical to invoking a method on a paused thread while debugginga live process.In order to support multiple native threads simulating multiple holographicthread executions, the data structures used in the mutable object graph layer de-scribed in Section 5.1.2 simply need to be replaced with their appropriately syn-chronized equivalents: replacing HashMap instances with ConcurrentHashMapinstances, for example. The synchronization overhead will have a negative impacton performance, which should be the subject of future evaluations, but this willenable more complex post-hoc application simulation.715.3 EvaluationThis section evaluates two primary research questions:1. To what extent does the holographic VM architecture improve on the reflection-based approach to heap dump analysis?2. Is holographic execution responsive enough for a typical heap dump analysisscenario?5.3.1 Case Study: Diagnosing a Memory LeakTo evaluate the feasibility and utility of object holographs, we augmented theEclipse MAT to leverage them as much as possible and then used the modifiedtool to diagnose a real world memory leak contributed by an end user.Extending the Eclipse MATA large portion of the Eclipse MAT user interface centres around navigating andsummarizing the object graph through predefined parameterized queries, some ofwhich are directly analogous to source-level operations; “Extract List Values,” forexample, iterates through a list’s entries in the same way as list iterator objects do.Our primary augmentation of the tool was to define two additional generic querieswhose implementation used holographic execution.The first is “Evaluate Expression,” which parses and evaluates a given codesnippet in the context of the objects selected in the tool. This is accomplishedby adapting a holographic VM to the JDI and reusing the implementation of theEclipse debugging UI. It supports either evaluating the expression once for each se-lected object or collecting all selected objects into a single Collection througha boolean-valued “aggregate” parameter.The second is “Load and Run Code,” which evaluates the contents of a spec-ified method from a given class file on disk. This is accomplished by using holo-graphic execution to create a new class loader instance, pass the class bytecodeinto the appropriate method to make the class loader define the new class, and thenactually invoke the target method. This allows users to define more complicated721 pub l i c s t a t i c Map<St r ing , In teger>2 f i ndDup l i ca t es ( Co l l ec t i on<CPPASTName> names) {34 SortedMap<St r ing , In teger> counts =5 new TreeMap<St r ing , In teger > ( ) ;67 f o r (CPPASTName n : names) {8 St r i ng nKey = n + ” - ”9 + Arrays . t oS t r i n g ( n . getNodeLocations ( ) ) ;1011 I n t ege r count = counts . get ( nKey ) ;12 i f ( count == n u l l ) {13 count = 0 ;14 }15 counts . put ( nKey , count + 1 ) ;16 }1718 r e t u rn counts ;19 }Figure 5.2: Analysis code used to diagnose the Eclipse CDT memory leakbugqueries via additional code compiled against the original application binaries. Thisquery also supports the same “aggregate” parameter.We also replaced several existing queries with equivalent versions that usedholographic execution. The “Extract List Values” query, for example, was reimple-mented to invoke the iterator() method on any collection and use the result toiterate over the collection’s elements. This not only increased the generality of theresulting queries, in this case allowing it to work on any Collection rather thanonly specific List subtypes, but also enabled them to accept newly created holo-graphic objects as well as existing heap dump objects. Replacing these reflection-based query implementations also eliminated thousands of lines of code, showingthat holographic execution also simplifies tool development.73Debugging ExperienceThe Juno release of the Eclipse C and C++ Development Tools (CDT) containeda memory leak2: indexing a large project caused the Eclipse runtime to exhaustall available memory, where the same project was successfully indexed in previousversions of the CDT. The user reporting the bug was able to upload a 1 gigabyteheap dump from the time of failure, but because the project that caused the errorcontained proprietary code they were not allowed to provide the actual projectsource. This hindered attempts by the CDT contributors in the following monthsto reproduce the problem, despite multiple other users reporting the same bug.The CDT contributors were able to determine that approximately 80% of theheap was retained by over 1.8 million instances of the class CPPASTName andtheir related child objects. This class is used to represent unique occurrences ofsymbols in C++ source code after preprocessing, and the bug reporter’s estimateof the actual number of such symbols in their code was smaller by a factor of six.Our initial theory for the memory leak was that the indexing process was creatingmultiple duplicate name objects representing the same locations in the source code.A straightforward way to investigate this theory is to iterate over all of the nameobjects and group them by their locations, in order to detect multiple names fromthe same location.Obtaining the necessary bytecode for the relevant classes in the uploaded heapdump was not difficult in this case: we only required the appropriate versions ofthe Java 6 JRE and the Juno Eclipse distribution. We then authored a small helpermethod, built against the matching version of the CDT source code, which iteratesover a sequence of CPPASTName objects and populates a map keyed by a stringrepresentation of their locations. See Figure 5.2 for the relevant source code. Thisanalysis would be very time-consuming to implement using reflection: althoughthe CPPASTName class has a field for storing its location, it is lazily calculated onrequest using several related datatypes, and so for the majority of the objects in theheap dump this field contains null.We executed this code using the “Load and Run Code” query described aboveon the first 100,000 CPPASTName objects in the heap dump, resulting in a new2https://bugs.eclipse.org/bugs/show bug.cgi?id=40007374holographic HashMap object. To examine its contents in the Eclipse MAT UI,we executed a holographic query to extract the key and value pairs from any Mapinstance. The results confirmed our theory that there were many sets of duplicates,in many cases over a dozen symbols with the same name and location.Given that many of the most duplicated symbols were from a common library,our next theory was that the indexer was creating a separate symbol instance everytime a header file was included. We selected one of the most duplicated sym-bols and began to test this new theory by writing code to print out the path ofinclude declarations for each. The first step was traversing the parse tree to find thecompilation unit containing each name, which we achieved using the “EvaluateExpression” query on the string "getTranslationUnit()".We were surprised to find that each symbol came from separate compilationunits. Executing "getFilePath()" on each revealed that they were all for thesame source file. From this point it was relatively simple to use existing MATqueries to find the references keeping the extra parse trees from being collected bythe garbage collector, in particular a thread local that was not cleared after use. Thisanalysis was presented on the online bug report, and a fix was submitted shortlyafter by one of the project contributors.5.3.2 PerformanceTo determine whether holographic execution is performant enough for its intendeduse, we created a test harness that executes the toString method on every ob-ject in a VirtualMachineMirror, measuring the time taken to return from theinvocation. This benchmark was chosen because it is easy to implement and ap-plicable to any Java codebase, and yet exercises a surprising amount of code; evenvery simple implementations of toString are often only the tip of the icebergwhen all of the methods that are ultimately used in their control flow are included.We ran our benchmark against three sample applications. jre only is a stubapplication including only an empty main class, for the purpose of benchmarkingonly the contents of the JRE. tomcat is the Apache Tomcat web server, version7.0.37, after serving the initial welcome page. eclipse is the Eclipse IDE, build20130614–0229, with a minimum of plugins installed in order to keep the total75Application jre only tomcat eclipseClasses 456 2657 7610Objects 2249 46387 99452Live VMAvg. toString time (ms) 15.9 25.9 33.2Max. toString time (ms) 1748 22041 80234Std. Dev. toString time (ms) 74.4 279.3 512.6Holographic VMPrepare time (s) 44 171 340Avg. toString time (ms) 5.4 2.5 7.4Max. toString time (ms) 1279 8804 55867Std. Dev. toString time (ms) 38.7 78.7 325.6Table 5.1: Results of executing Object.toString on every object in a VM,comparing performance on a holographic VM versus a live VM via theJDIclass and object count manageable.For each sample application, we used the JDI to connect to and pause the liveprocess, captured a single heap dump, and then ran the benchmark against boththe live process and the snapshot. We used the performance of remote executionon a live process as the baseline, measuring the performance of holographic exe-cution as a kind of overhead compared to this baseline. These experiments wereperformed on a MacBook Pro laptop with a 2.4 GHz Intel Core 2 Duo CPU and 8GB of RAM, running Mac OS X version 10.7.4. Table 5.1 presents our results.Although the time to translate the bytecode for all classes in the VMs is signifi-cant, once loaded the local holographic VM actually executes these methods fasterthan the remote process. In all cases the minimum toString() time was 0ms, asseveral classes define the method to simply return a constant or recalculated value,and hence return essentially instantaneously.Since there is no convenient method for uniquely identifying objects in Java,and hence no convenient way to correlate the object mirrors in the two different VMmirrors, the data are analyzed as two independent sets. It would be instructive infuture work to develop an algorithm for matching objects between VMs, possiblyusing structural comparison or raw memory addresses, in order to match times andanalyze the average overhead.76We were surprised to discover that holographic execution is actually faster inall cases than remote execution using the JDI. We suspect this is due to the fact thatthe JDI relies on inter-process communication to pass values between the targetand source VMs, whereas a holographic VM’s state resides in memory with thecaller, and for relatively simple objects this invocation overhead is greater than thetime needed for the execution itself. A future evaluation could connect the JDIto a remote holographic VM instead of a local one to normalize this difference,but this would require additional engineering to accomplish. In addition, since aholographic VM does not have to reside in a separate process, the lower latency ofreflective calls is in fact observed in tooling, although traded off by the overheadof keeping the object graph in memory locally.The most expensive aspects of holograph execution are fetching the bytecodefor the original classes as described in Section 5.2.1 and translating that bytecodeto produce hologram classes as described in Section 5.1.3. When the extra stepof loading a heap dump and preparing the holographic VM are included, the totaltimes to run the benchmark on either a live or dead process are similar. Since theresults of this process for each class in the heap dump are cached on disk, however,successive analysis runs can avoid this processing time. Our experience showsthat the translation time is consistently about 5 seconds per megabyte of class filecontent. Preprocessing the entire JRE, which consists of over 20,000 classes andover 60 MB of bytecode, can be done in just under five minutes.5.3.3 CompletenessThe major limitation on completeness in this system is the possibility that the exe-cution of useful code could encounter unsupported or illegal native methods. Ourexperimentation has initially targeted the Mac OS X distribution of Java 7 release5, and for every native method in that JVM’s runtime we have either provided amirror-based alternate implementation or explicitly determined that it requires il-legal access to the external environment and marked it as forbidden. Section A.1lists our categorization of these native methods.Our implementation currently includes 99 alternative implementations of na-tive methods, with a total of 1443 source lines of code. This serves as a rough eval-77uation of the effort involved in supporting a particular VM implementation. Onlythose methods in the sun.misc.Unsafe class are specific to the exact JVM im-plementation we used, as they involve raw memory addressing that depends on theexact memory layout of its objects. There are also a handful of platform-specificclasses such as UNIXFileSystem that contain native methods, but this is onlya superficial platform dependency since the actual MNMs for such methods areonly trivially different. 187 legal native methods are not yet implemented in ourprototype, as they were not encountered in our experiments, but these are all eithertrivial variations on other implemented methods, such as alternatives for differentprimitive types, or alternative ways of accessing the reflective properties alreadysupported by the mirrors API.The other limitation on completeness is the possibility of encountering classesin the heap dump with indeterminate initialization status. In the process of support-ing our experiments we encountered only two3 such classes. We explicitly markedthese as having class initializers that were safe to re-run, implying they could safelybe executed even if they may have already run.5.4 Related Work5.4.1 Mirror-based Behavioural IntercessionPrior work has examined the idea of customizing the behaviour of a language’sobjects via implicit mirror implementations. Such objects have been called mi-rages [46] or virtual values [8], and have largely been studied in the context of be-havioural intercession, or augmenting or replacing behaviours on existing objects.Holographic objects are similar in implementation, but focus instead on reproduc-ing the behaviour of base objects with no actual base object available in the runtimeto provide the base behaviour. In our case, behavioural intercession is limited toreplacing native methods, with the specific requirement of not deviating from basebehaviour, and is not exposed to developers that use the library.In addition, prior work has presented implementations of such objects in dy-namic languages, and doing the same in a statically-typed language such as JVM3java.lang.reflect.Modifier and java.security.KeyFactory78bytecode without requiring a custom language runtime presents fundamental chal-lenges that affect the design of the interfaces to those objects. The Jikes ResearchVirtual Machine [5] also implements the behaviour of a JVM on another JVM, andthe majority of its code is ordinary Java. Its object model is not intended to bepluggable, however, and although it could be made so this would not necessarilybe any easier than customizing any other JVM implementation.Lorenz and Vlissides [40] describe how pluggable reflection enables more flex-ible language tools, using a documentation generator and an object-to-componentgenerator as examples. A holographic language runtime represents another clientof a language’s pluggable reflective API, and allows the language’s implementa-tion itself to be decoupled from the representation of its runtime state. UnlikeLorenz and Vlissides’ examples, holographic execution requires a reflective APIthat represent computation and not just code, a distinction clarified by Bracha andUngar [13].5.4.2 Reproducing Past State and BehaviourThe idea of checkpointing and resuming execution recurs in several contexts, no-tably in operating system or hardware virtualization. Some language runtimes alsosupport resuming from a snapshot [1], and Java itself includes a small amount ofthis behaviour in its shared memory class file cache feature. In all of these cases,the restored process is a normal, unrestricted instance, and care must be taken toensure that invalid references to the outside world are not created. By contrastholographic execution as described in this work is intended to support diagnosisand analysis of the state of a system in the past. It requires no foresight, but at thecost of restricting additional execution and hence not truly restoring a dead pro-cess. It also does not currently support resuming execution from the exact time ofthe snapshot, although this is conceivably possible if a more precise snapshot suchas a core dump were used, along with techniques for recreating the captured callstack via program slicing [67].795.4.3 Heap Dump AnalysisSeveral other approaches have been used to analyse heap dumps, usually in thecontext of identifying the source of memory leaks. Maxwell et. al. [44] use graphmining to locate potential leak candidates. As we have illustrated in Section 5.3.1,holographic execution is a complementary technique which does not preclude theuse of the reflective API on a heap dump. For example, the aforementioned workincludes a case study of a memory leak in a scripting language parser. The graphmining technique identifies a non-standard linked-list implementation containinga long series of regular expression matches, which is helpful but does not fullydiagnose the root cause. We postulate that applying holographic execution to printout the semantic contents of this list could be extremely useful in diagnosing theactual source of the leak.5.4.4 Static Code AnalysisKozen and Stillerman [35] use a static analysis of class initializers similar to ours toinitialize classes eagerly, in order to improve startup performance and catch errorsearlier. Their algorithm ignores initializer effects with respect to static fields, butis flow-sensitive and hence calculates a more precise definition of initializationdependencies than our current implementation. Integrating their approach in thefuture may improve the success rate of our algorithm and hence reduce the numberof initializers that must be marked safely repeatable.5.5 SummaryWe have presented an architecture for holographic objects, which enables restrictedexecution starting from the state captured in a heap dump. We have shown that thisarchitecture supports analysis that is simple, type-safe, robust, secure, and familiar.It provides a good fit for analysis that depends on application-specific semantics,but also complements meta-level analysis by supporting a hybrid approach. Our ex-perience with our implementation for Java bytecode suggests that this architecturecan be effectively realized without having to customize a JVM. We have presentedevidence that holographic execution is competitive with live execution and henceis feasible for heap dump debugging and analysis. We have also applied this proto-80type implementation to diagnose a real-world memory leak bug that went unsolvedfor several months.81Chapter 6Retroactive Weaving for AspectJThis chapter describes the architecture for our retroactive weaver for the AspectJAOP language, which builds on the architecture for holographic virtual machinespresented in the previous chapter, and provides evidence for the effectiveness ofthis architecture through case studies and an evaluation of performance. This inturn provides evidence for the feasibility and effectiveness of the time-travel pro-gramming paradigm for mature, full-featured programming languages.6.1 ArchitectureAt a high-level, we implement the process of retroactive weaving for AspectJ bycombining three large and orthogonal components. The first component is theinterface to the underlying execution recording, adapted to implement the genericVirtualMachineMirror (VMM) interface described in Section 5.1.1. This hides thedetails of whichever recording system is used, and indeed our evaluation appliesthis architecture to two different systems. The second component is a holographicvirtual machine that wraps the first component and implements the same interface,providing the ability to perform additional execution on top of the recording asdescribed throughout Section 5.1. The last component, the primary focus of thischapter, is a reflective aspect weaver, which uses the metaprogramming facilitiesof the VMM interface to trap dynamic join points as they occur in the recording andexecute matching aspects accordingly.82This architecture allows the weaver to focus only on the concern of dynami-cally weaving aspects against a VMM, whether that program represents live execu-tion or a recording, and only the holographic VM is concerned with emulating theretroactive execution specified by aspects. The modular approach helps to reducethe complexity of this system, by separating the execution recording technologyand the AOP language semantics as independent concerns. The key to achievingthis modularity is the VMM interface, which is a reflective definition of the targetprogramming language.6.2 Events and IntercessionIn Chapter 5, a VMM was only used to represent a snapshot of a running program atone instance in time. To implement retroactive weaving for an execution recording,this model must be extended to represent the behaviour of a program as it executesover time. The necessary interface is somewhat similar to the reflective interfacea JVM provides for implementing debuggers, and indeed one of the scenarios weuse to evaluate retroactive weaving in Section 6.7 below involves adapting the JavaDebugging Interface described in Section 2.3 to the VMM interface.The key new operation is the ability to register callbacks for runtime events,such as method calls and field accesses, so that when events occur they are firstpassed to those callbacks before they actually occur in the runtime. While the VMMinvokes such callbacks, the state of the runtime is frozen so that callback bodiesmay read any necessary contextual state. Moreover, the callbacks are allowed tomodify the events, effectively changing runtime behaviour. This augments the run-time with powerful metaprogramming facilities.VMM instances are initially created in a paused state. Clients add initial call-backs as above, and then invoke a resume method that conceptually continuesexecution. For an execution recording, there may be no actual resumed execution.The interface only requires that each requested event is delivered to the registeredcallbacks in the correct order, with the VMM state emulating the context in whicheach event occurred.The VMM interface also includes a handful of operations for modifying thestructure of classes, such as adding new fields to an existing class, or declaring that83a class implements additional interfaces. These are used by the reflective weaverto implement intertype declarations, as well as to attach extra state for the internalrepresentation of features such as cflow pointcuts.Note that this interface is strictly more powerful that the reflective capabili-ties of the Java or AspectJ programming languages themselves. As observed inSection 2.2, it is not possible for base-level Java code to intercept and modifymethod invocations using the reflective classes included in the Java standard run-time libraries, such as java.lang.Class and java.lang.Method. It ispossible, however, to provide at least some of these capabilities when adaptingvarious execution recordings to the VMM interface. The wrapping holographic VMthen provides some of the missing metaprogramming functionality as well, justas it already fills in missing state as described in Section 5.2. The holographicVM implementation also raises events for holographic execution itself via hooksin its generated hologram class bytecode, which ensures consistent semantics inline with the definition of retroactive execution.6.3 Reflective AspectJ WeaverOur reflective weaver uses the metaprogramming facilities of the VMM interface de-scribed above to achieve the semantics of AspectJ aspects. The weaving processregisters callbacks that perform additional execution or modify the events them-selves, such that the observed behaviour is equivalent to static code modification.Note that the reflective weaver is not a runtime weaver for AspectJ. Althoughthe flexibility and power of metaprogramming potentially supports operations suchas dynamically enabling and disabling aspects, those semantics are not includedin the definition of AspectJ, as noted in Section 2.5. Since the reflective weaveris an alternative implementation of the same semantics of AspectJ, the weaver’suse of metaprogramming is carefully controlled to avoid exposing any additionalbehaviour.The event requests which trigger the execution of aspect code are all installedon the recorded VMM before actually resuming the recorded execution. Other re-quests may be dynamically enabled or disabled as the recorded execution proceeds,but only as needed to optimize the weaving process or implement late binding. For84example, when a new class is defined in the recording it may be necessary to createnew event requests to trap events in that class’ code. The holographic VM alsoregisters its own internal callbacks for specific events to collect data necessary forcorrect holographic execution.Reflective weaving borrows much of the load-time configuration implementa-tion described in Section 2.5 as well, in particular support for Extended MarkupLanguage (XML) files that explicitly name aspects that should be woven. Thesemay be hand-written by developers, but are also automatically generated by theAspectJ compiler when compiling aspects to their binary form for load-time weav-ing.The high-level process for retroactive weaving is:1. Load an execution recording, adapted to implement a VMM;2. Instantiate a holographic VM around it;3. Read the list of names of aspects to weave from the relevant XML files;4. Load the requested aspects by name in the holographic VM;5. For each aspect:(a) modify classes in the VM according to any intertype definitions, and(b) register callbacks to execute advice at matching join points;6. Resume the holographic VMNote that the reflective weaver supports a -XweaveCoreClasses option inthe XML configuration which ajc does not. This is necessary because the reflec-tive weaver, unlike ajc, is able to implement the semantics of weaving aspectsagainst classes in standard Java class libraries. Many existing AspectJ aspects as-sume the weaver only affects user-provided classes, or those loaded by a specificclass loader. If these are woven against core classes using the reflective weaver,they often become extremely slow or even incorrect. However, some of the aspectsin our evaluation case studies (see Section 6.6) do require access to core classes.85In addition, avoiding retroactive side-effects in the core libraries using the tech-nique described in Section 3.4 requires weaving against core classes. Providingthis option allows clients to scope their aspects on a per-library basis.6.3.1 Events as Join Point ShadowsThe core of the ajc weaver already operates on an abstracted representation ofthe AspectJ language in order to support both compile-time or binary weaving.Creating support for reflective weaving largely reduces to creating another imple-mentation of this abstraction backed by the mirrors API. Recall from Section 2.5that the ajc weaver is architected around the idea of join point shadows, which arelocations in AspectJ code (source or binary) that give rise to dynamic join points.Because the ajc weaver materializes the semantics of AspectJ by transformingcode into Java byte code with equivalent behaviour, the primary function of theweaver is to iterate through all join point shadows in the code and test for advicewhose pointcuts match those shadows. When a match is found, shadow mungersare applied that transform the bytecode as described above.The reflective weaver repurposes this mechanism and reuses much of the ajcimplementation by adapting dynamic runtime events as join point shadows. It isstraightforward for a runtime event from a VMM instance to expose at least as muchmetadata as the static area of code that gave rise to the event. For a method call,for example, this includes the declaring class, method name and argument types,as well as less fundamental properties such as the source file name and line numberof the method call.Shadows backed by a runtime event also support actually evaluating the shadow.This is exposed as a closure which can be replaced on the underlying event. Fora runtime event, a shadow munger will modify the event by wrapping the exposedclosure to add additional computation before, after, or around the shadow.6.3.2 Efficient Event RequestsGiven a pointcut, the weaver must create a minimal but sufficient set of event re-quests to cover all matching dynamic join points. A valid but extremely inefficientimplementation of reflective weaving would be to register callbacks for all possible86events, since the generic pointcut matching logic will filter out irrelevant events. Itis possible to optimize the weaving process considerably, however, if the point-cuts are partially and conservatively translated into event filters. The underlyingexecution recording mechanism will frequently support some of these filters na-tively and hence avoid some of the substantial overhead of detecting, instantiatingand handling unnecessary events. The details of implementing these filters in theunderlying VMM are examined in Section 6.4.All relevant shadow evaluations are therefore handled in a two-phase approach:filters are used to trap an over-approximation of all relevant events in the recording,and all resulting events are checked against the relevant pointcuts before actuallyapplying their shadow mungers. This two-phase approach mirrors the implementa-tion of compile-time or load-time weaving, where shadows are matched staticallyto determine locations in code that may produce matching join points, but dynamictesting logic often has to be inserted to match a portion of the pointcut at runtime.AspectJ pointcuts are formed from a combination of several possible patterns.They can be divided into four categories:1. Kinded, such as call and set;2. Scoping, such as within;3. Contextual, such as this; and4. The pointcut combinators &&, ||, ! and cflowAn early stage of pointcut concretization replaces cflow pointcuts with acombination of explicit control flow tracking and pointcuts that refer to that state.The remaining pointcut combinators resemble logical operators: &&, || and ! for“and”, “or” and “not” respectively. Pointcuts thus have an equivalent concept ofDisjunctive Normal Form (DNF). Rewriting a pointcut in DNF is already imple-mented and used by the ajc weaver to optimize the shadow matching process. Inour implementation it is helpful to lift all uses of || to the top-level, since eventrequests for each conjunctive clause can be extracted independently.The pseudocode for extracting a sufficient set of event requests for an advicedeclaration is as follows:87Reduce the po i n t cu t to d i s j u n c t i v e normal formr e s u l t ← ∅For each con junc t i ve clause :Separate the kinded des ignators from the othersCa lcu la te the set o f k inds t ha t w i l l match those des ignatorsI f there are no kinded designators , match a l l j o i n po in t k indsFor each matching k ind :request ← a new mi r r o r event request o f t ha t k indFor example , a ‘ set ‘ corresponds to a ‘ F ie ldMir rorSetRequest ‘For each other des ignator i n the clause :and s igna tu re pa t t e rn i n each kinded des ignator :I f the k ind does not suppor t t h i s pa t t e rn( e . g . ‘ * MyClass . foo ( i n t ) ‘ f o r a ‘ set ‘ ) :Continue wi th the next k indOtherwise :I f the m i r r o r event request supports t h i s pa t t e rnAdd a corresponding f i l t e r to the ‘ request ‘Add request to r e s u l tr e t u rn r e s u l tNote that if a conjunctive clause contains two kinded designators of differentkinds (e.g. set and call), the pointcut will have no matches, since kinds definea partition on join points and hence no join point will ever match more than onekind simultaneously.It is important to optimize the implementation of cflow pointcuts to drasti-cally reduce the number of events raised by the underlying execution recording.For all pointcuts that contain cflow expressions, the event requests created tomatch their join points are initially disabled. They are only enabled when a joinpoint that matches the argument to the cflow expression is entered, and then dis-abled when the join point exits. This is especially critical given the idiom of usingcflow(adviceexecution()) to avoid retroactive side-effects, since manyof the relevant join points will occur very frequently in the base program. SeeSection 6.7.1 for examples of such aspects.886.4 Execution RecordingsExecution recording technologies vary in terms of how much information theyexplicitly record and how much is implicit and inferred or reconstructed at con-sumption time. To evaluate the effectiveness of our implementation of retroac-tive weaving, we also adapted two implementations of execution recording to theVMM interface: one based on a database of events and snapshots, and one basedon deterministic replay. The implementations we chose represent two contrast-ing approaches to execution recording, and hence are intended to help measurethe behaviour of retroactive weaving over the range of all possible approaches toexecution recording.Many recent approaches to execution recording use a combination of these twotechniques, including Pothier’s more recent work on omniscient debugging [55].Such systems will often record multiple snapshots of system state, for example,and then use partial replay to determine state in between them. By choosing highlycontrasting backends, our evaluation is intended to provide the maximum evidencefor how the effectiveness of retroactive weaving varies with the underlying ap-proach. Implementing two very different backends also provides evidence that theVMM interface is sufficiently generalized, and hence that reflective weaving will beapplicable to other possible execution recording technologies.6.4.1 Events DatabaseOur first execution recording adapter is based on the tracing and database backendof the Trace-oriented Debugger (TOD) presented by Pothier et al. [56]. At its heart,this approach closely resembles the tracing implementation shown in Section 4.6.1,since at a high level the events database is simply a sequence of instantaneousevents. Each event contains a synthetic timestamp, where events with the sametimestamp semantically occurred simultaneously in the original execution.The TOD backend exposes a querying interface similar to that of a relationaldatabase. The trace database provides an interface for creating a cursor over a sub-set of events from the trace sequence. Filters can be applied to pick out only thoseevents that occurred within the source of a particular class, on a specific object,and so on, and filters can be combined with conjunction and disjunction operators89to create compound filters. Events are aggressively indexed as they are recordedto make these filters efficient to implement. The trace database also provides vari-ous interfaces for querying the state of objects and arrays at any given timestamp.These operations are used by the omniscient debugger UI to step forwards andbackwards in time, and to present a debugger-like view on the state of the recordedprogram at any given point in time.In the TOD VMM adaptor, each callback request is translated to an event cursor,and each supported filtering operation translates almost directly to cursor filters. Asingle stream of matching events is easy to produce using a merge sort process sim-ilar to that described in [56]. The state of the adaptor includes a current timestamp,effectively slicing the data to the state at this timestamp, and the VMM interfaces forquerying state are implemented by passing this fixed timestamp to the databasequery methods. The resume operation semantically replays the recorded pro-gram by iterating through the result of merging the cursors for all requests. Eachtime the cursor moves to the next event, the VMM timestamp is updated to that ofthe event. Then each applicable callback is invoked on the event. The adaptor onlyreads state in a forwards direction, and hence does not take advantage of TOD’sability to debug backwards in time, but it may skip over large periods of time as ittravels forwards and queries only relevant events.6.4.2 Deterministic ReplayOur approach to retroactive weaving based on deterministic replay is to enable de-bugging on the DR process, attach a debugging connection to that process within asecond JVM using the Java Debugging Interface (JDI), and adapt that JDI connec-tion as a VMM. This ensures isolation from whatever DR mechanism is used, at thecost of decreased performance due to the inter-process communication. This costis mitigated somewhat by caching data within the holographic VM that wraps theJDI adaptor.The callback API in the VMM model is straightforward to implement using theJDI’s similar event request mechanism. Although the VMM model supports con-currency via threads, the JDI adaptor pauses all threads within the debugged VMwhenever an event is raised. Assuming asynchronous threads rather than concur-90rent threads in the VMM model greatly simplifies the holographic VM implementa-tion, especially with respect to ensuring consistency in the aforementioned caching.We assume that implementing the API for event callbacks does not cause per-turbation of the underlying process. This is true as long as the underlying replaymechanism is robust enough to be unaffected by the pauses in execution and addi-tional reads of program state caused by event callbacks.For the purposes of our evaluation we use the LEAP system for deterministicreplay of multi-processor systems [31]. Like many software-based approaches toDR, LEAP works via static analysis and bytecode instrumentation of the sourceprogram. It produces both a record and replay version of the source program;the former records thread ordering data which the latter takes as input in order toreproduce the recorded execution. Our experimental setup is therefore to connectto the running replay process as the remote source of program state as above.6.5 SoundnessPerforming some operations may interfere with execution recording replay, be-cause they would perturb the underlying VM such that there is not enough infor-mation to continue the simulation. It is the responsibility of the execution recordingadaptor to guard against operations that would invalidate the recording, since theprecise definition of which operations are illegal will vary depending on the exactrecording mechanism.The execution recoding adaptors are configured to disallow operations thatmodify the VM’s state, such as setting fields and invoking methods. If such op-erations occur in holographic execution, the adaptors will throw an instance of anexception class named IllegalSideEffectError. This class is declared asa subclass of VirtualMachineError, a base class used internally by JVMimplementations to indicate low-level failure conditions client code should notbe expected to catch and reasonably handle, such as exhausting memory or en-countering internal VM bugs. We intend clients of the VMM interface to catchIllegalSideEffectErrors instead, at the meta-level.916.6 Case StudiesThis section outlines the background and motivation behind the examples of As-pectJ aspects we used to evaluate retroactive weaving for AspectJ. Our evaluationtargets both the effectiveness of the technique for the AspectJ language in particularand our weaver implementation. The former is demonstrated by reusing existingaspects, or by presenting alternate versions of dynamic analysis expressed as clearand concise aspects. The latter is evaluated by measuring the performance penaltyof evaluating aspects retroactively, and by approximating the additional develop-ment effort necessary to avoid illegal retroactive side-effects.We have endeavoured to identify several classes of aspects that are well-suitedto retroactive weaving, and chosen one concrete aspect to represent each class. Wehave in general preferred generic aspects, meaning those that are applicable to anyarbitrary Java or AspectJ program, so that we can use the same set of base programsto record and retroactively weave. This helps to keep a consistent baseline forcomparing these different classes of aspects.We present these case studies in roughly increasing order of complexity. Thefollowing table summarizes which AspectJ and holographic execution featureseach case study makes use of.Table 6.1: Summary of case studies and the AspectJ features they usecontract tracing racerj leaks heapstateful y y y y ycflow y y y y yaroundtoString y yrecorded threads y y yretroactive threads y ytype reflection ycontrol reflection yThe features referenced by the table are as follows:stateful: Whether the analysis code maintains its own state between join points.cflow: Use of the cflow pointcut combinator.around: The inclusion of around advice. Note that while around advice does92not appear in the source of any of the case studies, it is used heavily in the aspectsused to avoid illegal retroactive side-effects as illustrated in Section 6.7.1.toString: Use of the Object.toString method to output objects in ahuman-readable form.recorded threads: Supporting the analysis of multiple threads in the originalexecution.retroactive threads: Analysis that spawns its own additional threads.type reflection: Analysis that uses reflective methods to analyze the structureof types that appear in the recording (e.g. Class.getDeclaredFields).control reflection: Analysis that uses reflective methods to analyze the controlflow of the recorded program (e.g. Thread.getStackTrace).6.6.1 Contract VerificationA commonly cited application for aspects is modularizing contracts when using theDesign by Contract (DbC) paradigm [45]. Some argue that using aspects in thisway fails to maintain the advantages of DbC [10], although there is recent workon AOP languages intended to mitigate this [57]. However, contracts expressed asaspects are also good candidates for retroactive weaving because of the pluggabilityof aspects. They are often too expensive to leave enabled in production code,but may offer valuable insight into the root cause of unexpected behaviour whenevaluated against an execution recording, where raw performance is less important.The example we chose is a simple, lightweight aspect that detects resourceleaks. It leverages the semantics of the built-in java.io.Closeable interface:any class which implements this interface represents a resource that should be re-leased when the object is no longer needed. Newer versions of the Eclipse IDE willprovide compiler warnings for objects that are never closed, but the static analysisthe warning is based on cannot detect resource leaks for objects that escape their lo-cal scope. The aspect in Figure 6.1, however, will detect such leaks dynamically bycapturing all instances of classes that implement java.io.Closeable, track-ing when they are closed, and reporting on those that were never closed when theprogram terminates.931 pub l i c aspect UnclosedCloseables {23 p r i v a t e f i n a l Set<Closeable> unclosed = new HashSet<Closeable > ( ) ;45 a f t e r ( ) r e t u r n i ng ( Closeable c loseab le ) : c a l l ( Closeable +.new ( . . ) ) {6 unclosed . add ( c loseab le ) ;7 }89 a f t e r ( Closeable c loseab le ) : c a l l ( * Closeable . c lose ( ) )10 && t h i s ( c loseab le ) {11 unclosed . remove ( c loseab le ) ;12 }1314 pub l i c UnclosedCloseables ( ) {15 Runtime . getRuntime ( ) . addShutdownHook (new Thread ( ) {16 pub l i c vo id run ( ) {17 System . out . p r i n t ( ” Unclosed c losab les : ” + unclosed ) ;18 }19 } ) ;20 }21 }Figure 6.1: A contract verification aspect6.6.2 TracingThe next case study is the quintessential tracing example described in Section 2.5.While relatively simple, the most sophisticated version of this aspect makes use ofseveral AspectJ features that exercise the overall implementation well:• Invoking the toStringmethod within advice, implying a non-trivial amountof retroactive execution;• A cflow pointcut to avoid infinite recursion when calling toString;• References to the special form thisJoinPointStaticPart, which ex-poses metadata about the join point; and• The use of persistent state between advice invocations to implement tracingindentationSee Listing 6.1 for an excerpt of the output of this aspect.94Listing 6.1: An excerpt of output from the AspectJ tracing aspect1 - -> double t r a c i ng . C i r c l e . per imeter ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )2 <- - double t r a c i ng . C i r c l e . per imeter ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )3 c1 . per imeter ( ) = 12.5663706143591724 - -> double t r a c i ng . C i r c l e . area ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )5 <- - double t r a c i ng . C i r c l e . area ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )6 c1 . area ( ) = 12.5663706143591727 - -> double t r a c i ng . Square . per imeter ( ) : Square s ide = 1.0 @ (1 .0 , 2 .0 )8 <- - double t r a c i ng . Square . per imeter ( ) : Square s ide = 1.0 @ (1 .0 , 2 .0 )9 s1 . per imeter ( ) = 4.010 - -> double t r a c i ng . Square . area ( ) : Square s ide = 1.0 @ (1 .0 , 2 .0 )11 <- - double t r a c i ng . Square . area ( ) : Square s ide = 1.0 @ (1 .0 , 2 .0 )12 s1 . area ( ) = 1.013 - -> double t r a c i ng .TwoDShape . d is tance (TwoDShape ) : C i r c l e rad ius = 4.0 @ (0 .0 , 0 .0 )14 - -> double t r a c i ng .TwoDShape . getX ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )15 <- - double t r a c i ng .TwoDShape . getX ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )16 - -> double t r a c i ng .TwoDShape . getY ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )17 <- - double t r a c i ng .TwoDShape . getY ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )18 <- - double t r a c i ng .TwoDShape . d is tance (TwoDShape ) : C i r c l e rad ius = 4.0 @ (0 .0 , 0 .0 )19 c2 . d is tance ( c1 ) = 4.24264068711928520 - -> double t r a c i ng .TwoDShape . d is tance (TwoDShape ) : Square s ide = 1.0 @ (1 .0 , 2 .0 )21 - -> double t r a c i ng .TwoDShape . getX ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )22 <- - double t r a c i ng .TwoDShape . getX ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )23 - -> double t r a c i ng .TwoDShape . getY ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )24 <- - double t r a c i ng .TwoDShape . getY ( ) : C i r c l e rad ius = 2.0 @ (3 .0 , 3 .0 )25 <- - double t r a c i ng .TwoDShape . d is tance (TwoDShape ) : Square s ide = 1.0 @ (1 .0 , 2 .0 )26 s1 . d is tance ( c1 ) = 2.2360679774997927 s1 . t oS t r i n g ( ) : Square s ide = 1.0 @ (1 .0 , 2 .0 )956.6.3 Race DetectionBodden and Havelund developed RACER [12] as an example implementation of analgorithm for detecting data races. They proposed three novel pointcuts to enablea whole class of aspects for analyzing concurrency bugs: lock and unlock,for picking out join points that acquire and release locks, and maybeShared, ascoping pointcut used to restrict other pointcuts to those involving data possiblyaccessed by multiple threads.RACER consists of two aspects. Locking uses the lock and unlock point-cuts to track the locks held by a given thread and how many times each has beenlocked. Racer picks out possibly shared field accesses and drives the RACERalgorithm’s state machine. While the algorithm itself is relatively simple, its com-plete implementation in AspectJ is still over one thousand lines of code, and exe-cuting it retroactively is a reasonably heavy stress test of the architecture.RACER is particularly compelling as a case study for retroactive weaving be-cause its overhead is relatively high. The more that this high runtime cost can bepushed to the post-hoc processing environment, where raw performance is less ofa concern, the more practical such analysis becomes. With similar goals in mind,Ansaloni et. al. [6] investigated the advantages of using buffered advice to offloadsome of this overhead to a separate thread. Unlike buffered advice, retroactiveweaving does not require altering the specification and hence the semantics of theoriginal advice, although it instead introduces the possibility of runtime errors dueto illegal retroactive side effects as described in Section 6.5.The maybeShared pointcut is used as an optimization to reduce the numberof join points matched by the Racer aspect and hence the overall runtime of thealgorithm. For the sake of simplicity, the retroactive weaver’s implementation ofthis pointcut matches all join points; Bodden and Havelund note in [12] that this isa valid implementation, and moreover that the observed improvement in runtimespeed due to their implementation based on a static thread-local objects analysiswas negligible.966.6.4 Memory Leak DetectionSection 5.3.1 provided evidence for the benefits of retroactive execution in diag-nosing memory leaks based on snapshots. It is logical to investigate whether thesebenefits extend to execution recordings via retroactive weaving. Villazo´n et al. [66]and Chen and Chen [16] both experimented with using aspects to detect memoryleaks, and illustrate that one of the advantages of this approach is the ability totrack the dynamic location of each object construction, namely via stack traces.We use Villazo´n’s version for this evaluation since the full aspect’s source isavailable for experimentation. This aspect maintains PhantomReferences toconstructed objects to ensure they can still be collected. Before collecting suchobjects, the JVM adds the corresponding phantom references to a reference queue,which the aspect monitors. As references appear in the queue the aspect removesthem from the set of potential leaked objects. This aspect therefore relies on the ef-fectiveness of the JVM’s garbage collection to eliminate false positives, and hencewhen woven retroactively stresses the fidelity of the holographic virtual machine.To support this case study, we implemented a naive but semantically correctversion of retroactive garbage collection. The key requirement is being able to de-termine when an object is not reachable either in the original execution or throughadditional retroactive references. Our approach is to rely on the host JVM’s garbagecollection to collect holographic wrapper objects when they become unreachable.We define finalize methods on a few key classes such that when a holographicobject is collected the holographic VM can check to see if the corresponding ob-ject in the execution recording is reachable. The holographic implementation ofthe native Runtime.gc() method can then enqueue holographic references toany such objects.By necessity, this split approach will not identify as many collectable objectsas the native garbage collection. In the worst case, where an object is minimallyreachable by a chain of n pairs of original and retroactive references, it will take niterations to clear the chain of garbage. This is a valid limitation of a JVM garbagecollector, however, as they are permitted to be non-deterministic and unpredictable.Villazo´n’s aspect already accounts for the general imprecision of garbage collec-tion by invoking Runtime.gc() and related methods multiple times until col-97lection reaches an approximate fixed point.6.6.5 ProfilingPearce et al. [52] investigated the effectiveness of implementing a general-purposeJava profiler using AspectJ. Their conclusions were that AspectJ was sufficientlyexpressive and efficient for reasonable implementations of several profiling usecases. They also named several limitations of the standard implementation of As-pectJ at the time that impacted its suitability, primarily the lack of support forarray construction and synchronization join points, and the lack of support forweaving classes in the standard libraries. To date, however, these limitations havemostly since been addressed: the requested join points have since been added, andseveral alternate weaving approaches have been proposed for weaving standardclasses [66]. Reflective weaving is another such approach since it does not modifythe original bytecode.Their implementation djprof includes several aspects for gathering statisticson object lifetimes, heap usage, time spent, and time wasted. Profiling use casesthat involve measuring actual wall time for execution are not ideal for retroactiveweaving; modelling the current time during retroactive execution in a semanticallyconsistent manner is challenging, although an interesting avenue for future work.However, analyzing execution in terms of operation frequencies or the size of datais just as effectively measured post-hoc. We chose to reuse the HeapAspect inparticular from djprof, which estimates the total heap space allocated by eachmethod. Our primary reason was to demonstrate the advantage of source compat-ibility in retroactive analysis: even though the aspect is generic and refers onlyto types within the standard libraries, it makes heavy use of the reflective meth-ods in Java (and hence AspectJ) to estimate object sizes. These calculations couldbe implemented using other APIs for executions recordings, but using retroactiveweaving means the same existing code can be used as-is.6.7 EvaluationThis section evaluates two primary research questions:1. What amount of effort, in terms of orders of magnitude, is necessary to adapt98an aspect to make it valid for retroactive weaving?2. Is retroactive weaving responsive enough for a typical execution recordinganalysis scenario?6.7.1 Adaptation EffortThis section quantifies the amount of additional programming necessary to adaptthe selected aspects to retroactive weaving, using the idiom described in Sec-tion 3.4. One of our claimed benefits of retroactive weaving is the reuse of exist-ing aspects as-is for post-hoc analysis. If substantial effort is necessary to avoidretroactive side-effects and hence ensure sound analysis results, it reduces thestrength of this claim.Many such effects occur within code that caches frequently requested datawithin the standard runtime libraries. Avoiding this class of effects is usually assimple as eliding the caching mechanism entirely, since retroactive weaving is nota performance-critical environment. Figure 6.2 contains an example of such codeand advice for avoiding the retroactive side-effect. In this particular case, there isonly a single, highly-localized operation to omit, since the hash field is private andonly assigned to within this method, but in other cases it is necessary to use thecflow pointcut to apply a more precise scope.Other illegal side-effects occur because aspects legitimately need to alter be-haviour in ways that do not semantically interfere with the original execution. Forexample, three of the case study aspects we evaluated need to summarize and re-port analysis at the end of execution. They achieve this by registering threads to berun when the JVM is shutting down using the Runtime.addShutdownHookmethod, which adds threads to a static list. If called retroactively, adding to this listis an illegal side-effect.The solution here is to declare an additional list for retroactive hooks on anaspect, and to define advice that augments the shutdown hook mechanism to alsorun hooks added to this second list. Figure 6.3 illustrates how this is done. Thisversion advises the low-level implementation of the shutdown hook mechanism,at the cost of violating encapsulation. An alternative version could instead advisehigher-level code, at the cost of some code duplication.991 public class St r i ng {2 . . .3 public in t hashCode ( ) {4 i n t h = hash ;5 i f ( h == 0 && value . leng th > 0) {6 char va l [ ] = value ;78 for ( i n t i = 0 ; i < value . leng th ; i ++) {9 h = 31 * h + va l [ i ] ;10 }11 hash = h ;12 }13 return h ;1415 }16 . . .17 }1819 pub l i c aspect JREAroundFieldSets {20 . . .21 void around ( ) : set ( * S t r i ng . hash ) {22 / / Don ’ t proceed ( ) , j u s t l e t i t be reca l cu l a t ed every t ime23 }24 . . .25 }Figure 6.2: Suppressing illegal side-effects with aspectsA key unanticipated technical challenge is that the process of loading the as-pects themselves may also have illegal side-effects. These operations occur in theimplementation of class loading within the standard libraries and not in the controlflow of the base code it is loading. This creates a catch–22 situation where load-ing the aspects to avoid retroactive side-effects can cause those effects themselves.To avoid these side-effects, the holographic JVM implementation itself includes asmall amount of low-level metaprogramming to install callbacks that relocate stateand augment behaviour in much the same ways as described above. As discussedat length earlier, this code is more difficult to write and debug, but is limited to thebootstrapping phase of the holographic JVM.In total, we encountered 21 instances of state the aspects attempted to retroac-tively modify in their control flow, 18 of which had to be elided through metapro-gramming instead of advice. It is likely that this set of bootstrap workarounds1001 pub l i c aspect ShutdownHooks {23 / / Augments ApplicationShutdownHooks . hooks4 private s ta t i c IdentityHashMap<Thread , Thread> moreHooks5 = new IdentityHashMap<Thread , Thread> ( ) ;67 / / Add r e t r o a c t i v e hooks to moreHooks ins tead8 void around ( Thread hook ) :9 execution ( void Runtime . addShutdownHook ( Thread ) )10 && args ( hook ) {11 moreHooks . put ( hook , hook ) ;12 }1314 a f te r ( ) : execution ( void Shutdown . runHooks ( ) ) {15 / / Copy of ApplicationShutdownHooks . runHooks ( ) ,16 / / but r e f e r r i n g to moreHooks ins tead .17 Co l l ec t i on<Thread> threads ;18 synchronized ( ShutdownHooks . c lass ) {19 threads = moreHooks . keySet ( ) ;20 moreHooks = nul l ;21 }2223 for ( Thread hook : threads ) {24 hook . s t a r t ( ) ;25 }26 for ( Thread hook : threads ) {27 t ry {28 hook . j o i n ( ) ;29 } catch ( I n te r rup tedExcep t i on x ) { }30 }31 }32 }Figure 6.3: Relocating illegal side-effects with aspectsare adequate for many other retroactive aspects as well, since they are enough tosupport safely loading any additional aspects necessary to avoid other side-effects.The complete list of workarounds can be found in Section A.2.6.7.2 Results and RuntimeWe now present a basic evaluation of the retroactive weaver as applied to each ofthe five case studies. We compared the output and runtime of the ajc load-timeweaver against the retroactive weaver using either TOD or LEAP execution record-101ings. The results are presented in Table 6.2. All run times are reported in seconds,and are the average of five iterations with one warmup iteration (which has the side-effect of populating the holographic JVM bytecode caches as in Section 5.3.2).Table 6.2: Execution time comparison of retroactive weaving with load-timeweavingcontract tracing racerj leaks heapBase program 0.29 0.26 0.26 0.43 0.44ajc load-time weave 3.68 3.60 8.80 3.95 3.37TOD record 11.33 13.30 N/A 11.98 11.34TOD retroactive weave 12.98 44.35 N/A 107.65 34.58LEAP record 0.40 0.39 0.25 0.25 0.45LEAP retroactive weave 16.52 28.10 151.73 104.30 26.01From these results we can see that retroactive weaving typically performs be-tween 3X to 20X slower than runtime weaving. The TOD-based retroactive weaverout-performed the LEAP-based weaver for the contract validation aspect becausethe aspect matched very few joinpoints in the base program. This is the situationwhere using an indexed execution trace offers an advantage, since the remainingexecution events that do not match can be skipped over quickly.The leak detector aspect caused the worst overhead for both execution record-ing adaptors at approximately 26X, but for very different reasons. As documentedin Villazo´n et al. [66], the aspect is often more useful if woven with a weaver thatsupports core Java classes, and its corresponding base program contains a leakthat is only detected in that case. Hence the performance comparison is not com-pletely fair as the LEAP-based retroactive weaver detects the intended leak and 28other objects that are not reclaimed and the load-time weaver does not. On theother hand, because the TOD system records events by augmenting the originalprogram’s bytecode, it is also not able to record events that occur in the code Javaclasses, and hence retroactively weaving the leak detector aspect on a TOD record-ing produces the same, less complete results as the load-time weaver. However,the aspect also depends on garbage collection for completeness. The prototypeimplementation of garbage collection in the TOD-based VMM is based on a naivemark-and-sweep algorithm implemented at the metaprogramming level, and is un-102surprisingly very slow. By contrast the LEAP-based VMM is based on a live JVMand can defer to its highly-optimized and asynchronous garbage collection.Note that the RacerJ aspect cannot currently be used with the TOD-based back-end because the TOD recorder and database does not record synchronization eventsand hence does not support the lock() and unlock() pointcuts. It is likelyfeasible to modify the TOD architecture to add this event type, however. An alter-native scenario that only considers synchronized methods is possible, but ignoringthe many synchronized blocks commonly used in concurrent code would likelyproduce many false positives.It is important to put the runtime of retroactive weaving in perspective. Theretroactive weaver is not used in the same way as the load-time weaver, and notintended to be an alternative to it nor competitive with it in runtime. The overheadreported above is somewhat comparable to the overhead of running a program indebug mode, and there is plenty of room for further optimization in future imple-mentations. The advantage of time-travel programming is allowing the resourcesgiven to and context of a program’s execution to vary independently from that ofthe execution analysis. Retroactive execution does not require human interaction toproceed, and is an ideal candidate for scheduling on idle computing resources. Inaddition, multiple analysis aspects can be woven retroactively against the same ex-ecution recording in parallel. Therefore, even though executing an aspect retroac-tively may be an order of magnitude slower that live execution, the ability to doso also saves the time otherwise needed to implement or reimplement the desiredanalysis at the metaprogramming level or within an execution query language.6.8 Related WorkSeveral query languages have been proposed that treat program execution eventsas data, and define domain-specific constructs for selecting and aggregating thatdata. The Program Trace Query Language [28] is a SQL-like relational query lan-guage for Java based on several Java-specific relations for events such as objectallocation and method invocation. Our work is complimentary to such query lan-guages, which can extract values from multiple points in time at once, but cannotuse the original source language to interpret those values. By contrast the Program103Query Language [42] uses flow-sensitive boolean conditions to identify applica-tion errors. PQL queries can be evaluated dynamically to catch errors at runtime,but can also be used to find sound approximations statically. Since the dynamicinstrumentation application can invoke or replace application code, it constitutes apoincuts and advice language, one with more sophisticated pointcuts than AspectJ.The PQL implementation supports identifying query matches post-hoc, which isto say identifying matching join points given a program event stream, but does notsupport executing the bodies of queries.The execution metaprogramming approach described earlier is closely relatedto the scriptable debugging paradigm, which involves driving debugging activi-ties via a secondary language that operates on the debugging API as a first-classdatatype. The secondary language may actually be a subset of the target language,as is the case with the Eclipse MAT utilities, but in that case the execution valuesof the debugged program are represented at the metaprogramming level and not asnormal values within the debugging script or program. The Gnu debugger (GDB)features a Python scripting interface for automating debugging, and the recent Ex-positor [53] language supports time-travel debugging by allowing scripts to usetraces as first-class objects, including moving backwards and forwards in time ar-bitrarily and retrieving, for example, the set of all points in time where a breakpointwould have been hit. Note that our mirror-based implementation of AspectJ onlyrefers to the generic VM mirror interfaces, and do not require a holographic VM.If applied to an adapted JDI mirror, the process will dynamically load the aspectclasses into the live process and execute advice bodies directly. This therefore canbe seen as another scriptable debugging implementation1 on a normal live processbased on AOP.Ansaloni et. al. [6] propose optimizing non-interfering aspects by executingthem in parallel on multi-core architectures. To ensure consistency, however, theaspects in question have to be rewritten to ensure that the necessary state is ex-plicitly copied into the aspect space so that base execution can continue withoutblocking. A possible improvement on this approach would be to use holographic1Note that the completeness of this AspectJ implementation is limited by the JDI implementation;in particular general around advice is not supported for the version of JDI we used as its support formethod invocation was not reentrant.104execution (in read-only mode) in combination with a copy-on-write layer to allowparallel execution of unaltered aspects. Their primary example of a non-interferingaspect is the Racer [21] algorithm that we included as a case study, which is com-pelling since it is a useful but costly aspect.6.9 SummaryIn this chapter we have shown that combining a reflective aspect weaver for As-pectJ with the holographic JVM implementation from Chapter 5 creates an AspectJretroactive weaver. We have shown that such a weaver can be used to evaluate awide variety of aspects against execution recordings instead of inline with the origi-nal base program, with a high but not prohibitive runtime cost. We have also shownthat adapting aspects to be free of illegal side-effects in their control flow only re-quires a handful of common auxiliary aspects and metaprogramming workarounds.While these results are very preliminary, they demonstrate the basic feasibility ofthis approach and lay the foundation for future implementations.105Chapter 7ConclusionWe conclude with a review of the contributions of this thesis and how they supportour thesis statement. We also outline the limitations and threats to validity of thiswork, and suggest several potential avenues of future research.7.1 SummaryThis thesis has presented Time-travel Programming, a novel programming paradigmfor interacting with prior executions of software. Its defining characteristic is sim-ulating the behaviour code would have produced if evaluated during such past exe-cutions. Its building blocks are Retroactive Execution, the simulation of evaluatingcode at some past time and context, and Retroactive Weaving, the retroactive exe-cution of a post-hoc aspect across an execution recording. Our thesis statement isthat TTP is effective, feasible, and flexible, which we have demonstrated as followsbelow.TTP is effective: In Section 3.1, Section 5.3.1, and Section 6.6, we have out-lined several examples of program analysis that are straightforward and logical toexpress using TTP, including debugging a long-standing memory leak in Eclipse.TTP is feasible: By outlining a prototype for both retroactive execution andretroactive weaving for the Java Virtual Machine and the AspectJ AOP program-ming language in Chapter 5 and Chapter 6, we have provided evidence that TTPcan be implemented for a full-featured programming language. We have shown106that the implementation effort required is reasonable by reusing substantial amountsof existing programming language implementation, and that it is feasible to workaround missing information in execution recordings through moderate adaption ef-fort of the post-hoc code.TTP is flexible: Chapter 4 examined the concepts behind TTP in the contextof a simple core language, and demonstrates how they interact with fundamentalprogramming language concepts. This provides evidence that TTP is applicableto many programming languages that are built on these concepts. Orthogonally,by evaluating our prototype TTP implementation against three different sources ofexecution state (JVM heap dumps, execution traces and DR processes), we havedemonstrated that the execution recording mechanism may vary independentlyfrom the TTP implementation for a particular programming language.7.2 LimitationsIncomplete implementation of AspectJ: The reflective AspectJ weaver does notyet support the complete AspectJ language, although supporting the evaluationrequired a much more complete implementation that initially anticipated. Mostomissions are from a lack of need for evaluation rather than any architectural ob-stacles.Lack of core class weaving in ajc: For the most part the reflective AspectJweaver supports join points that occur in the code of all classes, but the ajc toolsetassumes an implementation based on bytecode rewriting and hence excludes somepackages1 from weaving. This artificially restricts valid aspects: advice on coremethods only produce a compile warning, but intertype definitions on core classesproduce compile errors even if load-time weaving is targeted. This means it is notpossible to use ajc to produce the binary form (see Section 2.5) used by retroactiveweaving of some valid retroactive aspects, which means developers must avoidAspectJ features that would otherwise provide the most elegant expression of theiranalysis.1java.*, javax.*, sun.reflect.*, and org.aspectj.*. The latter package is ex-cluded to avoid infinite recursion if the classes involved in the weaving implementation itself arewoven.1077.3 Threats to ValidityReflection may introduce unsound results: It is possible for the rich reflectioninterface in AspectJ to observe changes to type definitions or control flow, and thispotential source of retroactive side-effects is not currently guarded against by theholographic JVM. This is unlikely to be a serious concern since the AspectJ lan-guage does not specify precisely how weaving affects these features of the runtime.For example, weaving often introduces extra stack frames between the caller andcallee frames, and the exact nature of these changes depends on internal optimiza-tion decisions. Therefore, any code that depends on this data would be extremelyfragile.Double weaving of joint points in aspects: When ajc compiles aspects intotheir binary class form, it also applies compile-time weaving to the join points inthose aspects. When such aspects are then loaded and woven by the reflectiveweaver against all events that occur in the runtime, advice may be applied twiceto those join points. For the case studies in this thesis the only advice that appliesto other advice code involves cflow tracking, whose application happens to beidempotent and hence does not affect execution semantics. Unfortunately, ajcdoes not expose any options for separating weaving from compiling, although itwould be a relatively small change to support this in the tool’s architecture.Lack of generalization to other language implementations: The abstractconcepts defined by TTP theoretically apply to any programming language. Thegeneral model described in Section 4.5 for retroactive weaving, for example, cantheoretically be applied to many AOP languages, and provides a definition ofthe semantics of a retroactive weaver for any such language. However, the ac-tual implementation techniques presented here may not be applicable for otherAOP language runtimes, in particular for fundamentally different joinpoint models.The fact that our implementations overlapped significantly with existing languageimplementations is promising, but is no guarantee that other languages runtimeswould be equally compatible with the TTP paradigm.1087.4 Future WorkAlthough we have shown that the implementation of our architecture is performantenough to be useful, there is still plenty of room for optimization that would greatlyimprove the scope of scenarios these techniques are feasible for. An important di-rection to pursue is to improve the bytecode loading and translating workflow inthe holographic JVM implementation. Ideally, it should be possible to producehologram bytecode once for each version of a class as it is developed, rather thanrepeatedly for each heap dump it occurs in. The translation process cannot cur-rently be applied to a single class file in isolation, however, as the standard JVMbytecode type inferencing algorithm requires access to the context of the classhierarchy. Additional engineering effort should make it possible to remove this de-pendency, possibly by abstracting the type inferencing to leave placeholder typesand replacing them with actual types only after caching the translated bytecode.With regard to inferring class initialization state in the holographic JVM, Kozenand Stillerman [35] use a static analysis of class initializers similar to ours to ini-tialize classes eagerly, in order to improve startup performance and catch errorsearlier. Their algorithm ignores initializer effects with respect to static fields, butis flow-sensitive and hence calculates a more precise definition of initializationdependencies than our current implementation. Integrating their approach in thefuture may improve the success rate of our algorithm and hence reduce the numberof initializers that must be marked safely repeatable.It should be possible to apply the general model defined in Section 4.5 to AOPlanguages beyond those based on pointcuts and advice, such as crosscutting con-tracts as in AspectJML [57]. There are also several more radical paths other thanretroactive weaving to explore in the intersection of AOP and execution record-ings. It could be useful to expose future knowledge about the original executionthrough specialized pointcuts valid only in retroactive weaving; a garbage(x)pointcut, for example, that only matches join points where x is an object value thatis garbage collectable at the end of execution. Such pointcuts would need to be de-signed carefully to only bind and clone immutable, self-contained values from thefuture environment, since retroactive execution of advice bodies cannot sensiblyproceed in the context of more than one environment at once.109Similarly, given that execution recordings are often very large, it is intrigu-ing to consider how time-travel programming can be parallelized without entirelylosing the benefits of source code compatibility. For example, the workload of ap-plying toString to every object in a heap dump as described in Section 5.3.2could be parallelized by splitting the set of objects into multiple subsets, underthe assumption that the effects of invoking toString on some objects do notaffect the results of others. The portability of execution recordings and of holo-graphic JVMs means that all state can be safely replicated in a distributed farm.Retroactive weaving of certain stateless or mostly stateless aspects could then beparallelized by partitioning the timeline of execution.Finally, this work to date has focussed exclusively on analyzing a single pro-cess at once, but computer systems are becoming increasingly multi-process anddistributed. The scope of time-travel programming as a concept could easily be ex-tended to evaluate additional code in the context of multiple independently recordedprocesses. This could be applied very naturally to actor-based systems, whereretroactive aspects in the applicable AOP language could create additional actorsthemselves in order to communicate analysis data across processes, again restoringthe advantages of a programming paradigm even when that programming happensafter the fact.110Bibliography[1] Clozure cl documentation - 4.9. saving applications. URLhttp://ccl.clozure.com/manual/chapter4.9.html#Saving-Applications.Accessed October 2013. → pages 79[2] The AspectJ programming guide. URLhttps://eclipse.org/aspectj/doc/released/progguide/index.html. AccessedMay 2017. → pages 14[3] Eclipse memory analyzer open source project. URLhttp://www.eclipse.org/mat/. Accessed October 2013. → pages 20[4] VAssert programming guide, 2008. URLhttps://www.vmware.com/pdf/ws65 vassert programming.pdf. AccessedJune 2011. → pages 24, 28[5] B. Alpern, S. Augart, S. Blackburn, M. Butrico, A. Cocchi, P. Cheng,J. Dolby, S. Fink, D. Grove, M. Hind, et al. The jikes research virtualmachine project: building an open-source research community. IBM SystemsJournal, 44(2):399–417, 2005. → pages 79[6] D. Ansaloni, W. Binder, A. Villazo´n, and P. Moret. Parallel dynamicanalysis on multicores with aspect-oriented programming. In Proceedings ofthe 9th International Conference on Aspect-Oriented Software Development,AOSD ’10, pages 1–12, New York, NY, USA, 2010. ACM. → pages 96, 104[7] M. Attariyan, M. Chow, and J. Flinn. X-ray: Automating root-causediagnosis of performance anomalies in production software. In Proceedingsof the 10th USENIX Conference on Operating Systems Design andImplementation, OSDI’12, pages 307–320. USENIX Association, 2012. →pages 26[8] T. Austin, T. Disney, and C. Flanagan. Virtual values for language extension.In Proceedings of the 2011 ACM international conference on Object111oriented programming systems languages and applications, pages 921–938.ACM, 2011. → pages 78[9] J. Baker and W. Hsieh. Runtime Aspect Weaving ThroughMetaprogramming. In Proceedings of the 1st International Conference onAspect-oriented Software Development, AOSD ’02, pages 86–95, New York,NY, USA, 2002. ACM. → pages 16[10] S. Balzer, P. T. Eugster, and B. Meyer. Can Aspects Implement Contracts?In N. Guelfi and A. Savidis, editors, Rapid Integration of SoftwareEngineering Techniques, number 3943 in Lecture Notes in ComputerScience, pages 145–157. Springer Berlin Heidelberg, 2006. → pages 93[11] F. Bellard. QEMU, a fast and portable dynamic translator. In USENIXAnnual Technical Conference, 2005. → pages 32[12] E. Bodden and K. Havelund. Racer: effective race detection using aspectj.In Proceedings of the 2008 international symposium on Software testing andanalysis, ISSTA ’08, pages 155–166, New York, NY, USA, 2008. ACM.ACM ID: 1390650. → pages 25, 96[13] G. Bracha and D. Ungar. Mirrors: design principles for meta-level facilitiesof object-oriented programming languages. In Proceedings of the 19thannual ACM SIGPLAN conference on Object-oriented programming,systems, languages, and applications, OOPSLA ’04, pages 331–344. ACM,2004. → pages 59, 79[14] E. Bruneton, R. Lenglet, and T. Coupaye. ASM: a code manipulation tool toimplement adaptable systems. In Adaptable and extensible componentsystems, 2002. → pages 64[15] C. Cadar, D. Dunbar, and D. Engler. Klee: Unassisted and automaticgeneration of high-coverage tests for complex systems programs. InProceedings of the 8th USENIX Conference on Operating Systems Designand Implementation, OSDI’08, pages 209–224, Berkeley, CA, USA, 2008.USENIX Association. → pages 68[16] K. Chen and J.-B. Chen. Aspect-Based Instrumentation for LocatingMemory Leaks in Java Programs. In Computer Software and ApplicationsConference, 2007. COMPSAC 2007. 31st Annual International, volume 2,pages 23–28, July 2007. → pages 97112[17] J. Choi and B. Alpern. DejaVu: Deterministic Java Replay Debugger forJalapeno Java Virtual Machine. OOPSLA 2000 Companion, 2000. → pages26[18] J. D. Choi, B. Alpern, T. Ngo, M. Sridharan, and J. Vlissides. Aperturbation-free replay platform for cross-optimized multithreadedapplications. In Parallel and Distributed Processing Symposium.,Proceedings 15th International, 2001. → pages 2[19] J. Chow, T. Garfinkel, and P. M. Chen. Decoupling dynamic programanalysis from execution in virtual environments. In USENIX AnnualTechnical Conference, 2008. → pages 35[20] J. Cook and A. Nusayr. Using AOP for Detailed Runtime MonitoringInstrumentation. In WODA 2008: the sixth international workshop ondynamic analysis. → pages 25[21] P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model forstatic analysis of programs by construction or approximation of fixpoints. InProceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles ofprogramming languages, POPL ’77, pages 238–252. ACM, 1977. → pages70[22] D. S. Dantas and D. Walker. Harmless advice. In Conference record of the33rd ACM SIGPLAN-SIGACT symposium on Principles of programminglanguages, POPL ’06, pages 383–396, New York, NY, USA, 2006. ACM.→ pages 17[23] B. De Fraine, E. Ernst, and M. Su¨dholt. Essential AOP: The A Calculus. InECOOP 2010 - Object-Oriented Programming, volume 6183 of LectureNotes in Computer Science, pages 101–125. Springer Berlin Heidelberg,2010. → pages 53[24] D. Devecsery, M. Chow, X. Dou, J. Flinn, and P. M. Chen. Eidetic systems.In Proceedings of the 11th USENIX Conference on Operating SystemsDesign and Implementation, OSDI’14, pages 525–540. USENIXAssociation, 2014. → pages 18, 26[25] G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt:enabling intrusion analysis through virtual-machine logging and replay.Operating Systems Design and Implementation, 2002. → pages 18, 34, 35113[26] C. Dutchyn, D. B. Tucker, and S. Krishnamurthi. Semantics and scoping ofaspects in higher-order languages. Science of Computer Programming, 63(3), Dec. 2006. ISSN 0167-6423. → pages 16, 40, 53[27] M. Factor, A. Schuster, and K. Shagin. Instrumentation of standard librariesin object-oriented languages: the twin class hierarchy approach. InProceedings of the 19th annual ACM SIGPLAN conference onObject-oriented programming, systems, languages, and applications,OOPSLA ’04, page 288300. ACM, 2004. → pages 65[28] S. F. Goldsmith, R. O’Callahan, and A. Aiken. Relational queries overprogram traces. In Object-Oriented Programming, Systems, Languages, andApplications, page 402, 2005. → pages 24, 103[29] O. Gruber, B. Hargrave, J. McAffer, P. Rapicault, and T. Watson. Theeclipse 3.0 platform: adopting osgi technology. IBM Systems Journal, 44(2):289–299, 2005. → pages 22[30] E. Hilsdale and J. Hugunin. Advice Weaving in AspectJ. In Proceedings ofthe 3rd International Conference on Aspect-oriented Software Development,AOSD ’04, pages 26–35, New York, NY, USA, 2004. ACM. → pages 15[31] J. Huang, P. Liu, and C. Zhang. LEAP: lightweight deterministicmulti-processor replay of concurrent java programs. In Proceedings of theeighteenth ACM SIGSOFT international symposium on Foundations ofsoftware engineering, pages 207–216, 2010. → pages 18, 91[32] A. Joshi, S. T. King, G. W. Dunlap, and P. M. Chen. Detecting past andpresent intrusions through vulnerability-specific predicates. In Proceedingsof the twentieth ACM symposium on Operating systems principles, SOSP’05, pages 91–104, New York, NY, USA, 2005. ACM. → pages 28[33] G. Kiczales and J. D. Rivieres. The Art of the Metaobject Protocol. MITPress, Cambridge, MA, USA, 1991. ISBN 0262111586. → pages 10[34] G. Kiczales, E. Hilsdale, J. Hugunin, M. Kersten, J. Palm, and W. G.Griswold. An overview of AspectJ. In Proceedings of the 15th EuropeanConference on Object-Oriented Programming, ECOOP ’01, pages 327–353,London, UK, UK, 2001. Springer-Verlag. → pages 40[35] D. Kozen and M. Stillerman. Eager class initialization for java. InProceedings of the 7th International Symposium on Formal Techniques inReal-Time and Fault-Tolerant Systems: Co-sponsored by IFIP WG 2.2,FTRTFT ’02, pages 71–80. Springer-Verlag, 2002. → pages 80, 109114[36] S. Krishnamurthi. Programming Languages: Application and Interpretation.2007. URL http://www.plai.org. Accessed March 2015. → pages 38[37] G. Lefebvre, B. Cully, M. J. Feeley, N. C. Hutchinson, and A. Warfield.Tralfamadore: unifying source code and execution experience. In EuroSys,2009. → pages 32[38] B. Lewis. Debugging backwards in time. In Automated and Analysis-DrivenDebugging, 2003. → pages 18[39] K. J. Lieberherr and D. Orleans. Preventive program maintenance inDemeter/Java (research demonstration). In International Conference onSoftware Engineering, pages 604–605, Boston, MA, 1997. ACM Press. →pages 45[40] D. H. Lorenz and J. Vlissides. Pluggable reflection: decouplingmeta-interface and implementation. In 25th International Conference onSoftware Engineering, 2003. Proceedings, pages 3– 13. IEEE, May 2003. →pages 79[41] C. K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace,V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysistools with dynamic instrumentation. In Programming Language Design andImplementation, 2005. → pages 24[42] M. Martin, B. Livshits, and M. S. Lam. Finding application errors andsecurity flaws using PQL: a program query language. Object-OrientedProgramming, Systems, Languages and Applications, 2005. → pages 104[43] H. Masuhara and G. Kiczales. Modeling crosscutting in aspect-orientedmechanisms. In ECOOP 2003 - Object-Oriented Programming, volume2743 of Lecture Notes in Computer Science, pages 2–28. Springer BerlinHeidelberg, 2003. → pages 13, 45[44] E. K. Maxwell, G. Back, and N. Ramakrishnan. Diagnosing memory leaksusing graph mining on heap dumps. In Proceedings of the 16th ACMSIGKDD international conference on Knowledge discovery and datamining, KDD ’10, pages 115–124. ACM, 2010. → pages 80[45] B. Meyer. Applying ’design by contract’. Computer, 25(10):40–51, Oct.1992. → pages 93115[46] S. Mostinckx, T. Van Cutsem, S. Timbermont, and E. Tanter. Mirages:Behavioral intercession in a mirror-based architecture. In Proceedings of the2007 symposium on Dynamic languages, pages 89–100. ACM, 2007. →pages 78[47] G. Necula, S. McPeak, S. Rahul, and W. Weimer. CIL: intermediatelanguage and tools for analysis and transformation of c programs. InCompiler Construction. 2002. → pages 29[48] T. Ngo and J. Barton. Debugging by remote reflection. In Euro-Par 2000Parallel Processing, pages 1031–1038, 2000. → pages 26[49] E. B. Nightingale, D. Peek, P. M. Chen, and J. Flinn. Parallelizing securitychecks on commodity hardware. In Architectural Support for ProgrammingLanguages and Operating Systems, 2008. → pages 2, 26[50] Oracle. Java platform debugger architecture. URLhttp://docs.oracle.com/javase/7/docs/technotes/guides/jpda/index.html.Accessed October 2013. → pages 59[51] A. Orso and B. Kennedy. Selective capture and replay of programexecutions. In ACM SIGSOFT Software Engineering Notes, volume 30,pages 1–7, New York, NY, USA, May 2005. ACM. ACM ID: 1083251. →pages 2[52] D. J. Pearce, M. Webster, R. Berry, and P. H. J. Kelly. Profiling with AspectJ.Software: Practice and Experience, 37(7):747–777, 2007. → pages 98[53] K. Y. Phang, J. S. Foster, and M. Hicks. EXPOSITOR: ScriptableTime-Travel Debugging with First Class Traces. In Proceedings of the 2013International Conference on Software Engineering, 2013. → pages 104[54] A. Popovici, G. Alonso, and T. Gross. Just-in-time aspects: efficientdynamic weaving for java. In Aspect-Oriented Software Development, 2003.→ pages 16[55] G. Pothier and E. Tanter. Summarized trace indexing and querying forscalable back-in-time debugging. In Proceedings of the 25th EuropeanConference on Object-oriented Programming, ECOOP’11, pages 558–582,Berlin, Heidelberg, 2011. Springer-Verlag. → pages 18, 47, 89[56] G. Pothier, E´. Tanter, and J. Piquer. Scalable omniscient debugging. ACMSIGPLAN Notices, 42(10), 2007. → pages 18, 89, 90116[57] H. Rebeˆlo, G. T. Leavens, M. Bagherzadeh, H. Rajan, R. Lima, D. M.Zimmerman, M. Corne´lio, and T. Thu¨m. AspectJML: Modular specificationand runtime checking for crosscutting contracts. In Proceedings of the 13thInternational Conference on Modularity, MODULARITY ’14, pages157–168, New York, NY, USA, 2014. ACM. → pages 93, 109[58] J. A. Robinson. A machine-oriented logic based on the resolution principle.J. ACM, 12(1):23–41, Jan. 1965. ISSN 0004-5411. → pages 30[59] R. Salkeld and R. Garcia. Essential Retroactive Weaving. In CompanionProceedings of the 14th International Conference on Modularity,MODULARITY Companion 2015, pages 52–57, New York, NY, USA,2015. ACM. → pages v[60] R. Salkeld and G. Kiczales. Interacting with dead objects. In Proceedings ofthe 2013 ACM SIGPLAN International Conference on Object OrientedProgramming Systems Languages and Applications, OOPSLA ’13, pages203–216, New York, NY, USA, 2013. ACM. → pages v, 49[61] R. Salkeld, W. Xu, B. Cully, G. Lefebvre, A. Warfield, and G. Kiczales.Retroactive aspects: Programming in the past. In Proceedings of the NinthInternational Workshop on Dynamic Analysis, WODA ’11, pages 29–34,New York, NY, USA, 2011. ACM. → pages v[62] V. Schuppan, M. Baur, and A. Biere. JVM independent replay in java.Electronic Notes in Theoretical Computer Science, 113:85–104, Jan. 2005.→ pages 2[63] M. X. Sheldon, G. V. Weissman, and V. M. Inc. Retrace: Collectingexecution trace with virtual machine deterministic replay. In Modeling,Benchmarking and Simulation, 2007. → pages 2, 35[64] B. C. Smith. Reflection and semantics in lisp. In Proceedings of the 11thACM SIGACT-SIGPLAN Symposium on Principles of ProgrammingLanguages, POPL ’84, pages 23–35, New York, NY, USA, 1984. ACM. →pages 10[65] A. Srivastava and A. Eustace. ATOM: a system for building customizedprogram analysis tools. In Proceedings of the ACM SIGPLAN 1994conference on Programming language design and implementation, pages196–205, Orlando, Florida, United States, 1994. ACM. → pages 24117[66] A. Villazo´n, W. Binder, and P. Moret. Aspect Weaving in Standard JavaClass Libraries. In Proceedings of the 6th International Symposium onPrinciples and Practice of Programming in Java, PPPJ ’08, pages 159–167,New York, NY, USA, 2008. ACM. → pages 97, 98, 102[67] G. Xu, A. Rountev, Y. Tang, and F. Qin. Efficient checkpointing of javasoftware using context-sensitive capture and replay. In Proceedings of thethe 6th joint meeting of the European software engineering conference andthe ACM SIGSOFT symposium on The foundations of software engineering,ESEC-FSE ’07, pages 85–94. ACM, 2007. → pages 79[68] Z. Yang, M. Yang, L. Xu, H. Chen, and B. Zang. ORDER: object centricdeterministic replay for java. In Proceedings of the 2011 USENIXconference on USENIX annual technical conference, USENIXATC’11, page3030. USENIX Association, 2011. → pages 18118Appendix AAppendicesA.1 Illegal Native Methods in the JREThis appendix contains our classification of all illegal native methods in the JRE.See Table A.1 for the complete list.119Category Class MethodCore Class Initialization* registerNativesjava.lang.System initPropertiesjava.util.concurrent.atomic.AtomicLong VMSupportsCS8sun.misc.VM initializeDriverssun.print.* *sun.security.smartcardio.* *GUIapple.laf.* *com.apple.eawt.* *com.apple.laf.* *com.sun.java.swing.plaf.gtk.* *java.awt.*, sun.awt.* *sun.lwawt.* *Graphicscom.sun.imageio.plugins.jpeg.* *sun.dc.pr.* *sun.font.* *sun.java2d.* *120Category Class MethodIOcom.apple.eio.* *com.sun.java.util.jar.pack.NativeUnpack *java.io.*, java.nio.*, sun.nio.* *java.util.logging.FileHandler *java.util.prefs.* *sun.misc.MessageUtils *JIT Compilation java.lang.Compiler *Java 7 Method Handles java.lang.invoke.* *Managementcom.sun.demo.jvmti.hprof.* *oracle.jrockit.jfr.* *sun.management.* *sun.misc.Perf *sun.tracing.dtrace.JVM *Media com.sun.media.sound.* *Native Librariesjava.lang.ClassLoader$NativeLibrary *java.lang.System mapLibraryNameNetworkjava.net.*, sun.net.* *sun.rmi.* *Security apple.security.*, sun.security.* *121Category Class MethodShared superclassjava.lang.Object *java.lang.Throwable *Systemapple.applescript.* *apple.launcher.* *com.apple.concurrent.* *com.apple.jobjc.* *com.apple.resources.LoadNativeBundleAction *com.sun.management.* *java.lang.ProcessEnvironment environjava.lang.Runtime *java.lang.UNIXProcess *java.util.TimeZone *sun.misc.GC *sun.misc.NativeSignalHandler *sun.misc.Signal *Table A.1: Categorization of forbidden methods in the Java Runtime Environment122A.2 Illegal Side-effects in AspectJ Case StudiesThis appendix contains the complete list of illegal side-effects encountered in thecontrol flow of the retroactive aspects evaluated in Chapter 6. Table A.2 summa-rizes the state that the execution the aspects attempted to modify, and how theseeffects were elided with either additional aspects or (in the case of effects en-countered when booting the holographic JVM) low-level metaprogramming. Thesecond column indicates whether the side-effects occurred when loading aspectsthemselves, making it necessary to implement the strategy using metaprogram-ming instead. The strategies named in the third column are as follows:Skip method: Simply removing invocations of methods that are not necessaryfor correctness.Avoiding cache: Eliding the side-effect, nullifying the effect of the relevantcaching.Secondary cache: Duplicating the referenced storage so that retroactive exe-cution reads and writes independently but equivalently.Secondary counter: Duplicating a numerical counter that is incremented togenerate distinct values. This strategy is valid as long as overlapping retroactivevalues with original values does not interfere with soundness.Augmenting collection: Allocating a secondary collection of values that se-mantically augments the original and advising interactions with that collection tomaintain consistency, as illustrated in Figure 6.3.Secondary I/O: Replacing standard output streams with independent streamsthat can be explicitly accessed by TTP clients.Reduced concurrency: A specific case during class loading to avoid updatingstorage that tracks locks per class, wherein the class loader itself is synchronizedon instead. This leads to decreased concurrency in retroactive weaving but doesnot interfere with correctness.123Table A.2: Illegal side-effects encountered by case studiesState Bootstrap Strategyjava.io.UnixFileSystem.cache y Secondary cachejava.io.UnixFileSystem.javaHomePrefixCache y Secondary cachejava.lang.ApplicationShutdownHooks.hooks Augmented collectionjava.lang.Class.declaredFields y Secondary cachejava.lang.ClassLoader.getClassLoadingLock y Reduced concurrencyjava.lang.String.hash y Avoided cachejava.lang.System.stdout Secondary I/Ojava.lang.System.stderr Secondary I/Ojava.lang.Thread.threadInitNumber y Secondary counterjava.lang.Thread.threadSeqNumber y Secondary counterjava.lang.ThreadLocal.nextHashCode y Secondary counterjava.lang.ThreadLocal values y Augmented collectionjava.net.URLClassLoader.closeables y Augmented collectionjava.nio.charset.Charset.cache1 y Secondary cachejava.nio.charset.Charset.cache2 y Secondary cachejava.util.zip.ZipCoder.dec y Secondary cachejava.util.zip.ZipCoder.enc y Secondary cachesun.reflect.NativeConstructorAccessorImpl.numInvocations y Secondary countersun.misc.Hashing.randomHashSeed y Secondary countersun.misc.MetaIndex.registerDirectory y Skip methodsun.nio.cs.ThreadLocalCoders$Cache.cache y Secondary cache124A.3 RAPL Interpreter Source CodeThis appendix contains the complete source code for the RAPL interpreter de-scribed in Chapter 4.125Listing A.1: Complete RAPL interpreter source code1 #lang plai23 (require "rapl.rkt")4 (require "rapl_parser.rkt")5 (require "rapl_serialization.rkt")67 ;;8 ;; Rapl interpreter9 ;;1011 (define-type Value12 (numV (n number?))13 (boolV (b boolean?))14 (symbolV (s symbol?))15 (closV (params (listof symbol?)) (body ExprC?) (env Env?))16 (boxV (l Location?))17 (voidV)18 (taggedV (tag Value?) (value Value?))19 (traceValueV (v Value?))20 (resumeV (label string?) (pos number?)))2122 (define-type Binding23 [bind (name symbol?) (value Value?)])24 (define Location? number?)25 (define Env? (listof Binding?))26 (define mt-env empty)2728 (define-type Storage29 [cell (location Location?) (val Value?)])12630 (define Store? (listof Storage?))31 (define mt-store empty)3233 (define-type Advice34 [aroundappsA (advice Value?)])35 (define AdvStack? (listof Advice?))36 (define mt-adv empty)3738 (define-type Control39 [interp-init]40 [app-call (abs Value?) (args (listof Value?))]41 [app-result (r Value?)])4243 (define-type State44 [state (c Control?) (adv AdvStack?) (sto Store?) (tin TraceIn?)])4546 (define-type TraceOut47 [traceout (states (listof State?))])48 (define mt-traceout (traceout empty))4950 (define/contract (append-traceout . ts) (->* () () #:rest (listof TraceOut?) TraceOut?)51 (traceout (foldl append '() (map traceout-states (reverse ts)))))5253 (define-type TraceIn54 [tracein (states (listof State?))])55 (define mt-tracein (tracein empty))5657 (define/contract (trace-state tin) (-> TraceIn? State?)58 (first (tracein-states tin)))5960 (define/contract (next-trace-state tin) (-> TraceIn? TraceIn?)12761 (tracein (rest (tracein-states tin))))6263 (define-type Result64 [v*s*t*t (v Value?) (s Store?) (tin TraceIn?) (tout TraceOut?)])6566 ;; Numbers and arithmetic6768 (define (num+ l r)69 (cond70 [(and (numV? l) (numV? r))71 (numV (+ (numV-n l) (numV-n r)))]72 [else73 (error 'num+ "one argument was not a number")]))7475 (define (num* l r)76 (cond77 [(and (numV? l) (numV? r))78 (numV (* (numV-n l) (numV-n r)))]79 [else80 (error 'num* "one argument was not a number")]))8182 (define (numWrite v)83 (cond84 [(numV? v)85 (write (numV-n v))]86 [else87 (error 'numWrite "argument was not a number")]))8889 ;; Booleans and conditionals9091 (define/contract (deep-untag v) (-> Value? Value?)12892 (type-case Value v93 [taggedV (tag tagged)94 (deep-untag tagged)]95 [else v]))9697 (define/contract (equal-values l r) (-> Value? Value? boolean?)98 (equal? l r))99100 ;; Identifiers and functions101102 (define/contract (lookup for env) (-> symbol? Env? Value?)103 (cond104 [(empty? env) (error 'lookup (string-append "name not found: " (symbol->string for)))]105 [else106 (type-case Binding (first env)107 [bind (name value)108 (cond109 [(symbol=? for name) value]110 [else (lookup for (rest env))])])]))111112 (define/contract (apply f args adv sto tin) (-> Value? (listof Value?) AdvStack? Store? TraceIn? Result?)113 (type-case Value (deep-untag f)114 [closV (params body env)115 (let ([bs (map bind params args)])116 (interp body (append bs env) adv sto tin))]117 [resumeV (label pos)118 (if (unbox retroactive-error-checking)119 (rw-call pos args adv sto tin)120 (rw-call-no-error args adv sto tin))]121 [traceValueV (tf)122 (apply (lift-trace-value tf) args adv sto tin)]129123 [else (error (string-append "only functions can be applied: " (value->string f)))]))124125 (define z-combinator126 (parse-string "(lambda (f) ((lambda (x) (f (lambda (y) ((x x) y))))127 (lambda (x) (f (lambda (y) ((x x) y))))))"))128129 ;; Mutations and side-effects130131 (define/contract (storage-at storage loc) (-> (listof Storage?) Location? (or/c Storage? #f))132 (cond133 [(empty? storage) #f]134 [else135 (let ([s (first storage)])136 (type-case Storage s137 [cell (l val)138 (cond139 [(= loc l) s]140 [else (storage-at (rest storage) loc)])]))]))141142 (define/contract (fetch sto tin b) (-> Store? TraceIn? Value? Value?)143 (type-case Value (deep-untag b)144 [boxV (loc)145 (let ([storage (storage-at sto loc)])146 (if storage147 (type-case Storage storage148 [cell (l val) val])149 (error "location not found")))]150 [traceValueV (v)151 (type-case State (trace-state tin)152 [state (c adv sto-t tin-t)153 (fetch sto-t tin-t v)])]130154 [else (error "attempt to unbox a non-box")]))155156 (define/contract (new-loc sto) (-> Store? Location?)157 (length sto))158159 (define override-store cons)160161 (define (list-box-push! b x)162 (set-box! b (cons x (unbox b))))163 (define (list-box-pop! b)164 (let* ([next (first (unbox b))]165 [_ (set-box! b (rest (unbox b)))])166 next))167168 (define read-source (box (lambda (prompt)169 (display prompt)170 (display "> ")171 (string->number (read-line)))))172 (define write-sink (box (lambda (s) (begin (display s) (newline)))))173174 ;; Lifting trace values175176 (define/contract (lift-binding b) (-> Binding? Binding?)177 (type-case Binding b178 [bind (name value)179 (bind name (lift-trace-value value))]))180181 (define/contract (lift-trace-value v) (-> Value? Value?)182 (type-case Value v183 [numV (_) v]184 [boolV (_) v]131185 [symbolV (_) v]186 [closV (params body env)187 (closV params body (map lift-binding env))]188 [boxV (l) (traceValueV v)]189 [traceValueV (tv) (traceValueV v)]190 [voidV () v]191 [taggedV (tag tagged)192 (taggedV (lift-trace-value tag) (lift-trace-value tagged))]193 [resumeV (label pos) v]))194195 (define/contract (prepend-trace t r) (-> TraceOut? Result? Result?)196 (type-case Result r197 [v*s*t*t (v-r s-r tin-r tout-r)198 (v*s*t*t v-r s-r tin-r (append-traceout t tout-r))]))199200 ;; Advice201202 (define/contract (apply-with-weaving f args adv adv2 sto tin) (-> Value? (listof Value?) AdvStack?AdvStack? Store? TraceIn? Result?)203 (type-case Result (weave adv adv2 f sto tin)204 (v*s*t*t (woven-f s-w tin-w tout-w)205 (type-case Result (apply woven-f args adv2 s-w tin-w)206 (v*s*t*t (r s-r tin-r tout-r)207 (let ([call-state (state (app-call f args) adv sto tin)]208 [return-state (state (app-result r) adv s-r tin-r)])209 (v*s*t*t r s-r tin-r (append-traceout (traceout (list call-state))210 tout-w211 tout-r212 (traceout (list return-state))))))))))213214 ; Applies all advice in scope for all tags on f132215 (define/contract (weave adv adv2 f sto tin) (-> AdvStack? AdvStack? Value? Store? TraceIn? Result?)216 (type-case Value f217 [taggedV (tag tagged)218 (type-case Result (weave adv adv2 tagged sto tin)219 [v*s*t*t (v-w s-w tin-w tout-w)220 (prepend-trace tout-w (weave-for-tag adv adv2 tag v-w s-w tin-w))])]221 [else (v*s*t*t f sto tin mt-traceout)]))222223 ; Applies all advice in scope for a single tag on f224 (define/contract (weave-for-tag adv adv2 tag f sto tin) (-> AdvStack? AdvStack? Value? Value? Store?TraceIn? Result?)225 (if (empty? adv)226 (v*s*t*t f sto tin mt-traceout)227 (type-case Result (weave-advice adv2 tag (first adv) f sto tin)228 [v*s*t*t (v-w s-w tin-w tout-w)229 (prepend-trace tout-w (weave-for-tag (rest adv) adv2 tag v-w s-w tin-w))])))230231 ; Apply a single advice function to f232 (define/contract (weave-advice adv tag advice f sto tin) (-> AdvStack? Value? Advice? Value? Store?TraceIn? Result?)233 (type-case Advice advice234 [aroundappsA (g)235 (apply g (list tag f) adv sto tin)]))236237 ;; Debugging238239 (define/contract (display-value v out) (-> Value? output-port? void?)240 (type-case Value v241 [numV (n) (display n out)]242 [boolV (b) (display b out)]243 [symbolV (s) (write s out)]133244 [closV (params body env)245 (begin (display params out) (display " -> " out) (display (exp-syntax body) out) (newlineout) (display-env env out))]246 [boxV (l)247 (begin (display "box(" out) (display l out) (display ")" out))]248 [traceValueV (v)249 (begin (display "tracevalue(" out) (display-value v out) (display ")" out))]250 [voidV ()251 (display "(void)" out)]252 [taggedV (t v)253 (begin (display "(tag " out) (display-value t out) (display " " out) (display-value v out)(display ")" out))]254 [resumeV (label f) (display label out)]))255256 (define/contract (value->string v) (-> Value? string?)257 (letrec ([out (open-output-string)]258 [_ (display-value v out)])259 (get-output-string out)))260261 (define/contract (display-context env sto t out) (-> Env? Store? TraceIn? output-port? void?)262 (begin (display "=======================================\n" out)263 (display "Environment: \n" out)264 (display-env env out)265 (display "Store: \n" out)266 (display-store sto out)267 (if (empty? (tracein-states t))268 (void)269 (begin (display "Trace Store: \n" out)270 (display-store (state-sto (trace-state t)) out)))))271272 (define/contract (display-env env out) (-> Env? output-port? void?)134273 (for ([def env])274 (type-case Binding def275 [bind (n v)276 (begin (display "\t\t" out) (display n out) (display " -> " out) (display-value v out)(display "\n" out))])))277278 (define/contract (display-store sto out) (-> Store? output-port? void?)279 (for ([c sto])280 (type-case Storage c281 [cell (l v)282 (begin (display "\t" out) (display l out) (display " -> " out) (display-value v out)(display "\n" out))])))283284 (define/contract (display-state s out) (-> State? output-port? void?)285 (type-case State s286 [state (c adv sto tin)287 (type-case Control c288 [interp-init ()289 (begin (display "(interp-init)" out))]290 [app-call (f args)291 (begin (display "(app-call " out) (display-value f out) (display ")" out))]292 [app-result (result)293 (begin (display "(app-return " out) (display-value result out) (display ")"out))])]))294295 (define/contract (display-advice a out) (-> Advice? output-port? void?)296 (display-value (aroundappsA-advice a) out))297298 (define/contract (display-advice-stack adv out) (-> AdvStack? output-port? void?)299 (begin (display "[" out) (display "\n" out)300 (map (lambda (a) (begin (display " " out) (display-advice a out) (display "," out) (display135"\n" out))) adv)301 (display "]")))302303 (define/contract (display-with-label label val out) (-> string? Value? output-port? void?)304 (begin (display label out) (display ": " out) (display-value val out) (newline out)))305306 ;; Main interpretation function307308 (define verbose-interp (box false))309 (define retroactive-error-checking (box true))310311 (define/contract (interp expr env adv sto tin) (-> ExprC? Env? AdvStack? Store? TraceIn? Result?)312 (begin313 (if (unbox verbose-interp)314 (begin315 (display "Expression: ") (display (exp-syntax expr)) (newline)316 (display-context env sto tin (current-output-port))317 (newline))318 '())319320 (type-case ExprC expr321322 ;; Numbers and arithmetic323324 [numC (n) (v*s*t*t (numV n) sto tin mt-traceout)]325326 [plusC (l r) (type-case Result (interp l env adv sto tin)327 [v*s*t*t (v-l s-l tin-l tout-l)328 (type-case Result (interp r env adv s-l tin-l)329 [v*s*t*t (v-r s-r tin-r tout-r)330 (v*s*t*t (num+ (deep-untag v-l) (deep-untag v-r)) s-r tin-r136331 (append-traceout tout-l tout-r))])])]332333 [multC (l r) (type-case Result (interp l env adv sto tin)334 [v*s*t*t (v-l s-l tin-l tout-l)335 (type-case Result (interp r env adv s-l tin-l)336 [v*s*t*t (v-r s-r tin-r tout-r)337 (v*s*t*t (num* (deep-untag v-l) (deep-untag v-r)) s-r tin-r338 (append-traceout tout-l tout-r))])])]339340 ;; Booleans and conditionals341342 [boolC (b) (v*s*t*t (boolV b) sto tin mt-traceout)]343344 [equalC (l r) (type-case Result (interp l env adv sto tin)345 [v*s*t*t (v-l s-l tin-l tout-l)346 (type-case Result (interp r env adv s-l tin-l)347 [v*s*t*t (v-r s-r tin-r tout-r)348 (v*s*t*t (boolV (equal-values (deep-untag v-l) (deep-untagv-r))) s-r tin-r349 (append-traceout tout-l tout-r))])])]350351 [ifC (c t f) (type-case Result (interp c env adv sto tin)352 [v*s*t*t (v-c s-c tin-c tout-c)353 (type-case Result (if (boolV-b (deep-untag v-c))354 (interp t env adv s-c tin-c)355 (interp f env adv s-c tin-c))356 [v*s*t*t (v-b s-b tin-b tout-b)357 (v*s*t*t v-b s-b tin-b (append-traceout tout-c tout-b))])])]358359 ;; Identifiers and abstractions360137361 [idC (n) (v*s*t*t (lookup n env) sto tin mt-traceout)]362363 [lamC (params b) (v*s*t*t (closV params b env) sto tin mt-traceout)]364365 [appC (f args) (type-case Result (interp f env adv sto tin)366 [v*s*t*t (v-f s-f tin-f tout-f)367 (type-case ResultList (map-expr-list (lambda (e s t) (interp e env adv s t))args s-f tin-f)368 [vs*s*t*t (v-args s-args tin-args tout-args)369 (prepend-trace (append-traceout tout-f tout-args)370 (apply-with-weaving v-f v-args adv adv s-argstin-args))])])]371372 [recC (f) (interp (appC z-combinator (list f)) env adv sto tin)]373374 [letC (s v in) (type-case Result (interp v env adv sto tin)375 [v*s*t*t (v-v s-v tin-v tout-v)376 (prepend-trace tout-v (interp in (cons (bind s v-v) env) adv s-vtin-v))])]377378 ;; Boxes and sequencing379380 [boxC (a) (type-case Result (interp a env adv sto tin)381 [v*s*t*t (v-a s-a tin-a tout-a)382 (let ([where (new-loc sto)])383 (v*s*t*t (boxV where)384 (override-store (cell where v-a) sto)385 tin-a386 tout-a))])]387388 [unboxC (a) (type-case Result (interp a env adv sto tin)138389 [v*s*t*t (v-a s-a tin-a tout-a)390 (v*s*t*t (fetch s-a tin-a v-a) s-a tin-a tout-a)])]391392 [setboxC (b val) (type-case Result (interp b env adv sto tin)393 [v*s*t*t (v-b s-b tin-b tout-b)394 (type-case Result (interp val env adv s-b tin-b)395 [v*s*t*t (v-v s-v tin-v tout-v)396 (type-case Value (deep-untag v-b)397 [boxV (l) (v*s*t*t (voidV)398 (override-store (cell l v-v) s-v)399 tin-v400 (append-traceout tout-b tout-v))]401 [traceValueV (v) (error 'retroactive-side-effect "attemptto retroactively set box")]402 [else (error 'interp "attempt to set-box! on anon-box")])])])]403404 [seqC (b1 b2) (type-case Result (interp b1 env adv sto tin)405 [v*s*t*t (v-b1 s-b1 tin-b1 tout-b1)406 (prepend-trace tout-b1 (interp b2 env adv s-b1 tin-b1))])]407408 [voidC () (v*s*t*t (voidV) sto tin mt-traceout)]409410 ;; Advice411412 [symbolC (s) (v*s*t*t (symbolV s) sto tin mt-traceout)]413414 [tagC (tag v)415 (interp-tag tag v env adv sto tin)]416417 [aroundappsC (advice extent)139418 (interp-aroundapps advice extent env adv sto tin)]419420 ;; Input/Output421422 [fileC (path) (interp (parse-file path) mt-env adv sto tin)]423424 [writeC (l a) (type-case Result (interp a env adv sto tin)425 [v*s*t*t (v-a s-a tin-a tout-a)426 (begin ((unbox write-sink) (string-append l ": " (value->string v-a)))427 (v*s*t*t (voidV) s-a tin-a tout-a))])]428429 [readC (l) (let* ([val ((unbox read-source) l)]430 [_ (record-interp-input val)])431 (v*s*t*t (numV val) sto tin mt-traceout))]))432 )433434 (define/contract (interp-tag tag v env adv sto tin) (-> ExprC? ExprC? Env? AdvStack? Store? TraceIn?Result?)435 (type-case Result (interp tag env adv sto tin)436 [v*s*t*t (v-tag s-tag tin-tag tout-tag)437 (type-case Result (interp v env adv s-tag tin-tag)438 [v*s*t*t (v-v s-v tin-v tout-v)439 (v*s*t*t (taggedV v-tag v-v) s-v tin-v (append-traceout tout-tag tout-v))])]))440441 (define/contract (interp-aroundapps advice extent env adv sto tin) (-> ExprC? ExprC? Env? AdvStack?Store? TraceIn? Result?)442 (type-case Result (interp advice env adv sto tin)443 [v*s*t*t (v-a s-a tin-a tout-a)444 (let ([new-adv (cons (aroundappsA v-a) adv)])445 (prepend-trace tout-a (interp extent env new-adv s-a tin-a)))]))446140447 (define-type ResultList448 [vs*s*t*t (vs (listof Value?)) (s Store?) (tin TraceIn?) (tout TraceOut?)])449 (define/contract (append-result rl r) (-> ResultList? Result? ResultList?)450 (type-case ResultList rl451 (vs*s*t*t (vs old-s old-tin old-tout)452 (type-case Result r453 (v*s*t*t (v s tin tout)454 (vs*s*t*t (append vs (list v)) s tin (append-traceout old-tout tout)))))))455456 (define/contract (map-expr-list f exprs sto tin) (-> (-> ExprC? Store? TraceIn? Result?) (listof ExprC?)Store? TraceIn? ResultList?)457 (let ([helper (lambda (e rl)458 (type-case ResultList rl459 [vs*s*t*t (vs s tin-rl tout-rl)460 (append-result rl (f e s tin-rl))]))])461 (foldl helper (vs*s*t*t '() sto tin mt-traceout) exprs)))462463 (define/contract (interp-exp exp) (-> ExprC? Value?)464 (v*s*t*t-v (interp exp mt-env mt-adv mt-store mt-tracein)))465466 (define/contract (app-chain exps) (-> (listof ExprC?) ExprC?)467 (foldl (lambda (next chained) (appC chained next)) (first exps) (rest exps)))468469 ;; Replay470471 (define interp-input (box '()))472473 (define (record-interp-input (x number?))474 (list-box-push! interp-input x))475 (define get-interp-input476 (lambda () (reverse (unbox interp-input))))141477478 (define-type RaplRecording479 [raplRecForReplay (program list?) (input list?)])480481 (define/contract (interp-with-recording exps recording-path) (-> (listof ExprC?) path-string? Result?)482 (let* ([result (interp-exp (app-chain exps))]483 [input (get-interp-input)]484 [recording (raplRecForReplay exps input)]485 [_ (write-struct-to-file recording recording-path)])486 result))487488 (define/contract (replay-interp recording-path) (-> path-string? Result?)489 (let* ([recording (read-struct-from-file recording-path)]490 [remaining-input (box (raplRecForReplay-input recording))]491 [_ (set-box! read-source (lambda (prompt) (list-box-pop! remaining-input)))])492 ((interp-exp (app-chain (raplRecForReplay-program recording))))))493494 ;; Tracing495496 (define/contract (interp-with-tracing exprs trace-path) (-> (listof ExprC?) path-string? Value?)497 (type-case Result (interp (app-chain exprs) mt-env mt-adv mt-store mt-tracein)498 [v*s*t*t (v s tin tout)499 (let ([trace (append (list (state (interp-init) mt-adv mt-store mt-tracein))500 (traceout-states tout)501 (list (state (app-result v) mt-adv s tin)))])502 (begin503 (if (file-exists? trace-path)504 (delete-file trace-path)505 (void))506 (write-struct-to-file trace trace-path)507 v))]))142508509 ;; TODO-RS: Gah, can't figure out how to get a hold of the current module510 (define rapl-ns (module->namespace (string->path"/Users/robinsalkeld/Documents/UBC/Code/rapl/rapl_interpreter.rkt")))511512 (define/contract (interp-query trace-path exprs) (-> path-string? (listof ExprC?) Value?)513 (let* ([_ (set-box! read-source (lambda (prompt) (error 'retroactive-side-effect "attempt toretroactively read input")))]514 [tin (tracein (read-struct-from-file rapl-ns trace-path))]515 [resume (resumeV "top-level thunk" (length (tracein-states tin)))])516 (type-case Result (interp (app-chain exprs) mt-env mt-adv mt-store mt-tracein)517 [v*s*t*t (v-a s-a tin-a tout-a)518 (type-case Result (apply-with-weaving v-a (list resume) mt-adv mt-adv s-a tin)519 [v*s*t*t (v-t s-t tin-t tout-t)520 (type-case Result (apply v-t (list) mt-adv s-t tin-t)521 [v*s*t*t (v-r s-r tin-r tout-r)522 (if (or #t (= (length (tracein-states tin-r)) 0))523 v-r524 (error 'interp-query "Trace not fully read ˜s" tin-r))])])])))525526 (define/contract (all-tags v) (-> Value? (listof Value?))527 (type-case Value v528 [taggedV (tag tagged)529 (cons tag (all-tags tagged))]530 [else empty]))531532 (define/contract (deep-tag tags v) (-> (listof Value?) Value? Value?)533 (foldr taggedV v tags))534535 (define/contract (is-trace-advice? a) (-> Advice? boolean?)536 (type-case Advice a143537 [aroundappsA (advice)538 (traceValueV? advice)]))539540 (define/contract (lift-advice a) (-> Advice? Advice?)541 (type-case Advice a542 [aroundappsA (advice)543 (aroundappsA (traceValueV advice))]))544545 (define/contract (new-trace-advice trace-adv adv) (-> AdvStack? AdvStack? AdvStack?)546 (if (empty? adv)547 (map lift-advice trace-adv)548 (if (is-trace-advice? (first adv))549 (new-trace-advice (rest trace-adv) (rest adv))550 (new-trace-advice trace-adv (rest adv)))))551552 (define/contract (without-trace-advice adv) (-> AdvStack? AdvStack?)553 (let ([result (filter (lambda (a) (not (is-trace-advice? a))) adv)])554 (begin ;(display "with trace advice: ") (display-list adv) (newline)555 ;(display "without trace advice: ") (display-list result) (newline)556 result)))557558 (define (display-list l)559 (begin (display "[") (newline)560 (map (lambda (e) (begin (display " ") (display e) (display ",") (newline))) l)561 (display "]")))562563 (define/contract (merge-advice-stacks trace-adv adv) (-> AdvStack? AdvStack? AdvStack?)564 (let ([result (append (reverse (new-trace-advice (reverse trace-adv) (reverse adv))) adv)])565 (begin ;(display "trace-adv: ") (display-list trace-adv) (newline)566 ;(display "adv: ") (display-list adv) (newline)567 ;(display "result: ") (display-list result) (newline)144568 result)))569570 ;; Without error checking571572 (define/contract (rw-resume-value-no-error v) (-> Value? Value?)573 (deep-tag (all-tags v) (resumeV "dummy" 0)))574575 (define/contract (rw-replay-call-no-error f args adv sto tin) (-> Value? (listof Value?) AdvStack?Store? TraceIn? Result?)576 (apply-with-weaving (rw-resume-value-no-error f) (map lift-trace-value args) (without-trace-adviceadv) adv sto tin))577578 (define/contract (rw-call-no-error args adv sto tin) (-> (listof Value?) AdvStack? Store? TraceIn?Result?)579 (rw-result-no-error adv sto (next-trace-state tin)))580581 (define/contract (rw-result-no-error adv sto tin) (-> AdvStack? Store? TraceIn? Result?)582 (type-case State (trace-state tin)583 [state (c adv-t sto-t tin-t)584 (type-case Control c585 [interp-init ()586 (error 'rw-result-no-error "Unexpected state")]587 [app-call (f args)588 (type-case Result (rw-replay-call-no-error f args (merge-advice-stacks adv-tadv) sto tin)589 (v*s*t*t (v-r s-r tin-r tout-r)590 (rw-result-no-error adv s-r (next-trace-state tin-r))))]591 [app-result (r)592 (v*s*t*t (lift-trace-value r) sto tin mt-traceout)])]))593594 ;; With error checking145595596 (define/contract (rw-replay-call f args adv sto tin) (-> Value? (listof Value?) AdvStack? Store?TraceIn? Result?)597 (type-case Result (rw-check-result598 (apply-with-weaving (rw-resume-value f tin)599 (map lift-trace-value args)600 (without-trace-advice adv) adv sto tin)601 tin)602 [v*s*t*t (v-r s-r tin-r tout-r)603 (v*s*t*t v-r s-r (next-trace-state tin-r) tout-r)]))604605 (define/contract (rw-check-result result tin-before) (-> Result? TraceIn? Result?)606 (type-case Result result607 [v*s*t*t (v-r s-r tin-r tout-r)608 (if (< (length (tracein-states tin-r)) (length (tracein-states tin-before)))609 (let ([r (app-result-r (state-c (trace-state tin-r)))])610 (if (equal-values v-r r)611 result612 (error 'retroactive-side-effect613 (format "incorrect retroactive result: expected\n ˜a but got\n ˜a" rv-r))))614 (error 'retroactive-side-effect "retroactive advice did not proceed"))]))615616 (define/contract (rw-resume-value v t) (-> Value? TraceIn? Value?)617 (let ([r (resumeV (value->string v) (length (tracein-states t)))])618 (deep-tag (all-tags v) r)))619620 (define/contract (rw-call pos passed adv sto tin)621 (-> number? (listof Value?) AdvStack? Store? TraceIn? Result?)622 (cond [(= pos (length (tracein-states tin)))623 (type-case Control (state-c (trace-state tin))146624 [interp-init ()625 (rw-result adv sto (next-trace-state tin))]626 [app-call (abs args)627 (if (andmap equal-values passed (map lift-trace-value args))628 (rw-result adv sto (next-trace-state tin))629 (error 'retroactive-side-effect630 (format "incorrect argument passed retroactively: expected\n ˜a butgot\n ˜a" args passed)))]631 [else (error 'rw-call "Unexpected state")])]632 [else (error 'retroactive-side-effect "retroactive advice proceeded out of order")]))633634 (define/contract (rw-result adv sto tin) (-> AdvStack? Store? TraceIn? Result?)635 (let* ([t-state (trace-state tin)]636 [_ (if (unbox verbose-interp)637 (begin638 (display "Weaving state: ") (display-state t-state (current-output-port)) (newline))639 '())])640 (type-case State t-state641 [state (c adv-t sto-t tin-t)642 (type-case Control c643 [interp-init () (error 'rw-result "Unexpected state")]644 [app-call (f args)645 (type-case Result (rw-replay-call f args (merge-advice-stacks adv-t adv) stotin)646 (v*s*t*t (v-r s-r tin-r tout-r)647 (prepend-trace tout-r (rw-result adv s-r tin-r))))]648 [app-result (r)649 (v*s*t*t (lift-trace-value r) sto tin mt-traceout)])])))147

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0367029/manifest

Comment

Related Items