Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Characterizing and refactoring asynchronous JavaScript callbacks Gallaba, Keheliya 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2016_february_gallaba_keheliya.pdf [ 602.24kB ]
Metadata
JSON: 24-1.0223155.json
JSON-LD: 24-1.0223155-ld.json
RDF/XML (Pretty): 24-1.0223155-rdf.xml
RDF/JSON: 24-1.0223155-rdf.json
Turtle: 24-1.0223155-turtle.txt
N-Triples: 24-1.0223155-rdf-ntriples.txt
Original Record: 24-1.0223155-source.json
Full Text
24-1.0223155-fulltext.txt
Citation
24-1.0223155.ris

Full Text

Characterizing and refactoring asynchronous JavaScriptcallbacksbyKeheliya GallabaB.Sc. Eng. (Hons), University of Moratuwa, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of Applied ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Electrical and Computer Engineering)The University of British Columbia(Vancouver)December 2015c© Keheliya Gallaba, 2015AbstractModern web applications make extensive use of JavaScript, which is now esti-mated to be one of the most widely used languages in the world. Callbacks are apopular language feature in JavaScript. However, they are also a source of compre-hension and maintainability issues. We studied several features of callback usageacross a large number of JavaScript applications and found out that over 43% ofall callback- accepting function call sites are anonymous, the majority of callbacksare nested, and more than half of all callbacks are invoked asynchronously.Promises have been introduced as an alternative to callbacks for composingcomplex asynchronous execution flow and as a robust mechanism for error check-ing in JavaScript. We use our observations of callback usage to build a developertool that refactors asynchronous callbacks into Promises. We show that our tech-nique and tool is broadly applicable to a wide range of JavaScript applications.iiPrefaceThe work presented in this thesis was conducted by the author under the supervi-sion of Professor Ali Mesbah and Professor Ivan Beschastnikh.I was responsible for implementing the tools, running the experiments, evalu-ating and analyzing the results, and writing the manuscript. My supervisors guidedme with the creation of the experimental methodology, and the analysis of results,as well as editing and writing portions of the manuscript. Quinn Hanam and AminMilani Fard also helped me by editing and writing portions of the manuscript.The results of the empirical study of JavaScript callbacks were published inthe Proceedings of the ACM/IEEE International Symposium on Empirical Soft-ware Engineering and Measurement (ESEM)[26] in 2015. We are also proud toannounce that our paper received the Best Full Paper Award at ESEM 2015.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . 52 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Empirical Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.2 Subject Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20iv4.4.1 Prevalence of Callbacks (RQ1) . . . . . . . . . . . . . . . 204.4.2 Callback Usage (RQ2) . . . . . . . . . . . . . . . . . . . 214.4.3 Solutions (RQ3) . . . . . . . . . . . . . . . . . . . . . . 254.5 Threats to Validity . . . . . . . . . . . . . . . . . . . . . . . . . . 304.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.2 Exploratory Study: Refactoring Callbacks to Promises . . . . . . 365.2.1 Finding Issues Related to Callbacks and Promises . . . . . 365.2.2 Exploring Refactoring Pull Requests . . . . . . . . . . . . 385.2.3 Mining Commits . . . . . . . . . . . . . . . . . . . . . . 385.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.3.1 Identifying Functions with Asynchronous Callbacks . . . 405.3.2 Choosing a Refactoring Strategy . . . . . . . . . . . . . . 415.3.3 Transforming the Asynchronous Function . . . . . . . . . 435.3.4 Transforming the Callback Function . . . . . . . . . . . . 465.3.5 Transforming the Call Site . . . . . . . . . . . . . . . . . 475.3.6 Flattening Promise Consumers . . . . . . . . . . . . . . . 485.4 Semantic Equivalence of Transformations . . . . . . . . . . . . . 495.4.1 Scheduling Order Equivalence . . . . . . . . . . . . . . . 505.4.2 Function Scope Equivalence . . . . . . . . . . . . . . . . 505.4.3 Intra-Procedural Control Flow Equivalence . . . . . . . . 515.4.4 Inter-Procedural Control Flow Equivalence . . . . . . . . 525.4.5 Data Flow Equivalence . . . . . . . . . . . . . . . . . . . 525.4.6 Equivalence of Nested Asynchronous Callbacks . . . . . . 535.5 Implementation: PROMISESLAND . . . . . . . . . . . . . . . . . 535.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545.6.1 Detection Accuracy (RQ1) . . . . . . . . . . . . . . . . . 555.6.2 Refactoring Correctness (RQ2) . . . . . . . . . . . . . . . 565.6.3 Performance (RQ3) . . . . . . . . . . . . . . . . . . . . . 585.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60v6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . 626.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65viList of TablesTable 4.1 JavaScript subject systems in our study . . . . . . . . . . . . . 15Table 4.2 Example Asynchronous APIs available to JavaScript programs 18Table 4.3 Top 10 Async.js invoked methods in JavaScript web applica-tions (left) and NPM modules (right). The ∗ symbol denotescalls that do not appear in both tables. . . . . . . . . . . . . . . 27Table 4.4 Percentage of subject systems creating and using Promises . . 28Table 5.1 Refactoring strategies. . . . . . . . . . . . . . . . . . . . . . . 41Table 5.2 Detection accuracy of the tool. . . . . . . . . . . . . . . . . . 56Table 5.3 Performance measurements of PROMISESLAND (in seconds) . 59viiList of FiguresFigure 2.1 The JavaScript event loop model. . . . . . . . . . . . . . . . 7Figure 4.1 Boxplots for percentage of callback-accepting function defini-tions and callsites per category, across client/server, and in total. 21Figure 4.2 Boxplots for percentage of asynchronous callback-acceptingfunction callsites. . . . . . . . . . . . . . . . . . . . . . . . . 22Figure 4.3 Boxplots for percentage of anonymous callback-accepting func-tion callsites per category, across client/server, and in total. . . 23Figure 4.4 Instances of nested callbacks for a particular nesting level. . . 24Figure 4.5 The distribution of total Promise usage and creation instancesby category, across client/server, and in total. . . . . . . . . . 29Figure 5.1 Count of nested callbacks with at least 1 error-first callback inweb applications. . . . . . . . . . . . . . . . . . . . . . . . . 33Figure 5.2 Overview of our approach. . . . . . . . . . . . . . . . . . . . 40Figure 5.3 The sequence diagram on the left (a) shows the sequence foran asynchronous function that accepts a callback. The se-quence diagram on the right (b) shows the sequence for anasynchronous function that returns a promise. . . . . . . . . 50viiiFigure 5.4 The bar chart on the left (a) shows the number of asynchronouscallbacks converted into Dues by using the tool from [17] ver-sus the number of asynchronous callbacks converted into Promiseswith PROMISESLAND. The bar chart on the right (b) shows thenumber of subject systems in which the tool from [17] was ableto detect asynchronous callbacks for converting into Dues ver-sus the number of subject systems in which PROMISESLANDwas able to detect asynchronous callbacks for converting intoPromises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58ixAcknowledgmentsI would like to thank my advisors Ivan Beschastnikh and Ali Mesbah for theirinvaluable support. This work would not have been possible without the guidance,motivation, inspiration, and support provided by them.I would also like to thank my family specially my wife, Thamali, My sister,Sathya and my parents for their unlimited support and love. Thank you for encour-aging me in all of my endeavors and inspiring me to follow my dreams. Knowingthat you always wanted the best for me helped me to stay motivated.To my friends and colleagues in SALT Lab, thank you for listening, offeringme advice, and supporting me in various ways through this entire program.Last but not least, I would also like to thank Prof. Karthik Pattabirman foraccepting to be a part of my defence committee.xChapter 1IntroductionCallbacks, or higher-order functions, are available in many programming languages,for instance, as function-valued arguments (e.g., Python), function pointers (e.g.,C++), and lambda expressions (e.g., Lisp). In this thesis we study JavaScriptcallbacks since JavaScript is the dominant language for building web applica-tions. For example, a recent survey of more than 26K developers conducted byStack Overflow found that JavaScript is the most-used programming language [56].The callback language feature in JavaScript is an important factor to its success.For instance, JavaScript callbacks are used to responsively handle events on theclient-side by executing functions asynchronously. And, in Node.js1, a popularJavaScript-based framework, callbacks are used on the server-side to service mul-tiple concurrent client requests.Listing 1.1 illustrates three common challenges with callbacks. First, the threecallbacks in this example (on lines 2, 3, and 5) are anonymous. This makes thecallbacks difficult to reuse and to understand as they are not descriptive. Second,the callbacks in the example are nested to three levels, making it challenging toreason about the flow of control in the code. Finally, in line 5 there is a call toconn.query, which invokes the second callback parameter asynchronously. Thatis, the execution of the inner-most anonymous function (lines 6–7) is deferred untilsome later time. As a result, most of the complexity in this small example restson extensive use of callbacks; in particular, the use of anonymous, nested, and1 https://nodejs.org11 var db = requ i re ( ’somedatabaseprovider’ ) ;2 h t t p . get ( ’/recentposts’ , function ( req , res ){3 db . openConnection ( ’host’ , creds , function ( er r , conn ){4 res . param[’posts’ ] . forEach ( function ( post ) {5 conn . query ( ’select * from users where id=’ + post[’user’ ] ,function ( er r , r e s u l t s ){6 conn . c lose ( ) ;7 res . send ( r e s u l t s [0 ] ) ;8 }) ;9 }) ;10 }) ;11 }) ;Listing 1.1: A representative JavaScript snippet illustrating thecomprehension and maintainability challenges associated with nested,anonymous callbacks and asynchronous callback scheduling.asynchronous callbacks.Though the issues outlined above have not been studied in detail, they arewell-known to developers. Searching for “callback hell” on the web brings upmany articles with best practices on callback usage in JavaScript. For example,a prominent problem with asynchronous callbacks in JavaScript is error-handling.In JavaScript, an error that occurs during the execution of an asynchronous taskcannot be handled with the traditional try/catch mechanism because the asyn-chronous task is run outside the existing call stack. For example, consider the codein Listing 1.2.1 t r y {2 setTimeout ( function ( ) {3 throw new Er ro r ( "Uh oh!" ) ;4 } , 2000) ;5 } catch ( e ) {6 console . log ( "Caught the error: " + e . message ) ;7 }Listing 1.2: A JavaScript snippet illustrating that try/catch statements areineffective for handling errors in asynchronous callbacks.An exception generated during setTimeoutwill not be caught inside the catchblock. Therefore, an error generated by an asynchronous function, such as setTimeout,can only be handled by passing it as a parameter to the callback function. TheJavaScript community has come up with a convention for error propagation in2asynchronous contexts called the error-first protocol. In this idiom-based proto-col, the first parameter of the callback is reserved for communicating errors andthe other parameters are used for passing data.1 f s . readF i l e ( ’/foo.txt’ , function ( er r , r e s u l t ) {2 i f ( e r r ) {3 console . log ( ’Unknown Error’ ) ;4 return ;5 }6 console . log ( r e s u l t ) ;7 }) ;Listing 1.3: A JavaScript snippet illustrating the error-first protocol.Consider Listing 1.3: an error generated by the asynchronous function fs.readFileis passed to the callback as an argument (err) and the callback must include appro-priate error-handling code. A key limitation of the error-first protocol is that it ismerely a convention and developers are not obligated to use it. As a result, devel-opers manually check to see whether a function follows the protocol, which makesit error-prone. As this is a best practice and adhering to this protocol is optional, itis also not clear to what extent developers use it in practice.Promises are a new feature of ECMAScript6 that are designed to help withthe error handling and nesting problems associated with asynchronous callbacks.The ECMAScript6 specification has been approved [7] in 2015 and all the majorJavaScript runtimes support promises [10]. Promises explicitly register handlersfor executions that are successful and executions that produce errors. This removesthe need for the error-first convention and separates the success handler from the er-ror handler. Promises can also be chained together, which flattens nested callbacksand makes them easier to understand.1.1 ObjectivesThe main objective of our research is twofold:• Gaining an understanding of JavaScript callback usage in practiceAlthough callbacks are a key JavaScript feature, they have not received muchattention in the research community. We think that this is a critical omission3as the usage of callbacks is an important factor in developer comprehensionand maintenance of JavaScript code.• Devising automated JavaScript refactoring techniques to mitigate callback-related challengesNew JavaScript language features, such as Promises, are being proposed tohelp with problems associated with asynchronous callbacks. But there is nomention or use of refactoring tools in the developer community to transformexisiting callbacks to Promises. We investigate the feasibility of a novel au-tomated JavaScript refactoring technique for this purpose.1.2 ContributionsThis thesis makes the following main contributions:• A systematic methodology and tool, for analyzing JavaScript code staticallyto identify callbacks and to measure their various features (Section 4.3)• An empirical study to characterize JavaScript callback usage across 138large JavaScript projects. These include 86 Node.js modules from the NPMpublic package registry used in server-side code and 62 subject systems froma broad spectrum of categories, such as JavaScript MVC frameworks, games,and data visualization libraries. We found that on average, every 10th func-tion definition takes a callback argument, and that over 43% of all callback-accepting function callsites are anonymous. Furthermore, the majority ofcallbacks are nested, more than half of all callbacks are asynchronous, andasynchronous callbacks, on average, appear more frequently in client-sidecode (72%) than server-side (55%). (Section 4.4)• A discussion of the implications of our empirical findings• An exploratory study in which we search for and examine several GitHub is-sues and pull-requests containing terms related to refactoring of asynchronouscallbacks into promises. We found that developers frequently want to refac-tor existing code that uses asynchronous callbacks into code that uses promises(GitHub search returned over 4K issues related to this topic). GitHub search4returned only 451 pull-requests related to this topic (a small number of actualtransformations as compared to the number of requests). This study providessupport for the utility of an automated refactoring tool (Section 5.2)• A set of static analysis techniques and a tool that support automated refac-toring by: (1) discovering instances of asynchronous callbacks and (2) trans-forming instances of asynchronous callbacks into promises. (Section 5.3)• An evaluation of the refactoring tool pointing to the real-world relevance andefficacy of our techniques. (Section 5.6)1.3 Thesis OrganizationIn Chapter 2 of this thesis, we provide background information regarding JavaScriptapplications, particularly with respect to the use of callbacks in JavaScript, alongwith the motivation to conduct this research. Chapter 3 discusses the related workin this area of study. Chapter 4 describes in detail the empirical srudy we carriedout to characterize the real-world use of callbacks in different types of JavaScriptprograms, the results found from this study, and the implications these results canhave with respect to web developers, tool developers and the research community.In Chapter 5, we present the tool we developed to detect instances of asynchronouscallbacks and to refactor such callbacks. The evaluation results of the tool on awide range of JavaScript applications is also presented in this chapter. Finally,Chapter 6 concludes our work and presents future research directions.5Chapter 2BackgroundA callback is a function that is passed as an argument to another function, which isexpected to invoke it either immediately or at some point in the future. Callbackscan be seen as a form of the continuation-passing style (CPS) [57], in which controlis passed explicitly in the form of a continuation; in this case the callback passedas an argument represents a continuation.Synchronous and asynchronous callbacks. There are two types of callbacks. Acallback passed to a function f can be invoked synchronously before f returns, orit can be deferred to execute asynchronously some time after f returns.JavaScript uses an event-driven model with a single thread of execution. Pro-gramming with callbacks is especially useful when a caller does not want to waituntil the callee completes. To this end, the desired non-blocking operation is sched-uled as a callback and the main thread continues its synchronous execution. Whenthe operation completes, a message is enqueued into a task queue along with theprovided callback. The event loop in JavaScript prioritizes the single thread to exe-cute the call stack first; when the stack is empty, the event loop dequeues a messagefrom the task queue and executes the corresponding callback function. Figure 2.1illustrates the event loop model of JavaScript for a non-blocking HTTP get callwith a callback named cb.Named and anonymous callbacks. JavaScript callbacks can be named functionsor anonymous functions (e.g., lines 2, 3, or 5 of Listing 1.1). Each approach has itstradeoffs. Named callbacks can be reused and are easily identified in stack traces6Call Stack System (browser/Node.js)get(…, cb)getOrders()main()Task Queue2. Register cbEvent Loop1. Dequeue first task and execute3. Poll for next task whenstack is emptyWhen HTTP call returnsenqueue a message with cbcb()main()cb()click, fooFigure 2.1: The JavaScript event loop model.or breakpoints during debugging activities. However naming causes the callbackfunction to persist in memory and prevents it from being garbage collected. Onthe other hand, anonymous callbacks are more resource-friendly because they aremarked for garbage collection immediately after being executed. However, anony-mous callbacks are not reusable and may be difficult to maintain, test, or debug.Nested callbacks. Developers often need to combine several callback-acceptingfunctions together to achieve a certain task. For example, two callbacks have tobe nested if the result of the first callback needs to be passed into the second call-back in a non-blocking way (see lines 2-8 in Listing 2.2). This structure becomesmore complex when the callbacks need to be conditionally nested. Control-flowcomposition with nested callbacks increases the complexity of the code. The term‘callback hell’ [49] has been coined by developers to voice their common frustra-tion with this complexity.Error-first callbacks. In synchronous JavaScript code the throw keyword can beused to signal an error and try/catch can be used to handle the error. When thereis asynchrony, however, it may not be possible to handle an error in the context it isthrown. Instead, the error must be propagated asynchronously to an error handlerin a different context. Callbacks are the basic mechanism for delivering errors7asynchronously in JavaScript. Because there is no explicit language support forasynchronous error signaling, the developer community has proposed a convention— dedicate the first argument in the callback to be a permanent place-holder forerror signaling.More exactly, the error-first callback protocol specifies that during a callbackinvocation, either the error argument is non-null (indicating an error), or the firstargument is null and the other arguments contain data (indicating success), but notboth [1]. Listing 2.1 shows an example of this protocol. If an error occurs whilereading the file (Line 4), the anonymous function will be called with error as thefirst argument. For this error to be handled, it will be propagated by passing it as thefirst argument of the callback (in line 5). The error can then be handled at a moreappropriate location (lines 17–21). When there is no error, the program continues(line 8) and invokes the callback with the first (error) argument set to null.1 var f s = requ i re ( ’fs’ ) ;2 // read a file3 function r e a d t h e f i l e ( f i lename , ca l l back ) {4 f s . readF i l e ( f i lename , function ( er r , contents ) {5 i f ( e r r ) return ca l l back ( e r r ) ;7 // if no error , continue8 read data f rom db ( null , contents , ca l l back ) ;9 }) ;10 }12 function read data f rom db ( er r , contents , ca l l back ) {13 //some long running task14 }16 r e a d t h e f i l e ( ’/some/file’ , function ( er r , r e s u l t ) {17 i f ( e r r ) {18 //handle the error19 console . log ( e r r ) ;20 return ;21 }22 // do something with the result23 }) ;Listing 2.1: Error-first callback protocol.The error-first callback protocol is intended to simplify exception handlingfor developers; if there is an error, it will always propagate as the first argumentthrough the API, and API clients can always check the first argument for errors.8But, there is no automatic checking of the error-first protocol in JavaScript. It istherefore unclear how frequently developers adhere to this protocol in practice.Handling callbacks. A quick search in the NPM repository1 reveals that thereare over 4,500 modules to help developers with asynchronous JavaScript program-ming. Two solutions that are gaining traction in helping developers handle call-backs are libraries, such as Async.js [40] and new language features such as Promises [12].We now briefly detail these two approaches.The Async.js library exposes an API to help developers manage callbacks. Forexample, Listing 2.2 shows how nested callbacks in vanilla JavaScript (lines 1–8)can be expressed using the waterfall method available in Async.js (11–18).Promises are a JavaScript language extension. A Promise-based function takessome input and returns a promise object representing the result of an asynchronousoperation. A promise object can be queried by the developer to answer questionslike “were there any errors while executing the async call?” or “has the data fromthe async call arrived yet?” A promise object, once fulfilled, can notify any func-tion that depends on its data. Listing 2.2 illustrates how nested callbacks in vanillaJavaScript (lines 1–8) can be re-expressed using Promises (20–24).Promises are described further in detail in the Section 5.1.1 https://www.npmjs.com/91 // Before: nested callbacks2 $ ( "#button" ) . c l i c k ( function ( ) {3 promptUserForTwit terHandle ( function ( handle ) {4 t w i t t e r . getTweetsFor ( handle , function ( tweets ) {5 u i . show ( tweets ) ;6 }) ;7 }) ;8 }) ;10 // After: Using Async.js waterfall method11 $ ( "#button" ) . c l i c k ( function ( ) {12 async . w a t e r f a l l ( [13 promptUserForTwit terHandle ,14 t w i t t e r . getTweetsFor ,15 u i . show16 ]17 , hand leError ) ;18 }) ;20 // After: sequential join of callbacks with Promises21 $ ( "#button" ) . c l i ckPromise ( )22 . then ( promptUserForTwit terHandle )23 . then ( t w i t t e r . getTweetsFor )24 . then ( u i . show ) ;Listing 2.2: Rewriting nested callbacks using Async.js or Promises.10Chapter 3Related WorkCallback-related issues are a recurrent discussion topic among developers [49].However, to the best of our knowledge, there have been no empirical studies ofcallback usage in practice.JavaScript applications. The dynamic behaviour of JavaScript applications wasstudied by Richards et al. [52]. They found that commonly made assumptionsabout dynamism in JavaScript are violated in at least some real-world code. Asimilar study was conducted by Martinsen et al. [39]. Richards et al. [53] studiedthe prevalence of eval. They found eval to be pervasive, and argued that in mostusage scenarios, it could be replaced with equivalent and safer code or languageextensions.Ocariza et al. [47] conducted an empirical study to characterize root causesof client-side JavaScript bugs. Since this study, server-side JavaScript, on top ofNode.js, has gained traction among developers. Our study considers callback usagein both client- and server-side JavaScript code.Security vulnerabilities in JavaScript code have also been studied. Exam-ples include studies on remote JavaScript inclusions [46, 62], cross-site scripting(XSS) [61], and privacy-violating information flows [33]. Parallelism in JavaScriptcode was studied by Fortuna et al. [25].Milani Fard et al. [43] studied code smells in JavaScript code. In their list ofJavaScript smells, they included nested callbacks, but only focus on callbacks inclient-side code. Decofun [20] is a JavaScript function de-anonymizer. It parses11the code and names any detected anonymous function according to its context.Although JavaScript is a challenging language for software engineering, recentresearch advances have made the use of static analysis on JavaScript more practical[15, 23, 34, 35, 38, 45, 48, 55]. Other techniques mitigate the analysis challengesby using a dynamic or hybrid approach [14, 28, 60]. Others have considered toimprove the core language through abstraction layers [59].Asynchronous programming. Okur et al. [50] recently conducted a large-scalestudy on the usage of asynchronous programming in C# applications. They foundthat callback-based asynchronous idioms are heavily used, but new idioms that cantake advantage of the async/await keywords are rarely used in practice. Theyhave studied how developers (mis)use some of these new language constructs. Ourstudy similarly covers the usage of callbacks and new language features such asPromises to enhance asynchronous programming. However, our work considersJavaScript code and delves deeper. For example, we characterize the usage ofcallback nesting and anonymous callbacks, which are known to cause maintenanceproblems.Concurrency bug detection. EventRacer [51] detects data races in JavaScriptapplications. Zheng et al. [63] propose a static analysis method for detectingconcurrency bugs caused by asynchronous calls in web applications. Similarly,WAVE [32] is a tool for identifying concurrency bugs by looking for the samesequence of user events leading to different final DOM-trees of the application.Program comprehension. Clematis [13] is a technique for helping developersunderstand complex event-based and asynchronous interactions in JavaScript codeby capturing low-level interactions and presenting those as higher-level behavioralmodels. Theseus [37] is an IDE extension that helps developers to navigate asyn-chronous and dynamic JavaScript execution. Theseus has some limitations; forexample, it does not support named callbacks. Our study demonstrates that overhalf of all callbacks are named, indicating that many applications will be negativelyimpacted by similar limitations in existing tools.JavaScript Refactoring Tools. The closest work to ours is by Brodu et al. [17]who propose a compiler for converting nested callbacks (or an imbrication of con-tinuations) into a sequence of Dues, which is a simpler version of Promises. This12compiler does not support the critical promise notions of rejection and fulfillmentand re-writes the error-first protocol callbacks without using the error-handlingbody of the callback. In addition, there are several drawbacks to this approach:(1) the source code does not change, so it does not eliminate the issues with un-derstandability, (2) Dues do not support the critical notions of rejection and res-olution in promises and can therefore only re-write the error-first protocol in asimplified notation, (3) their approach requires developer intervention to specifyasynchronous callbacks that can are suitable candidates to be converted. This ap-proach also requires the developer to manually specify asynchronous callbacks tobe converted into Promises. In contrast we introduce a tool which is completelyautomated.A number of other JavaScript refactoring tools have been previously proposed.For example, Meawad et al. [41] proposed a tool to refactor eval statements intosafer code, and Feldthaus et al. [21, 22] developed a technique for semi-automaticrefactoring with a focus on renaming. None of these works focuses on detecting orrefactoring asynchronous callbacks to Promises.13Chapter 4Empirical Study4.1 MethodologyTo characterize callback usage in JavaScript applications, we focus on the follow-ing three research questions.RQ1: How prevalent are callbacks?RQ2: How are callbacks programmed and used?RQ3: Do developers use external APIs to handle callbacks?Our analyses are open source [4] and all of our empirical data is available fordownload [9].4.2 Subject SystemsWe study 138 popular open source JavaScript subject systems from six distinctcategories: NPM modules (86), web applications (16), game engines (16), client-side frameworks (8), visualization libraries (6), and games (6). NPM modules areused only on the server-side. Client-side frameworks, visualization libraries, andgames include only client-side code. Web applications and game engines includeboth client-side and server-side code. Table 4.1 presents these categories, whetheror not the category includes client-side and server-side application code, and the14aggregate number of JavaScript files and lines of JavaScript code that we analyzefor each applications category.Table 4.1: JavaScript subject systems in our studyCategory Subject Client Server Total Totalsystems side side files LOCNPM Modules 86 X 4,983 1,228,271Web Apps. 16 X X 1,779 494,621Game Engines 16 X X 1,740 1,726,122Frameworks 8 X 2,374 711,172DataViz Libs. 6 X 3,731 958,983Games 6 X 347 119,279Total 138 X X 14,954 5,238,448The 86 NPM modules we study are the most depended-on modules in theNPM repository [2]. The other subject systems were selected from GitHub Show-cases [3], where popular and trending open source repositories are organized arounddifferent topics. The subject systems we consider are JavaScript-only systems.Those systems that contain server-side components are written in Node.js1, a popu-lar server-side JavaScript framework. Overall, we study callbacks in over 5 millionlines of JavaScript code.4.3 AnalysisTo address the three research questions, we have developed a static analyzer tosearch for different patterns of callbacks in JavaScript code2.Our static analyzer builds on prior JavaScript analysis tools, such as Esprima [31]to parse and build an AST, Estraverse [58] to traverse the AST, and TernJS [30],a type inference technique by Hackett and Guo [29], to query for function typearguments. We also developed a custom set of analyses to identify callbacks andto measure their various properties of interest.Duplicate code is an issue for any source code analysis tool that measures theprevalence of some language feature. Our analysis maintains a set of dependencies1 https://nodejs.org2Our static analysis approach considers most program paths, but it does not handle cases likeeval that require dynamic analysis.15for a subject system and guarantees that each dependency is analyzed exactly once.To resolve dependencies expressed with the require keyword, we use TernJS andits Node plugin.In the rest of this section we detail our analysis for each of the three researchquestions.Prevalence of callbacks (RQ1). To investigate the prevalence of callbacks inJavaScript code we consider all function definitions and function callsites in eachsubject system. For each subject, we compute (1) the percentage of function defini-tions that accept callbacks as arguments, and (2) the percentage of function callsitesthat accept callbacks as arguments.We say that f is a callback-accepting function if we find that at least one argu-ment to f is used as a callback.To determine whether a parameter p of a function f definition is a callbackargument, we use a three-step process: (1) if p is invoked as a function in thebody of f then p is a callback; (2) if p is passed to a known callback-acceptingfunction (e.g., setTimeout) then p is a callback; (3) if p is used as an argument toan unknown function f ′, then recursively check if p is a callback parameter in f ′.Note that for f to be a callback-accepting function, it is insufficient to find aninvocation of f with a function argument p. To be a callback-accepting function,the argument p must be invoked inside f , or p must be passed down to some otherfunction where it is invoked.We do not analyze a subject’s dependencies, such as libraries, to find call-backs3. The only time we analyze external libraries is when a subject system callsinto a library and passes a function as an argument. In this case our analysis checkwhether the passed function is invoked as a callback in the library code.We also study whether callback usage patterns are different between server-sideand client-side JavaScript code. Categorizing the projects known as purely client-side (MVC frameworks like Ember, Angular) or purely server-side (NPM modules)is easy. But, some projects contain both server-side and client-side JavaScript code.To distinguish client-side code from server-side code, we use the project directorystructure. We assume that client-side code is stored in a directory named www,3For instance, JavaScript files under the node modules directory are excluded.16public, static, or client. We also identify client-side code through developer codeannotations (e.g., /* jshint browser:true, jquery:true */).1 function getRecord ( id , ca l l back ) {2 h t t p . get ( ’http://foo/’ + id , function ( er r , doc ) {3 i f ( e r r ) {4 return ca l l back ( e r r ) ;5 }6 return ca l l back ( null , doc ) ;7 }) ;8 }10 var l o g S t u f f = function ( ) { . . . }11 getRecord ( ’007’ , l o g S t u f f ) ;Listing 4.1: An example of an asynchronous anonymous callback.For example, consider the code in Listing 4.1. To check if the getRecord()function invocation in line 11 takes a callback, we analyze its definition in line 1.We find that there is a path from the start of getRecord() to the callback invo-cation (which happens to be the logStuff argument provided to the getRecord()invocation). The path is: getRecord()−→ htt p.get()−→Anonymous1()−→ callback().Therefore, logStuff() is a callback function because it is passed as a functionargument and it is invoked in that function. This makes getRecord() a callback-accepting function.We label a callsite as callback-accepting if it corresponds to (1) a function thatwe determine to be callback-accepting, as described above, or (2) a function knowna-priori to be callback-accepting (e.g., setTimeout()). The getRecord callsite inline 11 is callback-accepting because the function getRecord was determined tobe callback-accepting.Callback usage in practice (RQ2). Developers use callbacks in different ways.Some of these are known to be problematic [49] for comprehension and mainte-nance (e.g., see Chapter 2). We characterize callback usage in three ways: wecompute (1) the percentage of callbacks that are anonymous versus named, (2) thepercentage of callbacks that are asynchronous, and (3) the callback nesting depth.Anonymous versus named callbacks. If a function callsite is identified as callback-accepting, and an anonymous function expression is used as an argument, we callit an instance of an anonymous callback.17Asynchronous callbacks. A callback passed into a function can be deferred andinvoked at a later time. This deferral happens through known system APIs, whichdeal with the task queue in common browsers and Node.js environments. Ouranalysis detects a variety of APIs, including DOM events, network calls, timers,and I/O. Table 4.2 lists examples of these asynchronous APIs.Table 4.2: Example Asynchronous APIs available to JavaScript programsCategory Examples AvailabilityDOM events addEventListener,onclickBrowserNetwork calls XMLHTTPRequest.open BrowserTimers setImmediate(), Browser,(macro-Task) setTimeout(), Node.jssetInterval()Timers process.nextTick() Node.js(micro-task)I/O APIs of fs, net Node.jsFor each function definition, if a callback argument is passed into a knowndeferring function call, we label this callback as asynchronous.Callback nesting. The number of callbacks nested inside one after the otheris defined as the callback depth. According to this definition, Listing 1.1 has acallback depth of three because of callbacks at lines 2, 3, and 5.In cases where there are multiple instances of nested callbacks in a function,we count the maximum depth of nesting, including nesting in conditional branches.This under-approximates callback nesting. For example, Listing 4.2 shows an ex-ample code snippet from an open source application4. There are two sets of nestedcallbacks in this example, namely, at Lines 1, 4, 5 (depth of three) and a second setat Lines 1, 12, 13, 15 (depth of four). Our analysis computes the maximum nestingdepth for this function, which is four.Handling callbacks (RQ3). There are a number of solutions to help with thecomplexities associated with callbacks. We consider three well-known solutions:4https://github.com/NodeBB/NodeBB181 def ine ( ’admin/general/dashboard’ , ’semver’ , function ( semver ) {2 var Admin = {} ;4 $ ( ’#logout -link’ ) . on ( ’click’ , function ( ) {5 $ . post (RELATIVE PATH + ’/logout’ , function ( ) {6 window . l o c a t i o n . h re f = RELATIVE PATH + ’/’ ;7 }) ;8 }) ;10 . . .12 $ ( ’.restart’ ) . on ( ’click’ , function ( ) {13 bootbox . conf i rm ( ’Are you sure you wish to restart NodeBB?’ , function (conf i rm ) {14 i f ( conf i rm ) {15 $ ( window ) . one ( ’action:reconnected’ , function ( ) {16 app . a l e r t ({ a l e r t i d : ’instance_restart’ , }) ;17 }) ;19 socket . emit ( ’admin.restart’ ) ;20 }21 }) ;22 }) ;23 return Admin ;24 }) ;Listing 4.2: Example of multiple nested callbacks in one function.(1) the prevalence of the error-first callback convention, (2) the prevalence andusage of Async.js [40], a popular control flow library, and (3) the prevalence ofPromises [12], a recent language extension, which provides an alternative to usingcallbacks. For each solution, we characterize its usage — are developers using thesolution and to what extent.Error-first callbacks. To detect error-first callbacks (see Chapter 2) we use aheuristic. We check if the first parameter p of a function f definition has the name‘error’ or ‘err’. Then, we check if f ’s callsites also contain ‘error’ or ‘err’ as theirfirst argument. Thus, our analysis counts the percentage of function definitions thataccept an error as the first argument, as well as the percentage of function callsitesthat are invoked with an error passed as the first argument.Async.js. If a subject system has a dependency on Async.js (e.g., in theirpackage.json), we count all invocations of the Async.js APIs in the subject’scode.Promises. We count instances of Promise creation and consumption for each19subject system. If a new expression returns a Promise object (e.g., new Promise()),it is counted as a Promise creation. Invocations of the then() method on a Promiseobject are counted as Promise consumption — this method is used to attach call-backs to a promise. These callbacks are then invoked when a promise changes itsstate (i.e., evaluates to success or an error).4.4 Results4.4.1 Prevalence of Callbacks (RQ1)In the subject systems that we studied, on average, 10% of all function defini-tions and 19% of all function callsites were callback-accepting. Figure 4.1 depictsthe percentage of callback-accepting function definitions and callsites, per cate-gory of systems (Table 4.1), and in aggregate. The figure also shows how theseare distributed across client-side and server-side, indicating that server-side codegenerally contains more functions that take callbacks than client-side code.Finding 1: On average, every 10th function definition takes a callback argument.Callback-accepting function definitions are more prevalent in server-side code(10%) than in client-side code (4.5%).Finding 2: On average, every 5th function callsite takes a callback argument.Callback-accepting function callsites are more prevalent in server-side code(24%) than in client-side code (9%).Implications. Callbacks are utilized across all subject categories we considered.Some categories contain a higher degree of callback usage. For example, the webapplications category contained a higher fraction of both callback-accepting func-tion definitions and invocations of such functions. The inter-category differencesin callback usage were not large, however, and we believe that these differencescan be ascribed to the fact that some categories contain more server-side code thanothers. We believe the more extensive usage of callbacks in server-side code canbe attributed to the continuation-passing-style of programming that was advocatedin the Node.js community from its inception.20DataVizEnginesFrameworksGamesWebAppsNPMClientServerTotal020406080PercentageCallsitesDefinitionsFigure 4.1: Boxplots for percentage of callback-accepting function defini-tions and callsites per category, across client/server, and in total.4.4.2 Callback Usage (RQ2)Asynchronous CallbacksAs a reminder, an asynchronous callback is a callback that is eventually passed toan asynchronous API call like setTimeout() (see Section 4.3 for more details).Figure 4.2 shows the prevalence of asynchronous callback accepting functioncallsites. Across all applications there was a median of 56% and a mean of 56%of callsites with asynchronous callbacks. The figure also partitions the data intothe client-side and server-side categories. We find that usage of asynchronous call-backs is higher in client-side code. Of all callsites in client-side code, 72% wereasynchronous. On the server-side, asynchronous callbacks usage is 55% on aver-age.21DataVizEnginesFrameworksGamesWebAppsNPMClientServerTotal020406080100PercentageFigure 4.2: Boxplots for percentage of asynchronous callback-acceptingfunction callsites.Finding 3: More than half of all callbacks are asynchronous. Asynchronouscallbacks, on average, appear more frequently in client-side code (72%) thanserver-side code (55%).Implications. The extensive use of asynchrony in the subjects we studied indi-cates that program analyses techniques that ignore the presence of asynchrony areinapplicable and may lead to poor results. Analyses of JavaScript must account forasynchrony.The extensive use of asynchronous scheduling paired with callbacks surprisedus. The asynchronous programming style significantly complicates program con-trol flow and impedes program comprehension [13]. Yet there are few tools tohelp with this. We think that the problem of helping developers reason about largeJavaScript code bases containing asynchronous callbacks, both on the client-sideand the server-side, deserves more attention from the research community.22DataVizEnginesFrameworksGamesWebAppsNPMClientServerTotal020406080100PercentageFigure 4.3: Boxplots for percentage of anonymous callback-accepting func-tion callsites per category, across client/server, and in total.Anonymous CallbacksFigure 4.3 shows the prevalence of anonymous callback- accepting function call-sites. The median percentage across the categories ranges from 23% to 48%, whichis fairly high considering that anonymous callbacks are difficult to understand andmaintain. Figure 4.3 also shows the same data partitioned between client-side andserver-side code, and indicates that server-side code contains a slightly higher per-centage of anonymous callbacks than client-side code.Finding 4: Over 43% of all callback-accepting function callsites are invokedwith at least one anonymous callback. There is little difference between client-side and server-side code in the extent to which they use anonymous callbacks.Implications. This finding indicates that in a large fraction of cases, a callback isused once (anonymously) and is never re-used again. It seems that developers findanonymous callbacks useful in spite of the associated comprehension, debugging,23and testing challenges. We think that this topic deserves further study — it isimportant to understand why developers use anonymous callbacks and prefer themover named callbacks. Possible reasons for using anonymous callbacks could becode brevity, or creating temporary local scopes (e.g., in closures). We also thinkthat the high fraction of anonymous callbacks indicates that this popular languagefeature is here to stay. Therefore, it is worthwhile for the research community toinvest time in developing tools that will support developers in handling anonymouscallbacks.Nested CallbacksFigure 4.4 presents our results for the total number of instances of nested callbacksat each observed nesting level. We found that the majority of callbacks nest twolevels deep. Figure 4.4 shows this unusual peak at nesting level of 2. We alsofound that callbacks are nested up to a depth of 8 (there were 29 instances ofnesting at this level). In these extreme cases developers compose sequences ofasynchronous callbacks with result values that flow from one callback into thenext. These extreme nesting examples are available as part of our dataset [9].1 2 3 4 5 6 7 8Nesting LevelNo. of instances     0 8,00016,00024,000Figure 4.4: Instances of nested callbacks for a particular nesting level.Finding 5: Callbacks are nested up to a depth of 8. There is a peak at nestinglevel of 2.Implications. As with anonymous and asynchronous callbacks, callback nesting241 $ ( document ) . ready ( function ( ) {2 $ ( ’.star’ ) . c l i c k ( function ( e ) {3 . . .4 })5 })Listing 4.3: Example of a nested callback introduced by the wrapping$(document).ready() function from the jQuery library.taxes code readability and comprehension. We find that nesting is widely usedin practice and note that developers lack tools to manage callback nesting. Webelieve that there is ample opportunity in this area for tool builders and softwareanalysis experts. The number of instances decreases from level 1 to 8, except atlevel 2. Based on our investigation of numerous level 2 callback nesting examples,we believe that the peak at level 2 is due to a common JavaScript practice in whichproject code is surrounded with an anonymous function from an external library.This is used, for example, for module exporting, loading, or to wait for the DOM tobe fully loaded on the client-side. Due to the extra callback surrounding the projectcode, in these type of projects, callbacks begin nesting at level 2. Listing 4.3 listsan example of this kind of nesting with the $(document).ready() function (line1) from the popular jQuery library. This function waits for the DOM to load. Itincreases the callback nesting in the rest of the code by 1 (e.g., the callback on line2 has a nesting level of 2).4.4.3 Solutions (RQ3)Error-first ProtocolWe found that 20% of all function definitions follow the error-first protocol. Themedian percentage across the categories ranges from 4% to 50%. The fraction offunction definitions that adhere to the error-first protocol is almost twice as highin the server-side code (30%) than in the client-side code (16%). In addition, theerror-first protocol was the most common solution among the three solutions weconsidered. For example, 73% (63 out of 86) NPM modules and 93% (15 out of16) web applications had instances of the error-first protocol.25Finding 6: Overall, every 5th function definition adheres to the error-first pro-tocol. The error-first protocol is used twice as often in server-side code than inclient-side codeImplications. Although we found that a non-trivial fraction of JavaScript functionsrely on the error-first protocol, it remains an ad-hoc solution that is loosely applied.The relatively low and highly variable use of the error-first protocol means that de-velopers must check adherence manually and cannot depend on APIs and librariesto enforce it. Such idiom-based strategies for handling exceptions are known tobe error-prone in other languages, such as C [18]. It would be interesting to studyif this is also the case for JavaScript functions that follow the error-first callbackidiom.Async.jsTo study Async.js, we considered subject systems that use this library. We foundthat only systems in the web applications and NPM modules categories used thislibrary. Our results show that 9 of 16 (56%) web applications and just 9 of 85(11%) NPM modules use the Async.js library to manage asynchronous controlflow. Table 4.3 shows the top 10 used functions from the Async.js API (by numberof callsites) for these two categories of subject systems.This table indicates that the sets of functions used in the top 10 list are simi-lar. But, there are notable differences: for example, nextTick was the most usedAsync.js method in web applications and just the 9th most used method in NPMmodules. The nextTick method in Async.js is used to delay the invocation ofthe callback until a later tick of the event loop, which allows other events to pre-cede the execution of the callback. In Node.js code the nextTick is implementedusing the process.nextTick() method in the runtime. In browsers this call isimplemented using setImmediate(callback) or setTimeout(callback, 0).In this case Async.js provides a single interface for developers to achieve the samefunctionality in both client-side and server-side code.In NPM modules parallel is the most widely used Async.js method (it is the6th most popular among web applications). This call is used to run an array ofindependent functions in parallel.26Table 4.3: Top 10 Async.js invoked methods in JavaScript web applications(left) and NPM modules (right). The ∗ symbol denotes calls that do notappear in both tables.Rank Method Count1 nextTick 182 queue∗ 163 each 143 setImmediate∗ 143 series 146 auto∗ 116 waterfall 116 parallel 119 map 109 apply 10Rank Method Count1 parallel 1892 apply 813 waterfall 724 series 615 each 486 map 377 eachSeries∗ 208 eachLimit∗ 129 whilst∗ 109 nextTick 10As a final example, the second-most used call in web-applications, queue, doesnot appear in the top ten calls used by NPM modules. The queue method function-ality is similar to that of parallel, except that tasks can be added to the queue ata later time and the progress of tasks in the queue can be monitored.We should note that because JavaScript is single-threaded, both the paralleland queue Async.js calls do not expose true parallelism. Here, parallel executionmeans that there may be points in time when two or more tasks have started, buthave not yet completed.There are significant differences between web applications and NPM modulesin terms of the Async.js API usage. To characterize this difference, we first rankedthe API functions according to the number of times they were used in web appli-cations and NPM modules. Then we analyzed the difference between the ranks ofeach function in the two categories. For example, the rank of nextTick is 9 and 1in the NPM modules and web applications, respectively, making the absolute dif-ference 8. Overall, the rank differences had a mean of 6.2, a median of 5.5, and avariance of 23.8. This indicates that the Async.js library is used differently in thesetwo categories of subject systems.27Finding 7: More than half of the web applications (56%) use the Async.js libraryto manage asynchronous control flow. The usage is much lower (11%) in theNPM modules. In addition, the Async.js library is used differently (rank varianceof 23.8) in these two categories of subject systems.Implications. Libraries, such as Async.js, provide one means of coping with thecomplexity of callbacks. However, because library solutions are not provided na-tively by the language runtime, the developer community can be divided on whichlibrary to use, especially as there are many alternatives (see Chapter 2). We thinkthat the difference in Async.js API usage by developers of web applications (thatinclude both client-side code and server-side code) and NPM modules (exclusivelyserver-side code) indicates different underlying concerns around callbacks andtheir management. We think that this usage deserves further study and can informthe design of future libraries and proposals for language extensions [36, 42, 59].PromisesTable 4.3 shows the percentage of subjects that create promises and the percentageof subjects that use promises.Table 4.4: Percentage of subject systems creating and using PromisesCategory Subjects creating Subjects usingPromises (%) Promises (%)DataViz libraries 6 31Game Engines 0 25Frameworks 50 75Games 0 17Web Applications 13 50NPM Modules 3 12Total 8 26Figure 4.5 shows box plots for the number of Promise creating instances usingnew Promise() constructs and Promise usage, e.g., then(), in the different sub-28DataVizEnginesFrameworksGamesWebAppsNPMClientServerTotal0100200300400500No. of instancesUsageCreationFigure 4.5: The distribution of total Promise usage and creation instances bycategory, across client/server, and in total.ject categories, partitioned across client/server, and in total. It should be noted thatnot all application access a Promise through the new Promise() statement; someinvoke library functions that return Promises.In aggregate, we found that 37 of 138 (27%) applications use Promises. Theywere predominantly used by client-side frameworks (75%), with a maximum of513 usage instances (across all frameworks) and a standard deviation of 343 usageinstances. In all the other subject systems, usage of Promises was rare, with a meanclose to zero. There was one outlier, the bluebird NPM module5, that had 2,032Promise usage instances. This module implements the Promises specification as alibrary. We therefore omit it from our results.Finding 8: 27% of subject systems use Promises. This usage is concentrated inclient-side code, particularly in JavaScript frameworks.Implications. Although Promises is a promising language-based approach to re-solving many of the challenges related to callbacks, such as nesting and error han-5 https://www.npmjs.com/package/bluebird29dling, we have not observed a significant uptake of Promises among the systems westudied. This could be because Promises is a relatively new addition to browsersand Node.js. It would be interesting to study how this adoption evolves and whetherPromises lead to higher quality and more maintainable JavaScript code. Tools thatautomatically refactor callbacks into Promises would help developers to migrateexisting large projects to use Promises.4.5 Threats to ValidityThere are both internal and external threats to validity for our study. We overviewthese threats in this section.Internal threats. Our JavaScript analyses rely on a number of developmentconventions. For example, our error-first callback analysis depends on a namingheuristic — the assumption that code adhering to the error-first protocol will namethe first argument of a function as err or error. A threat is that we may be under-counting error-first protocol adherence by missing cases where the protocol is fol-lowed but a different argument name is used. And, we may also be over-countingadherence, since an argument name does not necessarily mean that the code useserror-first protocol, or that it properly follows the protocol. We also rely on direc-tory naming conventions and use code annotations as hints to identify client-sideand server-side code.We decided to count features of callback usage in particular ways. For example,we count callback nesting by taking the maximum depth of callback nesting for afunction. This can provide an under approximation of the number of instances ofnested callbacks.Our analyses are static. This limits the kinds of JavaScript behaviors that wecan analyze. For example, we do not handle code in eval statements in our study.External threats. Although we study over 5 million lines of JavaScript code,our sample might not be representative, in a number of ways. First, it comes fromopen source projects of a particular size and maturity. Second, we consider projectsthat use JavaScript and are primarily mono-lingual. For example, we do not con-sider projects that use JavaScript on the client-side and Java on the server-side.As a result, our findings may not generalize to other types of JavaScript projects.30However, the subject systems in our study represent five different categories andas the first study of its kind, we believe our characterization study of JavaScriptcallback usage in practice is worthwhile, and hope that it will lead to other studiesthat consider a broader variety of subject systems.All our empirical data and toolset are publicly available; since the subject sys-tems are all open source, our study should be repeatable.4.6 ConclusionsAll modern JavaScript applications that handle and respond to events use callbacks.However, developers are frequently frustrated by “callback hell” — the compre-hension and maintainability challenges associated with nested, anonymous call-backs and asynchronous callback scheduling. In an empirical study of callbacksusage in practice, we study over 5 million lines of JavaScript code in 138 subjectsystems that span a variety of categories. We report on the prevalence of callbacks,their usage, and the prevalence of solutions that help to manage the complexityassociated with callbacks. We hope that our study will inform the design of futureJavaScript analysis and code comprehension tools. Our analysis [4] and empiricalresults [9] are available online.31Chapter 5Refactoring5.1 BackgroundIn this section, we elaborate on the key background of the JavaScript language, itsruntime, and promises, which are necessary to understand the rest of the thesis.Callback Nesting and Error Handling in Practice. JavaScript programs fre-quently contain sequences of asynchronous tasks that need to be completed se-quentially (e.g., when a click event is fired, send data to the server, and then onresponse from the server, update an element in the DOM). Using callbacks to han-dle the control flow in these situations results in nested callbacks, which increasecode complexity. Because each callback adds a new function definition and inden-tation level, this affects the understandability of the program.In our empirical study (Section 4.4), callback depth was defined to be the num-ber of callbacks nested inside one after the other. Using static analysis we com-puted the maximum nesting depth for each function and found that the majority ofcallbacks are nested to two levels, nesting can happen as deep as 8 levels, and thatthe number of nested instances decreases from level 1 to 8, except at level 2.We also found that 20% of all callback accepting function definitions use the‘error-first callbacks’ convention to propagate errors asynchronously.To understand whether nested callbacks use the error-first protocol, we focusedon a subset of 16 subject systems (having a total of 494 KLOC) from the previousstudy and counted the number of occurrences of the error-first protocol in nested32callbacks across these systems. The results for these 16 web applications are shown0	  500	  1000	  1500	  2000	  2500	  3000	  3500	  L1	   L2	   L3	   L4	   L5	   L6	   L7	  No.	  of	  Instances	  Level	  of	  Nes0ng	  Nested	  only	  Nested	  with	  Error-­‐first	  Figure 5.1: Count of nested callbacks with at least 1 error-first callback inweb applications.in the Figure 5.1. It shows the number of nested callbacks for each nesting levelas black bars and number of error-first protocol inside nested callbacks as lightgrey bars. We found out a large number of nested callbacks included at least oneinstance of the error-first protocol. On average 28% of nested callbacks use theerror-first protocol.Uncaught errors. Because each callback is executed with a new call stack, un-caught errors cannot be propagated up the call stack to the original caller of theasynchronous function. Consider Listing 5.1: if an error is raised in handleErrorat lines 3, 7 or 10, the programmer cannot handle these errors with one catch blockbecause each of these statements are executed within a fresh call stack. The pro-grammer needs three catch blocks to handle uncaught exceptions from the threecallback functions. To avoid missing error handlers, developers occasionally use a331 getUser ( ’jackson’ , function ( e r ro r , user ) {2 t r y{3 i f ( e r r o r ) {4 handleError ( e r r o r ) ;5 } else {6 getNewTweets ( user , function ( e r ro r , tweets ) {7 t r y {8 i f ( e r r o r ) {9 handleError ( e r r o r ) ;10 } else {11 updateTimel ine ( tweets , function ( e r r o r ) {12 t r y {13 i f ( e r r o r ) hand leError ( e r r o r ) ;14 } catch ( e ) {15 g loba lEr ro rHand le r ( e ) ;16 }17 }) ;18 }19 } catch ( e ) {20 g loba lEr ro rHand le r ( e ) ;21 }22 }) ;23 }24 } catch ( e ) {25 g loba lEr ro rHand le r ( e ) ;26 }27 }) ;Listing 5.1: A sequence of asynchronous operations.global catch-all uncaughtException handler as a hack, as shown below.1 process . on ( ’uncaughtException’ , function ( e r r ) {2 console . e r r o r ( ’ uncaughtException:’ , e r r . message )3 console . e r r o r ( e r r . s tack )4 process . e x i t ( 1 )5 })Synchronization. Callbacks do not have built in synchronization and are not nat-urally idempotent. This causes problems when, for example, the same callbackis accidentally invoked multiple times inside another continuation. Consider List-ing 5.2: because there is no return statement within the if block, cb is executedtwice if foo is truthy. Based on our experience, this is a common mistake amongJavaScript developers.Promises. A promise is a design pattern that handles asynchronous events andsolves many of the callback-related problems described previously. While promiseshave been used for some time in JavaScript with third party libraries, the next341 handler ( cb , foo ) {2 i f ( foo ) cb ( foo ) ;3 cb ( foo ) ;4 }Listing 5.2: A JavaScript snippet illustrating that callbacks are vulnerable tosynchronization bugs.ECMA specification (version 6) [6] of the language has promises built in.With the promises design pattern, instead of accepting a callback as the con-tinuation function, an asynchronous function returns a Promise instance. This in-stance represents a value that will be available sometime in the future, for example,after a deferred task has completed. A promise can be in one of three states:Pending. The deferred task has not yet completed, so the outcome of the Promisehas not been determined.Fulfilled. The deferred task has completed successfully and its result is available.Rejected. The deferred task has failed and a corresponding error is available.A promise’s initial state is pending, and will either transition to fulfilledor rejected depending on the outcome of the deferred task. Once the caller ofthe asynchronous function has retrieved a promise, it can register two continuationfunctions with the promise: a success handler and an optional error handler. Thesuccess handler is called when the promise enters the fulfilled state. The errorhandler is called when the promise enters the rejected state.1 getUser ( ’jackson’ )2 . then ( getNewTweets , handleError )3 . then ( updateTimel ine , handleError )4 . catch ( g loba lEr ro rHand le r ) ;Listing 5.3: A sequence of asynchronous operations composed withPromises.Promises solve many of the problems associated with callbacks. Consider List-ing 5.3, which is semantically equivalent to Listing 5.1, but uses promises. Call-back nesting is eliminated by chaining promises. Error handling is separated into35a success handler (the first parameter of .then) and an error handler (the secondparameter of .then). Uncaught errors are handled with .catch. Basic synchro-nization is handled automatically because promises guarantee that the error andsuccess handlers only execute once.5.2 Exploratory Study: Refactoring Callbacks toPromisesWe carried out an exploratory study to better understand the extent and manner inwhich developers refactor callbacks into promises. Our exploratory study consistsof three parts: (1) a manual inspection of issues on GitHub related to promises,(2) a manual inspection of pull requests on GitHub related to promises, and (3) anautomated mining of commits that refactored asynchronous callbacks to promises.5.2.1 Finding Issues Related to Callbacks and PromisesThe first part of our study explored posts on GitHub’s issue tracking system. GitHubis one of the most popular collaborative software-development platforms amongJavaScript developers [24] and provides the largest publicly available dataset in-cluding developer discussions and development history. Because of its popularityand depth of content, we used GitHub’s search feature to find issues related torefactoring of asynchronous callbacks to promises.We first used the query string: “promise callback language:JavaScriptstars:>30 comments:>5 type:issue” to search for GitHub issue discussionsthat were non-trivial (containing at least 5 comments), associated with projects thatwere popular (starred by at least 30 users) and contained the terms promise andcallback. This search resulted in 4,342 issues. We randomly sampled 11 issuesfrom this set and manually inspected the discussions associated with the issues. Wefound that in majority of issues (8 out of 11), the final consensus was to refactorthe code to use promises instead of using asynchronous callbacks. Many discussionparticipants agreed that using promises would be beneficial to the project, however,the main reasons for reluctance to migrate to promises were that (1) promises maycause significant changes to existing APIs and (2) the development costs associatedwith the change was prohibitive.36Many users who showed interest in moving to promises, emphasized variousbenefits of Promises:“I’ve recently solved various problems using promises, e.g., polyglot data mi-grations, browser/node client libraries for an HTTP API, kv store interfaces &controller code. I’ve both written completely new code and rewrapped callback-style code. Personally I’m very pleased with the amount of additional safety andexpressiveness I’ve gained by using promises. I’m not dismissing callbacks perse, but personally I find it much simpler to reason about code using promises thancode using callbacks & utility libraries like async.” 1“In general, I think promises provide a good foundation for async: they’refairly simple and lightweight, but they make it easy to build higher level asyncconstructs. For example, when’js’s when.map and when.reduce are surprisinglycompact, but quite powerful. I see that as an advantage of promises, they are bothuseful in and of themselves, and also make good building blocks. I also feel thatpromises make for very clean API design. There is no need for callbacks [...] tobe added to every function signature in your API. You can just return a promiseinstead.” 2We then narrowed down the search by including the term refactor.3 Thisresulted in 351 issues, many of which indicated strong demand to refactor codeto use promises. For example, one participant stated: “So this is something that ispurely for devs but I think it is about time to do this. i.e. git-task is a great candidateto take full advantage of promises and it would have made implementation of #602much easier.” 4Many of these requests came from users of JavaScript libraries who wantedpromises as part of the library API: “Are there any plans for Promise support,alongside the callbacks and streams? Proper Promise support in any-db and any-1 https://github.com/share/ShareJS/issues/2682 https://github.com/gladiusjs/gladius-core/issues/1273Complete search string: “Refactor promises language:JavaScript stars:>20type:issue” We lowered the number of stars to capture more projects.4 https://github.com/FredrikNoren/ungit/issues/60337db-transaction would be really nice :)”5, and “Add promise API option?” 6. Someof the users encouraged the move to promises by sharing their own experiencesof using promises: “We’ve recently converted pretty large internal codebases fromasync.js to promises and the code became smaller, more declarative, and cleanerat the same time.” 75.2.2 Exploring Refactoring Pull RequestsThe second part of our study explored pull requests on GitHub in order to de-termine whether developers acted on suggestions for refactoring asynchronouscallbacks to promises. We did this by searching GitHub for pull requests associ-ated with refactoring asynchronous callbacks to promises and manually inspectingthe results. Our search used the following query string: “Refactor promiseslanguage:Javascript stars:>20 type:pr”. The search resulted in 451 pullrequests. We observed that most of these pull requests were submitted as improve-ments to the project and involved replacing callbacks with Promises. These wereeither native Promises supported by the runtime or ones provided by third-partylibraries like Bluebird, Q, or RSVP. We found that developers act on the desire forpromises over asynchronous callbacks and do perform this type of refactoring inpractice. A more detailed listing of the discussions we explored in our study, alongwith listings of relevant quotes, can be found online [11].5.2.3 Mining CommitsIn the third part of our exploratory study, we mined 134 Node.js applications andNPM modules to look for examples of asynchronous callback to promise refactor-ing in practice. We discovered 39 instances of asynchronous callback to promiserafactorings across 9 projects. This indicates that developers are interested in per-forming this refactoring in practice. We manually inspected each of these instancesand found that five of these instances conform to the standard refactoring patternthat matches what is recommended in developer blog posts [19].This exploratory study demonstrates that the developers see many advantages5 https://github.com/grncdr/node-any-db/issues/666 https://github.com/addyosmani/psi/issues/567 https://github.com/meetfinch/decking/issues/1838in migrating to promises. However, because of the complex control flow asso-ciated with callbacks and promises, refactoring code to use promises instead ofasynchronous callbacks is difficult. Without tool support, the development costsassociated with the change can become the major barrier of promises adoption.We believe there is a strong demand among JavaScript developers for automatedtool support to perform these refactorings.In this chapter, our goal is to develop an approach that can automatically refac-tor asynchronous callbacks to promises. The approach should have the followingproperties: (1) candidates for refactoring can be detected automatically, (2) the ap-proach can refactor most asynchronous callbacks, (3) the approach produces codeand (4) the understandability of the source code is improved. In the followingsection we describe our approach.5.3 ApproachIn this section we formally specify our proposed program refactorings. Throughoutwe use the following notation: we use async to denote an asynchronous, built-in, JavaScript API function, such as process.nextTick or fs.readFile. Weuse cb to denote a callback, or a function that is passed as an argument to otherfunctions. For example, Listing 5.4 gives an abstract example of function f thatuses an asynchronous callback cb f ; Line 16 in the listing contains a callsite to f .1 function f (cb f ) {2 async ( function cbasync (error, data ) {3 i f ( e r r o r ) cb f ( null , data ) ;4 else cb f (error , nul l ) ;5 }) ;7 function cb f ( e r ro r , data ) {8 i f ( e r r o r ) {9 // Handle error10 }11 else {12 // Handle data13 }14 }16 f (cb f ) ;Listing 5.4: Abstract functions and callsites in the original program P.39Detection Async FunctionDefinitionDetectorASync CallsiteDetectorModify-OriginalWrap-aroundConversionOptimizationFlatten NestedCallbacksJSprogramPromiseCreationPromiseConsumptionError PathExtractionCallsite ConversionFigure 5.2: Overview of our approach.We transform P into P′ by transforming sub-elements of P. Figure 5.2 illus-trates this process. In Sections 5.3.1 and 5.3.2 we describe a process to automat-ically discover instances of f that can be refactored using our method. In Sec-tion 5.3.3 we describe a process to derive a new asynchronous function f ′ thatreturns a promise from f . In Section 5.3.4 we describe a process to derive succ anderr, the success and error handlers for the promise from cb f . In Section 5.3.5 wedescribe a process to derive the new call site of f ′ from the original call site of f .Finally, in Section 5.4 we demonstrate that P is semantically equivalent to P′.5.3.1 Identifying Functions with Asynchronous CallbacksOur first contribution is in helping developers to automatically identify instancesof f to refactor. We consider a function to be an instance of f if async is directlyinvoked inside the body of the function and if one of the following is true:1. cb f = cbasync, or2. cb f is invoked inside the closure of cbasyncThis rule ensures that f is asynchronous and f accepts a callback as a parameterand that callback is deferred to be invoked when the asynchronous API call finishes.We identify instances of async using a whitelist of calls that we know to beasynchronous. This whitelist includes a variety of APIs, including DOM events,40network calls, timers, and I/O. The complete list is available online [11].5.3.2 Choosing a Refactoring StrategyOur next contribution is to identify which of our automated refactoring techniques(if any) can transform f to f ′. We describe two strategies for transforming call-backs into promises: modify-original and wrap-around. Table 5.1 shows the ad-vantages and disadvantages of each strategy, which are discussed below.Table 5.1: Refactoring strategies.Strategy Advantages Disadvantagesmodify-original Produces code similar to howdevelopers would refactorTransforms some instanceswrap-around Transforms most instances Produces code that can bemore complex than the originalModify-originalIn our exploratory study (Section 5.2), we observed the relative frequency of dif-ferent kinds of promise refactorings performed by developers. The modify-originalstrategy is based on the most frequent type of refactoring that we observed. Thelimitation of modify-original is that it cannot transform more complex asynchronouscallbacks. Candidate instances for the modify-original refactoring must meet thefollowing preconditions. The rationale behind these preconditions is also providedbelow:1. cb f is invoked inside the closure of cbasyncThis ensures that cb f 6= cbasync. This would require a different transformationfor cb f that we do not currently support.2. cb f is not used in cbasync other than in instances of (1)This ensures that f does not use cb f for anything other than as a callbackfunction. Because this parameter is removed during the transformation, cb f isno longer available to f at runtime.3. f always returns voidThis precondition eliminates cases where an alternate synchronization method41is used. We found that in most cases, when an asynchronous function returnsa value, the return value is used as an identifier for synchronization. Customsynchronization strategies require detailed knowledge of their implementationto produce a valid refactoring and are therefore not handled by either of ourtransformation strategies.4. cb f is splittableThis requires that cb f has a success path and an error path that do not interactwith each other (i.e., that cb f is splittable). For example, if cb f is using theerror-first protocol, the error parameter cannot be used on the success path andthe data parameter cannot be used on the error path. This is because promisesseparate the success and error handlers, so any interaction between the twopaths cannot occur in a promises implementation.5. f has exactly one asyncThis is needed because only one promise is returned and the handler for apromise can only be invoked once. If more than one async is invoked, a sig-nificantly more complex refactoring is needed.6. invocations of cb f provide zero or one argument, or follow the error-firstprotocolIt eliminates cases where more than one non-null argument is given to cb f .This is a restriction of the current implementation of promises in JavaScript,which only accepts one argument in both the resolve and reject handlers.7. cbasync does not use variables named resolve or rejectThis prevents problems caused by variable hiding, since our method uses thesevariable names inside cbasync.8. f is not contained in a third-party libraryFinally, this precondition prevents library code from being refactored.Wrap-aroundBecause the modify-original strategy cannot transform asynchronous callbacks thatdo not satisfy one or more of the above preconditions, we also provide a strategywhich (unlike modify-original) does not modify the body of f . This strategy isable to refactor a larger number of asynchronous callback functions than modify-42original. This strategy, however, comes at the cost of simplicity and comprehen-sion, as this method produces more code than the original and introduces a newfunction. Candidates for the wrap-around refactoring must satisfy the followingpreconditions:1. cb f = cbasync OR cb f is invoked inside the closure of cbasync2. cb f is not used in f other than in instances of (1)3. f always returns void4. cb f is splittable5. f has exactly one async6. invocations of cb f provide zero or one arguments, or follow the error firstprotocol7. f cannot be refactored by modify-originalThe preconditions for the wrap-around strategy are more relaxed than the pre-conditions for the modify-original strategy. The wrap-around strategy does not in-clude a number of preconditions found in the modify-original strategy. This meansthat the wrap-around strategy can support the following additional cases: caseswhere cb f = cbasync (modify-original condition #1), cases where cbasync uses vari-ables named resolve or reject (modify-original condition #7) and cases wheref is contained in a third-party library (modify-original condition #8).The rationale for the preconditions in the wrap-around strategy are as follows:Precondition (1) is the same as our precondition for identifying instances of f inSection 5.3.1. Preconditions (2-6) are the same as the modify-original precondi-tions. Precondition (7) ensures that the modify-original strategy is selected first,because it produces more understandable code.5.3.3 Transforming the Asynchronous FunctionIn this subsection, we specify our transformations for deriving f ′ from f .Modify-originalIn this strategy, we modify the body of f in the same way that a developer wouldbe likely to perform this transformation. First, our technique creates a new f ′ thatreturns a promise:431 function f ′ ( ) {2 return new Promise ( ) ;3 }) ;The Promise constructor takes one argument: the factory function for the promise.To build this, we declare an anonymous function that wraps the body of f :1 function f ′ ( ) {2 return new Promise ( function ( resolve , r e j e c t ){3 async ( function cb′async (data ) {4 i f ( e r r o r ) cb f ( null , data ) ;5 else cb f (error , nul l ) ;6 }) ;7 }) ;8 }) ;Next, we replace invocations of cb f with invocations of resolve and reject. In-vocations of cb f that pass a non-null error argument are converted into invocationsof reject, which calls err. We look for arguments that use the error-first protocolor match the regular expression e|err|error to find these invocations. All otherinvocations of cb f are converted into invocations of resolve, which calls succ.1 function f ′ ( ) {2 return new Promise ( function ( resolve , r e j e c t ){3 async ( function cb′async (data ) {4 i f ( e r r o r ) r e j e c t ( null , data ) ;5 else reso lve (error , nul l ) ;6 }) ;7 }) ;Finally, in P′ we replace f with f ′.The listing 5.5 and listing 5.6 shows how an asynchronous callback instancein a real-world JavaScript program is refactored to use Promises using PROMISES-LAND. The refactored version of the function addTranslations, does not accepta callback, and instead returns a Promise. The invocations of the callback (lines 5and 15 in listing 5.5) have been changed to reject and resolve (lines 6 and 16 inlisting 5.6) depending whether an error occurred or not.Wrap-aroundIn this strategy, we do not modify f . Instead, we wrap all of the calls to f inside anew function. We create this new function f ′, which creates and returns a Promise.441 function addTrans la t ions ( t r a n s l a t i o n s , ca l l back ) {2 t r a n s l a t i o n s = JSON. parse ( t r a n s l a t i o n s ) ;3 f s . r eadd i r ( dirname + ’/../client/src/translations/’ , function ( er r , p o f i l e s) {4 i f ( e r r ) {5 return ca l l back ( e r r ) ;6 }7 var vars = [ ] ;8 p o f i l e s . forEach ( function ( f i l e ) {9 var l o c a l e = f i l e . s l i c e (0 , −3) ;10 i f ( ( f i l e . s l i c e (−3) === ’.po’ ) && ( l o c a l e !== ’template’ ) ) {11 vars . push ({ tag : loca le , language : t r a n s l a t i o n s [locale ]} ) ;12 }13 }) ;15 return ca l l back ( vars ) ;16 }) ;17 }19 addTrans la t ions ( t rans , jobComplete ) ;Listing 5.5: An example of an asynchronous callback before refactoring toPromisesA call to f is inserted into the body of the factory method for the promise:1 function f ′ ( ) {2 return new Promise ( function ( resolve , r e j e c t ) {3 f ( ) ;4 }) ;5 }7 function f (cb f ) {8 async ( function cbasync (error, data ) {9 i f ( e r r o r ) cb f ( null , data ) ;10 else cb f (error , nul l ) ;11 }) ;12 }A new anonymous function is created as the continuation function for f . If cb ffollows the error-first protocol, the continuation function provides branches thatdirect the error parameter to reject and the data parameter to resolve:1 function f ′ ( ) {2 return new Promise ( function ( resolve , r e j e c t ) {3 f ( function ( er r , data ){4 i f ( e r r !== nul l )5 return r e j e c t ( e r r ) ;6 reso lve ( data ) ;451 function addTrans la t ions ( t r a n s l a t i o n s ) {2 return new Promise ( function ( resolve , r e j e c t ) {3 t r a n s l a t i o n s = JSON. parse ( t r a n s l a t i o n s ) ;4 f s . r eadd i r ( d i rname + ’/../client/src/translations/’ , function ( er r ,p o f i l e s ) {5 i f ( e r r ) {6 r e j e c t ( e r r ) ;7 }8 var vars = [ ] ;9 p o f i l e s . forEach ( function ( f i l e ) {10 var l o c a l e = f i l e . s l i c e (0 , −3) ;11 i f ( ( f i l e . s l i c e (−3) === ’.po’ ) && ( l o c a l e !== ’template’ ) ) {12 vars . push ({ tag : loca le , language : t r a n s l a t i o n s [locale ]} ) ;13 }14 }) ;16 reso lve ( vars ) ;17 }) ;18 }) ;19 }21 addTrans la t ions ( t rans ) . then ( jobComplete ) ;Listing 5.6: An example of an asynchronous callback after refactoring toPromises using Modify-original strategy7 }) ;8 }) ;9 }11 function f (cb f ) {12 async ( function cbasync (error, data ) {13 i f ( e r r o r ) cb f ( null , data ) ;14 else cb f (error , nul l ) ;15 }) ;16 }5.3.4 Transforming the Callback FunctionFrom Section 5.3.3, we have a new asynchronous function f ′ that returns a promise.We now transform all call sites of f to use the promise produced by f ′. The firststep is to identify call sites of f . We rely on existing static analysis of TernJS [30],a type inference technique based on the work by Hackett and Guo [29] to determinethe points-to relationships between call sites of f and the declaration of f .Next, we convert all call sites to use f ′. Consider c, a call site of f . c has46a callback function cb f , which handles both successful and unsuccessful execu-tions of f . However, f ′ requires a separate handler for successful and unsuccessfulexecutions. From cb f we derive two functions: the success handler succ and theerror handler err. succ is the success-handling path of cb f , while err is the error-handling path of cb f . We therefore declare a success handler and an error handlerfor the promise. The code that is executed along the success path in cb f gets copiedinto succ, while all code that is executed along the error path in cb f is copied intoerr. Any conditional statements that cause control flow to branch to the success orerror paths in cb f are omitted from the handlers.1 function succ ( data ) {2 // Handle data3 }4 function err ( e r r o r ) {5 // Handle error6 }To determine the success path and error path of cb f , we use a heuristic: we lookfor an if statement that checks if a parameter matching e|err|error is non-null.The branch where the parameter is null we consider to be the success path and thebranch where the parameter is non-null we consider to be the error path. This isbased on the typical usage of the error-first protocol. Finally, in P′ we replace cb fwith succ and err.5.3.5 Transforming the Call SiteOur last step to have a working program is to transform the call sites of f to invokef ′ instead. First, if the wrap-around strategy was used, the name of f is changed tof ′. If the modify-original strategy was used, the name remains unchanged.1 f ′ (cb f ) ;Next, since f ′ does not accept a continuation function as an argument, we removecb f as an argument from our call to f ′. Because our call to f ′ produces a promise,we pass succ and err to the promise by appending the call .then(succ, err) tothe promise returned by f ′.1 f ′ ( ) . then ( succ , err ) ;47In some cases, no err exists. Either there was no error handling path in cb f or onewas not recognized by our heuristic. In this case, we add a comment in place of erras the second argument of then, which makes a recommendation to the developerto create an error handler.5.3.6 Flattening Promise ConsumersAfter a set of nested callbacks are converted into promises, the result is a set ofnested promise consumers. Because [Promise].then also returns a Promise,we can improve readability by converting nested promises to a flat sequence ofchained promises that are semantically equivalent. For example, Listing 5.7 has aset of nested promises that can be refactored to the chain of promises in Listing 5.81 getLocationDataNew ( "jackson" ) . then ( function ( data ) {2 getCoordinatesNew ( data . address , data . count ry ) . then ( function ( longLat ) {3 getNearbyATMsNew ( longLat ) . then ( function ( atms ) {4 console . log ( ’Closest ATM is at: ’ + atms[0 ] ) ;5 }) ;6 }) ;7 }) ;Listing 5.7: Nested promises calls.1 getLocationDataNew ( "jackson" ) . then ( function ( data ) {2 return getCoordinatesNew ( data . address , data . count ry ) ;3 }) . then ( function ( longLat ) {4 return getNearbyATMsNew ( longLat ) ;5 }) . then ( function ( atms ) {6 return console . log ( ’Closest ATM is at: ’ + atms[0 ] ) ;7 }) ;Listing 5.8: Chained promises after they are flattened.We have two preconditions for flattening promise consumers (below, v is avariable):1. ∀v declared in succ,v is not used inside a closure2. only one nested promise is consumed inside succThe first condition checks that no variable declared in succ is also used in one ofthe asynchronous handlers declared in succ. This condition is necessary because if48we add the handler for the nested promise through a promise chain, then v, which isdeclared in succ will no longer be available to the nested promise’s handler throughclosure. This is illustrated in Listing 5.9. We cannot flatten these nested promisesbecause the parameter data is used by the success handler of getNearbyATMsNew.1 getLocationDataNew ( "jackson" ) . then ( function ( data ) {2 getCoordinatesNew ( data . address , data . count ry ) . then ( function ( longLat ) {3 return getNearbyATMsNew ( longLat ) . then ( function ( atms ) {4 console . log ( "The closest ATM to " + data . address + " is: " + atms[0 ] ) ;5 }) ;6 }) ;7 }) ;Listing 5.9: Nested promises which cannot be flattenedThe second condition checks that there is just one asynchronous call insideof succ since promise chaining does not support executing multiple asynchronousfunctions in parallel.If the two preconditions are met, to flatten a promise chain we perform twotransformations:1. Each handler is modified to return a promise.2. For each handler that is not at the start of the chain, a new call to [Promise].thenis created after the previous handler is registered. The handler is passed to theprevious promise in the chain.5.4 Semantic Equivalence of TransformationsNext we use induction to show that our transformations do not change a program’ssemantics. That is, we show that these transformations are behaviour-preservingprogram refactorings. For the base case, we need to show that the following fiveproperties are semantically equivalent between P and P′: (1) scheduling order,(2) function scope, (3) intra-procedural control flow, (4) inter-procedural controlflow, and (5) data flow. Finally, we show that our transformation is semanticallyequivalent for (6) nested asynchronous callbacks.49[callback]Scheduler script f async[callback]Schedule or run... cbf[error, data]cbasync[error, data]Scheduler script f' Promiseerr / succSchedule or run[Promise][Promise]...[error, data]async[succ, err][cbasync]cbasync[error] / [data]Promise[error] / [data](a) (b)Figure 5.3: The sequence diagram on the left (a) shows the sequence for anasynchronous function that accepts a callback. The sequence diagramon the right (b) shows the sequence for an asynchronous function thatreturns a promise.5.4.1 Scheduling Order EquivalenceConsider the sequence diagrams in Figure 5.3. Diagram (a) shows the sequencefor an asynchronous callback, while diagram (b) shows the sequence for an asyn-chronous function that returns a promise. From the diagrams we can see that bothversions execute cbasync in the same tick. Since we have only one async inside f ,any scheduling decisions made by async will be the same.5.4.2 Function Scope EquivalenceWe consider the function pairs f ⇔ f ′, cbasync ⇔ cb′async and cb f ⇔ {succ,err}separately.1) f ⇔ f ′. Since f is unchanged in the wrap-around strategy, the scope is alsounchanged. In the modify-original strategy, there are three changes between f andf ′:1. The body of f is nested in a new function within f ′. This does not affect thescope because anything available in the body of f is available inside the nestedfunction through closure.2. The parameter cb f is not present in f ′. This affects the scope if cb f is usedanywhere other than as the callback function. This case is filtered by our50preconditions.3. The parameters resolve and reject are added to the scope of f ′. Thiscauses a naming conflict or overwrites the scope if variables named resolveor reject exists in the scope of f . This case is also filtered by our precondi-tions.2) cbasync ⇔ cb′async. Because cbasync is nested within f , the same changes applyas the changes between f and f ′. These changes do not affect the scope of cb′asyncfor the same reasons they do not affect f ′. The only other change in cb′async is thatinvocations of f ′ are replaced by invocations of resolve and reject. This changedoes not affect the scope.3) cb f ⇔ {succ,err}. Because succ and err are defined at the same level as cb f ,the closure is the same for all three functions. Variables declared within cb f arecopied to succ and err if they occur on the success path and error path respectively.As long as the success path and error path are correctly retrieved, these copies donot affect the semantics of the program because the variables are only used withinthe path that is being executed.5.4.3 Intra-Procedural Control Flow EquivalenceWe now show that the intra-procedural control flow of the functions in P is equiv-alent to the intra-procedural control flow of the functions in P′. We demonstratethat all transformations we perform on the elements in P produce new elements inP′ that have control flow equivalent to their counterparts in P.1) f ⇔ f ′. For the modify-original strategy, we add a return statement and aninvocation of a Promise constructor. The Promise does not semantically modifythe program’s behaviour and because the body of f is unchanged, statement orderis preserved. For the wrap-around strategy, f is unchanged.2) cbasync⇔ cb′async. For the modify-original strategy, statement order is preservedbecause we replace calls to cb f with calls to either succ or err. These replacementsdo not change the control flow. For the wrap-around strategy, cbasync is unchanged.3) cb f ⇔ {succ,err}. Statement order is preserved because succ and err are builtfrom the success path and error path of cb f , which does not change the statementorder on either path.514) c⇔ c′. Consider Figure 5.3. In this case statements are added to the controlflow to create the promise, however, these statements do not affect the behaviourof the program and can be ignored. The execution order of statements in the callsites of f is maintained relative to the body of f .5.4.4 Inter-Procedural Control Flow EquivalenceFor intra-procedural control flow, we consider two cases: the control flow betweencall sites and async and the control flow between cbasync and cb f .1) c→ async ⇐⇒ c′ → async. Consider Figure 5.3. For the modify-originalstrategy, the control flow between call sites of f and the body of f now passesthrough a Promise constructor and a factory function. Neither the Promise or thefactory function semantically modify the program’s behaviour and statement orderbetween call sites of f and the body of f is preserved.For the wrap-around strategy, the control flow between call sites of f and thebody of f now passes through f ′, a Promise constructor and a factory function.Neither f ′, the Promise or the factory function semantically modify the program’sbehaviour and statement order between call sites of f and the body of f is pre-served.2) cbasync→ cb f ⇐⇒ cb′async→{succ,err}. Consider Figure 5.3. For the modify-original strategy, the control flow between cb′async and {succ,err} now passes througha Promise. The Promise does not semantically modify the program’s behaviourand statement order between cb′async and {succ,err} is preserved.For the wrap-around strategy, the control flow between cbasync and {succ,err}now passes through f ′ and a Promise. Neither f ′ or the Promise semanticallymodify the program’s behaviour and statement order between cbasync and {succ,err}is preserved.5.4.5 Data Flow EquivalenceBecause the control flow is equivalent, we demonstrate data flow equivalence byshowing that the data that is passed between c′, f ′, async, cb′async, succ and errremain equivalent to their counterparts in P.1) c→ async ⇐⇒ c′ → async. Data flow is preserved between c′ and f ′ for52all arguments besides cb f since these arguments are unchanged. Because our pre-conditions state that cb f is only used in f as the continuation function, the data flowis preserved for cb f when we register succ and err through .then(succ, err).2) cbasync → cb f ⇐⇒ cb′async → {succ,err}. Data flow is preserved betweencb′async and {succ,err} because of the pre-condition of modify-original in Sec-tion 5.3.2. With this condition, we know that at most one argument is passed tosucc or err. The resolve and reject handlers of promises each take one param-eter and propagate them from cb′async to {succ,err}.5.4.6 Equivalence of Nested Asynchronous CallbacksWe now take the ‘inductive step’ and show that our transformation is semanticallyequivalent for nested asynchronous callbacks.Because we have a precondition that async is the only asynchronous callbackwithin f , we only need to consider asynchronous callbacks that are nested in-side cb f . Because cb f is the last function in the control flow, we can ignore allother functions and refactorings. Our method will automatically refactor any asyn-chronous callbacks inside cb f . After such a refactoring, since our base transfor-mation produces code that is semantically equivalent, the nested asynchronous callwill also be semantically equivalent. It follows that all nested asynchronous call-backs will be semantically equivalent.5.5 Implementation: PROMISESLANDWe have implemented our approach in a tool called PROMISESLAND. It is com-posed of two components, namely, a static analyzer to search for refactoring oppor-tunities in the form of patterns of asynchrony in JavaScript code and a transforma-tion engine to refactor the detected opportunities into native Promises. PROMIS-ESLAND builds on prior JavaScript analysis tools, such as Esprima [31] to parseand build an AST, Estraverse [58] to traverse the AST, and Escope [8] for scopeanalysis. We also use TernJS [30], a type inference technique based on the workby Hackett and Guo [29], to query for function type arguments.Implementation Challenges. Library should not be refactored by default. If thereare possible refactorings in module code used by the project code, the user can53be notified via INFO logs. Heuristics like the following are used to select localdependencies of the project, but to ignore remote dependencies.• Usually, external dependencies which are downloaded from the public npmregistry are in the node modules directory and are referred in the source likethis:require(’module foo’)• But internal dependencies are kept separate from the third-party modules, un-der version control, and referred by a file system path like this:require(’./path/to/bar.js’)Limitations. A limitation of our implementation is that it depends on the sound-ness of current static analysis techniques. For example, if a points-to relationshipbetween c and f is not discovered by static analysis, our technique will not refac-tor c. While this is true for points-to relationships between all elements of thetransformation, in Section 5.6 we show that this rarely occurs in practice.5.6 EvaluationOur goal is to evaluate the efficacy of our approach in terms of its refactoringopportunity detection accuracy, refactoring correctness, and efficiency. We addressthe following research questions in our empirical evaluation:RQ1: Can PROMISESLAND accurately identify instances of asynchronous call-backs to be converted?We consider PROMISESLAND as an automated technique that a developer canuse to first identify refactoring opportunities in the code. Therefore, we asses howaccurately PROMISESLAND can find asynchronous callbacks in JavaScript code.RQ2: Can PROMISESLAND correctly refactor asynchronous callbacks to Promises?The most important factor for adoption of refactoring tools is determined [54]to be confidence in the correctness. We consider a refactoring correct, if it pre-serves the behaviour after the transformation, which is critical in any refactoringtechnique.RQ3: Is PROMISESLAND efficient?54Refactoring tools that are slow face adoption challenges [44] in practice. Weevaluate the efficiency of PROMISESLAND in both the detection and transformationof asynchronous callbacks in terms of overhead and analysis speed.We have made PROMISESLAND open source and all our empirical dataset isavailable for download [11].5.6.1 Detection Accuracy (RQ1)To find out whether PROMISESLAND can accurately identify refactoring candi-dates in the form of asynchronous callbacks, we first manually inspected foursubject systems (see Table 5.2) and counted asynchronous callbacks that can beconverted to Promises. This set of subject systems consists of heroku-bouncer,8a server-side middleware, moonridge,9 an isomorphic library for MongoDB, tim-bits,10 a client-side widget framework, and tingo-rest,11 a REST-API wrapper forTingoDB. To limit the manual inspection effort, we included four systems for thisevaluation, although we think these four are representative as they include server-side code, client-side code as well as isomorphic JavaScript (executed both on theclient-side and server-side).We then used PROMISESLAND to find refactoring instances, to measure pre-cision and recall. We define precision as the percentage of asynchronous call-backs that can safely be refactored without leading to test failures, across all asyn-chronous callbacks that PROMISESLAND detects. Recall pertains to the percentageof asynchronous callbacks that PROMISESLAND detects, across all asynchronouscallbacks that exist in the subject system.Table 5.2 presents our results. PROMISESLAND did not report any false posi-tives, providing a precision of 100%. This means that although the static analysisis not sound, in practical use, the tool does not detect any wrong instances, and thedeveloper does not need to be concerned about refactoring incorrectly identifiedcallbacks.The recall was 83% on average. This is because PROMISESLAND missed a8 https://github.com/heroku/node-heroku-bouncer9 https://github.com/capaj/Moonridge10 https://github.com/postmedia/timbits11 https://github.com/lean-stack/node.tingo-rest55few instances of asynchronous callbacks. The reason is that our design is based onthe premise that only if it can be guaranteed that all paths of a function execute thecallback asynchronously, the callback can be considered to be semantically similarto a Promise (and thus it becomes a refactoring candidate). To ensure statically thatthe callback is executed asynchronously and exactly once, we follow a conserva-tive approach that can miss some of the potential candidates for conversion. Thatis the reason that our recall is not 100%. This means in practice although PROMIS-ESLAND detects and transforms most of the candidates, a few can be missed. Webelieve this can be further improved using more advanced data-flow analysis tech-niques.Table 5.2: Detection accuracy of the tool.Subject LOC Detected Refactored Precision RecallSystem (JS) Instances Instances (%) (%)heroku-bouncer 947 7 6 100 85.7moonridge 3,760 19 14 100 73.6timbits 1,226 17 15 100 88.2tingo-rest 238 4 4 100 100Total 6,171 39 47 100 (avg) 82.9 (avg)5.6.2 Refactoring Correctness (RQ2)In prior research, Brodu et al. [17] proposed a compiler-based technique to convertnested callbacks into a simpler specification of Promises called Dues [16]. To eval-uate PROMISESLAND, we select the subject systems used by Brodu et al. and com-pare our results to theirs. This set of subject systems consist of 64 Node.js modulesand is expected to be representative of a majority of commonly used JavaScriptmodules. We measure how many asynchronous callback instances can be detectedand converted to Promises without leading to failures of the existing test cases ofthe subject systems.Out of the 64 modules, we first selected the ones with non-failing tests. Wethen instrumented the code to filter out subject systems with test cases that coveredthe asynchronous callbacks in the code. This selection was needed to be able toverify behaviour preservation after the refactoring step. There were 21 subject56systems that matched these two criteria, namely passing tests before refactoringand providing callback coverage. We use test cases for this purpose because priorresearch [27] has shown the effectiveness of test cases for providing an estimate ofhow reliable refactoring engines are for refactoring tasks on real software projects.Then refactoring was performed with PROMISESLAND, which analyzed 438JavaScript files and 108,615 lines of JavaScript code, across all the 21 subjectsystems. This included the identification of asynchronous callback instances andrefactoring them to Promises. After each conversion, we ran the tests to verifywhether the original behaviour is preserved.The results of our comparison between the technique proposed by Brodu etal. [17] (indicated as Dues) and PROMISESLAND are shown in Figure 5.4. In allexcept two subject systems, PROMISESLAND was able to correctly transform moreasynchronous callbacks right away than using the Dues transpiler. The exceptions,namely express-user-couchdb and express-endpoint, were not compatiblewith the Node.js 0.12 version, needed for native Promises support. After mi-nor modifications in the dependency declarations of those two projects to dependon the Bluebird third party Promises implementation, instead of native Promises,PROMISESLAND was able to refactor with passing tests in these two projects aswell.In total, Dues transpiler converted 56 instances, while PROMISESLAND con-verted 188 instances (including those 56).Out of the 188 converted instances, 73 were converted using the modify-originalstrategy and 115 were converted using the wrap-around strategy. When detect-ing compatible continuations for refactoring, the Dues compiler restricted itself tochoose error-first callbacks only. However, PROMISESLAND does not have suchconstrains and determines the suitability for conversion by analyzing the body ofthe function itself. Therefore, our approach can select a larger set of asynchronouscallbacks for safe conversion.These results show not only the ability of PROMISESLAND in detecting asyn-chronous callbacks, but also its correctness in transforming those into Promises.5756	  188	  Dues	   Promises	  0	  20	  40	  60	  80	  100	  120	  140	  160	  180	  200	  No.	  of	  Instances	  Dues	   Promises	  9	  21	  0	  5	  10	  15	  20	  25	  Dues	   Promises	  No.	  of	  Projects	  Dues	   Promises	  (a) (b)Figure 5.4: The bar chart on the left (a) shows the number of asynchronouscallbacks converted into Dues by using the tool from [17] versusthe number of asynchronous callbacks converted into Promises withPROMISESLAND. The bar chart on the right (b) shows the number ofsubject systems in which the tool from [17] was able to detect asyn-chronous callbacks for converting into Dues versus the number of sub-ject systems in which PROMISESLAND was able to detect asynchronouscallbacks for converting into Promises.5.6.3 Performance (RQ3)Since the refactoring tool will typically be used in a development environment, theanalysis and transformation of the source code are expected to be completed withinan acceptable time frame, without keeping the developer idle for too long. In thisstep, to characterize the performance of PROMISESLAND, we measured the timeto analyze and refactor asynchronous callbacks.Table 5.3 shows the performance statistics of each phase of the refactoring. Allthe measurements were taken on a system with Dual-core 2.16 GHz CPU and 4GB58of RAM, running Linux. In all cases the refactoring completed within 3 seconds.The last row of Table 5.3 shows the time taken for the complete refactoring processend-to-end. Since the migration from asynchronous callbacks to Promises will be aone time task in software maintenance, we believe the time taken by our techniqueis acceptable and does not hinder the developers’ regular work-flow.Table 5.3: Performance measurements of PROMISESLAND (in seconds)Phase Min Max Mean MedianAsync Function Detection 0.12 1.00 0.51 0.50Promise Creation Conversion 0.10 0.49 0.29 0.29Promise Consumption Conversion 0.11 0.47 0.27 0.30Optimization and Re-writing 0.14 0.95 0.61 0.58All Phases 0.97 2.57 1.69 1.645.7 DiscussionEvaluating PROMISESLAND. We evaluated the correctness of PROMISESLANDby running an application’s tests after its code was refactored using the tool. Thisis a sanity check that the PROMISESLAND maintains program correctness. A morerigorous evaluation would require more formal techniques and is part of our futurework.Evaluating Promises. Although at least some developers prefer promises overasynchronous callbacks, we do not know of any research that considers whetherthe use of promises improves JavaScript code quality. Our work contributes tworefactoring techniques and a tool, PROMISESLAND, that implements these tech-niques. In our evaluation, we focus on features of the tool, such as its precision andrecall. Empirical evaluation of the promises language feature itself and its impli-cations for software quality and developer productivity remains an open problem.IDE Intergration. By default PROMISESLAND refactors all asynchronous call-backs that it finds in the source code of an application, though it can be also run ona single source file. We believe PROMISESLAND can be integrated into commonJavaScript IDEs to make it more easily accessible to developers, which forms partof our future work.59Async and Wait. Promises are specified in the ECMAScript6 specification. EC-MAScript7 [5], which is nearing completion, will provide a new option for han-dling asynchrony in the form of the async and await keywords. These will allow alinear programming style and permit traditional try/catch error handling, whichis arguably more understandable than promises and will likely gain fast adoptionamong developers. However, our perspective is that, regardless of the underlyingmechanism for managing asynchrony, the need for detecting and refactoring asyn-chronous callbacks will remain. The mechanisms described in this chapter andimplemented as part of PromisesLand are a first step towards more powerful tech-niques. Promises explicitly encode success and failure paths, which are implicitin the error-first protocol. With the techniques developed in this chapter, if andwhen ECMAScript7 is standardized, we will be one step closer to automating therefactoring of JavaScript code to use async and await.Backward Compatibility. Although all major JavaScript runtimes support promises,lack of backward-compatibility was a concern that we observed in discussions thatwe studied (Section 5.2). For example, one developer noted that “[I] too believePromises are the future, but it seems that you need to make the users aware of whatPromise library they should use or native browser Promises if supported.”12 Inother words, refactoring a library to use promises requires all clients of the libraryto change their code. Fortunately, PROMISESLAND can be used for this to someextent, but clients must be made aware of this tool.5.8 ConclusionIt is difficult to imagine a useful JavaScript application that does not use asyn-chronous callbacks; these are used by applications to respond to GUI events, re-ceive network messages, schedule timers, etc. But, asynchronous callbacks presenta number of software engineering challenges, including inability to properly catchand handle errors and callback nesting, which leads developers into “callback hell.”In this chapter we presented two refactorings, modify-original and wrap-around,to refactor asynchronous callbacks into promises, a JavaScript language featurethat resolves some of the issues with asynchronous callback. We implemented12 https://github.com/fixjs/define.js/issues/760both refactorings as part of the PROMISESLAND tool and evaluate it on 21 largeJavaScript applications. We found that PROMISESLAND correctly refactors asyn-chronous callbacks to promises, refactors 235% more callbacks than a tool fromprior work, and runs in under three seconds on all of our evaluation targets.61Chapter 6Conclusion and Future WorkIn the first part of this thesis, we present an empirical study to characterize JavaScriptcallback usage across 138 large JavaScript projects. These include 86 Node.js mod-ules from the NPM public package registry used in server-side code and 62 subjectsystems from a broad spectrum of categories, such as JavaScript MVC frameworks,games, and data visualization libraries. Analyzing JavaScript code statically toidentify callbacks and to characterize their properties for such a study presents anumber of challenges. For example, JavaScript is a loosely typed language and itsfunctions are variadic, i.e., they accept a variable number of arguments. We devel-oped new JavaScript analysis techniques, building on prior techniques and tools, toidentify callbacks and to measure their various features.The focus of our study is on gaining an understanding of callback usage inpractice. We study questions such as, how often are callbacks used, how deep arecallbacks nested, are anonymous callbacks more common than named callbacks,are callbacks used differently on the client-side as compared to the server-side,and so on. Finally we measure the extent to which developers rely on the “error-first protocol” best practice, and the adoption of two recent proposals to mitigatecallback-related challenges, the Async.js library [40] and Promises [12].The results of our study show that (1) callbacks are passed as arguments morethan twice as often in server-side code than in client-side code, i.e., 24% of allserver-side call-sites use a callback in server-side code, while only 9% use a call-back in client-side code; (2) anonymous callbacks are used in 42% of all callback-62accepting function call-sites; (3) there is extensive callback nesting, namely, mostcallbacks nest 2 levels, and some nest as deep as 8 levels, and (4) there is an exten-sive use of asynchrony associated with callbacks — 75% of all client-side callbackswere used in conjunction with built-in asynchronous JavaScript APIs.These results indicate that existing JavaScript analyses and tools [37, 43] oftenmake simplifying assumptions about JavaScript callbacks that might not be truein practice. For example, some of them ignore anonymous callbacks, asynchrony,and callback nesting altogether. Our work stresses the importance of empiricallyvalidating assumptions made in the designs of JavaScript analysis tools.We believe that our characterization of the real-world use of callbacks in differ-ent types of JavaScript programs will be useful to tool builders who employ staticand dynamic analyses (e.g., which language corner-cases to analyze). Our resultswill also make language designers more aware of how developers use callback-related language features in practice.In the second part of this thesis we present a set of program analysis techniquesto detect instances of asynchronous callbacks and to refactor such callbacks, in-cluding callbacks with the error-first protocol, into promises.We started by explaining an exploratory study (Section 5.2) in which we exam-ine several GitHub issues and pull-requests containing terms related to refactoringof asynchronous callbacks into promises. We found that developers frequentlywant to refactor existing code that uses asynchronous callbacks into code that usespromises (GitHub search returned over 4K issues related to this topic). Further-more, based on our reading of a random sample of these issues, developers havea hard time understanding this refactoring process. GitHub search returned only451 pull-requests related to this topic (a small number of actual transformations ascompared to the number of requests). And, the pull-requests we studied reveal thatthe most common style of refactoring is project-independent and amenable to au-tomation. Although our study is small, we found no mention or use of refactoringtools: it seems that currently developers manually refactor asynchronous callbacks.Our exploratory study provides support for the utility of an automated refac-toring tool. We propose a set of static analysis techniques that support auto-mated refactoring by: (1) discovering instances of asynchronous callbacks and (2)transforming instances of asynchronous callbacks into promises. We implemented63these techniques in a tool called PROMISESLAND and evaluated it on 21 opensource JavaScript projects containing a total of 108,615 lines of code. We foundthat PROMISESLAND performs favorably against recent work [17] that transformserror-first protocol into Dues [16], a simpler (non-standard) form of promises.Specifically, when evaluating PROMISESLAND on projects evaluated in [17], wefound that our technique is able to refactor 235% more asynchronous callbacksthan the tool proposed in [17]. PROMISESLAND runs in under three seconds on allof the projects we evaluated and we verified the correctness of our implementationby testing all of the refactorings with the test-suites that are distributed with theseprojects: all the test cases passed after our refactorings, pointing to the behaviourpreservation nature of our technique. We also manually studied the code of four ofthe projects to evaluate the precision and recall of PromisesLand. We found that ithas average precision of 100% and average recall of 83%. We believe these resultspoint to the real-world relevance and efficacy of our techniques.6.1 Future WorkAs future work, we plan to improve the techiniques we devised, to gain furtherinsights about JavaScript applications and other areas in software engineering. Forexample we plan to do further investigations to explain the differences we observedin usages of callbacks in client- side vs server-side JavaScript. Another avenue offuture work we are interested in is understanding why developers use different vari-ations of callbacks and which type is used when. With paradigms like FunctionalReactive Programming (FRP) gaining traction, we also plan to investigate whetherthe ways developers are using callbacks is changing. To complement the work ofthis thesis we also plan on studying how different usages of callbacks impact codequality and how error-prone are different callbacks by investigating correlation ofcallbacks with bug reports.Much like in JavaScript, anonymous functions (or Lambda expressions) arepresent in other programming languages such as C#, Racket, Scheme, Python andRuby, as well. We plan to extend the techiniques we developed to analyze programsof those other languages and characterize software engineering challenges relatedto asynchronous execution.64Bibliography[1] Error Handling in Node.js.https://www.joyent.com/developers/node/design/errors, 2014. Accessed:2015-11-30. → pages 8[2] Most depended-upon NMP packages.https://www.npmjs.com/browse/depended, 2014. Accessed: 2015-11-30. →pages 15[3] Github Showcases. https://github.com/showcases, 2014. Accessed:2015-11-30. → pages 15[4] CallMeBack. https://github.com/saltlab/callmeback, 2015. Accessed:2015-11-30. → pages 14, 31[5] Status, process, and documents for ECMA262.https://github.com/tc39/ecma262, 2015. Accessed: 2015-11-30. → pages 60[6] The ECMAScript language specification.http://wiki.ecmascript.org/doku.php?id=harmony:specification drafts, 2015.Accessed: 2015-11-30. → pages 35[7] ECMA General Assembly Press Release. http://www.ecma-international.org/news/Publication%20of%20ECMA-262%206th%20edition.htm, 2015.Accessed: 2015-11-30. → pages 3[8] Escope. https://github.com/estools/escope, 2015. Accessed: 2015-11-30. →pages 53[9] Don’t Call Us, We’ll Call You: Characterizing Callbacks in JavaScript.Dataset release. http://salt.ece.ubc.ca/callback-study/, 2015. Accessed:2015-11-30. → pages 14, 24, 31[10] Can I use Promises? http://caniuse.com/#feat=promises, 2015. Accessed:2015-11-30. → pages 365[11] Promisland: implementation and empirical dataset.http://salt.ece.ubc.ca/software/promisland, 2015. Accessed: 2015-11-30. →pages 38, 41, 55[12] Promises/A+ Promise Specification. https://promisesaplus.com, 2015.Accessed: 2015-11-30. → pages 9, 19, 62[13] S. Alimadadi, S. Sequeira, A. Mesbah, and K. Pattabiraman. UnderstandingJavaScript Event-based Interactions. In Proceedings of the Intl. Conf. onSoftware Engineering (ICSE), pages 367–377. ACM, 2014. → pages 12, 22[14] S. Alimadadi, A. Mesbah, and K. Pattabiraman. Hybrid DOM-sensitivechange impact analysis for JavaScript. In Proceedings of the EuropeanConference on Object-Oriented Programming (ECOOP), pages 321–345,2015. → pages 12[15] E. Andreasen and A. Møller. Determinacy in static analysis for jQuery. InProc. ACM SIGPLAN Conference on Object-Oriented Programming,Systems, Languages, and Applications (OOPSLA), October 2014. → pages12[16] E. Brodu. Due. https://github.com/etnbrd/due, 2015. Accessed: 2015-11-30.→ pages 56, 64[17] E. Brodu, S. Fre´not, and F. Oble´. Toward Automatic Update from Callbacksto Promises. In Proc. of the Workshop on All-Web Real-Time Systems(AWeS), pages 1:1–1:8, New York, NY, USA, 2015. ACM. ISBN978-1-4503-3477-8. doi:10.1145/2749215.2749216. → pages ix, 12, 56, 57,58, 64[18] M. Bruntink, A. van Deursen, and T. Tourwe´. Discovering Faults inIdiom-based Exception Handling. In Proceedings of the International Conf.on Software Engineering (ICSE), pages 242–251. ACM, 2006. → pages 26[19] B. Cavalier. Async programming part 2: Promises.http://blog.briancavalier.com/async-programming-part-2-promises/, 2013.Accessed: 2015-11-30. → pages 38[20] D. M. Clements. Decofun. https://github.com/davidmarkclements/decofun.Accessed: 2015-11-30. → pages 11[21] A. Feldthaus and A. Møller. Semi-automatic rename refactoring forJavaScript. In Proceedings of the ACM International Conference on Object66Oriented Programming Systems Languages and Applications (OOPSLA),pages 323–338. ACM, 2013. → pages 13[22] A. Feldthaus, T. Millstein, A. Møller, M. Scha¨fer, and F. Tip. Tool-supportedrefactoring for JavaScript. In Proceedings of the ACM InternationalConference on Object Oriented Programming Systems Languages andApplications (OOPSLA), pages 119–138, New York, NY, USA, 2011. ACM.ISBN 978-1-4503-0940-0. doi:10.1145/2048066.2048078. → pages 13[23] A. Feldthaus, M. Scha¨fer, M. Sridharan, J. Dolby, and F. Tip. Efficientconstruction of approximate call graphs for JavaScript IDE services. InProceedings of International Conference on Software Engineering (ICSE),pages 752–761. IEEE, 2013. → pages 12[24] K. Finley. Github has surpassed sourceforge and google code in popularity.2011. http://readwrite.com/2011/06/02/github-has-passed-sourceforge. →pages 36[25] E. Fortuna, O. Anderson, L. Ceze, and S. Eggers. A limit study of JavaScriptparallelism. In Proceedings of Intl. Symposium on WorkloadCharacterization (IISWC), pages 1–10, 2010. → pages 11[26] K. Gallaba, A. Mesbah, and I. Beschastnikh. Don’t call us, we’ll call you:Characterizing callbacks in JavaScript. In Proceedings of the ACM/IEEEInternational Symposium on Empirical Software Engineering andMeasurement (ESEM). IEEE Computer Society, 2015. → pages iii[27] M. Gligoric, F. Behrang, Y. Li, J. Overbey, M. Hafiz, and D. Marinov.Systematic testing of refactoring engines on real software projects. InG. Castagna, editor, Proceedings of the European Conference onObject-Oriented Programming (ECOOP), volume 7920 of Lecture Notes inComputer Science, pages 629–653. Springer Berlin Heidelberg, 2013. ISBN978-3-642-39037-1. doi:10.1007/978-3-642-39038-8 26. → pages 57[28] L. Gong, M. Pradel, M. Sridharan, and K. Sen. Dlint: Dynamically checkingbad coding practices in JavaScript. In Proceedings of the InternationalSymposium on Software Testing and Analysis (ISSTA), pages 94–105. ACM,2015. ISBN 978-1-4503-3620-8. doi:10.1145/2771783.2771809. → pages12[29] B. Hackett and S.-y. Guo. Fast and Precise Hybrid Type Inference forJavaScript. In Proceedings of the Conference on Programming Language67Design and Implementation (PLDI), pages 239–250. ACM, 2012. → pages15, 46, 53[30] M. Haverbeke. Tern. https://github.com/marijnh/tern, 2015. Accessed:2015-11-30. → pages 15, 46, 53[31] A. Hidayat. Esprima. https://github.com/jquery/esprima, 2015. Accessed:2015-11-30. → pages 15, 53[32] S. Hong, Y. Park, and M. Kim. Detecting Concurrency Errors in Client-SideJavaScript Web Applications. In Proceedings of International Conferenceon Software Testing, Verification and Validation (ICST), pages 61–70. IEEE,2014. → pages 12[33] D. Jang, R. Jhala, S. Lerner, and H. Shacham. An empirical study ofprivacy-violating information flows in JavaScript web applications. InProceedings of Conf. on Comp. and Communications Security, pages270–283. ACM, 2010. doi:http://doi.acm.org/10.1145/1866307.1866339. →pages 11[34] S. H. Jensen, P. A. Jonsson, and A. Møller. Remedying the eval that men do.In Proceedings of the International Symposium on Software Testing andAnalysis (ISSTA), pages 34–44. ACM, 2012. ISBN 978-1-4503-1454-1.doi:10.1145/2338965.2336758. → pages 12[35] V. Kashyap, K. Dewey, E. A. Kuefner, J. Wagner, K. Gibbons, J. Sarracino,B. Wiedermann, and B. Hardekopf. JSAI: A static analysis platform forJavaScript. In Proceedings of the ACM SIGSOFT International Symposiumon Foundations of Software Engineering (FSE), pages 121–132. ACM, 2014.ISBN 978-1-4503-3056-5. doi:10.1145/2635868.2635904. → pages 12[36] Y. P. Khoo, M. Hicks, J. S. Foster, and V. Sazawal. Directing JavaScript witharrows. ACM Sigplan Notices, 44(12):49–58, 2009. → pages 28[37] T. Lieber. Theseus: understanding asynchronous code. In Proceedings of theConference on Human Factors in Computing Systems, pages 2731–2736.ACM, 2013. → pages 12, 63[38] M. Madsen, F. Tip, and O. Lhota´k. Static analysis of event-driven node.jsJavaScript applications. In Proc. ACM SIGPLAN Conference onObject-Oriented Programming, Systems, Languages, and Applications(OOPSLA), 2015. → pages 1268[39] J. Martinsen, H. Grahn, and A. Isberg. A comparative evaluation ofJavaScript execution behavior. In Proceedings of Intl. Conf. on WebEngineering (ICWE), pages 399–402. Springer, 2011. → pages 11[40] C. McMahon. Async.js. https://github.com/caolan/async. Accessed:2015-11-30. → pages 9, 19, 62[41] F. Meawad, G. Richards, F. Morandat, and J. Vitek. Eval begone!:semi-automated removal of eval from JavaScript programs. In Proceedingsof the International Conference on Object Oriented Programming SystemsLanguages and Applications (OOPSLA), pages 607–620. ACM, 2012. →pages 13[42] L. A. Meyerovich, A. Guha, J. Baskin, G. H. Cooper, M. Greenberg,A. Bromfield, and S. Krishnamurthi. Flapjax: A Programming Language forAjax Applications. In Proceedings of the Conference on Object OrientedProgramming Systems Languages and Applications (OOPSLA), pages 1–20,New York, NY, USA, 2009. ACM. ISBN 978-1-60558-766-0.doi:10.1145/1640089.1640091. → pages 28[43] A. Milani Fard and A. Mesbah. JSNose: Detecting JavaScript Code Smells.In Proceedings of the International Conference on Source Code Analysisand Manipulation (SCAM), pages 116–125. IEEE, 2013. → pages 11, 63[44] E. Murphy-Hill and A. Black. Breaking the barriers to successfulrefactoring. In Proceedings of the 2013 International Conference onSoftware Engineering (ICSE), pages 421–430, May 2008.doi:10.1145/1368088.1368146. → pages 55[45] H. V. Nguyen, C. Ka¨stner, and T. N. Nguyen. Building call graphs forembedded client-side code in dynamic web applications. In Proceedings ofthe ACM SIGSOFT International Symposium on Foundations of SoftwareEngineering (FSE), pages 518–529. ACM, 2014. ISBN 978-1-4503-3056-5.doi:10.1145/2635868.2635928. → pages 12[46] N. Nikiforakis, L. Invernizzi, A. Kapravelos, S. Van Acker, W. Joosen,C. Kruegel, F. Piessens, and G. Vigna. You are what you include:Large-scale evaluation of remote JavaScript inclusions. In Proceedings ofthe Conf. on Computer and Comm. Security, pages 736–747. ACM, 2012.→ pages 11[47] F. Ocariza, K. Bajaj, K. Pattabiraman, and A. Mesbah. An Empirical Studyof Client-Side JavaScript Bugs. In Proceedings of the ACM/IEEE69International Symposium on Empirical Software Engineering andMeasurement (ESEM), pages 55–64. IEEE, 2013. → pages 11[48] F. Ocariza, K. Pattabiraman, and A. Mesbah. Detecting inconsistencies inJavaScript MVC applications. In Proceedings of the ACM/IEEEInternational Conference on Software Engineering (ICSE), pages 325–335.ACM, 2015. → pages 12[49] M. Ogden. Callback Hell. http://callbackhell.com, 2015. Accessed:2015-11-30. → pages 7, 11, 17[50] S. Okur, D. L. Hartveld, D. Dig, and A. v. Deursen. A Study and Toolkit forAsynchronous Programming in C#. In Proceedings of the Intl. Conf. onSoftware Engineering (ICSE), pages 1117–1127. ACM, 2014. → pages 12[51] V. Raychev, M. Vechev, and M. Sridharan. Effective race detection forevent-driven programs. In Proceedings of the International Conference onObject Oriented Programming Systems Languages & Applications(OOPSLA), pages 151–166. ACM, 2013. → pages 12[52] G. Richards, S. Lebresne, B. Burg, and J. Vitek. An Analysis of theDynamic Behavior of JavaScript Programs. In Proceedings of theConference on Programming Language Design and Implementation (PLDI),pages 1–12. ACM, 2010. → pages 11[53] G. Richards, C. Hammer, B. Burg, and J. Vitek. The eval that men do. InProceedings of the European Conference on Object-Oriented Programming(ECOOP), pages 52–78. Springer, 2011. → pages 11[54] G. Soares, R. Gheyi, D. Serey, and T. Massoni. Making program refactoringsafer. Software, IEEE, 27(4):52–57, 2010. → pages 54[55] M. Sridharan, J. Dolby, S. Chandra, M. Scha¨fer, and F. Tip. Correlationtracking for points-to analysis of JavaScript. In Proceedings of the EuropeanConference on Object-Oriented Programming (ECOOP), pages 435–458.Springer-Verlag, 2012. ISBN 978-3-642-31056-0.doi:10.1007/978-3-642-31057-7 20. → pages 12[56] Stack Overflow. 2015 Developer Survey.http://stackoverflow.com/research/developer-survey-2015, 2015. Accessed:2015-11-30. → pages 170[57] G. J. Sussman and G. L. Steele Jr. Scheme: A interpreter for extendedlambda calculus. Higher-Order and Symbolic Computation, 11(4):405–439,1998. → pages 6[58] Y. Suzuki. Estraverse. https://github.com/estools/estraverse, 2015.Accessed: 2015-11-30. → pages 15, 53[59] TypeScript. TypeScript. http://www.typescriptlang.org, 2015. Accessed:2015-11-30. → pages 12, 28[60] S. Wei and B. G. Ryder. State-sensitive points-to analysis for the dynamicbehavior of JavaScript objects. In Proceedings of European Conference onObject-Oriented Programming (ECOOP), pages 1–26. Springer, 2014. →pages 12[61] J. Weinberger, P. Saxena, D. Akhawe, M. Finifter, R. Shin, and D. Song. AnEmpirical Analysis of XSS Sanitization in Web Application Frameworks.Technical Report EECS-2011-11, UC Berkeley, 2011. → pages 11[62] C. Yue and H. Wang. Characterizing insecure JavaScript practices on theweb. In Proceedings of Intl. Conf. on World Wide Web (WWW), pages961–970. ACM, 2009. doi:http://doi.acm.org/10.1145/1526709.1526838. →pages 11[63] Y. Zheng, T. Bao, and X. Zhang. Statically locating web application bugscaused by asynchronous calls. In Proceedings of the InternationalConference on World Wide Web (WWW), pages 805–814. ACM, 2011. →pages 1271

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0223155/manifest

Comment

Related Items