UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Performance improvements in crawling modern Web applications Zarei, Alireza


Today, a considerable portion of our society relies on Web applications to perform numerous tasks in every day life; for example, transferring money over wire or purchasing flight tickets. To ascertain such pervasive Web applications perform robustly, various tools are introduced in the software engineering research community and the industry. Web application crawlers are an instance of such tools used in testing and analysis of Web applications. Software testing, and in particular testing Web applications, play an imperative role in ensuring the quality and reliability of software systems. In this thesis, we aim at optimizing the crawling of modern Web applications in terms of memory and time performances. Modern Web applications are event driven and have dynamic states in contrast to classic Web applications. Aiming at improving the crawling process of modern Web applications, we focus on state transition management and scalability of the crawling process. To improve the time performance of the state transition management mechanism, we propose three alternative techniques revised incrementally. In addition, aiming at increasing the state coverage, i.e. increasing the number of states crawled in a Web application, we propose an alternative solution, reducing the memory consumption, for storage and retrieval of dynamic states in Web applications. Moreover, a memory analysis is performed by using memory profiling tools to investigate the areas of memory performance optimization. The enhancements proposed are able to improve the time performance of the state transition management by 253.34%. That is, the time consumption of the default state transition management is 3.53 times the proposed solution time, which in turn means time consumption is reduced by 71.69%. Moreover, the scalability of the crawling process is improved by 88.16%. That is, the proposed solution covers a considerably greater number of states in crawling Web applications. Finally, we identified the bottlenecks of scalability so as to be addressed in future work.

Item Media

Item Citations and Data


Attribution-NonCommercial-ShareAlike 2.5 Canada