Performance improvements in crawling modern Web applications

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Performance improvements in crawling modern Web applications Zarei, Alireza

Abstract

Today, a considerable portion of our society relies on Web applications to perform numerous tasks in every day life; for example, transferring money over wire or purchasing flight tickets. To ascertain such pervasive Web applications perform robustly, various tools are introduced in the software engineering research community and the industry. Web application crawlers are an instance of such tools used in testing and analysis of Web applications. Software testing, and in particular testing Web applications, play an imperative role in ensuring the quality and reliability of software systems. In this thesis, we aim at optimizing the crawling of modern Web applications in terms of memory and time performances. Modern Web applications are event driven and have dynamic states in contrast to classic Web applications. Aiming at improving the crawling process of modern Web applications, we focus on state transition management and scalability of the crawling process. To improve the time performance of the state transition management mechanism, we propose three alternative techniques revised incrementally. In addition, aiming at increasing the state coverage, i.e. increasing the number of states crawled in a Web application, we propose an alternative solution, reducing the memory consumption, for storage and retrieval of dynamic states in Web applications. Moreover, a memory analysis is performed by using memory profiling tools to investigate the areas of memory performance optimization. The enhancements proposed are able to improve the time performance of the state transition management by 253.34%. That is, the time consumption of the default state transition management is 3.53 times the proposed solution time, which in turn means time consumption is reduced by 71.69%. Moreover, the scalability of the crawling process is improved by 88.16%. That is, the proposed solution covers a considerably greater number of states in crawling Web applications. Finally, we identified the bottlenecks of scalability so as to be addressed in future work.

Item Metadata

Title	Performance improvements in crawling modern Web applications
Creator	Zarei, Alireza
Publisher	University of British Columbia
Date Issued	2014
Description	Today, a considerable portion of our society relies on Web applications to perform numerous tasks in every day life; for example, transferring money over wire or purchasing flight tickets. To ascertain such pervasive Web applications perform robustly, various tools are introduced in the software engineering research community and the industry. Web application crawlers are an instance of such tools used in testing and analysis of Web applications. Software testing, and in particular testing Web applications, play an imperative role in ensuring the quality and reliability of software systems. In this thesis, we aim at optimizing the crawling of modern Web applications in terms of memory and time performances. Modern Web applications are event driven and have dynamic states in contrast to classic Web applications. Aiming at improving the crawling process of modern Web applications, we focus on state transition management and scalability of the crawling process. To improve the time performance of the state transition management mechanism, we propose three alternative techniques revised incrementally. In addition, aiming at increasing the state coverage, i.e. increasing the number of states crawled in a Web application, we propose an alternative solution, reducing the memory consumption, for storage and retrieval of dynamic states in Web applications. Moreover, a memory analysis is performed by using memory profiling tools to investigate the areas of memory performance optimization. The enhancements proposed are able to improve the time performance of the state transition management by 253.34%. That is, the time consumption of the default state transition management is 3.53 times the proposed solution time, which in turn means time consumption is reduced by 71.69%. Moreover, the scalability of the crawling process is improved by 88.16%. That is, the proposed solution covers a considerably greater number of states in crawling Web applications. Finally, we identified the bottlenecks of scalability so as to be addressed in future work.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2014-02-25
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-ShareAlike 2.5 Canada
DOI	10.14288/1.0165885
URI	http://hdl.handle.net/2429/46072
Degree	Master of Applied Science - MASc
Program	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2014-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-sa/2.5/ca/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Performance improvements in crawling modern Web applications Zarei, Alireza

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights