UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Efficient dynamic analysis of Android applications Ahmed, Khaled

Abstract

Despite efforts to identify low-quality applications (apps) such as apps that crash or leak sensitive user information, the official Android Google Play app store continues to be infected by such apps. This causes issues ranging from negative user experience to millions of dollars in financial losses. One approach that the store uses to check apps is dynamic program analysis, in which an app is executed, and its runtime behavior is observed. However, existing dynamic analysis techniques often sacrifice the comprehensiveness of the analysis (i.e., do not report all interesting behaviors) in favor of efficiency. This is necessary since Android devices have limited resources, and the system aborts unresponsive or memory-hungry apps, terminating slow analysis prematurely. In this thesis, we propose techniques to improve the comprehensiveness of dynamic analyses while keeping them efficient. Specifically, we focus on two of the most prominent (yet heavy) analysis techniques: (a) slicing, which retrieves a slice: a set of statements that affect the execution of a specific statement (e.g., a buggy line of code), and (b) taint analysis, which identifies whether specific data flows out. Our proposed slicing approach provides more comprehensive slices (more relevant statements) than the state-of-the-art while being ten times more efficient. Our taint analysis approach can report full information flow paths, unlike state-of-the-art approaches that only report flow endpoints. We also showcase the usefulness of reporting full paths by introducing a new approach that classifies suspicious and legitimate information flow paths with high accuracy while remaining efficient. We also introduce new datasets to aid with the evaluation of our approaches for real use-cases and in realistic settings. For slicing, we introduce the first dataset of manually generated slices for real bugs from real apps. For taint analysis, we create a dataset of apps with malicious code that entered the Google Play Store. We characterize their attacks and propose a novel representation for summarizing their activation methods. We use information flows from this dataset in evaluating our path classification approach. Finally, we implement our approaches in open-source tools and make them and our datasets available to the research community.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International