UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Automatically associating resources with tasks based on a software developer’s activity Salomon, Marie


Developing and maintaining software is a complex process that consists of many different tasks and activities. Despite substantial research into how software developers work, there are few techniques to help track which resources, in particular which parts of source code, a developer needs to complete a task. In this thesis, we explore whether it is possible to associate automatically the resources a software developer works on as part of a task with the appropriate task assigned to a developer based on semantic similarity between the resource content and the description of a task. We explore a design space involving three similarity techniques—Term Frequency - Inverse Document Frequency (TF-IDF), Bidirectional Encoder Representation from Transformers (BERT), and word2vec—and three ways of segmenting the work a developer performs—time intervals, a set number of interactions a developer undertakes and a sliding time window. To explore this design space, we undertook three case studies on three developers from the same open source project, focusing on the effectiveness, measured in precision, with which different techniques and segmentation techniques are able to associate resources with tasks. Despite variation by developer, we found that TF-IDF combined with segmenting a developer’s activity by time results in the highest precision score, but with low recall. We found that BERT, combined with segmenting a developer’s activity by a set number of interactions, results in the best balance between the precision and recall. Future research should explore how to personalize the right combination of similarity with a segmentation approach for a developer to best associate resources with a task being worked on by the developer.

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International