- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Investigating the impact of methodological choices...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Investigating the impact of methodological choices on source code maintenance analyses Ahmad, Syed Ishtiaque
Abstract
Many prediction models rely on past data about how a system evolves to learn and anticipate the number of changes and bugs it will have in the future. As a software engineer or data scientist creates these models, they need to make several methodological choices such as deciding on size measurements, whether size should be controlled, from what time range metrics should be obtained, etc. In this work, we demonstrate how different methodological decisions can cause practitioners to reach conclusions that are significantly and meaningfully different. For example, when measuring SLOC from evolving source code of a method, one could decide to use the initial, median, average, final, or a per-change measure of method size. These decisions matter; for instance, one prior study observed better performance of code metrics for defect prediction in general, while another study found negative results when performance was evaluated through a time-based approach. Our result identifies the reason behind this contradiction is due to the age of the methods not being explicitly controlled, where the first six months of a method’s evolution could have provided a better understanding of maintenance. Understanding the impact of these different methodological decisions is especially important given the increasing significance of approaches that use these large datasets for software analysis tasks. This work can impact both practitioners and researchers by helping them understand which of the methodological choices underpinning their analyses are important, and which are not; this can lead to more consistency among research studies and improved decision making for deployed analyses.
Item Metadata
Title |
Investigating the impact of methodological choices on source code maintenance analyses
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2021
|
Description |
Many prediction models rely on past data about how a system evolves to learn and anticipate the number of changes and bugs it will have in the future. As a software engineer or data scientist creates these models, they need to make several methodological choices such as deciding on size measurements, whether size should be controlled, from what time range metrics should be obtained, etc. In this work, we demonstrate how different methodological decisions can cause practitioners to reach conclusions that are significantly and meaningfully different. For example, when measuring SLOC from evolving source code of a method, one could decide to use the initial, median, average, final, or a per-change measure of method size. These decisions matter; for instance, one prior study observed better performance of code metrics for defect prediction in general, while another study found negative results when performance was evaluated through a time-based approach. Our result identifies the reason behind this contradiction is due to the age of the methods not being explicitly controlled, where the first six months of a method’s evolution could have provided a better understanding of maintenance. Understanding the impact of these different methodological decisions is especially important given the increasing significance of approaches that use these large datasets for software analysis tasks. This work can impact both practitioners and researchers by helping them understand which of the methodological choices underpinning their analyses are important, and which are not; this can lead to more consistency among research studies and improved decision making for deployed analyses.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2021-09-10
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0401968
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2021-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International