UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Code comment generation by incorporating pre-defined application programming interface documentation Shahbazi, Ramin

Abstract

Code comments are significantly helpful in comprehending software programs and, therefore, saving a lot of time and energy in software maintenance. However, the comments are mostly outdated or missed, specially in complex software projects. As a result, several automatic comment generation models are developed as a solution. These models aim to automatically predict comments in the natural language given a code snippet. Recently, several studies investigate the effect of integrating external knowledge on the quality of generated comments. In this work, we propose a solution, namely APIContext2Com, to improve the effectiveness of generated comments by incorporating the pre-defined Application Programming Interface (API) context. The API context includes the definition and description of the pre-defined APIs used within the code snippets. Since the API information expresses the functionality of a code snippet, it can be helpful to better generate the code summary. We introduce a seq-2-seq encoder-decoder neural network model with different sets of multiple encoders to effectively transform distinct inputs into target comments. A ranking mechanism is also developed to exclude less relevant APIs, so that we can filter out some unrelated knowledge to the model. We evaluate the proposed approach on large-scale Java dataset, containing over 130,000 records, and the experimental findings reveal that the proposed model significantly outperforms the competitive baselines. Compared to the best baseline, Rencos, the proposed approach improves BLEU1, BLEU2, BLEU3, BLEU4, METEOR, ROUGE-L scores by 1.88 (8.24 %), 2.16 (17.58 %), 1.38 (18.3 %), 0.73 (14.17 %), 1.58 (14.98 %) and 1.9 (6.92 %) respectively.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International