UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Parameter efficient code representation learning Saberi Tirani, Iman

Abstract

This dissertation addresses efficient code representation learning, embedding code and related text into vectors that capture semantics, syntax, and context to support tasks like code summarization and code generation. We propose four contributions: (1) knowledge transformation via lightweight adapters to adapt models like RoBERTa for code summarization; (2) knowledge aggregation using our proposed AdvFusion architecture for multilingual integration; (3) knowledge injection through syntax-aware introduced NER adapters to inject syntax; and (4) contextual retrieval via our proposed Programming Knowledge Graph (PKG) for finer-grain context retrieval on code generation. Results: Our methods outperform baselines—e.g., Ruby summarization (BLEU 16.53 vs. 14.75), Java refinement (BLEU 78.2), and code generation (PKG improves accuracy by up to 34% on MBPP).

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International