UBC Faculty Research and Publications

LLM-Enhanced Framework for Building Domain-Specific Lexicon for Urban Power Grid Design Xu, Yan; Wang, Tao; Yuan, Yang; Huang, Ziyue; Chen, Xi; Zhang, Bo; Zhang, Xiaorong; Wang, Zehua

Abstract

Traditional methods for urban power grid design have struggled to meet the demands of multi-energy integration and high resilience scenarios due to issues such as delayed updates of terminology and semantic ambiguity. Current techniques for constructing domain-specific lexicons face challenges like the insufficient coverage of specialized vocabulary and imprecise synonym mining, which restrict the semantic parsing capabilities of intelligent design systems. To address these challenges, this study proposes a framework for constructing a domain-specific lexicon for urban power grid design based on Large Language Models (LLMs). The aim is to enhance the accuracy and practicality of the lexicon through multi-level term extraction and synonym expansion. Initially, a structured corpus covering national and industry standards in the field of power was constructed. An improved Term Frequency–Inverse Document Frequency (TF-IDF) algorithm, combined with mutual information and adjacency entropy filtering mechanisms, was utilized to extract high-quality seed vocabulary from 3426 candidate terms. Leveraging LLMs, multi-level prompt templates were designed to guide synonym mining, incorporating a self-correction mechanism for semantic verification to mitigate errors caused by model hallucinations. This approach successfully built a domain-specific lexicon comprising 3426 core seed words and 10,745 synonyms. The average cosine similarity of synonym pairs reached 0.86, and expert validation confirmed an accuracy rate of 89.3%; text classification experiments showed that integrating the domain-specific dictionary improved the classifier’s F1-score by 9.2%, demonstrating the effectiveness of the method. This research innovatively constructs a high-precision terminology dictionary in the field of power design for the first time through embedding domain-driven constraints and validation workflows, solving the problems of insufficient coverage and imprecise expansion of traditional methods, and supporting the development of semantically intelligent systems for smart urban power grid design, with significant practical application value.

Item Media