From Embeddings to Entities : A Comparative Analysis of RAG Architectures in Academic Domains

UBC Undergraduate Research

From Embeddings to Entities : A Comparative Analysis of RAG Architectures in Academic Domains Harjono, Karel Joshua

Abstract

Retrieval-Augmented Generation (RAG) systems are transforming how AI models access and utilize external knowledge, specifically in domainspecific applications such as education. Traditional RAG methods typically rely on vector store retrieval, which excels in semantic similarity but struggles with transparency and structured reasoning. This thesis explores an alternative approach, GraphRAG, which uses knowledge graphs to encode explicit relationships between entities from a given passage, potentially offering improved context relevance and explainability. Through a controlled evaluation involving curated datasets across seven academic disciplines and six question types, this thesis compares the performance, retrieval accuracy, and transparency of GraphRAG and vector-based RAG systems. Results show comparable performance across most metrics, with GraphRAG offering notable advantages in source traceability and structured retrieval. Additionally, this study introduces a domain-specific benchmark dataset to assess RAG systems in educational contexts. The findings highlight the value of structured retrieval in enhancing trust and interpretability in AIassisted learning environments and suggest directions for future research on evaluation methodologies and user interface improvements.

Item Metadata

Title	From Embeddings to Entities : A Comparative Analysis of RAG Architectures in Academic Domains
Creator	Harjono, Karel Joshua
Date Issued	2025-04
Description	Retrieval-Augmented Generation (RAG) systems are transforming how AI models access and utilize external knowledge, specifically in domainspecific applications such as education. Traditional RAG methods typically rely on vector store retrieval, which excels in semantic similarity but struggles with transparency and structured reasoning. This thesis explores an alternative approach, GraphRAG, which uses knowledge graphs to encode explicit relationships between entities from a given passage, potentially offering improved context relevance and explainability. Through a controlled evaluation involving curated datasets across seven academic disciplines and six question types, this thesis compares the performance, retrieval accuracy, and transparency of GraphRAG and vector-based RAG systems. Results show comparable performance across most metrics, with GraphRAG offering notable advantages in source traceability and structured retrieval. Additionally, this study introduces a domain-specific benchmark dataset to assess RAG systems in educational contexts. The findings highlight the value of structured retrieval in enhancing trust and interpretability in AIassisted learning environments and suggest directions for future research on evaluation methodologies and user interface improvements.
Genre	Graduating Project
Type	Text
Language	eng
Series	University of British Columbia. COSC_O 449
Date Available	2025-05-12
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0448869
URI	http://hdl.handle.net/2429/91114
Affiliation	Science, Irving K. Barber Faculty of (Okanagan); Computer Science, Mathematics, Physics and Statistics, Department of (Okanagan)
Peer Review Status	Unreviewed
Scholarly Level	Undergraduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Undergraduate Research