Show simple item record

dc.contributor.advisorMaximilian Rohrer
dc.contributor.authorSælemyr, Joakim
dc.contributor.authorFemdal, Halvor Thorstensen
dc.date.accessioned2025-02-15T17:11:45Z
dc.date.issued2024
dc.identifierno.nhh:wiseflow:7200393:61696255
dc.identifier.urihttps://hdl.handle.net/11250/3178510
dc.description.abstractThis thesis investigates how Retrieval-Augmented Generation (RAG) improves the ability of Large Language Models (LLMs) to filter information from financial documents. For this task, we first develop NorwegianFinanceQA, a dataset containing 433 queries from the financial reports of 9 Norwegian companies, divided into text- and table-related queries. Next, we evaluate the retrieval accuracy and efficiency of RAG systems with different chunking techniques: character-based, recursive, and semantic splitting. Additionally, we propose a table-specific summarization approach. Our results suggest that table summaries achieve perfect accuracy for table queries while at the same time increasing efficiency. However, this improvement comes at the expense of text-query performance. Our findings highlight the importance of tailored chunking strategies when using LLMs and RAG systems for information retrieval in a financial context.
dc.description.abstract
dc.languageeng
dc.publisherNORWEGIAN SCHOOL OF ECONOMICS
dc.titleChunk Smarter, Retrieve Better: Enhancing LLMs in Finance : An Empirical Comparison of Chunking Techniques in Retrieval Augmented Generation for Financial Reports
dc.typeMaster thesis


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record