refactor: improve terminology consistency in architecture and RAG sections

merendamattia · merendamattia · commit 940984484a56 · 2025-12-22T08:35:02.000+01:00
diff --git a/latex/main.pdf b/latex/main.pdf
diff --git a/latex/sections/03_architecture.tex b/latex/sections/03_architecture.tex
@@ -7,9 +7,9 @@ \subsection{High-Level Overview}
 
 \subsection{Core Components}
 
-The system is composed of several interrelated components that work together to deliver financial advisory services. The agent layer contains specialized AI agents built on top of the datapizza-ai framework.\footnote{\url{https://github.com/datapizza-labs/datapizza-ai}.} BaseAgent serves as an abstract foundation that encapsulates common agent functionality including initialization, configuration loading, system prompt management, and tool registration. All specialized agents inherit from BaseAgent to ensure consistency and reduce code duplication. Chatbot Agent focuses on natural conversation and financial profile extraction, maintaining conversation history and interpreting user intent while guiding users toward providing relevant financial information. Financial Advisor Agent specializes in portfolio generation and analysis, utilizing RAG to access historical financial data and providing evidence-based recommendations.
+The system is composed of several interrelated components that work together to deliver financial advisory services. The agent layer contains specialized AI agents built on top of the datapizza-ai framework.\footnote{\url{https://github.com/datapizza-labs/datapizza-ai}.} Base Agent serves as an abstract foundation that encapsulates common agent functionality including initialization, configuration loading, system prompt management, and tool registration. All specialized agents inherit from Base Agent to ensure consistency and reduce code duplication. Chatbot Agent focuses on natural conversation and financial profile extraction, maintaining conversation history and interpreting user intent while guiding users toward providing relevant financial information. Financial Advisor Agent specializes in portfolio generation and analysis, utilizing RAG to access historical financial data and providing evidence-based recommendations.
 
-The data model layer ensures type safety and clear data contracts throughout the system. Financial Profile captures demographic and financial characteristics including age, employment status, income, expenses, debt, savings, investment experience, risk tolerance, and financial goals. Portfolio represents investment recommendations including asset allocations, expected returns, volatility metrics, and diversification ratios. PACMetrics encodes portfolio analysis metrics including PAC (piano di accumulo) values, performance indicators, and risk assessment. These models are implemented using Pydantic, which enforces type safety at runtime and provides validation.
+The data model layer ensures type safety and clear data contracts throughout the system. Financial Profile captures demographic and financial characteristics including age, employment status, income, expenses, debt, savings, investment experience, risk tolerance, and financial goals. Portfolio represents investment recommendations including asset allocations, expected returns, volatility metrics, and diversification ratios. PAC Metrics encodes portfolio analysis metrics including PAC (piano di accumulo) values, performance indicators, and risk assessment. These models are implemented using Pydantic, which enforces type safety at runtime and provides validation.
 
 The retrieval layer implements retrieval-augmented generation specifically for financial data. RAG Asset Retriever maintains a vector database of historical ETF and stock data, supports semantic search for relevant financial assets, integrates historical price data spanning ten years, and provides structured asset information for recommendations. This component is essential for grounding LLM recommendations in factual, historical information rather than allowing the model to generate potentially inaccurate figures.
 
diff --git a/latex/sections/05_rag.tex b/latex/sections/05_rag.tex
@@ -35,7 +35,7 @@ \subsection{RAG Architecture}
 
 The pipeline begins with the ingest\_pdfs method, which recursively scans the data directory to extract text from PDF documents using the pypdf library. To ensure context preservation while maintaining granularity, the extracted text is processed into overlapping chunks of 800 characters with a 120-character overlap. This character-based chunking strategy ensures that semantic boundaries are respected during the retrieval phase.
 
-For the embedding stage, the system employs the SentenceTransformer framework, specifically utilizing the all-roberta-large-v1 model. This model transforms text chunks into high-dimensional dense vectors (1024 dimensions), capturing deep semantic relationships that simple keyword searches would miss. To optimize performance and reduce computational overhead, the system implements a caching mechanism: once generated, the embeddings and associated metadata are serialized into an embeddings \texttt{.pkl} file. Subsequent initializations of the agent load this index directly from disk, eliminating the need for re-processing the entire dataset.
+For the embedding stage, the system employs the Sentence Transformer framework, specifically utilizing the \texttt{all-roberta-large-v1} model. This model transforms text chunks into high-dimensional dense vectors (1024 dimensions), capturing deep semantic relationships that simple keyword searches would miss. To optimize performance and reduce computational overhead, the system implements a caching mechanism: once generated, the embeddings and associated metadata are serialized into an embeddings \texttt{.pkl} file. Subsequent initializations of the agent load this index directly from disk, eliminating the need for re-processing the entire dataset.
 
 The retrieval logic follows a \textit{top-K} similarity approach. When a query is received, it is encoded into the same vector space as the document chunks. The system then computes the cosine similarity between the query vector and the entire embedding matrix using scikit-learn's optimized routines. The k most relevant segments (with a default of \texttt{k = 15}) are returned to the Financial Advisor Agent, ranked by their similarity score. This architecture ensures that the LLM is provided with the most contextually relevant financial data to ground its recommendations in factual evidence.
 
@@ -51,13 +51,13 @@ \subsection{Data Processing Pipeline}
 
 The data processing pipeline is designed to transform unstructured PDF prospectuses into a searchable knowledge base. The process begins by scanning the dataset directory for PDF files using a recursive globbing strategy. Text extraction is performed page-by-page, and the resulting strings are partitioned into chunks of 800 characters with a 120-character overlap to maintain semantic continuity.
 
-Each chunk is then transformed into a dense vector representation using the SentenceTransformer model. Unlike generic text processing, this pipeline focuses on capturing the specific terminology found in financial asset descriptions. The system computes these embeddings during the initial setup and stores the entire payload—consisting of the original text, metadata (such as the source file name and chunk ID), and the numerical vectors—into a serialized Pickle file for efficient subsequent loading.
+Each chunk is then transformed into a dense vector representation using the Sentence Transformer model. Unlike generic text processing, this pipeline focuses on capturing the specific terminology found in financial asset descriptions. The system computes these embeddings during the initial setup and stores the entire payload—consisting of the original text, metadata (such as the source file name and chunk ID), and the numerical vectors—into a serialized Pickle file for efficient subsequent loading.
 
 \subsection{Performance Optimization}
 
 To ensure a responsive user experience, the system implements several optimization strategies at the retrieval level. Instead of relying on external database calls, the RAG Asset Retriever loads the entire embedding index into memory as a NumPy array. This allows for near-instantaneous similarity computations using optimized matrix operations.
 
-The use of a global cache for the embedding model ensures that the SentenceTransformer is loaded into memory only once, significantly reducing the latency of subsequent user queries. Furthermore, by utilizing the all-roberta-large-v1 model, the system achieves a balance between high-dimensional semantic accuracy (1024 dimensions) and computational speed, even on hardware without dedicated GPU acceleration.
+The use of a global cache for the embedding model ensures that the Sentence Transformer is loaded into memory only once, significantly reducing the latency of subsequent user queries. Furthermore, by utilizing the all-roberta-large-v1 model, the system achieves a balance between high-dimensional semantic accuracy (1024 dimensions) and computational speed, even on hardware without dedicated GPU acceleration.
 
 %\subsection{Evaluation of RAG Quality}