Skip to content

2 ‐ Methodology overview

Nikola Milosevic edited this page Jan 26, 2025 · 4 revisions

Basic overview

VerifAI is a Retrieval-Augmented Generation-based system designed to perform referenced QA in the biomedical domain. It consists of three main components.

  • The Information Retrieval component, based on hybrid semantic and lexical search, retrieves relevant documents and provides a context for the generative LLM.
  • Generative component, based on large language model, either self-hosted, or called using available API (e.g. OpenAI).
  • Verification system, that uses encoder transformer model to cross-check generated answer with sources based on which answer was given and verify and flag any potential hallucinations

Schematically system can be explained using the following image: image

The primary component of this toolbox is the information retrieval engine based on indexed documents (Verifai BioMed has indexed abstract dataset from PubMed database). The question-answering system utilizes either a standalone fine-tuned LLM, or LLM API, such as OpenAI to generate answers using retrieved documents in context (depending on the configuration settings). A fact-checking or verification engine examines the generated answer within the toolbox, identifying any potential hallucinations in the system.

Search

Generative question answering engine

Verification approaches

User interface

Clone this wiki locally