Inline Provider

Overview

The inline provider runs Ragas evaluation directly within the Llama Stack server process. This is the simplest deployment option and is ideal for development, testing, and lightweight production scenarios.

Architecture

The inline provider architecture is straightforward, running everything within the Llama Stack server process:

graph TB
    A[Client Request] --> B[Llama Stack Server]

    subgraph LSS ["🟦 Llama Stack Server"]
        style LSS stroke-dasharray: 5 5
        B --> C[Inline Ragas Provider]
        C --> D[RAGAS Engine]
        D --> E[Direct Memory Processing]
        E --> F[Results]
        F --> B
    end

    B --> A

Components

Llama Stack Server

The main server process that handles all requests and coordinates between different providers.

Inline Ragas Provider

The provider implementation that handles RAGAS evaluation requests directly within the server process.

RAGAS Engine

The core RAGAS evaluation engine that runs the actual metrics calculations.

Direct Memory Processing

All data processing happens in memory within the same process, providing fast access but sharing resources.

Process Flow

Request Reception: Client requests are received by the Llama Stack server
Provider Selection: Server routes evaluation requests to the inline RAGAS provider
Direct Processing: Provider loads data and runs evaluation in the same process
Result Return: Results are immediately available and returned to the client

Installation

Prerequisites

Python 3.12 or later
uv package manager

One-liner setup for the impatient

Running this command will start a Llama Stack server with the Ragas provider installed, and use the minimal distribution that is the distribution directory.
Note that we are asking for the [distro] dependency group (more info below).
Also note that, for this one-liner to work, you will need to have your environment variables set up (see [_environment_variables]).

uv run --with llama-stack-provider-ragas[distro] llama stack run distribution/run-inline.yaml

Installing with uv

To get started with uv, create a virtual environment and install from PyPI:

uv venv --python=3.12
source .venv/bin/activate
uv pip install llama-stack-provider-ragas

Development setup

If you’re planning to contribute and make modifications to the code, ensure that you clone the repository and set it up as an editable install:

git clone https://github.com/trustyai-explainability/llama-stack-provider-ragas
cd llama-stack-provider-ragas
uv pip install -e .

Optional dependencies for inline provider

The package includes several optional dependency groups:

Group	Description
`dev`	Development dependencies including testing tools, linting, and type checking
`distro`	Dependencies to use the provided minimal Llama Stack distribution under `distribution/`

Installing with optional dependencies

# For development (includes all dependencies)
uv pip install -e ".[dev]"

# For using the sample distribution
uv pip install -e ".[distro]"

# Base installation (inline provider only)
uv pip install -e "."

Configuration

The inline provider requires minimal configuration beyond the standard Llama Stack setup.

Environment Variables

Create a .env file in the project root with:

EMBEDDING_MODEL=all-MiniLM-L6-v2

Distribution Configuration

The repository includes a sample Llama Stack distribution configuration that uses Ollama as a provider for inference and embeddings.

The inline provider is setup in the following lines of the run-inline.yaml:

eval:
  - provider_id: trustyai_ragas_inline
    provider_type: inline::trustyai_ragas
    module: llama_stack_provider_ragas.inline
    config:
      embedding_model: ${env.EMBEDDING_MODEL}

Usage

Starting the Server

Start the Llama Stack server with the included distribution configuration:

dotenv run uv run llama stack run distribution/run-inline.yaml

Example Usage

The repository includes demonstration examples in the demos/ directory showing how to use the provider.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inline Provider

Overview

Architecture

Components

Llama Stack Server

Inline Ragas Provider

RAGAS Engine

Direct Memory Processing

Process Flow

Installation

Prerequisites

One-liner setup for the impatient

Installing with uv

Development setup

Optional dependencies for inline provider

Installing with optional dependencies

Configuration

Environment Variables

Distribution Configuration

Usage

Starting the Server

Example Usage

FilesExpand file tree

inline-provider.adoc

Latest commit

History

inline-provider.adoc

File metadata and controls

Inline Provider

Overview

Architecture

Components

Llama Stack Server

Inline Ragas Provider

RAGAS Engine

Direct Memory Processing

Process Flow

Installation

Prerequisites

One-liner setup for the impatient

Installing with uv

Development setup

Optional dependencies for inline provider

Installing with optional dependencies

Configuration

Environment Variables

Distribution Configuration

Usage

Starting the Server

Example Usage