Skip to content

Latest commit

 

History

History
320 lines (237 loc) · 9.64 KB

File metadata and controls

320 lines (237 loc) · 9.64 KB

AI-Powered Template Generation (RAG)

Generate OS image templates from natural language descriptions using Retrieval-Augmented Generation (RAG). The AI feature searches existing templates semantically, then uses an LLM to generate a new template grounded in real, working examples.

Phase 1 - This guide covers the current implementation: core RAG with basic CLI (semantic search, template generation, embedding cache). See the ADR for the full roadmap (query classification, conversational refinement, agentic validation).

Table of Contents


Prerequisites

Requirement Details
os-image-composer binary Built via earthly +build or go build ./cmd/os-image-composer
AI provider (one of) Ollama (local) or an OpenAI API key
Template files At least a few image-templates/*.yml files to serve as the RAG knowledge base

Install Ollama (recommended)

Ollama runs models locally - no API keys, no cloud costs.

# Install Ollama (Linux)
curl -fsSL https://ollama.com/install.sh | sh

# Pull the required models
ollama pull nomic-embed-text   # embedding model (768 dimensions)
ollama pull llama3.1:8b        # default chat/generation model

# Verify the server is running
ollama list

Tip: Alternative embedding models are supported: mxbai-embed-large (1024 dims) and all-minilm (384 dims). Change the model in os-image-composer.yml if needed.


Quick Start (Ollama - local, free)

With Ollama running, no configuration is needed:

# 1. Make sure Ollama is serving (default http://localhost:11434)
ollama serve &

# 2. Generate a template from a natural language description
./os-image-composer ai "create a minimal edge image for ubuntu with SSH"

# 3. Search for relevant templates without generating
./os-image-composer ai --search-only "cloud image with monitoring"

Quick Start (OpenAI - cloud)

# 1. Set your API key
export OPENAI_API_KEY="sk-..."

# 2. Generate using OpenAI
./os-image-composer ai --provider openai "create a minimal elxr image for IoT"

CLI Reference

os-image-composer ai [query] [flags]

Generate a Template

./os-image-composer ai "create a minimal edge image for elxr with docker support"

The command will:

  1. Index all templates in image-templates/ (with embedding cache)
  2. Perform semantic search to find the most relevant templates
  3. Show the top reference templates and their similarity scores
  4. Generate a new YAML template grounded in those examples

Search Only

Find relevant templates without invoking the LLM:

./os-image-composer ai --search-only "cloud deployment with monitoring"

Output shows each matching template with a score breakdown:

Found 5 matching templates:

1. elxr-cloud-amd64.yml
   Score: 0.87 (semantic: 0.92, keyword: 0.75, package: 0.60)
   Description: Cloud-ready eLxr image for VM deployment
   Distribution: elxr12, Architecture: x86_64, Type: raw

Save to File

# Save to image-templates/my-custom-image.yml
./os-image-composer ai "create a minimal edge image" --output my-custom-image

# Save to a specific path
./os-image-composer ai "create an edge image" --output /tmp/my-image.yml

If the output filename matches one of the reference templates returned by the current search results, you will be prompted before overwriting.

Cache Management

Embeddings are cached to avoid recomputation on each run. The cache automatically invalidates when a template's content changes (SHA256 hash).

# Show cache statistics (entries, size, model, dimensions)
./os-image-composer ai --cache-stats

# Clear the embedding cache (forces re-indexing on next run)
./os-image-composer ai --clear-cache

All Flags

Flag Default Description
--provider ollama AI provider: ollama or openai
--templates-dir ./image-templates Directory containing template YAML files
--search-only false Only search, don't generate
--output (none) Save generated template (name or path)
--cache-stats false Show cache statistics
--clear-cache false Clear the embedding cache

Configuration

Zero Configuration (Ollama)

If Ollama is running on localhost:11434 with nomic-embed-text and llama3.1:8b pulled, everything works out of the box - no config file changes required.

Switching to OpenAI

The AI command currently selects the provider via CLI flags, not os-image-composer.yml.

Use --provider openai when running os-image-composer ai:

./os-image-composer ai --provider openai "minimal Ubuntu server image for cloud VMs"

You also need an API key:

export OPENAI_API_KEY="sk-..."

The config snippet below shows the global config file schema for reference:

ai:
  provider: openai

Full Configuration Reference

All settings are optional. Defaults are shown below - only override what you need to change in os-image-composer.yml:

ai:
  provider: ollama                # "ollama" or "openai"
  templates_dir: ./image-templates

  ollama:
    base_url: http://localhost:11434
    embedding_model: nomic-embed-text   # 768 dims
    chat_model: llama3.1:8b
    timeout: "120s"                    # request timeout

  openai:
    embedding_model: text-embedding-3-small
    chat_model: gpt-4o-mini
    timeout: "60s"                     # request timeout

  cache:
    enabled: true
    dir: ./.ai-cache

  # Advanced - rarely need to change
  scoring:
    semantic_weight: 0.70   # embedding similarity weight
    keyword_weight: 0.20    # keyword overlap weight
    package_weight: 0.10    # package name matching weight
Environment Variable Description
OPENAI_API_KEY Required when provider: openai

How It Works

User query ──► Index templates ──► Semantic search ──► Build LLM context ──► Generate YAML
                  │                     │
                  ▼                     ▼
             Embedding cache      Hybrid scoring
             (SHA256 hash)    (semantic + keyword + package)
  1. Indexing - On first run (or when templates change), each template in image-templates/ is parsed and converted to a searchable text representation. An embedding vector is generated via the configured provider and cached locally.

  2. Search - The user query is embedded and compared against all template vectors using cosine similarity. A hybrid score combines:

    • Semantic similarity (70%) - how closely the meaning matches
    • Keyword overlap (20%) - exact term matches
    • Package matching (10%) - package name overlap
  3. Generation - The top-scoring templates are included as context for the LLM, which generates a new YAML template grounded in real, working examples.


Enriching Templates with Metadata

Templates work without metadata, but adding an optional metadata section improves search accuracy:

metadata:
  description: "Cloud-ready eLxr image for VM deployment on AWS, Azure, GCP"
  use_cases:
    - cloud-deployment
  keywords:
    - cloud
    - cloud-init
    - aws
    - azure

image:
  name: elxr-cloud-amd64
  # ... rest of template

All metadata fields are optional. Templates without metadata are still indexed using their filename, distribution, architecture, image type, and package lists.


Troubleshooting

Symptom Cause Fix
failed to create AI engine Ollama not running Run ollama serve
connection refused :11434 Ollama server down Start Ollama: ollama serve
Embeddings fail Model not pulled ollama pull nomic-embed-text
Chat generation fails Chat model not pulled ollama pull llama3.1:8b
Poor search results Stale cache ./os-image-composer ai --clear-cache
OpenAI auth error Missing API key export OPENAI_API_KEY="sk-..."
Slow first run Building embedding cache Normal - subsequent runs use cache
No matching templates found Empty templates dir Check --templates-dir points to templates

Related Documentation