feat(contextual-prefix): add local ollama prefix tier (tier 2.5) by vinsocci · Pull Request #63 · AgriciDaniel/claude-obsidian

vinsocci · 2026-05-29T01:11:27Z

Summary

contextual-prefix.py shipped three prefix-generation tiers:

anthropic-api — --allow-egress gated, sends page bodies off-machine (~$12/1k docs)
claude-cli — --allow-egress gated, sends page bodies off-machine (free via CC subscription)
synthetic — on-machine, template-based, lower quality

For users who already run a local LLM, there was no way to get LLM-quality contextual prefixes without egress — a gap between "good but leaves the machine" and "private but weak."

This adds a 4th tier between claude-cli and synthetic:

tier 2.5 "ollama" — POST each chunk to a local ollama /api/chat with the page body for context. Default-on when ollama is reachable; no flag needed for the common localhost case.

Egress posture (consistent with the existing model)

Local ollama (127.0.0.1 / localhost / ::1) needs no flag.
Non-localhost OLLAMA_URL requires --allow-remote-ollama, mirroring the scripts/tiling-check.py:351 default-deny precedent.
--no-ollama skips the tier even when reachable.
Asymmetric fallback: on ollama failure the tier drops to synthetic and does not climb to claude-cli/anthropic-api — egress was never consented to on this path, so silently egressing on local-LLM failure would violate the user's posture.

API surface

New flags: --ollama-model (default qwen2.5:7b-instruct), --allow-remote-ollama, --no-ollama
New functions: ollama_url / ollama_is_local / ollama_reachable / ollama_prefix
pick_prefix_tier() and generate_prefix() extended; the anthropic-api and claude-cli paths are unchanged.

Test plan

Verified on macOS (Obsidian 1.12.7, ollama + qwen2.5:7b-instruct), all against this repo's own public demo vault:

Tier picker (49 demo pages, --peek):

invocation	result
default (local ollama up)	all 49 → `tier=ollama`
`--no-ollama`	all 49 → `tier=synthetic`

Example prefixes (generated by this branch, tier=ollama):

wiki/concepts/Search Experience Optimization.md → "This chunk outlines the methodology, process, and key innovation of SXO in SEO analysis."
wiki/concepts/Pro Hub Challenge.md → "This chunk outlines the Pro Hub Challenge, detailing its structure and rules for both challenges."
wiki/concepts/cherry-picks.md → "This chunk outlines Tier 1 features for quick implementation in the Cherry-Picks feature backlog."

Production scale: 895 chunks across 401 pages generated with tier=ollama, zero egress, no errors.

Notes

Pairs naturally with the existing ollama rerank stage (rerank.py), so the full contextual-retrieval pipeline (prefix → BM25 → cosine rerank) can run entirely on-machine.
Default model qwen2.5:7b-instruct is a suggestion, not a requirement — --ollama-model accepts any pulled tag (tested also with qwen2.5-coder:14b).

🤖 Generated with Claude Code

contextual-prefix.py shipped three tiers: anthropic-api and claude-cli (both --allow-egress gated, both send page bodies off-machine) and synthetic (on-machine, template-based, lower quality). For users with a local LLM already running, there was no way to get LLM-quality contextual prefixes WITHOUT egress — the gap between "good but leaves the machine" and "private but weak." This adds a 4th tier between claude-cli and synthetic: tier 2.5 "ollama" — POST each chunk to a local ollama /api/chat with the page body for context. Default-ON when ollama is reachable; no flag needed for the common (localhost) case. Egress posture (consistent with the existing model): - Local ollama (127.0.0.1/localhost/::1) needs no flag. - Non-localhost OLLAMA_URL requires --allow-remote-ollama, mirroring the scripts/tiling-check.py:351 default-deny precedent. - --no-ollama skips the tier even if reachable. - Asymmetric fallback: on ollama failure the tier drops to synthetic and does NOT climb to claude-cli/anthropic-api — egress was never consented to on this path, so silently egressing on local-LLM failure would violate the user's posture. New flags: --ollama-model (default qwen2.5:7b-instruct), --allow-remote-ollama, --no-ollama. New functions: ollama_url/ollama_is_local/ollama_reachable/ ollama_prefix. pick_prefix_tier() and generate_prefix() extended; the "anthropic-api" and "claude-cli" paths are unchanged. Verified on macOS (Obsidian 1.12.7, ollama + qwen2.5:7b-instruct): - tier picker over 49 public demo pages: default -> all tier=ollama; --no-ollama -> all tier=synthetic. - example prefix (wiki/concepts/Search Experience Optimization.md): "This chunk outlines the methodology, process, and key innovation of SXO in SEO analysis." - production scale: 895 chunks across 401 pages generated with tier=ollama, zero egress, no errors. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

vinsocci requested a review from AgriciDaniel as a code owner May 29, 2026 01:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(contextual-prefix): add local ollama prefix tier (tier 2.5)#63

feat(contextual-prefix): add local ollama prefix tier (tier 2.5)#63
vinsocci wants to merge 1 commit into
AgriciDaniel:mainfrom
vinsocci:feat/ollama-prefix-tier

vinsocci commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vinsocci commented May 29, 2026

Summary

Egress posture (consistent with the existing model)

API surface

Test plan

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant