Skip to content

tag1consulting/scolta-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scolta-python

AI-powered search with Pagefind — the Python language binding of Scolta. A faithful port of scolta-php.

Scolta is a scoring/ranking/AI layer over Pagefind, a static client-side search engine. The browser-side scoring engine (scolta-core compiled to WebAssembly) re-ranks Pagefind results and drives an optional LLM tier (query expansion, summarization, follow-ups). This binding does the server-side work:

  • gets content out of the application,
  • builds and maintains a Pagefind-compatible index in-process (pure-Python indexer — no Pagefind binary required at runtime), with an input-side token cache so re-indexing after a content edit only re-tokenizes changed pages,
  • proxies AI calls (Anthropic native + any OpenAI-compatible endpoint),
  • serves the reused WASM/JS/CSS asset bundle and exposes config.

The pure-Python indexer is the default (indexer: auto). The Pagefind binary pipeline is ported too but is opt-in (indexer: binary), with the same auto-fallback-to-Python-when-the-binary-is-unavailable behaviour as the PHP binding.

Platform integration for Django/Wagtail lives in the companion scolta-django package.

Status

Work-in-progress port of scolta-php. See CLAUDE.md for the porting conventions and the per-phase progress.

Requirements

  • Python 3.10+
  • Optional: PyICU (the [icu] extra) for higher-quality Unicode diacritic normalization in the tokenizer. Without it the tokenizer uses a strtr-style fallback, exactly as scolta-php does without ext-intl.

Development

uv venv --python 3.12
uv pip install -e ".[dev]"
uv run pytest
uv run ruff check

About

Privacy-first AI search for Python, the pure-Python Scolta binding. Builds a static Pagefind index in-process; search runs entirely in the visitor's browser.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors