- Repository:
castorini/nuggetizer - Primary language: Python 3.10+
- Purpose: create/score/assign factual nuggets for RAG evaluation using LLM backends (OpenAI, Azure OpenAI, OpenRouter, vLLM).
src/nuggetizer/models/nuggetizer.py: main orchestration (Nuggetizer) for create/score/assign.src/nuggetizer/core/: core types, metrics, sync/async LLM handlers, base protocols.src/nuggetizer/prompts/: prompt builders and YAML prompt templates.scripts/: CLI pipelines for dataset-scale JSONL processing.examples/: end-to-end usage examples (sync and async).docs/: assets only (logo currently).
- Build backend:
setuptools.build_metaviapyproject.toml. - Dependencies are dynamic and sourced from
requirements.txt. - Install for development with
pip install -e .. - Recommended local environment from README: conda env with Python 3.10.
- API keys are loaded from
.envbysrc/nuggetizer/utils/api.py. - Supported env vars:
- OpenAI:
OPEN_AI_API_KEYorOPENAI_API_KEY - OpenRouter:
OPENROUTER_API_KEY - Azure OpenAI:
AZURE_OPENAI_API_BASE,AZURE_OPENAI_API_VERSION,AZURE_OPENAI_API_KEY
- OpenAI:
- Keep provider fallback behavior intact in
LLMHandler/AsyncLLMHandler:- OpenAI first when available, OpenRouter fallback when enabled/available.
- vLLM uses local base URL (
http://localhost:<port>/v1) with placeholder key.
- Formatting/linting/type checks are enforced by pre-commit:
- Ruff (
ruff-check --fix,ruff-format) - MyPy (strict-ish config in
pyproject.toml)
- Ruff (
- Run before committing:
pre-commit run --all-files
- Type hints are expected for new/changed code (
disallow_untyped_defs = true). - Preserve dataclass and Enum-based type contracts in
core/types.py.
- PR CI (
.github/workflows/pr-format.yml) runs on PRs tomain. - CI currently validates style/type only via pre-commit (ruff + mypy).
- No dedicated automated test suite is present; validate behavior using examples/scripts locally.
- Lint/type:
pre-commit run --all-files
- Quick smoke checks:
python3 examples/e2e.py --helppython3 examples/async_e2e.py --helppython3 scripts/create_nuggets.py --helppython3 scripts/assign_nuggets.py --helppython3 scripts/calculate_metrics.py --help
scripts/create_nuggets.pyexpects JSONL records withqueryandcandidates.scripts/assign_nuggets.pyjoins nugget JSONL with answer JSONL (topic_idmapping).scripts/calculate_metrics.pycomputes per-record and global metrics from assignments.- Scripts append to output JSONL in some paths; avoid accidental duplicate processing.
- Keep public constructor behavior stable in
Nuggetizer(model args, provider flags, window/max controls). - Avoid breaking JSONL schemas produced by
scripts/unless all downstream consumers are updated. - When editing prompt templates, verify prompt loader paths and assignment/score label compatibility.
- Preserve retry and key-rotation logic in LLM handlers unless intentionally redesigning error handling.
- Version is defined in
pyproject.toml(project.version) and managed withbumpverconfig. - If doing a release bump, update versioned references consistently per bumpver patterns.