Skip to content

Commit 9919e01

Browse files
author
Paul Kyle
committed
release: sync public v0.8.9
1 parent 742660e commit 9919e01

54 files changed

Lines changed: 4402 additions & 1399 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[Unit]
2+
Description=Palinode BGE-M3 GPU keepalive ping
3+
After=network.target
4+
5+
[Service]
6+
Type=oneshot
7+
ExecStart=/bin/bash -c 'curl -s --max-time 30 -X POST "${OLLAMA_URL}/api/embed" -H "Content-Type: application/json" -d "{\"model\":\"bge-m3\",\"input\":\"keepalive\"}" > /dev/null 2>&1'
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
[Unit]
2+
Description=Palinode BGE-M3 GPU keepalive (every 20 min)
3+
After=network.target
4+
5+
[Timer]
6+
OnBootSec=2min
7+
OnUnitActiveSec=20min
8+
Unit=palinode-embed-keepalive.service
9+
10+
[Install]
11+
WantedBy=default.target

docs/CHANGELOG.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,63 @@ All notable changes to Palinode. Format follows [Keep a Changelog](https://keepa
44

55
## Unreleased
66

7+
## [0.8.9] — 2026-05-05
8+
9+
A reliability and architecture release. Closes the MCP cold-start timeout class and lands
10+
four rounds of internal architecture deepening (no breaking changes). Two latent bugs
11+
surfaced and were fixed along the way.
12+
13+
### Fixed
14+
15+
- **MCP cold-start timeouts.** `palinode_save`, `palinode_search`, and `palinode_session_end`
16+
now use explicit 90 s httpx timeouts instead of the 30 s / 60 s defaults. BGE-M3 cold-starts
17+
on a warm-VRAM GPU take ~54 s on first inference; the old defaults caused `-32001 Request
18+
timed out` errors from the MCP client before the response arrived.
19+
- **`_warmup_embed()` background task** fires on MCP stdio startup, pre-heating the embedding
20+
GPU so the first real tool call doesn't hit the cold-start window.
21+
- **Trigger cooldown `TypeError`.** `check_triggers()` cooldown path subtracted a naive
22+
datetime (parsed via `fromisoformat`) from an aware `_utc_now()`. Any second-fire of a real
23+
trigger would raise `TypeError`. Hidden because no test exercised the cooldown-without-bypass
24+
path against a real DB and the API layer caught it as a 500. Caught during the architecture
25+
refactor.
26+
- **Frontmatter body truncation in consolidation runner.** `_get_decisions_for_project` used
27+
`content.split('---')` without `maxsplit`. If a decision file's body contained a horizontal
28+
rule, the body would be truncated at the first `---`. Fixed via the new
29+
`consolidation/frontmatter` module.
30+
31+
### Added
32+
33+
- **`deploy/systemd/palinode-embed-keepalive.{service,timer}.template`** — user-level systemd
34+
timer that pings BGE-M3 every 20 minutes to keep GPU kernels warm between sessions.
35+
- **`EmbedderProtocol` and `LLMProvider`** typed Protocol classes for embedding and LLM
36+
generation, each with `OllamaProvider` + `FakeProvider` adapters. Production callers
37+
default-construct the real adapter; tests can inject `FakeProvider` to run without live
38+
Ollama. Aligns with the tools-over-pipeline approach: model swaps become adapter swaps.
39+
- **`palinode/core/memory_paths`** with typed exception hierarchy (`MemoryPathError`,
40+
`MemoryPathTraversal`, `MemoryPathNotFound`, `MemoryPathTooLarge`). Replaces
41+
`HTTPException`-coupled path validation with typed errors that any surface can consume.
42+
- **`palinode/consolidation/proposal`** — typed `ProposalOp` dataclass and `OpKind` enum for
43+
the consolidation pipeline. The deterministic executor now accepts typed ops instead of
44+
`dict[str, Any]` with defensive runtime checks.
45+
- **`palinode/core/git_persistence`** — single seam for git write operations
46+
(`write_and_commit`, `commit_existing`, `push`) with typed error hierarchy. Replaces 9
47+
inline `subprocess.run(["git", ...])` call sites that had inconsistent `check=` semantics,
48+
cwd resolution, and error handling.
49+
50+
### Internal
51+
52+
- **Architecture deepening pass (rounds 1–4).** `server.py` 3088 → 2398 lines (−22%);
53+
`store.py` 1413 → 1158 lines (−18%). 14 new modules in `palinode/core/`, `palinode/api/`,
54+
and `palinode/consolidation/`; 12 new direct unit-test files adding 100+ seam tests.
55+
No behaviour changes outside the two bug fixes called out under Fixed.
56+
- Round 1: `similarity`, `wiki`, `summarize`, `db`, `triggers`, `entity_graph` extracted;
57+
`EmbedderProtocol` introduced.
58+
- Round 2: `middleware`, `rate_limit`, `frontmatter` extracted; direct seam tests for
59+
triggers, entity graph, indexer dedup, middleware, frontmatter.
60+
- Round 3: `LLMProvider`, `memory_paths`, `ProposalOp` typed seams.
61+
- Round 4: `git_persistence` write seam; callers migrated off the `store.py` re-export
62+
facade onto canonical module imports.
63+
764
## [0.8.8] — 2026-05-01
865

966
A pure-security release in response to the MCP Marketplace security audit.

examples/compaction-demo/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Between pass 1 and pass 2 (9 operations):
4141
- 4 × `ARCHIVE` — moved superseded entries to `archive/2026/my-app-status.md`
4242
- 2 × `UPDATE` — final wording pass on merged lines
4343

44-
All 22 operations were **proposed by an LLM** but **applied by deterministic Python** (`palinode/consolidation/executor.py`). The LLM never touches the file directly — it only emits JSON like `{"op": "SUPERSEDE", "id": "f-0317-1", "superseded_by": "f-0324-2", "reason": "stripe integration actually shipped on the 24th"}` and the executor validates and applies it. That's the [ADR-001](../../ADR-001-tools-over-pipeline.md) invariant in action.
44+
All 22 operations were **proposed by an LLM** but **applied by deterministic Python** (`palinode/consolidation/executor.py`). The LLM never touches the file directly — it only emits JSON like `{"op": "SUPERSEDE", "id": "f-0317-1", "superseded_by": "f-0324-2", "reason": "stripe integration actually shipped on the 24th"}` and the executor validates and applies it. That's the tools-over-pipeline invariant in action.
4545

4646
## Why this matters
4747

palinode/api/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)