Skip to content

Commit 504d3f0

Browse files
claudefrousselet
authored andcommitted
Add native Claude (Anthropic) provider to Ask Cairn assistant
Claude is not OpenAI-compatible, so it gets its own client against the native Messages API (POST /v1/messages, x-api-key header, top-level system parameter, content block list) rather than reusing OpenAICompatibleClient. Routing (chat_json) uses forced tool use: a single `plan` tool whose input_schema is the routing schema, with tool_choice pinned to it - the reliable structured-output path on Claude, and one that accepts the plan schema's free-form `arguments` object. No temperature or thinking is sent, since both 400 on the current Opus family (the default model tier). embed() raises a clear error: Anthropic has no embeddings endpoint, so semantic search must use another provider. Register `anthropic` (alias `claude`) in get_client(), add it to PROVIDER_LABELS, add a dedicated test module plus factory tests, and document it (assistant module spec, settings table, README, .env.example, CHANGELOG). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 010b4b1 commit 504d3f0

10 files changed

Lines changed: 386 additions & 20 deletions

File tree

.env.example

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,10 @@ POSTGRES_PORT=5432
2929
# AI_ASSISTANT_API_KEY=your-openai-api-key
3030
# AI_ASSISTANT_MODEL=gpt-4o-mini
3131
# AI_ASSISTANT_BASE_URL=https://api.openai.com/v1 # default; set for a custom gateway
32+
# Claude (Anthropic, native Messages API; no embeddings, so no semantic search):
33+
# AI_ASSISTANT_PROVIDER=anthropic
34+
# AI_ASSISTANT_API_KEY=your-anthropic-api-key
35+
# AI_ASSISTANT_MODEL=claude-opus-4-8
3236
# Self-hosted alternative (local LLM, no data egress): point at your own Ollama:
3337
# AI_ASSISTANT_PROVIDER=ollama
3438
# AI_ASSISTANT_OLLAMA_URL=http://host.docker.internal:11434

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12-
- **Ask Cairn: OpenAI and OpenAI-compatible providers**: the assistant gains an `openai` backend (`AI_ASSISTANT_PROVIDER=openai`) that targets OpenAI (ChatGPT, e.g. `gpt-4o-mini`) out of the box and, via `AI_ASSISTANT_BASE_URL`, any other endpoint implementing the OpenAI `/chat/completions` and `/embeddings` API (vLLM, LiteLLM, LocalAI, Together, Groq...). The shared request/response handling was extracted into a generic `OpenAICompatibleClient`; the existing `MistralClient` is now a thin subclass of it (Mistral already exposes an OpenAI-compatible API), so behaviour is unchanged for Mistral users. `AI_ASSISTANT_BASE_URL` now defaults to empty and each provider falls back to its own endpoint (`mistral` -> `api.mistral.ai`, `openai` -> `api.openai.com`); set it only to target a custom gateway.
12+
- **Ask Cairn: OpenAI and OpenAI-compatible providers**: the assistant gains an `openai` backend (`AI_ASSISTANT_PROVIDER=openai`) that targets OpenAI (ChatGPT, e.g. `gpt-4o-mini`) out of the box and, via `AI_ASSISTANT_BASE_URL`, any other endpoint implementing the OpenAI `/chat/completions` and `/embeddings` API (vLLM, LiteLLM, LocalAI, Together, Groq...). The shared request/response handling was extracted into a generic `OpenAICompatibleClient`; the existing `MistralClient` is now a thin subclass of it (Mistral already exposes an OpenAI-compatible API), so behaviour is unchanged for Mistral users. `AI_ASSISTANT_BASE_URL` now defaults to empty and each provider falls back to its own endpoint (`mistral` -> `api.mistral.ai`, `openai` -> `api.openai.com`, `anthropic` -> `api.anthropic.com`); set it only to target a custom gateway.
13+
- **Ask Cairn: Claude (Anthropic) provider**: a native `anthropic` backend (`AI_ASSISTANT_PROVIDER=anthropic`) talks to Claude through the Messages API (`POST /v1/messages`, `x-api-key` header, top-level `system`, `content` block list) - Claude is not OpenAI-compatible, so it has its own client. Routing uses forced tool use (a `plan` tool whose `input_schema` is the routing schema) and no `temperature`/`thinking` is sent (both 400 on the current Opus family). Set `AI_ASSISTANT_MODEL` to a Claude model id (e.g. `claude-opus-4-8`). Semantic search is not available with this provider, since Anthropic has no embeddings API.
1314

1415
## [0.27.1] - 2026-06-14
1516

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Manage your organisation's security posture, track compliance with regulatory fr
1313
- **Risks** : ISO 27005 and EBIOS RM (ANSSI v1.5, workshops 0 to 5) assessments, threat and vulnerability catalogs, treatment plans and formal risk acceptance
1414
- **Compliance** : frameworks, requirements, assessments, findings, action plans and inter-framework mappings, with Excel import
1515
- **Steering** : real-time dashboard, KPI indicators, ISO 27001 management reviews, and PDF/DOCX/PPTX report generation (SoA, audit report, risk register, meeting minutes)
16-
- **Ask Cairn (optional)** : natural-language questions in the command palette ("Which decisions were made at the last management review?"), answered by a pluggable LLM provider (Mistral AI by default; OpenAI / any OpenAI-compatible endpoint; self-hosted Ollama) that cites real records and enforces your permissions, with thumbs up/down feedback that admins can export to improve the assistant
16+
- **Ask Cairn (optional)** : natural-language questions in the command palette ("Which decisions were made at the last management review?"), answered by a pluggable LLM provider (Mistral AI by default; OpenAI / any OpenAI-compatible endpoint; Claude; self-hosted Ollama) that cites real records and enforces your permissions, with thumbs up/down feedback that admins can export to improve the assistant
1717

1818
Everything is bilingual (English/French), audit-ready (full change history, versioning, lifecycle workflows with approvals) and access-controlled (role-based permissions, scope-based tenancy, passkey login).
1919

@@ -47,7 +47,7 @@ To run the published image without cloning the repository, and for production no
4747

4848
## Tech stack
4949

50-
Django 5.2 LTS, PostgreSQL 16, Django REST Framework, Django Channels + Redis (real-time), Bootstrap 5.3 + HTMX (frontend), Docker. Optional: Mistral AI, OpenAI / OpenAI-compatible endpoints, or self-hosted Ollama (Ask Cairn assistant).
50+
Django 5.2 LTS, PostgreSQL 16, Django REST Framework, Django Channels + Redis (real-time), Bootstrap 5.3 + HTMX (frontend), Docker. Optional: Mistral AI, OpenAI / OpenAI-compatible endpoints, Claude (Anthropic), or self-hosted Ollama (Ask Cairn assistant).
5151

5252
## Licence
5353

assistant/providers/anthropic.py

Lines changed: 184 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
"""Anthropic (Claude) backend for the assistant (native Messages API).
2+
3+
Claude is NOT OpenAI-compatible : it uses ``POST /v1/messages`` with an
4+
``x-api-key`` header, a top-level ``system`` parameter, and a ``content`` block
5+
list in the response. It therefore needs its own client rather than the shared
6+
``OpenAICompatibleClient``.
7+
8+
Two operations are implemented: a chat completion constrained to a JSON Schema
9+
for tool routing (done with forced tool use, the reliable structured-output
10+
path on Claude) and a plain-text chat completion for the final summary
11+
sentence. Embeddings are not provided : Anthropic has no embeddings endpoint,
12+
so semantic search must use another provider (see ``embed``).
13+
14+
Only the calling user's question and the compact, identifier-stripped record
15+
fields produced by the read-only catalog tools leave the platform; ids and
16+
UUIDs are scrubbed before the summary call (see ``engine._strip_identifiers``).
17+
"""
18+
19+
import logging
20+
21+
import httpx
22+
from django.conf import settings
23+
24+
from assistant.providers.base import (
25+
BaseClient,
26+
MalformedModelOutput,
27+
ModelNotAvailable,
28+
ServiceUnreachable,
29+
)
30+
31+
logger = logging.getLogger(__name__)
32+
33+
34+
class AnthropicClient(BaseClient):
35+
PROVIDER_LABEL = "Claude"
36+
# Applied when neither the constructor argument nor
37+
# ``settings.AI_ASSISTANT_BASE_URL`` is set. The Messages endpoint is
38+
# ``{base_url}/messages`` (so the default resolves to
39+
# ``https://api.anthropic.com/v1/messages``).
40+
DEFAULT_BASE_URL = "https://api.anthropic.com/v1"
41+
# Pinned API version sent on every request (Anthropic requirement).
42+
ANTHROPIC_VERSION = "2023-06-01"
43+
# Name of the synthetic tool used to force structured routing output.
44+
PLAN_TOOL_NAME = "plan"
45+
46+
def __init__(self, base_url=None, model=None, api_key=None):
47+
self.base_url = (
48+
base_url or settings.AI_ASSISTANT_BASE_URL or self.DEFAULT_BASE_URL
49+
).rstrip("/")
50+
self.model = model or settings.AI_ASSISTANT_MODEL
51+
self.api_key = api_key if api_key is not None else settings.AI_ASSISTANT_API_KEY
52+
self.timeout = httpx.Timeout(
53+
settings.AI_ASSISTANT_TIMEOUT,
54+
connect=settings.AI_ASSISTANT_CONNECT_TIMEOUT,
55+
)
56+
57+
def _headers(self):
58+
if not self.api_key:
59+
raise ServiceUnreachable(
60+
f"{self.PROVIDER_LABEL} API key is not configured "
61+
"(set AI_ASSISTANT_API_KEY)."
62+
)
63+
return {
64+
"x-api-key": self.api_key,
65+
"anthropic-version": self.ANTHROPIC_VERSION,
66+
"content-type": "application/json",
67+
}
68+
69+
@staticmethod
70+
def _split_system(messages):
71+
"""Split OpenAI-style messages into Claude's (system, messages) shape.
72+
73+
Claude takes the system prompt as a top-level parameter, not as a
74+
message with ``role: "system"``; user/assistant turns stay in
75+
``messages``.
76+
"""
77+
system_parts = []
78+
chat = []
79+
for message in messages:
80+
role = message.get("role")
81+
content = message.get("content", "")
82+
if role == "system":
83+
if content:
84+
system_parts.append(content)
85+
else:
86+
chat.append({"role": role, "content": content})
87+
return "\n\n".join(system_parts), chat
88+
89+
def _base_payload(self, messages):
90+
system, chat = self._split_system(messages)
91+
# No temperature / thinking: both are rejected (HTTP 400) on the
92+
# current Opus family, which is the default model.
93+
payload = {
94+
"model": self.model,
95+
"max_tokens": settings.AI_ASSISTANT_MAX_TOKENS,
96+
"messages": chat,
97+
}
98+
if system:
99+
payload["system"] = system
100+
return payload
101+
102+
def _post(self, payload):
103+
try:
104+
return httpx.post(
105+
f"{self.base_url}/messages",
106+
json=payload,
107+
headers=self._headers(),
108+
timeout=self.timeout,
109+
)
110+
except (httpx.ConnectError, httpx.TimeoutException) as exc:
111+
raise ServiceUnreachable(str(exc)) from exc
112+
except httpx.HTTPError as exc:
113+
raise ServiceUnreachable(str(exc)) from exc
114+
115+
def _raise_for_status(self, resp):
116+
if resp.status_code in (401, 403):
117+
# Never surface the key or auth detail to the caller.
118+
logger.error(
119+
"%s authentication failed (HTTP %s)",
120+
self.PROVIDER_LABEL,
121+
resp.status_code,
122+
)
123+
raise ServiceUnreachable("authentication failed")
124+
if resp.status_code == 404:
125+
raise ModelNotAvailable(self.model)
126+
if resp.status_code >= 400:
127+
raise ServiceUnreachable(f"HTTP {resp.status_code}: {resp.text[:200]}")
128+
129+
def _content_blocks(self, resp):
130+
try:
131+
blocks = resp.json()["content"]
132+
except (KeyError, TypeError, ValueError) as exc:
133+
raise MalformedModelOutput(resp.text[:200]) from exc
134+
if not isinstance(blocks, list):
135+
raise MalformedModelOutput(resp.text[:200])
136+
return blocks
137+
138+
def chat_json(self, messages, json_schema, think=None):
139+
"""Chat completion constrained to ``json_schema``; returns the parsed object.
140+
141+
Uses forced tool use : a single ``plan`` tool whose ``input_schema`` is
142+
the routing schema, with ``tool_choice`` pinned to it. The model must
143+
emit a ``tool_use`` block whose ``input`` is the structured plan. The
144+
plan schema keeps a free-form ``arguments`` object, which Claude tool
145+
input schemas accept; server-side validation in the engine is the real
146+
safety net.
147+
"""
148+
payload = self._base_payload(messages)
149+
payload["tools"] = [
150+
{
151+
"name": self.PLAN_TOOL_NAME,
152+
"description": "Return the execution plan for the question.",
153+
"input_schema": json_schema,
154+
}
155+
]
156+
payload["tool_choice"] = {"type": "tool", "name": self.PLAN_TOOL_NAME}
157+
resp = self._post(payload)
158+
self._raise_for_status(resp)
159+
for block in self._content_blocks(resp):
160+
if block.get("type") == "tool_use" and block.get("name") == self.PLAN_TOOL_NAME:
161+
parsed = block.get("input")
162+
if not isinstance(parsed, dict):
163+
raise MalformedModelOutput(str(parsed)[:200])
164+
return parsed
165+
raise MalformedModelOutput(resp.text[:200])
166+
167+
def chat_text(self, messages):
168+
"""Plain-text chat completion."""
169+
resp = self._post(self._base_payload(messages))
170+
self._raise_for_status(resp)
171+
text = "".join(
172+
block.get("text", "")
173+
for block in self._content_blocks(resp)
174+
if block.get("type") == "text"
175+
)
176+
return text.strip()
177+
178+
def embed(self, texts):
179+
"""Embeddings are unsupported : Anthropic has no embeddings endpoint."""
180+
raise ServiceUnreachable(
181+
"The Claude provider does not support embeddings (Anthropic has no "
182+
"embeddings API). Disable AI_ASSISTANT_SEMANTIC_ENABLED, or set "
183+
"AI_ASSISTANT_PROVIDER to a provider with embeddings for indexing."
184+
)

assistant/providers/base.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,8 +55,9 @@ def get_client():
5555
``mistral`` (third-party API) is the default. ``openai`` covers OpenAI
5656
(ChatGPT) and any other OpenAI-compatible endpoint selected through
5757
``AI_ASSISTANT_BASE_URL`` (vLLM, LiteLLM, LocalAI, Together, Groq...).
58-
``ollama`` (self-hosted local LLM) remains selectable for those who point
59-
it at their own instance.
58+
``anthropic`` targets Claude through the native Messages API. ``ollama``
59+
(self-hosted local LLM) remains selectable for those who point it at their
60+
own instance.
6061
"""
6162
provider = (settings.AI_ASSISTANT_PROVIDER or "mistral").lower()
6263
if provider == "ollama":
@@ -71,4 +72,8 @@ def get_client():
7172
from assistant.providers.openai_compatible import OpenAICompatibleClient
7273

7374
return OpenAICompatibleClient()
75+
if provider in ("anthropic", "claude"):
76+
from assistant.providers.anthropic import AnthropicClient
77+
78+
return AnthropicClient()
7479
raise ServiceUnreachable(f"Unknown AI assistant provider: {provider!r}")

assistant/tests/test_anthropic.py

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,138 @@
1+
"""Unit tests for the Anthropic (Claude) provider client (no real sockets)."""
2+
3+
import json
4+
5+
import httpx
6+
import pytest
7+
from django.test import override_settings
8+
9+
from assistant.providers.anthropic import AnthropicClient
10+
from assistant.providers.base import (
11+
MalformedModelOutput,
12+
ModelNotAvailable,
13+
ServiceUnreachable,
14+
)
15+
16+
17+
class FakeResponse:
18+
def __init__(self, status_code=200, payload=None, text=""):
19+
self.status_code = status_code
20+
self._payload = payload if payload is not None else {}
21+
self.text = text or json.dumps(self._payload)
22+
23+
def json(self):
24+
return self._payload
25+
26+
27+
def _patch_post(monkeypatch, responses):
28+
calls = []
29+
30+
def fake_post(url, json=None, headers=None, timeout=None):
31+
calls.append({"url": url, "payload": dict(json), "headers": dict(headers or {})})
32+
item = responses.pop(0)
33+
if isinstance(item, Exception):
34+
raise item
35+
return item
36+
37+
monkeypatch.setattr(httpx, "post", fake_post)
38+
return calls
39+
40+
41+
def _client():
42+
return AnthropicClient(
43+
base_url="https://api.anthropic.com/v1",
44+
model="claude-opus-4-8",
45+
api_key="sk-ant-test",
46+
)
47+
48+
49+
def test_chat_json_uses_forced_tool_and_returns_input(monkeypatch):
50+
payload = {"content": [{"type": "tool_use", "name": "plan", "input": {"steps": []}}]}
51+
calls = _patch_post(monkeypatch, [FakeResponse(payload=payload)])
52+
result = _client().chat_json(
53+
[
54+
{"role": "system", "content": "route this"},
55+
{"role": "user", "content": "hi"},
56+
],
57+
{"type": "object"},
58+
)
59+
assert result == {"steps": []}
60+
call = calls[0]
61+
assert call["url"].endswith("/messages")
62+
# Native auth headers, not Bearer.
63+
assert call["headers"]["x-api-key"] == "sk-ant-test"
64+
assert call["headers"]["anthropic-version"] == "2023-06-01"
65+
body = call["payload"]
66+
assert body["model"] == "claude-opus-4-8"
67+
assert body["max_tokens"] >= 1
68+
# System prompt is hoisted to the top-level field, not a message.
69+
assert body["system"] == "route this"
70+
assert body["messages"] == [{"role": "user", "content": "hi"}]
71+
# Sampling params / thinking must NOT be sent (400 on the Opus family).
72+
assert "temperature" not in body
73+
assert "thinking" not in body
74+
# Structured output via forced tool use.
75+
assert body["tools"][0]["name"] == "plan"
76+
assert body["tools"][0]["input_schema"] == {"type": "object"}
77+
assert body["tool_choice"] == {"type": "tool", "name": "plan"}
78+
79+
80+
def test_chat_json_without_tool_use_raises_malformed(monkeypatch):
81+
payload = {"content": [{"type": "text", "text": "no plan here"}]}
82+
_patch_post(monkeypatch, [FakeResponse(payload=payload)])
83+
with pytest.raises(MalformedModelOutput):
84+
_client().chat_json([{"role": "user", "content": "hi"}], {"type": "object"})
85+
86+
87+
def test_chat_text_concatenates_text_blocks(monkeypatch):
88+
payload = {"content": [{"type": "text", "text": " Two open "}, {"type": "text", "text": "risks. "}]}
89+
_patch_post(monkeypatch, [FakeResponse(payload=payload)])
90+
assert _client().chat_text([{"role": "user", "content": "hi"}]) == "Two open risks."
91+
92+
93+
@override_settings(AI_ASSISTANT_BASE_URL="", AI_ASSISTANT_MODEL="claude-opus-4-8")
94+
def test_defaults_to_anthropic_endpoint(monkeypatch):
95+
payload = {"content": [{"type": "text", "text": "ok"}]}
96+
calls = _patch_post(monkeypatch, [FakeResponse(payload=payload)])
97+
AnthropicClient(api_key="sk-ant-test").chat_text([{"role": "user", "content": "hi"}])
98+
assert calls[0]["url"] == "https://api.anthropic.com/v1/messages"
99+
100+
101+
def test_missing_api_key_raises_clear_error(monkeypatch):
102+
def boom(*a, **k):
103+
raise AssertionError("must not hit the network without an API key")
104+
105+
monkeypatch.setattr(httpx, "post", boom)
106+
client = AnthropicClient(base_url="https://api.anthropic.com/v1", model="m", api_key="")
107+
with pytest.raises(ServiceUnreachable) as exc:
108+
client.chat_text([{"role": "user", "content": "hi"}])
109+
assert "API key" in str(exc.value)
110+
111+
112+
def test_auth_error_maps_to_unreachable_without_leaking(monkeypatch):
113+
_patch_post(monkeypatch, [FakeResponse(status_code=401, text="invalid x-api-key")])
114+
with pytest.raises(ServiceUnreachable) as exc:
115+
_client().chat_text([{"role": "user", "content": "hi"}])
116+
assert "sk-ant-test" not in str(exc.value)
117+
118+
119+
def test_unknown_model_maps_to_model_not_available(monkeypatch):
120+
_patch_post(monkeypatch, [FakeResponse(status_code=404, text="model not found")])
121+
with pytest.raises(ModelNotAvailable):
122+
_client().chat_text([{"role": "user", "content": "hi"}])
123+
124+
125+
def test_connect_error_maps_to_unreachable(monkeypatch):
126+
_patch_post(monkeypatch, [httpx.ConnectError("refused")])
127+
with pytest.raises(ServiceUnreachable):
128+
_client().chat_text([{"role": "user", "content": "hi"}])
129+
130+
131+
def test_embed_is_unsupported(monkeypatch):
132+
def boom(*a, **k):
133+
raise AssertionError("embed must not hit the network")
134+
135+
monkeypatch.setattr(httpx, "post", boom)
136+
with pytest.raises(ServiceUnreachable) as exc:
137+
_client().embed(["x"])
138+
assert "embeddings" in str(exc.value).lower()

0 commit comments

Comments
 (0)