Skip to content

Commit 2efd5a7

Browse files
authored
Merge pull request #56 from ggozad/feat/capabilities
Add SkillsCapability, thinking configuration, and script timeout
2 parents 403d8ce + 03992c2 commit 2efd5a7

File tree

11 files changed

+399
-50
lines changed

11 files changed

+399
-50
lines changed

CHANGELOG.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,20 @@
22

33
## [Unreleased]
44

5+
### Added
6+
7+
- **`SkillsCapability`**: New pydantic-ai capability wrapping `SkillToolset` + system prompt. Provides a single-line integration path via `Agent(capabilities=[SkillsCapability(...)])`. `SkillToolset` remains available for advanced use cases.
8+
- **Skill thinking configuration**: Skills can specify `thinking` effort level (`True`, `'low'`, `'medium'`, `'high'`, etc.) to configure reasoning on their sub-agents. Supported across providers via pydantic-ai's unified thinking setting.
9+
10+
### Fixed
11+
12+
- **Script timeout**: `run_script` now enforces a timeout (default 120s, configurable via `HAIKU_SKILLS_SCRIPT_TIMEOUT` env var). Previously a hanging script would block the agent forever.
13+
14+
### Changed
15+
16+
- Bump pydantic-ai dependency from `>=1.63.0` to `>=1.71.0`.
17+
- AG-UI state restoration now uses pydantic-ai's `for_run()` hook instead of overriding `get_tools()`.
18+
519
## [0.11.0] - 2026-03-26
620

721
### Added

docs/tutorial.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,48 @@ print(result.output)
5151

5252
By default, `SkillToolset` exposes a single `execute_skill` tool. When the agent calls it, a focused sub-agent spins up with that skill's instructions and tools — then returns the result. The main agent never sees the skill's internal tools. For an alternative approach where the main agent calls skill tools directly, see [Direct mode](#direct-mode).
5353

54+
### Using SkillsCapability
55+
56+
`SkillsCapability` bundles the toolset and system prompt into a single pydantic-ai [capability](https://ai.pydantic.dev/capabilities/):
57+
58+
```python
59+
from pydantic_ai import Agent
60+
from haiku.skills import SkillsCapability
61+
62+
agent = Agent(
63+
"anthropic:claude-sonnet-4-5-20250929",
64+
capabilities=[
65+
SkillsCapability(
66+
skill_paths=[Path("./skills")],
67+
skill_model="openai:gpt-4o-mini",
68+
),
69+
],
70+
)
71+
```
72+
73+
This is equivalent to the `SkillToolset` + `build_system_prompt` setup above.
74+
75+
`SkillsCapability` accepts the same parameters as `SkillToolset`:
76+
77+
```python
78+
# Load skills from installed entrypoint packages
79+
cap = SkillsCapability(use_entrypoints=True)
80+
81+
# Custom preamble for the system prompt
82+
cap = SkillsCapability(
83+
skill_paths=[Path("./skills")],
84+
preamble="You are a data science assistant.",
85+
)
86+
87+
# Direct mode (no sub-agents)
88+
cap = SkillsCapability(
89+
skill_paths=[Path("./skills")],
90+
use_subagents=False,
91+
)
92+
```
93+
94+
The underlying toolset is accessible via `cap.toolset` for advanced use (registry access, state snapshots, AG-UI event sink).
95+
5496
!!! note
5597
When a skill directory has validation errors (bad frontmatter, name mismatch, etc.), the error is collected and discovery continues. The CLI prints these as warnings to stderr.
5698

@@ -80,6 +122,12 @@ The `run_script` tool resolves the right executor based on file extension:
80122

81123
The skill directory is prepended to `PYTHONPATH`, so Python scripts can import sibling modules.
82124

125+
Scripts are subject to a timeout (default 120 seconds). Override it with the `HAIKU_SKILLS_SCRIPT_TIMEOUT` environment variable:
126+
127+
```bash
128+
export HAIKU_SKILLS_SCRIPT_TIMEOUT=300 # 5 minutes
129+
```
130+
83131
### Resources
84132

85133
Skills can expose files (references, templates, data) as resources. Any non-script, non-Python file in the skill directory is automatically discovered. The sub-agent receives a `read_resource` tool to read these on demand. This works for both filesystem and entrypoint skills (entrypoint skills must set `path` in their factory).
@@ -256,6 +304,36 @@ toolset.build_state_snapshot() # {"calculator": {"history": []}}
256304

257305
When `execute_skill` runs a skill whose tools modify state, the toolset computes a JSON Patch delta and returns it as a `StateDeltaEvent` — compatible with the [AG-UI protocol](ag-ui.md).
258306

307+
### Configuring thinking
308+
309+
Skills can request a thinking/reasoning effort level for their sub-agent:
310+
311+
```python
312+
def create_skill() -> Skill:
313+
metadata, instructions = parse_skill_md(Path(__file__).parent / "SKILL.md")
314+
315+
return Skill(
316+
metadata=metadata,
317+
instructions=instructions,
318+
tools=[solve],
319+
thinking="high", # enable deep reasoning
320+
)
321+
```
322+
323+
Supported values:
324+
325+
| Value | Meaning |
326+
|-------|---------|
327+
| `True` | Enable thinking at the provider's default effort |
328+
| `False` | Disable thinking (ignored on always-on models) |
329+
| `'minimal'`, `'low'`, `'medium'`, `'high'`, `'xhigh'` | Specific effort level |
330+
| `None` (default) | Don't configure thinking — use model defaults |
331+
332+
This uses pydantic-ai's [unified thinking setting](https://ai.pydantic.dev/thinking/) which works across Anthropic, OpenAI, Google, and other providers. Provider-specific thinking settings take precedence when both are set.
333+
334+
!!! note
335+
`thinking` only applies in sub-agent mode (the default). In direct mode (`use_subagents=False`), skill tools run inside the main agent — the main agent's model settings control thinking, not the skill's.
336+
259337
## MCP skills
260338

261339
Any [MCP](https://modelcontextprotocol.io/) server can be wrapped as a skill using `skill_from_mcp`. The MCP server's tools become the sub-agent's tools — the main agent still only sees `execute_skill`.

haiku/skills/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
resolve_model,
55
run_agui_stream,
66
)
7+
from haiku.skills.capability import SkillsCapability
78
from haiku.skills.mcp import skill_from_mcp
89
from haiku.skills.models import (
910
Skill,

haiku/skills/agent.py

Lines changed: 44 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@
2222
RetryPromptPart,
2323
)
2424
from pydantic_ai.models import Model
25-
from pydantic_ai.toolsets import FunctionToolset, ToolsetTool
25+
from pydantic_ai.settings import ModelSettings
26+
from pydantic_ai.toolsets import FunctionToolset
2627

2728
from haiku.skills.models import Skill
2829
from haiku.skills.prompts import SKILL_PROMPT
@@ -139,10 +140,22 @@ def _discover_scripts(skill: Skill) -> list[str]:
139140
)
140141

141142

142-
def _create_run_script(skill: Skill) -> Callable[..., Any]:
143+
SCRIPT_TIMEOUT_DEFAULT = 120.0
144+
145+
146+
def _create_run_script(
147+
skill: Skill, timeout: float | None = None
148+
) -> Callable[..., Any]:
143149
"""Create a run_script tool bound to a specific skill."""
144150
assert skill.path is not None
145151
scripts_dir = (skill.path / "scripts").resolve()
152+
resolved_timeout = (
153+
timeout
154+
if timeout is not None
155+
else float(
156+
os.environ.get("HAIKU_SKILLS_SCRIPT_TIMEOUT", SCRIPT_TIMEOUT_DEFAULT)
157+
)
158+
)
146159

147160
async def run_script(script: str, arguments: str = "") -> str:
148161
"""Execute a script from the skill's scripts/ directory.
@@ -171,7 +184,14 @@ async def run_script(script: str, arguments: str = "") -> str:
171184
stdout=asyncio.subprocess.PIPE,
172185
stderr=asyncio.subprocess.PIPE,
173186
)
174-
stdout, stderr = await proc.communicate()
187+
try:
188+
stdout, stderr = await asyncio.wait_for(
189+
proc.communicate(), timeout=resolved_timeout
190+
)
191+
except TimeoutError:
192+
proc.kill()
193+
await proc.wait()
194+
raise RuntimeError(f"Script {script} timed out after {resolved_timeout}s")
175195
if proc.returncode != 0:
176196
output = stderr.decode().strip() or stdout.decode().strip()
177197
raise RuntimeError(
@@ -249,11 +269,15 @@ async def event_handler(
249269
tools=tools,
250270
toolsets=skill.toolsets or None,
251271
)
272+
model_settings = (
273+
ModelSettings(thinking=skill.thinking) if skill.thinking is not None else None
274+
)
252275
result = await agent.run(
253276
request,
254277
deps=deps,
255278
usage_limits=UsageLimits(request_limit=20),
256279
event_stream_handler=event_handler,
280+
model_settings=model_settings,
257281
)
258282
text = result.output
259283
return text, collected_events, emitted_events
@@ -308,30 +332,27 @@ def _register_skill_state(self, skill: Skill) -> None:
308332
else:
309333
self._namespaces[namespace] = skill.state_type()
310334

311-
async def get_tools(self, ctx: RunContext[Any]) -> dict[str, ToolsetTool[Any]]:
312-
# Overridden to restore AG-UI state from deps before returning tools.
313-
# get_tools() is the only per-run hook with RunContext access in the
314-
# toolset API — there is no dedicated per-run setup method.
315-
self._maybe_restore_state(ctx)
316-
return await super().get_tools(ctx)
317-
318-
def _maybe_restore_state(self, ctx: RunContext[Any]) -> None:
319-
"""Restore namespace state from deps if it carries AG-UI state.
335+
async def for_run(self, ctx: RunContext[Any]) -> "SkillToolset":
336+
"""Restore AG-UI state from deps before the run starts.
320337
321338
Uses identity check (``is``) so we restore once per AG-UI request
322-
(each request creates a new dict) but not on every model step within
323-
a single run.
339+
(each request creates a new dict) but not redundantly within a run.
324340
"""
325341
deps = ctx.deps
326-
if deps is None or not hasattr(deps, "state"):
327-
return
328-
state = deps.state
329-
if not isinstance(state, dict) or not state:
330-
return
331-
if state is self._last_restored_state:
332-
return
333-
self._last_restored_state = state
334-
self.restore_state_snapshot(state)
342+
if deps is not None and hasattr(deps, "state"):
343+
state = deps.state
344+
if (
345+
isinstance(state, dict)
346+
and state
347+
and state is not self._last_restored_state
348+
):
349+
self._last_restored_state = state
350+
self.restore_state_snapshot(state)
351+
return self
352+
353+
@property
354+
def use_subagents(self) -> bool:
355+
return self._use_subagents
335356

336357
@property
337358
def registry(self) -> SkillRegistry:

haiku/skills/capability.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
from collections.abc import Callable
2+
from pathlib import Path
3+
from typing import Any
4+
5+
from pydantic_ai import RunContext
6+
from pydantic_ai.capabilities import AbstractCapability
7+
from pydantic_ai.models import Model
8+
9+
from haiku.skills.agent import SkillToolset
10+
from haiku.skills.models import Skill
11+
from haiku.skills.prompts import DEFAULT_PREAMBLE, build_system_prompt
12+
13+
14+
class SkillsCapability(AbstractCapability[Any]):
15+
"""Expose skills as a pydantic-ai capability.
16+
17+
Wraps a ``SkillToolset`` and provides both the toolset and a system prompt
18+
built from the skill catalog. The underlying toolset is accessible via the
19+
``toolset`` attribute for advanced use (state, registry, event sink, etc.).
20+
"""
21+
22+
def __init__(
23+
self,
24+
*,
25+
skills: list[Skill] | None = None,
26+
skill_paths: list[Path] | None = None,
27+
use_entrypoints: bool = False,
28+
skill_model: str | Model | None = None,
29+
use_subagents: bool = True,
30+
preamble: str = DEFAULT_PREAMBLE,
31+
) -> None:
32+
self.toolset = SkillToolset(
33+
skills=skills,
34+
skill_paths=skill_paths,
35+
use_entrypoints=use_entrypoints,
36+
skill_model=skill_model,
37+
use_subagents=use_subagents,
38+
)
39+
self.preamble = preamble
40+
41+
def get_toolset(self) -> SkillToolset:
42+
return self.toolset
43+
44+
def get_instructions(self) -> Callable[[RunContext[Any]], str]:
45+
def _instructions(ctx: RunContext[Any]) -> str:
46+
return build_system_prompt(
47+
self.toolset.skill_catalog,
48+
preamble=self.preamble,
49+
use_subagents=self.toolset.use_subagents,
50+
)
51+
52+
return _instructions

haiku/skills/models.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
from pydantic import BaseModel, ConfigDict, Field, PrivateAttr, field_validator
99
from pydantic_ai import Tool
1010
from pydantic_ai.models import Model
11+
from pydantic_ai.settings import ThinkingLevel
1112
from pydantic_ai.toolsets import AbstractToolset
1213

1314

@@ -84,6 +85,7 @@ class Skill(BaseModel):
8485
_toolsets: list[AbstractToolset[Any]] = PrivateAttr(default_factory=list)
8586
_state_type: type[BaseModel] | None = PrivateAttr(default=None)
8687
_state_namespace: str | None = PrivateAttr(default=None)
88+
_thinking: ThinkingLevel | None = PrivateAttr(default=None)
8789
_factory: Callable[..., "Skill"] | None = PrivateAttr(default=None)
8890

8991
def __init__(
@@ -93,13 +95,15 @@ def __init__(
9395
toolsets: Sequence[AbstractToolset[Any]] | None = None,
9496
state_type: type[BaseModel] | None = None,
9597
state_namespace: str | None = None,
98+
thinking: ThinkingLevel | None = None,
9699
**data: Any,
97100
) -> None:
98101
super().__init__(**data)
99102
self._tools = list(tools) if tools else []
100103
self._toolsets = list(toolsets) if toolsets else []
101104
self._state_type = state_type
102105
self._state_namespace = state_namespace
106+
self._thinking = thinking
103107

104108
@property
105109
def tools(self) -> list[Tool | Callable[..., Any]]:
@@ -133,6 +137,14 @@ def state_namespace(self) -> str | None:
133137
def state_namespace(self, value: str | None) -> None:
134138
self._state_namespace = value
135139

140+
@property
141+
def thinking(self) -> ThinkingLevel | None:
142+
return self._thinking
143+
144+
@thinking.setter
145+
def thinking(self, value: ThinkingLevel | None) -> None:
146+
self._thinking = value
147+
136148
def reconfigure(self, **kwargs: Any) -> None:
137149
"""Re-create this skill with new factory arguments.
138150
@@ -146,6 +158,7 @@ def reconfigure(self, **kwargs: Any) -> None:
146158
self._toolsets = new_skill._toolsets
147159
self._state_type = new_skill._state_type
148160
self._state_namespace = new_skill._state_namespace
161+
self._thinking = new_skill._thinking
149162
self.model = new_skill.model
150163

151164
def state_metadata(self) -> StateMetadata | None:

pyproject.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,15 +31,15 @@ dependencies = [
3131
"ag-ui-protocol>=0.1.13",
3232
"jsonpatch>=1.33",
3333
"pydantic>=2.12.5",
34-
"pydantic-ai-slim[mcp,openai]>=1.63.0",
34+
"pydantic-ai-slim[mcp,openai]>=1.71.0",
3535
"pyyaml>=6.0.3",
3636
"skills-ref>=0.1.1",
3737
]
3838

3939
[project.optional-dependencies]
4040
signing = ["pathspec>=0.12.0", "sigstore>=4.0.0"]
4141
tui = [
42-
"pydantic-ai-slim[logfire]>=1.63.0",
42+
"pydantic-ai-slim[logfire]>=1.71.0",
4343
"textual>=8.0.0",
4444
"typer>=0.24.1",
4545
"python-dotenv>=1.2.1",
@@ -70,7 +70,7 @@ dev = [
7070
"pytest-asyncio>=1.3.0",
7171
"pytest-cov>=7.0.0",
7272
"pytest-recording>=0.13.4",
73-
"pydantic-ai-slim[openai,logfire]>=1.63.0",
73+
"pydantic-ai-slim[openai,logfire]>=1.71.0",
7474
"ruff>=0.15.2",
7575
"pathspec>=0.12.0",
7676
"sigstore>=4.0.0",

0 commit comments

Comments
 (0)