ggozad
diff --git a/‎CHANGELOG.md‎
Lines changed: 14 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎docs/tutorial.md‎
Lines changed: 78 additions & 0 deletions b/‎docs/tutorial.md‎
Lines changed: 78 additions & 0 deletions
diff --git a/‎haiku/skills/__init__.py‎
Lines changed: 1 addition & 0 deletions b/‎haiku/skills/__init__.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎haiku/skills/agent.py‎
Lines changed: 44 additions & 23 deletions b/‎haiku/skills/agent.py‎
Lines changed: 44 additions & 23 deletions
diff --git a/‎haiku/skills/capability.py‎
Lines changed: 52 additions & 0 deletions b/‎haiku/skills/capability.py‎
Lines changed: 52 additions & 0 deletions
diff --git a/‎haiku/skills/models.py‎
Lines changed: 13 additions & 0 deletions b/‎haiku/skills/models.py‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 3 additions & 3 deletions b/‎pyproject.toml‎
Lines changed: 3 additions & 3 deletions
@@ -2,6 +2,20 @@
 
 ## [Unreleased]
 
+### Added
+
+- **`SkillsCapability`**: New pydantic-ai capability wrapping `SkillToolset` + system prompt. Provides a single-line integration path via `Agent(capabilities=[SkillsCapability(...)])`. `SkillToolset` remains available for advanced use cases.
+- **Skill thinking configuration**: Skills can specify `thinking` effort level (`True`, `'low'`, `'medium'`, `'high'`, etc.) to configure reasoning on their sub-agents. Supported across providers via pydantic-ai's unified thinking setting.
+
+### Fixed
+
+- **Script timeout**: `run_script` now enforces a timeout (default 120s, configurable via `HAIKU_SKILLS_SCRIPT_TIMEOUT` env var). Previously a hanging script would block the agent forever.
+
+### Changed
+
+- Bump pydantic-ai dependency from `>=1.63.0` to `>=1.71.0`.
+- AG-UI state restoration now uses pydantic-ai's `for_run()` hook instead of overriding `get_tools()`.
+
 ## [0.11.0] - 2026-03-26
 
 ### Added
 
@@ -51,6 +51,48 @@ print(result.output)
 
 By default, `SkillToolset` exposes a single `execute_skill` tool. When the agent calls it, a focused sub-agent spins up with that skill's instructions and tools — then returns the result. The main agent never sees the skill's internal tools. For an alternative approach where the main agent calls skill tools directly, see [Direct mode](#direct-mode).
 
+### Using SkillsCapability
+
+`SkillsCapability` bundles the toolset and system prompt into a single pydantic-ai [capability](https://ai.pydantic.dev/capabilities/):
+
+```python
+from pydantic_ai import Agent
+from haiku.skills import SkillsCapability
+
+agent = Agent(
+    "anthropic:claude-sonnet-4-5-20250929",
+    capabilities=[
+        SkillsCapability(
+            skill_paths=[Path("./skills")],
+            skill_model="openai:gpt-4o-mini",
+        ),
+    ],
+)
+```
+
+This is equivalent to the `SkillToolset` + `build_system_prompt` setup above.
+
+`SkillsCapability` accepts the same parameters as `SkillToolset`:
+
+```python
+# Load skills from installed entrypoint packages
+cap = SkillsCapability(use_entrypoints=True)
+
+# Custom preamble for the system prompt
+cap = SkillsCapability(
+    skill_paths=[Path("./skills")],
+    preamble="You are a data science assistant.",
+)
+
+# Direct mode (no sub-agents)
+cap = SkillsCapability(
+    skill_paths=[Path("./skills")],
+    use_subagents=False,
+)
+```
+
+The underlying toolset is accessible via `cap.toolset` for advanced use (registry access, state snapshots, AG-UI event sink).
+
 !!! note
     When a skill directory has validation errors (bad frontmatter, name mismatch, etc.), the error is collected and discovery continues. The CLI prints these as warnings to stderr.
 
@@ -80,6 +122,12 @@ The `run_script` tool resolves the right executor based on file extension:
 
 The skill directory is prepended to `PYTHONPATH`, so Python scripts can import sibling modules.
 
+Scripts are subject to a timeout (default 120 seconds). Override it with the `HAIKU_SKILLS_SCRIPT_TIMEOUT` environment variable:
+
+```bash
+export HAIKU_SKILLS_SCRIPT_TIMEOUT=300  # 5 minutes
+```
+
 ### Resources
 
 Skills can expose files (references, templates, data) as resources. Any non-script, non-Python file in the skill directory is automatically discovered. The sub-agent receives a `read_resource` tool to read these on demand. This works for both filesystem and entrypoint skills (entrypoint skills must set `path` in their factory).
@@ -256,6 +304,36 @@ toolset.build_state_snapshot()    # {"calculator": {"history": []}}
 
 When `execute_skill` runs a skill whose tools modify state, the toolset computes a JSON Patch delta and returns it as a `StateDeltaEvent` — compatible with the [AG-UI protocol](ag-ui.md).
 
+### Configuring thinking
+
+Skills can request a thinking/reasoning effort level for their sub-agent:
+
+```python
+def create_skill() -> Skill:
+    metadata, instructions = parse_skill_md(Path(__file__).parent / "SKILL.md")
+
+    return Skill(
+        metadata=metadata,
+        instructions=instructions,
+        tools=[solve],
+        thinking="high",   # enable deep reasoning
+    )
+```
+
+Supported values:
+
+| Value | Meaning |
+|-------|---------|
+| `True` | Enable thinking at the provider's default effort |
+| `False` | Disable thinking (ignored on always-on models) |
+| `'minimal'`, `'low'`, `'medium'`, `'high'`, `'xhigh'` | Specific effort level |
+| `None` (default) | Don't configure thinking — use model defaults |
+
+This uses pydantic-ai's [unified thinking setting](https://ai.pydantic.dev/thinking/) which works across Anthropic, OpenAI, Google, and other providers. Provider-specific thinking settings take precedence when both are set.
+
+!!! note
+    `thinking` only applies in sub-agent mode (the default). In direct mode (`use_subagents=False`), skill tools run inside the main agent — the main agent's model settings control thinking, not the skill's.
+
 ## MCP skills
 
 Any [MCP](https://modelcontextprotocol.io/) server can be wrapped as a skill using `skill_from_mcp`. The MCP server's tools become the sub-agent's tools — the main agent still only sees `execute_skill`.
 
@@ -4,6 +4,7 @@
     resolve_model,
     run_agui_stream,
 )
+from haiku.skills.capability import SkillsCapability
 from haiku.skills.mcp import skill_from_mcp
 from haiku.skills.models import (
     Skill,
 
@@ -22,7 +22,8 @@
     RetryPromptPart,
 )
 from pydantic_ai.models import Model
-from pydantic_ai.toolsets import FunctionToolset, ToolsetTool
+from pydantic_ai.settings import ModelSettings
+from pydantic_ai.toolsets import FunctionToolset
 
 from haiku.skills.models import Skill
 from haiku.skills.prompts import SKILL_PROMPT
@@ -139,10 +140,22 @@ def _discover_scripts(skill: Skill) -> list[str]:
     )
 
 
-def _create_run_script(skill: Skill) -> Callable[..., Any]:
+SCRIPT_TIMEOUT_DEFAULT = 120.0
+
+
+def _create_run_script(
+    skill: Skill, timeout: float | None = None
+) -> Callable[..., Any]:
     """Create a run_script tool bound to a specific skill."""
     assert skill.path is not None
     scripts_dir = (skill.path / "scripts").resolve()
+    resolved_timeout = (
+        timeout
+        if timeout is not None
+        else float(
+            os.environ.get("HAIKU_SKILLS_SCRIPT_TIMEOUT", SCRIPT_TIMEOUT_DEFAULT)
+        )
+    )
 
     async def run_script(script: str, arguments: str = "") -> str:
         """Execute a script from the skill's scripts/ directory.
@@ -171,7 +184,14 @@ async def run_script(script: str, arguments: str = "") -> str:
             stdout=asyncio.subprocess.PIPE,
             stderr=asyncio.subprocess.PIPE,
         )
-        stdout, stderr = await proc.communicate()
+        try:
+            stdout, stderr = await asyncio.wait_for(
+                proc.communicate(), timeout=resolved_timeout
+            )
+        except TimeoutError:
+            proc.kill()
+            await proc.wait()
+            raise RuntimeError(f"Script {script} timed out after {resolved_timeout}s")
         if proc.returncode != 0:
             output = stderr.decode().strip() or stdout.decode().strip()
             raise RuntimeError(
@@ -249,11 +269,15 @@ async def event_handler(
         tools=tools,
         toolsets=skill.toolsets or None,
     )
+    model_settings = (
+        ModelSettings(thinking=skill.thinking) if skill.thinking is not None else None
+    )
     result = await agent.run(
         request,
         deps=deps,
         usage_limits=UsageLimits(request_limit=20),
         event_stream_handler=event_handler,
+        model_settings=model_settings,
     )
     text = result.output
     return text, collected_events, emitted_events
@@ -308,30 +332,27 @@ def _register_skill_state(self, skill: Skill) -> None:
         else:
             self._namespaces[namespace] = skill.state_type()
 
-    async def get_tools(self, ctx: RunContext[Any]) -> dict[str, ToolsetTool[Any]]:
-        # Overridden to restore AG-UI state from deps before returning tools.
-        # get_tools() is the only per-run hook with RunContext access in the
-        # toolset API — there is no dedicated per-run setup method.
-        self._maybe_restore_state(ctx)
-        return await super().get_tools(ctx)
-
-    def _maybe_restore_state(self, ctx: RunContext[Any]) -> None:
-        """Restore namespace state from deps if it carries AG-UI state.
+    async def for_run(self, ctx: RunContext[Any]) -> "SkillToolset":
+        """Restore AG-UI state from deps before the run starts.
 
         Uses identity check (``is``) so we restore once per AG-UI request
-        (each request creates a new dict) but not on every model step within
-        a single run.
+        (each request creates a new dict) but not redundantly within a run.
         """
         deps = ctx.deps
-        if deps is None or not hasattr(deps, "state"):
-            return
-        state = deps.state
-        if not isinstance(state, dict) or not state:
-            return
-        if state is self._last_restored_state:
-            return
-        self._last_restored_state = state
-        self.restore_state_snapshot(state)
+        if deps is not None and hasattr(deps, "state"):
+            state = deps.state
+            if (
+                isinstance(state, dict)
+                and state
+                and state is not self._last_restored_state
+            ):
+                self._last_restored_state = state
+                self.restore_state_snapshot(state)
+        return self
+
+    @property
+    def use_subagents(self) -> bool:
+        return self._use_subagents
 
     @property
     def registry(self) -> SkillRegistry:
 
@@ -0,0 +1,52 @@
+from collections.abc import Callable
+from pathlib import Path
+from typing import Any
+
+from pydantic_ai import RunContext
+from pydantic_ai.capabilities import AbstractCapability
+from pydantic_ai.models import Model
+
+from haiku.skills.agent import SkillToolset
+from haiku.skills.models import Skill
+from haiku.skills.prompts import DEFAULT_PREAMBLE, build_system_prompt
+
+
+class SkillsCapability(AbstractCapability[Any]):
+    """Expose skills as a pydantic-ai capability.
+
+    Wraps a ``SkillToolset`` and provides both the toolset and a system prompt
+    built from the skill catalog.  The underlying toolset is accessible via the
+    ``toolset`` attribute for advanced use (state, registry, event sink, etc.).
+    """
+
+    def __init__(
+        self,
+        *,
+        skills: list[Skill] | None = None,
+        skill_paths: list[Path] | None = None,
+        use_entrypoints: bool = False,
+        skill_model: str | Model | None = None,
+        use_subagents: bool = True,
+        preamble: str = DEFAULT_PREAMBLE,
+    ) -> None:
+        self.toolset = SkillToolset(
+            skills=skills,
+            skill_paths=skill_paths,
+            use_entrypoints=use_entrypoints,
+            skill_model=skill_model,
+            use_subagents=use_subagents,
+        )
+        self.preamble = preamble
+
+    def get_toolset(self) -> SkillToolset:
+        return self.toolset
+
+    def get_instructions(self) -> Callable[[RunContext[Any]], str]:
+        def _instructions(ctx: RunContext[Any]) -> str:
+            return build_system_prompt(
+                self.toolset.skill_catalog,
+                preamble=self.preamble,
+                use_subagents=self.toolset.use_subagents,
+            )
+
+        return _instructions
@@ -8,6 +8,7 @@
 from pydantic import BaseModel, ConfigDict, Field, PrivateAttr, field_validator
 from pydantic_ai import Tool
 from pydantic_ai.models import Model
+from pydantic_ai.settings import ThinkingLevel
 from pydantic_ai.toolsets import AbstractToolset
 
 
@@ -84,6 +85,7 @@ class Skill(BaseModel):
     _toolsets: list[AbstractToolset[Any]] = PrivateAttr(default_factory=list)
     _state_type: type[BaseModel] | None = PrivateAttr(default=None)
     _state_namespace: str | None = PrivateAttr(default=None)
+    _thinking: ThinkingLevel | None = PrivateAttr(default=None)
     _factory: Callable[..., "Skill"] | None = PrivateAttr(default=None)
 
     def __init__(
@@ -93,13 +95,15 @@ def __init__(
         toolsets: Sequence[AbstractToolset[Any]] | None = None,
         state_type: type[BaseModel] | None = None,
         state_namespace: str | None = None,
+        thinking: ThinkingLevel | None = None,
         **data: Any,
     ) -> None:
         super().__init__(**data)
         self._tools = list(tools) if tools else []
         self._toolsets = list(toolsets) if toolsets else []
         self._state_type = state_type
         self._state_namespace = state_namespace
+        self._thinking = thinking
 
     @property
     def tools(self) -> list[Tool | Callable[..., Any]]:
@@ -133,6 +137,14 @@ def state_namespace(self) -> str | None:
     def state_namespace(self, value: str | None) -> None:
         self._state_namespace = value
 
+    @property
+    def thinking(self) -> ThinkingLevel | None:
+        return self._thinking
+
+    @thinking.setter
+    def thinking(self, value: ThinkingLevel | None) -> None:
+        self._thinking = value
+
     def reconfigure(self, **kwargs: Any) -> None:
         """Re-create this skill with new factory arguments.
 
@@ -146,6 +158,7 @@ def reconfigure(self, **kwargs: Any) -> None:
         self._toolsets = new_skill._toolsets
         self._state_type = new_skill._state_type
         self._state_namespace = new_skill._state_namespace
+        self._thinking = new_skill._thinking
         self.model = new_skill.model
 
     def state_metadata(self) -> StateMetadata | None:
 
@@ -31,15 +31,15 @@ dependencies = [
     "ag-ui-protocol>=0.1.13",
     "jsonpatch>=1.33",
     "pydantic>=2.12.5",
-    "pydantic-ai-slim[mcp,openai]>=1.63.0",
+    "pydantic-ai-slim[mcp,openai]>=1.71.0",
     "pyyaml>=6.0.3",
     "skills-ref>=0.1.1",
 ]
 
 [project.optional-dependencies]
 signing = ["pathspec>=0.12.0", "sigstore>=4.0.0"]
 tui = [
-    "pydantic-ai-slim[logfire]>=1.63.0",
+    "pydantic-ai-slim[logfire]>=1.71.0",
     "textual>=8.0.0",
     "typer>=0.24.1",
     "python-dotenv>=1.2.1",
@@ -70,7 +70,7 @@ dev = [
     "pytest-asyncio>=1.3.0",
     "pytest-cov>=7.0.0",
     "pytest-recording>=0.13.4",
-    "pydantic-ai-slim[openai,logfire]>=1.63.0",
+    "pydantic-ai-slim[openai,logfire]>=1.71.0",
     "ruff>=0.15.2",
     "pathspec>=0.12.0",
     "sigstore>=4.0.0",
Original file line number	Diff line number	Diff line change
`@@ -4,6 +4,7 @@`
`4`	`4`	`resolve_model,`
`5`	`5`	`run_agui_stream,`
`6`	`6`	`)`
	`7`	`+from haiku.skills.capability import SkillsCapability`
`7`	`8`	`from haiku.skills.mcp import skill_from_mcp`
`8`	`9`	`from haiku.skills.models import (`
`9`	`10`	`Skill,`