feat: add visor-agent-dev skill for AI coding assistants by buger · Pull Request #562 · probelabs/visor

buger · 2026-03-24T07:17:09Z

Summary

Adds a cross-tool skill (SKILL.md) that guides AI coding agents through the full visor assistant development workflow
Works out of the box with Claude Code, Codex, and OpenCode via symlinks to the canonical .claude/skills/ location
All commands use npx -y @probelabs/visor@latest so no global install is needed

What the skill covers

Safety guardrails: prevents destructive actions, unauthorized real provider usage, secret leakage, and accidental runner starts
Development loop: write YAML tests → validate config → iterate with mocks → graduate to real providers → debug with traces
Precise test running: single case (--only), stage filtering (#stage), parallel suites, discovery, debug flags
Interactive testing: --message flow with task tracking, TUI mode
Trace debugging: tasks list → tasks show → tasks trace --full
Configuration patterns: skills, workflows, intents, knowledge files with examples
References: Oel production agent as a real-world example

Files

.claude/skills/visor-agent-dev/SKILL.md — canonical skill definition
.agents/skills/visor-agent-dev — symlink for Codex discovery
.opencode/skills/visor-agent-dev — symlink for OpenCode discovery

Test plan

Verify SKILL.md renders correctly in Claude Code via /visor-agent-dev
Verify symlinks resolve correctly on clone (ls -la .agents/skills/ / .opencode/skills/)
Test that Codex discovers the skill via $ mention or /skills list
Run a sample workflow using the skill's instructions end-to-end

🤖 Generated with Claude Code

Cross-tool skill (Claude Code, Codex, OpenCode) that guides developers through the full visor assistant development workflow: writing YAML tests, config validation, mock/real provider testing, interactive --message testing, task trace debugging, and response evaluation. Includes safety guardrails to prevent destructive actions, unauthorized real provider usage, and secret leakage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

probelabs · 2026-03-24T07:49:32Z

PR Overview

This PR adds a comprehensive AI coding assistant skill (visor-agent-dev) that guides developers through building and extending visor-based AI assistants. The skill is designed to work with multiple AI coding platforms (Claude Code, Codex, OpenCode) through symlinked discovery paths.

Files Changed

File	Status	Lines	Description
`.claude/skills/visor-agent-dev/SKILL.md`	Added	+498	Canonical skill definition with full development workflow
`.agents/skills/visor-agent-dev`	Added	+1	Symlink for Codex discovery
`.opencode/skills/visor-agent-dev`	Added	+1	Symlink for OpenCode discovery

Total: 500 additions, 0 deletions across 3 files

What This PR Accomplishes

Cross-platform AI skill support - Single skill definition discoverable by Claude Code (.claude/skills/), Codex (.agents/skills/), and OpenCode (.opencode/skills/)
Comprehensive development workflow covering:
- Safety guardrails (prevents destructive actions, unauthorized provider usage, secret leakage)
- Test-driven development with YAML test suites
- Configuration validation
- Mock-based iteration before real provider usage
- Interactive testing with --message flow
- Trace debugging with tasks commands
- Configuration patterns for skills, workflows, intents, and knowledge files
Command reference - All commands use npx -y @probelabs/visor@latest for zero-install execution

Key Technical Changes

Skill Structure

The SKILL.md file follows the Claude Code skill format:

Frontmatter: name, description, argument-hint, allowed-tools
Safety Rules: 7 explicit guardrails
Development Workflow: 8-step iterative process
Configuration Patterns: YAML examples for skills, workflows, intents
Command Reference: Table of key visor CLI commands
Real Agent Example: References the Oel production assistant

Symlink Strategy

.claude/skills/visor-agent-dev/SKILL.md  (canonical)
  ↑
  ├── ../../.agents/skills/visor-agent-dev  (symlink for Codex)
  └── ../../.opencode/skills/visor-agent-dev (symlink for OpenCode)

Safety Guardrails Implemented

Never run with real providers without user confirmation
Never modify production configs without approval
Never commit secrets (use ${ENV_VAR} references)
Never add destructive allowed_commands patterns
Never run with live service flags (--slack, --telegram, etc.) without explicit request
Ask before external API calls
Always include disallowed_commands with destructive patterns

Architecture & Impact Assessment

graph TD
    A[AI Coding Assistant] --> B{Platform Detection}
    B -->|Claude Code| C[.claude/skills/visor-agent-dev/SKILL.md]
    B -->|Codex| D[.agents/skills/visor-agent-dev → C]
    B -->|OpenCode| E[.opencode/skills/visor-agent-dev → C]
    
    C --> F[Development Workflow]
    F --> G[Step 1: Understand Goal]
    F --> H[Step 2: Write YAML Tests]
    F --> I[Step 3: Validate Config]
    F --> J[Step 4: Run with Mocks]
    F --> K[Step 5: Real Providers]
    F --> L[Step 6: Interactive Testing]
    F --> M[Step 7: Debug Traces]
    F --> N[Step 8: Evaluate Quality]
    
    H --> O[*.tests.yaml files]
    I --> P[visor validate --config]
    J --> Q[visor test --only case#stage]
    K --> R[visor test --no-mocks]
    L --> S[visor --message --tui]
    M --> T[visor tasks trace --full]
    N --> U[visor tasks evaluate]
    
    style C fill:#e1f5ff
    style D fill:#ffe1e1
    style E fill:#e1ffe1

Affected System Components

No code changes - This is purely documentation/configuration
AI Assistant Tooling - New skill for AI coding assistants
Developer Experience - Streamlined onboarding for visor development

Development Flow Diagram

flowchart LR
    A[Write Tests] --> B[Validate Config]
    B --> C[Run with Mocks]
    C --> D{Pass?}
    D -->|No| A
    D -->|Yes| E[User Approval?]
    E -->|No| C
    E -->|Yes| F[Real Providers]
    F --> G[Interactive Testing]
    G --> H[Debug Traces]
    H --> I{Working?}
    I -->|No| A
    I -->|Yes| J[Complete]

Scope Discovery & Context Expansion

Direct Impact

AI coding assistants using Claude Code, Codex, or OpenCode
Developers building visor assistants, skills, workflows, or checks
Testing and debugging workflows for visor configurations

Related Files (Referenced in Skill)

assistant.yaml - Main assistant configuration
config/skills.yaml - Skill definitions
config/intents.yaml - Intent routing
config/projects.yaml - Project catalog
defaults/assistant.yaml - Built-in patterns
defaults/skills/ - Built-in skill examples
docs/ - Knowledge files
workflows/ - Reusable pipelines
*.tests.yaml - Test suites

Reference Implementation

The skill references the Oel production assistant at ../refine/Oel/ as a real-world example containing:

25+ skills with tools & knowledge
3 intents (chat, evaluate_ticket, release_notes)
30+ knowledge files (3,600+ lines)
Multiple workflows including Slack integration

Configuration Patterns Covered

Skills

- id: my-new-skill
  description: "needs to [what triggers this skill]"
  requires: [code-explorer]
  knowledge: |
    ## Instructions
  tools:
    my-tool:
      command: npx
      args: [my-mcp-server]
  allowed_commands: ['grep:*', 'find:*']
  disallowed_commands: ['rm:*', 'sudo:*']

Workflows

version: "1.0"
id: my-workflow
inputs: [query]
steps:
  fetch-data:
    type: mcp
    method: search
  process:
    type: ai
    depends_on: [fetch-data]
outputs:
  - name: result
    value_js: "return outputs?.['process']?.text ?? null;"

Test Assertions

calls - Verify step execution counts
outputs - Check values with equals, matches, contains
prompts - Verify AI input with contains
llm_judge - Semantic evaluation

Review Notes

Symlink validation - Verify symlinks resolve correctly after clone
Platform testing - Test skill discovery in Claude Code (/visor-agent-dev), Codex ($ mention), and OpenCode
Command accuracy - All CLI commands use npx -y @probelabs/visor@latest format
Safety completeness - Guardrails cover destructive actions, secrets, and live services
Documentation clarity - Workflow is step-by-step with examples

Test Plan (from PR author)

Verify SKILL.md renders correctly in Claude Code via /visor-agent-dev
Verify symlinks resolve correctly on clone
Test Codex discovers the skill via $ mention or /skills list
Run a sample workflow using the skill's instructions end-to-end

Metadata

Review Effort: 2 / 5
Primary Label: documentation

Powered by Visor from Probelabs

Last updated: 2026-03-24T08:19:07.716Z | Triggered by: pr_opened | Commit: e44fada

💡 TIP: You can chat with Visor using /visor ask <your question>

probelabs · 2026-03-24T07:49:39Z

✅ Security Check Passed

No security issues found – changes LGTM.

Architecture Issues (10)

Severity	Location	Issue
🟠 Error	`.claude/skills/visor-agent-dev/SKILL.md:1-498`	The PR introduces symlinks (.agents/skills/visor-agent-dev and .opencode/skills/visor-agent-dev) that point to ../../.claude/skills/visor-agent-dev. This creates a cross-platform compatibility issue: Windows doesn't support symlinks by default and requires developer mode or administrator privileges. The symlinks will break on Windows systems without proper permissions, causing the skill to be undiscoverable by Codex and OpenCode. 💡 Suggestion Consider alternative approaches: 1) Use hard copies or build scripts to replicate the file during installation, 2) Document Windows symlink requirements and provide fallback, 3) Use a plugin system where each AI tool registers its skill directory, 4) Create a post-install script that sets up symlinks conditionally based on OS.
🟠 Error	`.claude/skills/visor-agent-dev/SKILL.md:14-27`	Safety rules are embedded inline in the skill documentation rather than being codified in executable validation. Rules like 'NEVER run with real providers without user confirmation' are documentation-only - there's no enforcement mechanism. An AI agent could bypass these rules if it doesn't fully comprehend the documentation or if the skill context is truncated. 💡 Suggestion Implement safety as executable guardrails: 1) Add a --safe-mode flag that requires explicit --unsafe flag to disable, 2) Create a pre-flight check that validates destructive commands aren't in allowed_commands, 3) Add confirmation prompts for real provider usage, 4) Use criticality modes to enforce safety at the engine level rather than relying on documentation.
🟠 Error	`.agents/skills/visor-agent-dev:1`	The symlink .agents/skills/visor-agent-dev → ../../.claude/skills/visor-agent-dev assumes a specific directory structure. If the repository is cloned differently (e.g., shallow clone, sparse checkout, or different working directory), the relative path may not resolve correctly. The symlink provides no validation that the target exists. 💡 Suggestion Add a post-clone setup script that validates symlink targets and creates them if missing. Document this in CONTRIBUTING.md. Consider using Git LFS or subtree if directory structure flexibility is needed.
🟡 Warning	`.claude/skills/visor-agent-dev/SKILL.md:1-498`	The SKILL.md file (498 lines) duplicates extensive documentation that already exists in visor/docs/ directory. This creates maintenance burden - updates to visor's workflow must be synchronized across multiple files. The skill file should reference existing documentation rather than duplicating it. 💡 Suggestion Replace duplicated content with concise summaries and links to canonical documentation. Keep the skill focused on AI-specific guidance (safety rules, tool usage patterns) rather than general visor documentation.
🟡 Warning	`.claude/skills/visor-agent-dev/SKILL.md:30-498`	The skill file is monolithic (498 lines) covering testing, validation, debugging, configuration patterns, and reference examples. This violates single-responsibility principle and makes the skill difficult to navigate, maintain, and extend. Changes to one area require re-parsing the entire file. 💡 Suggestion Split into focused, composable skills: 1) visor-test-runner (testing commands and assertions), 2) visor-config-validator (validation and schema), 3) visor-debugger (trace debugging and inspection), 4) visor-scaffolding (project setup patterns). Use skill dependencies (requires field) to compose them.
🟡 Warning	`.claude/skills/visor-agent-dev/SKILL.md:177-194`	The skill uses both --only and --case flags for running single test cases, stating they're 'equivalent'. This inconsistency is confusing - users need to remember two flags for the same operation. The visor test framework should have a single canonical flag with aliases documented separately. 💡 Suggestion Standardize on one primary flag (--only as the canonical choice) and document --case as a deprecated alias. Update the CLI help text to reflect this. Ensure all documentation and examples use the canonical flag consistently.
🟡 Warning	`.claude/skills/visor-agent-dev/SKILL.md:196-274`	The skill documents test flags (--max-suites, --max-parallel, --debug, --bail, --progress, --json, --report, --summary) without explaining their interactions or dependencies. For example, --max-suites controls parallel test file execution while --max-parallel controls check parallelism within a file - this distinction is unclear. Users may combine these incorrectly leading to resource exhaustion or unexpectedly slow runs. 💡 Suggestion Add a 'Parallelism and Resource Management' section that explains: 1) The difference between file-level and check-level parallelism, 2) How to calculate total parallelism (max-suites × max-parallel), 3) Recommended settings for different machine sizes, 4) Common pitfalls.
🟡 Warning	`.claude/skills/visor-agent-dev/SKILL.md:324-377`	The --message flow creates tasks in SQLite but doesn't explain task lifecycle, cleanup, or concurrency implications. Users running multiple --message commands may accumulate tasks in .visor/agent-tasks.db without understanding how to manage them. The skill references 'visor tasks list' but doesn't explain task persistence, TTL, or manual cleanup. 💡 Suggestion Add a 'Task Management' section explaining: 1) Task lifecycle (submitted → working → completed/failed), 2) Where tasks are stored (.visor/agent-tasks.db), 3) How to list/clean up old tasks, 4) Concurrency limits, 5) Task ID format and how to use it with trace/evaluate commands.
🟡 Warning	`.claude/skills/visor-agent-dev/SKILL.md:412-428`	The LLM evaluation section (tasks evaluate) lacks details on evaluation criteria, cost implications, and result interpretation. Users may run evaluations without understanding what's being measured or how to act on results. The section is too brief for a feature that consumes API credits. 💡 Suggestion Expand with: 1) Default evaluation criteria, 2) How to customize evaluation criteria, 3) Cost estimate per evaluation, 4) How to interpret scores, 5) Integration with CI, 6) Example of iterating based on evaluation feedback.
🟡 Warning	`.opencode/skills/visor-agent-dev:1`	The symlink .opencode/skills/visor-agent-dev assumes OpenCode discovers skills from this exact path. If OpenCode changes its skill discovery mechanism or path structure, this integration breaks silently. There's no version compatibility check or fallback mechanism. 💡 Suggestion Document the OpenCode version this was tested with and add a compatibility check in the skill frontmatter. Consider a skill manifest file that declares supported tool versions. Create a test that validates skill discovery works with each tool's latest version as part of CI.

Performance Issues (1)

Severity	Location	Issue
🟠 Error	`contract:0`	Output schema validation failed: must have required property 'issues'

Powered by Visor from Probelabs

Last updated: 2026-03-24T07:59:17.089Z | Triggered by: pr_opened | Commit: e44fada

💡 TIP: You can chat with Visor using /visor ask <your question>

buger merged commit 323657c into main Mar 24, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add visor-agent-dev skill for AI coding assistants#562

feat: add visor-agent-dev skill for AI coding assistants#562
buger merged 1 commit intomainfrom
feat/visor-agent-dev-skill

buger commented Mar 24, 2026

Uh oh!

Uh oh!

probelabs Bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

probelabs Bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

buger commented Mar 24, 2026

Summary

What the skill covers

Files

Test plan

Uh oh!

Uh oh!

probelabs Bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Overview

Files Changed

What This PR Accomplishes

Key Technical Changes

Skill Structure

Symlink Strategy

Safety Guardrails Implemented

Architecture & Impact Assessment

Affected System Components

Development Flow Diagram

Scope Discovery & Context Expansion

Direct Impact

Related Files (Referenced in Skill)

Reference Implementation

Configuration Patterns Covered

Skills

Workflows

Test Assertions

Review Notes

Test Plan (from PR author)

Uh oh!

probelabs Bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Security Check Passed

Architecture Issues (10)

Performance Issues (1)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

probelabs Bot commented Mar 24, 2026 •

edited

Loading

probelabs Bot commented Mar 24, 2026 •

edited

Loading