Skip to content

feat: add visor-agent-dev skill for AI coding assistants#562

Merged
buger merged 1 commit intomainfrom
feat/visor-agent-dev-skill
Mar 24, 2026
Merged

feat: add visor-agent-dev skill for AI coding assistants#562
buger merged 1 commit intomainfrom
feat/visor-agent-dev-skill

Conversation

@buger
Copy link
Copy Markdown
Contributor

@buger buger commented Mar 24, 2026

Summary

  • Adds a cross-tool skill (SKILL.md) that guides AI coding agents through the full visor assistant development workflow
  • Works out of the box with Claude Code, Codex, and OpenCode via symlinks to the canonical .claude/skills/ location
  • All commands use npx -y @probelabs/visor@latest so no global install is needed

What the skill covers

  • Safety guardrails: prevents destructive actions, unauthorized real provider usage, secret leakage, and accidental runner starts
  • Development loop: write YAML tests → validate config → iterate with mocks → graduate to real providers → debug with traces
  • Precise test running: single case (--only), stage filtering (#stage), parallel suites, discovery, debug flags
  • Interactive testing: --message flow with task tracking, TUI mode
  • Trace debugging: tasks listtasks showtasks trace --full
  • Configuration patterns: skills, workflows, intents, knowledge files with examples
  • References: Oel production agent as a real-world example

Files

  • .claude/skills/visor-agent-dev/SKILL.md — canonical skill definition
  • .agents/skills/visor-agent-dev — symlink for Codex discovery
  • .opencode/skills/visor-agent-dev — symlink for OpenCode discovery

Test plan

  • Verify SKILL.md renders correctly in Claude Code via /visor-agent-dev
  • Verify symlinks resolve correctly on clone (ls -la .agents/skills/ / .opencode/skills/)
  • Test that Codex discovers the skill via $ mention or /skills list
  • Run a sample workflow using the skill's instructions end-to-end

🤖 Generated with Claude Code

Cross-tool skill (Claude Code, Codex, OpenCode) that guides developers
through the full visor assistant development workflow: writing YAML tests,
config validation, mock/real provider testing, interactive --message testing,
task trace debugging, and response evaluation.

Includes safety guardrails to prevent destructive actions, unauthorized
real provider usage, and secret leakage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@buger buger merged commit 323657c into main Mar 24, 2026
12 checks passed
@probelabs
Copy link
Copy Markdown
Contributor

probelabs Bot commented Mar 24, 2026

PR Overview

This PR adds a comprehensive AI coding assistant skill (visor-agent-dev) that guides developers through building and extending visor-based AI assistants. The skill is designed to work with multiple AI coding platforms (Claude Code, Codex, OpenCode) through symlinked discovery paths.

Files Changed

File Status Lines Description
.claude/skills/visor-agent-dev/SKILL.md Added +498 Canonical skill definition with full development workflow
.agents/skills/visor-agent-dev Added +1 Symlink for Codex discovery
.opencode/skills/visor-agent-dev Added +1 Symlink for OpenCode discovery

Total: 500 additions, 0 deletions across 3 files

What This PR Accomplishes

  1. Cross-platform AI skill support - Single skill definition discoverable by Claude Code (.claude/skills/), Codex (.agents/skills/), and OpenCode (.opencode/skills/)

  2. Comprehensive development workflow covering:

    • Safety guardrails (prevents destructive actions, unauthorized provider usage, secret leakage)
    • Test-driven development with YAML test suites
    • Configuration validation
    • Mock-based iteration before real provider usage
    • Interactive testing with --message flow
    • Trace debugging with tasks commands
    • Configuration patterns for skills, workflows, intents, and knowledge files
  3. Command reference - All commands use npx -y @probelabs/visor@latest for zero-install execution

Key Technical Changes

Skill Structure

The SKILL.md file follows the Claude Code skill format:

  • Frontmatter: name, description, argument-hint, allowed-tools
  • Safety Rules: 7 explicit guardrails
  • Development Workflow: 8-step iterative process
  • Configuration Patterns: YAML examples for skills, workflows, intents
  • Command Reference: Table of key visor CLI commands
  • Real Agent Example: References the Oel production assistant

Symlink Strategy

.claude/skills/visor-agent-dev/SKILL.md  (canonical)
  ↑
  ├── ../../.agents/skills/visor-agent-dev  (symlink for Codex)
  └── ../../.opencode/skills/visor-agent-dev (symlink for OpenCode)

Safety Guardrails Implemented

  1. Never run with real providers without user confirmation
  2. Never modify production configs without approval
  3. Never commit secrets (use ${ENV_VAR} references)
  4. Never add destructive allowed_commands patterns
  5. Never run with live service flags (--slack, --telegram, etc.) without explicit request
  6. Ask before external API calls
  7. Always include disallowed_commands with destructive patterns

Architecture & Impact Assessment

graph TD
    A[AI Coding Assistant] --> B{Platform Detection}
    B -->|Claude Code| C[.claude/skills/visor-agent-dev/SKILL.md]
    B -->|Codex| D[.agents/skills/visor-agent-dev → C]
    B -->|OpenCode| E[.opencode/skills/visor-agent-dev → C]
    
    C --> F[Development Workflow]
    F --> G[Step 1: Understand Goal]
    F --> H[Step 2: Write YAML Tests]
    F --> I[Step 3: Validate Config]
    F --> J[Step 4: Run with Mocks]
    F --> K[Step 5: Real Providers]
    F --> L[Step 6: Interactive Testing]
    F --> M[Step 7: Debug Traces]
    F --> N[Step 8: Evaluate Quality]
    
    H --> O[*.tests.yaml files]
    I --> P[visor validate --config]
    J --> Q[visor test --only case#stage]
    K --> R[visor test --no-mocks]
    L --> S[visor --message --tui]
    M --> T[visor tasks trace --full]
    N --> U[visor tasks evaluate]
    
    style C fill:#e1f5ff
    style D fill:#ffe1e1
    style E fill:#e1ffe1
Loading

Affected System Components

  • No code changes - This is purely documentation/configuration
  • AI Assistant Tooling - New skill for AI coding assistants
  • Developer Experience - Streamlined onboarding for visor development

Development Flow Diagram

flowchart LR
    A[Write Tests] --> B[Validate Config]
    B --> C[Run with Mocks]
    C --> D{Pass?}
    D -->|No| A
    D -->|Yes| E[User Approval?]
    E -->|No| C
    E -->|Yes| F[Real Providers]
    F --> G[Interactive Testing]
    G --> H[Debug Traces]
    H --> I{Working?}
    I -->|No| A
    I -->|Yes| J[Complete]
Loading

Scope Discovery & Context Expansion

Direct Impact

  • AI coding assistants using Claude Code, Codex, or OpenCode
  • Developers building visor assistants, skills, workflows, or checks
  • Testing and debugging workflows for visor configurations

Related Files (Referenced in Skill)

  • assistant.yaml - Main assistant configuration
  • config/skills.yaml - Skill definitions
  • config/intents.yaml - Intent routing
  • config/projects.yaml - Project catalog
  • defaults/assistant.yaml - Built-in patterns
  • defaults/skills/ - Built-in skill examples
  • docs/ - Knowledge files
  • workflows/ - Reusable pipelines
  • *.tests.yaml - Test suites

Reference Implementation

The skill references the Oel production assistant at ../refine/Oel/ as a real-world example containing:

  • 25+ skills with tools & knowledge
  • 3 intents (chat, evaluate_ticket, release_notes)
  • 30+ knowledge files (3,600+ lines)
  • Multiple workflows including Slack integration

Configuration Patterns Covered

Skills

- id: my-new-skill
  description: "needs to [what triggers this skill]"
  requires: [code-explorer]
  knowledge: |
    ## Instructions
  tools:
    my-tool:
      command: npx
      args: [my-mcp-server]
  allowed_commands: ['grep:*', 'find:*']
  disallowed_commands: ['rm:*', 'sudo:*']

Workflows

version: "1.0"
id: my-workflow
inputs: [query]
steps:
  fetch-data:
    type: mcp
    method: search
  process:
    type: ai
    depends_on: [fetch-data]
outputs:
  - name: result
    value_js: "return outputs?.['process']?.text ?? null;"

Test Assertions

  • calls - Verify step execution counts
  • outputs - Check values with equals, matches, contains
  • prompts - Verify AI input with contains
  • llm_judge - Semantic evaluation

Review Notes

  1. Symlink validation - Verify symlinks resolve correctly after clone
  2. Platform testing - Test skill discovery in Claude Code (/visor-agent-dev), Codex ($ mention), and OpenCode
  3. Command accuracy - All CLI commands use npx -y @probelabs/visor@latest format
  4. Safety completeness - Guardrails cover destructive actions, secrets, and live services
  5. Documentation clarity - Workflow is step-by-step with examples

Test Plan (from PR author)

  • Verify SKILL.md renders correctly in Claude Code via /visor-agent-dev
  • Verify symlinks resolve correctly on clone
  • Test Codex discovers the skill via $ mention or /skills list
  • Run a sample workflow using the skill's instructions end-to-end
Metadata
  • Review Effort: 2 / 5
  • Primary Label: documentation

Powered by Visor from Probelabs

Last updated: 2026-03-24T08:19:07.716Z | Triggered by: pr_opened | Commit: e44fada

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link
Copy Markdown
Contributor

probelabs Bot commented Mar 24, 2026

✅ Security Check Passed

No security issues found – changes LGTM.

Architecture Issues (10)

Severity Location Issue
🟠 Error .claude/skills/visor-agent-dev/SKILL.md:1-498
The PR introduces symlinks (.agents/skills/visor-agent-dev and .opencode/skills/visor-agent-dev) that point to ../../.claude/skills/visor-agent-dev. This creates a cross-platform compatibility issue: Windows doesn't support symlinks by default and requires developer mode or administrator privileges. The symlinks will break on Windows systems without proper permissions, causing the skill to be undiscoverable by Codex and OpenCode.
💡 SuggestionConsider alternative approaches: 1) Use hard copies or build scripts to replicate the file during installation, 2) Document Windows symlink requirements and provide fallback, 3) Use a plugin system where each AI tool registers its skill directory, 4) Create a post-install script that sets up symlinks conditionally based on OS.
🟠 Error .claude/skills/visor-agent-dev/SKILL.md:14-27
Safety rules are embedded inline in the skill documentation rather than being codified in executable validation. Rules like 'NEVER run with real providers without user confirmation' are documentation-only - there's no enforcement mechanism. An AI agent could bypass these rules if it doesn't fully comprehend the documentation or if the skill context is truncated.
💡 SuggestionImplement safety as executable guardrails: 1) Add a --safe-mode flag that requires explicit --unsafe flag to disable, 2) Create a pre-flight check that validates destructive commands aren't in allowed_commands, 3) Add confirmation prompts for real provider usage, 4) Use criticality modes to enforce safety at the engine level rather than relying on documentation.
🟠 Error .agents/skills/visor-agent-dev:1
The symlink .agents/skills/visor-agent-dev → ../../.claude/skills/visor-agent-dev assumes a specific directory structure. If the repository is cloned differently (e.g., shallow clone, sparse checkout, or different working directory), the relative path may not resolve correctly. The symlink provides no validation that the target exists.
💡 SuggestionAdd a post-clone setup script that validates symlink targets and creates them if missing. Document this in CONTRIBUTING.md. Consider using Git LFS or subtree if directory structure flexibility is needed.
🟡 Warning .claude/skills/visor-agent-dev/SKILL.md:1-498
The SKILL.md file (498 lines) duplicates extensive documentation that already exists in visor/docs/ directory. This creates maintenance burden - updates to visor's workflow must be synchronized across multiple files. The skill file should reference existing documentation rather than duplicating it.
💡 SuggestionReplace duplicated content with concise summaries and links to canonical documentation. Keep the skill focused on AI-specific guidance (safety rules, tool usage patterns) rather than general visor documentation.
🟡 Warning .claude/skills/visor-agent-dev/SKILL.md:30-498
The skill file is monolithic (498 lines) covering testing, validation, debugging, configuration patterns, and reference examples. This violates single-responsibility principle and makes the skill difficult to navigate, maintain, and extend. Changes to one area require re-parsing the entire file.
💡 SuggestionSplit into focused, composable skills: 1) visor-test-runner (testing commands and assertions), 2) visor-config-validator (validation and schema), 3) visor-debugger (trace debugging and inspection), 4) visor-scaffolding (project setup patterns). Use skill dependencies (requires field) to compose them.
🟡 Warning .claude/skills/visor-agent-dev/SKILL.md:177-194
The skill uses both --only and --case flags for running single test cases, stating they're 'equivalent'. This inconsistency is confusing - users need to remember two flags for the same operation. The visor test framework should have a single canonical flag with aliases documented separately.
💡 SuggestionStandardize on one primary flag (--only as the canonical choice) and document --case as a deprecated alias. Update the CLI help text to reflect this. Ensure all documentation and examples use the canonical flag consistently.
🟡 Warning .claude/skills/visor-agent-dev/SKILL.md:196-274
The skill documents test flags (--max-suites, --max-parallel, --debug, --bail, --progress, --json, --report, --summary) without explaining their interactions or dependencies. For example, --max-suites controls parallel test file execution while --max-parallel controls check parallelism within a file - this distinction is unclear. Users may combine these incorrectly leading to resource exhaustion or unexpectedly slow runs.
💡 SuggestionAdd a 'Parallelism and Resource Management' section that explains: 1) The difference between file-level and check-level parallelism, 2) How to calculate total parallelism (max-suites × max-parallel), 3) Recommended settings for different machine sizes, 4) Common pitfalls.
🟡 Warning .claude/skills/visor-agent-dev/SKILL.md:324-377
The --message flow creates tasks in SQLite but doesn't explain task lifecycle, cleanup, or concurrency implications. Users running multiple --message commands may accumulate tasks in .visor/agent-tasks.db without understanding how to manage them. The skill references 'visor tasks list' but doesn't explain task persistence, TTL, or manual cleanup.
💡 SuggestionAdd a 'Task Management' section explaining: 1) Task lifecycle (submitted → working → completed/failed), 2) Where tasks are stored (.visor/agent-tasks.db), 3) How to list/clean up old tasks, 4) Concurrency limits, 5) Task ID format and how to use it with trace/evaluate commands.
🟡 Warning .claude/skills/visor-agent-dev/SKILL.md:412-428
The LLM evaluation section (tasks evaluate) lacks details on evaluation criteria, cost implications, and result interpretation. Users may run evaluations without understanding what's being measured or how to act on results. The section is too brief for a feature that consumes API credits.
💡 SuggestionExpand with: 1) Default evaluation criteria, 2) How to customize evaluation criteria, 3) Cost estimate per evaluation, 4) How to interpret scores, 5) Integration with CI, 6) Example of iterating based on evaluation feedback.
🟡 Warning .opencode/skills/visor-agent-dev:1
The symlink .opencode/skills/visor-agent-dev assumes OpenCode discovers skills from this exact path. If OpenCode changes its skill discovery mechanism or path structure, this integration breaks silently. There's no version compatibility check or fallback mechanism.
💡 SuggestionDocument the OpenCode version this was tested with and add a compatibility check in the skill frontmatter. Consider a skill manifest file that declares supported tool versions. Create a test that validates skill discovery works with each tool's latest version as part of CI.

Performance Issues (1)

Severity Location Issue
🟠 Error contract:0
Output schema validation failed: must have required property 'issues'

Powered by Visor from Probelabs

Last updated: 2026-03-24T07:59:17.089Z | Triggered by: pr_opened | Commit: e44fada

💡 TIP: You can chat with Visor using /visor ask <your question>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant