feat: Add agent-test-generation skill by hocokahu · Pull Request #26 · warpdotdev/oz-skills

hocokahu · 2026-05-29T23:02:19Z

@zachbai I thought it would be useful to add a skill to generate test cases against all agentic coding. Let me know your thought. Monocle2AI is an open source under the Linux Foundation.

Summary

Adds agent-test-generation — a skill that scaffolds monocle_test_tools pytest tests for Python AI agent apps across LangGraph, Google ADK, OpenAI, Microsoft Agent Framework, CrewAI, LlamaIndex, and Strands.

Covers seven test categories:

Agent & Tool Routing — positive + negative tests that the right agent/tool runs for each request
Input Validation — verify user inputs are forwarded into agent/tool calls
Output Validation — verify outputs contain expected content
Performance — token-limit and duration bounds
Quality Assessment — dual-mode (see below)
Multi-task Orchestration — complex multi-agent requests
Individual Agent Testing — each sub-agent in isolation

Dual-mode quality assessment

The quality file works with or without a cloud key:

Local (default, no key) — deterministic contains_output/does_not_have_any_output assertions plus optional BERTScore semantic similarity (auto-skips if bert_score isn't pip-installed). No network, no LLM call.
Cloud (OKAHU_API_KEY set) — LLM-as-judge classification via Okahu (sentiment, toxicity, bias, hallucination, etc.). Tests are gated by pytest.mark.skipif so they skip cleanly when the key is absent. The Cloud eval is provided at no cost.

Scaffold monocle_test_tools pytest tests for Python AI agent apps (LangGraph, Google ADK, CrewAI, LlamaIndex, Strands). Covers seven test categories: routing, input/output validation, performance, multi-agent orchestration, individual-agent isolation, and quality assessment. Quality assessment is dual-mode: deterministic + optional BERTScore similarity locally (no API key, no LLM call); LLM-as-judge classification via Okahu cloud when OKAHU_API_KEY is set.

hocokahu force-pushed the add-agent-test-generation-skill branch from 3caa53b to abd8f01 Compare June 5, 2026 22:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add agent-test-generation skill#26

feat: Add agent-test-generation skill#26
hocokahu wants to merge 1 commit into
warpdotdev:mainfrom
hocokahu:add-agent-test-generation-skill

hocokahu commented May 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hocokahu commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Dual-mode quality assessment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hocokahu commented May 29, 2026 •

edited

Loading