maestro/source/skills/accelerate/SKILL.md at main · sharpdeveye/maestro

name	accelerate
description	Use when the workflow is too slow, too expensive, or both and needs latency, cost, or token usage optimization.
argument-hint	[target metric]
category	enhancement
version	2.0.0
user-invocable	true

MANDATORY PREPARATION

Invoke /agent-workflow — it contains workflow principles, anti-patterns, and the Context Gathering Protocol. Follow the protocol before proceeding — if no workflow context exists yet, you MUST run /teach-maestro first. Consult the context-management reference in the agent-workflow skill for window optimization and budget strategies.

Make the workflow faster and cheaper without sacrificing quality. Measure before and after.

Performance Audit

Measure current performance:

Current metrics:
  Latency (p50): ___ms
  Latency (p95): ___ms
  Cost per request: $___
  Token usage (avg): ___ input / ___ output
  Error rate: ___%

Acceleration Strategies

Reduce Token Usage

Shorten system prompts (remove redundant instructions)
Compress few-shot examples to minimum viable length
Use structured output schemas instead of verbose text
Summarize context instead of passing raw documents
Reduce output length requirements

Model Cascading

Route simple tasks to cheaper/faster models
Escalate only complex tasks to capable models
Use classification to determine complexity

Caching

Cache responses for identical or near-identical inputs
Cache tool results with appropriate TTL
Cache embeddings for frequently-queried documents
Use semantic caching for similar (not identical) queries

Parallelization

Run independent tool calls in parallel
Run independent agent steps in parallel
Use streaming to start processing before full response

Context Optimization

Retrieve less, retrieve better (improve retrieval precision)
Use context compression techniques
Implement sliding window for long conversations

Acceleration Report

For each optimization:

What changed: Specific modification
Before: Latency/cost/tokens before
After: Latency/cost/tokens after
Quality impact: Any quality change (verify with golden tests)
Trade-off: What was sacrificed for the improvement

Acceleration Checklist

Baseline metrics recorded before any changes
Each optimization measured with before/after comparison
Quality impact verified (golden tests still pass)
Trade-offs documented for each change
Cost/latency improvements quantified

Recommended Next Step

After optimization, run /evaluate to verify quality didn't degrade, or /iterate to set up continuous monitoring.

NEVER:

Optimize without measuring first (you need a baseline)
Sacrifice quality for speed without explicit user approval
Cache outputs that depend on real-time data
Skip the quality check after optimization
Optimize prematurely (make it correct first, then make it fast)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MANDATORY PREPARATION

Performance Audit

Acceleration Strategies

Acceleration Report

Acceleration Checklist

Recommended Next Step

FilesExpand file tree

SKILL.md

Latest commit

History

SKILL.md

File metadata and controls

MANDATORY PREPARATION

Performance Audit

Acceleration Strategies

Acceleration Report

Acceleration Checklist

Recommended Next Step