diff --git a/AUTHORING.md b/AUTHORING.md index 6c6cf1e..5572da5 100644 --- a/AUTHORING.md +++ b/AUTHORING.md @@ -11,10 +11,10 @@ Skills earn their keep on repeated, opinionated workflows. Before writing one, c - **Tool-bounded.** Uses only the tools and data it truly needs. Fewer moving parts means fewer ways to fail. - **Deterministic where possible.** Same input should produce a similar output across runs. Lean on scripts for the deterministic parts. - **Short execution path.** Few steps, low latency, low token cost. Long workflows belong in a checklist or split skills. -- **Recoverable failures.** Detects errors and either retries or exits cleanly with a useful message — never leaves the user mid-state. +- **Recoverable failures.** Detects errors and either retries or exits cleanly with a useful message, and never leaves the user mid-state. - **Context-light.** Works from the user's prompt and the skill body. Doesn't require long conversation history or hidden setup. -If the task fails several of these, it is probably documentation, a runbook, or a one-off prompt — not a skill. +If the task fails several of these, it is probably documentation, a runbook, or a one-off prompt, not a skill. ## Write the description for the goal, not the mechanics @@ -25,13 +25,13 @@ The `description` is the only part of the skill always loaded into context. The The agent matches descriptions against what the user is trying to *achieve*. Internal mechanics (which library, which container, which API) belong in the body of `SKILL.md`. ```yaml -# Good — names the goal and the trigger surface +# Good: names the goal and the trigger surface description: >- Port a CUDA kernel to HIP and flag anything that needs manual review. Use when the user wants to run CUDA code on AMD GPUs, mentions hipify, HIP, ROCm porting, or asks how to convert a .cu file. -# Bad — describes how the skill works internally +# Bad: describes how the skill works internally description: >- Runs hipify-perl on .cu files, parses the output, and post-processes the result with regex rules. @@ -41,8 +41,8 @@ description: >- - **Third person.** The description is injected into the system prompt. Use *"Ports CUDA kernels..."*, not *"I help you port..."* or *"You can use this to..."*. - **State WHAT and WHEN.** What the skill produces, and the situations in which the agent should reach for it. -- **Include the trigger surface.** List the words and phrases a user is likely to say — product names, file extensions, API names, error messages. Missing triggers cause under-triggering. -- **Add negative triggers when boundaries are easily crossed.** *"Do not use for system-wide installs — see X instead."* +- **Include the trigger surface.** List the words and phrases a user is likely to say, including product names, file extensions, API names, and error messages. Missing triggers cause under-triggering. +- **Add negative triggers when boundaries are easily crossed.** *"Do not use for system-wide installs; see X instead."* - **Be pushy when the use case is ambiguous.** It is better to err toward being invoked than to be silently skipped. - **Stay under ~1024 characters** (the hard cap on Anthropic-compatible runtimes). @@ -75,7 +75,7 @@ Database migrations want low freedom. Code review wants high freedom. Mismatched ### Use progressive disclosure, one level deep -Link from `SKILL.md` directly to reference files. Do not chain references through intermediate files — agents may only partially read deeply nested content. +Link from `SKILL.md` directly to reference files. Do not chain references through intermediate files because agents may only partially read deeply nested content. ``` skill-name/ @@ -132,7 +132,7 @@ Test the skill the way users will hit it: 1. Run a fresh agent against ~10 prompts that *should* trigger the skill and ~10 that *shouldn't*. The description should route both sets correctly. 2. Run the skill end-to-end on a real machine. Watch where the agent hesitates, asks unnecessary questions, or goes off-script. -3. Bring those observations back into the skill — usually as a sharper description, a clearer default, or a missing prerequisite — rather than adding more prose. +3. Bring those observations back into the skill, usually as a sharper description, a clearer default, or a missing prerequisite, rather than adding more prose. ## Pre-publish checklist @@ -152,7 +152,7 @@ Test the skill the way users will hit it: ## Validating locally -The structural rules from this guide — frontmatter shape, name format, description length, and `SKILL.md` body size — are enforced by `scripts/validate_skills.py` and run on every pull request. Run them locally before pushing: +The structural rules from this guide (frontmatter shape, name format, description length, and `SKILL.md` body size) are enforced by `scripts/validate_skills.py` and run on every pull request. Run them locally before pushing: ```bash ./scripts/check.sh # validates every skill (same command CI runs) diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..1d65b4b --- /dev/null +++ b/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2026 dholanda + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md index 10bbb8c..bf13b7b 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,30 @@ # AMD Skills -AMD Skills package the knowledge, scripts, and conventions for working with AMD hardware and software — ROCm, HIP, MIGraphX, ROCm-aware PyTorch and JAX, Instinct GPUs, Ryzen AI, Lemonade, and the broader AMD developer stack — and deliver them in a form AI coding agents can load on demand. +
-When a developer asks an agent to "set up ROCm in this container," "port this CUDA kernel to HIP," "tune this model for an MI300X," or "integrate local AI into my app," the agent pulls in an AMD-authored skill instead of guessing. +![AMD](https://img.shields.io/badge/AMD-Skills-ED1C24?logo=amd&logoColor=white) +![ROCm](https://img.shields.io/badge/ROCm-Enabled-green) +![Ryzen AI](https://img.shields.io/badge/Ryzen_AI-Ready-1F6FEB) +![Agent Skills](https://img.shields.io/badge/Agent_Skills-Format-7B2D8E) +[![Cursor](https://img.shields.io/badge/Cursor-Compatible-000000?logo=cursor&logoColor=white)](https://cursor.com) +[![Claude Code](https://img.shields.io/badge/Claude_Code-Compatible-D97757)](https://www.anthropic.com/claude-code) +[![OpenAI Codex](https://img.shields.io/badge/OpenAI_Codex-Compatible-412991)](https://openai.com/codex/) +[![Gemini CLI](https://img.shields.io/badge/Gemini_CLI-Compatible-4285F4)](https://ai.google.dev/gemini-api/docs/cli) +[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) +AMD Skills +Give your AI agents the power of AMD's optimized ecosystem. -Skills in this repository follow the standardized [Agent Skills](https://github.com/anthropics/skills) format and are designed to interoperate with the major coding agents — Cursor, Claude Code, OpenAI Codex, and Gemini CLI. +[**Browse the Skill Catalog ->**](#the-catalog) + +
+ +AMD Skills provide agents with knowledge, scripts, and conventions for working with AMD hardware and software. + +Skills in this repository follow the standardized [Agent Skills](https://github.com/anthropics/skills) format and are designed to interoperate with the major coding agents like Cursor, Claude Code, OpenAI Codex, and Gemini CLI. ## What is a skill? -A skill is a self-contained folder that bundles everything an agent needs to perform a focused task: instructions, helper scripts, prompts, templates, and references. At its core is a `SKILL.md` file with YAML frontmatter — a `name` and a short `description` that tells the agent *when* the skill should activate — followed by the guidance the agent reads while the skill is in use. +A skill is a self-contained folder that bundles everything an agent needs to perform a focused task: instructions, helper scripts, prompts, templates, and references. At its core is a `SKILL.md` file with YAML frontmatter, a `name`, and a short `description` that tells the agent *when* the skill should activate, followed by the guidance the agent reads while the skill is in use. ``` skills/ @@ -22,9 +38,9 @@ When an agent decides a skill is relevant (or you invoke it explicitly), it load ## Why a skill, not a doc? -Documentation describes an API surface — every flag, every option, neutral by design. A skill encodes the opinionated path: which flags, which container image, which `gfx` target, which environment variables, in what order. It captures the decisions a senior AMD engineer makes without thinking, in a form the agent can apply consistently across teams and repositories. +Documentation describes an API surface: every flag, every option, neutral by design. A skill encodes the opinionated path: which flags, which container image, which `gfx` target, which environment variables, in what order. It captures the decisions a senior AMD engineer makes without thinking, in a form the agent can apply consistently across teams and repositories. -Skills earn their keep on repeated, opinionated workflows — exactly where the AMD stack lives. +Skills earn their keep on repeated, opinionated workflows, exactly where the AMD stack lives. ## The catalog @@ -38,7 +54,7 @@ Diagnose, configure, and tune AMD silicon directly. | --- | --- | | `rocm-doctor` | Detect driver / kernel / ROCm / framework mismatches and propose fixes. | | `gfx-target-chooser` | Pick the right `gfx942` / `gfx90a` / `gfx1100` target and matching compiler flags. | -| `mi300x-tuner` | Opinionated training and inference tuning for MI300X — TunableOp, FSDP, FlashAttention. | +| `mi300x-tuner` | Opinionated training and inference tuning for MI300X, including TunableOp, FSDP, and FlashAttention. | | `rocm-container-picker` | Map a workload to a known-good `rocm/*` container image. | | `ryzen-ai-deploy` | Prepare, quantize, and deploy models to Ryzen AI NPUs across the ONNX, PyTorch, and hybrid CPU/NPU/iGPU paths. | @@ -72,11 +88,11 @@ Close the loop from trace to fix to ship. | `migraphx-deploy` | Compile an ONNX model with MIGraphX and benchmark it on a target. | | `rocm-ci-template` | Drop-in GitHub Actions for AMD-targeted projects. | -> Skills land incrementally — see [Status](#status) for what is available today. +> Skills land incrementally; see [Status](#status) for what is available today. ## A federated catalog -The AMD stack is large and moves fast. ROCm, HIP, MIGraphX, vLLM-AMD, Ryzen AI, the framework integrations — each has its own team, release cadence, and validation matrix. A single monorepo of skills, maintained by one central team, would always be a step behind. +The AMD stack is large and moves fast. ROCm, HIP, MIGraphX, vLLM-AMD, Ryzen AI, and framework integrations each have their own team, release cadence, and validation matrix. A single monorepo of skills, maintained by one central team, would always be a step behind. So skills here are **federated**: each skill is owned and versioned by the team that owns the product it describes, and this repository is the **catalog** that brings them together. @@ -108,11 +124,11 @@ Concretely: Each skill stays close to the engineers who ship the underlying product, the CI that validates it, and the release tag that pins it. -This repo also acts as an **incubator**: a skill can start its life under `skills/` here to iterate quickly, then graduate to its product repo and be re-pointed from `catalog/` once it has a clear owner — no change for installed users. +This repo also acts as an **incubator**: a skill can start its life under `skills/` here to iterate quickly, then graduate to its product repo and be re-pointed from `catalog/` once it has a clear owner, with no change for installed users. ### What this means for you -- **One install, full coverage.** You add this repository through the plugin flow of your agent and you get the whole AMD catalog — you do not need to track and install skills product by product. +- **One install, full coverage.** You add this repository through the plugin flow of your agent and you get the whole AMD catalog, so you do not need to track and install skills product by product. - **Skills update with the products they describe.** When ROCm cuts a new release, the ROCm team updates the ROCm skills as part of that release. You see the new behavior the next time you pull the catalog. - **Skills you can trust.** Each skill is signed off by the team that owns the underlying product, not assembled second-hand by a separate documentation team. @@ -171,19 +187,19 @@ Once a skill is installed, reference it in plain language while talking to your - "Use the `migraphx-deploy` skill to compile this ONNX model for `gfx942` and benchmark it." - "Use the `omniperf-tune` skill to find the bottleneck in this training step." -The agent loads the matching `SKILL.md` and any helper scripts, then carries out the task. In most cases the agent will pick the right skill on its own from the description — explicit invocation is a fallback, not a requirement. +The agent loads the matching `SKILL.md` and any helper scripts, then carries out the task. In most cases the agent will pick the right skill on its own from the description; explicit invocation is a fallback, not a requirement. ## Contributing a skill We welcome contributions from AMD engineers, partners, and the community. There are two contribution paths, matching how the catalog is organized. -### Path A — Skills authored in this repository +### Path A: Skills authored in this repository Best for cross-cutting skills that do not have a natural product home. 1. Copy an existing skill folder under `skills/` as a starting point and rename it. 2. Update the `SKILL.md` frontmatter so the `name` and `description` clearly explain *what* the skill does and *when* an agent should reach for it. -3. Add the supporting scripts, templates, and reference docs your instructions point to. Keep skills focused — one well-scoped task per skill is better than one mega-skill. +3. Add the supporting scripts, templates, and reference docs your instructions point to. Keep skills focused: one well-scoped task per skill is better than one mega-skill. 4. Register the skill in `.claude-plugin/marketplace.json` with a human-readable description. 5. Validate the skill locally before pushing: ```bash @@ -191,22 +207,22 @@ Best for cross-cutting skills that do not have a natural product home. ``` 6. Open a pull request. The `validate` GitHub Actions workflow runs `./scripts/check.sh` and must pass before merge. See [AUTHORING.md](AUTHORING.md#validating-locally) for the full set of enforced rules. -### Path B — Skills authored in a product repository +### Path B: Skills authored in a product repository Best for skills that should ship and version with a product (HIP, MIGraphX, Ryzen AI, vLLM-AMD, etc.). -1. Add the skill folder to your product repository — a common location is `.agents/skills//`. +1. Add the skill folder to your product repository; a common location is `.agents/skills//`. 2. Open a pull request here that adds an entry to `catalog/` pointing at the skill's location and pinning a tag. 3. CI will validate the linked skill against the same rules as in-repo skills, and the central plugin manifests will surface it through one install. ### Writing tips -See [AUTHORING.md](AUTHORING.md) for the full authoring guide — when a task is a good fit for a skill, how to write a description that routes correctly, and the conventions every AMD skill should follow. The essentials: +See [AUTHORING.md](AUTHORING.md) for the full authoring guide, including when a task is a good fit for a skill, how to write a description that routes correctly, and the conventions every AMD skill should follow. The essentials: - Optimize the `description` for *agent routing*, not marketing copy. Describe the user's goal, not how the skill works internally. - Be explicit about prerequisites: ROCm version, kernel, GPU architecture, container image. - Prefer scripts and runnable commands over prose where possible. -- Call out known pitfalls — driver mismatches, unsupported architectures, environment variables that silently change behavior. +- Call out known pitfalls: driver mismatches, unsupported architectures, and environment variables that silently change behavior. ## Status diff --git a/assets/banner.jpg b/assets/banner.jpg new file mode 100644 index 0000000..accec6e Binary files /dev/null and b/assets/banner.jpg differ diff --git a/skills/local-ai-app-integration/SKILL.md b/skills/local-ai-app-integration/SKILL.md index 495b258..c3b12d6 100644 --- a/skills/local-ai-app-integration/SKILL.md +++ b/skills/local-ai-app-integration/SKILL.md @@ -13,8 +13,8 @@ description: >- # Local AI App Integration (Embeddable Lemonade) Add a local AI mode to an existing app that already talks to a cloud AI API -(OpenAI, Anthropic, or Ollama-compatible). The app launches `lemond` — the -Embeddable Lemonade binary — as a private subprocess and the existing client +(OpenAI, Anthropic, or Ollama-compatible). The app launches `lemond`, the +Embeddable Lemonade binary, as a private subprocess and the existing client talks to it on `http://localhost:PORT/api/v1`. The user gets local, private, hardware-optimized inference (CPU, AMD iGPU/dGPU, XDNA2 NPU) with no separate install. @@ -26,11 +26,11 @@ Use this skill when **all** of the following are true: - The app already calls a cloud AI service over HTTP (OpenAI Chat Completions, Anthropic Messages, or Ollama). - The user wants that AI to run on the end-user's PC, with the AI engine - bundled into the app — not as a separate user install. + bundled into the app, not as a separate user install. - The target platform is Windows x64 or Linux x64 (macOS embeddable is in beta). If the user instead wants a **system-wide** Lemonade Server (one install, -shared across apps), do not use this skill — point them at +shared across apps), do not use this skill; point them at `https://lemonade-server.ai/install_options.html` and the standard OpenAI base URL `http://localhost:13305/api/v1`. @@ -52,7 +52,7 @@ Track progress against this checklist. Move on only when each step verifies. --- -## Step 1 — Survey the app +## Step 1: Survey the app Find every place the app currently calls a cloud AI API. Search the repo for: @@ -65,16 +65,16 @@ Record three things before continuing: 1. **Client library and language** (e.g., `openai-python`, `openai-node`, `@anthropic-ai/sdk`, `go-openai`, raw `fetch`). -2. **Modalities used** — text chat, tool calling, embeddings, image gen, +2. **Modalities used:** text chat, tool calling, embeddings, image gen, transcription, TTS. This drives the model + backend choice in Step 2. 3. **One single place** where the base URL and API key are constructed. If there isn't one, refactor to one before going further. Local-mode toggling must flip exactly one config object. -## Step 2 — Pick a model + backend profile +## Step 2: Pick a model + backend profile Choose **one** default profile based on the app's primary modality. Do not -ship a buffet — ship one good default and document how the user can override +ship a buffet. Ship one good default and document how the user can override it. | App's primary need | Default model | Recipe | Why | @@ -93,7 +93,7 @@ unset. Override only if the app has hard hardware requirements. For more options and tradeoffs, see [reference.md](reference.md). -## Step 3 — Place Embeddable Lemonade in the app's tree +## Step 3: Place Embeddable Lemonade in the app's tree Get the embeddable artifact from the latest Lemonade release: @@ -118,12 +118,12 @@ vendor/lemonade/ models--unsloth--Qwen3-4B-GGUF/ ``` -**Bundle decisions — pick deliberately:** +**Bundle decisions: pick deliberately** - **Backends:** Bundle `llamacpp:vulkan` at packaging time (works on every GPU). Install `llamacpp:rocm` at first run on supported AMD systems via `POST /v1/install` after probing `GET /v1/system-info`. Never ship every - backend — the artifact balloons. + backend, or the artifact balloons. - **Models:** Either bundle the default model under `models/` (offline install, larger installer) **or** pull on first run with `POST /v1/pull` (smaller installer, needs network). Pick one and document it. @@ -135,7 +135,7 @@ Strip what you don't ship: delete the `lemonade` CLI and `resources/defaults.json` from the shipping artifact once `config.json` is initialized. -## Step 4 — Add a `lemond` launcher +## Step 4: Add a `lemond` launcher The launcher is a thin process supervisor. Its only jobs: @@ -227,9 +227,9 @@ async function waitForHealth(port, key, timeoutMs) { } ``` -## Step 5 — Re-point the existing client at `lemond` +## Step 5: Re-point the existing client at `lemond` -Change exactly two values in the app's existing client config — the base URL +Change exactly two values in the app's existing client config: the base URL and the API key. Nothing else. | Existing client | New `base_url` | New auth | @@ -257,7 +257,7 @@ resp = client.chat.completions.create( ) ``` -## Step 6 — Wait for health, then preload the default model +## Step 6: Wait for health, then preload the default model `lemond` lazy-loads models on first inference. To eliminate cold-start latency on the user's first message, preload right after the health check @@ -273,7 +273,7 @@ Content-Type: application/json If the model isn't downloaded yet, follow the recovery flow in Step 7. -## Step 7 — Lifecycle and recovery +## Step 7: Lifecycle and recovery These are the only failure modes worth handling. Do not over-engineer. @@ -283,13 +283,13 @@ These are the only failure modes worth handling. Do not over-engineer. | `/v1/load` returns 500 with backend error | Backend not installed for this hardware | `GET /v1/system-info`, pick a supported backend, `POST /v1/install` with `{"recipe": "...", "backend": "..."}`, retry | | Subprocess exits immediately | Port already in use by another `lemond` | Pick a new free port and retry once | | `/v1/health` never returns 200 | First-run backend extraction is slow on cold disk | Extend timeout to 90s on first launch, 30s after | -| HTTP 401 on every request | Forgot the `Authorization: Bearer` header | Audit the client config — Lemonade rejects unauth'd calls when `LEMONADE_API_KEY` is set | +| HTTP 401 on every request | Forgot the `Authorization: Bearer` header | Audit the client config because Lemonade rejects unauth'd calls when `LEMONADE_API_KEY` is set | **Shutdown:** On app exit, `proc.terminate()` (Unix) or `proc.kill()` (Windows). `lemond` flushes config and exits cleanly within a -couple of seconds. Always wait on the process — never orphan it. +couple of seconds. Always wait on the process; never orphan it. -**Do not** parse `lemond` stdout to detect readiness — use the HTTP +**Do not** parse `lemond` stdout to detect readiness; use the HTTP `/v1/health` probe. Stdout format is not a stable contract. --- @@ -302,7 +302,7 @@ The integration is done when **all** of these are true: - [ ] `GET /api/v1/health` returns 200 within the timeout. - [ ] The default model loads successfully via `POST /v1/load`. - [ ] The existing client's chat / image / speech call returns a valid - response with the base URL and key swapped — no other code changed. + response with the base URL and key swapped, with no other code changed. - [ ] Killing the parent process leaves no `lemond` subprocess behind. - [ ] On a fresh machine without the optimal backend, the app still works via the Vulkan fallback bundled in `bin/`. diff --git a/skills/local-ai-app-integration/reference.md b/skills/local-ai-app-integration/reference.md index 43745cc..cf15a9d 100644 --- a/skills/local-ai-app-integration/reference.md +++ b/skills/local-ai-app-integration/reference.md @@ -1,4 +1,4 @@ -# Local AI App Integration — Reference +# Local AI App Integration: Reference Detailed reference material for the `local-ai-app-integration` skill. Read this only when the main `SKILL.md` flow needs a decision that isn't covered @@ -32,7 +32,7 @@ hardware-optimized one at first run after a system probe. | `cpu` | x86_64 CPU | Windows, Linux | Install only if you need a non-Vulkan CPU path. | | `metal` | Apple Silicon | macOS (beta) | macOS-only path. | -### Text generation (NPU recipes — Windows only) +### Text generation (NPU recipes, Windows only) | Recipe | Backend | Hardware | Notes | |---|---|---|---| @@ -64,7 +64,7 @@ hardware-optimized one at first run after a system probe. ## Model picker by use case -Pick **one** model as the app default. Do not list options to the user — +Pick **one** model as the app default. Do not list options to the user; ship a default and document how to override. | Use case | Recommended model | Approx size | Recipe | @@ -177,7 +177,7 @@ hand-editing `config.json`, or at runtime via `POST /internal/set`. | `host` | string | Default `127.0.0.1`. **Do not** expose on `0.0.0.0` from an embedded app. | | `log_level` | enum | `trace`/`debug`/`info`/`warning`/`error`/`fatal`/`none` | | `global_timeout` | int seconds | HTTP client timeout for backend installs and pulls | -| `no_broadcast` | bool | **Set `true` for embedded apps** — disables UDP discovery beacon | +| `no_broadcast` | bool | **Set `true` for embedded apps**, disables UDP discovery beacon | | `extra_models_dir` | string | Search path for arbitrary GGUFs (see below) | ### Deferred (apply on next load)