From 82177168eecdcbda74013d053b9d19352764e5c2 Mon Sep 17 00:00:00 2001
From: Daniel Holanda <holand.daniel@gmail.com>
Date: Thu, 28 May 2026 11:58:34 -0700
Subject: [PATCH 1/3] Refactor categories

---
 README.md | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)
diff --git a/README.md b/README.md
index e194a9e..fc9be48 100644
--- a/README.md
+++ b/README.md
@@ -18,11 +18,13 @@
 
 </div>
 
-AMD Skills give coding agents the knowledge, scripts, and conventions they need to work with AMD hardware and software. Each skill follows the standardized [Agent Skills](https://github.com/anthropics/skills) format and works with Cursor, Claude Code, OpenAI Codex, and Gemini CLI.
+AMD Skills provide agents with knowledge, scripts, and conventions for working with AMD hardware and software.
+
+Skills in this repository follow the standardized [Agent Skills](https://github.com/anthropics/skills) format and are designed to interoperate with the major coding agents like Cursor, Claude Code, OpenAI Codex, and Gemini CLI.
 
 ## Installation
 
-AMD Skills is built directly into Claude and Cursor. **No install. No setup.**
+AMD Skills is built directly into Claude and Cursor. **No install. No setup**
 
 Just ask something like: `"Use AMD Skills to integrate local AI into my app"`.
 
@@ -62,9 +64,9 @@ Embed AMD-optimized AI into end-user applications.
 | [`local-ai-app-integration`](skills/local-ai-app-integration/SKILL.md) | Integrate local AI into cloud LLM apps for offline support, better privacy, and lower API costs. | in-repo |
 | [`local-ai-use`](skills/local-ai-use/SKILL.md) | Route image generation, text-to-speech, and speech-to-text through a local AI server to reduce token cost. | in-repo |
 
-### Hardware-native skills
+### Platform readiness
 
-Diagnose, configure, and tune AMD devices directly.
+Diagnose, configure, and ready AMD systems for AI workloads: drivers, BIOS, memory pools, `gfx` targets, and framework setup.
 
 | Skill | What it does | Source |
 | --- | --- | --- |
@@ -72,10 +74,11 @@ Diagnose, configure, and tune AMD devices directly.
 | [`rocm-doctor`](skills/rocm-doctor/SKILL.md) | Diagnose ROCm / PyTorch / llama.cpp failures on AMD GPUs against a fixed list of known misconfigurations. | in-repo |
 | `mi-tuner` | Opinionated inference tuning for MI accelerators (TunableOp, FSDP, FlashAttention). | _planned_ |
 | `gfx-target-chooser` | Pick the right `gfx942` / `gfx90a` / `gfx1100` target and matching compiler flags. | _planned_ |
+| `pytorch-rocm-setup` | Get a known-good PyTorch + ROCm stack running on a target node, end to end. | _planned_ |
 
-### Kernel optimization
+### Kernel engineering
 
-Write, tune, and reason about GPU kernels for AMD targets. All entries are federated from [`AMD-AGI/Apex`](https://github.com/AMD-AGI/Apex) at `main` (`tools/skills/`).
+Author, tune, and reason about GPU kernels for AMD targets. All entries are federated from [`AMD-AGI/Apex`](https://github.com/AMD-AGI/Apex) at `main` (`tools/skills/`).
 
 | Skill | What it does | Source |
 | --- | --- | --- |
@@ -97,9 +100,8 @@ Bring existing workloads onto AMD.
 | --- | --- | --- |
 | `cuda-to-hip` | Port CUDA kernels with `hipify` and flag anything that needs manual review. | _planned_ |
 | `vllm-rocm` | Stand up vLLM on AMD with the right environment variables and model configurations. | _planned_ |
-| `pytorch-rocm-setup` | Get a known-good PyTorch + ROCm stack running on a target node, end to end. | _planned_ |
 
-### Profiling and delivery
+### Performance & delivery
 
 Close the loop from trace to fix to ship.
 
@@ -135,15 +137,26 @@ The AMD stack is large and moves fast. ROCm, HIP, Ryzen AI, and framework integr
 
 This repo also acts as an **incubator**: a skill can start under `skills/` to iterate quickly, then graduate to its product repo and be re-pointed from `scripts/sources.yml` once it has a clear owner, with no change for installed users.
 
-- **One install, full coverage.** Add this repository through your agent's plugin flow and you get the whole AMD catalog.
-- **Skills update with the products they describe.** When ROCm cuts a release, the ROCm team updates the ROCm skills as part of that release.
-- **Skills you can trust.** Each skill is signed off by the team that owns the underlying product.
+```
+skills/                  # All skills the agent can load (in-repo + vendored copies of federated)
+.cursor-plugin/          # Cursor plugin manifest
+.claude-plugin/          # Claude Code marketplace manifest
+.github/workflows/       # CI for validating skills and the `import-external-skills` workflow
+scripts/                 # Tooling for publishing, regenerating manifests, and importing
+scripts/sources.yml      # Master list of external skill sources for federation
+```
 
-Each vendored skill carries a `.federated.json` marker that records the upstream repo and pinned commit, so the importer can refresh or remove it without disturbing in-repo skills.
+In-repo skills are authored directly under `skills/`. Federated skills are
+declared in [`scripts/sources.yml`](scripts/sources.yml) and vendored into
+`skills/` by the manually-dispatched `import-external-skills` workflow,
+which opens a pull request with the imported copies. Each vendored skill
+carries a `.federated.json` marker that records the upstream repo and
+pinned commit, so the importer can refresh or remove it without disturbing
+in-repo skills.
 
-## Manual installation
+## Manual Installation
 
-AMD Skills are compatible with Cursor, Claude Code, OpenAI Codex, and Gemini CLI.
+AMD Skills are compatible with Cursor, Claude Code, OpenAI Codex, and Gemini CLI. The general flow:
 
 ### Cursor
 
@@ -160,7 +173,7 @@ Register this repository as a plugin marketplace, then install individual skills
 
 ### OpenAI Codex
 
-Copy or symlink the desired folders from `skills/` into one of Codex's standard skill locations (for example `$REPO_ROOT/.agents/skills` or `$HOME/.agents/skills`). Codex discovers `SKILL.md` files automatically.
+Copy or symlink the desired folders from `skills/` into one of Codex's standard skill locations (for example `$REPO_ROOT/.agents/skills` or `$HOME/.agents/skills`). Codex will discover the `SKILL.md` files automatically.
 
 ### Gemini CLI
 
@@ -172,7 +185,7 @@ gemini extensions install https://github.com/amd/skills.git --consent
 
 ## Using a skill
 
-Reference it in plain language while talking to your agent. The agent loads the matching `SKILL.md` and any helper scripts, then carries out the task.
+Once a skill is installed, reference it in plain language while talking to your agent. For example:
 
 - "Use AMD Skills to integrate local AI capabilities into my app with Embeddable Lemonade."
 - "Use AMD Skills to convert these CUDA kernels and flag anything that needs manual review."

From f0b97fe7f699d9994ece72802e2b00fc387ddcdb Mon Sep 17 00:00:00 2001
From: Daniel Holanda <holand.daniel@gmail.com>
Date: Thu, 28 May 2026 12:00:57 -0700
Subject: [PATCH 2/3] Add new skills to README

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index fc9be48..00a7424 100644
--- a/README.md
+++ b/README.md
@@ -72,7 +72,6 @@ Diagnose, configure, and ready AMD systems for AI workloads: drivers, BIOS, memo
 | --- | --- | --- |
 | [`apu-memory-tuner`](skills/apu-memory-tuner/SKILL.md) | Inspect and tune the shared-vs-dedicated memory split (GTT / UMA Frame Buffer) on AMD Ryzen APUs. | in-repo |
 | [`rocm-doctor`](skills/rocm-doctor/SKILL.md) | Diagnose ROCm / PyTorch / llama.cpp failures on AMD GPUs against a fixed list of known misconfigurations. | in-repo |
-| `mi-tuner` | Opinionated inference tuning for MI accelerators (TunableOp, FSDP, FlashAttention). | _planned_ |
 | `gfx-target-chooser` | Pick the right `gfx942` / `gfx90a` / `gfx1100` target and matching compiler flags. | _planned_ |
 | `pytorch-rocm-setup` | Get a known-good PyTorch + ROCm stack running on a target node, end to end. | _planned_ |
 
@@ -100,6 +99,7 @@ Bring existing workloads onto AMD.
 | --- | --- | --- |
 | `cuda-to-hip` | Port CUDA kernels with `hipify` and flag anything that needs manual review. | _planned_ |
 | `vllm-rocm` | Stand up vLLM on AMD with the right environment variables and model configurations. | _planned_ |
+| `serving-llms-on-instinct` | Deploy LLM inference on AMD Instinct GPUs end-to-end: detect hardware (or onboard via AMD Developer Cloud), validate model fit, apply the right vLLM recipe, and launch a benchmarked endpoint. SGLang and engine/backend selection in later phases. | _planned_ |
 
 ### Performance & delivery
 

From 4ebefe610f0a233aecf485141d2a4f43d4acf655 Mon Sep 17 00:00:00 2001
From: Daniel Holanda <holand.daniel@gmail.com>
Date: Thu, 28 May 2026 12:04:00 -0700
Subject: [PATCH 3/3] Adjust catalog

---
 README.md | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 00a7424..bcc91f5 100644
--- a/README.md
+++ b/README.md
@@ -52,6 +52,11 @@ Skills earn their keep on repeated, opinionated workflows, exactly where the AMD
 
 ## The catalog
 
+> [!IMPORTANT]
+> **The catalog is under active development.** Skills, categories, and descriptions are changing fast. Expect entries to appear, move, and get renamed without notice.
+>
+> **Target: ready for testing by June 12.** Until then, treat anything below as a preview.
+
 The initial catalog is organized into five focus areas.
 
 
@@ -108,7 +113,6 @@ Close the loop from trace to fix to ship.
 | Skill | What it does | Source |
 | --- | --- | --- |
 | [`rocprof-compute`](skills/rocprof-compute/SKILL.md) | Profile AMD GPU kernels with `rocprof-compute` to collect metrics, roofline data, and bottleneck analysis. | [Apex](https://github.com/AMD-AGI/Apex) |
-| `rocprof-capture` | Capture and interpret a `rocprof` trace for a workload. | _planned_ |
 | `omniperf-tune` | Run `omniperf`, locate the bottleneck, and suggest the fix. | _planned_ |
 | `quark-quantize` | Quantize PyTorch / ONNX models with [AMD Quark](https://github.com/amd/Quark) and export for AMD deployment. | _planned_ |