Revert "Revert "Rework skill validator script (#11)""#16
Revert "Revert "Rework skill validator script (#11)""#16
Conversation
There was a problem hiding this comment.
Pull request overview
This PR reverts #15 by switching skill validation back to a “validate all skills” approach, moving validation logic into tools/validate-skills.sh, and updating CI and contributor documentation accordingly.
Changes:
- Add
tools/validate-skills.shto validate every skill directory (with CI vs local output differences). - Update GitHub Actions workflow to run the new script and validate all skills.
- Add
CONTRIBUTING.mdguidance for structural and LLM-based skill testing, and remove the prior.github/scripts/validate-skills.sh.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
tools/validate-skills.sh |
New validation script that validates all skills and writes CI output to the job summary. |
CONTRIBUTING.md |
Documents how to add and test skills, referencing the new validation script. |
.github/workflows/validate-skills.yml |
Updates CI to run the new validation script and validate all skills on PRs. |
.github/scripts/validate-skills.sh |
Removed prior CI-focused “validate changed skills” script. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
dacharyc
left a comment
There was a problem hiding this comment.
Thanks for making the PR again, @nirinchev - I think there are some philosophical details to iron out around what changes we may want to make to best enable contributors.
There was a problem hiding this comment.
I didn't get a chance to ask on the original PR, but can you say more about the use case for this move? The original PR description only says:
Moves validate-skills.sh script to tools so that it can more easily be accessed outside of github actions.
If the desire is to run the skill-validator tool outside of CI, I think a separate script with its own entrypoint makes more sense. This script is optimized for CI usage, including these three flags which I would probably not run for local checks or other purposes: --strict --emit-annotations -o markdown
If the desire is to run the skill-validator tool on multiple skills, the script isn't needed at all - skill-validator already handles multi-skill validation. You can just invoke it with skill-validator check skills/ and it will check all skills in the directory.
And finally, if you want a script entrypoint for the tool to ensure the tool is installed before running the script, I'd probably recommend using the enterprise version of the tool, skill-validator-ent, which I wrote as a wrapper around the core lib that works with our AWS Bedrock model provider/identity management. That's the one I refer to in the review-skill skill for our use here. The CI predated development of skill-validator-ent, and also doesn't use any of the enterprise functionality, so I didn't change it here - but if the use case is a script for people to run locally, we should change it.
There was a problem hiding this comment.
My thinking was mostly to align what we use locally and what's on CI. While we do output problems on CI in the summary section, it's always easier to be able to run whatever CI is doing and see the impact of your changes without pushing. And while I know the script is a simple wrapper around the skill-validator cli, I believe it still offers some value by abstracting away the API and helping with the installation process. Again, I know that this is information people could discover by reading what the script does, then replicating it manually, I wanted to address the case "Oh my CI failed, let me make some changes, let me run the same command CI is using locally and see if it succeeds".
On the enterprise cli - I have no opinions, fine with moving to that or to some other tool, my main goal is to have the same scripts run on CI available locally so that contributors can validate their work without having to copy-paste commands.
|
|
||
| - name: Validate changed skills | ||
| run: bash .github/scripts/validate-skills.sh "${{ github.base_ref }}" | ||
| - name: Validate all skills |
There was a problem hiding this comment.
This behavior was intentional. I only want the CI to validate a changed skill because I don't want the CI to block merging a changed skill if some other skill is now failing. If we want to have some other CI to run a scheduled check to confirm all skills are still known good, I'd recommend setting that up as a separate CI run.
There was a problem hiding this comment.
I have made the validate check required, which means there should be no skills in main that are failing validation. It's not clear to me what we're gaining here by only checking the changed files.
|
|
||
| on: | ||
| pull_request: | ||
| paths: |
There was a problem hiding this comment.
I only want to run the validator when a PR has changed skills - I don't want to run it when PRs touch unrelated infrastructure. This scope gate lets us avoid unnecessary CI runs.
There was a problem hiding this comment.
The CI run is super cheap - half a minute where the majority of the time is spent on the setup. By making the check only run on skill changes, we cannot make the validate check required for merging as there's going to be PRs where it doesn't run.
|
|
||
| ### Structural testing | ||
|
|
||
| Use the `tools/validate-skills.sh` script to test the structural validity of the skill. This script uses the [`skill-validator`](https://github.com/agent-ecosystem/skill-validator) tool to check the skill's metadata and instructions. |
There was a problem hiding this comment.
I don't think we suggest contributors use a tools/validate-skills.sh script for structural validation. I think we should advise folks to use the review-skill skill. The review-skill skill runs the structural validation checks, performs some additional quality checks, and provides optional LLM-as-judge checks, and provides details interpreted by the agent using framing to help SMEs understand what they need to change based on the output of the structural and other checks.
Running the bare tool is fine for folks who are very familiar with what it measures and how to apply it, but I don't think it's contributor-friendly for SMEs who may not be familiar with what a Skill needs or how to add it.
There was a problem hiding this comment.
This is the script being run on CI and this is what's going to gate PR merges - SMEs should be capable of running the same scripts CI runs to unblock themselves and resolve problems raised by it. I'm happy to also suggest running the review-skill skill, but anything that runs on CI, we should be able to run locally.
|
Closing this in favor of upcoming PR. |
Reverts #15