Conversation
There was a problem hiding this comment.
Pull request overview
Adds new reusable GitHub Actions/workflows to support automated “autosolve” (assess + implement) flows and changelog-driven release tagging, along with a lightweight bash test harness and CI wiring for the repo.
Changes:
- Introduce
autotag-from-changelogcomposite action + script + tests to create/push tags based onCHANGELOG.md. - Add
autosolvecomposite actions (assess,implement), shared bash utilities, prompts, and reusable workflows (Jira + GitHub Issue). - Add bash test framework (
test.sh,test_helpers.sh) plus CI workflow to run tests on PRs.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
test_helpers.sh |
Adds shared bash assertions/helpers for repository test scripts. |
test.sh |
Adds a simple test runner that discovers and executes *_test.sh files. |
autotag-from-changelog/auto-tag-release.sh |
Implements tagging logic based on CHANGELOG.md state. |
autotag-from-changelog/auto-tag-release_test.sh |
Tests tagging behavior using temporary git repos. |
autotag-from-changelog/action.yml |
Composite action wrapper for the changelog autotag script. |
autosolve/scripts/shared.sh |
Shared autosolve functions (validation, prompt building, result parsing, CLI install). |
autosolve/scripts/shared_test.sh |
Unit tests for shared autosolve functions. |
autosolve/scripts/assess.sh |
Runs read-only Claude assessment and extracts structured outputs. |
autosolve/scripts/assess_test.sh |
Tests assess output formatting/extraction behavior. |
autosolve/scripts/implement.sh |
Runs Claude implementation, security validation, push+PR creation, and output plumbing. |
autosolve/scripts/implement_test.sh |
Tests security_check behavior in a temporary git repo. |
autosolve/scripts/jira.sh |
Jira prompt building, commenting, transitions, and final status helpers. |
autosolve/scripts/jira_test.sh |
Tests non-HTTP Jira helper functions. |
autosolve/run_step.sh |
Entry-point wrapper to run autosolve script functions from the workspace CWD. |
autosolve/prompts/security-preamble.md |
System/security preamble injected into prompts. |
autosolve/prompts/assessment-footer.md |
Standardizes assessment output markers. |
autosolve/prompts/implementation-footer.md |
Standardizes implementation output markers and instructions. |
autosolve/assess/action.yml |
Composite action wiring for assess flow. |
autosolve/implement/action.yml |
Composite action wiring for implement flow (incl. security check and PR creation). |
actions_helpers.sh |
Adds common logging + GitHub Actions output helpers. |
actions_helpers_test.sh |
Tests for actions_helpers.sh helpers. |
README.md |
Documents new actions/workflows and local development/testing. |
CLAUDE.md |
Adds repo conventions and guidance for Claude-driven workflows and testing. |
CHANGELOG.md |
Adds entries describing the new actions/workflows. |
.shellcheckrc |
Configures shellcheck behavior for repo sourcing patterns. |
.github/workflows/test.yml |
Adds CI job to run ./test.sh on PRs. |
.github/workflows/jira-autosolve.yml |
Adds reusable Jira autosolve workflow that composes assess+implement. |
.github/workflows/github-issue-autosolve.yml |
Adds reusable GitHub Issue autosolve workflow (assess+implement + commenting/label mgmt). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
2b58edf to
c3e5ff5
Compare
Feedback from testing in ccloud-private-automation-testingWhile building a test workflow that uses the composite actions directly (not the reusable workflows), I noticed the README's auth documentation is a bit thin for direct action usage. The reusable workflows (
This works fine once you read # Direct action usage with Vertex AI
- uses: google-github-actions/auth@v3
with:
project_id: my-project
service_account: my-sa@my-project.iam.gserviceaccount.com
workload_identity_provider: projects/.../providers/...
- uses: cockroachdb/actions/autosolve/assess@v1
env:
CLAUDE_CODE_USE_VERTEX: "1"
ANTHROPIC_VERTEX_PROJECT_ID: my-project
CLOUD_ML_REGION: us-east5
with:
prompt: "Fix the bug" |
Bug: status reports SUCCESS when push_and_pr failsWhen the implementation succeeds and security check passes, but the From a test run log: The implementation step succeeded (Claude created the file), security check passed, but # implement.sh:set_implement_outputs()
if [ "$impl_result" = "SUCCESS" ] && [ "$security_conclusion" != "failure" ]; then
status="SUCCESS"It doesn't check whether Suggested fix: pass |
Bug: summary output is always empty (AUTOSOLVE_TMPDIR not shared between steps)The assessment summary (and likely the implementation summary too) is always empty because Each step in a composite action runs in a new shell. # run_step.sh
if [ -z "${AUTOSOLVE_TMPDIR:-}" ]; then
AUTOSOLVE_TMPDIR="$(mktemp -d "${TMPDIR:-/tmp}/autosolve_XXXXXX")"
export AUTOSOLVE_TMPDIR
fiSo what happens:
Observed output: The Possible fix: write if [ -z "${AUTOSOLVE_TMPDIR:-}" ]; then
AUTOSOLVE_TMPDIR="$(mktemp -d "${TMPDIR:-/tmp}/autosolve_XXXXXX")"
export AUTOSOLVE_TMPDIR
echo "AUTOSOLVE_TMPDIR=$AUTOSOLVE_TMPDIR" >> "${GITHUB_ENV:-/dev/null}"
fi |
Docs: callers should checkout the PR base branch, not the trigger refThe autosolve actions work on whatever is already checked out. When a caller uses For example, if the workflow runs from branch This also causes a downstream problem: if the branch has workflow file modifications relative to the fork's main, GitHub requires the The fix in the caller workflow is simple: - uses: actions/checkout@v5
with:
ref: main # checkout the PR base branch, not the trigger ref
fetch-depth: 0
persist-credentials: false # prevent checkout's credential helper from interfering with fork pushWorth documenting in the README examples, especially for Also noting that |
6ff8275 to
2a0f999
Compare
Cherry-picked actions_helpers.sh and test_helpers.sh from PR cockroachdb#5 (CNSL-1944 autosolve branch) to support the release-version-extract action's log_error and set_output functions.
- Added comprehensive test workflow covering: - Valid changelog format validation - Breaking change detection (full mode) - Version ordering validation (valid and invalid) - Date ordering validation - Invalid changelog format handling - Multiple version validation depth - Breaking change indentation handling - Cherry-picked actions_helpers.sh and test_helpers.sh from PR cockroachdb#5 (CNSL-1944 autosolve branch) to support the validate_version_order.sh script's log_error and set_output functions.
Feature request: actions should write to $GITHUB_STEP_SUMMARYCurrently callers need to add their own summary step to get results visible in the Actions UI. The composite actions could write to This way every caller gets a nice summary in the Actions UI for free without extra workflow boilerplate. Example of what it could look like: # In set_assess_outputs or set_implement_outputs
{
echo "## Autosolve Result"
echo "**Assessment:** $assessment"
echo "**Implementation:** $status"
if [ -n "$pr_url" ]; then
echo "**PR:** $pr_url"
fi
if [ -n "$summary" ]; then
echo "### Summary"
echo "$summary"
fi
} >> "${GITHUB_STEP_SUMMARY:-/dev/null}"The |
Bug:
|
2a0f999 to
a950e62
Compare
|
Addressed all feedback — thanks for the thorough testing! Bug: Bug: status reports SUCCESS when Docs: auth for direct action usage — Added an "Authentication" section to the README with a Vertex AI example showing the env vars needed for direct composite action usage. Docs: callers should checkout the PR base branch — Added a "Caller checkout" section documenting Feature: Bug: Also cleaned up |
Bug: reusable workflow shell steps can't find scripts when called cross-repoThe - name: Build prompt
run: ${{ github.workspace }}/autosolve/run_step.sh github_issues build_github_issue_promptWhen called from a different repo, Error: The composite action Affected steps: Build prompt, Remove label, Set final status, Comment on issue. Possible fixes:
|
a950e62 to
1e3ddeb
Compare
|
Bug: reusable workflow shell steps can't find scripts cross-repo — Fixed. Created internal composite actions Both workflows ( |
Follow-up: local action paths still resolve to caller's workspaceThe fix for the Error from the latest test run: The reusable workflow checks out Possible fixes:
Option 3 is probably cleanest since it's how the composite actions ( Test repo: |
1e3ddeb to
4ea0712
Compare
50c09c8 to
dedeca0
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new “autosolve” automation suite (assess + implement) for automatically evaluating and attempting fixes via Claude Code, along with supporting scripts, prompts, reusable workflows, and a lightweight bash test harness.
Changes:
- Introduces
autosolve/assessandautosolve/implementcomposite actions plus shared step runner/scripts. - Adds security/prompt templates and GitHub Issue autosolve reusable workflow integration.
- Standardizes bash testing via
test_helpers.shand expands test discovery to include root-level*_test.sh.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
test_helpers.sh |
New shared bash test helper functions (expect_success, expect_failure, print_results). |
test.sh |
Updates test discovery depth and uses BASH_SOURCE[0] for script dir. |
autotag-from-changelog/auto-tag-release_test.sh |
Refactors tests to use shared test helpers. |
autotag-from-changelog/auto-tag-release.sh |
Switches error logging to shared actions_helpers.sh helpers. |
autosolve/scripts/shared_test.sh |
Adds unit tests for shared.sh functions. |
autosolve/scripts/shared.sh |
Adds shared autosolve functions (validation, prompt building, result parsing, final status). |
autosolve/scripts/implement_test.sh |
Adds unit tests for implement.sh security/output behavior. |
autosolve/scripts/implement.sh |
Adds implementation flow, blocked-path security checks, PR creation, and outputs. |
autosolve/scripts/github_issues.sh |
Adds GitHub Issues prompt builder + issue comment/label helpers. |
autosolve/scripts/assess_test.sh |
Adds unit tests for assessment outputs. |
autosolve/scripts/assess.sh |
Adds assessment flow (read-only tools) and outputs. |
autosolve/run_step.sh |
Adds a step entrypoint to source scripts and invoke functions from workspace cwd. |
autosolve/prompts/security-preamble.md |
Adds a security preamble injected into prompts. |
autosolve/prompts/implementation-footer.md |
Adds implementation instructions + required success/failed marker. |
autosolve/prompts/assessment-footer.md |
Adds assessment instructions + required proceed/skip marker. |
autosolve/implement/action.yml |
Defines the composite implement action interface and steps. |
autosolve/assess/action.yml |
Defines the composite assess action interface and steps. |
actions_helpers_test.sh |
Adds tests for action helper logging/output helpers. |
actions_helpers.sh |
Adds shared logging + GITHUB_OUTPUT helper functions. |
README.md |
Documents new autosolve actions and reusable workflows. |
CLAUDE.md |
Updates repo conventions for changelog ordering, sourcing patterns, and shell style guidance. |
CHANGELOG.md |
Adds changelog entries for the new autosolve functionality. |
.shellcheckrc |
Adds ShellCheck config for external sources and SCRIPTDIR source path resolution. |
.github/workflows/github-issue-autosolve.yml |
Adds reusable workflow for GitHub Issue-driven autosolve runs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
fantapop
left a comment
There was a problem hiding this comment.
I've reviewed this thing line by line and left a bunch of comments
a4995af to
7d64d1b
Compare
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
7d64d1b to
d8bc767
Compare
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
d8bc767 to
fed481b
Compare
|
@linhcrl I think this one is ready for your review. I'm sorry, it's somewhat gigantic. I have personally gone over it a few times now and done multiple rounds of testing. Please feel free to comment on anything that you think could be clearer, or is unnecessary. There is one more thing I'm going to test manually right now which is specifying a skill to run. |
Summary
github-issue-autosolveworkflow that uses Claude Code to automatically assess and resolve GitHub issues. Triggered via label, it runs a two-phase pipeline: assess (read-only evaluation of whether the task is suitable) then implement (make changes, security-check, push to a fork, and open a draft PR).autosolve/assessandautosolve/implement) that can also be used independently in custom workflows.actions_helpers.sh), a test framework (test_helpers.sh), and shellcheck config (.shellcheckrc) to support the repo's growing script base..github/workflows/), with symlink-traversal detection.Test plan
./test.shpasses all 62 tests (helpers, assess, implement, shared, autotag)Co-Authored-By: Claude