Skip to content

Generic autosolve github workflow for automated issue resolution#5

Open
fantapop wants to merge 2 commits intomainfrom
CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution
Open

Generic autosolve github workflow for automated issue resolution#5
fantapop wants to merge 2 commits intomainfrom
CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution

Conversation

@fantapop
Copy link
Contributor

@fantapop fantapop commented Mar 13, 2026

Summary

  • Add a reusable github-issue-autosolve workflow that uses Claude Code to automatically assess and resolve GitHub issues. Triggered via label, it runs a two-phase pipeline: assess (read-only evaluation of whether the task is suitable) then implement (make changes, security-check, push to a fork, and open a draft PR).
  • Add two composite actions (autosolve/assess and autosolve/implement) that can also be used independently in custom workflows.
  • Add shared shell helpers (actions_helpers.sh), a test framework (test_helpers.sh), and shellcheck config (.shellcheckrc) to support the repo's growing script base.
  • Security: blocked-path enforcement prevents Claude from modifying sensitive paths (e.g., .github/workflows/), with symlink-traversal detection.
  • Auth: supports both Anthropic API key and Vertex AI (via Workload Identity Federation).

Test plan

Co-Authored-By: Claude

@fantapop fantapop requested a review from Copilot March 13, 2026 23:06
@fantapop fantapop changed the title Cnsl 1944 generic autosolve git hub workflow for automated issue resolution Generic autosolve github workflow for automated issue resolution Mar 13, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new reusable GitHub Actions/workflows to support automated “autosolve” (assess + implement) flows and changelog-driven release tagging, along with a lightweight bash test harness and CI wiring for the repo.

Changes:

  • Introduce autotag-from-changelog composite action + script + tests to create/push tags based on CHANGELOG.md.
  • Add autosolve composite actions (assess, implement), shared bash utilities, prompts, and reusable workflows (Jira + GitHub Issue).
  • Add bash test framework (test.sh, test_helpers.sh) plus CI workflow to run tests on PRs.

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
test_helpers.sh Adds shared bash assertions/helpers for repository test scripts.
test.sh Adds a simple test runner that discovers and executes *_test.sh files.
autotag-from-changelog/auto-tag-release.sh Implements tagging logic based on CHANGELOG.md state.
autotag-from-changelog/auto-tag-release_test.sh Tests tagging behavior using temporary git repos.
autotag-from-changelog/action.yml Composite action wrapper for the changelog autotag script.
autosolve/scripts/shared.sh Shared autosolve functions (validation, prompt building, result parsing, CLI install).
autosolve/scripts/shared_test.sh Unit tests for shared autosolve functions.
autosolve/scripts/assess.sh Runs read-only Claude assessment and extracts structured outputs.
autosolve/scripts/assess_test.sh Tests assess output formatting/extraction behavior.
autosolve/scripts/implement.sh Runs Claude implementation, security validation, push+PR creation, and output plumbing.
autosolve/scripts/implement_test.sh Tests security_check behavior in a temporary git repo.
autosolve/scripts/jira.sh Jira prompt building, commenting, transitions, and final status helpers.
autosolve/scripts/jira_test.sh Tests non-HTTP Jira helper functions.
autosolve/run_step.sh Entry-point wrapper to run autosolve script functions from the workspace CWD.
autosolve/prompts/security-preamble.md System/security preamble injected into prompts.
autosolve/prompts/assessment-footer.md Standardizes assessment output markers.
autosolve/prompts/implementation-footer.md Standardizes implementation output markers and instructions.
autosolve/assess/action.yml Composite action wiring for assess flow.
autosolve/implement/action.yml Composite action wiring for implement flow (incl. security check and PR creation).
actions_helpers.sh Adds common logging + GitHub Actions output helpers.
actions_helpers_test.sh Tests for actions_helpers.sh helpers.
README.md Documents new actions/workflows and local development/testing.
CLAUDE.md Adds repo conventions and guidance for Claude-driven workflows and testing.
CHANGELOG.md Adds entries describing the new actions/workflows.
.shellcheckrc Configures shellcheck behavior for repo sourcing patterns.
.github/workflows/test.yml Adds CI job to run ./test.sh on PRs.
.github/workflows/jira-autosolve.yml Adds reusable Jira autosolve workflow that composes assess+implement.
.github/workflows/github-issue-autosolve.yml Adds reusable GitHub Issue autosolve workflow (assess+implement + commenting/label mgmt).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch 4 times, most recently from 2b58edf to c3e5ff5 Compare March 16, 2026 22:30
@fantapop fantapop changed the base branch from main to add-claude.md March 16, 2026 22:33
@fantapop
Copy link
Contributor Author

Feedback from testing in ccloud-private-automation-testing

While building a test workflow that uses the composite actions directly (not the reusable workflows), I noticed the README's auth documentation is a bit thin for direct action usage.

The reusable workflows (github-issue-autosolve.yml, jira-autosolve.yml) handle auth setup internally — they accept auth_mode, vertex_project_id, etc. as inputs and set the right env vars on each step. But when using autosolve/assess and autosolve/implement directly, the caller needs to know to:

  1. Run google-github-actions/auth@v3 (or equivalent) themselves
  2. Set the env vars (CLAUDE_CODE_USE_VERTEX, ANTHROPIC_VERTEX_PROJECT_ID, CLOUD_ML_REGION) on each action step

This works fine once you read github-issue-autosolve.yml as a reference, but the README's "Required authentication" section only mentions the reusable workflow inputs (auth_mode: vertex). It would be helpful to add a note or example showing the env vars needed for direct composite action usage. Something like:

# Direct action usage with Vertex AI
- uses: google-github-actions/auth@v3
  with:
    project_id: my-project
    service_account: my-sa@my-project.iam.gserviceaccount.com
    workload_identity_provider: projects/.../providers/...

- uses: cockroachdb/actions/autosolve/assess@v1
  env:
    CLAUDE_CODE_USE_VERTEX: "1"
    ANTHROPIC_VERTEX_PROJECT_ID: my-project
    CLOUD_ML_REGION: us-east5
  with:
    prompt: "Fix the bug"

@fantapop
Copy link
Contributor Author

Bug: status reports SUCCESS when push_and_pr fails

When the implementation succeeds and security check passes, but the push_and_pr step fails (e.g., fork repo not accessible), the final status is still reported as SUCCESS.

From a test run log:

=== Final Result ===
Assessment: PROCEED
Implementation status: SUCCESS
PR URL: 
Branch: 

The implementation step succeeded (Claude created the file), security check passed, but push_and_pr failed with exit code 128 (repo not found). Despite this, set_implement_outputs reported status=SUCCESS because it only checks IMPL_RESULT and SECURITY_CONCLUSION:

# implement.sh:set_implement_outputs()
if [ "$impl_result" = "SUCCESS" ] && [ "$security_conclusion" != "failure" ]; then
    status="SUCCESS"

It doesn't check whether push_and_pr actually succeeded. The PR_URL and BRANCH_NAME being empty are clues, but the status itself is misleading.

Suggested fix: pass steps.pr.conclusion into set_implement_outputs and factor it into the status determination.

@fantapop
Copy link
Contributor Author

Bug: summary output is always empty (AUTOSOLVE_TMPDIR not shared between steps)

The assessment summary (and likely the implementation summary too) is always empty because AUTOSOLVE_TMPDIR is not shared across composite action steps.

Each step in a composite action runs in a new shell. run_step.sh creates AUTOSOLVE_TMPDIR and exports it, but that export only lives within that shell process:

# run_step.sh
if [ -z "${AUTOSOLVE_TMPDIR:-}" ]; then
  AUTOSOLVE_TMPDIR="$(mktemp -d "${TMPDIR:-/tmp}/autosolve_XXXXXX")"
  export AUTOSOLVE_TMPDIR
fi

So what happens:

  1. run_assessment step creates /tmp/autosolve_abc123/, writes assessment_result.txt there
  2. set_assess_outputs step starts a new shell, AUTOSOLVE_TMPDIR is empty, creates /tmp/autosolve_xyz789/, can't find assessment_result.txt → summary is empty

Observed output:

Assessment: PROCEED
Summary:

The assessment value itself works because it's passed via GITHUB_OUTPUT step outputs, not the temp file. But summary and result depend on reading from the temp dir.

Possible fix: write AUTOSOLVE_TMPDIR to GITHUB_ENV in run_step.sh so it persists across steps:

if [ -z "${AUTOSOLVE_TMPDIR:-}" ]; then
  AUTOSOLVE_TMPDIR="$(mktemp -d "${TMPDIR:-/tmp}/autosolve_XXXXXX")"
  export AUTOSOLVE_TMPDIR
  echo "AUTOSOLVE_TMPDIR=$AUTOSOLVE_TMPDIR" >> "${GITHUB_ENV:-/dev/null}"
fi

@fantapop
Copy link
Contributor Author

Docs: callers should checkout the PR base branch, not the trigger ref

The autosolve actions work on whatever is already checked out. When a caller uses workflow_dispatch and runs from a non-default branch, actions/checkout checks out that branch by default. The autosolve action then branches from there, and the resulting PR includes unrelated commits from the triggering branch — not just Claude's changes.

For example, if the workflow runs from branch test-autosolve (which has a workflow file change), the PR against main includes both the workflow change and Claude's work.

This also causes a downstream problem: if the branch has workflow file modifications relative to the fork's main, GitHub requires the workflow scope on the push PAT — even though Claude never touched workflow files. Checking out the base branch avoids this entirely.

The fix in the caller workflow is simple:

- uses: actions/checkout@v5
  with:
    ref: main  # checkout the PR base branch, not the trigger ref
    fetch-depth: 0
    persist-credentials: false  # prevent checkout's credential helper from interfering with fork push

Worth documenting in the README examples, especially for workflow_dispatch use cases. The issues: [labeled] trigger doesn't have this problem since it always runs on the default branch.

Also noting that persist-credentials: false on the checkout step is important — without it, the checkout action's credential helper can interfere with the fork push token's credential helper set up by implement.sh.

@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch 4 times, most recently from 6ff8275 to 2a0f999 Compare March 18, 2026 00:53
linhcrl added a commit to linhcrl/actions that referenced this pull request Mar 18, 2026
Cherry-picked actions_helpers.sh and test_helpers.sh from PR cockroachdb#5
(CNSL-1944 autosolve branch) to support the release-version-extract
action's log_error and set_output functions.
linhcrl added a commit to linhcrl/actions that referenced this pull request Mar 18, 2026
- Added comprehensive test workflow covering:
  - Valid changelog format validation
  - Breaking change detection (full mode)
  - Version ordering validation (valid and invalid)
  - Date ordering validation
  - Invalid changelog format handling
  - Multiple version validation depth
  - Breaking change indentation handling

- Cherry-picked actions_helpers.sh and test_helpers.sh from PR cockroachdb#5
  (CNSL-1944 autosolve branch) to support the validate_version_order.sh
  script's log_error and set_output functions.
@fantapop
Copy link
Contributor Author

Feature request: actions should write to $GITHUB_STEP_SUMMARY

Currently callers need to add their own summary step to get results visible in the Actions UI. The composite actions could write to $GITHUB_STEP_SUMMARY directly in the set_assess_outputs and set_implement_outputs functions — they already have all the data (assessment result, summary, PR URL, branch, etc.).

This way every caller gets a nice summary in the Actions UI for free without extra workflow boilerplate. Example of what it could look like:

# In set_assess_outputs or set_implement_outputs
{
  echo "## Autosolve Result"
  echo "**Assessment:** $assessment"
  echo "**Implementation:** $status"
  if [ -n "$pr_url" ]; then
    echo "**PR:** $pr_url"
  fi
  if [ -n "$summary" ]; then
    echo "### Summary"
    echo "$summary"
  fi
} >> "${GITHUB_STEP_SUMMARY:-/dev/null}"

The :-/dev/null fallback keeps it safe for local testing where GITHUB_STEP_SUMMARY isn't set.

@fantapop
Copy link
Contributor Author

Bug: github_token secret name collides with reserved name

The github-issue-autosolve.yml reusable workflow defines a secret named github_token:

    secrets:
      github_token:
        required: true

This causes a validation error when called:

secret name `github_token` within `workflow_call` can not be used since it would collide with system reserved name

The secret needs to be renamed to something like gh_token or repo_token to avoid the collision with the system-provided GITHUB_TOKEN.

@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch from 2a0f999 to a950e62 Compare March 18, 2026 17:54
@fantapop
Copy link
Contributor Author

Addressed all feedback — thanks for the thorough testing!

Bug: AUTOSOLVE_TMPDIR not shared between steps — Fixed. run_step.sh now writes AUTOSOLVE_TMPDIR to GITHUB_ENV so it persists across composite action steps.

Bug: status reports SUCCESS when push_and_pr fails — Fixed. set_implement_outputs now checks PR_CONCLUSION and INPUT_CREATE_PR. When PR creation is enabled and the PR step didn't succeed, status is FAILED. Added 4 tests covering the matrix (full success, PR failure, create_pr=false, impl failure).

Docs: auth for direct action usage — Added an "Authentication" section to the README with a Vertex AI example showing the env vars needed for direct composite action usage.

Docs: callers should checkout the PR base branch — Added a "Caller checkout" section documenting ref: main and persist-credentials: false for workflow_dispatch use cases.

Feature: $GITHUB_STEP_SUMMARY — Done. Both set_assess_outputs and set_implement_outputs now write to $GITHUB_STEP_SUMMARY with assessment result, status, PR URL, branch, and summary. Falls back to /dev/null for local testing.

Bug: github_token secret name collision — Renamed to repo_token throughout github-issue-autosolve.yml.

Also cleaned up 2>/dev/null stderr redirections across all production scripts — errors now flow to output for debugging.

@fantapop
Copy link
Contributor Author

Bug: reusable workflow shell steps can't find scripts when called cross-repo

The github-issue-autosolve.yml reusable workflow has shell steps that reference scripts via ${{ github.workspace }}:

- name: Build prompt
  run: ${{ github.workspace }}/autosolve/run_step.sh github_issues build_github_issue_prompt

When called from a different repo, github.workspace resolves to the caller's checkout directory, not cockroachdb/actions. So it looks for ccloud-private-automation-testing/autosolve/run_step.sh which doesn't exist.

Error:

/home/runner/work/ccloud-private-automation-testing/ccloud-private-automation-testing/autosolve/run_step.sh: No such file or directory

The composite action uses: ./autosolve/assess works correctly because ./ in uses: resolves within the reusable workflow's repo. But run: commands with ${{ github.workspace }} resolve to the caller's workspace.

Affected steps: Build prompt, Remove label, Set final status, Comment on issue.

Possible fixes:

  1. Move these shell scripts into composite actions (like assess/implement) so uses: ./ resolution works
  2. Use ${{ github.action_path }} or a different mechanism to locate the scripts relative to the action repo
  3. Check out the actions repo alongside the caller's repo and reference scripts from there

@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch from a950e62 to 1e3ddeb Compare March 18, 2026 18:17
@fantapop
Copy link
Contributor Author

Bug: reusable workflow shell steps can't find scripts cross-repo — Fixed. Created internal composite actions autosolve/github-issues-helpers and autosolve/jira-helpers that wrap the shell calls. These use ${{ github.action_path }} which resolves to the actions repo regardless of caller, unlike ${{ github.workspace }} which resolves to the caller's checkout.

Both workflows (github-issue-autosolve.yml and jira-autosolve.yml) now use uses: ./autosolve/*-helpers for all shell steps. The helper actions are documented as internal-only (not intended for direct external use).

@fantapop
Copy link
Contributor Author

Follow-up: local action paths still resolve to caller's workspace

The fix for the ${{ github.workspace }} shell script issue replaced the run: steps with local composite action references (e.g., ./autosolve/github-issues-helpers). However, this has the same underlying problem — in a reusable workflow, relative paths like ./autosolve/... resolve to the caller's workspace, not cockroachdb/actions.

Error from the latest test run:

Can't find 'action.yml', 'action.yaml' or 'Dockerfile' under 
'/home/runner/work/ccloud-private-automation-testing/ccloud-private-automation-testing/autosolve/github-issues-helpers'.
Did you forget to run actions/checkout before running your local action?

The reusable workflow checks out cockroachlabs/ccloud-private-automation-testing (the caller), so autosolve/github-issues-helpers doesn't exist there.

Possible fixes:

  1. Checkout cockroachdb/actions into a subdirectory — add a step that checks out the actions repo alongside the caller's code, then reference the local actions from that path
  2. Inline the logic — put the shell commands directly in run: steps rather than delegating to local composite actions
  3. Use fully-qualified action references — reference actions as cockroachdb/actions/autosolve/github-issues-helpers@branch instead of ./autosolve/github-issues-helpers

Option 3 is probably cleanest since it's how the composite actions (autosolve/assess, autosolve/implement) are already referenced by callers.

Test repo: cockroachlabs/ccloud-private-automation-testing, run triggered by labeling issue #3.

@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch from 1e3ddeb to 4ea0712 Compare March 18, 2026 18:27
@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch 15 times, most recently from 50c09c8 to dedeca0 Compare March 18, 2026 22:00
@fantapop fantapop requested a review from Copilot March 18, 2026 22:34
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “autosolve” automation suite (assess + implement) for automatically evaluating and attempting fixes via Claude Code, along with supporting scripts, prompts, reusable workflows, and a lightweight bash test harness.

Changes:

  • Introduces autosolve/assess and autosolve/implement composite actions plus shared step runner/scripts.
  • Adds security/prompt templates and GitHub Issue autosolve reusable workflow integration.
  • Standardizes bash testing via test_helpers.sh and expands test discovery to include root-level *_test.sh.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
test_helpers.sh New shared bash test helper functions (expect_success, expect_failure, print_results).
test.sh Updates test discovery depth and uses BASH_SOURCE[0] for script dir.
autotag-from-changelog/auto-tag-release_test.sh Refactors tests to use shared test helpers.
autotag-from-changelog/auto-tag-release.sh Switches error logging to shared actions_helpers.sh helpers.
autosolve/scripts/shared_test.sh Adds unit tests for shared.sh functions.
autosolve/scripts/shared.sh Adds shared autosolve functions (validation, prompt building, result parsing, final status).
autosolve/scripts/implement_test.sh Adds unit tests for implement.sh security/output behavior.
autosolve/scripts/implement.sh Adds implementation flow, blocked-path security checks, PR creation, and outputs.
autosolve/scripts/github_issues.sh Adds GitHub Issues prompt builder + issue comment/label helpers.
autosolve/scripts/assess_test.sh Adds unit tests for assessment outputs.
autosolve/scripts/assess.sh Adds assessment flow (read-only tools) and outputs.
autosolve/run_step.sh Adds a step entrypoint to source scripts and invoke functions from workspace cwd.
autosolve/prompts/security-preamble.md Adds a security preamble injected into prompts.
autosolve/prompts/implementation-footer.md Adds implementation instructions + required success/failed marker.
autosolve/prompts/assessment-footer.md Adds assessment instructions + required proceed/skip marker.
autosolve/implement/action.yml Defines the composite implement action interface and steps.
autosolve/assess/action.yml Defines the composite assess action interface and steps.
actions_helpers_test.sh Adds tests for action helper logging/output helpers.
actions_helpers.sh Adds shared logging + GITHUB_OUTPUT helper functions.
README.md Documents new autosolve actions and reusable workflows.
CLAUDE.md Updates repo conventions for changelog ordering, sourcing patterns, and shell style guidance.
CHANGELOG.md Adds changelog entries for the new autosolve functionality.
.shellcheckrc Adds ShellCheck config for external sources and SCRIPTDIR source path resolution.
.github/workflows/github-issue-autosolve.yml Adds reusable workflow for GitHub Issue-driven autosolve runs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Contributor Author

@fantapop fantapop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed this thing line by line and left a bunch of comments

@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch 4 times, most recently from a4995af to 7d64d1b Compare March 19, 2026 17:40
Base automatically changed from add-claude.md to main March 19, 2026 18:07
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch from 7d64d1b to d8bc767 Compare March 19, 2026 18:11
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@fantapop fantapop force-pushed the CNSL-1944-generic-autosolve-git-hub-workflow-for-automated-issue-resolution branch from d8bc767 to fed481b Compare March 19, 2026 18:52
@fantapop fantapop marked this pull request as ready for review March 19, 2026 19:03
@fantapop
Copy link
Contributor Author

@linhcrl I think this one is ready for your review. I'm sorry, it's somewhat gigantic. I have personally gone over it a few times now and done multiple rounds of testing. Please feel free to comment on anything that you think could be clearer, or is unnecessary. There is one more thing I'm going to test manually right now which is specifying a skill to run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants