Skip to content

Commit d41580a

Browse files
committed
Merge origin/main into codex/translate-acp-host-capabilities
Signed-off-by: Andrew Harvard <aharvard@squareup.com>
2 parents 65ff7d7 + e790d37 commit d41580a

116 files changed

Lines changed: 8832 additions & 4810 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.agents/skills/code-review/SKILL.md

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,16 @@ You are a senior engineer conducting a thorough code review. Review **only the l
9292
- **AnimatePresence**: Is it used properly with unique keys for dialog/modal transitions?
9393
- **Reduced Motion**: Is `useReducedMotion()` respected for accessibility?
9494

95+
### Async State, Defaults & Persistence
96+
- **Async Source of Truth**: During async provider/model/session mutations, does UI/session/localStorage state update only after the backend accepts the change? If the UI updates optimistically, is there an explicit rollback path?
97+
- **UI/Backend Drift**: Could the UI show provider/model/project/persona X while the backend is still on Y after a failed mutation, delayed prepare, or pending-to-real session handoff?
98+
- **Requested vs Fallback Authority**: Do explicit user or caller selections stay authoritative over sticky defaults, saved preferences, aliases, or fallback resolution?
99+
- **Dependent State Invalidation**: When a parent selection changes (provider/project/persona/workspace/etc.), are dependent values like `modelId`, `modelName`, defaults, or cached labels cleared or recomputed so stale state does not linger?
100+
- **Persisted Preference Validation**: Are stored selections validated against current inventory/capabilities before reuse, and do stale values fail soft instead of breaking creation flows?
101+
- **Compatibility of Fallbacks**: Are default or sticky selections guaranteed to remain compatible with the active concrete provider/backend, instead of leaking across providers?
102+
- **Best-Effort Lookups**: Do inventory/config/default-resolution lookups degrade gracefully on transient failure, or can they incorrectly block a primary flow that should still work with a safe fallback?
103+
- **Draft/Home/Handoff Paths**: If the product has draft, Home, pending, or pre-created sessions, did you review those handoff paths separately from the already-active session path?
104+
95105
### General Code Quality
96106
- **Error Handling**: Are errors handled gracefully with user-friendly messages?
97107
- **Loading States**: Are loading states shown during async operations?
@@ -104,13 +114,18 @@ You are a senior engineer conducting a thorough code review. Review **only the l
104114

105115
### Step 0: Run Quality Checks
106116

107-
Before reading any code, run the project's CI gate to establish a baseline:
117+
Before reading any code, run the project's CI gate to establish a baseline. Use **check-only** commands so the baseline never mutates the working tree — otherwise auto-formatters can introduce unstaged diffs and you'll end up reviewing formatter output instead of the author's actual changes.
118+
119+
Avoid `just check-everything` as the baseline in this repo: that recipe runs `cargo fmt --all` in write mode and will modify the working tree. Run the non-mutating equivalents instead:
108120

109121
```bash
110-
just ci
122+
cargo fmt --all -- --check
123+
cargo clippy --all-targets -- -D warnings
124+
(cd ui/desktop && pnpm run lint:check)
125+
./scripts/check-openapi-schema.sh
111126
```
112127

113-
This runs: `pnpm check` (Biome lint/format + file sizes), `pnpm typecheck`, `just clippy` (Rust linting), `pnpm test`, `pnpm build`, and `just tauri-check` (Rust type checking).
128+
If the project has a stronger pre-push or CI gate than this helper set, run that fuller gate when the review is meant to be PR-ready, but only after confirming it is also non-mutating (or run it from a clean stash). In this repo, targeted tests for the changed area plus the pre-push checks are often the practical follow-up.
114129

115130
Report the results as pass/fail. Any failures are automatically **P0** issues and should appear at the top of the findings list. Do not skip this step even if the user only wants a quick review.
116131

@@ -120,7 +135,8 @@ For each file in the list:
120135

121136
1. Run `git diff main...HEAD -- <file>` to get the exact lines that changed
122137
2. Review **only those changed lines** against the Review Checklist — do not flag issues in unchanged code
123-
3. Note the file path and line numbers from the diff output for each issue found
138+
3. For stateful UI or async flow changes, trace the full path end to end: user selection -> local/session state update -> persistence -> backend prepare/set/update call -> failure/rollback path
139+
4. Note the file path and line numbers from the diff output for each issue found
124140

125141
### Step 2: Categorize Issues
126142

@@ -152,16 +168,17 @@ After reviewing all files, provide:
152168

153169
### Step 3b: Self-Check
154170

155-
Before presenting findings to the user, silently review the issue list two more times:
171+
Before presenting findings to the user, silently review the issue list three times:
156172

157173
1. **Pass 1**: For each issue, ask — is this genuinely a problem, or could it be intentional/acceptable? Remove false positives.
158174
2. **Pass 2**: For each remaining issue, ask — does the recommended fix actually improve the code, or is it a matter of preference?
175+
3. **Pass 3**: For async state/default-resolution issues, ask — can the UI, persisted state, and backend ever disagree after a failure, fallback, or session handoff?
159176

160-
After both passes, tag each surviving issue as one of:
177+
After these passes, tag each surviving issue as one of:
161178
- **[Must Fix]** — clear violation, will likely get flagged in PR review
162179
- **[Your Call]** — valid concern but may be intentional or a reasonable tradeoff (e.g. stepping outside the design system for a specific reason). Present it but let the user decide.
163180

164-
Only present issues that survived both passes.
181+
Only present issues that survived these passes.
165182

166183
### Step 4: Fix Issues
167184

@@ -189,7 +206,7 @@ Once all issues are fixed, display:
189206

190207
**✅ Code review complete! All issues have been addressed.**
191208

192-
Your code is ready to commit and push. Lefthook will run the full CI gate (`just ci`) automatically when you push.
209+
Your code is ready to commit and push. Lefthook and CI will run the repo's configured gates when you push.
193210

194211
Next steps: generate a PR summary that explains the intent of this change, what files were modified and why, and how to verify the changes work.
195212

.github/workflows/bundle-goose2.yml

Lines changed: 81 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,11 @@ on:
3838
required: false
3939
default: ""
4040
type: string
41+
windows-signing:
42+
description: "Whether to perform Windows signing via Azure Trusted Signing"
43+
required: false
44+
default: false
45+
type: boolean
4146
cli-run-id:
4247
description: >
4348
Run ID of a prior build-cli.yml workflow run to download the goose
@@ -125,7 +130,7 @@ jobs:
125130
126131
- name: Cache Rust dependencies
127132
if: inputs.cli-run-id == ''
128-
uses: Swatinem/rust-cache@v2
133+
uses: Swatinem/rust-cache@e18b497796c12c097a38f9edb9d0641fb99eee32 # v2
129134
with:
130135
key: goose2-macos-arm64
131136

@@ -175,13 +180,11 @@ jobs:
175180
certificate-password: ${{ secrets.APPLE_CERTIFICATE_PASSWORD }}
176181

177182
# ── Tauri bundle ──
178-
- name: Check disk space before bundle
179-
run: df -h
180-
181183
- name: Bundle Goose 2 (pnpm tauri build)
182184
env:
185+
APPLE_SIGNING_IDENTITY: ${{ inputs.signing && 'Developer ID Application' || '' }}
183186
APPLE_ID: ${{ inputs.signing && secrets.APPLE_ID || '' }}
184-
APPLE_ID_PASSWORD: ${{ inputs.signing && secrets.APPLE_ID_PASSWORD || '' }}
187+
APPLE_PASSWORD: ${{ inputs.signing && secrets.APPLE_ID_PASSWORD || '' }}
185188
APPLE_TEAM_ID: ${{ inputs.signing && secrets.APPLE_TEAM_ID || '' }}
186189
working-directory: ui/goose2
187190
run: |
@@ -291,7 +294,7 @@ jobs:
291294
292295
- name: Cache Rust dependencies
293296
if: inputs.cli-run-id == ''
294-
uses: Swatinem/rust-cache@v2
297+
uses: Swatinem/rust-cache@e18b497796c12c097a38f9edb9d0641fb99eee32 # v2
295298
with:
296299
key: goose2-macos-x86_64
297300

@@ -360,8 +363,9 @@ jobs:
360363
# ── Tauri bundle (cross-compile for Intel) ──
361364
- name: Bundle Goose 2 for Intel
362365
env:
366+
APPLE_SIGNING_IDENTITY: ${{ inputs.signing && 'Developer ID Application' || '' }}
363367
APPLE_ID: ${{ inputs.signing && secrets.APPLE_ID || '' }}
364-
APPLE_ID_PASSWORD: ${{ inputs.signing && secrets.APPLE_ID_PASSWORD || '' }}
368+
APPLE_PASSWORD: ${{ inputs.signing && secrets.APPLE_ID_PASSWORD || '' }}
365369
APPLE_TEAM_ID: ${{ inputs.signing && secrets.APPLE_TEAM_ID || '' }}
366370
working-directory: ui/goose2
367371
run: |
@@ -477,7 +481,7 @@ jobs:
477481
478482
- name: Cache Rust dependencies
479483
if: inputs.cli-run-id == ''
480-
uses: Swatinem/rust-cache@v2
484+
uses: Swatinem/rust-cache@e18b497796c12c097a38f9edb9d0641fb99eee32 # v2
481485
with:
482486
key: goose2-linux-x86_64
483487

@@ -564,6 +568,7 @@ jobs:
564568
runs-on: windows-latest
565569
timeout-minutes: 60
566570
permissions:
571+
id-token: write
567572
contents: read
568573
actions: read
569574
steps:
@@ -621,7 +626,7 @@ jobs:
621626
622627
- name: Cache Rust dependencies
623628
if: inputs.cli-run-id == ''
624-
uses: Swatinem/rust-cache@v2
629+
uses: Swatinem/rust-cache@e18b497796c12c097a38f9edb9d0641fb99eee32 # v2
625630
with:
626631
key: goose2-windows-x86_64
627632

@@ -697,3 +702,70 @@ jobs:
697702
name: Goose2-windows-x64-msi
698703
path: ui/goose2/src-tauri/target/x86_64-pc-windows-msvc/release/bundle/msi/*.msi
699704
if-no-files-found: warn
705+
706+
sign-windows:
707+
name: "Sign Windows installers"
708+
needs: bundle-windows
709+
if: inputs.windows-signing
710+
runs-on: windows-latest
711+
environment: signing
712+
permissions:
713+
id-token: write
714+
contents: read
715+
actions: read
716+
steps:
717+
- name: Download NSIS installer
718+
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7
719+
with:
720+
name: Goose2-windows-x64-nsis
721+
path: unsigned/nsis
722+
723+
- name: Download MSI installer
724+
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7
725+
with:
726+
name: Goose2-windows-x64-msi
727+
path: unsigned/msi
728+
729+
- name: Azure login
730+
uses: azure/login@a457da9ea143d694b1b9c7c869ebb04ebe844ef5 # v2
731+
with:
732+
client-id: ${{ secrets.AZURE_CLIENT_ID }}
733+
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
734+
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
735+
736+
- name: Sign Windows installers with Azure Trusted Signing
737+
uses: azure/trusted-signing-action@db7a3a6bd3912025c705162fb7475389f5b69ec6 # v1
738+
with:
739+
endpoint: ${{ secrets.AZURE_SIGNING_ENDPOINT }}
740+
trusted-signing-account-name: ${{ secrets.AZURE_SIGNING_ACCOUNT_NAME }}
741+
certificate-profile-name: ${{ secrets.AZURE_CERTIFICATE_PROFILE_NAME }}
742+
files-folder: ${{ github.workspace }}/unsigned
743+
files-folder-filter: exe,msi
744+
files-folder-recurse: true
745+
746+
- name: Verify signed installers
747+
shell: pwsh
748+
run: |
749+
$files = Get-ChildItem -Path unsigned -Recurse -Include *.exe,*.msi
750+
foreach ($file in $files) {
751+
Write-Output "Verifying signature: $($file.FullName)"
752+
$sig = Get-AuthenticodeSignature $file.FullName
753+
if ($sig.Status -ne "Valid") {
754+
throw "Signature invalid for $($file.Name): $($sig.Status)"
755+
}
756+
Write-Output "✅ Signature valid: $($file.Name)"
757+
}
758+
759+
- name: Upload signed NSIS installer
760+
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6
761+
with:
762+
name: Goose2-windows-x64-nsis-signed
763+
path: unsigned/nsis/*.exe
764+
if-no-files-found: error
765+
766+
- name: Upload signed MSI installer
767+
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6
768+
with:
769+
name: Goose2-windows-x64-msi-signed
770+
path: unsigned/msi/*.msi
771+
if-no-files-found: error

.github/workflows/pr-smoke-test.yml

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -108,9 +108,13 @@ jobs:
108108
node-version: '22'
109109

110110
- name: Install agentic providers
111-
run: npm install -g @anthropic-ai/claude-code @openai/codex @google/gemini-cli @zed-industries/claude-agent-acp @zed-industries/codex-acp
111+
run: npm install -g @anthropic-ai/claude-code @zed-industries/claude-agent-acp @zed-industries/codex-acp
112112

113-
- name: Run Smoke Tests with Provider Script
113+
- name: Install Node.js Dependencies
114+
run: source ../../bin/activate-hermit && pnpm install --frozen-lockfile
115+
working-directory: ui/desktop
116+
117+
- name: Run Smoke Tests (Normal Mode)
114118
env:
115119
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
116120
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
@@ -127,12 +131,10 @@ jobs:
127131
SKIP_BUILD: 1
128132
SKIP_PROVIDERS: ${{ vars.SKIP_PROVIDERS || '' }}
129133
run: |
130-
# Ensure the HOME directory structure exists
131134
mkdir -p $HOME/.local/share/goose/sessions
132135
mkdir -p $HOME/.config/goose
133-
134-
# Run the provider test script (binary already built and downloaded)
135-
bash scripts/test_providers.sh
136+
source ../../bin/activate-hermit && pnpm run test:integration:providers
137+
working-directory: ui/desktop
136138

137139
- name: Set up Python
138140
uses: actions/setup-python@v5
@@ -188,6 +190,10 @@ jobs:
188190
- name: Make Binary Executable
189191
run: chmod +x target/debug/goose
190192

193+
- name: Install Node.js Dependencies
194+
run: source ../../bin/activate-hermit && pnpm install --frozen-lockfile
195+
working-directory: ui/desktop
196+
191197
- name: Run Provider Tests (Code Execution Mode)
192198
env:
193199
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
@@ -205,7 +211,8 @@ jobs:
205211
run: |
206212
mkdir -p $HOME/.local/share/goose/sessions
207213
mkdir -p $HOME/.config/goose
208-
bash scripts/test_providers_code_exec.sh
214+
source ../../bin/activate-hermit && pnpm run test:integration:providers-code-exec
215+
working-directory: ui/desktop
209216

210217
compaction-tests:
211218
name: Compaction Tests
@@ -277,7 +284,8 @@ jobs:
277284
GOOSE_PROVIDER: anthropic
278285
GOOSE_MODEL: claude-sonnet-4-5-20250929
279286
SHELL: /bin/bash
287+
SKIP_BUILD: 1
280288
run: |
281289
echo 'export PATH=/some/fake/path:$PATH' >> $HOME/.bash_profile
282-
source ../../bin/activate-hermit && pnpm run test:integration:debug
290+
source ../../bin/activate-hermit && pnpm run test:integration:goosed
283291
working-directory: ui/desktop

.github/workflows/pr-website-preview.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ jobs:
5151
cleanup:
5252
runs-on: ubuntu-latest
5353
needs: deploy
54-
if: github.event.action == 'closed'
54+
if: github.event.action == 'closed' && github.event.pull_request.head.repo.full_name == 'aaif-goose/goose'
5555
permissions:
5656
contents: write
5757
steps:

0 commit comments

Comments
 (0)