Skip to content

chore: promote staging to staging-promote/c79754df-23099429381 (2026-03-15 04:45 UTC)#1192

Merged
henrypark133 merged 2 commits intomainfrom
staging-promote/15ab156d-23103553911
Mar 16, 2026
Merged

chore: promote staging to staging-promote/c79754df-23099429381 (2026-03-15 04:45 UTC)#1192
henrypark133 merged 2 commits intomainfrom
staging-promote/15ab156d-23103553911

Conversation

@ironclaw-ci
Copy link
Contributor

@ironclaw-ci ironclaw-ci bot commented Mar 15, 2026

Auto-promotion from staging CI

Batch range: c79754df2888ac7e2704d6cf4686b111eceee959..15ab156d62632e173d9a10933b775cece6ea66a5
Promotion branch: staging-promote/15ab156d-23103553911
Base: staging-promote/c79754df-23099429381
Triggered by: Staging CI batch at 2026-03-15 04:45 UTC

Commits in this batch (2):

Current commits in this promotion (2)

Current base: main
Current head: staging-promote/15ab156d-23103553911
Current range: origin/main..origin/staging-promote/15ab156d-23103553911

Auto-updated by staging promotion metadata workflow

Waiting for gates:

  • Tests: pending
  • E2E: pending
  • Claude Code review: pending (will post comments on this PR)

Auto-created by staging-ci workflow

ilblackdragon and others added 2 commits March 15, 2026 03:17
* fix: eliminate panic paths in production code and document infallible operations

PolicyRule::new() now returns Result instead of panicking on invalid
caller-supplied regex. CreateJobTool returns ToolError when job_manager
is unconfigured instead of panicking. Remaining infallible unwrap/expect
calls (hardcoded regexes, compile-time constants, guarded accesses)
are annotated with SAFETY comments. Where possible, unwraps are replaced
with safer patterns: split_last(), if-let, match-destructure, and
reusing peek() values.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use inline lowercase safety comments to match CI pattern

The no-panics CI check greps for '// safety:' (lowercase, inline)
to suppress false positives. Switch from block SAFETY comments to
inline safety comments on the .unwrap() lines.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add regression tests for panic-path fixes

- PolicyRule::new returns Err on invalid regex (not panic)
- CreateJobTool::execute_sandbox returns ToolError when job_manager is None

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add inline // safety: comments on all infallible unwrap/expect lines

The CI no-panics check requires '// safety:' on the same line as
unwrap()/expect() to suppress false positives. Move safety annotations
from block comments to inline comments on every infallible production
unwrap/expect across all touched files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: trigger CI with skip-regression-check label

[skip-regression-check]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: remove redundant block-level SAFETY comments

Each unwrap/expect now carries its own inline // safety: annotation,
making the standalone block comments above them redundant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add Criterion benchmarks for safety layer hot paths

Add benchmark suite using Criterion.rs for performance-critical paths:

- benches/safety_check.rs: Sanitizer (clean/adversarial), Validator
  (normal/long/tool params), LeakDetector (clean/secrets/HTTP scan)
- benches/tool_dispatch.rs: JSON parsing, schema validation patterns,
  tool output serialization

CI compiles benchmarks on every PR to prevent regressions.
Run locally with: cargo bench

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add bench-compile to CI roll-up job

Include bench-compile in the run-tests roll-up job's needs array
so benchmark compilation failures block PRs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add black_box to benchmarks, use real SafetyLayer pipeline

- Wrap all benchmark inputs in criterion::black_box to prevent
  compiler optimization from skewing results
- Replace generic JSON benchmarks in tool_dispatch.rs with actual
  SafetyLayer pipeline benchmarks (sanitize_tool_output, wrap_for_llm,
  scan_inbound_for_secrets)
- Keep JSON parsing benchmarks for tool parameter overhead measurement

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: apply cargo fmt to benchmark files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: copy benches/ in Dockerfile to fix manifest parse error

Cargo.toml references [[bench]] targets that must exist for manifest
parsing to succeed. Add COPY benches/ to the Docker build stage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: re-trigger CI after adding skip-regression-check label

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review comments on criterion benchmarks

- Move header string allocations outside b.iter() closure in
  http_request_scan to avoid measuring allocation overhead
- Add .unwrap() to serde_json::from_str results in JSON parsing
  benchmarks to catch invalid JSON instead of silently benchmarking
  error construction
- Add comment explaining why benches/ COPY is needed in Dockerfile
  ([[bench]] entries require source files for cargo manifest parsing)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: update Cargo.lock with criterion dependencies

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(bench): build secret-like strings at runtime to avoid CI secret scanners

Construct AWS key and GitHub token patterns via format!() concatenation
so the literal strings don't appear in source and trigger push protection
or secret scanning in CI pipelines. The resulting strings still match
LeakDetector patterns for valid benchmarking.

[skip-regression-check]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: rename tool_dispatch bench, drop async_tokio, replace JSON benchmarks

1. Rename `tool_dispatch.rs` → `safety_pipeline.rs` to match actual
   content (SafetyLayer pipeline benchmarks).
2. Drop unused `async_tokio` feature from criterion dependency.
3. Replace serde_json::from_str benchmarks (third-party only) with
   Validator::validate_tool_params exercising IronClaw's recursive
   validation on simple, complex, and deeply nested JSON inputs.
4. Add `--all-features` to CI bench-compile to match clippy/test
   convention and verify both DB backends.

Addresses zmanian's review feedback on PR #836.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added scope: agent Agent core (agent loop, router, scheduler) scope: tool/builtin Built-in tools scope: tool/wasm WASM tool sandbox scope: tool/mcp MCP client scope: llm LLM integration scope: workspace Persistent memory / workspace scope: config Configuration size: XL 500+ changed lines scope: extensions Extension management scope: setup Onboarding / setup scope: sandbox Docker sandbox scope: ci CI/CD workflows scope: dependencies Dependency updates risk: high Safety, secrets, auth, or critical infrastructure contributor: core 20+ merged PRs labels Mar 15, 2026
@claude
Copy link

claude bot commented Mar 15, 2026

Code review

No issues found.

Summary

This PR adds Criterion benchmarks for safety layer hot paths and improves error handling throughout the codebase. Reviewed for:

Security & Safety (Agent 1): No vulnerabilities detected. PR actually improves security by eliminating ~5 panic paths in production code. All .unwrap() calls on hardcoded regex literals are properly justified with safety comments.

Architecture & Patterns (Agent 2): Follows CLAUDE.md guidelines. Error handling uses proper Result types. Benchmark isolation is correct. Minor opportunity to consolidate hardcoded regexes into LazyLock constants to reduce .expect() comments, but current approach is acceptable.

Bug Scan (Agent 3): No logic errors detected. Refactorings (e.g., split_last() pattern in settings, RTF extractor) are semantically correct and improve code clarity.

Performance & Production (Agent 4): Benchmark allocations follow correct Criterion pattern (allocated once per group, reused across iterations, not per-measurement). No resource leaks or blocking operations. Reasonable to add explicit sample_size() constraints if benchmarks run on constrained CI systems, but not critical.

Details

  • CI job: New bench-compile job compiles benchmarks with all features (compile-only validation is reasonable)
  • Error handling: PolicyRule::new() now returns Result instead of panicking on invalid regex — test coverage added
  • Benchmark data: Secret patterns built at runtime to avoid CI secret scanner false positives (good practice)
  • Code comments: All unsafe .unwrap() calls are documented with safety justifications

🤖 Generated with Claude Code

Base automatically changed from staging-promote/c79754df-23099429381 to staging-promote/e805ec61-23059634819 March 16, 2026 16:01
Base automatically changed from staging-promote/e805ec61-23059634819 to main March 16, 2026 16:23
@henrypark133 henrypark133 merged commit 218e877 into main Mar 16, 2026
38 checks passed
@henrypark133 henrypark133 deleted the staging-promote/15ab156d-23103553911 branch March 16, 2026 20:26
@ironclaw-ci ironclaw-ci bot mentioned this pull request Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: high Safety, secrets, auth, or critical infrastructure scope: agent Agent core (agent loop, router, scheduler) scope: ci CI/CD workflows scope: config Configuration scope: dependencies Dependency updates scope: extensions Extension management scope: llm LLM integration scope: sandbox Docker sandbox scope: setup Onboarding / setup scope: tool/builtin Built-in tools scope: tool/mcp MCP client scope: tool/wasm WASM tool sandbox scope: workspace Persistent memory / workspace size: XL 500+ changed lines staging-promotion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants