Skip to content

[ENH](faults): Add Tilt fault injection CLI#6881

Open
rescrv wants to merge 2 commits intorescrv/wire-up-faultsfrom
rescrv/fault-inject-tilt
Open

[ENH](faults): Add Tilt fault injection CLI#6881
rescrv wants to merge 2 commits intorescrv/wire-up-faultsfrom
rescrv/fault-inject-tilt

Conversation

@rescrv
Copy link
Copy Markdown
Contributor

@rescrv rescrv commented Apr 10, 2026

Description of changes

Add a chroma-fault binary for injecting, listing, and clearing
faults against Tilt's rust-log-service with either Tilt instance
defaults or an explicit service address.

Wire the faults feature into Tilt and CI builds for
chroma-log-service, and only register the fault injection gRPC
service when that feature is enabled.

Add warning logs around injected wal3 upload faults and read
repair, and raise the garbage collector dispatcher queue sizes in
worker configs.

Test plan

CI, but the new feature is manual right now.

Migration plan

N/A

Observability plan

N/A

Documentation Changes

N/A

Co-authored-by: AI

Add a chroma-fault binary for injecting, listing, and clearing
faults against Tilt's rust-log-service with either Tilt instance
defaults or an explicit service address.

Wire the faults feature into Tilt and CI builds for
chroma-log-service, and only register the fault injection gRPC
service when that feature is enabled.

Add warning logs around injected wal3 upload faults and read
repair, and raise the garbage collector dispatcher queue sizes in
worker configs.

Co-authored-by: AI
@github-actions
Copy link
Copy Markdown

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@propel-code-bot
Copy link
Copy Markdown
Contributor

propel-code-bot bot commented Apr 10, 2026

Add chroma-fault CLI and wire feature-gated fault injection across log-service, wal3, and Tilt/CI

This PR introduces a new Rust CLI binary chroma-fault in rust/faults/src/bin/chroma-fault.rs to inject, list, and clear faults via gRPC against Tilt rust-log-service instances, with support for default Tilt addresses or explicit endpoints. It also expands fault handling in wal3 and log-service to support both global upload faults and replica-specific upload faults, while adding structured warning logs for injected faults and read-repair activity.

In addition, the PR wires the faults feature into local/CI image builds and runtime service registration. rust-log-service now conditionally registers FaultInjectionService only when the faults feature is enabled, and build pipelines (Tiltfile, .github/actions/tilt-setup-prebuild/docker-bake.hcl, rust/Dockerfile) pass feature flags accordingly. Worker config queue sizes for garbage_collector are increased, and related tests/exports/dependencies are updated to support the new behavior.

This summary was automatically generated by @propel-code-bot

Copy link
Copy Markdown
Contributor

@propel-code-bot propel-code-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues were identified across reviewed changes; implementation appears sound and ready.

Status: No Issues Found | Risk: Low

Review Details

📁 16 files reviewed | 💬 0 comments

@rescrv rescrv changed the title feat(faults): Add Tilt fault injection CLI [ENH](faults): Add Tilt fault injection CLI Apr 10, 2026
@blacksmith-sh
Copy link
Copy Markdown
Contributor

blacksmith-sh bot commented Apr 10, 2026

Found 2 test failures on Blacksmith runners:

Failures

Test View Logs
test_invalid_sha256 View Logs
test_default_ef/test_invalid_sha256[single-region] View Logs

Fix in Cursor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant