Skip to content

feat(coprocessor): throttle dependent ops#1905

Closed
Eikix wants to merge 4 commits intomainfrom
codex/dependent-ops-throttle
Closed

feat(coprocessor): throttle dependent ops#1905
Eikix wants to merge 4 commits intomainfrom
codex/dependent-ops-throttle

Conversation

@Eikix
Copy link
Copy Markdown
Contributor

@Eikix Eikix commented Feb 4, 2026

What

  • DB-backed monotonic clamp for dependent ops scheduling (prevents schedule_order inversions across main/catchup/poller).
  • New dependence_chain_schedule table (1 row per chain) used to clamp via upsert+returning.
  • Limiter stays optional; dependent-ops-rate-per-min=0 still disables.

Why

  • Throttling writes schedule_order into the future; later inserts with smaller schedule_order can invert execution order.
  • Per-replica state isn’t enough with multiple HL types. DB clamp gives a single monotonic timeline across restarts/instances.

How

  • For dependent ops only: compute schedule_order, apply limiter delay, then clamp_schedule_order_db:
    • INSERT … ON CONFLICT DO UPDATE with GREATEST(last + 1µs, computed)
    • returns clamped schedule_order
  • New migration: 20260204120000_dependence_chain_schedule.sql

Impact / load

  • +1 upsert per dependent op (PK lookup, no scans).
  • Contention only on hot chains (already serialized work).
  • Non-dependent ops unaffected.

Risks / mitigations

  • Hot-row contention → limited to single chain; acceptable for testnet throttling.
  • Missing migration → clear startup failure; documented.
  • Precision edge (µs clamp) → relies on DB timestamp precision; uses 1µs to preserve order.

Why this is minimal

  • No new worker, queue, or scheduler changes.
  • No protocol changes; uses existing schedule_order.

Tracking

Testing

  • cargo fmt --manifest-path coprocessor/fhevm-engine/host-listener/Cargo.toml
  • SQLX_OFFLINE=true cargo check -p host-listener
  • SQLX_OFFLINE=true cargo clippy -p host-listener -- -D warnings

@cla-bot cla-bot bot added the cla-signed label Feb 4, 2026
@Eikix Eikix force-pushed the codex/dependent-ops-throttle branch from 26d26a4 to 6aeb36b Compare February 4, 2026 16:28
@Eikix
Copy link
Copy Markdown
Contributor Author

Eikix commented Feb 4, 2026

Quick note on Antoniu's inversion concern: the PoC already clamps schedule_order to be monotonic per dependence chain when limiter is enabled (schedule_order = max(base+defer, last_scheduled+1µs)). This prevents TXb from being scheduled earlier than TXa even if delay shrinks later.\n\nCaveat: the clamp state is in-memory per replica; it resets on restart and doesn’t coordinate across replicas. If we need strict monotonicity across restarts/replicas we’d need to persist last_scheduled per chain (bigger change).

@Eikix Eikix force-pushed the codex/dependent-ops-throttle branch from 6aeb36b to 86b8688 Compare February 4, 2026 16:38
@Eikix Eikix changed the title feat(host-listener): throttle dependent ops feat(coprocessor): throttle dependent ops Feb 4, 2026
@Eikix Eikix force-pushed the codex/dependent-ops-throttle branch from 86b8688 to 44472f0 Compare February 4, 2026 16:41
@Eikix
Copy link
Copy Markdown
Contributor Author

Eikix commented Feb 4, 2026

Next step: implement a minimal PoC that uses DB as source of truth for dependency limit! ETA tmr

@mergify
Copy link
Copy Markdown

mergify bot commented Feb 4, 2026

🧪 CI Insights

Here's what we observed from your CI run for ffab6b1.

🟢 All jobs passed!

But CI Insights is watching 👀

@Eikix Eikix marked this pull request as ready for review February 4, 2026 21:04
@Eikix Eikix requested a review from a team as a code owner February 4, 2026 21:04
@Eikix
Copy link
Copy Markdown
Contributor Author

Eikix commented Feb 4, 2026

@claude /pr-review

@claude
Copy link
Copy Markdown

claude bot commented Feb 4, 2026

Claude finished @Eikix's task in 7m 27s —— View job


PR Review Complete

  • Pre-review validation
  • Gather context (PR summary & guidelines)
  • Run parallel review agents
  • Validate findings
  • Post review

Summary

Overall this is a well-designed feature that addresses a real problem (schedule_order inversions across host-listener instances). The DB-backed monotonic clamp is a minimal and effective solution.

Issues Found

Severity Count Description
Bug 1 Type mismatch: TIMESTAMPTZ column vs PrimitiveDateTime return type

Bug Details

Type mismatch in dependence_chain_schedule table - The migration uses TIMESTAMPTZ but the Rust code expects PrimitiveDateTime. In sqlx:

  • TIMESTAMPPrimitiveDateTime
  • TIMESTAMPTZOffsetDateTime

The RETURNING clause will fail at runtime. Fix: Change TIMESTAMPTZ to TIMESTAMP in the migration for consistency with existing schedule_order columns.

Positive Observations

  • Clean token bucket implementation for rate limiting
  • Good use of Prometheus metrics for observability
  • Unit tests cover the limiter logic
  • Helpful documentation in README
  • Graceful degradation when limiter is disabled (rate=0)

|

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED]

Code Review Summary

Overall this is a well-designed feature that addresses a real problem (schedule_order inversions across host-listener instances). The DB-backed monotonic clamp is a minimal and effective solution.

Issues Found

1 bug requiring attention before merge.

Positive Observations

  • Clean token bucket implementation for rate limiting
  • Good use of Prometheus metrics for observability
  • Unit tests cover the limiter logic
  • Documentation in README is helpful
  • Graceful degradation when limiter is disabled (rate=0)

@@ -0,0 +1,4 @@
CREATE TABLE IF NOT EXISTS dependence_chain_schedule (
dependence_chain_id bytea PRIMARY KEY,
last_scheduled_at TIMESTAMPTZ NOT NULL
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[AUTOMATED] Bug: Type mismatch between TIMESTAMPTZ column and PrimitiveDateTime return type

The column last_scheduled_at is defined as TIMESTAMPTZ, but the Rust function clamp_schedule_order_db() returns PrimitiveDateTime. In sqlx with the time crate:

  • TIMESTAMP (without timezone) maps to PrimitiveDateTime
  • TIMESTAMPTZ (with timezone) maps to OffsetDateTime

The RETURNING last_scheduled_at clause will return a TIMESTAMPTZ value, and sqlx will fail at runtime when trying to decode it into PrimitiveDateTime.

Evidence: The existing schedule_order columns in the codebase use TIMESTAMP (not TIMESTAMPTZ):

-- 20250703000000_add_schedule_order_column.sql
ADD COLUMN IF NOT EXISTS schedule_order TIMESTAMP NOT NULL DEFAULT NOW();

Suggested fix: Change TIMESTAMPTZ to TIMESTAMP for consistency with existing schedule_order columns:

CREATE TABLE IF NOT EXISTS dependence_chain_schedule (
    dependence_chain_id  bytea PRIMARY KEY,
    last_scheduled_at     TIMESTAMP NOT NULL
);

Confidence: 92/100

@Eikix Eikix marked this pull request as draft February 4, 2026 21:28
@Eikix
Copy link
Copy Markdown
Contributor Author

Eikix commented Feb 5, 2026

Closing, in favour of #1907 which replaces it

@Eikix Eikix closed this Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant