Skip to content

refactor(sync): drop batch_id protocol fields#1

Merged
haex-space merged 2 commits into
mainfrom
refactor/remove-batch-fields
Apr 21, 2026
Merged

refactor(sync): drop batch_id protocol fields#1
haex-space merged 2 commits into
mainfrom
refactor/remove-batch-fields

Conversation

@haex-space

@haex-space haex-space commented Apr 21, 2026

Copy link
Copy Markdown
Owner

Summary

  • Remove batchId / batchSeq / batchTotal from pushChangeSchema and the SyncChange wire type
  • Delete validateBatches + BatchValidationError and the /sync/push call-site that invoked them
  • Drop tests/syncBatchValidation.test.ts (validated a protocol the client no longer speaks)

Why

Transaction-scope HLC (see haex-vault refactor/crdt-transaction-scope-hlc) is now the semantic grouping key for a sender-side transaction. The batch_id trio was transport metadata from an older protocol iteration that never reached the database, and the client has already stopped sending it, so the server-side validator ran against nothing.

HLC-group atomicity is enforced by the client instead: it chunks pushes at HLC boundaries, and the server persists every request in a single db.transaction().

Companion work

Test plan

  • bun test — 279/279 pass after validateBatches removed
  • E2E: tests/sync suite on the companion branches — 97/97 pass
  • CI green on this branch

Summary by CodeRabbit

  • Refactor
    • Removed batch grouping fields from sync protocol
    • Server-side batch validation eliminated; atomicity now handled through client-side chunking at transaction boundaries
    • Updated sync endpoint documentation to clarify transaction-scoped change handling

Transaction-scope HLC (see haex-vault refactor/crdt-transaction-scope-hlc)
is now the semantic grouping key for a sender-side transaction. batchId /
batchSeq / batchTotal were transport metadata from an earlier protocol
iteration that never made it to the database; the client no longer sends
them and the completeness check based on them ran against nothing.

Remove the fields from pushChangeSchema and the SyncChange wire type,
delete validateBatches + BatchValidationError and their call-site, and
drop the corresponding unit tests. HLC-group atomicity is now enforced
upstream: the client chunks pushes at HLC boundaries, and the server
persists every request in a single db.transaction().
@coderabbitai

coderabbitai Bot commented Apr 21, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@haex-space has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 9 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 3 minutes and 9 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 45f338d7-6423-4797-b94d-43d54ebf0ce4

📥 Commits

Reviewing files that changed from the base of the PR and between 813db75 and b400284.

📒 Files selected for processing (1)
  • tests/helpers/setup.ts
📝 Walkthrough

Walkthrough

This pull request removes batch validation infrastructure from the sync system. The changes eliminate batch-related schema fields (batchId, batchSeq, batchTotal), the server-side validateBatches function, and corresponding tests. The system now relies exclusively on HLC timestamps for transaction grouping and client-side chunking at HLC boundaries.

Changes

Cohort / File(s) Summary
Schema and Type Updates
src/routes/sync.schemas.ts, src/utils/syncUtils.ts
Removed optional batch fields (batchId, batchSeq, batchTotal) from schema and SyncChange type definition. Updated hlcTimestamp documentation to clarify transaction-scoped semantics.
Batch Validation Removal
src/routes/sync.ts
Removed validateBatches import and function call from /sync/push handler. Updated exports to remove validateBatches and BatchValidationError. Modified endpoint comment to reflect HLC-group atomicity assumptions and note absence of server-side completeness verification.
Test Cleanup
tests/syncBatchValidation.test.ts
Deleted entire test suite covering batch validation scenarios including completeness checks, duplicate sequence detection, and edge cases.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Batches be gone, HLC takes the stage,
No more sequence counts to engage,
Transaction-scoped timestamps align,
Client-side chunking keeps things fine!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: removing batch_id protocol fields from the sync module, which is the primary refactoring across all modified files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/remove-batch-fields

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routes/sync.schemas.ts`:
- Line 45: The hlcTimestamp field currently uses hlcTimestamp: z.string() which
is too permissive; replace it with a structural validator—either
z.string().regex(...) with the canonical HLC pattern or z.string().refine(value
=> isValidHlc(value), { message: 'invalid HLC timestamp' }) and add a small
helper isValidHlc or parseHlcTimestamp that enforces the HLC format/components
(e.g., logical counter, wall-time, node id) so malformed values are rejected by
the schema; update the schema entry for hlcTimestamp to use that validator and
return a clear error message on failure.

In `@src/routes/sync.ts`:
- Around line 126-129: Before starting the transaction, validate that all
incoming records in the sync request have the same hlcTimestamp to enforce the
single-HLC invariant: iterate the array payload (e.g., request body field that
contains changes/ops), collect the distinct hlcTimestamp values and if more than
one is present return a 400 error (do not start the transaction) with an
explanatory message; this check should be placed in the sync route handler just
before the transaction block that uses lastHlc so lastHlc remains unambiguous.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 1e06ec2e-c7b8-42e7-a220-86b60ba35c7a

📥 Commits

Reviewing files that changed from the base of the PR and between 8fc1503 and 813db75.

📒 Files selected for processing (4)
  • src/routes/sync.schemas.ts
  • src/routes/sync.ts
  • src/utils/syncUtils.ts
  • tests/syncBatchValidation.test.ts
💤 Files with no reviewable changes (1)
  • tests/syncBatchValidation.test.ts

rowPks: z.string(), // JSON string
columnName: sqlIdentifier.nullable(),
hlcTimestamp: z.string(),
hlcTimestamp: z.string(), // Transaction-scope HLC; shared by every change from one sender-side tx

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate hlcTimestamp structurally instead of accepting any string.

On Line 45, hlcTimestamp: z.string() is too permissive now that HLC is the primary transaction/ordering key. Malformed values can break ordering assumptions in push conflict handling.

Suggested hardening
 export const pushChangeSchema = z.object({
   tableName: sqlIdentifier,
   rowPks: z.string(), // JSON string
   columnName: sqlIdentifier.nullable(),
-  hlcTimestamp: z.string(), // Transaction-scope HLC; shared by every change from one sender-side tx
+  hlcTimestamp: z
+    .string()
+    .min(1)
+    // Replace with your canonical HLC regex/parser if available.
+    .regex(/^[^\\s]+$/, 'invalid hlcTimestamp'),
   deviceId: z.string().optional(),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
hlcTimestamp: z.string(), // Transaction-scope HLC; shared by every change from one sender-side tx
export const pushChangeSchema = z.object({
tableName: sqlIdentifier,
rowPks: z.string(), // JSON string
columnName: sqlIdentifier.nullable(),
hlcTimestamp: z
.string()
.min(1)
// Replace with your canonical HLC regex/parser if available.
.regex(/^[^\s]+$/, 'invalid hlcTimestamp'),
deviceId: z.string().optional(),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/sync.schemas.ts` at line 45, The hlcTimestamp field currently uses
hlcTimestamp: z.string() which is too permissive; replace it with a structural
validator—either z.string().regex(...) with the canonical HLC pattern or
z.string().refine(value => isValidHlc(value), { message: 'invalid HLC timestamp'
}) and add a small helper isValidHlc or parseHlcTimestamp that enforces the HLC
format/components (e.g., logical counter, wall-time, node id) so malformed
values are rejected by the schema; update the schema entry for hlcTimestamp to
use that validator and return a clear error message on failure.

Comment thread src/routes/sync.ts
Comment on lines +126 to +129
// Apply changes atomically in a transaction so either ALL changes are
// applied or NONE. HLC-group atomicity is guaranteed by the client
// chunking at HLC boundaries before calling this endpoint — the server
// trusts the sender's grouping and does no completeness verification.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Enforce the single-HLC request invariant server-side.

Line 126-Line 129 explicitly trusts client chunking, but there is no guard against mixed hlcTimestamp values in one request. That can violate the documented contract and make lastHlc ambiguous.

Suggested guard before transaction
   const changes = rawChanges as PushChange[]
+  const uniqueHlc = new Set(changes.map((c) => c.hlcTimestamp))
+  if (uniqueHlc.size > 1) {
+    return c.json({ error: 'All changes in one push request must share the same hlcTimestamp' }, 400)
+  }

   try {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/sync.ts` around lines 126 - 129, Before starting the transaction,
validate that all incoming records in the sync request have the same
hlcTimestamp to enforce the single-HLC invariant: iterate the array payload
(e.g., request body field that contains changes/ops), collect the distinct
hlcTimestamp values and if more than one is present return a 400 error (do not
start the transaction) with an explanatory message; this check should be placed
in the sync route handler just before the transaction block that uses lastHlc so
lastHlc remains unambiguous.

src/db/index.ts throws on module load when DATABASE_URL is unset.
federation.helpers.test.ts mocks ../src/db via mock.module(), but ESM
imports are hoisted, so the mock is registered too late when another
test file triggers the load first. CI runner order surfaces this,
local runs do not.

Same fix pattern as STORAGE_ENCRYPTION_KEY (2d14c95): set a dummy
value in the bun preload so module-load never throws. postgres() is
lazy and opens no connection until a query runs, so tests that mock
the db see no real I/O.
@haex-space haex-space merged commit 572fe06 into main Apr 21, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant