feat(09): backup API for embedded and server modes#81
Conversation
…t constructors - BackupResult<S> wraps BackupManifest + new session (nullable for leave-closed/stopped) - BackupExecutor implements close-copy-manifest-restart algorithm with SHA-256 hashing - All-args package-private constructors added to BackupManifest and BackupFileMetadata - Mode validation: embedded rejects leaveStopped, server rejects leaveClosed - Unit tests cover copy, manifest JSON round-trip, mode rejection, path containment
- EmbeddedSession expanded to 8-param constructor with backup callback slot - ServerSession expanded to 7-param constructor, backup stub replaced with real impl - JNA backend constructs backup lambdas in doStartEmbedded and doStartServer - Panama backend constructs backup lambdas in doStartEmbedded and doStartServer - BackupExecutor made public for cross-module access from backends - Tests updated for new constructor signatures with null-rejection coverage
- SUMMARY.md with execution metrics and deviation documentation - STATE.md updated with phase 9 position and decisions - ROADMAP.md marks phase 9 plan 1 complete - REQUIREMENTS.md marks BKUP-01, BKUP-02, BKUP-03 complete
# Conflicts: # .planning/ROADMAP.md # .planning/STATE.md
- Sentinel file copy, manifest parsing, leaveClosed behavior - Mode-specific option rejection (leaveStopped on embedded) - Null options and closed session guard tests
- Sentinel file copy, manifest parsing, server restart after backup - leaveStopped behavior, leaveClosed rejection - Null options and closed session guard tests - findFreePort helper for port allocation
- SUMMARY.md with 20 tests across 4 files - STATE.md, ROADMAP.md, REQUIREMENTS.md updated - BKUP-04 marked complete
Critical: - Fix native handle leak: move closed.set(true) to finally block after backupAction runs, preventing handle leak on pre-validation failures - Fix premature directory deletion: only clean up backup dir in leave-inactive and restart-failure paths, not when restart succeeds High: - Reject non-existent source path with explicit error instead of silently creating empty backup - Narrow catch to IOException only; let RuntimeException propagate immediately instead of treating programming errors as recoverable - Propagate IOException exc in deleteDirectoryQuietly postVisitDirectory Important: - Return "0644" fallback on non-POSIX systems instead of null for manifest mode field interop with Go implementation - Remove dead Gson no-arg constructor from BackupManifest - Add fileCount/totalBytes/sizeBytes non-negative bounds checks - Replace internal "Phase 10" references with user-facing messages - Remove dead BackupOptions.toJson() Tests: - Add 5 null-arg rejection tests for BackupExecutor.execute() - Add backupSetsClosedAfterActionNotBefore test - Add closeAfterBackupIsNoOp tests for both session types - Add manifest list immutability tests - Add extractPersistPath precedence test - Add BackupFileMetadataTest with null/bounds validation
|
Claude finished @tazarov's task in 2m 24s —— View job Code Review — feat(09): backup API
Overall the implementation is solid — the transactional error-handling model is well-thought-out, the lock-based concurrency in session classes is correct, and the test coverage is thorough. A few issues are worth addressing before merge. Findings1. Bug: Partial backup directory not cleaned up when copy fails and restart succeeds
When // restart succeeded
if (backupError != null) {
throw toRuntimeException(backupError); // dest NOT cleaned up
}The The test 2. Inconsistency: embedded persist path sourced from YAML, server sourced from FFI
For embedded sessions, For server sessions (line 141), the path is fetched from the running shim via If the Chroma shim resolves relative paths differently from the JVM (e.g., relative to the shim's own working directory or binary location), the embedded backup would copy from the wrong directory. In practice the working directories are likely the same, but the server approach is more robust. Consider using an FFI accessor for embedded sessions too, or documenting the assumption. 3. Minor:
|
| Priority | Finding |
|---|---|
| Bug | #1 — Partial backup not cleaned up when copy fails + restart succeeds |
| Design | #2 — Embedded persistPath sourced from YAML vs FFI |
| Minor | #3, #4, #5, #6 — Cosmetic / documentation |
- Fix partial backup cleanup on copy-fail + restart-success path - Use FFI chroma_embedded_persist_path instead of YAML parsing - Remove redundant savedYaml aliases in JNA/Panama runtimes - Pass actual shim version as wrapperVersion in backup manifest - Enforce non-null mode in BackupFileMetadata record
|
Claude finished @tazarov's task in 2m 37s —— View job Code Review — feat(09): backup API (round 2)
|
| # | Finding | Status |
|---|---|---|
| 1 | Bug: partial backup not cleaned up on copy-fail + restart-success | ✅ Fixed (BackupExecutor.java:124-127 + test at line 287) |
| 2 | Inconsistency: embedded persist path from YAML vs FFI | ✅ Fixed — both modes now use embeddedPersistPath(handle) via FFI |
| 3 | Minor: redundant savedYaml alias |
✅ Fixed — lambda captures configYaml directly |
| 4 | Minor: wrapperVersion hardcoded to "java" |
✅ Fixed — now uses doVersion() (actual shim version) |
| 5 | Minor: BackupFileMetadata.mode nullable but undocumented |
✅ Fixed — compact constructor now rejects null via requireNonNull |
| 6 | Minor: BackupManifest.files() returns empty list without Javadoc |
Still present — cosmetic only |
New Finding
Bug: Native handle leaked on pre-validation failure in backup()
EmbeddedSession.java:130-134 and ServerSession.java:137-141
Both session classes have the same pattern:
try {
return backupAction.apply(options);
} finally {
closed.set(true); // runs even when BackupExecutor throws before close
}BackupExecutor.execute() can throw before calling closeAction — specifically during pre-validation at:
- line 52:
isWithinPath(dest, source)—IllegalArgumentExceptionif dest is inside source - line 57:
ensureEmptyDir(dest)—IllegalArgumentExceptionif dest is non-empty/file, orChromaExceptionon I/O failure
If either of these throws, the session's closeAction has not been called — the native handle is still alive. But closed.set(true) still executes in the finally block, which means:
- The session is now in a zombie state: not open, but handle not freed
- Subsequent
session.close()calls seeclosed == trueand skipcloseAction - The native handle is permanently leaked
The PR description mentions "Fix native handle leak on pre-validation failure" as one of the 18 previously-addressed findings, but the current code still has this gap for failures that occur before the closeAction runs inside BackupExecutor.
Fix: closed should only be set true once the close action has actually been invoked. One approach: propagate pre-validation exceptions without touching closed, and only mark the session closed after closeAction succeeds. A cleaner approach is to have BackupExecutor signal whether the close action ran (e.g. a boolean [] sentinel, or a wrapper exception type), so the session can decide whether to mark itself closed.
Summary
The implementation is in very good shape — all round-1 findings were resolved cleanly. Finding #6 (Javadoc on files()) remains but is cosmetic. The new finding above is a real resource-leak bug affecting the pre-validation error path; the happy path and post-close error paths are all correct.
EmbeddedSession and ServerSession set closed=true in a finally block during backup(), even when BackupExecutor threw before invoking closeAction (e.g. destination-inside-source or non-empty-destination). This leaked the native handle since subsequent close() calls were no-ops. Add PreValidationFailure marker exception to BackupExecutor so sessions can distinguish pre-validation failures (session stays open) from post-closeAction failures (session marked closed to prevent double-free).
|
Claude finished @tazarov's task in 1m 18s —— View job Code Review — feat(09): backup API (round 3)
Round-2 Finding — Status
Fix VerificationThe pre-validation leak fix is clean. The solution introduces a // EmbeddedSession.java:130-139 (ServerSession mirrors this)
try {
BackupResult<EmbeddedSession> result = backupAction.apply(options);
closed.set(true); // set only on success
return result;
} catch (BackupExecutor.PreValidationFailure e) {
throw (RuntimeException) e.getCause(); // no closed.set — session stays alive ✅
} catch (RuntimeException e) {
closed.set(true); // close action ran (or tried), mark invalidated ✅
throw e;
}This correctly handles all three states:
One Remaining Cosmetic Note
public List<BackupFileMetadata> files() {
return files == null ? Collections.emptyList() : Collections.unmodifiableList(files);
}When SummaryAll substantive findings across both rounds are resolved. The |

Summary
Phase 9: Backup API
Goal: Users can back up Chroma data from both embedded and server modes, producing a directory with a manifest file that records backup metadata
Status: Verified (6/6 must-haves)
Implements Java backup API across core, JNA, and Panama modules.
BackupExecutorhandles the close-copy-manifest-restart lifecycle with transactional error handling. BothEmbeddedSession.backup()andServerSession.backup()delegate to this executor with mode-specific close/restart actions. Produces a backup directory containing apersist/snapshot andbackup_manifest.jsonwith SHA-256 checksums.Changes
Plan 01: Core backup implementation
BackupExecutor— static utility implementing close-copy-manifest-restart with cleanup-on-failureBackupResult<S>— generic record wrapping manifest + new sessionBackupMode— enum for embedded/server wire formatBackupOptions— builder with destination, includeMetadata, leaveInactive flagsEmbeddedSessionandServerSessionPlan 02: Integration tests
Review fixes (18 findings addressed)
Requirements Addressed
EmbeddedSession.backup(options)performs directory copy with manifestServerSession.backup(options)performs stop-backup-restart cycleBackupOptionsbuilder supports destination, includeMetadata, leaveInactiveVerification
Key Files
Created:
java/core/.../BackupExecutor.java— core backup algorithmjava/core/.../BackupResult.java— generic result recordjava/core/.../BackupMode.java— embedded/server enumjava/core/.../BackupFileMetadataTest.java— metadata validation testsjava/jna/.../JnaEmbeddedBackupTest.java+JnaServerBackupTest.javajava/panama/.../PanamaEmbeddedBackupTest.java+PanamaServerBackupTest.javaModified:
java/core/.../EmbeddedSession.java— backup method + lockjava/core/.../ServerSession.java— backup method + lockjava/core/.../BackupManifest.java— bounds checks, removed dead constructorjava/core/.../BackupOptions.java— removed dead toJsonjava/jna/.../JnaChromaRuntime.java— backup lambda constructionjava/panama/.../PanamaChromaRuntime.java— backup lambda construction