Conversation
| const LOCK_OPTIONS: LockOptions = { ...LOCK_OPTIONS_SYNC, retries: { retries: 20, minTimeout: 50, maxTimeout: 1000 } }; | ||
| const STALE_MS = 30_000; | ||
| const LOCK_OPTIONS_SYNC: LockOptions = { stale: STALE_MS, realpath: false }; | ||
| const LOCK_OPTIONS: LockOptions = { |
There was a problem hiding this comment.
nit: cleanupStaleTmpFiles() runs before the lock is acquired. Since tmp files are named {name}.enc.{pid}.{counter}.tmp and the cleanup matches any PID prefix for this store name, there's a small race with other language server processes (e.g. multiple VS Code windows sharing the same storageDir) — process A could delete a .tmp file that process B is about to rename.
Moving this call inside the lock block (after lockSyncWithRetry, before the existsSync check) would close that window.
| this.saveSync(); | ||
| } | ||
| } else { | ||
| this.saveSync(); |
There was a problem hiding this comment.
lockSyncWithRetry does up to 200 × 50ms of Atomics.wait, which hard-blocks the Node.js event loop for up to 10 seconds. This runs in the constructor during LSP initialization, so the editor gets no responses while it's spinning.
The main scenario: process A crashes while holding the lock. The stale timeout is 30s, so process B retries synchronously until proper-lockfile considers the lock stale. If that takes longer than the 10s retry budget, the server fails to start.
This is an improvement over the previous no-retry lockSync (which would just fail immediately), but worth noting the tradeoff. Could the constructor fail fast and defer store initialization to an async path to avoid blocking?
Purpose
Prepare the CloudFormation Language Server to migrate its persistence layer from LMDB to an encrypted file-based datastore on all platforms (currently only Windows uses it). The migration is gated behind a new feature flag and the file store itself is hardened for production use.
Two Core Changes
1. New
FileDbFeature FlagA new feature flag controls whether the file-based datastore is used instead of LMDB on non-Windows systems.
Only alpha is opted in
2. EncryptedFileStore Hardening
The file-based datastore received significant reliability improvements across six areas:
Atomic writes with fsync — After writing a temp file, data is flushed to disk before the atomic rename. The parent directory is also fsynced to ensure the directory entry is durable. Windows silently skips directory fsync since NTFS journaling handles durability.
Rename retry logic — File renames now retry up to 10 times on transient OS errors (EPERM, EACCES, EBUSY, ENOENT). This handles Windows-specific issues like antivirus or indexers holding file handles. Failed temp files are cleaned up on final failure.
Serialized write queue — Writes are chained through an internal promise queue so multiple async operations within the same process don't interleave. Each write waits for the previous one to finish before acquiring the cross-process file lock.
Improved locking strategy — The lock target changed from the data file itself to the parent directory with an explicit lockfile path. This avoids issues where the data file gets replaced via rename while a lock still references it. Lock retry parameters were retuned: more retries, shorter intervals, and randomized jitter to reduce contention. The synchronous lock used during construction now retries instead of failing immediately.
Stale temp file cleanup — On startup, the store scans its directory and removes leftover
.tmpfiles from prior crashed processes. Only files matching the store's own name prefix are removed.Resilience to external file deletion — If the data file is deleted between operations (e.g., by another process), the store gracefully starts from an empty state instead of throwing an error.