Filedb improv by satyakigh · Pull Request #537 · aws-cloudformation/cloudformation-languageserver

satyakigh · 2026-04-23T20:36:40Z

Purpose

Prepare the CloudFormation Language Server to migrate its persistence layer from LMDB to an encrypted file-based datastore on all platforms (currently only Windows uses it). The migration is gated behind a new feature flag and the file store itself is hardened for production use.

Two Core Changes

1. New `FileDb` Feature Flag

A new feature flag controls whether the file-based datastore is used instead of LMDB on non-Windows systems.

Environment	Enabled	Fleet %
Alpha	Yes	100%
Beta	No	100%
Prod	No	0%

Only alpha is opted in

2. EncryptedFileStore Hardening

The file-based datastore received significant reliability improvements across six areas:

Atomic writes with fsync — After writing a temp file, data is flushed to disk before the atomic rename. The parent directory is also fsynced to ensure the directory entry is durable. Windows silently skips directory fsync since NTFS journaling handles durability.
Rename retry logic — File renames now retry up to 10 times on transient OS errors (EPERM, EACCES, EBUSY, ENOENT). This handles Windows-specific issues like antivirus or indexers holding file handles. Failed temp files are cleaned up on final failure.
Serialized write queue — Writes are chained through an internal promise queue so multiple async operations within the same process don't interleave. Each write waits for the previous one to finish before acquiring the cross-process file lock.
Improved locking strategy — The lock target changed from the data file itself to the parent directory with an explicit lockfile path. This avoids issues where the data file gets replaced via rename while a lock still references it. Lock retry parameters were retuned: more retries, shorter intervals, and randomized jitter to reduce contention. The synchronous lock used during construction now retries instead of failing immediately.
Stale temp file cleanup — On startup, the store scans its directory and removes leftover .tmp files from prior crashed processes. Only files matching the store's own name prefix are removed.
Resilience to external file deletion — If the data file is deleted between operations (e.g., by another process), the store gracefully starts from an empty state instead of throwing an error.

…ency

kddejong · 2026-04-24T15:36:34Z

-const LOCK_OPTIONS: LockOptions = { ...LOCK_OPTIONS_SYNC, retries: { retries: 20, minTimeout: 50, maxTimeout: 1000 } };
+const STALE_MS = 30_000;
+const LOCK_OPTIONS_SYNC: LockOptions = { stale: STALE_MS, realpath: false };
+const LOCK_OPTIONS: LockOptions = {


nit: cleanupStaleTmpFiles() runs before the lock is acquired. Since tmp files are named {name}.enc.{pid}.{counter}.tmp and the cleanup matches any PID prefix for this store name, there's a small race with other language server processes (e.g. multiple VS Code windows sharing the same storageDir) — process A could delete a .tmp file that process B is about to rename.

Moving this call inside the lock block (after lockSyncWithRetry, before the existsSync check) would close that window.

kddejong · 2026-04-24T17:08:54Z

+                    this.saveSync();
+                }
+            } else {
                this.saveSync();


lockSyncWithRetry does up to 200 × 50ms of Atomics.wait, which hard-blocks the Node.js event loop for up to 10 seconds. This runs in the constructor during LSP initialization, so the editor gets no responses while it's spinning.

The main scenario: process A crashes while holding the lock. The stale timeout is 30s, so process B retries synchronously until proper-lockfile considers the lock stale. If that takes longer than the 10s retry budget, the server fails to start.

This is an improvement over the previous no-retry lockSync (which would just fail immediately), but worth noting the tradeoff. Could the constructor fail fast and defer store initialization to an async path to avoid blocking?

satyakigh added 2 commits April 23, 2026 15:21

Improve EncryptedFileStore for crash safety and cross-process concurr…

717d67c

…ency

add feature flag

413e4a8

satyakigh requested a review from a team as a code owner April 23, 2026 20:36

kddejong reviewed Apr 24, 2026

View reviewed changes

Comment thread tools/telemetry-generator.ts

kddejong reviewed Apr 24, 2026

View reviewed changes

satyakigh closed this Apr 26, 2026

satyakigh deleted the filedb-improv branch May 1, 2026 20:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filedb improv#537

Filedb improv#537
satyakigh wants to merge 2 commits intomainfrom
filedb-improv

satyakigh commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

kddejong Apr 24, 2026

Uh oh!

kddejong Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

satyakigh commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Two Core Changes

1. New FileDb Feature Flag

2. EncryptedFileStore Hardening

Uh oh!

Uh oh!

kddejong Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

kddejong Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

satyakigh commented Apr 23, 2026 •

edited

Loading

1. New `FileDb` Feature Flag