feat: cold-start optimization (measurement + lazy imports + fork init) by ogabrielluiz · Pull Request #12783 · langflow-ai/langflow

ogabrielluiz · 2026-04-20T14:38:08Z

Summary

End-to-end cold-start optimization for Langflow. Consolidates work originally split across seven stacked PRs (#12783, #12784, #12785, #12786, #12788, #12789, #12798) into a single landing surface for QA and review.

Nothing has reached release-1.10.0 yet. The dependent PRs were auto-marked as merged by GitHub when their head branches became reachable from this branch during the consolidation merges; their original review threads are preserved on each closed PR.

What's in this PR

1. Measurement harness (originally #12783)

Reproducible cold-start benchmarking for lfx run and langflow run scenarios. Driver, mock LLM, measurement Dockerfile (python:3.13-slim + uv + hyperfine), threshold-based regression gate in CI. Zero-cost stdlib checkpoint instrumentation in lfx._bench, gated on LFX_BENCHMARK_CHECKPOINTS. Make targets: bench-local, bench-docker, bench-snapshot, bench-verify-synthetic. New benchmarks dependency group.

2. Component index correctness + build caching (originally #12784)

Lazy asyncio.Lock on ComponentCache so the import-time singleton does not crash on Python 3.13+. Bounded concurrency on module scans so asyncio's default thread pool is not exhausted under high component counts. Atomic disk-cache write via tempfile + os.replace, version-stamped with the lfx package version so lfx-only deployments do not invalidate on every restart. Read-time stale-cache peek runs outside the lock to avoid widening the lock-hold window during multi-MB disk reads.

3. Deferred imports off the graph hot path (originally #12785)

Field-typing layer split into static metadata (lfx/field_typing/names.py) and class-object resolution (constants.py) via PEP 562 __getattr__. Stub fallbacks for langchain transitive failures (Windows c10.dll and similar) with one-time warnings and fork-aware cache cleanup. DEFAULT_IMPORT_STRING preamble removed from validate.create_class; user components must include their own imports, with an actionable NameError hint mapping each previously-auto-injected name to the right from X import Y line. PEP 563 lazy annotations preserved through prepare_global_scope, so TYPE_CHECKING-only langchain symbols resolve correctly at tool-mode introspection time.

4. Lazy `validate.py` exec_globals (originally #12786)

_LazyImportProxy stand-ins for langchain modules in user component code, deferred to first real use. Eager imports preserved for non-langchain modules (Pydantic strict-validators, lfx constants). Star imports always resolve eagerly. Windows _MissingModulePlaceholder for unavailable C-extensions (e.g. jq).

5. Service init restructuring + fork-safe container preload (originally #12788)

PreloadStep enum and is_step_complete gates so Gunicorn workers skip work the master already completed (profile pictures, bundles, types cache, starter projects, agentic globals/MCP). Master preload runs in preload._run_master_preload, with two parallel waves via asyncio.gather:

Wave 1: copy_profile_pictures || load_bundles_with_error_handling (different filesystem subtrees, no shared state)
Wave 2 (agentic_experience only): initialize_agentic_global_variables || auto_configure_agentic_mcp_server (independent per the prerequisite DAG)

Types cache stays sequential after wave 1 because it scans settings.components_path, which the bundle step extends. Master disposes the DB engine and external cache socket before fork to prevent worker connection-pool inheritance. gc.freeze() after preload moves preloaded objects into the permanent generation so the cyclic GC does not unshare COW pages in workers.

6. Release notes (originally #12789, doc page dropped)

The standalone Deployment/deployment-cold-start.mdx page was dropped because the cold-start work is internal-track for now (master/worker COW, gc.freeze, asyncio.gather are architecture, not operator-facing config). Release notes still carry the user-visible bullet and the before/after numbers table.

7. TCP-probe readiness fix (originally #12798)

langflow_run_http_ready benchmark scenario uses a TCP probe instead of an HTTP probe, removing false negatives when the HTTP server is bound but not yet serving.

How to verify

make bench-local runs against the dev venv (fast iteration)
make bench-docker builds the measurement image and runs all scenarios end-to-end
make bench-verify-synthetic injects a synthetic regression and proves the CI gate trips
Apply the run-benchmark-snapshot label to capture authoritative numbers against release-1.10.0
Apply the run-benchmarks label to verify against thresholds

For prefork validation, run with LANGFLOW_GUNICORN_PRELOAD=true and watch logs for [preload] lines; workers should log Skipping ...: inherited from master for completed preload steps.

Adds a reproducible cold-start benchmarking harness covering lfx run bare boot, lfx run <flow> first-execution, and langflow run end-to-end restart scenarios. - src/backend/tests/benchmarks/: driver, scenarios, fixtures, mock LLM, measurement Dockerfile, thresholds.json sentinel baseline, conftest. - src/lfx/src/lfx/_bench.py: stdlib-only checkpoint instrumentation gated on LFX_BENCHMARK_CHECKPOINTS. - src/lfx/src/lfx/cli/run.py: integrates checkpoint hooks at after-imports, before-run-flow, after-run-flow landmarks; dump() on completion. - .github/workflows/cold-start-benchmark.yml: label-gated CI regression gate with per-scenario matrix jobs. run-benchmarks (verify) and run-benchmark-snapshot (capture) label triggers. - Makefile: bench-local / bench-docker / bench-snapshot / bench-verify-synthetic. - pyproject.toml + uv.lock: benchmarks dep group.

coderabbitai · 2026-04-20T14:38:26Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0ad8b08c-b2ae-4c31-a689-83193305a836

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch cold-start/01-measurement-foundation

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Lazy asyncio.Lock on ComponentCache (safe under module-import-time constraints), bounded concurrency on dynamic component loading, non-blocking async read of the persisted index, cache-hit short-circuit that skips the full package walk when the installed lfx version matches, atomic + version-stamped writes, stale-index warning. - src/lfx/src/lfx/interface/components.py: all correctness + caching changes - src/lfx/tests/unit/test_component_index.py: parity + correctness tests Observable flow behavior is unchanged: every change is accompanied by a deep parity snapshot test that loads a flow via the component cache and asserts byte-identical final output + vertex execution order.

Renames test classes to descriptive names and strips internal roadmap IDs (IMP-XX, IDX-XX, D-XX, plan 0N-NN) from comments and docstrings. No behavioral changes. Class renames: - TestIMP02NoPandas -> TestLfxImportsWithoutPandas - TestIMP07FieldTyping -> TestFieldTypingDefersLangchainCore

Rewrites prepare_global_scope to build a LazyImportProxy-backed globals mapping instead of eagerly calling importlib.import_module() for every langchain_* name in the prepended DEFAULT_IMPORT_STRING. Components that do not reference a given langchain symbol no longer trigger its import, cutting transformers/torch off the component-instantiation path. Narrowed scope: only langchain / langchain_core / langchain_classic / langchain_text_splitters / langchain_community prefixes are deferred. Everything else (stdlib, pydantic, lfx) resolves eagerly so class-body validators get concrete values.

Renames test files / directory / functions to descriptive names and strips internal roadmap IDs (SVC-XX, CNT-XX, IDX-06, D-XX, phase_04). - Directory: phase_04_service_init_parity/ -> service_init_parity/ - Files: - test_svc01_starter_hash_cache.py -> test_starter_project_hash_gate.py - test_svc02_dependency_review.py -> test_lifespan_dependency_review.py - test_svc02_gather_structure.py -> test_lifespan_gather_structure.py - test_svc03_mcp_event_readiness.py -> test_mcp_event_readiness.py - test_svc04_restart_integration.py -> test_langflow_restart_parity.py - Functions: test_svc04_* -> descriptive names No behavioral changes.

Adds docs/docs/Deployment/deployment-cold-start.mdx documenting UV_COMPILE_BYTECODE, multi-stage layer separation, pre-warmed venv patterns, pre-bake-deps recipes, and LANGFLOW_GUNICORN_PRELOAD guidance. Cross-links from the Docker deployment guide and the production best practices guide. Registers the new page in the sidebar. Adds a release-notes.mdx bullet summarizing the cold-start performance improvements across lfx run and langflow run, with headline scenario numbers.

Removes internal roadmap IDs (IDX-01, MEAS-03, Phase 2, Phase 5) from the cold-start deployment guide. No behavioral changes.

Swap the `Application startup complete.` stdout marker for a TCP connect probe against 127.0.0.1:7860 so the scenario no longer races against langflow's structlog processor pipeline. The same approach is used by _langflow_no_change_restart_supervisor.py, whose header note already called out this class of failure. The scenario's thresholds.json entry is still the `mean_ms: 0` sentinel, so continue-on-error stays set for this matrix cell until a run-benchmark-snapshot captures a real baseline. Workflow comments and the generated thresholds.json `_note` are updated to reflect the fix.

Addresses two low-severity review notes on #12798. Pre-flight check before launching the child: if 127.0.0.1:7860 already accepts connections (dev server, leftover benchmark boot), fail fast with exit code 3 rather than race against the stale listener and emit a bogus near-zero LANGFLOW_READY_MS. Post-ready child-liveness check: after a successful TCP connect, verify the child is still running. If it already exited, the connect landed on someone else's listener and we refuse to record a measurement. Also drops the redundant exit_code accumulator — failure paths now return directly inside the try/finally; the finally block still runs.

…sions Captured a real baseline for langflow_run_http_ready via snapshot-mode run 24784108652 on 12798/merge@93c8aa10: 22183.74ms mean, 246ms stddev over 5 runs. Updated thresholds.json, including a refreshed snapshot for the other scenarios on the same run. With the sentinel gone, dropped langflow_run_http_ready from every continue-on-error expression and from the regression-comment-skip condition, so the gate now enforces a real regression ceiling on that scenario. langflow_run_no_change_restart is retained at its 2026-04-20 baseline (11324.7ms) rather than the 0.85ms the current run produced — the low number is a known self_measuring dispatch bug, not a real measurement. Workflow comments updated to reflect this.

When a thresholds.json entry has runs=0 AND mean_ms<=0, treat it as a placeholder for a scenario that has never been snapshotted: record the current measurement for visibility but do not trip the gate. The next run-benchmark-snapshot anchors the real baseline. Previously, any baseline mean_ms<=0 tripped the gate unconditionally. That meant a new scenario landing as a sentinel (mean_ms=0, runs=0, the convention for "tracked but not yet anchored") would fail the workflow on its very first run, forcing the contributor to either: - snapshot on the PR branch (discouraged — authoritative baselines should come from main), or - add continue-on-error: true in the workflow matrix as a hack, then remember to remove it in a followup PR after the baseline lands. Distinguishing runs=0 (unanchored) from runs>0 with mean_ms=0 (intentionally-zeroed Path-B sentinel) preserves the existing sentinel-trip semantic for the latter case.

The unreleased deployment-cold-start page only lives in docs/docs/, not in the versioned snapshots. Slug-based links like /deployment-cold-start resolve to the latest released version (1.9.0) and break the docusaurus build. Relative .mdx links resolve within the same docs version.

Cold-start work is internal-track for now. The implementation details (master/worker COW, gc.freeze, gunicorn preload, asyncio.gather waves) read as architecture rather than user-facing configuration. The user-facing win (Langflow starts faster) happens automatically; if and when prefork mode becomes a recommended path, the env-var reference can carry a one-line note. - Remove docs/docs/Deployment/deployment-cold-start.mdx - Remove sidebar entry - Remove the Cold-start optimization section from deployment-docker.mdx - Remove the Cold-start optimization bullet from deployment-prod-best-practices.mdx - Trim the deployment-tuning paragraph and table footnote in release-notes.mdx so the cold-start row no longer references the dropped page

codecov · 2026-05-06T17:44:16Z

Codecov Report

❌ Patch coverage is 70.91837% with 171 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.31%. Comparing base (78f82ca) to head (034eccb).
⚠️ Report is 5 commits behind head on release-1.10.0.

Files with missing lines	Patch %	Lines
...ase/langflow/initial_setup/starter_project_hash.py	0.00%	60 Missing ⚠️
src/lfx/src/lfx/custom/validate.py	70.94%	47 Missing and 5 partials ⚠️
src/lfx/src/lfx/_bench.py	60.00%	15 Missing and 1 partial ⚠️
src/lfx/src/lfx/interface/components.py	85.56%	10 Missing and 4 partials ⚠️
src/lfx/src/lfx/field_typing/constants.py	93.10%	3 Missing and 3 partials ⚠️
...c/lfx/src/lfx/custom/custom_component/component.py	60.00%	4 Missing ⚠️
src/lfx/src/lfx/utils/type_hints.py	77.77%	4 Missing ⚠️
src/lfx/src/lfx/base/data/base_file.py	25.00%	3 Missing ⚠️
src/backend/base/langflow/server.py	71.42%	2 Missing ⚠️
src/lfx/src/lfx/cli/run.py	77.77%	1 Missing and 1 partial ⚠️
... and 6 more

❌ Your project check has failed because the head coverage (51.03%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@                Coverage Diff                 @@
##           release-1.10.0   #12783      +/-   ##
==================================================
- Coverage           54.36%   53.31%   -1.05%     
==================================================
  Files                2091     2095       +4     
  Lines              191692   192072     +380     
  Branches            27455    27506      +51     
==================================================
- Hits               104204   102406    -1798     
- Misses              86333    88500    +2167     
- Partials             1155     1166      +11

Flag	Coverage Δ
backend	`50.26% <22.22%> (-7.37%)`	⬇️
frontend	`54.54% <ø> (+0.07%)`	⬆️
lfx	`51.03% <78.69%> (+0.43%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...rc/backend/base/langflow/field_typing/constants.py	`0.00% <ø> (ø)`
src/backend/base/langflow/initial_setup/setup.py	`41.67% <ø> (-12.23%)`	⬇️
src/backend/base/langflow/preload.py	`88.23% <100.00%> (-10.32%)`	⬇️
src/lfx/src/lfx/base/prompts/utils.py	`53.57% <100.00%> (+3.57%)`	⬆️
src/lfx/src/lfx/base/tools/component_tool.py	`48.90% <100.00%> (+1.38%)`	⬆️
src/lfx/src/lfx/field_typing/names.py	`100.00% <100.00%> (ø)`
src/lfx/src/lfx/graph/graph/utils.py	`72.74% <100.00%> (ø)`
src/lfx/src/lfx/graph/state/model.py	`92.30% <100.00%> (+0.12%)`	⬆️
src/lfx/src/lfx/graph/vertex/param_handler.py	`83.09% <100.00%> (ø)`
src/lfx/src/lfx/graph/vertex/vertex_types.py	`43.58% <100.00%> (ø)`
... and 26 more

... and 182 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-05-06T17:45:05Z

Build successful! ✅
Deploying docs draft.
Deploy successful! View draft

github-actions · 2026-05-06T17:49:46Z

Frontend Unit Test Coverage Report

Coverage Summary

Lines	Statements	Branches	Functions
	37.46% (44822/119652)	67.62% (6197/9164)	37.13% (1029/2771)

Unit Test Results

Tests	Skipped	Failures	Errors	Time
4253	0 💤	0 ❌	0 🔥	8m 29s ⏱️

The test asserted that initialize_auto_login_default_superuser() is called exactly once in main.py. That call was intentionally removed — superuser initialization is handled inside initialize_services() via setup_superuser(), which includes file-lock protection for multi-worker environments. The standalone call in main.py was always redundant and its removal didn't break AUTO_LOGIN behavior.

This reverts commit 629c8ab.

This reverts commit c6adefd.

github-actions Bot added the enhancement New feature or request label Apr 20, 2026

ogabrielluiz and others added 2 commits April 20, 2026 11:40

[autofix.ci] apply automated fixes

620fbfe

ogabrielluiz mentioned this pull request Apr 20, 2026

feat(components): component index correctness + build caching #12784

Merged

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Apr 20, 2026

ogabrielluiz and others added 4 commits April 20, 2026 11:43

feat(imports): defer heavy imports off the Graph hot path

f52b6c5

[autofix.ci] apply automated fixes (attempt 2/3)

68826bc

[autofix.ci] apply automated fixes

e18a461

ogabrielluiz mentioned this pull request Apr 20, 2026

feat(imports): defer heavy imports off the Graph hot path #12785

Merged

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Apr 20, 2026

ogabrielluiz mentioned this pull request Apr 20, 2026

fix(custom): lazy exec_globals in validate.prepare_global_scope #12786

Merged

ogabrielluiz and others added 3 commits April 20, 2026 11:46

feat(services): service init restructuring + container fork safety

7cba56b

[autofix.ci] apply automated fixes

c21cdb5

This was referenced Apr 20, 2026

feat(services): service init restructuring + container fork safety #12788

Merged

docs(deployment): cold-start optimization guide + release notes #12789

Merged

ogabrielluiz and others added 8 commits April 20, 2026 13:53

refactor(services): finish rename + strip remaining internal vocab

03c73a5

refactor(docs): strip internal planning vocabulary

43feb33

Removes internal roadmap IDs (IDX-01, MEAS-03, Phase 2, Phase 5) from the cold-start deployment guide. No behavioral changes.

[autofix.ci] apply automated fixes

7ad386c

ogabrielluiz requested review from RamGopalSrikar and jordanrfrazier May 6, 2026 17:17

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 6, 2026

ogabrielluiz marked this pull request as ready for review May 6, 2026 17:33

ogabrielluiz added the run-benchmark-snapshot Triggers the cold-start-benchmark workflow in snapshot mode (captures authoritative baseline) label May 6, 2026

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 6, 2026

This comment has been minimized.

Sign in to view

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 6, 2026

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 8, 2026

jordanrfrazier force-pushed the cold-start/01-measurement-foundation branch from 742aa3f to 629c8ab Compare May 8, 2026 18:25

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 8, 2026

Revert "test: delete stale test_main_superuser_init"

c6adefd

This reverts commit 629c8ab.

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 8, 2026

Reapply "test: delete stale test_main_superuser_init"

11c4f2c

This reverts commit c6adefd.

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels May 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: cold-start optimization (measurement + lazy imports + fork init)#12783

feat: cold-start optimization (measurement + lazy imports + fork init)#12783
ogabrielluiz wants to merge 61 commits intorelease-1.10.0from
cold-start/01-measurement-foundation

ogabrielluiz commented Apr 20, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

Review skipped

Uh oh!

This comment has been minimized.

codecov Bot commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ogabrielluiz commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in this PR

1. Measurement harness (originally #12783)

2. Component index correctness + build caching (originally #12784)

3. Deferred imports off the graph hot path (originally #12785)

4. Lazy validate.py exec_globals (originally #12786)

5. Service init restructuring + fork-safe container preload (originally #12788)

6. Release notes (originally #12789, doc page dropped)

7. TCP-probe readiness fix (originally #12798)

How to verify

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

This comment has been minimized.

codecov Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026

Frontend Unit Test Coverage Report

Coverage Summary

Unit Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ogabrielluiz commented Apr 20, 2026 •

edited

Loading

4. Lazy `validate.py` exec_globals (originally #12786)

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

codecov Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading