Skip to content

feat: check_models external readiness check#712

Open
mikeknep wants to merge 8 commits into
mainfrom
dd-model-health/mknepper
Open

feat: check_models external readiness check#712
mikeknep wants to merge 8 commits into
mainfrom
dd-model-health/mknepper

Conversation

@mikeknep

Copy link
Copy Markdown
Contributor

📋 Summary

Introduces a new check_models CLI command and DataDesigner interface method for checking external readiness of models and tools without triggering a full workload (preview or create). This is the "external deps" analogue to the existing validate functionality (internal coherence).

🔗 Related Issue

N/A, direct to PR

🔄 Changes

  • Lifts the checks run by DatasetBuilder to a standalone readiness.py engine module. Used by both the builder and the end user-facing interfaces
  • Creates a flags.py engine module where the async engine flag is centralized. This cleans up some duplication of the env var magic string / constant that was floating around in a few places.

🧪 Testing

  • make test passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

✅ Checklist

  • Follows commit message conventions
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@mikeknep mikeknep requested a review from a team as a code owner May 29, 2026 15:02
@greptile-apps

greptile-apps Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR extracts the model and MCP tool health-check logic from DatasetBuilder into a shared readiness.py engine module, centralises the async-engine feature flag into flags.py, and uses these foundations to expose a new check_models method on DataDesigner and a matching check-models CLI command.

  • New engine/readiness.py: run_readiness_check is now the single code path for model and MCP tool probes, called by both DatasetBuilder.build/build_preview and the new DataDesigner.check_models, eliminating any risk of the standalone method drifting from the workload startup gate.
  • New engine/flags.py: DATA_DESIGNER_ASYNC_ENGINE is centralised, removing three separate inline os.environ.get calls and making monkeypatching in tests target a single canonical location.
  • Ordering change in DatasetBuilder.build / build_preview: _use_async (and therefore client_concurrency_mode) is now resolved before the readiness check is invoked, so configs that fall back to the sync engine (e.g. those using allow_resize) run the model health check in sync mode — matching the actual build path rather than blindly using the raw env-var flag.

Confidence Score: 5/5

Safe to merge; the change is a well-scoped refactor with no observable regression on the workload path.

The extraction of health-check logic into readiness.py is functionally equivalent to the removed DatasetBuilder methods, and the test suite covers model probe, MCP probe, async dispatch, timeout/cancel, and every column-type variant. The one deliberate behavioral shift — resolving _use_async before the readiness check rather than after the resume-compat gate — is actually more correct, and existing tests exercise it.

No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/readiness.py New module; correctly lifts model and MCP tool health-check logic out of DatasetBuilder, parameterized on ClientConcurrencyMode. Logic is functionally equivalent to the removed methods.
packages/data-designer-engine/src/data_designer/engine/flags.py New module centralising DATA_DESIGNER_ASYNC_ENGINE flag; clean single source of truth for tests to monkeypatch.
packages/data-designer-engine/src/data_designer/engine/dataset_builders/dataset_builder.py Removes _run_model_health_check_if_needed and _run_mcp_tool_check_if_needed; replaces both call sites with run_readiness_check. _use_async is now resolved before the health check, making client_concurrency_mode consistent with actual execution mode.
packages/data-designer/src/data_designer/interface/data_designer.py Adds check_models method; delegates to run_readiness_check with the same resolved client_concurrency_mode used by the workload path.
packages/data-designer/src/data_designer/cli/controllers/generation_controller.py Adds run_check_models; mirrors run_validate pattern with typed-error class name surfaced in the error message for easier user diagnosis.
packages/data-designer/src/data_designer/cli/commands/check_models.py New CLI command; thin delegation to GenerationController.run_check_models, mirrors validate.py structure.
scripts/health_checks.py Simplified to use DataDesigner.check_models; loses the 2-attempt retry for chat completion and explicit embedding-vector size validation present in the old direct ModelFacade approach.
packages/data-designer-engine/tests/engine/test_readiness.py Comprehensive new test file covering model probe, MCP probe, ordering, async dispatch (including timeout/cancel), and all column-type coverage paths.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as check-models CLI
    participant Ctrl as GenerationController
    participant DD as DataDesigner
    participant RP as ResourceProvider
    participant R as readiness.py
    participant MR as ModelRegistry
    participant MCP as MCPRegistry

    User->>CLI: dd check-models config.yaml
    CLI->>Ctrl: run_check_models(config_source)
    Ctrl->>Ctrl: _load_config(config_source)
    Ctrl->>DD: DataDesigner()
    Ctrl->>DD: check_models(config_builder)
    DD->>RP: _create_resource_provider("check-models", config_builder)
    DD->>DD: _resolve_client_concurrency_mode(config_builder)
    DD->>R: run_readiness_check(columns, resource_provider, client_concurrency_mode)
    R->>MR: run_health_check / arun_health_check(model_aliases)
    MR-->>R: OK / raises ModelError
    R->>MCP: run_health_check(tool_aliases)
    MCP-->>R: OK / raises RuntimeError
    R-->>DD: None (success)
    DD-->>Ctrl: None
    Ctrl->>User: All models and tools responded successfully

    Note over DD,R: Same run_readiness_check path used by DatasetBuilder.build() / build_preview()
Loading

Reviews (4): Last reviewed commit: "align readiness with client mode" | Re-trigger Greptile

if not model_aliases:
return

if flags.DATA_DESIGNER_ASYNC_ENGINE:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small edge case: check_models() can build sync-mode clients when the config forces sync fallback, like allow_resize=True, but readiness still branches on the raw async env flag and calls arun_health_check(). I know allow_resize / sync are probably on the way out, so maybe this is just a known limitation, but it might be worth either matching the resolved client mode here or adding a regression/explicit guard so users do not hit an async/sync client internals error.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I overlooked this, thanks. Addressed in c9c118f. This should be fairly straightforward to unwind if we fully drop the sync engine.

fake_loop = Mock()

with (
patch("data_designer.engine.readiness.ensure_async_engine_loop", return_value=fake_loop, create=True),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: this patch target does not look like it is used. _run_model_health_check() imports ensure_async_engine_loop inside the function from data_designer.engine.dataset_builders.utils.async_concurrency, so patching data_designer.engine.readiness.ensure_async_engine_loop will not intercept it. Could patch the source path instead and drop create=True; same thing applies to the timeout test below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks. Fixed in 8f52dbd

@andreatgretel

Copy link
Copy Markdown
Contributor

I think this new readiness path could eventually replace the implementation in scripts/health_checks.py, but probably not the script itself. The script is catalog-scoped, checking the predefined provider/model defaults when env vars are present; check_models is config-scoped. A nice follow-up might be to keep the script as the maintainer entrypoint, but have it build small temporary configs from PREDEFINED_PROVIDERS_MODEL_MAP and call the shared readiness path so the actual probe logic does not drift.

mikeknep added 8 commits June 10, 2026 09:45
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
@mikeknep mikeknep force-pushed the dd-model-health/mknepper branch from c4715f9 to c9c118f Compare June 10, 2026 16:26
@mikeknep

Copy link
Copy Markdown
Contributor Author

@andreatgretel I updated the health_checks script to use this new method in b3a9d88, LMK what you think!

@mikeknep mikeknep requested a review from andreatgretel June 10, 2026 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants