Skip to content

fix(BA-5983): accept partial ModelDefinition input in deployment API#11531

Merged
jopemachine merged 18 commits into
mainfrom
fix/BA-5983-modelconfig-fields-optional
May 11, 2026
Merged

fix(BA-5983): accept partial ModelDefinition input in deployment API#11531
jopemachine merged 18 commits into
mainfrom
fix/BA-5983-modelconfig-fields-optional

Conversation

@jopemachine
Copy link
Copy Markdown
Member

@jopemachine jopemachine commented May 8, 2026

Resolves BA-5983

Summary

  • addModelRevision rejected requests that omitted ModelConfigInput.name (and similarly model_path, ModelServiceConfigInput.port, ModelHealthCheckInput.path) with ModelConfig.name is required, even though those fields are routinely supplied by other layers in the revision merge chain — the runtime variant's default_model_definition, the model vfolder's model-definition.yaml, a revision preset, or the model_mount_destination default.
  • Re-bind ModelConfigInputGQL / ModelDefinitionInputGQL / ModelServiceConfigInputGQL / ModelHealthCheckInputGQL to the *Draft DTO variants from common/config so every field the merge chain can supply is nullable at the GQL boundary.
  • Required-field enforcement is unchanged: it still happens inside {ModelConfig,ModelHealthCheck,ModelServiceConfig}Draft.to_resolved() after the full merge in DeploymentController.add_revision().

Schema impact (input-only, non-breaking):

  • ModelConfigInput.{name, modelPath}: non-null → nullable
  • ModelServiceConfigInput.{port, shell, preStartActions}: non-null → nullable
  • ModelHealthCheckInput.{path, interval, maxRetries, maxWaitTime, expectedStatusCode, initialDelay}: non-null → nullable
  • ModelDefinitionInput.models: non-null → nullable
  • All output (ModelConfig, ModelDefinition, etc.) types untouched.

The supergraph / schema.graphql / v2-schema.graphql regeneration will be pushed by the update-api-schema workflow on this PR.

Test plan

  • addModelRevision succeeds when modelDefinition.models[].name / modelPath are omitted and the runtime variant or vfolder model-definition.yaml supplies them.
  • addModelRevision succeeds when modelDefinition is omitted entirely (variant baseline alone is sufficient).
  • Existing requests that DO supply name / modelPath continue to work unchanged.
  • addModelRevision still fails with a clear error when no layer supplies a required field (validation at to_resolved() is preserved).
  • update-api-schema workflow regenerates schema.graphql / v2-schema.graphql / supergraph.graphql cleanly.

🤖 Generated with Claude Code


📚 Documentation preview 📚: https://sorna--11531.org.readthedocs.build/en/11531/


📚 Documentation preview 📚: https://sorna-ko--11531.org.readthedocs.build/ko/11531/

The addModelRevision mutation rejected requests that omitted
ModelConfigInput.name (and similarly model_path, ModelServiceConfigInput.port,
ModelHealthCheckInput.path), even though those fields are routinely supplied
by other layers in the revision merge chain — the runtime variant's
default_model_definition, the model vfolder's model-definition.yaml, a
revision preset, or the model_mount_destination default.

Bind the input types (ModelConfigInputGQL, ModelDefinitionInputGQL,
ModelServiceConfigInputGQL, ModelHealthCheckInputGQL) to the *Draft DTO
variants from common/config so every field that the merge chain can
supply is nullable at the GQL boundary. Required-field validation stays
where it belongs — in {ModelConfig,ModelHealthCheck,ModelServiceConfig}Draft
.to_resolved() after the full merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the size:M 30~100 LoC label May 8, 2026
@jopemachine jopemachine added this to the 26.4 milestone May 8, 2026
@github-actions github-actions Bot added the comp:manager Related to Manager component label May 8, 2026
Co-authored-by: octodog <mu001@lablup.com>
@github-actions github-actions Bot added size:L 100~500 LoC area:docs Documentations and removed size:M 30~100 LoC labels May 8, 2026
Re-route the GraphQL/REST v2 input boundary through dedicated
``Model{HealthCheck,Metadata,ServiceConfig,Config,Definition}Input``
DTOs in ``common/dto/manager/v2/deployment/request.py`` instead of
binding directly to the merge-chain ``*Draft`` domain models from
``common/config``. The ``*Draft`` types remain the internal merge-chain
representation; the boundary types are owned by the v2 DTO package.

- New v2 Input DTOs mirror the structure of the corresponding
  ``*Draft`` types (every field optional) so the request layer stays
  permissive and lower-priority sources (runtime variant default,
  revision preset, vfolder ``model-definition.yaml``) can supply
  whatever the request omits.
- ``CreateRevisionInputDTO``/``AddRevisionGQLInputDTO``/v2
  ``RevisionInput`` now type ``model_definition`` as
  ``ModelDefinitionInput`` instead of leaking
  ``ModelDefinitionDraft`` from ``common/config`` into the v2 DTO
  package.
- Add ``to_model_definition_draft`` converter alongside the DTOs and
  call it at the GQL adapter boundary (``manager/api/adapters/
  deployment/adapter.py``) before constructing
  ``ModelRevisionCreator``; the legacy REST path is unchanged
  (still uses the deprecated ``common/dto/manager/deployment``
  ``RevisionInput``).
- Re-bind ``Model*InputGQL`` types in
  ``manager/api/gql/deployment/types/revision.py`` to the new v2
  Input DTOs and drop the temporary ``*DraftDTO`` aliases introduced
  in the previous commit.

Required-field validation still happens in
``ModelConfigDraft.to_resolved`` / ``ModelHealthCheckDraft.to_resolved``
/ ``ModelServiceConfigDraft.to_resolved`` after the merge, which is
called in ``DeploymentController.add_revision``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the comp:common Related to Common component label May 8, 2026
@jopemachine jopemachine removed this from the 26.4 milestone May 8, 2026
jopemachine and others added 2 commits May 8, 2026 20:02
Drop verbose per-field descriptions on the new v2 ModelConfig /
ModelDefinition / ModelServiceConfig / ModelHealthCheck Input DTOs and
their GQL counterparts where the field name is already self-describing
or the description merely repeated the merge-chain note. Keep one
class-level note on ModelDefinitionInput as the single canonical
explanation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/ai/backend/common/dto/manager/v2/deployment/request.py Outdated
jopemachine and others added 4 commits May 10, 2026 18:10
Address review feedback by moving the standalone
``to_model_definition_draft`` helper into ``ModelDefinitionInput.to_draft``
and update the test fixtures to use ``ModelDefinitionInput`` (matching
the field type) instead of ``ModelDefinitionDraft``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pin the BA-5983 contract: GraphQL/REST inputs accept all-optional
fields, but ``to_resolved()`` after the revision merge chain still
raises when no source supplies a required field. Three groups:

- ``ModelDefinitionInput.to_draft()`` produces a valid empty/partial
  draft without raising
- Empty request merges with variant baseline / preset and resolves to
  the baseline values (partial overrides also combine correctly)
- Missing ``name`` / ``model_path`` / ``port`` / health-check ``path``
  with no baseline raises ``ValueError`` at ``to_resolved()``

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ``merge_revision_drafts`` helper was removed in #11250 in favor of
the ``RevisionDraft.merge(self, other)`` instance method, but
``RevisionDraftReader`` docstring still pointed at the old name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collapse the four ``test_missing_*_raises`` cases into a single
parametrized test. Each scenario carries the ``ModelDefinitionInput``
shape and the expected ``ValueError`` pattern; new cases can be added
as ``pytest.param`` entries.

The remaining tests (DTO→draft conversions, baseline/preset merge
shape) are intentionally left non-parametrized — their assertions
differ in depth and the per-source distinction is meaningful.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jopemachine jopemachine changed the title fix(BA-5983): make ModelConfig GQL input fields optional fix(BA-5983): accept partial ModelDefinition input in deployment API May 11, 2026
jopemachine and others added 2 commits May 11, 2026 12:14
Apply parametrize to the two test groups that previously held one
test method per scenario:

- ``TestModelDefinitionInputToDraft`` — collapse three round-trip
  cases into a single ``test_to_draft_preserves_input_shape`` that
  asserts ``draft.model_dump() == input.model_dump()``. The invariant
  is the same for every scenario (empty / partial / nested); the
  inputs now live as ``pytest.param`` entries.
- ``TestEmptyInputMergesWithBaseline`` — extract a ``ResolvedExpectation``
  dataclass so the three "merge produces correct resolved value"
  scenarios share one test body. New cases (additional sources,
  deeper overrides) only need a new ``pytest.param``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
``ModelDefinitionInput.to_draft()`` used ``model_dump()`` with default
arguments, which dumps every field — including the unset ones at their
``None`` default. Round-tripping that through ``model_validate`` left
the resulting draft with ``model_fields_set`` containing every field,
so every ``None`` looked "explicitly set" and clobbered lower-priority
baselines during the revision merge chain.

Switch to ``model_dump(exclude_unset=True)`` so the resulting draft's
``model_fields_set`` reflects only what the caller actually provided.
This is what makes the BA-5983 scenario actually work end-to-end: a
request that omits ``name`` / ``model_path`` / ``service.port`` /
``health_check.path`` lets the variant baseline (or preset) fill them
in instead of nulling them out.

Extend the merge test to cover this directly — every "missing required
field" scenario now layers a baseline draft together with a request
draft so the merge actually combines fields across sources. Without
the fix, the model_path / service_port / health_check_path cases
would raise the wrong error first (e.g. ``ModelConfig.name is required``
fires before ``model_path``) because every request-side ``None`` would
clobber the baseline's preserved value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size:XL 500~ LoC and removed size:L 100~500 LoC labels May 11, 2026
jopemachine and others added 5 commits May 11, 2026 12:30
Insert a real ``RuntimeVariantRow`` with a baseline
``default_model_definition`` (round-trips through ``PydanticColumn``
serialization), then exercise the production ``RevisionDraftReader``
+ ``RevisionDraft.merge`` pipeline against a request draft built from
``ModelDefinitionInput.to_draft()``.

Scenarios:

- Empty input + baseline supplying full required tree → resolved
  ``ModelConfig`` carries baseline values verbatim.
- Partial request (name only) + baseline (name + model_path) →
  request wins on ``name``; baseline's ``model_path`` survives.
- Baseline missing ``name`` → ``to_resolved()`` raises
  ``ModelConfig.name is required``.
- Baseline supplying empty ``service`` → ``ModelServiceConfig.port
  is required``.
- Baseline supplying empty ``health_check`` → ``ModelHealthCheck.path
  is required``.

The synthetic merge tests still cover the pure merge functions
in-process; this file pins the DB → reader → merge → resolve loop
that the ``add_model_revision`` action actually runs in production.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restructure the DB-backed merge test so the baseline ``ModelDefinitionDraft``
lives in a fixture (the DB-side value) and the parametrize tables only
carry the actual request input + expected outcome.

Two scenario groups, each pinned to its own baseline fixture:

- ``TestMergeWithFullBaseline`` — baseline ships every required field;
  parametrize cases probe how different requests (empty, partial
  override) combine with that baseline.
- ``TestMergeRaisesWithIncompleteBaseline`` — baseline ships an
  incomplete definition so the resolve-time check fires. The request
  is always all-empty here; the parametrize table only varies the
  baseline shape + expected error pattern.

Shared helpers (``_seed_variant``, ``_merge_via_reader``) move to
module scope so both classes consume the same DB and reader setup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…puts

Restructure so each baseline shape lives in its own test class with
its own ``variant_id`` fixture seeding the DB. The parametrize tables
inside each class only carry the request ``ModelDefinitionInput`` and
the expected outcome — there is no ``baseline``/``incomplete_baseline``
parameter, since "what's in the DB" is the fixture's responsibility.

Four classes, one per baseline shape:

- ``TestMergeWithCompleteBaseline`` — variant ships every required
  field; any request resolves successfully.
- ``TestMergeWhenBaselineLacksName`` — request must supply ``name``,
  otherwise ``to_resolved()`` raises.
- ``TestMergeWhenBaselineLacksServicePort`` — request must supply
  ``service.port``.
- ``TestMergeWhenBaselineLacksHealthCheckPath`` — request must supply
  ``service.health_check.path``.

Each "lacks-X" class pairs a parametrized success test (request
supplies the missing field) with a dedicated failure test (empty
request → ``to_resolved()`` raises).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The synthetic merge test file (test_model_definition_merge.py)
duplicated every scenario the DB-backed test already covers — same
"empty input + baseline → resolved values" and "missing field
→ raise" shapes, but with hand-built drafts instead of the real
``RevisionDraftReader`` path. Keep the realistic version
(test_revision_merge_db.py) as the single source of truth for the
merge contract.

The one piece worth preserving from the synthetic file is the pure
``to_draft`` conversion check. Move that into the existing DTO test
module as ``TestModelDefinitionInputToDraft``, tightened to assert
``draft.model_fields_set == input.model_fields_set`` — the precise
invariant the merge logic relies on (BA-5983 broke because
``model_dump()`` was used without ``exclude_unset=True``, leaving
every field "set" on the draft).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the DB-backed revision merge test and the to_draft invariant
test added in this PR. Existing tests in
``tests/unit/common/dto/manager/v2/deployment/test_request.py``
remain — those were updated only to swap ``ModelDefinitionDraft()``
for ``ModelDefinitionInput()`` so they typecheck against the new
field type.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size:L 100~500 LoC and removed size:XL 500~ LoC labels May 11, 2026
@jopemachine jopemachine requested a review from a team May 11, 2026 03:55
@jopemachine jopemachine marked this pull request as ready for review May 11, 2026 03:55
Copilot AI review requested due to automatic review settings May 11, 2026 03:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Deployment v2 GraphQL input layer so addModelRevision can accept partial ModelDefinition inputs (allowing required fields like model name/path/service port/healthcheck path to be supplied later by the revision merge chain: runtime variant baseline, vfolder model-definition.yaml, revision preset, and mount defaults), while keeping required-field validation at the final to_resolved() step after merging.

Changes:

  • Rebind GraphQL deployment revision inputs (ModelConfigInput, ModelDefinitionInput, ModelServiceConfigInput, ModelHealthCheckInput, ModelMetadataInput) to v2 request DTOs where fields are nullable.
  • Convert request-layer ModelDefinitionInput into ModelDefinitionDraft via a to_draft() helper and wire this into the deployment adapter’s create and add_revision flows.
  • Regenerate GraphQL reference schemas and update unit tests/changelog entry accordingly.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/unit/common/dto/manager/v2/deployment/test_request.py Updates unit tests to use the new ModelDefinitionInput request DTO instead of ModelDefinitionDraft.
src/ai/backend/manager/sokovan/deployment/revision_draft/reader.py Updates docstring wording to reflect RevisionDraft.merge layering semantics.
src/ai/backend/manager/api/gql/deployment/types/revision.py Switches GraphQL input types to v2 request DTO-backed inputs and makes fields nullable.
src/ai/backend/manager/api/adapters/deployment/adapter.py Converts request ModelDefinitionInput to ModelDefinitionDraft before invoking domain-layer creators.
src/ai/backend/common/dto/manager/v2/deployment/request.py Introduces all-optional request DTOs for model definition/config/service/healthcheck + ModelDefinitionInput.to_draft().
docs/manager/graphql-reference/v2-schema.graphql Regenerated schema reflecting nullable model-definition-related input fields.
docs/manager/graphql-reference/supergraph.graphql Regenerated supergraph reflecting nullable model-definition-related input fields.
changes/11531.fix.md Adds changelog entry for making these GraphQL input fields optional.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +811 to 827
class ModelHealthCheckInputGQL(PydanticInputMixin[ModelHealthCheckInputDTO]):
interval: float | None = gql_field(
description="Interval in seconds between health checks.", default=None
)
path: str = gql_field(description="Path to check for health status.")
max_retries: int = gql_field(
description="Maximum number of retries for health check.", default=10
path: str | None = gql_field(description="Path to check for health status.", default=None)
max_retries: int | None = gql_field(
description="Maximum number of retries for health check.", default=None
)
max_wait_time: float = gql_field(
description="Maximum time in seconds to wait for a health check response.", default=15.0
max_wait_time: float | None = gql_field(
description="Maximum time in seconds to wait for a health check response.", default=None
)
expected_status_code: int = gql_field(
description="Expected HTTP status code for a healthy response.", default=200
expected_status_code: int | None = gql_field(
description="Expected HTTP status code for a healthy response.", default=None
)
initial_delay: float = gql_field(
description="Initial delay in seconds before the first health check.", default=60.0
initial_delay: float | None = gql_field(
description="Initial delay in seconds before the first health check.", default=None
)
Comment on lines +837 to +848
class ModelServiceConfigInputGQL(PydanticInputMixin[ModelServiceConfigInputDTO]):
pre_start_actions: list[PreStartActionInputGQL] | None = gql_field(
description="List of pre-start actions to execute before starting the model service.",
default=strawberry.UNSET,
default=None,
)
start_command: list[str] | None = gql_field(
description="Command to start the model service.", default=None
)
shell: str = gql_field(
description="Shell configured for the model service.",
default="/bin/bash",
shell: str | None = gql_field(
description="Shell configured for the model service.", default=None
)
port: int = gql_field(description="Port number for the model service. Must be greater than 1.")
port: int | None = gql_field(description="Port number for the model service.", default=None)
Comment on lines +891 to +892
name: str | None = gql_field(description="Name of the model.", default=None)
model_path: str | None = gql_field(description="Path to the model file.", default=None)
description="List of models in the model definition."
class ModelDefinitionInputGQL(PydanticInputMixin[ModelDefinitionInputDTO]):
models: list[ModelConfigInputGQL] | None = gql_field(
description="List of models in the model definition.", default=None
… types

``ModelHealthCheckDraft.to_resolved`` and
``ModelServiceConfigDraft.to_resolved`` previously duplicated every
default value (``10.0``, ``10``, ``15.0``, ``200``, ``60.0`` for the
health-check fields; ``[]`` and ``"/bin/bash"`` for the service
config) inline as ``if self.x is not None else <default>`` branches.
Those literals were already declared on the strict ``ModelHealthCheck``
/ ``ModelServiceConfig`` classes via ``Field(default=...)``, so the
project carried the same constants in two places at risk of drift.

Switch ``to_resolved`` to drop ``None`` scalars via
``model_dump(exclude_none=True)`` and let the strict type's field
defaults apply during ``model_validate``/constructor call. Required-
field checks (``path``, ``port``) stay as explicit ``ValueError``
raises so the error message remains domain-specific rather than a
generic ``pydantic.ValidationError``. The nested ``health_check``
draft is resolved out-of-band before the strict service config is
composed, since it carries its own required-field check.

Behavior-preserving:

- Every ``Field(default=...)`` on the strict type matches the
  literal the old code wrote inline (verified field-by-field).
- ``pre_start_actions``: old ``or []`` and new ``exclude_none=True``
  produce identical results for the two reachable shapes (``None``
  and ``list[PreStartAction]``).
- Field-level constraints (``gt``, ``ge``) still fire because both
  the old constructor call and the new ``model_validate`` execute
  Pydantic validators.

Existing tests in test_config / test_revision_draft_merge /
test_model_definition_start_command_compat / test_revision_draft_reader
still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment on lines +537 to +539
# Drop unset (None) fields so the strict type's ``Field(default=...)``
# declarations remain the single source of truth for default values.
return ModelHealthCheck.model_validate(self.model_dump(exclude_none=True))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jopemachine jopemachine merged commit bbb8e02 into main May 11, 2026
36 checks passed
@jopemachine jopemachine deleted the fix/BA-5983-modelconfig-fields-optional branch May 11, 2026 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:docs Documentations comp:common Related to Common component comp:manager Related to Manager component size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants