Skip to content

fix(BA-6062): Wrap legacy string based start_command via shell -c instead of shlex.split#11648

Open
seedspirit wants to merge 3 commits into
mainfrom
fix/BA-6062
Open

fix(BA-6062): Wrap legacy string based start_command via shell -c instead of shlex.split#11648
seedspirit wants to merge 3 commits into
mainfrom
fix/BA-6062

Conversation

@seedspirit
Copy link
Copy Markdown
Contributor

@seedspirit seedspirit commented May 18, 2026

Summary

  • Replace shlex.split normalization with shell -c wrapping for legacy string start_command in stored model definitions, preserving shell semantics (pipes, redirections, env expansion, multiline). Effective on data-hydration paths only — see Scope below.
  • Add Alembic migration 70ffcaaa5f5c that re-wraps existing single-element argv lists whose only token contains whitespace. Repairs rows produced by prior migration 8c1f7d3a9e2b which stored legacy shell strings as a meaningless one-token list like ["python service.py"].
  • Helper _wrap_str_start_command_into_argv is non-mutating: returns a new dict with overridden start_command only when wrapping is needed.

Scope

The wrap fires at the strict-type boundary (ModelServiceConfig.model_validate) and on draft hydration of stored payloads:

  • DB read via PydanticColumn(ModelDefinition) / PydanticColumn(ModelDefinitionDraft).
  • vfolder model-definition.yaml validation through model_definition_iv (trafaret).
  • Preset / runtime variant JSONB columns hydrated into ModelDefinitionDraft.

REST v2 / GraphQL request input is not affected: ModelServiceConfigInput (common/dto/manager/v2/deployment/request.py) still types start_command as list[str] | None, so request payloads must be argv lists. The wrap is purely a backward-compat for pre-existing stored data and user-authored yaml.

Test plan

  • Alembic upgrade/downgrade/re-upgrade verified locally on 5 representative rows in runtime_variants (whitespace 1-elem → rewrap, no-ws 1-elem preserved, multi-elem preserved, null preserved, custom-shell vs DEFAULT_SHELL fallback). Downgrade no-op, re-upgrade idempotent.
  • Unit: tests/unit/manager/data/deployment/test_model_definition_start_command_compat.py covers Pydantic, trafaret, raw YAML, and PydanticColumn read paths with both default-shell and custom-shell cases.
  • CI: pants test --changed-since=... green.

Resolves BA-6062

seedspirit and others added 2 commits May 18, 2026 12:56
…f shlex.split

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions github-actions Bot added size:L 100~500 LoC comp:manager Related to Manager component comp:common Related to Common component require:db-migration Automatically set when alembic migrations are added or updated labels May 18, 2026
@github-actions github-actions Bot added size:XL 500~ LoC and removed size:L 100~500 LoC labels May 18, 2026
@seedspirit seedspirit marked this pull request as ready for review May 18, 2026 04:18
@seedspirit seedspirit requested a review from a team as a code owner May 18, 2026 04:18
Copilot AI review requested due to automatic review settings May 18, 2026 04:18
@seedspirit seedspirit added this to the 26.4 milestone May 18, 2026
@github-actions github-actions Bot added size:L 100~500 LoC and removed size:XL 500~ LoC labels May 18, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR changes legacy string start_command handling for model service definitions from shlex.split() normalization to shell execution via [shell, "-c", command], plus a migration intended to repair previously persisted malformed single-element argv values.

Changes:

  • Adds shared default shell wrapping for string start_command in Pydantic and trafaret validation paths.
  • Adds Alembic migration 70ffcaaa5f5c to rewrite selected persisted single-element argv lists.
  • Updates compatibility tests and adds a changelog entry.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
src/ai/backend/common/config.py Replaces shlex.split coercion with shell wrapping and updates default shell usage.
src/ai/backend/manager/models/alembic/versions/70ffcaaa5f5c_rewrap_legacy_string_start_command_via_.py Adds data migration for previously wrapped single-element start_command lists.
tests/unit/manager/data/deployment/test_model_definition_start_command_compat.py Updates tests to expect [shell, "-c", command] wrapping.
changes/11648.fix.md Adds release note for the compatibility fix and migration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/ai/backend/common/config.py
Comment thread src/ai/backend/common/config.py
Comment thread src/ai/backend/common/config.py
Comment thread src/ai/backend/common/config.py
@seedspirit seedspirit marked this pull request as draft May 18, 2026 06:30
@seedspirit seedspirit marked this pull request as ready for review May 18, 2026 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:common Related to Common component comp:manager Related to Manager component require:db-migration Automatically set when alembic migrations are added or updated size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants