Skip to content

Add multi-architecture support to AQUA Shape Recommender#1336

Merged
mrDzurb merged 8 commits intomainfrom
feature/shape-recommender-v2-multi-architecture
Feb 28, 2026
Merged

Add multi-architecture support to AQUA Shape Recommender#1336
mrDzurb merged 8 commits intomainfrom
feature/shape-recommender-v2-multi-architecture

Conversation

@Aryanag2
Copy link
Member

@Aryanag2 Aryanag2 commented Feb 11, 2026

Summary

Upgrade shape recommender to support 4 architecture types using strategy pattern: text-generation, multimodal VLMs, embedding models, and audio/ASR.

Changes

  • New architectures supported:

    • ✅ Multimodal VLMs (LLaVA, Nemotron-VL, Qwen2-VL, InternVL, Phi3-V)
    • ✅ Embedding models (BERT, RoBERTa, E5-Mistral, GTE, ModernBERT)
    • ✅ Audio/ASR (Whisper all sizes)
    • ✅ Text-generation (Llama, Mistral, Qwen, Falcon) - no changes to existing behavior
  • Key implementation:

    • Add ParsedModelConfig with detect_architecture() as single routing point
    • Add EmbeddingConfig, WhisperConfig, VisionConfig for new architectures
    • Add memory estimators: VisionMemoryEstimator, EmbeddingMemoryEstimator, WhisperMemoryEstimator
    • Implement strategy pattern with 4 concrete strategies in new strategies/ package
    • Add StrategyFactory for architecture-to-strategy routing
    • Remove text-generation-only gate from _get_model_config()
    • Add architecture-specific vLLM flags (--limit-mm-per-prompt, --task embedding, etc.)
    • Graceful error handling for multimodal models with incomplete sub-configs

Impact

  • Models previously rejected with "not supported" errors now get proper recommendations
  • Zero breaking changes - all existing text-generation models work identically
  • Enables AQUA service to support new model types

Testing

  • All 39 tests passing in test_recommend.py
    • 27 existing tests (zero regressions)
    • 12 new architecture tests added in TestNewArchitectures
      • Audio/Whisper: 4 tests ✅
      • Embedding: 5 tests ✅
      • Multimodal VLM: 3 tests ✅
  • Test data included: 17 HuggingFace config.json files for new architectures
  • Command used: pytest tests/unitary/with_extras/aqua/test_recommend.py -v
  • Backward compatibility verified: All text-generation tests continue to pass with identical behavior

Files Changed

  • Modified: constants.py, estimator.py, llm_config.py, recommend.py, test_recommend.py
  • New: strategies/__init__.py, strategies/base.py, strategies/text.py, strategies/multimodal.py, strategies/embedding.py, strategies/audio.py
  • Test data: 17 config files in test_data/recommend/config-json-files/

Upgrade shape recommender to support 4 architecture types using strategy pattern:
- Text-generation (Llama, Mistral, Qwen, Falcon) - no changes to existing behavior
- Multimodal VLMs (LLaVA, Nemotron-VL, Qwen2-VL, InternVL, Phi3-V)
- Embedding models (BERT, RoBERTa, E5-Mistral, GTE, ModernBERT)
- Audio/ASR (Whisper all sizes)

Key changes:
- Add ParsedModelConfig with detect_architecture() as single routing point
- Add EmbeddingConfig, WhisperConfig, VisionConfig for new architectures
- Add memory estimators: VisionMemoryEstimator, EmbeddingMemoryEstimator, WhisperMemoryEstimator
- Implement strategy pattern with 4 concrete strategies in new strategies/ package
- Add StrategyFactory for architecture-to-strategy routing
- Remove text-generation-only gate from _get_model_config()
- Add architecture-specific vLLM flags (--limit-mm-per-prompt, --task embedding, etc.)

Models previously rejected with "not supported" errors now get proper recommendations.
Zero breaking changes - all existing text-generation models work identically.

Tested: 9/10 existing tests pass, 1 test has expected behavior change
(Whisper now succeeds via new entry point instead of being rejected)

Signed-off-by: Aryan Gosaliya <aryan.gosaliya@oracle.com>
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Feb 11, 2026
…ure support

- Add graceful error handling for multimodal models with incomplete sub-configs
- Wrap text_config and vision_config parsing in try-except blocks
- Allow either text or vision config to succeed for VLMs
- Update test_llm_config_unsupported_models to reflect new error messages
- Remove obsolete test_which_shapes_valid (replaced by TestNewArchitectures)
- All 39 tests in test_recommend.py now pass
- New architecture tests (audio, embedding, multimodal): 12/12 passing

Signed-off-by: Aryan Gosaliya <aryan.gosaliya@oracle.com>
@github-actions
Copy link

📌 Cov diff with main:

Coverage-0%

📌 Overall coverage:

Coverage-17.42%

- 7 Whisper/audio model configs (openai/whisper-*)
- 6 embedding model configs (BAAI/bge-*, sentence-transformers/*)
- 4 multimodal VLM configs (llava-hf/*, lmms-lab/*)

These config.json files are required for TestNewArchitectures tests to pass
in GitHub Actions CI/CD workflow.

Signed-off-by: Aryan Gosaliya <aryan.gosaliya@oracle.com>
@github-actions
Copy link

📌 Cov diff with main:

Coverage-0%

📌 Overall coverage:

Coverage-17.42%

@Aryanag2 Aryanag2 changed the title Add multi-architecture support to AQUA Shape Recommender [WIP]Add multi-architecture support to AQUA Shape Recommender Feb 11, 2026
_summarize_shapes_for_seq_lens uses LLMConfig as a type annotation but
it was never imported after the V2 refactor, causing a NameError at class
definition time and preventing the CLI from loading at all.

Signed-off-by: Aryan Gosaliya <aryan.gosaliya@oracle.com>
@github-actions
Copy link

📌 Cov diff with main:

Coverage-54%

📌 Overall coverage:

Coverage-56.77%

@github-actions
Copy link

📌 Cov diff with main:

Coverage-54%

📌 Overall coverage:

Coverage-56.75%

… strategies

Signed-off-by: Aryan Gosaliya <aryan.gosaliya@oracle.com>
@github-actions
Copy link

📌 Cov diff with main:

Coverage-49%

📌 Overall coverage:

Coverage-56.70%

@mrDzurb
Copy link
Member

mrDzurb commented Feb 26, 2026

LGTM!

@github-actions
Copy link

📌 Cov diff with main:

Coverage-49%

📌 Overall coverage:

Coverage-56.75%

@mrDzurb mrDzurb changed the title [WIP]Add multi-architecture support to AQUA Shape Recommender Add multi-architecture support to AQUA Shape Recommender Feb 27, 2026
@mrDzurb mrDzurb merged commit f605539 into main Feb 28, 2026
16 of 20 checks passed
@github-actions
Copy link

📌 Cov diff with main:

Coverage-49%

📌 Overall coverage:

Coverage-56.74%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants