Skip to content

Add multi-architecture support to AQUA Shape Recommender#1335

Closed
Aryanag2 wants to merge 5 commits intomainfrom
feature/shape-recommender-v2-multi-architecture
Closed

Add multi-architecture support to AQUA Shape Recommender#1335
Aryanag2 wants to merge 5 commits intomainfrom
feature/shape-recommender-v2-multi-architecture

Conversation

@Aryanag2
Copy link
Member

Summary

Upgrade shape recommender to support 4 architecture types using strategy pattern: text-generation, multimodal VLMs, embedding models, and audio/ASR.

Changes

  • New architectures supported:

    • ✅ Multimodal VLMs (LLaVA, Nemotron-VL, Qwen2-VL, InternVL, Phi3-V)
    • ✅ Embedding models (BERT, RoBERTa, E5-Mistral, GTE, ModernBERT)
    • ✅ Audio/ASR (Whisper all sizes)
    • ✅ Text-generation (Llama, Mistral, Qwen, Falcon) - no changes to existing behavior
  • Key implementation:

    • Add ParsedModelConfig with detect_architecture() as single routing point
    • Add EmbeddingConfig, WhisperConfig, VisionConfig for new architectures
    • Add memory estimators: VisionMemoryEstimator, EmbeddingMemoryEstimator, WhisperMemoryEstimator
    • Implement strategy pattern with 4 concrete strategies in new strategies/ package
    • Add StrategyFactory for architecture-to-strategy routing
    • Remove text-generation-only gate from _get_model_config()
    • Add architecture-specific vLLM flags (--limit-mm-per-prompt, --task embedding, etc.)

Impact

  • Models previously rejected with "not supported" errors now get proper recommendations
  • Zero breaking changes - all existing text-generation models work identically
  • Enables AQUA service to support new model types

Testing

  • 9/10 existing tests pass
  • 1 test has expected behavior change (Whisper now succeeds via new entry point instead of being rejected)
  • Package installs successfully with all dependencies

Files Changed

  • Modified: constants.py, estimator.py, llm_config.py, recommend.py
  • New: strategies/__init__.py, strategies/base.py, strategies/text.py, strategies/multimodal.py, strategies/embedding.py, strategies/audio.py

Next Steps

  • Update test test_llm_config_unsupported_models to validate both entry points
  • Add integration tests for new strategies
  • Update documentation

- Changed huggingface_hub==0.26.2 to huggingface_hub in opctl extras
- Allows installation of latest huggingface_hub version (1.3.7+)
- All ADS APIs using huggingface_hub verified compatible with latest version
- Replace deprecated 'huggingface-cli login' with 'hf auth login' in error messages
- Remove deprecated 'new_session' parameter from huggingface_hub.login() call
- HfHubHTTPError and GatedRepoError now require a 'response' keyword argument in v1.0+
- Updated tests to pass response=MagicMock() when mocking these exceptions
Upgrade shape recommender to support 4 architecture types using strategy pattern:
- Text-generation (Llama, Mistral, Qwen, Falcon) - no changes to existing behavior
- Multimodal VLMs (LLaVA, Nemotron-VL, Qwen2-VL, InternVL, Phi3-V)
- Embedding models (BERT, RoBERTa, E5-Mistral, GTE, ModernBERT)
- Audio/ASR (Whisper all sizes)

Key changes:
- Add ParsedModelConfig with detect_architecture() as single routing point
- Add EmbeddingConfig, WhisperConfig, VisionConfig for new architectures
- Add memory estimators: VisionMemoryEstimator, EmbeddingMemoryEstimator, WhisperMemoryEstimator
- Implement strategy pattern with 4 concrete strategies in new strategies/ package
- Add StrategyFactory for architecture-to-strategy routing
- Remove text-generation-only gate from _get_model_config()
- Add architecture-specific vLLM flags (--limit-mm-per-prompt, --task embedding, etc.)

Models previously rejected with "not supported" errors now get proper recommendations.
Zero breaking changes - all existing text-generation models work identically.

Tested: 9/10 existing tests pass, 1 test has expected behavior change
(Whisper now succeeds via new entry point instead of being rejected)

Signed-off-by: Aryan Gosaliya <aryan.gosaliya@oracle.com>
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Feb 11, 2026
@github-actions
Copy link

⚠️ This PR changed pyproject.toml file. ⚠️

  • PR Creator must update 📃 THIRD_PARTY_LICENSES.txt, if any 📚 library added/removed in pyproject.toml.
  • PR Approver must confirm 📃 THIRD_PARTY_LICENSES.txt updated, if any 📚 library added/removed in pyproject.toml.

@Aryanag2
Copy link
Member Author

Closing to recreate with clean commit history from main branch

@Aryanag2 Aryanag2 closed this Feb 11, 2026
@Aryanag2 Aryanag2 deleted the feature/shape-recommender-v2-multi-architecture branch February 11, 2026 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant