New Blueprint Project - Agentic Audio RAG with LangGraph#263
New Blueprint Project - Agentic Audio RAG with LangGraph#263alexsifman wants to merge 124 commits intov2.0.0from
Conversation
…at/agentic-audio-rag
…at/agentic-audio-rag
…at/agentic-audio-rag
…Blueprints into feat/agentic-audio-rag
…l testing code from notebook
…local model isntead of hf as default - Updated README.md to clarify model storage options and local setup instructions. - Modified config.yaml to set local model paths for Qwen and CLAP models. - Refactored run-workflow.ipynb to initialize models using a configuration-driven approach. - Enhanced model_selection.py to check for models in local datafabric before downloading. - Added utility function to initialize audio models, supporting both local and remote loading.
… compatible langchain version
for more information, see https://pre-commit.ci
|
At the moment this bp takes a lot to deploy no only because of the pip install process on deployment script but also because this deployment process is transferring two big models into deployment environment. This whole process takes more than what AIS gives as tolerance. The way AIS deployment scripts works are not entirely correct, as they are checking container readiness instead of container "liveness". I'm working on a quick POC to fix this and only after that we will be able to deploy this bp. |
…Blueprints into feat/agentic-audio-rag
|
We need to wait the code from this pr (https://github.azc.ext.hp.com/phoenix/phoenix-app-desktop/pull/2442) to be on released binary in order to test the deployment of this bp. |
…feat/agentic-audio-rag
- Introduced QwenOmniAgent class in a new module (src/qwen_agent.py) to unify the adapter for the Qwen2.5-Omni-7B model. - Updated run-workflow notebook to use the new QwenOmniAgent instance. - Removed the old _QwenAdapter class from model.py to streamline the codebase. - Ensured compatibility between notebook and MLflow deployment by using the shared QwenOmniAgent implementation.
8b051c3 to
ab73ae4
Compare
|
Notebooks updated and streamlit UI and mlflow service working flawlessly ⚡ |
This PR contains the Agentic Audio RAG blueprint, a RAG system that turns speech in audio/video files into searchable knowledge and lets you ask questions directly about the actual audio. A LangGraph-driven agent retrieves the most relevant timestamped audio segments, and an audio-native LLM listens to those clips to produce precise answers.