This section shows how to deploy the Visual Search and QA Application using Helm chart.
Before you begin, ensure that you have the following:
- Kubernetes* cluster set up and running.
- The cluster must support dynamic provisioning of Persistent Volumes (PV). Refer to the Kubernetes Dynamic Provisioning Guide for more details.
- Install
kubectlon your system. See the Installation Guide. Ensure access to the Kubernetes cluster. - Helm chart installed on your system. See the Installation Guide.
Do the following to deploy VSQA using Helm chart.
Use the following command to pull the Helm chart from Docker Hub:
helm pull oci://registry-1.docker.io/intel/metro-ai-suite-vsqa-chartYou may add --version <version-no> to specify a version number. Refer to the release notes for details on the latest version number to use for the sample application.
After pulling the chart, extract the .tgz file
tar -xvf metro-ai-suite-vsqa-chart-<version-no>.tgzThis will create a directory named metro-ai-suite-vsqa-chart containing the chart files. Navigate to the extracted directory with to access the charts.
cd metro-ai-suite-vsqa-chartClone the source repository
git clone https://github.com/open-edge-platform/edge-ai-suites.gitNavigate to the chart directory
cd edge-ai-suites/metro-ai-suite/visual-search-question-and-answering/deployment/helm-chartEdit the values.yaml file to set the necessary environment variables. At minimum, ensure you set the models, and proxy settings as required.
| Key | Description | Example Value |
|---|---|---|
global.proxy.http_proxy |
HTTP proxy if required | http://proxy-example.com:000 |
global.proxy.https_proxy |
HTTPS proxy if required | http://proxy-example.com:000 |
global.VLM_MODEL_NAME |
VLM model to be used by vlm-openvino-serving | Qwen/Qwen2.5-VL-7B-Instruct |
global.EMBEDDING_MODEL_NAME |
Embedding model to be used for feature extraction by multimodal-embedding-serving | CLIP/clip-vit-h-14 |
global.registry |
Remote registry to pull images from. Default as blank | intel/ |
global.env.keeppvc |
Set to true to persist the storage. Default is false | false |
Navigate to the chart directory and build the Helm dependencies using the following command:
helm dependency updateCreate a namespace for Milvus
kubectl create namespace milvusInstall Milvus latest helm chart
helm repo add milvus https://zilliztech.github.io/milvus-helm/
helm repo updateDeploy Milvus in a simplified standalone mode
helm install my-milvus milvus/milvus -n milvus --set image.all.tag=v2.6.0 --set cluster.enabled=false --set etcd.replicaCount=1 --set minio.mode=standalone --set pulsar.enabled=false --set pulsarv3.enabled=falseNote: if you need customized settings for Milvus, please refer to the official guide.
Check the pods status with kubectl get po -n milvus. RESTARTS are possible, as long as the 3 pods are stablized after a while, the deployment is successful.
mkdir -p $HOME/dataMake sure the host directories are available to the cluster nodes, and the host-paths under the volumes.hostDataPath section in values.yaml file match the correct directories. Particularly, the default path in values.yaml is /home/user/data, which corresponds to a host username user.
Note: supported media types: jpg, png, mp4
Create a namespace for VSQA app
kubectl create namespace vsqaInstall
helm install vsqa . --values values.yaml -n vsqaCheck the status of the deployed resources to ensure everything is running correctly:
kubectl get pods -n vsqa
kubectl get services -n vsqaEnsure all pods are in the "Running" state before proceeding.
For a simpler access, we can do a port forward
kubectl port-forward -n vsqa svc/visual-search-qa-app 17580:17580Leave the session alive, then access http://localhost:17580 to view the application.
To uninstall, use the following command:
helm uninstall vsqa -n vsqa
helm uninstall my-milvus -n milvus- Ensure that all pods are running and the services are accessible.
-
If you encounter any issues during the deployment process, check the Kubernetes logs for errors:
kubectl logs <pod-name> -n <your-namespace>
-
If the data preparation pod shows error while loading a large dataset, it might be caused by too large of the dataset size. Try breaking the dataset into smaller subsets and ingest each of them instead.