Skip to content

Commit fe5263e

Browse files
authored
Merge branch 'open-edge-platform:main' into rm_datastore
2 parents 7bdd408 + 6661d38 commit fe5263e

File tree

11 files changed

+181
-92
lines changed

11 files changed

+181
-92
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Key components of the **Edge AI Libraries**:
2626
| [Model Registry](microservices/model-registry) | Microservice | [Link](microservices/model-registry/docs/user-guide/get-started.md) | [API Reference](microservices/model-registry/docs/user-guide/api-docs/openapi.yaml) |
2727
| [Intel® Geti™](https://github.com/open-edge-platform/geti) | Tool | [Link](https://geti.intel.com/) | [Docs](https://docs.geti.intel.com) |
2828
| [Visual Pipeline and Performance Evaluation Tool](tools/visual-pipeline-and-platform-evaluation-tool) | Tool | [Link](tools/visual-pipeline-and-platform-evaluation-tool/docs/user-guide/get-started.md) | [Build](tools/visual-pipeline-and-platform-evaluation-tool/docs/user-guide/how-to-build-source.md) instructions |
29-
| [Chat Question and Answer](sample-applications/chat-question-and-answer) | Sample Application | [Link](sample-applications/chat-question-and-answer-core/docs/user-guide/get-started.md) | [Build](sample-applications/chat-question-and-answer/docs/user-guide/build-from-source.md) instructions |
29+
| [Chat Question and Answer](sample-applications/chat-question-and-answer) | Sample Application | [Link](sample-applications/chat-question-and-answer/docs/user-guide/get-started.md) | [Build](sample-applications/chat-question-and-answer/docs/user-guide/build-from-source.md) instructions |
3030
| [Chat Question and Answer Core](sample-applications/chat-question-and-answer-core) | Sample Application | [Link](sample-applications/chat-question-and-answer-core/docs/user-guide/get-started.md) | [Build](sample-applications/chat-question-and-answer-core/docs/user-guide/build-from-source.md) instructions |
3131

3232

sample-applications/chat-question-and-answer-core/docs/user-guide/build-from-source.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,12 +84,19 @@ You should see entries for both `chatqna` and `chatqna-ui`.
8484
## Running the Application Container
8585
After building the images for the `Chat Question-and-Answer Core` application, you can run the application container using `docker compose` by following these steps:
8686

87-
1. Start the Docker containers with the previously built images:
87+
1. **Set Up Environment Variables**:
88+
```bash
89+
export HUGGINGFACEHUB_API_TOKEN=<your-huggingface-token>
90+
source scripts/setup_env.sh
91+
```
92+
Configure the models to be used (LLM, Embeddings, Rerankers) in the `scripts/setup_env.sh` as needed. Refer to and use the same list of models as documented in [Chat Question-and-Answer](../../../chat-question-and-answer/docs/user-guide/get-started.md#supported-models).
93+
94+
2. Start the Docker containers with the previously built images:
8895
```bash
8996
docker compose -f docker/compose.yaml up
9097
```
9198

92-
2. Access the application:
99+
3. Access the application:
93100
- Open your web browser and navigate to `http://<host-ip>:5173` to view the application dashboard.
94101

95102
## Verification

sample-applications/chat-question-and-answer/docs/user-guide/get-started.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ The sample application has been validated with a few models just to validate the
4141
### LLM Models validated for each model server
4242
| Model Server | Models Validated |
4343
|--------------|-------------------|
44-
| `TEI` | `Intel/neural-chat-7b-v3-3`, `Qwen/Qwen2.5-7B-Instruct`, `microsoft/Phi-3.5-mini-instruct`, `meta-llama/Llama-3.1-8B-instruct`, `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` |
44+
| `vLLM` | `Intel/neural-chat-7b-v3-3`, `Qwen/Qwen2.5-7B-Instruct`, `microsoft/Phi-3.5-mini-instruct`, `meta-llama/Llama-3.1-8B-instruct`, `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` |
4545
| `OVMS` | `Intel/neural-chat-7b-v3-3`, `Qwen/Qwen2.5-7B-Instruct`, `microsoft/Phi-3.5-mini-instruct`, `meta-llama/Llama-3.1-8B-instruct`, `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` |
4646
| `TGI` | `Intel/neural-chat-7b-v3-3`, `Qwen/Qwen2.5-7B-Instruct`, `microsoft/Phi-3.5-mini-instruct`, `meta-llama/Llama-3.1-8B-instruct`, `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` |
4747

sample-applications/chat-question-and-answer/docs/user-guide/overview-architecture.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ ChatQ&A application is a combination of the core LangChain application logic tha
3333
### Application Flow
3434

3535
1. **Input Sources**:
36-
- **Documents**: The document ingestion microservice supports ingesting from various document formats. Supported formats are word and pdf.
36+
- **Documents**: The document ingestion microservice supports ingesting documents in various formats. Supported formats are word and pdf.
3737
- **Web pages**: Contents of accessible web pages can also be parsed and used as input for the RAG pipeline.
3838
2. **Create the context**
3939
- **Upload input documents and web links**: The UI microservice allows the developer to interact with the ChatQ&A backend. It provides the interface to upload the documents and weblinks on which the RAG pipeline will be executed. The documents are uploaded and stored in object store. MinIO is the database used for object store.
@@ -66,12 +66,12 @@ The application flow is illustrated in the flow diagram below. The diagram shows
6666

6767
2. **Document ingestion microservice**:
6868
- **What it is**: Document ingestion microservice provides capability to ingest contents from documents and web links, create the necessary context, and retrieve the right context based on user query.
69-
- **How it's used**: Document ingestion microservice provides a REST API endpoint that can be used to manage the contents. The ChatQ&A backend uses this API to access its capabilities.
70-
- **Benefits**: The core part of the document ingestion microservice is the vector handling capability which is optimized for target deployment hardware. Selection of the vectorDB is based on performance considerations. Rest of the document ingestion microservice can be treated as sample reference implementaiton.
69+
- **How it's used**: Document ingestion microservice provides a `documents` REST API endpoint that can be used to manage the contents. The ChatQ&A backend uses this API to access its capabilities.
70+
- **Benefits**: The core part of the document ingestion microservice is the vector handling capability which is optimized for target deployment hardware. Selection of the vectorDB is based on performance considerations. Rest of the document ingestion microservice can be treated as sample reference implementation.
7171

7272
3. **ChatQ&A backend microservice**:
7373
- **What it is**: ChatQ&A backend microservice is a LangChain based implementation of ChatQ&A RAG pipeline providing required handling of the user queries.
74-
- **How it’s used**: A REST API endpoint is provided which is used by the UI front end to send user queries and trigger the RAG pipeline.
74+
- **How it’s used**: A `streamlog` REST API endpoint is provided which is used by the UI front end to send user queries and trigger the RAG pipeline.
7575
- **Benefits**: The microservice provides a reference of how LangChain framework is used to implement ChatQ&A using Intel Edge AI inference microservices.
7676

7777
4. **ChatQ&A UI**:

sample-applications/chat-question-and-answer/docs/user-guide/overview.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,6 @@ Refer to the [Get Started](./get-started.md) page to get started with the sample
4747

4848
2. **Generation [Q&A]**: This part allows the user to query the document database and generate responses. The LLM inference microservice, embedding inference microservice, and reranking microservice work together to provide accurate and efficient answers to user queries. When a user submits a question, the embedding model hosted by the chosen model serving (default is OVMS) transforms it into an embedding, enabling semantic comparison with stored document embeddings. The vector database searches for relevant embeddings, returning a ranked list of documents based on semantic similarity. The LLM Inference Microservice generates a context-aware response from the final set of documents. It is possible to use any supported models to run with the applications. Detailed documentation provides full information on validated models and models supported overall.
4949

50-
Further details on the system architecture and customizable options are available [here](./overview-architecture.md).
51-
5250
Detailed hardware and software requirements are available [here](./system-requirements.md).
5351

5452
[This sample application is ready for deployment with Edge Orchestrator. Download the deployment package and follow the instructions](deploy-with-edge-orchestrator.md)

sample-applications/chat-question-and-answer/setup.sh

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,13 @@ export INDEX_NAME=intel-rag
1313
export EMBEDDING_ENDPOINT_URL=http://tei-embedding-service
1414
#Setup the host IP
1515
export HOST_IP=$(hostname -I | awk '{print $1}')
16+
# The above command does not work on EMT. Two options:
17+
# 1. Check with:
18+
# ip -o route get to 8.8.8.8 | sed -n 's/.*src \([0-9.]\+\).*/\1/p'
19+
# But this approach could also have an issue based on kind of
20+
# deployment (airgapped or not). Need to check for a better solution.
21+
# IP address of 8.8.8.8 is Google address.
22+
# 2. Eliminate the need for hostname.
1623

1724
# UI ENV variables
1825
export APP_ENDPOINT_URL=http://$HOST_IP:8100

tools/visual-pipeline-and-platform-evaluation-tool/Dockerfile.vippet

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,6 @@ RUN pip install -r requirements.txt
3232

3333
ADD diagrams/ /home/dlstreamer/vippet/diagrams
3434

35-
ADD app.py collect.py optimize.py pipeline.py device.py /home/dlstreamer/vippet/
35+
ADD app.py collect.py optimize.py pipeline.py device.py explore.py /home/dlstreamer/vippet/
3636

3737
CMD ["python", "app.py"]

tools/visual-pipeline-and-platform-evaluation-tool/app.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from optimize import OptimizationResult, PipelineOptimizer
1515
from pipeline import SmartNVRPipeline, Transportation2Pipeline
1616
from device import DeviceDiscovery
17+
from explore import GstInspector
1718

1819
css_code = """
1920
@@ -93,6 +94,7 @@
9394
# pipeline = Transportation2Pipeline()
9495
pipeline = SmartNVRPipeline()
9596
device_discovery = DeviceDiscovery()
97+
gst_inspector = GstInspector()
9698

9799
# Download File
98100
def download_file(url, local_filename):
@@ -576,6 +578,7 @@ def on_run(
576578
constants=constants,
577579
param_grid=param_grid,
578580
channels=(recording_channels, inferencing_channels),
581+
elements=gst_inspector.get_elements(),
579582
)
580583
collector.collect()
581584
time.sleep(3)
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
import subprocess
2+
from threading import Lock
3+
4+
class GstInspector:
5+
"""
6+
A singleton class to inspect GStreamer elements using gst-inspect-1.0.
7+
This class provides a method to retrieve the list of GStreamer elements
8+
and their descriptions.
9+
10+
These is an example of the output from the command:
11+
12+
videoanalytics: gvaclassify: Object classification (requires GstVideoRegionOfInterestMeta on input)
13+
videoanalytics: gvadetect: Object detection (generates GstVideoRegionOfInterestMeta)
14+
videoanalytics: gvainference: Generic full-frame inference (generates GstGVATensorMeta)
15+
16+
Those elements will be returned in a list of tuples:
17+
18+
[
19+
("videoanalytics", "gvaclassify", "<description>"),
20+
("videoanalytics", "gvadetect", "<description>"),
21+
("videoanalytics", "gvainference", "<description>")
22+
]
23+
"""
24+
_instance = None
25+
_lock = Lock()
26+
27+
def __new__(cls, *args, **kwargs):
28+
with cls._lock:
29+
if cls._instance is None:
30+
cls._instance = super(GstInspector, cls).__new__(cls)
31+
cls._instance._initialize()
32+
return cls._instance
33+
34+
def _initialize(self):
35+
self.elements = self._get_gst_elements()
36+
37+
def _get_gst_elements(self):
38+
try:
39+
result = subprocess.run(
40+
["gst-inspect-1.0"],
41+
stdout=subprocess.PIPE,
42+
stderr=subprocess.PIPE,
43+
text=True,
44+
check=True
45+
)
46+
lines = result.stdout.splitlines()
47+
elements = []
48+
for line in lines:
49+
if ": " in line:
50+
plugin, rest = line.split(": ", 1)
51+
if ": " in rest:
52+
element, description = rest.split(": ", 1)
53+
elements.append((plugin.strip(), element.strip(), description.strip()))
54+
55+
return sorted(elements)
56+
57+
except subprocess.CalledProcessError as e:
58+
print(f"Error running gst-inspect-1.0: {e}")
59+
return []
60+
61+
def get_elements(self):
62+
return self.elements
63+
64+
if __name__ == "__main__":
65+
inspector = GstInspector()
66+
for element in inspector.get_elements():
67+
print(element)

tools/visual-pipeline-and-platform-evaluation-tool/optimize.py

100644100755
Lines changed: 3 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -31,13 +31,15 @@ def __init__(
3131
param_grid: Dict[str, List[str]],
3232
poll_interval: int = 1,
3333
channels: int | tuple[int, int] = 1,
34+
elements: List[tuple[str, str, str]] = [],
3435
):
3536

3637
# Initialize class variables
3738
self.pipeline = pipeline
3839
self.constants = constants
3940
self.param_grid = param_grid
4041
self.poll_interval = poll_interval
42+
self.elements = elements
4143

4244
# Set the number of channels
4345
self.channels = (
@@ -65,29 +67,12 @@ def _iterate_param_grid(self, param_grid: Dict[str, List[str]]):
6567

6668
def optimize(self):
6769

68-
# Run gst-inspect-1.0 to get the list of elements
69-
process = Popen(["gst-inspect-1.0", "va"], stdout=PIPE, stderr=PIPE)
70-
elements = process.communicate()[0].decode("utf-8").split("\n")
71-
72-
# Log the elements
73-
self.logger.info("Elements:")
74-
self.logger.info(pprint.pformat(elements))
75-
76-
# Find the available encoder
77-
# Note that the selected encoder is the last one on the list.
78-
# This is usually vah264lpenc if the encoder is available.
79-
# Otherwise, fallback to the only available encoder, usually vah264enc.
80-
encoder = [element for element in elements if "vah264enc" in element or "vah264lpenc" in element][-1]
81-
encoder = encoder.split(":")[0].strip()
82-
83-
# Log the encoder
84-
self.logger.info(f"Encoder: {encoder}")
8570

8671
for params in self._iterate_param_grid(self.param_grid):
8772

8873
# Evaluate the pipeline with the given parameters, constants, and channels
8974
_pipeline = self.pipeline.evaluate(
90-
self.constants, params, self.regular_channels, self.inference_channels, encoder
75+
self.constants, params, self.regular_channels, self.inference_channels, self.elements
9176
)
9277

9378
# Log the command

0 commit comments

Comments
 (0)