Lets take the example of cloud-edge-collaborative-inference-for-llm scenario and understand how the algorithm developer is able to prepare and config the test environment using the following configuration.
| Property | Required | Description |
|---|---|---|
| dataset | yes | The configuration of dataset |
| metrics | yes | The metrics used for test case's evaluation; Type: list |
For example:
testenv:
dataset:
...
metrics:
...| Property | Required | Description |
|---|---|---|
| train_url | yes | The url address of train dataset index; Type: string |
| test_url | yes | The url address of test dataset index; Type: string |
The dataset files can be provided in several formats. The supported data formats are TXT, CSV, JSON, and JSONL.
Here is how the data files should be prepared depending on the format:
For TXT format, each line typically represents a single data record or a path to a data file, optionally followed by its corresponding label separated by a space.
/path/to/image1.jpg dog
/path/to/image2.jpg catFor CSV format, the file should contain comma-separated values. It usually includes headers, where one column represents the data (or path to data) and another represents the label.
image_path,label
/path/to/image1.jpg,dog
/path/to/image2.jpg,catFor JSON format, it can be a JSON array of objects, or JSON Lines (JSONL) where each line is a valid JSON object.
[
{"image": "/path/to/image1.jpg", "label": "dog"},
{"image": "/path/to/image2.jpg", "label": "cat"}
]Or JSONL:
{"image": "/path/to/image1.jpg", "label": "dog"}
{"image": "/path/to/image2.jpg", "label": "cat"}For example:
dataset:
train_index: "./dataset/mmlu-5-shot/train_data/data.json"
test_index: "./dataset/mmlu-5-shot/test_data/metadata.json"We have designed multiple metrics for edge-cloud collaborative inference, including:
| Metric | Description | Unit |
|---|---|---|
| Accuracy | Accuracy on the test Dataset | - |
| Edge Ratio | proportion of queries router to edge | - |
| Time to First Token | Time taken to generate the first token | s |
| Internal Token Latency | Time taken to generate each token | s |
| Throughput | Token generation speed | token/s |
| Cloud Prompt Tokens | Number of prompt tokens consumed by Cloud Model | - |
| Cloud Completion Tokens | Number of completion tokens generated by Cloud Model | - |
| Edge Prompt Tokens | Number of prompt tokens consumed by the Edge Model | - |
| Edge Completion Tokens | Number of completion tokens generated by the Edge Model | - |
Each metric is calculated by a module in examples/cloud-edge-collaborative-inference-for-llm/testenv. For more details, please check the folder.
You can select multiple metrics in examples/cloud-edge-collaborative-inference-for-llm/testenv/testenv.yaml.
# testenv.yaml
testenv:
dataset:
train_data: "./dataset/mmlu-5-shot/train_data/data.json"
test_data_info: "./dataset/mmlu-5-shot/test_data/metadata.json"
metrics:
- name: "Accuracy"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/accuracy.py"
- name: "Edge Ratio"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/edge_ratio.py"
- name: "Cloud Prompt Tokens"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/cloud_prompt_tokens.py"
- name: "Cloud Completion Tokens"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/cloud_completion_tokens.py"
- name: "Edge Prompt Tokens"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/edge_prompt_tokens.py"
- name: "Edge Completion Tokens"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/edge_completion_tokens.py"
- name: "Time to First Token"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/time_to_first_token.py"
- name: "Throughput"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/throughput.py"
- name: "Internal Token Latency"
url: "./examples/cloud-edge-collaborative-inference-for-llm/testenv/internal_token_latency.py"