Skip to content

Dev #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 73 commits into from
May 9, 2025
Merged

Dev #15

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
52e9074
[fix] yolo11n-seg mode config error
dianjixz Apr 3, 2025
a231d39
[update] llm-model-yolo11n-seg version
dianjixz Apr 3, 2025
6d62b0d
[update] Update llm_asr & llm_kws en docs
Apr 7, 2025
bed467e
[update] Delete the old version of the dynamic library
Apr 7, 2025
d6d8f3c
[update] main_kws & main_llm use their own python environment
Apr 7, 2025
f5c82e4
[update] main_vlm updates the image encoding model and uses its own p…
Apr 7, 2025
033313f
[update] Remove the system environment python package. Add internvl2.…
Apr 7, 2025
a0222ca
[update] add smolvlm-500M-ax630c model & tokenizer server
Apr 8, 2025
8ae949d
[update] update main_vlm version
Apr 8, 2025
c39e691
[fix] Fix tokenizer_smolvlm prompt bug.
Apr 8, 2025
a78367e
[update] main_camera add AXERA VIN && add rtsp && add webstream
dianjixz Apr 9, 2025
9842ffb
[update] llm_task add start and stop
dianjixz Apr 9, 2025
9f257bc
[update] main_camera doc
dianjixz Apr 9, 2025
ce0d5c2
Merge branch 'dev' of github.com:m5stack/StackFlow into dev
dianjixz Apr 9, 2025
3e367c1
[update] kws doc add enwake_audio
dianjixz Apr 10, 2025
b8108f3
[update] Compatible with OpenAI API calls
Apr 15, 2025
a6bc38b
[update] Update log printing. Update fields
Apr 15, 2025
4274e8c
[fix] Fix buffer data overwrite
Apr 16, 2025
f952bb6
[update] Added support for 650N. Enable bLoadModelUseCmm.
Apr 16, 2025
98adcc2
[update] depth_anything, melotts, yolo. Added support for 650N.
Apr 16, 2025
c63bb09
[fix] Fixed the issue that CMM cannot be released after the class Eng…
Apr 16, 2025
8d18ef3
[update] mode config add compile_flage
dianjixz Apr 16, 2025
724bb5b
Merge branch 'dev' of github.com:m5stack/StackFlow into dev
dianjixz Apr 16, 2025
8f673e4
[fix] Fix CMM cannot be released
Apr 16, 2025
91c5bbd
[update] change bsp define
dianjixz Apr 16, 2025
d8b0652
Merge branch 'dev' of github.com:m5stack/StackFlow into dev
dianjixz Apr 16, 2025
9276df6
[update] Release whisper-small model
Apr 16, 2025
3203fea
[update] Update ModuleLLM-OpenAI-Plugin version
Apr 17, 2025
52f1a48
[update] add benchmark test
Apr 17, 2025
c85e4d6
[update] add llm unit test.
Apr 17, 2025
1dd2b14
[update] StackFlow bin add version id
dianjixz Apr 18, 2025
2b169c7
Merge branch 'dev' of github.com:m5stack/StackFlow into dev
dianjixz Apr 18, 2025
23cc53d
[update] Update import package method
Apr 18, 2025
66a4770
[update] Update model name
Apr 21, 2025
783200e
[update] Update openai_api version
Apr 21, 2025
8f2643d
[update] update llm-asr doc
Apr 21, 2025
14d7e5b
[fix] llm_whisper
Apr 21, 2025
4b477ee
[update] Add melotts-en-us model
Apr 22, 2025
aa3441d
[update] Update package version
Apr 22, 2025
8e263d0
[update] Update other package version
Apr 22, 2025
efa978b
[update] update melotts doc
Apr 22, 2025
16dfe70
[update] update model version
Apr 22, 2025
43c3438
[fix] Fix non-utf-8 characters
Apr 23, 2025
dce7e4d
[update] Update OpenAI-Plugin
Apr 23, 2025
076ecac
[update] Update LLM VLM STT benchmark
Apr 24, 2025
3bdf822
[update] Update benchmark
Apr 24, 2025
d636d4d
[update] add pzmq pzmq_data class
dianjixz Apr 29, 2025
afaf857
Merge branch 'dev' of github.com:m5stack/StackFlow into dev
dianjixz Apr 29, 2025
1a90562
优化g2p流程,可以处理多音字,中英混合的情况等等
yuyun2000 Apr 30, 2025
2e40ae6
去掉中文注释;使用log输出日志;格式化代码
yuyun2000 Apr 30, 2025
3ce0205
新增日语和英语模型配置
yuyun2000 Apr 30, 2025
3897870
略微增加语速,听感更好
yuyun2000 Apr 30, 2025
5782f89
处理陌生英语单词
yuyun2000 Apr 30, 2025
ef3df9b
Merge pull request #11 from yuyun2000/opt/melotts
Abandon-ht Apr 30, 2025
04961ae
[update] update mode_melotts-en-default.json
Apr 30, 2025
1e85f45
[update] update llm-model-melotts-en-default & llm-model-melotts-ja-jp
Apr 30, 2025
4e301f1
[update] update whisper-tiny
May 6, 2025
2a8a744
[update] StackFlow add stackflow_data && pzmq add get_param set_param
dianjixz May 6, 2025
e3c70bc
Implement Sola algorithm for smoother audio transitions
yuyun2000 May 6, 2025
a151aff
Translate logs in Lexicon.hpp to English and add debug switch
yuyun2000 May 6, 2025
840f739
[update] update whisper-base
May 6, 2025
e1d7e6d
Merge pull request #12 from yuyun2000/opt/melotts
Abandon-ht May 6, 2025
40bfe39
[update] update whisper-small
May 7, 2025
6b285ed
[update] update benchmark
May 8, 2025
b43e73e
[add] add benchmark.yml
May 8, 2025
b1df925
Fix SOLA detail issue causing first frame problems
yuyun2000 May 9, 2025
cbb0afa
Optimize G2P process to skip inference for short audio clips
yuyun2000 May 9, 2025
74603be
[update] update qwen3-0.6B model
May 9, 2025
9e7342f
Fix code formatting and integrate SOLA algorithm into main.py
yuyun2000 May 9, 2025
3b20852
Merge pull request #13 from yuyun2000/opt/melotts
Abandon-ht May 9, 2025
835daf1
Fix SOLA algorithm implementation
yuyun2000 May 9, 2025
e5944f2
Merge pull request #14 from yuyun2000/opt/melotts
Abandon-ht May 9, 2025
6e503a1
[update] delete debug log
May 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -163,5 +163,4 @@ StatementMacros:
- QT_REQUIRE_VERSION
TabWidth: 4
UseCRLF: false
UseTab: Never
...
UseTab: Never
18 changes: 18 additions & 0 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: Benchmark Test
on:
workflow_dispatch:
push:
branches:
- dev
jobs:
build:
runs-on: [self-hosted, linux, arm64]
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Start Benchmark Test
run: |
echo "This job runs on a self-hosted runner!"
echo "Running benchmark test..."
python3 benchmark/benchmodulellm.py
10 changes: 10 additions & 0 deletions benchmark/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
benchmodulellm can be used to test llm unit inference performance

Only the llm unit definition files (model json) are required.

If no model specified, it would benchmark default list. More model networks may be added later.

Usage
```shell
python benchmodulellm.py --host 192.168.20.100 --port 10001 --test-items default.yaml
```
39 changes: 39 additions & 0 deletions benchmark/RESULTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Results

## ModuleLLM (AX630C)

### LLM
| model | ttft (ms) | avg-token/s | model version | llm version |
|---------------------------------|------------|-------------|---------------|-------------|
| qwen2.5-0.5B-prefill-20e | 359.8 | 10.32 | v0.2 | v1.8 |
| qwen2.5-0.5B-p256-ax630c | 1126.19 | 10.30 | v0.4 | v1.8 |
| qwen2.5-0.5B-Int4-ax630c | 442.95 | 12.52 | v0.4 | v1.8 |
| qwen2.5-coder-0.5B-ax630c | 361.81 | 10.28 | v0.2 | v1.8 |
| qwen2.5-1.5B-ax630c | 1029.41 | 3.59 | v0.3 | v1.8 |
| qwen2.5-1.5B-p256-ax630c | 3056.54 | 3.57 | v0.4 | v1.8 |
| qwen2.5-1.5B-Int4-ax630c | 1219.54 | 4.63 | v0.4 | v1.8 |
| deepseek-r1-1.5B-ax630c | 1075.04 | 3.57 | v0.3 | v1.8 |
| deepseek-r1-1.5B-p256-ax630c | 3056.86 | 3.57 | v0.4 | v1.8 |
| llama3.2-1B-prefill-ax630c | 891.00 | 4.48 | v0.2 | v1.8 |
| llama3.2-1B-p256-ax630c | 2601.11 | 4.49 | v0.4 | v1.8 |
| openbuddy-llama3.2-1B-ax630c | 891.02 | 4.52 | v0.2 | v1.8 |

`The input text used by the llm test is "hello!“`

### VLM
| model | ttft (ms) | avg-token/s | image encode (ms) | model version | vlm version |
|---------------------------------|------------|-------------|-------------------|---------------|-------------|
| internvl2.5-1B-364-ax630c | 1117.27 | 10.56 | 1164.61 | v0.4 | v1.7 |
| smolvlm-256M-ax630c | 185.75 | 30.16 | 799.11 | v0.4 | v1.7 |
| smolvlm-500M-ax630c | 365.69 | 13.14 | 838.30 | v0.4 | v1.7 |

`The image encoding test uses a jpg image with a size of 810*1080`

### STT
| model | encode (ms) | avg-decode (ms) | model version | whisper version |
|--------------------|-------------|-----------------|---------------|-----------------|
| whisper-tiny | 248.0 | 32.54 | v0.4 | v1.7 |
| whisper-base | 660.31 | 51.11 | v0.4 | v1.7 |
| whisper-small | 1606.08 | 148.92 | v0.4 | v1.7 |

`The STT test uses a 30-second wav English audio`
126 changes: 126 additions & 0 deletions benchmark/benchmodulellm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
import argparse
import os
import sys

import yaml
import logging

from pathlib import Path

from utils import LLMClient

FILE = Path(__file__).resolve()
ROOT = FILE.parents[0]
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT))
ROOT = Path(os.path.relpath(ROOT, Path.cwd()))

logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)

def parse_opt(known=False):
"""
Parse command-line options.
"""
parser = argparse.ArgumentParser()
parser.add_argument("--host", type=str, default="127.0.0.1", help="ModuleLLM IP Address")
parser.add_argument("--port", type=int, default=10001, help="ModuleLLM TCP Port")
parser.add_argument("--test-items", type=str, default=ROOT / "default.yaml", help="testitems.yaml path")

args = parser.parse_known_args()[0] if known else parser.parse_args()

return args

def read_yaml(file_path):
"""
Read a YAML file and return its content.
"""
if not os.path.exists(file_path):
logging.error(f"YAML file '{file_path}' does not exist.")
sys.exit(1)

try:
with open(file_path, "r") as file:
data = yaml.safe_load(file)
if data is None:
logging.warning(f"YAML file '{file_path}' is empty.")
return {}

logging.info(f"YAML file '{file_path}' read successfully.")

if "items" in data:
return data["items"]
else:
logging.warning(f"'items' not found in YAML file.")
return []
except Exception as e:
logging.error(f"Failed to read YAML file '{file_path}': {e}")
sys.exit(1)

def write_yaml(file_path, data):
"""
Write data to a YAML file.
"""
try:
with open(file_path, "w") as file:
yaml.safe_dump(data, file)
logging.info(f"YAML file '{file_path}' written successfully.")
except Exception as e:
logging.error(f"Failed to write YAML file '{file_path}': {e}")
sys.exit(1)

def categorize_and_deduplicate(items):
"""
Categorize items by 'type' and remove duplicate 'model_name'.
"""
categorized = {}
for item in items:
item_type = item.get("type")
model_name = item.get("model_name")
if not item_type or not model_name:
continue

if item_type not in categorized:
categorized[item_type] = set()

categorized[item_type].add(model_name)

# Convert sets back to lists for easier usage
return {key: list(value) for key, value in categorized.items()}

def main(opt):
items = read_yaml(opt.test_items)
if not items:
logging.warning(f"No items found in YAML file '{opt.test_items}'.")
return

categorized_items = categorize_and_deduplicate(items)

logging.info("Categorized items:")
for item_type, models in categorized_items.items():
logging.info(f"Type: {item_type}, Models: {models}")

if item_type == "llm":
logging.info("Initializing LLMClient...")
llm_client = LLMClient(opt.host, opt.port)

for model_name in models:
logging.info(f"Testing model: {model_name}")
input_text = "Tell me an adventure story."
try:
result = llm_client.test(model_name, input_text)
logging.info(f"Test result for model '{model_name}': {result}")
except Exception as e:
logging.error(f"Error testing model '{model_name}': {e}")

del llm_client
logging.info("LLMClient deleted successfully.")

return categorized_items

if __name__ == "__main__":
opt = parse_opt()
main(opt)
31 changes: 31 additions & 0 deletions benchmark/default.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
items:
- model_name: qwen2.5-0.5B-p256-ax630c
type: llm
- model_name: internvl2.5-1B-364-ax630c
type: vlm
- model_name: whisper-tiny
type: whisper
- model_name: whisper-base
type: whisper
- model_name: whisper-small
type: whisper
- model_name: sherpa-ncnn-streaming-zipformer-20M-2023-02-17
type: asr
- model_name: sherpa-ncnn-streaming-zipformer-zh-14M-2023-02-23
type: asr
- model_name: sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01
type: kws
- model_name: sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01
type: kws
- model_name: melotts-zh-cn
type: melotts
- model_name: single_speaker_english_fast
type: tts
- model_name: single_speaker_fast
type: tts
- model_name: yolo11n
type: yolo
- model_name: yolo11n-seg
type: yolo
- model_name: yolo11n-pose
type: yolo
3 changes: 3 additions & 0 deletions benchmark/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from .llm import LLMClient

__all__ = ["LLMClient"]
Loading
Loading