feat: implement smart chunking with custom Silero VAD #14

Signal46 · 2025-11-27T12:09:12Z

Implements SmartChunker in src/chunking.rs to split audio on silence.
Adds src/vad.rs with a custom, lightweight SileroVad wrapper using ort and ndarray to avoid dependency conflicts.
Exposes transcribe_with_smart_chunking in the TranscriptionEngine trait.
Adds tests/smart_chunking.rs to verify VAD and chunking logic.
Updates Cargo.toml with necessary dependencies (anyhow, ndarray, reqwest for tests).
Updates .gitignore to exclude *.onnx model files.

- Implements [SmartChunker](cci:2://file:///c:/transcribe-rs/transcribe-rs_SmartChunking/src/chunking.rs:5:0-5:24) in [src/chunking.rs](cci:7://file:///c:/transcribe-rs/transcribe-rs_SmartChunking/src/chunking.rs:0:0-0:0) to split audio on silence. - Adds [src/vad.rs](cci:7://file:///c:/transcribe-rs/transcribe-rs_SmartChunking/src/vad.rs:0:0-0:0) with a custom, lightweight [SileroVad](cci:2://file:///c:/transcribe-rs/transcribe-rs_SmartChunking/src/vad.rs:6:0-11:1) wrapper using `ort` and `ndarray` to avoid dependency conflicts. - Exposes [transcribe_with_smart_chunking](cci:1://file:///c:/transcribe-rs/transcribe-rs_SmartChunking/src/lib.rs:204:4-228:5) in the [TranscriptionEngine](cci:2://file:///c:/transcribe-rs/transcribe-rs_SmartChunking/src/lib.rs:125:0-229:1) trait. - Adds [tests/smart_chunking.rs](cci:7://file:///c:/transcribe-rs/transcribe-rs_SmartChunking/tests/smart_chunking.rs:0:0-0:0) to verify VAD and chunking logic. - Updates [Cargo.toml](cci:7://file:///c:/transcribe-rs/transcribe-rs_SmartChunking/Cargo.toml:0:0-0:0) with necessary dependencies (`anyhow`, `ndarray`, `reqwest` for tests). - Updates `.gitignore` to exclude `*.onnx` model files.

Add progress callback parameter for real-time progress reporting Fix VAD sample rate input type from f32 to i64 (resolves the "Unexpected input data type" error) Update TranscriptionEngine trait signature

cjpais · 2025-11-28T00:42:41Z

First I wanna say thank you for this. I really appreciate you porting the code to make a PR here. I did skim the code this morning. I think there's some minor things I will want to tweak, but I want to simmer on them for a few days. Just as I think about the overall architecture of this library and how ultimately it will be used.

- Implement `decode_and_resample` to support various audio formats (MP3, M4A, etc.) - Update `transcribe_file` to use the new decoder, enabling native support for non-WAV files - Add `symphonia` and `rubato` dependencies

cjpais · 2025-12-04T05:19:12Z

this is on my todo, but that list is long right now, i will review when I can!

Signal46 added 2 commits November 27, 2025 10:18

fix: Add progress callback and fix VAD input type

bf03820

Add progress callback parameter for real-time progress reporting Fix VAD sample rate input type from f32 to i64 (resolves the "Unexpected input data type" error) Update TranscriptionEngine trait signature

feat: add audio decoding support via symphonia and rubato

9b77e03

- Implement `decode_and_resample` to support various audio formats (MP3, M4A, etc.) - Update `transcribe_file` to use the new decoder, enabling native support for non-WAV files - Add `symphonia` and `rubato` dependencies

Signal46 mentioned this pull request Dec 1, 2025

Added feature of transcription of local files (WAV, MP3 and M4A) along with progressbar cjpais/Handy#381

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement smart chunking with custom Silero VAD #14

feat: implement smart chunking with custom Silero VAD #14

Signal46 commented Nov 27, 2025

Uh oh!

cjpais commented Nov 28, 2025

Uh oh!

cjpais commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: implement smart chunking with custom Silero VAD #14

Are you sure you want to change the base?

feat: implement smart chunking with custom Silero VAD #14

Conversation

Signal46 commented Nov 27, 2025

Uh oh!

cjpais commented Nov 28, 2025

Uh oh!

cjpais commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants