Add support for Typhoon ASR Real-Time model with transcribe() function by Copilot · Pull Request #19 · PyThaiNLP/pythaiasr

Copilot · 2026-01-31T04:54:00Z

Adds scb10x/typhoon-asr-realtime model support to pythaiasr with comprehensive real-time inference capabilities.

Changes

Model integration: Added Typhoon ASR to supported models list. Uses NVIDIA NeMo toolkit instead of HuggingFace transformers (existing models unaffected).
Audio processing: Typhoon models require different handling:
- Accepts file paths or numpy arrays
- Resamples to 16kHz and normalizes audio via np.max(np.abs(data))
- Uses temporary WAV files for NeMo's transcribe() API
Real-time inference function: Added new transcribe() function specifically for Typhoon ASR models that provides:
- Word-level timestamps (estimated, uniformly distributed)
- Processing time metrics
- Audio duration information
- Rich structured output as dictionary instead of string
- Device flexibility (CPU, CUDA, or auto)
Dependencies: Added optional [typhoon] extra with nemo-toolkit[asr]>=1.21.0 and librosa
Error handling: Raises informative ImportError when nemo-toolkit is missing, directing users to install via pip install pythaiasr[typhoon]
Tests: Added unit tests for model listing, import error handling, and the new transcribe() function. Integration tests available via TEST_TYPHOON_ASR=true env var.

Usage

Basic ASR (string output)

from pythaiasr import asr

# File path or numpy array
result = asr("audio.wav", model="scb10x/typhoon-asr-realtime", device="cpu")

Real-time inference with detailed output

from pythaiasr import transcribe

# Basic transcription
result = transcribe("audio.wav", model="scb10x/typhoon-asr-realtime")
print(result['text'])
print(f"Processing time: {result['processing_time']:.2f}s")

# With word-level timestamps
result = transcribe("audio.wav", with_timestamps=True)
for ts in result['timestamps']:
    print(f"{ts['word']}: {ts['start']:.2f}s - {ts['end']:.2f}s")

Installation requires the typhoon extra:

pip install pythaiasr[typhoon]

Note: Timestamps are estimated by uniformly distributing words across the audio duration and may not reflect actual word timing in natural speech with pauses or variable rates.

Original prompt

This section details on the original issue you should resolve

<issue_title>Add support typhoon-asr-realtime</issue_title>
<issue_description>Typhoon ASR Real-Time is a next-generation, open-source Automatic Speech Recognition (ASR) model built specifically for real-world streaming applications in the Thai language. It is designed to deliver fast and accurate transcriptions while running efficiently on standard CPUs. This enables users to host their own ASR service, reducing costs and avoiding the need to send sensitive data to third-party cloud services. The model is trained on 10,000 hours of Thai audio transcriptions to help it generalize to any environments.

HuggingFace model: https://huggingface.co/typhoon-ai/typhoon-asr-realtime

GitHub: https://github.com/scb-10x/typhoon-asr</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes Add support typhoon-asr-realtime #18

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>

wannaphong · 2026-01-31T05:16:59Z

@copilot Add ASR Real-Time Inference

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>

…d timestamp warnings Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>

Copilot · 2026-01-31T05:26:30Z

@copilot Add ASR Real-Time Inference

Added transcribe() function for real-time inference with Typhoon ASR. This provides word-level timestamps, processing time metrics, and audio duration information.

Example usage:

from pythaiasr import transcribe

result = transcribe("audio.wav", with_timestamps=True)
print(result['text'])
for ts in result['timestamps']:
    print(f"{ts['word']}: {ts['start']:.2f}s - {ts['end']:.2f}s")

Commit: 0ea8e24

Initial plan

b642583

Copilot AI assigned Copilot and wannaphong Jan 31, 2026

Copilot started work on behalf of wannaphong January 31, 2026 04:54 View session

Copilot AI and others added 3 commits January 31, 2026 05:00

Add support for typhoon-asr-realtime model

a9878bd

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>

Add tests for typhoon-asr-realtime model

e97a2c1

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>

Fix code review issues: correct numpy normalization and test backup

b2250a9

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add support for Typhoon ASR Real-Time model~~ Add support for Typhoon ASR Real-Time model Jan 31, 2026

Copilot AI requested a review from wannaphong January 31, 2026 05:10

Copilot finished work on behalf of wannaphong January 31, 2026 05:10

Copilot started work on behalf of wannaphong January 31, 2026 05:17 View session

Copilot AI and others added 2 commits January 31, 2026 05:23

Add transcribe() function for real-time inference with Typhoon ASR

e3c2ca0

Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>

Address code review: add EPSILON constant, improve error messages, ad…

0ea8e24

…d timestamp warnings Co-authored-by: wannaphong <8536487+wannaphong@users.noreply.github.com>

Copilot AI changed the title ~~Add support for Typhoon ASR Real-Time model~~ Add support for Typhoon ASR Real-Time model with transcribe() function Jan 31, 2026

Copilot finished work on behalf of wannaphong January 31, 2026 05:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Typhoon ASR Real-Time model with transcribe() function#19

Add support for Typhoon ASR Real-Time model with transcribe() function#19
Copilot wants to merge 6 commits intomainfrom
copilot/add-typhoon-asr-realtime-support

Copilot AI commented Jan 31, 2026 •

edited

Loading

Uh oh!

wannaphong commented Jan 31, 2026

Uh oh!

Copilot AI commented Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Usage

Basic ASR (string output)

Real-time inference with detailed output

Comments on the Issue (you are @copilot in this section)

Uh oh!

wannaphong commented Jan 31, 2026

Uh oh!

Copilot AI commented Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 31, 2026 •

edited

Loading