Please Add timestamp support for Whisper models

Thanks for open-sourcing this project — it's been very helpful!

Currently the Whisper implementation uses `<|notimestamps|>` and filters out all special tokens, so `TimestampedResult.timestamps` is always `None`.

For segment-level timestamps, it seems like removing `<|notimestamps|>` and parsing `<|xx.xx|>` tokens would be a straightforward fix.

For word-level timestamps, I'm not sure what the best approach would be — would re-exporting the ONNX model with cross-attention weights and applying DTW be a viable path? Or is there a better way to achieve this?

If you could point me in the right direction, I'd be happy to give it a try and submit a PR.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please Add timestamp support for Whisper models #116

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Please Add timestamp support for Whisper models #116

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions