Open
Description
Is there any possibility to get start and end timestamp both for every token, currently we are getting only start time of every token.
A scenario that we face in general is, we have multiple speech samples where there is a long silence (>0.5 seconds to 3 seconds) as we have only the start time, the particular word is kind of spoken for 3 seconds which is not a proper scenario when we check the duration analysis.
Another question, is it possible to get logits in the result(s) along with timestamps so that we can apply some other algorithms on them?
Metadata
Metadata
Assignees
Labels
No labels