Skip to content

[Tracking] Asynchronous API extensions for VAD and KWS #2930

Open
@reuben

Description

@reuben

For features based on Voice Activity Detection and Keyword Spotting, we should ideally implement a callback based API. These are latency sensitive features where applications have no way of knowing what's a good time or interval to query the status of VAD/KWS without guessing. A callback based API lets us notify the application instead.

One simple example is automatic end of stream detection. We can implement this with Voice Activity Detection quite easily by looking at the probability of blank labels over time in the decoder loop (see e.g. arxiv:1611.09405). Exposing this functionality in our current API, on the other hand, is not so simple. Our API is entirely synchronous and information flows are always started by the application, not the library, so there's no simple API extension that lets us propagate the end of stream information to applications.

Implementing an asynchronous callback based API is non-trivial. We need to be extra careful around reentrancy, for example. Maybe manage a worker thread (or threads) ourselves to make it less error prone. (This would also simplify problems of stalling the main thread and dropping recorded samples in naive integrations of DeepSpeech as single-threaded systems, at the cost of removing low level control from applications.)

Binding an asynchronous API to higher level languages is even more complicated, as each language has its own preferred syntax/ergonomics/event loop details these days, and SWIG will not do much to avoid having to write code by hand for each language binding.

The good news is that these changes should all be doable as extensions of our current API, without breaking backwards compatibility. This means we can ship these mid 1.x cycle without too much hassle. This is a tracking issue for the overall work of figuring out how to add asynchronous APIs.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions