-
Notifications
You must be signed in to change notification settings - Fork 243
SpeakerManager improvements #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
3b7fb76
80aa250
8978171
aa0f95e
82c7239
ac05a63
a976c9c
c00d1e9
44d5b55
f9a48a0
bacd0b1
35abfb2
545d561
4c822e6
afdb6d8
fe1e3c2
189e38b
f57b86d
a492c82
ba68030
432ee7c
de77a38
7204b4d
e9ea4d2
4b60a44
125c9af
da57fc4
4a07829
a6b99fd
e443a66
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,3 +1,4 @@ | ||||||||||
|
|
||||||||||
| # SpeakerManager API | ||||||||||
|
|
||||||||||
| Tracks and manages speaker identities across audio chunks for streaming diarization. | ||||||||||
|
|
@@ -73,6 +74,21 @@ let bob = Speaker(id: "bob", name: "Bob", currentEmbedding: bobEmbedding) | |||||||||
| speakerManager.initializeKnownSpeakers([alice, bob]) | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| Sometimes, there are already speakers in the database that may have the same ID. | ||||||||||
| ```swift | ||||||||||
| let alice = Speaker(id: "alice", name: "Alice", currentEmbedding: aliceEmbedding) | ||||||||||
| let bob = Speaker(id: "bob", name: "Bob", currentEmbedding: bobEmbedding) | ||||||||||
| speakerManager.initializeKnownSpeakers([alice, bob], mode: .overwrite, preservePermanent: false) // replace any speakers with ID "alice" or "bob" with the new speakers, even if the old ones were marked as permanent. | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| > The `mode` argument dictates how to handle redundant speakers. It is of type `SpeakerInitializationMode`, and can take on one of four values: | ||||||||||
| > - `.reset`: reset the speaker database and add the new speakers | ||||||||||
| > - `.merge`: merge new speakers whose IDs match with existing ones | ||||||||||
| > - `.overwrite`: overwrite existing speakers with the same IDs as the new ones | ||||||||||
| > - `.skip`: skip adding speakers whose IDs match existing ones | ||||||||||
| > | ||||||||||
| > The `preservePermanent` argument determines whether existing speakers marked as permanent should be preserved (i.e., not overwritten or merged). It is `true` by default. | ||||||||||
|
|
||||||||||
| **Use case:** When you have pre-recorded voice samples of known speakers and want to recognize them by name instead of numeric IDs. | ||||||||||
|
|
||||||||||
| #### upsertSpeaker | ||||||||||
|
|
@@ -91,16 +107,138 @@ speakerManager.upsertSpeaker( | |||||||||
| updateCount: 5, // optional | ||||||||||
| createdAt: Date(), // optional | ||||||||||
| updatedAt: Date() // optional | ||||||||||
| isPermanent: false // optional | ||||||||||
| ) | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| **Behavior:** | ||||||||||
| - If speaker ID exists: updates the existing speaker's data | ||||||||||
| - If speaker ID is new: inserts as a new speaker | ||||||||||
| - Maintains ID uniqueness and tracks numeric IDs for auto-increment | ||||||||||
| - If `isPermanent` is true, then the new speaker or the existing speaker will become permanent. This means that the speaker will not be merged or removed without an override. | ||||||||||
|
|
||||||||||
| #### mergeSpeaker | ||||||||||
| ```swift | ||||||||||
| // merge speaker 1 into "alice" | ||||||||||
| speakerManager.mergeSpeaker("1", into: "alice") | ||||||||||
|
|
||||||||||
| // merge speaker 2 into speaker 3 under the name "bob", regardless of whether speaker 2 is permanent. | ||||||||||
| speakerManager.mergeSpeaker("2", into: "3", mergedName: "Bob", stopIfPermanent: false) | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| **Behavior:** | ||||||||||
| - Unless `stopIfPermanent` is `false`, the merge will be stopped if the first speaker is permanent. | ||||||||||
| - Otherwise: Merges the first speaker into the destination speaker and removes the first speaker from the known speaker database. | ||||||||||
| - If `mergedName` is provided, the destination speaker will be renamed. Otherwise, its name will be preserved. | ||||||||||
|
|
||||||||||
| > Note: the `mergedName` argument is optional. | ||||||||||
| > Note: `stopIfPermanent` is `true` by default. | ||||||||||
|
|
||||||||||
| #### removeSpeaker | ||||||||||
| Remove a speaker from the database. | ||||||||||
|
|
||||||||||
| ```swift | ||||||||||
| // remove speaker 1 | ||||||||||
| speakerManager.removeSpeaker("1") | ||||||||||
|
|
||||||||||
| // remove "alice" from the known speaker database, even if they are marked as permanent | ||||||||||
| speakerManager.removeSpeaker("alice", keepIfPermanent: false) | ||||||||||
| ``` | ||||||||||
| > Note: `keepIfPermanent` is `true` by default. | ||||||||||
|
|
||||||||||
| #### removeSpeakersInactive | ||||||||||
| Remove speaker that have been inactive since a certain date or for a certain duration. | ||||||||||
SGD2718 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
|
|
||||||||||
| ```swift | ||||||||||
| // remove speakers that have been inactive since `date` | ||||||||||
| speakerManager.removeSpeakersInactive(since: date) | ||||||||||
|
|
||||||||||
| // remove speakers that have been active for 10 seconds, even if they were marked as permanent | ||||||||||
Alex-Wengg marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
SGD2718 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||
| speakerManager.removeSpeakersInactive(for: 10.0, keepIfPermanent: false) | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| > Note: Both versions of the method have an optional `keepIfPermanent` argument that defaults to `true`. | ||||||||||
|
|
||||||||||
| #### removeAllSpeakers | ||||||||||
SGD2718 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||
| Remove all speakers that match a given predicate. | ||||||||||
|
|
||||||||||
| ```swift | ||||||||||
| // remove all speakers with less than 5 seconds of speaking time | ||||||||||
| speakerManager.removeSpeakers( | ||||||||||
| where: { $0.duration < 5.0 }, | ||||||||||
| keepIfPermanent: false // also remove permanent speakers (optional) | ||||||||||
| ) | ||||||||||
|
|
||||||||||
| // Alternate syntax (does NOT remove permanent speakers) | ||||||||||
| speakerManager.removeSpeakers { | ||||||||||
| $0.duration < 5.0 | ||||||||||
| } | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| > Note: the predicate should take in a `Speaker` object and return a `Bool`. | ||||||||||
|
|
||||||||||
| #### makeSpeakerPermanent | ||||||||||
| Make the speaker permanent. | ||||||||||
|
|
||||||||||
| ```swift | ||||||||||
| speakerManager.makeSpeakerPermanent("alice") // mark "alice" as permanent | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| #### revokePermanence | ||||||||||
| Make the speaker not permanent. | ||||||||||
|
|
||||||||||
| ```swift | ||||||||||
| speakerManager.revokePermanence(from: "alice") // mark "alice" as not permanent | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ### Speaker Retrieval | ||||||||||
|
|
||||||||||
| #### findSpeaker | ||||||||||
| Find the best matching speaker to an embedding vector and the cosine distance to them, unless no match is found. | ||||||||||
|
|
||||||||||
| ```swift | ||||||||||
| let (id, distance) = speakerManager.findSpeaker(with: embedding) | ||||||||||
| ``` | ||||||||||
| > Note: there is an optional `speakerThreshold` argument to use a threshold other than the default. | ||||||||||
|
|
||||||||||
| #### findMatchingSpeakers | ||||||||||
| Find all speakers within the maximum `speakerThreshold` to an embedding vector. | ||||||||||
|
|
||||||||||
| ```swift | ||||||||||
| for speaker in speakerManager.findMatchingSpeakers(with: embedding) { | ||||||||||
| print("ID: \(speaker.id), Distance: \(speaker.distance)") | ||||||||||
| } | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| > Note: there is an optional `speakerThreshold` argument to use a threshold other than the default. | ||||||||||
|
|
||||||||||
| #### findSpeakers | ||||||||||
| Find all speakers that meet a certain predicate. | ||||||||||
| ```swift | ||||||||||
| // two ways to find all speakers with > 5.0s of speaking time. | ||||||||||
| speakerManager.findSpeakers(where: { $0.duration > 5.0 }) | ||||||||||
| speakerManager.findSpeakers{ $0.duration > 5.0 } | ||||||||||
| // Returns an array of IDs corresponding to speakers that meet the predicate. | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| > Note: the predicate should take in a `Speaker` object and return a `Bool`. | ||||||||||
|
|
||||||||||
| #### findMergeablePairs | ||||||||||
| Find all pairs of speakers that might be the same person. Specifically, find the pairs of speakers such that the cosine distance between them is less than the `speakerThreshold`. | ||||||||||
|
Comment on lines
+233
to
+234
|
||||||||||
| #### findMergeablePairs | |
| Find all pairs of speakers that might be the same person. Specifically, find the pairs of speakers such that the cosine distance between them is less than the `speakerThreshold`. | |
| speakerThreshold: 0.6, // optional | |
| excludeIfBothPermanent: true // optional |
SGD2718 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
SGD2718 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Alex-Wengg marked this conversation as resolved.
Show resolved
Hide resolved
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing backticks and opening pipe character in markdown table row. For consistent table formatting, the method name should be wrapped in backticks and the row should start with |: | \findSpeakers(where:)` | [String] | ...`
| | Method | Returns | Description | | |
| | `findMergeablePairs(speakerThreshold:excludeIfBothPermanent:)` | [(speakerToMerge: String, destination: String)] | Find all pairs of very similar speakers | |
Uh oh!
There was an error while loading. Please reload this page.