Vocabulary Log Probabilities API by Dokotela · Pull Request #2897 · k2-fsa/sherpa-onnx

Dokotela · 2025-12-14T19:13:10Z

Vocabulary Log Probabilities API

Summary

I needed this for fusion from two different models, and its worked ok for me so far. I think there was also another PR/Issues that was asking about this. I tried to add API for accessing full vocabulary log probability, to use with advanced confidence scoring, entropy-based uncertainty estimation, and hypothesis fusion algorithms (e.g., weighted ROVER). This exposes the complete probability distribution over the entire vocabulary for each decoded token position.

Changes

Core API (`sherpa-onnx/c-api/`)

New C API Functions:

SherpaOnnxOnlineStreamGetVocabLogProbs() - full vocab log probs online streams
SherpaOnnxOfflineStreamGetVocabLogProbs() - full vocab log probs offline streams
SherpaOnnxDestroyVocabLogProbs() - Memory management for vocab log probs

New Data Structure:

typedef struct SherpaOnnxVocabLogProbs {
  const float *log_probs;  // Flattened 2D array [num_tokens][vocab_size]
  int32_t num_tokens;
  int32_t vocab_size;
} SherpaOnnxVocabLogProbs;

Model Support

Online Models:
- Streaming Transducer (Zipformer, NeMo)
- Streaming CTC models
Offline Models:
- Whisper (greedy search decoder)
- Moonshine (greedy search decoder)
- Canary (transducer models)
- CTC models

Implementation Details

Core Changes:

Added vocab_log_probs field to OfflineRecognitionResult and OnlineRecognizerResult structures
Modified decoders to capture full vocabulary distributions during decoding:
- offline-whisper-greedy-search-decoder.cc - Captures joiner output logits
- offline-moonshine-greedy-search-decoder.cc - Captures model output distributions
- online-transducer-greedy-search-decoder.cc - Captures joiner logits per frame
- online-transducer-greedy-search-nemo-decoder.cc - NeMo-specific implementation
- offline-recognizer-canary-impl.h - Canary model support
- offline-recognizer-ctc-impl.h - CTC model support

Flutter/Dart Bindings:

Added getVocabLogProbs() methods to OnlineStream and OfflineStream
Updated FFI bindings in sherpa_onnx_bindings.dart
Full type-safe access from Dart code

JSON Serialization:

Added token_log_probs field to JSON output (for backward compatibility)
Maintains compatibility with upstream's ys_log_probs field

Use - I'm using it for Confidence-Weighted ROVER Fusion mostly (but can also be used for Entropy-Based Uncertainty Quantification).

Testing - I'm not sure how you run tests on this repo, I couldn't really find a test suite, so I'm not sure if I broke anything (in my local testing it didn't seem to, but I certainly didn't test it with all of the platforms and programming languages that you use)

Summary by CodeRabbit

New Features
- Added retrieval of vocabulary-level log probabilities for online and offline streams via new getVocabLogProbs().
- Recognition results now include per-token confidence scores and per-token vocabulary probability distributions.
- Introduced temperature scaling in the online decoder to control output sharpness.

…custom vocab log probs API - Resolved conflicts in offline-stream.cc to support both ys_log_probs (upstream) and token_log_probs (custom) - Updated pubspec.yaml to use local path for development - Preserved custom SherpaOnnxVocabLogProbs API for full vocabulary log probabilities - Maintained compatibility with upstream's ys_log_probs field

coderabbitai · 2025-12-14T19:13:22Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds token- and vocabulary-level log-probability support across the codebase: new C API struct/functions for vocab log-probs, Dart FFI bindings and stream methods to fetch them, result types extended with token/ys/vocab probs, decoders updated to compute log-softmax distributions, and JSON/serialization wiring updated.

Changes

Cohort / File(s)	Summary
Flutter result types `flutter/sherpa_onnx/lib/src/offline_recognizer.dart`, `flutter/sherpa_onnx/lib/src/online_recognizer.dart`	Added `tokenLogProbs` / `ysProbs` fields; updated constructors, `fromJson`/`toJson`, `toString`, and result construction paths.
Flutter stream methods `flutter/sherpa_onnx/lib/src/offline_stream.dart`, `flutter/sherpa_onnx/lib/src/online_stream.dart`	Added `getVocabLogProbs()` to fetch/deserialise native flattened vocab log-prob arrays into Dart `Map<String,List<double>>`.
Flutter FFI bindings `flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart`	Added `SherpaOnnxVocabLogProbs` struct, native/Dart typedefs and binding fields; init wiring for new getters/destroyer.
C API `sherpa-onnx/c-api/c-api.h`, `sherpa-onnx/c-api/c-api.cc`	Added `SherpaOnnxVocabLogProbs` struct and functions: `SherpaOnnxOnlineStreamGetVocabLogProbs`, `SherpaOnnxOfflineStreamGetVocabLogProbs`, `SherpaOnnxDestroyVocabLogProbs`.
Decoder/result headers `sherpa-onnx/csrc/offline-stream.h`, `sherpa-onnx/csrc/offline-ctc-decoder.h`, `sherpa-onnx/csrc/offline-moonshine-decoder.h`, `sherpa-onnx/csrc/offline-whisper-decoder.h`, `sherpa-onnx/csrc/online-recognizer.h`, `sherpa-onnx/csrc/online-transducer-decoder.h`	Added `token_log_probs`, `vocab_log_probs` (and related fields) to various result structs/public types.
Offline decoder implementations `sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc`, `sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc`	Switch to log-softmax for selection; accumulate per-step `token_log_probs` and `vocab_log_probs` and populate results.
Offline recognizer implementations `sherpa-onnx/csrc/offline-recognizer-canary-impl.h`, `...-ctc-impl.h`, `...-moonshine-impl.h`, `...-whisper-impl.h`	Propagate token/vocab log-prob arrays through Convert(); added GetMaxTokenIdWithConfidence helper and guards for empty inputs.
Online transducer decoders `sherpa-onnx/csrc/online-transducer-greedy-search-decoder.cc`, `...-nemo-decoder.cc`, `...-nemo-decoder.h`, `...-transducer-decoder.cc`	Record per-token vocab distributions; add `temperature_scale` to NeMo decoder and apply temperature before log-softmax; copy/move ops updated to include `vocab_log_probs`.
Offline stream serialization `sherpa-onnx/csrc/offline-stream.cc`	Emit `token_log_probs` (and `ys_log_probs`) conditionally when non-empty; avoid trailing delimiters.
Misc `sherpa-onnx/csrc/offline-whisper-model-config.h`	Removed duplicate `OfflineWhisperDecoderResult` struct.

Sequence Diagram(s)

sequenceDiagram
    participant App as Dart App
    participant DartStream as Dart Online/Offline Stream
    participant Bindings as SherpaOnnxBindings
    participant Native as Native Lib (C/C++)
    participant Heap as Native Heap

    App->>DartStream: getVocabLogProbs()
    DartStream->>Bindings: check get*VocabLogProbs ptr
    Bindings-->>DartStream: function ptr
    DartStream->>Native: call get*VocabLogProbs(stream)
    activate Native
    Native->>Heap: allocate SherpaOnnxVocabLogProbs + flattened array
    Native-->>DartStream: Pointer<SherpaOnnxVocabLogProbs>
    deactivate Native

    DartStream->>DartStream: read numTokens, vocabSize, flatten -> Map<String,List<double>>
    DartStream->>Bindings: destroyVocabLogProbs(ptr)
    Bindings->>Native: free array + struct
    Native-->>Bindings: done
    DartStream-->>App: return Map or null

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Expose ys probs to JNI, Kotlin and Java API #2736: Adds ysProbs/ys_probs to online recognizer result and updates result construction paths.
Add token-level confidence scores (ys_probs) for offline transducer models #2843: Adds per-token log-probability fields and wiring across recognizers and decoders.
Add modified beam search and hotwords support for NeMo transducer models #3077: Introduces token/vocab log-prob support and wiring through decoders, results, and bindings.

Suggested reviewers

csukuangfj

Poem

🐇 I hopped through logs both soft and loud,
Flattened probs beneath a cloud,
Per-token whispers, vocab songs—
I carried confidences along,
Now streams and decoders hum proud. 🎶

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Vocabulary Log Probabilities API' clearly and concisely summarizes the main objective of the PR, which is to expose vocabulary log probability distributions through new API functions and data structures.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2025-12-14T19:13:48Z

Summary of Changes

Hello @Dokotela, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement by exposing the full vocabulary log probabilities for each decoded token position across both online and offline speech recognition streams. This new capability is crucial for advanced applications such as confidence scoring, uncertainty estimation, and hypothesis fusion algorithms, providing a deeper insight into model predictions beyond just the most likely token.

Highlights

New API for Vocabulary Log Probabilities: Introduced C API functions (SherpaOnnxOnlineStreamGetVocabLogProbs, SherpaOnnxOfflineStreamGetVocabLogProbs, SherpaOnnxDestroyVocabLogProbs) and a new data structure (SherpaOnnxVocabLogProbs) to expose the complete probability distribution over the entire vocabulary for each decoded token position.
Extended Model Support: Integrated vocabulary log probability capture across various online (Streaming Transducer, CTC) and offline (Whisper, Moonshine, Canary, CTC) models.
Flutter/Dart Bindings: Added getVocabLogProbs() methods to OnlineStream and OfflineStream in Flutter/Dart, along with updated FFI bindings, enabling type-safe access from Dart.
Enhanced Result Structures: Modified OfflineRecognitionResult and OnlineRecognizerResult to include vocab_log_probs and token_log_probs fields, providing more detailed confidence information.
JSON Serialization: Added token_log_probs to JSON output for backward compatibility and alignment with upstream fields.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new API to access vocabulary log probabilities, which is a valuable feature for advanced use cases like model fusion and uncertainty estimation. The changes span across the C-API, core C++ logic, and Dart/Flutter bindings.

My review has identified a few areas for improvement:

There are some inconsistencies in propagating the new vocab_log_probs data in different recognizer implementations.
A critical copy-paste error was found in one of the decoders.
Some opportunities for code cleanup and performance optimization were also noted.

Overall, this is a great addition. Addressing the feedback will improve the robustness and consistency of the new API.

sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc

sherpa-onnx/csrc/offline-recognizer-canary-impl.h

sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h

sherpa-onnx/csrc/offline-recognizer-whisper-impl.h

flutter/sherpa_onnx/lib/src/online_stream.dart

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (9)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (2)
92-96: Consider removing the "ADD THIS" comment.

The inline comment // ADD THIS appears to be a development artifact and should be removed before merging.
-    // Store full vocabulary distribution (already log-softmaxed)
+    // Store full vocabulary log-probability distribution
     std::vector<float> full_vocab_probs(vocab_size);
119-120: Remove development comment.

The comment // ADD THIS on line 120 should be removed.
   ans.token_log_probs = std::move(token_log_probs);
-  ans.vocab_log_probs = std::move(vocab_log_probs);  // ADD THIS
+  ans.vocab_log_probs = std::move(vocab_log_probs);
sherpa-onnx/csrc/offline-moonshine-decoder.h (1)
17-19: Consider adding documentation for vocab_log_probs.

The token_log_probs field has a descriptive comment. Consider adding a similar comment for vocab_log_probs to document its shape and semantics.
   /// Token-level log probabilities (confidence scores)
   std::vector<float> token_log_probs;
+  /// Full vocabulary log-probability distribution per token [num_tokens][vocab_size]
   std::vector<std::vector<float>> vocab_log_probs;
sherpa-onnx/c-api/c-api.h (1)
676-689: Add documentation for the new API functions.

The struct definition has good inline documentation, but the three new functions lack documentation comments explaining:

Return value semantics (caller owns the pointer, returns NULL if no vocab log probs available)

Memory management (must call SherpaOnnxDestroyVocabLogProbs to avoid memory leak)
 SHERPA_ONNX_API typedef struct SherpaOnnxVocabLogProbs {
   const float *log_probs;  // Flattened 2D array [num_tokens][vocab_size]
   int32_t num_tokens;
   int32_t vocab_size;
 } SherpaOnnxVocabLogProbs;

+/// Get vocabulary log probabilities for an online stream.
+///
+/// @param stream A pointer returned by SherpaOnnxCreateOnlineStream().
+/// @return A pointer to vocab log probs, or NULL if not available.
+///         The user must invoke SherpaOnnxDestroyVocabLogProbs() to free
+///         the returned pointer to avoid memory leak.
 SHERPA_ONNX_API const SherpaOnnxVocabLogProbs *
 SherpaOnnxOnlineStreamGetVocabLogProbs(const SherpaOnnxOnlineStream *stream);

+/// Get vocabulary log probabilities for an offline stream.
+///
+/// @param stream A pointer returned by SherpaOnnxCreateOfflineStream().
+/// @return A pointer to vocab log probs, or NULL if not available.
+///         The user must invoke SherpaOnnxDestroyVocabLogProbs() to free
+///         the returned pointer to avoid memory leak.
 SHERPA_ONNX_API const SherpaOnnxVocabLogProbs *
 SherpaOnnxOfflineStreamGetVocabLogProbs(const SherpaOnnxOfflineStream *stream);

+/// Free a pointer returned by SherpaOnnxOnlineStreamGetVocabLogProbs() or
+/// SherpaOnnxOfflineStreamGetVocabLogProbs().
+///
+/// @param log_probs A pointer returned by the getter functions.
 SHERPA_ONNX_API void SherpaOnnxDestroyVocabLogProbs(
     const SherpaOnnxVocabLogProbs *log_probs);
sherpa-onnx/csrc/offline-stream.cc (2)
445-473: Suggest consistent precision for log probability fields.

The serialization uses different precisions for log probabilities:

ys_log_probs: precision 6 (line 454)

token_log_probs: precision 4 (line 469)

Since both represent log probabilities, consider using consistent precision (e.g., 6 for both) to maintain uniformity in the JSON output.

Apply this diff for consistency:
     for (auto prob : token_log_probs) {
-      os << sep << std::fixed << std::setprecision(4) << prob;
+      os << sep << std::fixed << std::setprecision(6) << prob;
       sep = ", ";
     }
472-486: Trailing comma assumes words is always serialized.

Line 472 adds a trailing comma after token_log_probs, which is fine because words is always serialized next (lines 477-486). However, if in the future words serialization becomes conditional, this would produce invalid JSON when token_log_probs is present but words is empty.

Consider this pattern if words ever becomes optional:
// Don't add trailing comma after token_log_probs
os << "]";  

// Later, add comma before words if token_log_probs was present
if (!token_log_probs.empty() && !words.empty()) {
  os << ", ";
}
flutter/sherpa_onnx/lib/src/online_stream.dart (1)
40-73: Consider using List<List<double>> instead of Map<String, List<double>> for better performance and semantics.

The current implementation has two concerns:

Data structure choice: Using a Map with string keys ("token_0", "token_1", etc.) for inherently sequential data is unusual. A List<List<double>> would be more natural, efficient, and consistent with the underlying C array structure.

Performance: The nested loop with per-element access is O(numTokens × vocabSize). For large vocabularies, using asTypedList for bulk copy would be significantly faster.
-  Map<String, List<double>>? getVocabLogProbs() {
+  List<List<double>>? getVocabLogProbs() {
     final getFunc = SherpaOnnxBindings.getOnlineStreamVocabLogProbs;
     final destroyFunc = SherpaOnnxBindings.destroyVocabLogProbs;

     if (getFunc == null || destroyFunc == null) {
       return null;
     }

     final ptr = getFunc(this.ptr);
     if (ptr == nullptr) {
       return null;
     }

     final vocabLogProbs = ptr.ref;
     final numTokens = vocabLogProbs.numTokens;
     final vocabSize = vocabLogProbs.vocabSize;

-    final Map<String, List<double>> result = {};
-
-    for (int tokenIdx = 0; tokenIdx < numTokens; tokenIdx++) {
-      final List<double> tokenProbs = [];
-
-      for (int vocabIdx = 0; vocabIdx < vocabSize; vocabIdx++) {
-        final index = tokenIdx * vocabSize + vocabIdx;
-        final logProb = vocabLogProbs.logProbs[index];
-        tokenProbs.add(logProb);
-      }
-
-      result['token_$tokenIdx'] = tokenProbs;
-    }
+    final List<List<double>> result = [];
+    final allProbs = vocabLogProbs.logProbs.asTypedList(numTokens * vocabSize);
+
+    for (int tokenIdx = 0; tokenIdx < numTokens; tokenIdx++) {
+      final start = tokenIdx * vocabSize;
+      result.add(allProbs.sublist(start, start + vocabSize).toList());
+    }

     destroyFunc(ptr);
     return result;
   }
sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)
156-168: Consider reusing CalculateLogProb logic to reduce duplication.

The manual log-softmax computation here (lines 159-167) duplicates logic from the CalculateLogProb helper function. Consider extracting a shared helper to compute the full log-softmax distribution.

For example:
// Add a helper function to compute full log-softmax
static void ComputeLogSoftmax(const float *logits, int32_t vocab_size, 
                              std::vector<float> &out) {
  out.resize(vocab_size);
  float max_logit = *std::max_element(logits, logits + vocab_size);
  double sum_exp = 0.0;
  for (int32_t i = 0; i < vocab_size; ++i) {
    sum_exp += std::exp(logits[i] - max_logit);
  }
  float log_sum = max_logit + std::log(sum_exp);
  for (int32_t i = 0; i < vocab_size; ++i) {
    out[i] = logits[i] - log_sum;
  }
}

// Then in the loop:
std::vector<float> full_vocab_probs;
ComputeLogSoftmax(current_logits, vocab_size, full_vocab_probs);
predicted_vocab_log_probs.push_back(std::move(full_vocab_probs));
sherpa-onnx/csrc/offline-recognizer-canary-impl.h (1)
171-209: Consider optimizing to reduce redundant passes.

The method makes multiple passes over the logits array:

Lines 179: Find max_logit

Lines 183-185: Compute sum_exp

Lines 191-193: Fill full_distribution (if requested)

Lines 200-206: Find max token

Consider combining passes for better cache locality and performance:
std::pair<int32_t, float> GetMaxTokenIdWithConfidence(
    Ort::Value *logits,
    std::vector<float> *full_distribution = nullptr) const {
  auto meta = model_->GetModelMetadata();
  const float *p_logits = logits->GetTensorData<float>();

  // Find max for numerical stability
  float max_logit = *std::max_element(p_logits, p_logits + meta.vocab_size);

  // Single pass: compute sum_exp and find max token
  float sum_exp = 0.0f;
  int32_t max_token_id = 0;
  float max_log_prob = -std::numeric_limits<float>::infinity();
  
  for (int32_t i = 0; i < meta.vocab_size; ++i) {
    float exp_val = std::exp(p_logits[i] - max_logit);
    sum_exp += exp_val;
    
    float log_prob = p_logits[i] - max_logit;  // will subtract log_sum later
    if (log_prob > max_log_prob) {
      max_log_prob = log_prob;
      max_token_id = i;
    }
  }
  
  float log_sum = max_logit + std::log(sum_exp);
  max_log_prob = max_log_prob + max_logit - log_sum;  // correct the log_prob

  // Fill distribution if requested
  if (full_distribution != nullptr) {
    full_distribution->resize(meta.vocab_size);
    for (int32_t i = 0; i < meta.vocab_size; ++i) {
      (*full_distribution)[i] = p_logits[i] - log_sum;
    }
  }

  return {max_token_id, max_log_prob};
}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6a36e8 and 51c97d1.

📒 Files selected for processing (26)

flutter/sherpa_onnx/lib/src/offline_recognizer.dart (6 hunks)
flutter/sherpa_onnx/lib/src/offline_stream.dart (1 hunks)
flutter/sherpa_onnx/lib/src/online_recognizer.dart (3 hunks)
flutter/sherpa_onnx/lib/src/online_stream.dart (1 hunks)
flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart (4 hunks)
flutter/sherpa_onnx/pubspec.yaml (1 hunks)
sherpa-onnx/c-api/c-api.cc (1 hunks)
sherpa-onnx/c-api/c-api.h (1 hunks)
sherpa-onnx/csrc/offline-ctc-decoder.h (1 hunks)
sherpa-onnx/csrc/offline-moonshine-decoder.h (1 hunks)
sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (4 hunks)
sherpa-onnx/csrc/offline-recognizer-canary-impl.h (9 hunks)
sherpa-onnx/csrc/offline-recognizer-ctc-impl.h (2 hunks)
sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h (1 hunks)
sherpa-onnx/csrc/offline-recognizer-whisper-impl.h (2 hunks)
sherpa-onnx/csrc/offline-stream.cc (1 hunks)
sherpa-onnx/csrc/offline-stream.h (1 hunks)
sherpa-onnx/csrc/offline-whisper-decoder.h (1 hunks)
sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (4 hunks)
sherpa-onnx/csrc/online-recognizer-transducer-nemo-impl.h (2 hunks)
sherpa-onnx/csrc/online-recognizer.h (1 hunks)
sherpa-onnx/csrc/online-transducer-decoder.cc (2 hunks)
sherpa-onnx/csrc/online-transducer-decoder.h (1 hunks)
sherpa-onnx/csrc/online-transducer-greedy-search-decoder.cc (2 hunks)
sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc (3 hunks)
sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.h (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-08-06T04:23:50.237Z

Learnt from: litongjava
Repo: k2-fsa/sherpa-onnx PR: 2440
File: sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/core/Core.java:4-6
Timestamp: 2025-08-06T04:23:50.237Z
Learning: The sherpa-onnx JNI library files are stored in Hugging Face repository at https://huggingface.co/csukuangfj/sherpa-onnx-libs under versioned directories like jni/1.12.7/, and the actual Windows JNI library filename is "sherpa-onnx-jni.dll" as defined in Core.java constants.

Applied to files:

flutter/sherpa_onnx/pubspec.yaml
flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart

📚 Learning: 2025-08-06T04:18:47.981Z

Learnt from: litongjava
Repo: k2-fsa/sherpa-onnx PR: 2440
File: sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/core/Core.java:4-6
Timestamp: 2025-08-06T04:18:47.981Z
Learning: In sherpa-onnx Java API, the native library names in Core.java (WIN_NATIVE_LIBRARY_NAME = "sherpa-onnx-jni.dll", UNIX_NATIVE_LIBRARY_NAME = "libsherpa-onnx-jni.so", MACOS_NATIVE_LIBRARY_NAME = "libsherpa-onnx-jni.dylib") are copied directly from the compiled binary filenames and should not be changed to match other libraries' naming conventions.

Applied to files:

flutter/sherpa_onnx/pubspec.yaml
flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart

🧬 Code graph analysis (11)

sherpa-onnx/csrc/online-recognizer.h (1)

sherpa-onnx/csrc/math.h (1)

float (52-72)

sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc (2)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

full_vocab_probs (92-92)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)

full_vocab_probs (158-158)

sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h (1)

sherpa-onnx/csrc/offline-stream.cc (2)

r (257-257)

r (257-257)

sherpa-onnx/c-api/c-api.h (1)

sherpa-onnx/c-api/c-api.cc (6)

SherpaOnnxOnlineStreamGetVocabLogProbs (757-780)

SherpaOnnxOnlineStreamGetVocabLogProbs (757-758)

SherpaOnnxOfflineStreamGetVocabLogProbs (782-806)

SherpaOnnxOfflineStreamGetVocabLogProbs (782-783)

SherpaOnnxDestroyVocabLogProbs (808-814)

SherpaOnnxDestroyVocabLogProbs (808-809)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)

full_vocab_probs (158-158)

sherpa-onnx/csrc/offline-recognizer-ctc-impl.h (1)

sherpa-onnx/csrc/offline-stream.cc (2)

r (257-257)

r (257-257)

sherpa-onnx/csrc/online-transducer-decoder.h (1)

sherpa-onnx/csrc/math.h (1)

float (52-72)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

full_vocab_probs (92-92)

sherpa-onnx/csrc/online-transducer-greedy-search-decoder.cc (2)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

full_vocab_probs (92-92)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)

full_vocab_probs (158-158)

sherpa-onnx/csrc/offline-stream.h (1)

sherpa-onnx/csrc/math.h (1)

float (52-72)

sherpa-onnx/csrc/offline-recognizer-canary-impl.h (4)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

full_vocab_probs (92-92)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)

full_vocab_probs (158-158)

sherpa-onnx/csrc/onnx-utils.cc (2)

View (188-227)

View (188-188)

sherpa-onnx/csrc/offline-stream.cc (2)

r (257-257)

r (257-257)

🪛 Cppcheck (2.18.0)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc

[warning] 72-72: Invalid std

(invalidFunctionArg)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc

[warning] 44-44: Invalid std

(invalidFunctionArg)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: ubuntu-24.04 3.11
GitHub Check: ubuntu-24.04 3.9
GitHub Check: Release shared tts-ON
GitHub Check: ubuntu-24.04 3.12
GitHub Check: ubuntu-24.04 3.10
GitHub Check: Release static tts-ON
GitHub Check: Release shared tts-OFF
GitHub Check: Debug static tts-ON
GitHub Check: Release shared-OFF tts-OFF
GitHub Check: Release shared-OFF tts-ON
GitHub Check: ubuntu-24.04 3.8
GitHub Check: Debug shared tts-OFF
GitHub Check: Debug shared-OFF tts-ON
GitHub Check: rknn shared ON
GitHub Check: ubuntu-24.04 3.13
GitHub Check: Release shared-ON tts-OFF
GitHub Check: Debug shared-ON tts-OFF
GitHub Check: Debug shared-OFF tts-OFF
GitHub Check: Debug shared-ON tts-ON
GitHub Check: Release shared-ON tts-ON

🔇 Additional comments (28)

sherpa-onnx/csrc/online-transducer-decoder.cc (1)

43-43: LGTM!

The copy and move assignment operators correctly propagate vocab_log_probs following the same pattern used for ys_probs, lm_probs, and context_scores.

Also applies to: 71-71

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

65-83: Log-softmax implementation is correct but has minor redundancy.

The numerical stability approach (subtracting max_logit) is appropriate. However, the loop at lines 77-83 recomputes p[j] - log_sum which was already computed for p[0] on line 75. Consider combining the max-finding with the log-prob computation in a single pass after computing log_sum.

The static analysis warning about std::log(sum_exp) at line 72 is a false positive since sum_exp is always positive when vocab_size > 0 (as exp() returns positive values).

sherpa-onnx/c-api/c-api.cc (2)

808-814: LGTM!

The destructor correctly handles the null case and properly frees both the inner array (log_probs) with delete[] and the struct itself with delete, matching the allocation pattern in the getter functions.

757-780: No robustness issue here. The vocab_log_probs data structure is populated by decoders where each inner vector is created with exactly vocab_size elements from model output, which is a fixed property per model. Since all tokens use the same model with the same output shape, all inner vectors are guaranteed to have identical sizes by design. The std::copy operation is safe—if any vector had a different size, it would indicate data corruption in the decoder itself, which would be a more fundamental problem than a buffer overflow in this code.

sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h (1)

30-30: LGTM!

The assignment correctly propagates token_log_probs from the decoder result to the recognition result, consistent with how other fields are copied.

sherpa-onnx/csrc/online-recognizer.h (1)

45-45: LGTM!

The vocab_log_probs field addition enables per-token vocabulary distribution access for online recognition. The 2D structure (tokens × vocab_size) is appropriate for storing full distributions.

sherpa-onnx/csrc/online-transducer-decoder.h (1)

33-36: LGTM!

The vocab_log_probs field addition is well-documented with clear shape specification and usage conditions. The 2D vector structure appropriately stores per-token vocabulary distributions.

sherpa-onnx/csrc/offline-stream.h (1)

52-54: LGTM!

The addition of token_log_probs (per-token confidence) and vocab_log_probs (full vocabulary distributions) to OfflineRecognitionResult enables comprehensive confidence scoring and uncertainty estimation as described in the PR objectives.

flutter/sherpa_onnx/lib/src/offline_stream.dart (1)

36-69: LGTM!

The implementation correctly:

Validates function pointers and returned data

Calculates flattened array indices (line 59)

Cleans up native resources (line 67)

Handles null cases gracefully

The Map<String, List> structure with "token_i" keys provides clear token identification.

sherpa-onnx/csrc/offline-recognizer-ctc-impl.h (1)

34-62: LGTM!

The implementation correctly maintains alignment between r.tokens and r.token_log_probs. When tokens are filtered (e.g., SIL tokens at line 41), both the token and its corresponding log probability are skipped together, preserving the one-to-one correspondence.

sherpa-onnx/csrc/offline-recognizer-whisper-impl.h (1)

157-172: The token/log_prob alignment is correctly preserved through filtering.

The loop correctly maintains 1:1 correspondence between r.tokens and r.token_log_probs: when a token is filtered by !sym_table.Contains(i), the loop continues without appending to either vector, so both are skipped symmetrically. The guard on line 169 is defensive but unnecessary given the decoder's guarantee of equal-sized source vectors. No alignment issue exists.

sherpa-onnx/csrc/offline-whisper-decoder.h (1)

16-24: LGTM!

The new token_log_probs and vocab_log_probs fields are correctly typed and appropriately placed in the result struct to capture per-token and full-vocabulary log probability distributions.

sherpa-onnx/csrc/online-recognizer-transducer-nemo-impl.h (2)

52-59: LGTM!

The temperature_scale parameter is correctly plumbed from config_ to the decoder constructor, consistent with the updated OnlineTransducerGreedySearchNeMoDecoder signature.

72-79: LGTM!

The templated constructor path correctly mirrors the primary constructor's handling of temperature_scale.

sherpa-onnx/csrc/online-transducer-greedy-search-decoder.cc (1)

142-160: LGTM!

The per-token log probability export and full vocabulary distribution storage are correctly implemented with a single pass through temperature scaling and LogSoftmax. This serves as the correct pattern that other decoders (like the NeMo decoder) should follow.

flutter/sherpa_onnx/lib/src/offline_recognizer.dart (3)

610-613: LGTM!

The JSON parsing for token_log_probs correctly handles null values with a safe default empty list and properly converts numeric values to doubles.

843-869: LGTM!

The getResult method correctly handles tokenLogProbs in both the error path (empty result) and the success path (parsed from JSON), ensuring consistency.

597-604: All instantiations of OfflineRecognizerResult already provide the tokenLogProbs parameter. The three usages in the file (fromJson factory at line 607, empty result at line 848, and parsedJson construction at line 862) all include this parameter. No breaking changes were introduced.

flutter/sherpa_onnx/lib/src/online_recognizer.dart (2)

329-330: LGTM!

The ysProbs parameter is optional with a sensible default value, which avoids breaking existing code while supporting the new feature.

508-510: LGTM!

The parsing of ysProbs in getResult correctly handles null values with a safe default empty list.

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (2)

18-45: LGTM with a note on the static analysis hint.

The CalculateLogProb function correctly implements log-softmax using the log-sum-exp trick for numerical stability.

Note: The static analysis tool flags line 44 as "Invalid std", but this appears to be a false positive. std::log is a valid function from <cmath> (included on line 8).

204-208: LGTM!

The addition of IsMultiLingual() guard before accessing id2lang is good defensive programming and prevents potential issues with non-multilingual models.

sherpa-onnx/csrc/offline-recognizer-canary-impl.h (3)

60-64: LGTM!

The early return for empty encoder output is good defensive programming and prevents potential crashes downstream.

146-152: LGTM!

The Convert method signature update to include token_log_probs is consistent with changes in other files, and the assignment is straightforward.

218-220: LGTM!

The addition of an early return when the feature vector is empty prevents potential issues downstream and aligns with the empty check in DecodeStream.

Also applies to: 235-237

flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart (3)

680-689: LGTM!

The SherpaOnnxVocabLogProbs struct is correctly defined for FFI with appropriate types. The flattened 2D array design (via Pointer<Float>) is the right approach for C interop.

1433-1445: LGTM!

The typedefs for vocab log probs functions are correctly defined with appropriate Native and Dart versions.

2325-2341: LGTM!

The initialization of vocab log probs bindings follows the existing pattern in the file and correctly uses lookupFunction to bind the native functions.

flutter/sherpa_onnx/lib/src/online_recognizer.dart

flutter/sherpa_onnx/pubspec.yaml

sherpa-onnx/csrc/offline-ctc-decoder.h

sherpa-onnx/csrc/offline-recognizer-canary-impl.h

sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc

sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.h

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

sherpa-onnx/csrc/offline-recognizer-canary-impl.h (1)
150-173: Critical: Token log probabilities misalignment when tokens are skipped.

The method assigns token_log_probs wholesale at line 155, but then iterates through tokens and skips some (line 162 continue when !symbol_table_.Contains(token_id)). This breaks the correspondence between r.tokens and r.token_log_probs indices. The same issue affects vocab_log_probs alignment (set at line 131).

Apply this diff to fix the alignment:
 OfflineRecognitionResult Convert(
     const std::vector<int32_t> &tokens,
     const std::vector<float> &token_log_probs) const {
   OfflineRecognitionResult r;
   r.tokens.reserve(tokens.size());
-  r.token_log_probs = token_log_probs;

   std::string text;
   for (size_t idx = 0; idx < tokens.size(); ++idx) {
     int32_t token_id = tokens[idx];

     if (!symbol_table_.Contains(token_id)) {
       continue;
     }

     const auto &s = symbol_table_[token_id];
     text += s;
     r.tokens.push_back(s);
+    
+    if (idx < token_log_probs.size()) {
+      r.token_log_probs.push_back(token_log_probs[idx]);
+    }
   }

   r.text = std::move(text);

   return r;
}
Note: You'll also need to filter vocab_log_probs similarly in DecodeStream before assigning to r.vocab_log_probs at line 131, or ensure tokens are never skipped by guaranteeing all generated tokens exist in the symbol table.

🧹 Nitpick comments (4)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)
121-145: Log-softmax implementation is correct and addresses previous review feedback.

The current implementation computes log-softmax once per iteration and then extracts the log probability for the selected token, which addresses the redundant computation concern raised in the previous review.

Optional refactoring opportunities:
Max finding consistency: Lines 123-128 manually find the max logit, while lines 101-102 use std::max_element. Consider using std::max_element here as well for consistency:
float max_logit = *std::max_element(current_logits, current_logits + vocab_size);
Allocation efficiency: full_vocab_probs is allocated in every loop iteration (line 122). For large vocabularies, consider pre-allocating it outside the loop:
std::vector<float> full_vocab_probs(vocab_size);
for (int32_t i = 0; i < num_possible_tokens; ++i) {
  // ... reuse full_vocab_probs in each iteration
Note on static analysis: The Cppcheck warning on line 134 about "Invalid std" is a false positive—std::log is a valid function from <cmath>.
sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)
65-96: Consider optimizing the duplicate log-softmax computation.

The log-sum-exp implementation is correct and numerically stable. However, the log-softmax is computed twice: once while finding the max token (lines 68-83) and again while storing the full vocabulary distribution (lines 92-96).

Apply this diff to compute log-softmax once:
-    // Compute log-softmax and find max token with confidence
-    float max_logit = *std::max_element(p, p + vocab_size);
-    
-    float sum_exp = 0.0f;
-    for (int32_t j = 0; j < vocab_size; ++j) {
-      sum_exp += std::exp(p[j] - max_logit);
-    }
-    float log_sum = max_logit + std::log(sum_exp);
-    
-    int32_t max_token_id = 0;
-    float max_log_prob = p[0] - log_sum;
-    
-    for (int32_t j = 1; j < vocab_size; ++j) {
-      float log_prob = p[j] - log_sum;
-      if (log_prob > max_log_prob) {
-        max_log_prob = log_prob;
-        max_token_id = j;
-      }
-    }
-
-    if (max_token_id == eos) {
-      break;
-    }
-    tokens.push_back(max_token_id);
-    token_log_probs.push_back(max_log_prob);
-    
-    // Store full vocabulary distribution (already log-softmaxed)
-    std::vector<float> full_vocab_probs(vocab_size);
-    for (int32_t j = 0; j < vocab_size; ++j) {
-      full_vocab_probs[j] = p[j] - log_sum;
-    }
-    vocab_log_probs.push_back(std::move(full_vocab_probs));
+    // Compute log-softmax once for both max selection and storage
+    float max_logit = *std::max_element(p, p + vocab_size);
+    
+    float sum_exp = 0.0f;
+    for (int32_t j = 0; j < vocab_size; ++j) {
+      sum_exp += std::exp(p[j] - max_logit);
+    }
+    float log_sum = max_logit + std::log(sum_exp);
+    
+    // Compute log-softmax for all tokens and find max in single pass
+    std::vector<float> full_vocab_probs(vocab_size);
+    int32_t max_token_id = 0;
+    float max_log_prob = p[0] - log_sum;
+    full_vocab_probs[0] = max_log_prob;
+    
+    for (int32_t j = 1; j < vocab_size; ++j) {
+      float log_prob = p[j] - log_sum;
+      full_vocab_probs[j] = log_prob;
+      if (log_prob > max_log_prob) {
+        max_log_prob = log_prob;
+        max_token_id = j;
+      }
+    }
+
+    if (max_token_id == eos) {
+      break;
+    }
+    tokens.push_back(max_token_id);
+    token_log_probs.push_back(max_log_prob);
+    vocab_log_probs.push_back(std::move(full_vocab_probs));
flutter/sherpa_onnx/lib/src/online_stream.dart (1)
53-66: Consider adding defensive validation for native values.

The method reads numTokens and vocabSize from native code and immediately uses them in nested loops without validation. If native code returns negative or unexpectedly large values, this could cause issues.

Apply this diff to add defensive checks:
 final vocabLogProbs = vocabPtr.ref;
 final numTokens = vocabLogProbs.numTokens;
 final vocabSize = vocabLogProbs.vocabSize;
+
+if (numTokens < 0 || vocabSize < 0 || numTokens > 10000 || vocabSize > 100000) {
+  destroyFunc(vocabPtr);
+  return null;
+}

 final Map<String, List<double>> result = {};
sherpa-onnx/csrc/offline-recognizer-canary-impl.h (1)
101-119: Remove unused variable tokens_generated.

The variable tokens_generated is declared and incremented but never read. It serves no purpose in the current implementation.

Apply this diff:
 if (max_token_id != eos) {
-  int32_t tokens_generated = 0;
   for (int32_t i = decoder_input.size() + 1;
        i <= decoder_input.size() + num_tokens; ++i) {
     if (tokens.back() == eos) {
       break;
     }

     std::tie(logits, decoder_states) =
         RunDecoder(tokens.back(), i, std::move(decoder_states),
                    View(&enc_states), View(&enc_mask));

     std::vector<float> next_full_vocab_probs;
     auto [next_token_id, next_confidence] =
         GetMaxTokenIdWithConfidence(&logits, &next_full_vocab_probs);

     tokens.push_back(next_token_id);
     token_log_probs.push_back(next_confidence);
     vocab_log_probs.push_back(std::move(next_full_vocab_probs));
-    tokens_generated++;
   }
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51c97d1 and ce7f975.

📒 Files selected for processing (10)

flutter/sherpa_onnx/lib/src/online_recognizer.dart (3 hunks)
flutter/sherpa_onnx/lib/src/online_stream.dart (1 hunks)
flutter/sherpa_onnx/pubspec.yaml (1 hunks)
sherpa-onnx/csrc/offline-ctc-decoder.h (1 hunks)
sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (4 hunks)
sherpa-onnx/csrc/offline-recognizer-canary-impl.h (9 hunks)
sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h (1 hunks)
sherpa-onnx/csrc/offline-recognizer-whisper-impl.h (2 hunks)
sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (3 hunks)
sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h
sherpa-onnx/csrc/offline-ctc-decoder.h
flutter/sherpa_onnx/lib/src/online_recognizer.dart
sherpa-onnx/csrc/offline-recognizer-whisper-impl.h
flutter/sherpa_onnx/pubspec.yaml

🧰 Additional context used

🧬 Code graph analysis (4)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)

full_vocab_probs (122-122)

sherpa-onnx/csrc/offline-recognizer-canary-impl.h (3)

sherpa-onnx/csrc/offline-canary-model.cc (2)

tokens (83-117)

tokens (83-85)

sherpa-onnx/csrc/offline-recognizer-ctc-impl.h (2)

Convert (26-323)

`` (193-199)

sherpa-onnx/csrc/offline-stream.cc (2)

r (257-257)

r (257-257)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

full_vocab_probs (92-92)

sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc (3)

sherpa-onnx/csrc/online-stream.cc (8)

r (68-68)

r (68-68)

r (72-74)

r (72-72)

r (93-93)

r (93-93)

r (95-97)

r (95-95)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (1)

full_vocab_probs (122-122)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (1)

full_vocab_probs (92-92)

🪛 Cppcheck (2.18.0)

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc

[warning] 72-72: Invalid std

(invalidFunctionArg)

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc

[warning] 134-134: Invalid std

(invalidFunctionArg)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: Release shared tts-OFF
GitHub Check: Debug static tts-OFF
GitHub Check: Release shared tts-ON
GitHub Check: Debug shared-OFF tts-ON
GitHub Check: Debug shared-ON tts-ON
GitHub Check: Debug shared-OFF tts-OFF
GitHub Check: Release shared-ON tts-OFF
GitHub Check: Debug shared-ON tts-OFF
GitHub Check: Release shared-ON tts-ON
GitHub Check: Release shared-OFF tts-OFF
GitHub Check: Release shared-OFF tts-ON
GitHub Check: swift (macos-latest)
GitHub Check: rknn shared ON
GitHub Check: rknn shared OFF
GitHub Check: ubuntu-24.04 3.13
GitHub Check: ubuntu-24.04 3.12
GitHub Check: ubuntu-24.04 3.11
GitHub Check: ubuntu-24.04 3.9
GitHub Check: ubuntu-24.04 3.8
GitHub Check: ubuntu-24.04 3.10

🔇 Additional comments (16)

sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc (2)

90-99: Looks good: single-pass temperature + LogSoftmax, and full-vocab log-probs capture.
This also appears to address the previously-reported duplicate “export per-token log scores” block (only one remains now).

138-139: Call site threading temperature_scale_ into DecodeOne() is consistent.
Just ensure temperature_scale_ is validated at config boundaries too (so all decoders behave consistently).

sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc (5)

8-9: LGTM!

The new includes are necessary for the log-softmax computation (std::log, std::exp) and max logit initialization (std::numeric_limits).

92-103: LGTM!

The variable renaming (logits_tensor, current_token_id) improves code clarity, and the initialization logic correctly identifies the first token using std::max_element.

108-110: LGTM!

The new data structures for tracking per-token log probabilities and full vocabulary distributions are properly declared and will be populated in the decoding loop.

147-176: LGTM!

The loop continuation logic correctly:

Creates the next token input tensor with the selected token

Forwards through the decoder with proper state propagation

Updates the offset and checks context boundaries

Computes the next token for the subsequent iteration

178-189: LGTM!

The result assembly correctly:

Guards language lookup with IsMultiLingual() check

Populates the predicted tokens

Populates the new token_log_probs and vocab_log_probs fields using std::move for efficiency

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc (4)

8-8: LGTM!

The <cmath> include is necessary for std::log and std::exp used in the log-softmax computation.

42-43: LGTM!

The vector declarations appropriately support the PR's objective of exposing per-token log-probabilities and full vocabulary distributions.

119-120: LGTM!

The result population correctly uses std::move to efficiently transfer the computed log-probability data to the result structure.

66-72: The cppcheck warning "Invalid std (invalidFunctionArg)" at line 72 is a false positive.

The std::log(sum_exp) call is valid—sum_exp is guaranteed to be strictly positive as it's a sum of std::exp() values. This log-sum-exp pattern is used consistently throughout the codebase without issues. No action is required.

sherpa-onnx/csrc/offline-recognizer-canary-impl.h (5)

60-64: LGTM! Good defensive guard.

The early return for empty encoder output prevents potential crashes or undefined behavior in downstream processing.

79-92: LGTM! Vocab log probs now properly collected.

This addresses the previous review feedback by capturing the full vocabulary distribution for the first token. The three parallel vectors (tokens, token_log_probs, vocab_log_probs) are correctly initialized.

123-128: LGTM! Proper cleanup of EOS token.

Correctly removes the EOS token and its associated confidence and vocab distribution from all three parallel vectors, maintaining alignment.

175-213: LGTM! Correct log-softmax implementation.

The method correctly computes log-softmax using the log-sum-exp trick for numerical stability. The optional full distribution capture and max token identification are both implemented correctly.

222-224: LGTM! Good defensive guard.

The early return for empty feature vectors prevents potential downstream errors and is consistent with the empty encoder output guard in DecodeStream.

sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @dart-api-examples/non-streaming-asr/pubspec.yaml:
- Around line 13-14: The pubspec.yaml currently uses a local path override for
the sherpa_onnx dependency which will break consumers; replace the local path
entry (sherpa_onnx: path: ../../flutter/sherpa_onnx) with the published
versioned dependency (e.g., sherpa_onnx: ^X.Y.Z) or otherwise remove the path
override so the package resolves from pub.dev, and if the new API isn’t
published yet, add a TODO to revert this path change once the updated
sherpa_onnx is released.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67970fb and 2316c5a.

📒 Files selected for processing (1)

dart-api-examples/non-streaming-asr/pubspec.yaml

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-06T04:23:50.237Z

Learnt from: litongjava
Repo: k2-fsa/sherpa-onnx PR: 2440
File: sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/core/Core.java:4-6
Timestamp: 2025-08-06T04:23:50.237Z
Learning: The sherpa-onnx JNI library files are stored in Hugging Face repository at https://huggingface.co/csukuangfj/sherpa-onnx-libs under versioned directories like jni/1.12.7/, and the actual Windows JNI library filename is "sherpa-onnx-jni.dll" as defined in Core.java constants.

Applied to files:

dart-api-examples/non-streaming-asr/pubspec.yaml

dart-api-examples/non-streaming-asr/pubspec.yaml

CodeRabbit correctly identified that using a local path dependency would break for users cloning the repository. Reverted to ^1.12.20. Added TODO comment explaining that: - This example uses MedASR API not yet published to pub.dev - Users can temporarily use local path for testing - Or wait until the updated package is released The example will have errors until MedASR support is published.

- Updated all version numbers to 1.12.21 - Resolved conflicts in pubspec.yaml files - Kept helpful TODO comment for MedASR example - Kept descriptive comments for local development paths

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @dart-api-examples/non-streaming-asr/pubspec.yaml:
- Around line 13-15: The TODO comment incorrectly references "MedASR API";
update the comment to refer to the vocab log-probs functionality introduced in
this PR (e.g., getVocabLogProbs(), SherpaOnnxVocabLogProbs) or clarify if MedASR
is actually intended; change the text so it consistently mentions the
"Vocabulary Log Probabilities API" and notes the same instruction about
reverting to a local path for local testing or waiting for the package release.

🧹 Nitpick comments (2)

sherpa-onnx/c-api/c-api.cc (2)
782-805: Consider validating consistent inner vector dimensions.

The code assumes all inner vectors in vocab_log_probs have the same size as the first entry. If this invariant is violated, std::copy at Line 798 could write out-of-bounds or leave uninitialized data.

While decoders should always produce consistent dimensions, a defensive validation would prevent subtle memory corruption bugs.
♻️ Optional: Add size validation
   auto vocab_probs = new SherpaOnnxVocabLogProbs;
   vocab_probs->num_tokens = result.vocab_log_probs.size();
   vocab_probs->vocab_size = result.vocab_log_probs[0].size();
 
+  // Validate all inner vectors have consistent size
+  for (int32_t i = 1; i < vocab_probs->num_tokens; ++i) {
+    if (static_cast<int32_t>(result.vocab_log_probs[i].size()) != vocab_probs->vocab_size) {
+      delete vocab_probs;
+      return nullptr;
+    }
+  }
+
   // Flatten the 2D vector into a 1D array
807-831: Same inner vector size assumption applies here.

This function has the same assumption as the online variant regarding consistent inner vector dimensions. Consider applying the same validation if added to SherpaOnnxOnlineStreamGetVocabLogProbs.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f30966f and f034396.

📒 Files selected for processing (4)

dart-api-examples/non-streaming-asr/pubspec.yaml
flutter/sherpa_onnx/pubspec.yaml
sherpa-onnx/c-api/c-api.cc
sherpa-onnx/c-api/c-api.h

🚧 Files skipped from review as they are similar to previous changes (1)

sherpa-onnx/c-api/c-api.h

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-08-06T04:23:50.237Z

Learnt from: litongjava
Repo: k2-fsa/sherpa-onnx PR: 2440
File: sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/core/Core.java:4-6
Timestamp: 2025-08-06T04:23:50.237Z
Learning: The sherpa-onnx JNI library files are stored in Hugging Face repository at https://huggingface.co/csukuangfj/sherpa-onnx-libs under versioned directories like jni/1.12.7/, and the actual Windows JNI library filename is "sherpa-onnx-jni.dll" as defined in Core.java constants.

Applied to files:

flutter/sherpa_onnx/pubspec.yaml

📚 Learning: 2025-08-06T04:18:47.981Z

Learnt from: litongjava
Repo: k2-fsa/sherpa-onnx PR: 2440
File: sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/core/Core.java:4-6
Timestamp: 2025-08-06T04:18:47.981Z
Learning: In sherpa-onnx Java API, the native library names in Core.java (WIN_NATIVE_LIBRARY_NAME = "sherpa-onnx-jni.dll", UNIX_NATIVE_LIBRARY_NAME = "libsherpa-onnx-jni.so", MACOS_NATIVE_LIBRARY_NAME = "libsherpa-onnx-jni.dylib") are copied directly from the compiled binary filenames and should not be changed to match other libraries' naming conventions.

Applied to files:

flutter/sherpa_onnx/pubspec.yaml

🔇 Additional comments (2)

flutter/sherpa_onnx/pubspec.yaml (1)

34-56: LGTM! Helpful local development guidance.

The added comments consistently document how to switch to local path dependencies for testing across all platform packages. This is a good developer experience improvement.

sherpa-onnx/c-api/c-api.cc (1)

833-839: LGTM!

The destroy function correctly handles null input and properly frees both the inner array and the struct, matching the allocation pattern in the getter functions.

dart-api-examples/non-streaming-asr/pubspec.yaml

CodeRabbit correctly identified that the TODO comment should reference the vocab log probs API (getVocabLogProbs, SherpaOnnxVocabLogProbs) introduced in this PR, not other unrelated APIs.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

dart-api-examples/non-streaming-asr/pubspec.yaml (1)
13-18: LGTM - Helpful TODO comment for unpublished API.

The comment clearly explains the situation with the Vocabulary Log Probabilities API. Consider showing the exact local path syntax inline to make it more actionable for developers:
💡 Optional: Include the exact syntax for local testing
   # TODO: This example directory includes examples that use the Vocabulary Log 
   # Probabilities API (getVocabLogProbs, SherpaOnnxVocabLogProbs) which is not 
   # yet published to pub.dev.
-  # Revert to local path (../../flutter/sherpa_onnx) for local testing,
+  # For local testing, replace the sherpa_onnx line below with:
+  #   sherpa_onnx:
+  #     path: ../../flutter/sherpa_onnx
   # or wait until the updated package is released.
   sherpa_onnx: ^1.12.21

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f034396 and 6f35262.

📒 Files selected for processing (1)

dart-api-examples/non-streaming-asr/pubspec.yaml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: Release shared-ON tts-OFF
GitHub Check: Release shared-OFF tts-OFF
GitHub Check: Debug shared-ON tts-ON
GitHub Check: swift (macos-15-intel)
GitHub Check: ubuntu-24.04 3.8
GitHub Check: ubuntu-24.04 3.9
GitHub Check: ubuntu-24.04 3.13
GitHub Check: ubuntu-24.04 3.12
GitHub Check: ubuntu-24.04 3.11
GitHub Check: ubuntu-24.04 3.10
GitHub Check: rknn shared ON
GitHub Check: rknn shared OFF

Dokotela · 2026-01-24T21:46:38Z

I think its ready to merge. The failing tests don't seem to have anything to do with my PR (as far as I can tell), and I'm not sure why on this one instance the style_check didn't complete.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Resolves merge conflicts in 5 files, preserving our custom per-token log probability additions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Captures the max log probability when emitting non-blank, non-repeat tokens in the CTC decoder. This enables real per-word confidence scores for SenseVoice and MedASR (NeMo CTC), completing log prob support across all decoder types in our fork. Also propagates token_log_probs through ConvertSenseVoiceResult() so SenseVoice results include confidence scores in the final OfflineRecognitionResult. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

flutter/sherpa_onnx/lib/src/offline_recognizer.dart (1)

816-831: ⚠️ Potential issue | 🟠 Major

Free native config on all failure paths in OfflineRecognizer factory.

convertConfig() allocates many native strings, but the current throw paths skip freeConfig(c) (both when bindings are uninitialized and when native creation returns nullptr), causing leaks on failure/retry paths.

💡 Proposed fix

   factory OfflineRecognizer(OfflineRecognizerConfig config) {
-    final c = convertConfig(config);
-
     if (SherpaOnnxBindings.createOfflineRecognizer == null) {
       throw Exception("Please initialize sherpa-onnx first");
     }
+    final c = convertConfig(config);
 
-    final ptr = SherpaOnnxBindings.createOfflineRecognizer?.call(c) ?? nullptr;
-
-    if (ptr == nullptr) {
-      throw Exception(
-          "Failed to create offline recognizer. Please check your config");
-    }
-
-    freeConfig(c);
-
-    return OfflineRecognizer._(ptr: ptr, config: config);
+    try {
+      final ptr = SherpaOnnxBindings.createOfflineRecognizer!.call(c);
+      if (ptr == nullptr) {
+        throw Exception(
+            "Failed to create offline recognizer. Please check your config");
+      }
+      return OfflineRecognizer._(ptr: ptr, config: config);
+    } finally {
+      freeConfig(c);
+    }
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@flutter/sherpa_onnx/lib/src/offline_recognizer.dart` around lines 816 - 831,
The factory OfflineRecognizer currently calls convertConfig() to allocate native
resources but throws before calling freeConfig(c) on failure paths; wrap the
native calls in a try/finally (or ensure freeConfig(c) is invoked before every
throw) so that freeConfig(c) is always called regardless of errors;
specifically, after obtaining c from convertConfig(), check
SherpaOnnxBindings.createOfflineRecognizer and the ptr result inside a try block
and call freeConfig(c) in the finally block (or call freeConfig(c) immediately
before each Exception throw), referencing OfflineRecognizer, convertConfig,
freeConfig, SherpaOnnxBindings.createOfflineRecognizer and ptr to locate the
code to change.

🧹 Nitpick comments (1)

flutter/sherpa_onnx/lib/src/offline_recognizer.dart (1)

1060-1070: Prefer a single parsing path for OfflineRecognizerResult.

getResult() re-implements parsing logic already present in OfflineRecognizerResult.fromJson(). Reusing fromJson() avoids drift in defaults/conversions.

♻️ Proposed refactor

-    return OfflineRecognizerResult(
-        text: parsedJson['text'],
-        tokens: List<String>.from(parsedJson['tokens']),
-        tokenLogProbs: (parsedJson['token_log_probs'] as List?)
-                ?.map((e) => (e as num).toDouble())
-                .toList() ??
-            [],
-        timestamps: List<double>.from(parsedJson['timestamps']),
-        lang: parsedJson['lang'],
-        emotion: parsedJson['emotion'],
-        event: parsedJson['event']);
+    return OfflineRecognizerResult.fromJson(
+      parsedJson as Map<String, dynamic>,
+    );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@flutter/sherpa_onnx/lib/src/offline_recognizer.dart` around lines 1060 -
1070, getResult() duplicates JSON parsing logic already implemented by
OfflineRecognizerResult.fromJson(), risking drift; update getResult() to call
OfflineRecognizerResult.fromJson(parsedJson) (or the static factory on
OfflineRecognizerResult) and return that instance instead of reconstructing
fields inline, ensuring you preserve any existing null/default handling in
OfflineRecognizerResult.fromJson() and remove the manual List/String/num
conversions in getResult().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@flutter/sherpa_onnx/lib/src/offline_recognizer.dart`:
- Around line 816-831: The factory OfflineRecognizer currently calls
convertConfig() to allocate native resources but throws before calling
freeConfig(c) on failure paths; wrap the native calls in a try/finally (or
ensure freeConfig(c) is invoked before every throw) so that freeConfig(c) is
always called regardless of errors; specifically, after obtaining c from
convertConfig(), check SherpaOnnxBindings.createOfflineRecognizer and the ptr
result inside a try block and call freeConfig(c) in the finally block (or call
freeConfig(c) immediately before each Exception throw), referencing
OfflineRecognizer, convertConfig, freeConfig,
SherpaOnnxBindings.createOfflineRecognizer and ptr to locate the code to change.

---

Nitpick comments:
In `@flutter/sherpa_onnx/lib/src/offline_recognizer.dart`:
- Around line 1060-1070: getResult() duplicates JSON parsing logic already
implemented by OfflineRecognizerResult.fromJson(), risking drift; update
getResult() to call OfflineRecognizerResult.fromJson(parsedJson) (or the static
factory on OfflineRecognizerResult) and return that instance instead of
reconstructing fields inline, ensuring you preserve any existing null/default
handling in OfflineRecognizerResult.fromJson() and remove the manual
List/String/num conversions in getResult().

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d16afe0b-5002-48fe-8770-c94d4fb5b1c6

📥 Commits

Reviewing files that changed from the base of the PR and between b0d37fe and 06c2c17.

📒 Files selected for processing (2)

flutter/sherpa_onnx/lib/src/offline_recognizer.dart
flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart

- Move binding null-check before convertConfig() and wrap native call in try/finally so freeConfig() always runs on failure paths - Replace inline JSON parsing in getResult() with fromJson() to avoid drift and improve null safety Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Revert whitespace-only changes in sherpa_onnx_bindings.dart and offline_recognizer.dart, re-applying only actual logic changes - Fix tab characters in online_recognizer.dart - Strip trailing whitespace from C++ files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add defensive bounds validation to offline_stream getVocabLogProbs() matching the online version - Filter vocab_log_probs per-token in Whisper Convert() to stay aligned with tokens after timestamp token filtering - Propagate vocab_log_probs through online transducer Convert() so the field is no longer dead on OnlineRecognizerResult - Use double accumulator in Moonshine log-softmax to match Canary and avoid precision loss with large vocabularies - Restore direct returns in Canary ForwardEncoder/ForwardDecoder for guaranteed copy elision Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

During previous upstream merges, conflict resolution accidentally modified files unrelated to our vocab_log_probs feature (JNI, Java examples, Kotlin, CLI tools, and several C++ files). Reset all 33 files to match upstream/master so our branch only contains the per-token log probability additions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Restore blank line between variable declarations (whisper decoder) - Restore original ForwardDecoder call formatting (canary) - Restore original LOGE message text (canary) - Remove navigation comment that wasn't in upstream (whisper decoder.h) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. Canary decoder: revert loop index to i=1 (matching upstream and the reference Python script). The position resets for the generation phase; starting at decoder_input.size()+1 fed wrong positional encodings to the model. 2. Standard transducer: guard against temperature_scale_ <= 0 to prevent division by zero, matching the NeMo decoder's existing guard. 3. Moonshine: filter token_log_probs and vocab_log_probs inside the token loop (like Canary/Whisper do) so they stay aligned with r.tokens when unknown symbols are skipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. Remove misleading std::move on const ref for vocab_log_probs in transducer Convert() — was silently copying anyway. 2. Restore tokens_buf handling in NeMo transducer Manager constructor that was lost during a merge. 3. Make OnlineRecognizerResult.fromJson null-safe for text, tokens, and timestamps (matching the offline version's pattern). 4. Wrap setConfig FFI call in try/finally to ensure freeConfig runs on exception (matching the factory constructor fix). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Restore hotwords_buf parameter in OnlineRecognizerConfig constructor - Restore hotwords_buf handling in transducer Manager constructor - Restore feature_dim from config instead of hardcoded 80 - Remove chunk_shift addition that wasn't in upstream - Revert variable rename (tokens_input back to tokens) in whisper decoder - Revert IsMultiLingual() guard addition in whisper decoder - Restore unconditional ys_log_probs serialization (matching upstream) - Restore removed comment in whisper timestamp tracking Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Match the offline recognizer pattern by using the existing fromJson factory instead of manual field extraction, ensuring consistent null-safe parsing across both online and offline paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Dokotela · 2026-03-16T18:59:44Z

I don't know what a style check is, but otherwise I think this PR is ready.

csukuangfj

You have a lot of redundant code. Please try to use existing functions to simplify it.

csukuangfj · 2026-03-17T06:20:56Z

sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc

+    double sum_exp = 0.0;
+    for (int32_t j = 0; j < vocab_size; ++j) {
+      sum_exp += std::exp(p[j] - max_logit);
+    }
+    float log_sum = max_logit + static_cast<float>(std::log(sum_exp));
+
+    // Compute log-softmax for all tokens and find max in single pass
+    std::vector<float> full_vocab_probs(vocab_size);
+    int32_t max_token_id = 0;
+    float max_log_prob = p[0] - log_sum;
+    full_vocab_probs[0] = max_log_prob;
+
+    for (int32_t j = 1; j < vocab_size; ++j) {
+      float log_prob = p[j] - log_sum;
+      full_vocab_probs[j] = log_prob;
+      if (log_prob > max_log_prob) {
+        max_log_prob = log_prob;
+        max_token_id = j;
+      }
+    }


Too complicated. Please use existing functions to simplify it.

csukuangfj · 2026-03-17T06:21:47Z

sherpa-onnx/csrc/offline-recognizer-canary-impl.h

+      OfflineRecognitionResult empty_result;
+      s->SetResult(empty_result);


Suggested change

OfflineRecognitionResult empty_result;

s->SetResult(empty_result);

Just return is enough. Every stream has a default initialized result.

Dokotela and others added 21 commits August 24, 2025 01:25

Added probabilities to OnlineRecognizerResult

4e0bdb6

maybe if I spell it right and in the right place

1e5111e

flutter updates

b188759

core probability files

77dde1f

whisper probabilities

2e8033b

canary confidence scores

2dc62a4

moonshine confidence scores

0ad50a9

Merge branch 'master' of https://github.com/Dokotela/sherpa-onnx

fd13598

streaming nemo confidences

49f1919

Merge branch 'master' of https://github.com/k2-fsa/sherpa-onnx

634cfd8

Removed debug statements

cabd408

Matching original formatting

06c97ae

Renamed probability variable, removed word-level probability

14ffa69

token_probs -> token_log_probs

2620453

Adding full vocab log probs

ae4b752

Merge branch 'k2-fsa:master' into master

2a674d8

Merge branch 'master' of https://github.com/Dokotela/sherpa-onnx

a4ac5f8

Streaming does seem to have probabilities now

abe6179

don't remember if I needed this

44a8571

Making as few adjustments as I can

51c97d1

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Dec 14, 2025

gemini-code-assist bot reviewed Dec 14, 2025

View reviewed changes

coderabbitai bot reviewed Dec 14, 2025

View reviewed changes

Dokotela added 2 commits December 14, 2025 14:36

Straightforward fixes to Gemini Code's PR Comments

fbc1f66

Should be cleaned now

ce7f975

coderabbitai bot reviewed Dec 14, 2025

View reviewed changes

sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc Show resolved Hide resolved

Add guard for invalid temperature_scale in NeMo decoder

ff6bd92

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

dart-api-examples/non-streaming-asr/pubspec.yaml Outdated Show resolved Hide resolved

Dokotela added 2 commits January 11, 2026 22:30

Merge upstream/master (v1.12.21)

f034396

- Updated all version numbers to 1.12.21 - Resolved conflicts in pubspec.yaml files - Kept helpful TODO comment for MedASR example - Kept descriptive comments for local development paths

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

dart-api-examples/non-streaming-asr/pubspec.yaml Outdated Show resolved Hide resolved

Fix TODO comment to reference Vocabulary Log Probabilities API

6f35262

CodeRabbit correctly identified that the TODO comment should reference the vocab log probs API (getVocabLogProbs, SherpaOnnxVocabLogProbs) introduced in this PR, not other unrelated APIs.

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

Dokotela added 2 commits January 22, 2026 07:44

Merge upstream/master: sync with latest changes (v1.12.23)

ce02979

Fix merge conflicts in offline_recognizer.dart

b0d37fe

csukuangfj requested a review from Copilot January 27, 2026 04:48

Copilot AI reviewed Jan 27, 2026

View reviewed changes

Copilot started reviewing on behalf of csukuangfj January 27, 2026 09:39 View session

Dokotela and others added 2 commits March 13, 2026 00:19

Merge upstream/master: sync with k2-fsa/sherpa-onnx v1.12.29

724ee36

Resolves merge conflicts in 5 files, preserving our custom per-token log probability additions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

Dokotela and others added 4 commits March 13, 2026 13:07

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Mar 13, 2026

Dokotela and others added 7 commits March 13, 2026 14:15

Remove stray blank line in canary RunEncoder

98e0a3e

Merge remote-tracking branch 'upstream/master'

0466c61

csukuangfj requested changes Mar 17, 2026

View reviewed changes

		OfflineRecognitionResult empty_result;
		s->SetResult(empty_result);

Conversation

Dokotela commented Dec 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Vocabulary Log Probabilities API

Summary

Changes

Core API (sherpa-onnx/c-api/)

Model Support

Implementation Details

Use - I'm using it for Confidence-Weighted ROVER Fusion mostly (but can also be used for Entropy-Based Uncertainty Quantification).

Testing - I'm not sure how you run tests on this repo, I couldn't really find a test suite, so I'm not sure if I broke anything (in my local testing it didn't seem to, but I certainly didn't test it with all of the platforms and programming languages that you use)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot commented Dec 14, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Dokotela commented Jan 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Dokotela commented Mar 16, 2026

Uh oh!

csukuangfj left a comment

Choose a reason for hiding this comment

Uh oh!

csukuangfj Mar 17, 2026

Dokotela commented Dec 14, 2025 •

edited by coderabbitai bot

Loading

Core API (`sherpa-onnx/c-api/`)

coderabbitai bot commented Dec 14, 2025 •

edited

Loading