Skip to content

Streaming callback onMessage sometimes contains stale/foreign text from a previous run, breaking JSON output (possible cross-run contamination after cancel/close) #449

@ishizuki-tech

Description

@ishizuki-tech

I’m seeing intermittent corruption in callback streaming where some onMessage(...) chunks include text that does not belong to the current request (it appears to come from a previous run). This is most visible with strict JSON-only prompts, because a single injected fragment breaks parsing.

This does not look like a simple “delta vs accumulated output” semantics issue. It looks more like cross-run contamination (late callbacks, stale buffer reuse, or request isolation boundaries).

Summary

Using callback streaming:

  • Conversation.sendMessageAsync(text, callback: MessageCallback)

…I intermittently receive onMessage(...) content that includes unexpected fragments which do not match the current prompt/output and often resemble leftovers from a prior generation.

Result: strict JSON output becomes invalid (broken quotes/braces, stray prefixes/tails).

What I observe (intermittent)

  • onMessage(...) sometimes includes unexpected fragments:

    • unrelated to the current prompt/output
    • looks like leftovers (often JSON braces/quotes/partial lines)
  • The corruption is often partial, not a full replay:

    • stray prefix line
    • mid-stream injection
    • unexpected tail appended to otherwise valid JSON

Expected behavior

For each sendMessageAsync(text, callback) invocation:

  • All onMessage(...) content should belong to the current request only.
  • After cancel/close and a subsequent new request, prior-run output should never appear in the new stream.

High-level correlations

I don’t have a minimal repro yet, but it seems more likely when:

  • cancelProcess() / close() is called while tokens are still streaming
  • A new streaming request starts shortly after a canceled/closed run (retry/watchdog recovery)
  • Rapid teardown/recreate of the owning component (lifecycle transitions)

Notes (hypotheses)

  • Late callbacks after cancel/close (already queued events delivered later)
  • Stale buffer/state reuse across runs
  • Concurrency/order boundary issues at the callback layer

Questions

  1. Are requests strictly isolated at the callback layer (can old-run callbacks arrive after a new request begins)?
  2. After cancelProcess() / close(), is there a guarantee that no further onMessage(...) callbacks will be delivered?
  3. Is there a recommended isolation pattern (e.g., wait for terminal callback, recreate Conversation, etc.)?

Temporary client-side mitigation

Until this is clarified/fixed, I’m considering:

  • Per-run runId and ignore callbacks after finalization
  • Delay starting a new run until terminal callback or timeout
  • Hard reset by creating a new Conversation after cancellation/close

Environment

  • LiteRT‑LM: com.google.ai.edge.litertlm:litertlm-android:0.9.0-alpha04

    • Version correlation: observed more frequently with com.google.ai.edge.litertlm:litertlm-android:0.8.0.
  • Android / device: Google Pixel 9a (API 36)

  • ABI: arm64-v8a

  • Model / backend / runtime config: same as Issue 1

  • Build environment: AGP 9.0.0, Kotlin 2.3.10 (Compose BOM 2026.02.00)

Metadata

Metadata

Assignees

Labels

area:ai-behavior-bugsBugs where AI produces incorrect or undesirable outputs.type:bugReport of an error, flaw, or fault.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions