Skip to content

[session-manager] session_info and session_read return inconsistent results due to transient SDK API failures #4622

@B1INGO

Description

@B1INGO

Bug Description

session_info and session_read tools return inconsistent results for the same session ID when the OpenCode SDK HTTP API experiences transient failures. Specifically:

  1. session_info succeeds and returns session metadata
  2. session_read immediately fails with "Session not found" for the same session ID

Root Cause Analysis

After deep investigation, the issue stems from dual storage architecture + transient SDK API failures:

Storage Architecture

  • SDK Storage (SQLite DB): ~/.local/share/opencode/opencode.db - contains full session data
  • File Storage: ~/.local/share/opencode-data/opencode/storage/message/ - fallback storage

For sessions created purely in SQLite backend (isSqliteBackend() === true), no file storage exists as fallback.

Failure Chain

When SDK API transiently fails:

  1. session_info �� getSessionInfo() �� SDK fails �� fallback to getFileSessionInfo() �� file doesn't exist �� returns null �� "Session not found"
  2. session_read �� sessionExists() �� SDK's session.list() fails �� fallback to fileSessionExists() �� file doesn't exist �� returns false �� "Session not found"

Both paths fail because:

  • SDK API is temporarily unavailable (timeout, network hiccup, race condition)
  • File storage has no data for pure-SQLite sessions

Key Code Locations

In dist/index.js:

  • Line 107196-107208: sessionExists2() - existence check with SDK + file fallback
  • Line 107210-107223: readSessionMessages2() - message read with fallback
  • Line 107102-107132: getSdkSessionMessages() - SDK message fetch (no retry)
  • Line 107496-107521: session_read tool definition

Evidence

# Session exists in SQLite with full data
sqlite3 ~/.local/share/opencode/opencode.db "SELECT COUNT(*) FROM message WHERE session_id='ses_187699fafffe5TjZ0wrO97d21b'"
# Result: 157 messages, 485 parts

# But file storage is empty for this session
ls ~/.local/share/opencode-data/opencode/storage/message/ses_187699fafffe5TjZ0wrO97d21b
# Result: directory does not exist

Proposed Solution

Add retry logic for SDK API calls in session tools, since transient failures are expected in networked environments:

// In session_read tool (line ~107504)
execute: async (args, _context) => {
  const maxRetries = 2;
  const retryDelay = 500; // ms
  
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    if (await resolvedDeps.sessionExists(args.session_id)) {
      break;
    }
    if (attempt < maxRetries - 1) {
      await new Promise(r => setTimeout(r, retryDelay));
    }
  }
  
  if (!await resolvedDeps.sessionExists(args.session_id)) {
    return `Session not found: ${args.session_id}`;
  }
  // ... rest unchanged
}

Alternatively, add retry logic directly in getSdkSessionMessages() and getSdkSessionInfo() functions.

Environment

  • OpenCode version: 1.15.12
  • oh-my-openagent: latest (installed via npm)
  • OS: Windows 11
  • Storage backend: SQLite (isSqliteBackend() === true)

Workaround

Wait a few seconds and retry the session_read call - the SDK API typically recovers quickly.

Related

This issue particularly affects sessions that were created in SQLite backend without corresponding file storage. The fallback mechanism works correctly, but for pure-SQLite sessions there's nothing to fall back TO when SDK fails.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions