Skip to content

High service call volume when agents read/write through the deployed API #300

@khaliqgant

Description

@khaliqgant

Problem

Every read and write an agent performs goes through the deployed Relayfile service as a raw HTTP request. There is no client-side caching or request deduplication at any layer. This results in far more service calls than necessary, driving up compute, network, and (indirectly) upstream provider API costs.

Three independent issues compound each other:


1. SDK has no read cache

RelayFileClient.readFile() (packages/sdk/typescript/src/client.ts:1362) issues a fresh HTTP request on every call with no TTL cache and no in-flight deduplication.

Primary multiplier — materializeChangeRecord (client.ts:1238): whenever an agent uses subscribe() or open(), every non-deleted file event triggers a readFile call to hydrate the change record with current content. If 10 events arrive for the same path in quick succession, 10 reads go to the server.

Secondary multiplier — agent re-reads: AI agent workflows routinely re-read the same file several times per session (read → reason → act → verify). Each re-read is a separate round-trip with no local copy available.

// Every call below = 1 HTTP GET, even for the same path within the same second
const a = await client.readFile({ workspaceId, path: "/linear/issues/AGE-12.json" });
const b = await client.readFile({ workspaceId, path: "/linear/issues/AGE-12.json" });
// two requests sent; b is identical to a

2. No in-flight request deduplication

If two concurrent operations both need the same file (common in multi-subscriber or fan-out agent patterns), two HTTP requests go out in parallel for identical data. Neither the SDK nor the FUSE mount deduplicates concurrent in-flight reads for the same {workspaceId}:{path}.


3. FUSE content cache TTL is 2 seconds

When agents use the FUSE mount, file content is cached via fsState.putFile() (internal/mountfuse/fs.go:287), which uses s.attrTTL (default 2s) as the TTL:

// internal/mountfuse/fs.go:290-293
s.fileCache[file.Path] = cachedFile{
    file:      file,
    expiresAt: time.Now().Add(s.attrTTL), // 2 seconds
}

Sequential agent steps routinely take longer than 2 seconds between reads of the same file, so nearly every read results in a cache miss and a new HTTP call to the server. Directory listings (dirCache) use entryTTL (default 5s), which is also short for listing-heavy workflows.

The WebSocket invalidator (internal/mountfuse/wsinvalidate.go) already handles remote-change-driven invalidation, so a longer TTL is safe — stale entries are evicted the moment a change event arrives.


Impact

Scenario Requests today Requests with fixes
Agent reads same file 5× in a session 5 HTTP GETs 1 (TTL cache hit for remaining 4)
10 change events fire for one path 10 readFile calls 1 (in-flight dedup + cache)
FUSE agent reads file every 10s 1 fetch/2s → ~5 fetches 1 fetch/30s → ~3× reduction
2 concurrent subscribers, same file 2 parallel GETs 1 (deduplication)

Proposed Solutions

Solution A — SDK read cache with event-driven invalidation (highest impact, no server changes)

Add an in-process FileReadCache to RelayFileClient, enabled by default.

Behaviour:

  • Cache readFile responses keyed by {workspaceId}:{path} with a configurable TTL (default 5 seconds).
  • Skip the cache for fork-scoped reads (forkId set), which have isolated state.
  • When any active change stream (subscribe / open) receives a file.updated, file.created, or file.deleted event, immediately evict that path from the cache — no waiting for TTL expiry.
  • In-flight deduplication: if a fetch for the same key is already in-flight, return the existing Promise instead of issuing a second request.

API surface (additive, backwards-compatible):

export interface RelayFileReadCacheOptions {
  /** Cache TTL in ms. Default: 5000. */
  ttlMs?: number;
  /** Max cached entries before LRU eviction. Default: 500. */
  maxEntries?: number;
}

export interface RelayFileClientOptions {
  // ...existing fields...
  /**
   * Client-side file read cache.
   * Set to false to disable. Default: enabled.
   */
  readCache?: false | RelayFileReadCacheOptions;
}

Files to change:

  • packages/sdk/typescript/src/client.tsFileReadCache class, hook into readFile(), invalidation hook in RelayFileChangeStreamManager
  • packages/sdk/typescript/src/types.ts — export RelayFileReadCacheOptions
  • packages/sdk/typescript/src/index.ts — re-export new type

Solution B — FUSE content cache TTL decoupled and raised (easy win, FUSE mount path)

Decouple fileCache TTL from FUSE kernel attrTTL. Add a ContentTTL field to mountfuse.Config (default 30 seconds), used only by putFile. The kernel attribute TTL (AttrTTL) remains at 2 seconds, so FUSE Getattr still refreshes quickly without re-downloading content.

// internal/mountfuse/fs.go
type Config struct {
    // ...existing fields...
    // ContentTTL controls how long fetched file content is cached.
    // Defaults to 30s. Independent of AttrTTL (kernel attribute cache).
    ContentTTL time.Duration
}

Expose as RELAYFILE_MOUNT_CONTENT_TTL env var in cmd/relayfile-mount/main.go for operator tuning.

Files to change:

  • internal/mountfuse/fs.goConfig.ContentTTL, fsState.contentTTL, use in putFile
  • cmd/relayfile-mount/main.go — wire RELAYFILE_MOUNT_CONTENT_TTL env var

Solution C — Bulk read endpoint (server-side, higher effort)

Agents that need to read many files (e.g., full workspace snapshot at session start) do N sequential or parallel reads, each a separate HTTP round-trip. A POST /v1/workspaces/{ws}/fs/bulk-read accepting a list of paths and returning all responses in one payload would collapse N calls to 1.

Spec addition:

# openapi/relayfile-v1.openapi.yaml
/v1/workspaces/{workspaceId}/fs/bulk-read:
  post:
    requestBody:
      content:
        application/json:
          schema:
            type: object
            properties:
              paths:
                type: array
                items:
                  type: string
    responses:
      "200":
        content:
          application/json:
            schema:
              type: object
              properties:
                files:
                  type: array
                  items:
                    $ref: "#/components/schemas/FileReadResponse"

Files to change:

  • openapi/relayfile-v1.openapi.yaml
  • internal/httpapi/server.go
  • internal/relayfile/store.go
  • packages/sdk/typescript/src/client.tsbulkRead() method

Solution D — Conditional GET / If-None-Match support (server-side, reduces bandwidth)

The server already tracks per-file revisions. Adding ETag response headers and honouring If-None-Match on GET /fs/file would let the client skip downloading content it already has. The server returns 304 Not Modified when the revision matches, saving payload transfer for large files.

Files to change:

  • internal/httpapi/server.go — emit ETag: {revision}, handle If-None-Match
  • packages/sdk/typescript/src/client.ts — send If-None-Match when a cached revision is known

Recommended order

Priority Solution Effort Server change?
1 A — SDK read cache + dedup Medium No
2 B — FUSE content TTL Small No
3 C — Bulk read endpoint Large Yes
4 D — Conditional GET / ETag Medium Yes

Solutions A and B can ship independently and together address the dominant cost driver (redundant reads per agent session) without any server-side changes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions