Skip to content

[Platform][Cache] CachePlatform cannot cache content-object inputs (Whisper audio, Mistral OCR) #2159

Description

@tacman

Summary

CachePlatform (the ai.platform.cache.* decorator, symfony/ai-cache-platform) provides "free idempotency" for re-running platform invocations. But it can only build a cache key for string, array, and MessageBag inputs — it throws on any other input type, including the first-party ContentInterface content objects (Audio, Document, DocumentUrl, Image, ImageUrl, …).

That means it cannot decorate any platform task whose top-level input is a single content object, which is the normal shape for audio transcription, OCR, and single-image vision — exactly the slow/expensive calls most worth caching.

The gap

CachePlatform::invoke() (src/platform/src/Bridge/Cache/CachePlatform.php):

$normalizedInput = match (true) {
    \is_string($input) => md5($input),
    \is_array($input) => json_encode($input),
    $input instanceof MessageBag => $input->getId()->toString(),
    default => throw new InvalidArgumentException(\sprintf('Unsupported input type: %s', get_debug_type($input))),
};

A ContentInterface instance falls through to default and throws (only when prompt_cache_key is set; otherwise the call silently passes through uncached).

Reproduction

Both of these already exist in the codebase today:

Whisper (audio transcription)examples/openai/audio-transcript.php invokes with an Audio content object directly:

$file = Audio::fromFile(__DIR__.'/audio.mp3');
$platform->invoke('whisper-1', $file); // <-- $file is a ContentInterface

Wrap that platform in CachePlatform and pass prompt_cache_keyInvalidArgumentException: Unsupported input type: Symfony\AI\Platform\Message\Content\Audio.

Mistral OCR (the bridge proposed in #2072) — same shape with DocumentUrl/Document:

$platform->invoke('mistral-ocr-latest', new DocumentUrl('https://…/document.pdf'), ['prompt_cache_key' => 'ocr']);
// InvalidArgumentException: Unsupported input type: …\Content\DocumentUrl

Confirmed by direct test against CachePlatform (string input caches fine; a DocumentUrl input throws).

Proposed fix (surgical)

Content objects currently have no identity — ContentInterface is an empty marker. Introduce a small opt-in interface so an input can advertise a stable cache key, and have CachePlatform honor it:

interface CacheableInputInterface
{
    public function getCacheKey(): string;
}
// CachePlatform::invoke()
$normalizedInput = match (true) {
    \is_string($input) => md5($input),
    \is_array($input) => json_encode($input),
    $input instanceof MessageBag => $input->getId()->toString(),
    $input instanceof CacheableInputInterface => $input->getCacheKey(),
    default => throw new InvalidArgumentException(...),
};

Then implement it on the URL/file content classes:

  • DocumentUrl / ImageUrl → return the URL (natural content identity).
  • Document / Filehash('xxh128', $this->asBinary()) (content hash; File already exposes asBinary() and __serialize()).

This generalizes to every content-object task rather than special-casing OCR, and is additive (no BC break — CacheableInputInterface is opt-in).

Note on identity semantics

MessageBag keys off getId(), which is a per-instance random Uuid::v7() — i.e. instance identity (same content, new instance → cache miss). The content classes above would instead use content identity (same URL/bytes → hit), which is what you want for idempotent re-runs of a pipeline over a document/audio archive. Worth a sentence in the docs so the two behaviors aren't surprising; happy to align with whatever the maintainers prefer.

Scope

Surgical and standalone — independent of the OCR bridge in #2072 (the gap already affects Whisper). I'm happy to send a PR if the direction is accepted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PlatformIssues & PRs about the AI Platform component

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions