Skip to content

Inconsistent data serialization for images in MessageSerializer #2198

Description

@xprojects-de

If an image is attached to a UserPrompt using Image::fromFile($localPath), stored in a MariaDB via the MessageSerializer, and subsequently deserialized, an "invalid data format" error occurs.

If you modify the MessageSerializer as follows, everything works smoothly.

    public function denormalize(mixed $data, string $type, ?string $format = null, array $context = []): mixed
    {
        ... 

        $message = match ($dataType) {
            SystemMessage::class => new SystemMessage($content),
            AssistantMessage::class => new AssistantMessage(...self::denormalizeAssistantParts($data)),
            UserMessage::class => new UserMessage(...array_map(
                static fn(array $part): ContentInterface => match ($part['type']) {
                    File::class, Document::class, Image::class, Audio::class => $part['type']::fromDataUrl($part['content']),
                    Text::class => new Text($part['content']),
                    ImageUrl::class => new ImageUrl($part['content']),
                    DocumentUrl::class => new DocumentUrl($part['content']),
                    default => throw new LogicException(\sprintf('Unknown content type "%s".', $part['type'])),
                },
                $contentAsBase64,
            )),

            ...
    }

     public function normalize(mixed $data, ?string $format = null, array $context = []): array
    {
       ...

        return [
                $context['identifier'] ?? 'id' => $data->getId()->toRfc4122(),
            'type' => $data::class,
            'content' => $content,
            'contentAsBase64' => ($data instanceof UserMessage && [] !== $data->getContent()) ? array_map(
                static fn(ContentInterface $content) => [
                    'type' => $content::class,
                    'content' => match ($content::class) {
                        Text::class => $content->getText(),
                        File::class, Document::class, Image::class, Audio::class => $content->asDataUrl(),
                        ImageUrl::class,
                        DocumentUrl::class => $content->getUrl(),
                        default => throw new LogicException(\sprintf('Unknown content type "%s".', $content::class)),
                    },
                ],
                $data->getContent(),
            ) : [],
            'toolsCalls' => $toolsCalls,
            'parts' => $parts,
            'metadata' => $data->getMetadata()->all(),
            'addedAt' => (new \DateTimeImmutable())->getTimestamp(),
        ];
    }

There must be an inconsistency here between the stored and read data formats.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions