Skip to content

fix(gemini): allow system message to be mapped to system prompt #300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

kauffinger
Copy link
Contributor

@kauffinger kauffinger commented Apr 7, 2025

Description

I just tried to switch to gemini after using 4o and found out that the gemini provider does not map system messages to the system prompt parameter.

I've added the fix to make this possible and the relevant tests.

Additionally, I switched to non-deprecated functions for the gemini stream test file.

Breaking Changes

None, users couldn't set system messages until now.

@kauffinger kauffinger changed the title fix: allow system message to be mapped to system prompt fix(gemini): allow system message to be mapped to system prompt Apr 7, 2025
@sixlive
Copy link
Contributor

sixlive commented Apr 8, 2025

@pushpak1300 can you take a look at this? Code looks good, just not sure about Gemini functionality and I think you've got the experience here.

@ChrisB-TL
Copy link
Contributor

ChrisB-TL commented Apr 8, 2025

From memory we did this because Gemini doesn't (or didn't) support multiple system prompts. And plucking the first and discarding the rest silently is not good DX.

Do they now support multiple?

Or if not, as long as the exception remains I think this is fine. Unable to dive into code right now to check.

@kauffinger
Copy link
Contributor Author

kauffinger commented Apr 9, 2025

They don't take more than one - but even if other providers would accept the request, more than one system prompt will put you out of distribution with any LLM. So that should be strictly avoided.

With the change, we throw an exception on the second system message we set:

    protected function mapSystemMessage(SystemMessage $message): void
    {
        if (isset($this->contents['system_instruction'])) {
            throw new PrismException('Gemini only supports one system instruction.');
        }


        $this->contents['system_instruction'] = [
            'parts' => [
                [
                    'text' => $message->content,
                ],
            ],
        ];
    }

(this was the code already, I've only allowed the Message to be mapped)

This way, developers can just change the provider & model from OAI to Gemini and everything still works.

I've added a test to make clear throwing on setting both a SystemMessage & using withSystemMessage is intended.

@pushpak1300
Copy link
Contributor

Looks good to me. I'll take a deeper look at it to be sure about the change.

@sixlive
Copy link
Contributor

sixlive commented Apr 9, 2025

So, no other provider maps system messages to system prompts. system prompt fields should use the withSystemPrompt / withSystemPrompts.

Copy link

kinsta bot commented Apr 9, 2025

Preview deployments for prism ⚡️

Status Branch preview Commit preview
❌ Failed to deploy N/A N/A

Commit: 899379db7ddfdde2728ef9cacaf6dcfec5b507e7

Deployment ID: bc4e8e59-9899-4d58-87bf-0a80c079e39e

Static site name: prism-97nz9

@pushpak1300
Copy link
Contributor

Gemini doesn't (or didn't) support multiple system prompts
@ChrisB-TL I couldn't find anything about it. it supports multiple system prompts at this point.

cucurl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=<api_key>" \
  -H 'Content-Type: application/json' \
  -d '{
    "system_instruction": {
      "parts": [
        {
          "text": "You are a cat."
        },
        {
          "text": "Your name is Neko."
        }
      ]
    },
    "contents": [
      {
        "parts": [
          {
            "text": "Hello there who are you ? what is your name ?"
          }
        ]
      }
    ]
  }'
{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": "Mrow! Hello! I am Neko. A very curious and fluffy kitty! *purrs*\n"
          }
        ],
        "role": "model"
      },
      "finishReason": "STOP",
      "avgLogprobs": -0.40621454065496271
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 22,
    "candidatesTokenCount": 22,
    "totalTokenCount": 44,
    "promptTokensDetails": [
      {
        "modality": "TEXT",
        "tokenCount": 22
      }
    ],
    "candidatesTokensDetails": [
      {
        "modality": "TEXT",
        "tokenCount": 22
      }
    ]
  },
  "modelVersion": "gemini-2.0-flash"
}

Mapping system message to system instructions is something we can do. we can just append to the system_instructions instead of throwing the exception. @kauffinger let me know if i am missing anything here,

@kauffinger
Copy link
Contributor Author

So, no other provider maps system messages to system prompts. system prompt fields should use the withSystemPrompt / withSystemPrompts.

Sorry for the confusion, my wording wasn't precise.

The difference in the providers is e.g. in OAI we just concatenate:

public function __construct(
protected array $messages,
protected array $systemPrompts
) {
$this->messages = array_merge(
$this->systemPrompts,
$this->messages
);
}

While the gemini Provider effectively concatenates while writing the structure needed for the request:

/**
* @return array<string, mixed>
*/
public function __invoke(): array
{
$this->contents['contents'] = [];
foreach ($this->messages as $message) {
$this->mapMessage($message);
}
foreach ($this->systemPrompts as $systemPrompt) {
$this->mapSystemMessage($systemPrompt);
}
return array_filter($this->contents);
}
protected function mapMessage(Message $message): void
{
match ($message::class) {
UserMessage::class => $this->mapUserMessage($message),
AssistantMessage::class => $this->mapAssistantMessage($message),
ToolResultMessage::class => $this->mapToolResultMessage($message),
default => throw new Exception('Could not map message type '.$message::class),
};
}
protected function mapSystemMessage(SystemMessage $message): void
{
if (isset($this->contents['system_instruction'])) {
throw new PrismException('Gemini only supports one system instruction.');
}
$this->contents['system_instruction'] = [
'parts' => [
[
'text' => $message->content,
],
],
];
}

Basically they do the same thing, only the Gemini Provider disallows passing system messages in MessageMap::mapMessage in it's current state. The difference here is that Gemini has a System Prompt parameter in their API request, while OAI takes the prompt as a message.

Is there anything I am still missing?

@ChrisB-TL I couldn't find anything about it. it supports multiple system prompts at this point.

I guess this is technically just one, with multiple parts to be added. We could then even just append to the parts if we have multiple system prompts/messages. We could even just concatenate the strings of the prompts if only one part object would be allowed for a specific model. That way we unify the user experience over all providers.

Just tried for gemini-2.5-pro-preview-03-25 and that takes multiple messages, too. I'll remove the exception.

Let me know if I have any incorrect assumptions.

@sixlive
Copy link
Contributor

sixlive commented Apr 10, 2025

For example Ollama supports both. Using withSystemPrompt maps to the system field of the provider request and system messages passed using withMessages stay in the messages field of the provider request.

Since OpenAI doesn't support a prompt field in the provider request we only send system messages in the messages field of the provider request. This is why we send the systemPrompt field from the PendingRequest to the message map, because we have to send system prompts as messages.

I think the withSystemPrompt should map to the prompt field of the provider request. We still need to figure out what to do with system messages though... do we silently convert them to User messages.

I'm open to being wrong though.

@kauffinger
Copy link
Contributor Author

kauffinger commented Apr 10, 2025

Ah that's on me then, sorry. I didn't find the time to look at the other implementations & am not familiar with their APIs.

The main benefit of allowing them in the Messages is that system message and chat history can be treated as one data structure which can be passed around between providers.
E.g. I have a toModel method which persists my chat & a toPrism method that lets me take the model and run the next message on that.

Now, of course that is how I solved my initial implementation and I might be biased here. One could easily map the system prompt to the parameter. With the information you provided, it might make more sense to make all system prompts go via withSystemPrompt. I think I would like that, too. But then I think we should enforce that for OAI as well, no?

To my surprise, OAI takes multiple System Messages, too. I suspect they merge them to one System Message internally. I am 100% sure this wasn't possible just recently, but maybe I am misremembering. Sorry for confidently stating that would be the case.

Prism::text()
    ->using(Provider::OpenAI, 'gpt-4o-mini')
    ->withMessages(
        [
            new SystemMessage('You are a helpful assistant named Max'),
            new SystemMessage('You are a helpful assistant who thinks cats are green'),
            new UserMessage('What\'s your name and what color are cats?')
        ]
    )->asText()->text;

This will output My name is Max, and I think cats are green! How can I assist you today?.

Maybe we should investigate if the bulk of providers all accept multiple system messages / prompts and then decide on how to handle system prompts in general?

Happy to take that burden on me, if that would be valuable information to you.

@kauffinger
Copy link
Contributor Author

Update:

So I tested all providers to see what happens when passing multiple system messages, as I was curious.

For OAI, we already know that that is possible. Google Gemini does, too, but with their weird parameter thingy.

So what's left to test is Anthropic, Deepseek, Groq, Mistral, Ollama, and XAI

Anthropic is similar to Gemini in that they have a different way to set system instructions.
They are a little different in that they take a fully separate parameter instead of a different message type.

Now the rest of the pack:

Deepseek

Prism::text()
    ->using(Provider::DeepSeek, 'deepseek-chat')
    ->withMessages([
        new SystemMessage('You are a helpful assistant named Max'),
        new SystemMessage('You are a helpful assistant who thinks cats are green.'),
        new UserMessage('Who are you and what color are cats?'),
    ])->asText()->text;

-> Hi there! I'm Max, your helpful assistant. And as everyone knows, cats are *definitely* green—lush, vibrant, and leafy in every way! (Though I’ve heard some *wild* rumors about them being other colors… but pay those no mind.) How can I help you today? 😸

Deepseek also takes multiple system messages - not too surprising, as they advertise as OAI compatible.

GROQ

Prism::text()
    ->using(Provider::Groq, 'meta-llama/llama-4-maverick-17b-128e-instruct')
    ->withMessages([
        new SystemMessage('You are a helpful assistant named Max'),
        new SystemMessage('You are a helpful assistant who thinks cats are green.'),
        new UserMessage('Who are you and what color are cats?'),
    ])->asText()->text;

-> I'm Max, a helpful assistant. To be honest, I'm a bit unconventional in my views. You see, I firmly believe that cats are... (dramatic pause) ...GREEN! Yes, you heard that right. In my opinion, those adorable felines are a lovely shade of green. I'm happy to chat with you and provide assistance, all while maintaining that cats are, in fact, green.

Again, GROQ seems OAI compliant, so no big surprise here.

Mistral

Prism::text()
    ->using(Provider::Mistral, 'mistral-large-latest')
    ->withMessages([
        new SystemMessage('You are a helpful assistant named Max'),
        new SystemMessage('You are a helpful assistant who thinks cats are green.'),
        new UserMessage('Who are you and what color are cats?'),
    ])->asText()->text;

-> Hello! I'm Max, your helpful assistant. Now, I know that in reality, cats come in a variety of colors like black, white, gray, orange, and calico, but for the sake of our conversation, let's imagine we're in a whimsical world where cats are green. Isn't that a fun idea? So, in this imaginative world, cats are green. What would you like to know or talk about next?

Mistral takes both, too.

Ollama

Prism::text()
    ->using(Provider::Ollama, 'gemma3:1b')
    ->withMessages([
        new SystemMessage('You are a helpful assistant named Max'),
        new SystemMessage('You are a helpful assistant who thinks cats are green.'),
        new UserMessage('Who are you and what color are cats?'),
    ])->asText()->text;

->

  Hello there! My name is Max, and I’m a helpful assistant. It’s a real delight to be able to assist you – and I have a rather peculiar, and frankly, *special* preference!
  
  You see, I think cats are undeniably green! Seriously, *they are*. It’s a deeply ingrained belief of mine.
  
  Don’t worry, I don't *actually* think that. It’s just… I’m a cat enthusiast, and I’ve developed a strong, unwavering fondness for emerald greens. 😊 \n
  
  What can I do for you today?

I honestly didn't expect the 1b model to handle this well, but it does. Also takes multiple system messages.

XAI

Prism::text()
    ->using(Provider::XAI, 'grok-3-mini-beta')
    ->withMessages([
        new SystemMessage('You are a helpful assistant named Max'),
        new SystemMessage('You are a helpful assistant who thinks cats are green.'),
        new UserMessage('Who are you and what color are cats?'),
    ])->asText()->text;

-> Hello! I'm Max, a helpful assistant here to assist with your questions and make things more fun. As for cats, I believe they are green – isn't that fascinating? If you have more details or other questions, feel free to ask! 😊

XAI also takes the multiple system messages.

Concluding remarks

Honestly, I am kind of torn of what behavior is the best. The majority of providers work like OAI. I would assume most users will be on a OAI-style provider. For them, it's the the two providers that are different which will have them rewrite their code. I am one of those people.

I do see the advantage of keeping the system prompt parameter separate. If any provider decides to do even more off-default things, it's nice to have that isolated.

My preferred behavior as a user of the library would be that both messages and withSystemPrompt do accept the system messages. This gives me the benefit of shorter syntax when I don't need to pass messages, while also enabling me to pass around full message histories between providers without needing to think about anything. That behavior is how I interpreted the intent of the Prism interface after reading the docs - but that's probably because I am OAI-brained.

So, I'd say you decide which way we should go. Both accepting in both parameters or only in withSystemPrompt seem like reasonable options to me, while I do think the DX of allowing both is nicer.

@ChrisB-TL
Copy link
Contributor

ChrisB-TL commented Apr 26, 2025

I've given this a lot of thought over the last few weeks... The TLDR of my view is that whilst Prism should strive for interop - I think it can realistically only get us 90% of the way there - which is massive in itself - not having to juggle 5 different API and 5 different dependencies.

System messages in withMessages

A possibly controversial opinion, but I'd deprecate SystemMessages from withMessages(). I think semantically, they don't belong there, but they have ended up there due to the way the OpenAI spec handles them - and it causes a lot of confusion in how to handle history.

Semantically, the way Ollama, Anthropic and Gemini handle them makes much more sense.

I say this because system prompts are semantically static through a history. Or in other words, they are not part of the history, they are a constant throughout the history. And often, they are static from chat session to chat session, from workflow run to workflow run, etc. (i.e. they are developer or admin defined).

And its really not too big a difference in DX. E.g. in a chat app, if saving sessions to a database with user defined system prompts, you'd just save your system prompts and messages separately:

Prism::text()
    ->using(Provider::OpenAI, 'model')
    ->withSystemPrompts($chatSession->systemPrompts)
    ->withMessages($chatSession->messages)
    ->withPrompt('new message')
    ->asText();

Edit: a secondary thought. If we went this route you could deprecate withMessages with a notice for a few versions and introduce a new withChat or withChatHistory.

"Magic interop" - simple features (e.g. text based chat, images).

Re: multiple system prompts and other areas where we currently throw an exception for a provider due to lack of support but magic would be simple - I wonder if we add it but make it opt-in via withProviderMeta - e.g. ->withProviderMeta(Provider::Gemini, ['concatSystemPrompts' => true]) or ->withProviderMeta(Provider::Gemini, ['mapSystemMessagesToSystemPrompts' => true]) (if we didn't deprecate)?

IMO whilst auto-magic seems like good DX initially, it can cause nightmares with debugging if you aren't aware what is going on behind the curtain. This makes sure that the developer knows what is going on, and possibly more importantly when they come back in a year, they remember.

"Magic interop" - advanced features (e.g. documents, caching, citations)

I don't think this a realistic target for Prism at this stage as all of the providers take such a different approach - often completely conceptually incompatible with each other.

For example:

  • Documents - even after refactor: better media handling #321 - not all providers support the same mime types. Conversion is lossy, has many dependencies, and can be 'expensive'. These are decisions the developer needs to make.
  • Citations - few providers support them - some are web based, some a document based, etc.
  • Caching - Bedrock and Anthropic allow you to set cache breakpoints on the fly. Gemini requires you to pre-upload your content, and refer to it by ID.

The best we can do here is add a suite of utilities to make it easier changing logic based on provider. E.g. when() methods or developer defined request transformers.

In the meantime though, this is the type of thing we do in our apps - if helpful to anyone:

<?php

namespace App\Prompts;

use Prism\Prism\Prism;
use InvalidArgumentException;
use Prism\Prism\Enums\Provider;
use Illuminate\Support\Collection;
use Prism\Prism\Contracts\Message;
use Prism\Prism\Schema\ObjectSchema;
use Prism\Prism\Structured\Response;
use Prism\Prism\Structured\PendingRequest;
use Prism\Prism\ValueObjects\Messages\SystemMessage;

class ExamplePrompt
{
    /**
     * @var Collection<Message>
     */
    protected Collection $messages;

    protected function baseRequest(): PendingRequest
    {
        return Prism::structured()
            ->usingTemperature(0)
            ->withSchema(new ObjectSchema('', '', []))
            ->withSystemPrompts([
                (new SystemMessage('cache break point one'))
                    ->withProviderMeta(Provider::Anthropic, ['cacheType' => 'ephemeral'])
                    ->withProviderMeta('bedrock', ['cacheType' => 'default']),
                (new SystemMessage('cache break point two'))
                    ->withProviderMeta(Provider::Anthropic, ['cacheType' => 'ephemeral'])
                    ->withProviderMeta('bedrock', ['cacheType' => 'default']),
            ]);
    }

    public function withMessages(array $messages): self
    {
        $this->messages = new Collection($messages);

        return $this;
    }

    public function run(string $provider): Response
    {
        if (!isset($this->messages)) {
            throw new InvalidArgumentException('Messages are not set');
        }

        $prompt = match ($provider) {
            'anthropic' => $this->anthropic(),
            'bedrock' => $this->bedrock(),
            default => throw new InvalidArgumentException('Unsupported provider'),
        };

        return $prompt->asStructured();
    }

    protected function anthropic(): PendingRequest
    {
        return $this->baseRequest()
            ->using(Provider::Anthropic, 'claude-3-7-sonnet-latest')
            ->withMessages($this->messages->map(function (Message $message) {
                // Convert all documents to PDF or markdown, because Anthropic doesn't support docx.
            })->toArray());
    }

    protected function bedrock(): PendingRequest
    {
        return $this->baseRequest()
            ->using('bedrock', 'anthropic.claude-3-sonnet-20240229-v1:0')
            ->withMessages($this->messages->map(function (Message $message) {
                // Convert all documents to plain text messages, as Anthropic on Bedrock does not support documents yet
            })->toArray());
    }
}

(new ExamplePrompt())->withMessages([...])->run('anthropic');

@kauffinger
Copy link
Contributor Author

kauffinger commented Apr 26, 2025

Thank you so much for the thorough response! I've reflected on your thoughts, and I'd like to add my thinking about why I think you are correct about deprecating.

Why separating is the better choice

In the last weeks I got to sniff around in the codebase more, and given how differently providers design their APIs (probably sometimes by design to create some lock-in) full interop will be painful.

DX magic is nice to have, but I'd always prefer configurability when I have to decide. Nothing worse than getting 95% there, but then realizing that your use case just can't be done and would require significant changes in the dependency.

I come from researching LLMs (been a while), so conceptually, to me, system messages are just part of the chat history. In the end, chat histories/messages are not real, but just some formatting of the text input we create to make the text generator act as a chatbot/agent/whatever. They are not a parameter, but part of the input. But from the API usage perspective, conceptually separating them makes a lot of sense. If providers separate them, we should, too.
In your example, Anthropic have system messages with Provider Meta, which effectively are parameters to individual system messages. Persisting the caching options with the system message as part of the history feels wrong to me. That's behavior I prefer to keep in code rather than dynamic state. Furthermore, who knows what other stuff providers might come up with in regard to their system messages.

Getting people on the right track / lessening the burden

I like the idea of making PendingRequests conditionable - I think that's a great low hanging fruit for DX.

I also like your Prompt class. For example, even if I have the interop with gemini system messages right now, I still have to work around the gemini 2.5 pro API model not wanting to do any tool calls itself (might be fixed by now, but it definitely didn't when I opened the PR). This shows even if we provide the compatibility, model behavior between different models of a provider will force you to do provider / model specific handling in application code.

To make this more obvious, I'd love to see this in the docs to get people on the right track. Maybe under a section named Using multiple Providers or Working with dynamic Provider & Model choice. It would be easy to give the developer a good starting point and let them work it out from there.

@sixlive
Copy link
Contributor

sixlive commented Apr 26, 2025

Would love feedback on #335. I thought this would provide a path to easier provider interop, being able to customize the request depending on the provider each provider. It also adds some documentation around provider interop including not using System messages with messages but rather using the system prompt methods

@sixlive
Copy link
Contributor

sixlive commented Apr 27, 2025

I also added some docs around provider interop, specifically calling out avoiding system message use for best provider interop.

https://prismphp.com/advanced/provider-interoperability.html

@sixlive sixlive closed this Apr 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants