Agents with Vision or returning ImageContent from KernelFunction #11145
-
Hello there, I am working on some agentic workflow in which I would like to let agents access and process image content if the planner "thinks" it is useful to perform the task. So the idea would be to have some
Then I would register that plugin on a specific agent. That doesn't seem to work however, it seems like returning an Is there a way to have this work, ie to make agents that can properly use vision models inside the planning workflow ? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Tagging @RogerBarreto |
Beta Was this translation helpful? Give feedback.
-
@floboc That's a very interesting question, thanks for bringing it in. So currently when the Plugin is invoked by the AIModel according to the function calling pattern the answer needs to go back to the model as a BUT, is it possible to be a bit creative here using Semantic Kernel where basically you can inject that generated image in the Chat History everytime you have one in your kernel context. Here's how I would do it:
|
Beta Was this translation helpful? Give feedback.
@floboc That's a very interesting question, thanks for bringing it in. So currently when the Plugin is invoked by the AIModel according to the function calling pattern the answer needs to go back to the model as a
message.role=tool
where there isn't on option to identify the function result in a multi modal way that the AI Model will recognize as image/audio.BUT, is it possible to be a bit creative here using Semantic Kernel where basically you can inject that generated image in the Chat History everytime you have one in your kernel context.
Here's how I would do it:
inject the
Kernel
into yourRenderPreview(Kernel kernel)
function.Once you get the imageContent created you can lever…