Checklist
Describe the issue
Version: 1.7.0 (this worked correctly before updating)
Provider: Groq (very similar behavior also observed with Gemini)
Description:
llmvision.image_analyzer injects its own hardcoded system message into every request, forcing the model to respond with
{"title": "...", "description": "..."}. This happens even with generate_title: false and response_format: text, and completely
overrides any custom output format requested in the user prompt.
My user prompt (passed correctly and in full) asks the model to answer in a specific "yes/no + one sentence reasoning" format for an automation. Instead, the model follows the injected system prompt and returns:
{"title": "No activity", "description": "The garden is empty."}
This worked as expected before updating to 1.7.0 — the same automation, unchanged, returned the model's response in the requested format.
This may be related to #598 / #599, which show similar title/description-related breakage, but here it's clearly traceable to this injected system prompt taking priority over the user's instructions.
Expected: With generate_title: false and response_format: text, no system prompt should be injected, and the model should respond directly to the user's message as it did before 1.7.0.
Reproduction steps
-
Call llmvision.image_analyzer with:
generate_title: false
response_format: text
expose_images: false
- A custom
message prompt that explicitly asks for a different output format, e.g.:
"Answer strictly in this format:
Line 1: yes or no
Line 2: one sentence explaining your reasoning"
- Any
image_entity
- Provider: Groq (Gemini will behave the same. I haven´t tested with other)
-
Check the debug logs for the outgoing "Request data" to the provider.
-
Observe that a system message has been injected before the user's
message, forcing JSON output with title/description fields,
completely unrelated to the custom prompt's requested format.
-
The model (correctly) follows the system prompt instead of the user
prompt, and response_text returns {"title": "...", "description": "..."} instead of the requested "yes/no + reasoning" text.
Debug logs
DEBUG [custom_components.llmvision] Service call data:
{'provider': '<provider_id>',
'message': 'This is a security camera image of a private garden with
lawn, tiled path, ornamental stones and a green privacy fence with a
white side gate.\n\nCONTEXT: People outside the fence on the
street/path are not a \nconcern. A person can be present near the
gate, fence or path inside the property.\nThe image may be bright or
backlit.\nBefore answering, consider:\n- Is there a face, limb or body
shape that could be a person inside \n the property, even partially
hidden or near the gate/fence?\n- If you are genuinely unsure whether
something is a person or part of the structure, answer YES — safety
comes first.\n- Only answer NO if you are confident no person is
present inside.\n\nAre there any human beings visible inside the
property?\n\nLine 1: yes or no\nLine 2: one sentence explaining your
reasoning\n',
'image_entity': ['camera.cam_entrance'],
'include_filename': True, 'target_width': 1280, 'max_tokens': 3000,
'generate_title': False, 'expose_images': False,
'response_format': 'text'}
DEBUG [custom_components.llmvision.providers] Provider initialized:
Groq(model=meta-llama/llama-4-scout-17b-16e-instruct, endpoint={...})
DEBUG [custom_components.llmvision.providers] Request data:
{'messages': [
{'role': 'system', 'content': 'Analyze the security camera image
and respond with ONLY a valid JSON object. No markdown, no code
blocks, no explanation - just the raw JSON. Output format:
{"title": "<2-5 word summary>", "description": "<1-2 factual
sentences in present tense>"}. If no people, vehicles, or animals
are present: title must be exactly "No activity".'},
{'role': 'user', 'content': [
{'type': 'text', 'text': '<my full custom prompt, same as above,
sent unmodified>'},
{'type': 'image_url', 'image_url': {'url': '<base64_image>'}}
]}
],
'model': 'meta-llama/llama-4-scout-17b-16e-instruct',
'max_completion_tokens': 3000, 'temperature': 0.3, 'top_p': 0.8}
DEBUG [custom_components.llmvision.providers] Posting to
https://api.groq.com/openai/v1/chat/completions
DEBUG [custom_components.llmvision.providers] Response data:
{'id': '<id>', 'object': 'chat.completion', 'created': <ts>,
'model': 'meta-llama/llama-4-scout-17b-16e-instruct',
'choices': [{'index': 0, 'message': {'role': 'assistant',
'content': '{"title": "No activity", "description": "The garden is
empty."}'}, 'logprobs': None, 'finish_reason': 'stop'}],
'usage': {...}}
DEBUG [custom_components.llmvision.providers] Provider: Groq, Model:
meta-llama/llama-4-scout-17b-16e-instruct, Response:
{"title": "No activity", "description": "The garden is empty."}
DEBUG [custom_components.llmvision.providers] Is Glimpse Model: False
INFO [custom_components.llmvision] Response:
{'response_text': '{"title": "No activity", "description": "The
garden is empty."}'}
Checklist
Describe the issue
Version: 1.7.0 (this worked correctly before updating)
Provider: Groq (very similar behavior also observed with Gemini)
Description:
llmvision.image_analyzerinjects its own hardcodedsystemmessage into every request, forcing the model to respond with{"title": "...", "description": "..."}. This happens even withgenerate_title: falseandresponse_format: text, and completelyoverrides any custom output format requested in the user prompt.
My user prompt (passed correctly and in full) asks the model to answer in a specific "yes/no + one sentence reasoning" format for an automation. Instead, the model follows the injected system prompt and returns:
{"title": "No activity", "description": "The garden is empty."}
This worked as expected before updating to 1.7.0 — the same automation, unchanged, returned the model's response in the requested format.
This may be related to #598 / #599, which show similar title/description-related breakage, but here it's clearly traceable to this injected system prompt taking priority over the user's instructions.
Expected: With
generate_title: falseandresponse_format: text, no system prompt should be injected, and the model should respond directly to the user's message as it did before 1.7.0.Reproduction steps
Call
llmvision.image_analyzerwith:generate_title: falseresponse_format: textexpose_images: falsemessageprompt that explicitly asks for a different output format, e.g.:"Answer strictly in this format:
Line 1: yes or no
Line 2: one sentence explaining your reasoning"
image_entityCheck the debug logs for the outgoing "Request data" to the provider.
Observe that a
systemmessage has been injected before the user'smessage, forcing JSON output with
title/descriptionfields,completely unrelated to the custom prompt's requested format.
The model (correctly) follows the system prompt instead of the user
prompt, and
response_textreturns{"title": "...", "description": "..."}instead of the requested "yes/no + reasoning" text.Debug logs