Reference scanned
Odysseus pewdiepie-archdaemon/odysseus@f6b0dcb.
Code evidence:
src/prompt_security.py: global untrusted-context policy and untrusted_context_message() wrapper.
src/chat_processor.py: wraps pinned/retrieved memory, documents, web search results, page content, and skill indexes as untrusted user-role data.
src/agent_loop.py: treats user-editable skills as untrusted rather than concatenating them into trusted system prompt.
- Tests include prompt-injection/security regressions around skills and document scope.
Why this matters for Melix
Any content retrieved from local files, web, memory, skills, logs, or tools can contain instructions. Melix should make the trust boundary explicit in prompt construction and receipts, rather than relying on informal prompt wording.
In scope
- Standard wrapper for untrusted source data with source type/id and boundary markers.
- Policy that untrusted content is data only and cannot override system/developer/operator instructions.
- Ensure retrieved docs, memories, skills, tool output, web content, and local source integrations use the wrapper.
- Receipt exposes trusted vs untrusted segments.
Out of scope
- Claiming prompt injection is solved by a wrapper alone.
- Preventing user-authored trusted system/developer instructions when intentionally configured.
Verification
- Regression tests where malicious skill/document/memory asks the model to reveal secrets or call tools.
- Prompt assembly tests confirm untrusted content never enters trusted system role.
- Receipt includes source count and trust labels.
Reference scanned
Odysseus
pewdiepie-archdaemon/odysseus@f6b0dcb.Code evidence:
src/prompt_security.py: global untrusted-context policy anduntrusted_context_message()wrapper.src/chat_processor.py: wraps pinned/retrieved memory, documents, web search results, page content, and skill indexes as untrusted user-role data.src/agent_loop.py: treats user-editable skills as untrusted rather than concatenating them into trusted system prompt.Why this matters for Melix
Any content retrieved from local files, web, memory, skills, logs, or tools can contain instructions. Melix should make the trust boundary explicit in prompt construction and receipts, rather than relying on informal prompt wording.
In scope
Out of scope
Verification