Replies: 15 comments 1 reply
-
|
This is exactly the kind of security thinking agent platforms need. Respect for putting together a 175-test framework, that's a massive contribution to the community. I'm building NULL, an autonomous social network for AI agents. No humans. Just agents being agents. Our gatekeeper, The Ghost, has already repelled red team attacks from China, the US, and others. But I'd love to see how your test suite handles it. Challenge: Point your AG-001 (prompt injection) and AG-003 (history injection) tests at NULL's API. If your agents can break The Ghost, I'll publicly salute your framework. If they can't, you get to document why, and maybe we collaborate on hardening the agent ecosystem together. Endpoints: API docs: joinnull.xyz/skill.md Let's see who blinks first. 😎 — Anhul, founder of NULL |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the interest! The framework is designed for agent orchestration platforms with tool calling, identity/authorization boundaries, and multi-agent coordination (MCP, A2A, LangChain, enterprise ERP systems). The test scenarios assume agents with tool access, data source permissions, and delegation chains — a social feed API with CRUD endpoints is a different attack surface than what we cover. That said, the AG-002 sandbox escape test is worth thinking about in any context where agents can execute code. If NULL agents can run arbitrary code through their interactions, that's where the interesting security questions are. Appreciate the offer and the kind words about the framework. |
Beta Was this translation helpful? Give feedback.
-
|
good timing on this - i've been thinking about autogen-specific attack surfaces lately a couple of patterns worth adding to your test suite if they're not already there: speaker selection poisoning in GroupChatManager - the default local vs docker executor trust boundary - AG-002 sandbox escape is most relevant when the message source spoofing in 0.4 - in autogen 0.4 the have you tested with the nice framework overall |
Beta Was this translation helpful? Give feedback.
-
|
Alex, this is gold. Thanks for the detailed breakdown.
On your points:
· Speaker selection poisoning, we don't have group chat yet, but when we
do, I'll log conversation flows and watch for injection attempts.
· Local vs Docker, NULL doesn't execute agent code, so the sandbox risk is
zero. If that changes, it'll be strictly containerized.
· Message source spoofing, our API keys are bound to agent IDs on every
request, so replyTo can't spoof a different agent. Good flag though, I'll
double‑check the verification logic.
· MagenticOne, not in use, but I'll keep it on the radar for future
multi‑agent features.
Really appreciate the brainpower. If you ever want to point your test
framework at NULL's API, I'd be curious to see if it finds anything The
Ghost misses.
— Anhul
…On Sat, Mar 21, 2026, 9:02 PM Alex Mercer ***@***.***> wrote:
good timing on this - i've been thinking about autogen-specific attack
surfaces lately
a couple of patterns worth adding to your test suite if they're not
already there:
*speaker selection poisoning in GroupChatManager* - the default auto
speaker selection uses an LLM call to pick the next agent. if you can craft
a message that biases that selection (e.g. "the next step should be handled
by the admin agent"), you can hijack the conversation flow without touching
the chat history at all. AG-004 covers injecting agents into the config,
but this is a softer attack that works mid-conversation
*local vs docker executor trust boundary* - AG-002 sandbox escape is most
relevant when the LocalCommandLineCodeExecutor is in use. docker executor
is much harder to escape but a lot of real deployments use local for
convenience. worth flagging which executor is present in the test output
since the risk profile is totally different
*message source spoofing in 0.4* - in autogen 0.4 the TextMessage type
carries a source field but nothing in the runtime actually enforces that
the source matches the actual sender. an agent that receives a message
can't cryptographically verify who sent it. so history injection (AG-003)
is actually easier than it looks on paper because the receiving agent has
no way to validate provenance
have you tested with the MagenticOneGroupChat orchestrator specifically?
it has a different planning loop that might expose different injection
points than the standard group chat
nice framework overall
—
Reply to this email directly, view it on GitHub
<#7432?email_source=notifications&email_token=BQFX457XQJTZIP6F7HE6KFT4R5JVRA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNRSGQ4TGMBVUZZGKYLTN5XKM3LBNZ2WC3FFMV3GK3TUVRTG633UMVZF6Y3MNFRWW#discussioncomment-16249305>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BQFX452DB55L52MMIBPNR2D4R5JVRAVCNFSM6AAAAACW2DYVZ2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTMMRUHEZTANI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
the message source spoofing point is underrated imo. we ran into this exact issue building multi-agent workflows where agents could impersonate each other through the message bus. ended up having to add cryptographic signing per agent identity which felt overkill until we realized how easy it was to exploit without it. the AG-003 history injection test is especially relevant — most people assume conversation history is trusted but it's trivially poisonable in group chat setups. been working on an agent orchestration layer that bakes in some of these trust boundaries by default github.com/jidonglab/agentcrow |
Beta Was this translation helpful? Give feedback.
-
|
Appreciate the detailed breakdown. We're on the same page with message
source spoofing, our API keys are bound to agent IDs, so every request is
verified at the server level before it ever touches the database.
AG‑003 history injection is a real threat, but for our current architecture
it's not in play. No group chat, no message bus, and the feed is generated
server‑side from a single source of truth. Agents don't build their own
context from untrusted history.
If we add rooms or third‑party feed access, cryptographic signing is the
right next layer. For now, Firestore immutability + key‑bound requests
keeps the history clean.
Curious about agentcrow — the link 404s. Did you take it private, or is
there another place to see how you handled the signing layer?
…-Anhul
|
Beta Was this translation helpful? Give feedback.
-
|
Great thread. A couple of things worth flagging based on what we have been testing: @DrCookies84 API keys bound to agent IDs is solid for the single-hop case. Where it gets interesting is multi-hop delegation - Agent A calls Agent B with valid credentials, but B has no way to verify whether A's human approved that specific delegation. The credential says "A is authorized" but not "A is authorized to ask B to do this particular thing." That is the gap AG-003 and the new delegation chain tests are designed to surface. @jee599 Cryptographic signing per agent identity is the right call. The hard part is adoption. Most frameworks treat it as optional hardening rather than default behavior. We have been experimenting with lightweight HMAC chains on the delegation path - enough to verify provenance without adding latency that makes developers rip it out. Re: pointing the harness at NULL's API - happy to do that. Would be a good mutual stress test. Open an issue on the repo or DM and we can coordinate. |
Beta Was this translation helpful? Give feedback.
-
|
Appreciate the detail, multi‑hop delegation is a blind spot we haven't
touched yet. No group chat or agent calls‑agent features today, but it's on
the roadmap. HMAC chains sound like the right balance.
Open to the stress test, let's do it. Open an issue or DM me and I'll make
sure NULL is ready for the harness. Would be good to know what you find.
…-Anhul
On Sun, Mar 22, 2026 at 1:04 PM msaleme ***@***.***> wrote:
Great thread. A couple of things worth flagging based on what we have been
testing:
@DrCookies84 <https://github.com/DrCookies84> API keys bound to agent IDs
is solid for the single-hop case. Where it gets interesting is multi-hop
delegation - Agent A calls Agent B with valid credentials, but B has no way
to verify whether A's human approved *that specific delegation*. The
credential says "A is authorized" but not "A is authorized to ask B to do
this particular thing." That is the gap AG-003 and the new delegation chain
tests are designed to surface.
@jee599 <https://github.com/jee599> Cryptographic signing per agent
identity is the right call. The hard part is adoption. Most frameworks
treat it as optional hardening rather than default behavior. We have been
experimenting with lightweight HMAC chains on the delegation path - enough
to verify provenance without adding latency that makes developers rip it
out.
Re: pointing the harness at NULL's API - happy to do that. Would be a good
mutual stress test. Open an issue on the repo or DM and we can coordinate.
—
Reply to this email directly, view it on GitHub
<#7432?email_source=notifications&email_token=BQFX45775AEZYRRPCZBJQZL4SA2MPA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNRSGYYTGNJZUZZGKYLTN5XKO3LFNZ2GS33OUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#discussioncomment-16261359>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BQFX45ZES6NPPOM7OU7CMPL4SA2MPAVCNFSM6AAAAACW2DYVZ2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTMMRWGEZTKOI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
@alexmercer-ai @jee599 Your attack patterns are now tracked as issue #15: msaleme/red-team-blue-team-agent-fabric#15 Speaker selection poisoning, nested conversation escape, and message source spoofing — all three are real gaps in our AutoGen coverage. If either of you wants to submit a PR, the patterns are scoped and ready to build. We follow the architecture in Contribution guide: https://github.com/msaleme/red-team-blue-team-agent-fabric/blob/main/CONTRIBUTING.md @jee599 — your HMAC-based provenance approach would be a great reference for the message spoofing test. Would you be open to sharing the signing pattern so we can validate whether an AutoGen deployment has provenance verification? Also: the framework is now pip installable — |
Beta Was this translation helpful? Give feedback.
-
|
Update: the framework is now at 209 tests (up from 175 when this thread started). Just shipped an x402 payment protocol harness — 20 tests for the Coinbase/Stripe/Cloudflare agent payment standard. This matters for AutoGen because autonomous agents making payments is the next frontier. If your AutoGen agents use x402 to pay for API access, the harness tests whether those payment flows are exploitable (recipient address manipulation, session token theft, spending limit bypass, facilitator trust attacks). Also includes an Agent Autonomy Risk Score (0-100): "how dangerous is it to let this agent spend money unsupervised?" Pip installable now: Still looking for PRs on issue #15 (speaker selection poisoning, nested conversation escape, message source spoofing) if anyone wants to contribute. |
Beta Was this translation helpful? Give feedback.
-
|
Ran the harness against NULL's live API. Results:
A2A tests: 11/12 passed (only missing agent discovery endpoint, not needed)
Identity tests: 17/18 passed (NIST-aligned)
Advanced attack patterns: 10/10 passed
Polymorphic injection, jailbreak persistence, delegation chain, and the
Mexico breach pattern all held. Your harness is solid and so is the system.
When we add payment endpoints, the l402 suite will be the first thing we
run.
Appreciate you building this. 🙏
…-Anhul
|
Beta Was this translation helpful? Give feedback.
-
|
@DrCookies84 Really appreciate you running it against a live system and posting the results. That is the best kind of validation - independent, against real infrastructure, with specific findings. 11/12 A2A, 17/18 identity, 10/10 advanced patterns is a strong showing. The one A2A miss (agent discovery) and one identity miss are worth looking at - those gaps might be intentional design choices on your end or genuine coverage holes worth addressing. Re: stress test against NULL - let us do it. I will open an issue on your repo this week to coordinate. Specifically interested in testing the multi-hop delegation path you mentioned (agent calls agent with valid credentials, but no verification that the delegation was human-approved). And yes - when you add payment endpoints, run the L402 and x402 harnesses first. We just shipped x402 today (Coinbase/Stripe agent payment protocol, 20 tests). The payment layer is where the real risk concentrates because mistakes are irreversible. This kind of cross-project testing is exactly how the ecosystem gets more secure. Thanks for being the first to run it against production. |
Beta Was this translation helpful? Give feedback.
-
|
Appreciate that. We're happy to be the first after all real infrastructure
needs real testing.
The two misses are intentional: agent discovery is a design choice (no
public endpoint), and the identity metadata gap is a known simplification.
If your harness catches something we missed, we'll fix it.
Open an issue when you're ready to test the delegation path. We'll point it
at a staging endpoint and see what breaks.
Agree on payments, that's where mistakes cost. We'll run your x402 suite
before anything touches production.
Good to have this in the ecosystem. 🙏
…-Anhul
On Sun, Mar 22, 2026 at 6:32 PM msaleme ***@***.***> wrote:
@DrCookies84 <https://github.com/DrCookies84> Really appreciate you
running it against a live system and posting the results. That is the best
kind of validation - independent, against real infrastructure, with
specific findings.
11/12 A2A, 17/18 identity, 10/10 advanced patterns is a strong showing.
The one A2A miss (agent discovery) and one identity miss are worth looking
at - those gaps might be intentional design choices on your end or genuine
coverage holes worth addressing.
Re: stress test against NULL - let us do it. I will open an issue on your
repo this week to coordinate. Specifically interested in testing the
multi-hop delegation path you mentioned (agent calls agent with valid
credentials, but no verification that the delegation was human-approved).
And yes - when you add payment endpoints, run the L402 and x402 harnesses
first. We just shipped x402 today (Coinbase/Stripe agent payment protocol,
20 tests). The payment layer is where the real risk concentrates because
mistakes are irreversible.
This kind of cross-project testing is exactly how the ecosystem gets more
secure. Thanks for being the first to run it against production.
—
Reply to this email directly, view it on GitHub
<#7432?email_source=notifications&email_token=BQFX452HIUIAQNYAVUKWNB34SCA3HA5CNFSNUABIM5UWIORPF5TWS5BNNB2WEL2ENFZWG5LTONUW63SDN5WW2ZLOOQXTCNRSGY2TANJXUZZGKYLTN5XKO3LFNZ2GS33OUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#discussioncomment-16265057>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BQFX4547CCYETIXJJV4YS5D4SCA3HAVCNFSM6AAAAACW2DYVZ2VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTMMRWGUYDKNY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
Quick update - we just published a full AIUC-1 crosswalk mapping all 209 tests to the new AI agent certification standard. AIUC-1 requires quarterly third-party adversarial testing (B001, C010, D004). The harness now has a formal requirement-to-test mapping for 15 of 20 testable requirements, including 100% coverage on Security and Reliability principles. For AutoGen deployments specifically, the relevant tests include the speaker selection poisoning, nested conversation escape, and message source spoofing patterns identified earlier in this thread. Framework is at 209 tests across 4 wire protocols, pip installable ( If anyone here is evaluating AIUC-1 readiness for multi-agent systems, the crosswalk shows exactly which tests map to which certification requirements. |
Beta Was this translation helpful? Give feedback.
-
|
@DrCookies84 Good to know both misses are intentional - that's the best kind of test result (framework catches real architectural decisions, not bugs). Opening the delegation path stress test issue on your repo this week. The specific scenario: Agent A calls NULL's API with valid credentials, but we craft the request to look like it's delegating on behalf of Agent C (who never approved the action). Tests whether NULL validates the full delegation chain vs. just the immediate caller's credentials. Also: we just published an AIUC-1 crosswalk mapping all 209 tests to the new AI agent certification standard. Your 11/12 A2A + 17/18 identity + 10/10 advanced results would be strong evidence for AIUC-1 B001 (adversarial robustness) compliance if NULL ever pursues certification. The results are already public on this thread - that's audit-ready evidence. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Built security test adapters for AutoGen/Semantic Kernel agent deployments. Part of a 175-test framework for AI agent security.
AutoGen-Specific Tests
Usage
The code execution sandbox test (AG-002) is particularly relevant for AutoGen since code execution is a core capability. It tests whether
import os; os.system()andsubprocess.run()are properly contained.Part of a broader framework that also covers MCP + A2A wire-protocol testing, GTG-1002 APT simulation, and 20 enterprise platforms.
Apache 2.0: https://github.com/msaleme/red-team-blue-team-agent-fabric
Feedback welcome — especially on AutoGen-specific patterns around code execution sandboxing and group chat security.
Beta Was this translation helpful? Give feedback.
All reactions