[FEATURE] EU AI Act Compliance Analysis — CrewAI Scores Best of Any Framework Scanned

### Feature Area

Documentation

### Is your feature request related to a an existing bug? Please link it here.

NA — This is not a bug report. This is an EU AI Act compliance analysis of CrewAI's codebase using the AIR Blackbox open-source scanner. We've posted similar analyses to [Haystack (#10810)](https://github.com/deepset-ai/haystack/issues/10810), [LlamaIndex (#20979)](https://github.com/run-llama/llama_index/issues/20979), and [Semantic Kernel (#13657)](https://github.com/microsoft/semantic-kernel/issues/13657).

### Describe the solution you'd like

## EU AI Act Compliance Scan — CrewAI Framework Analysis

We ran CrewAI's codebase through [AIR Blackbox](https://github.com/airblackbox/gateway) (`air-blackbox comply --scan`), an open-source EU AI Act compliance scanner that checks Python code against Articles 9–15.

**Result: 15 passing / 11 warnings / 11 failing out of 37 checks.**

CrewAI scored the best of any framework we've scanned so far, particularly on Article 14 (Human Oversight) with 6/9 passing.

### Standout Findings

| Pattern | Files | Notes |
|---------|-------|-------|
| Input validation (Pydantic) | 384/1,015 (38%) | Highest Pydantic adoption we've seen |
| Fallback/recovery patterns | 107 | Strong recovery architecture |
| Rate limiting / budget controls | 70 | `max_rpm`, `max_execution_time`, `max_tokens` |
| Prompt injection defense | 65 | Dedicated security module |
| Retry/backoff logic | 60 | Robust error recovery |
| Output validation | 52 | `output_pydantic` enforces structured LLM responses |
| Tracing / observability | 72 | Event bus with typed events across every layer |
| Token expiry / execution bounding | 32 | `max_iter`, timeouts |
| Human oversight patterns | 31 | `allow_delegation`, crew delegation controls |
| Agent action audit trail | 15 | Fingerprint-based agent identity |

### Notable Architecture Patterns

**Security Module (Fingerprint):** Dual identifiers per agent (human-readable ID + UUID fingerprint), metadata validation with depth limiting and size caps (10KB max) to prevent DoS. Most security-conscious agent identity system we've scanned.

**Built-in Guardrails:** `hallucination_guardrail.py` and `llm_guardrail.py` integrated into the event bus via `llm_guardrail_events`. Most frameworks require external tools for this.

**A2A Protocol:** Full agent communication protocol with AgentCard discovery, authentication (API key + HTTP digest), TLS verification, and extension registry.

**Event Bus:** Typed events across every layer — `agent_events`, `crew_events`, `tool_usage_events`, `llm_events`, `llm_guardrail_events`, `flow_events`, `knowledge_events`, `mcp_events`, `a2a_events`, `system_events`.

### Flagged Items Worth Reviewing

1. **LLM call error handling** — 77/113 files (68%). Missing in: `security/security_config.py`, `agents/agent_adapter.py`, `utilities/internal_instructor.py`, `rag/chromadb/client.py`, `rag/core/base_client.py`
2. **Unsafe input handling** — 17 files flagged for potentially passing raw user input into prompts: `lite_agent.py`, `a2a/utils/content_type.py`, `a2a/utils/task.py`
3. **Application logging** — 100/1,015 files (10%). The event bus handles tracing, but structured `logging` module usage is sparse

### Questions for Maintainers

1. **SecurityConfig TODOs** — The security module has TODO markers for authentication, scoping rules, and impersonation tokens. What's the roadmap?
2. **Hallucination guardrail** — How does `hallucination_guardrail.py` work? Pattern-based, LLM-judge-based, or something else?
3. **A2A authentication** — Is the API key + HTTP digest auth used in production multi-agent deployments, or primarily for CrewAI Enterprise?
4. **allow_delegation semantics** — When an agent delegates, does the delegated agent inherit the original agent's permissions, or operate under its own scope?
5. **Unsafe input in lite_agent** — The 17 flagged files include core agent paths. Can you confirm these handle user input safely before it reaches the LLM?

### How to Reproduce

```bash
pip install air-blackbox
git clone https://github.com/crewAIInc/crewAI.git
air-blackbox comply --scan ./crewAI -v
```

Scanner is Apache 2.0, runs entirely local, no data leaves your machine. Full PDF report available in the [AIR Blackbox gateway repo](https://github.com/airblackbox/gateway/tree/main/docs).

Any corrections or context from maintainers will be used to improve the scanner — we've already updated patterns based on feedback from Haystack and LlamaIndex teams.

### Describe alternatives you've considered

_No response_

### Additional context

Full PDF report with per-article breakdowns is available at: https://github.com/airblackbox/gateway/blob/main/docs/AIR_Blackbox_CrewAI_Report_v1.pdf

Scanner: [air-blackbox on PyPI](https://pypi.org/project/air-blackbox/) (Apache 2.0, runs locally)

Previous framework scans and maintainer responses:
- **Haystack** — Julian Risch responded with detailed corrections (docstrings count higher than detected, Haystack uses its own pipeline error handling)
- **LlamaIndex** — Logan Markewich responded within 3 minutes confirming AgentMesh is experimental, callback_manager is deprecated, packs are being deleted
- **Semantic Kernel** — Issue posted, awaiting response

Each maintainer response directly improves scanner accuracy. We'd love the same from CrewAI's team.

### Willingness to Contribute

I could provide more detailed specifications

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] EU AI Act Compliance Analysis — CrewAI Scores Best of Any Framework Scanned #4845

Feature Area

Is your feature request related to a an existing bug? Please link it here.

Describe the solution you'd like

EU AI Act Compliance Scan — CrewAI Framework Analysis

Standout Findings

Notable Architecture Patterns

Flagged Items Worth Reviewing

Questions for Maintainers

How to Reproduce

Describe alternatives you've considered

Additional context

Willingness to Contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pattern	Files	Notes
Input validation (Pydantic)	384/1,015 (38%)	Highest Pydantic adoption we've seen
Fallback/recovery patterns	107	Strong recovery architecture
Rate limiting / budget controls	70	`max_rpm`, `max_execution_time`, `max_tokens`
Prompt injection defense	65	Dedicated security module
Retry/backoff logic	60	Robust error recovery
Output validation	52	`output_pydantic` enforces structured LLM responses
Tracing / observability	72	Event bus with typed events across every layer
Token expiry / execution bounding	32	`max_iter`, timeouts
Human oversight patterns	31	`allow_delegation`, crew delegation controls
Agent action audit trail	15	Fingerprint-based agent identity

[FEATURE] EU AI Act Compliance Analysis — CrewAI Scores Best of Any Framework Scanned #4845

Description

Feature Area

Is your feature request related to a an existing bug? Please link it here.

Describe the solution you'd like

EU AI Act Compliance Scan — CrewAI Framework Analysis

Standout Findings

Notable Architecture Patterns

Flagged Items Worth Reviewing

Questions for Maintainers

How to Reproduce

Describe alternatives you've considered

Additional context

Willingness to Contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions