🤖 Why AI Agents Fail (And How to Fix Them)

Research-backed solutions to the three critical failure modes that break AI agents in production: hallucinations, timeouts, and memory loss.

⭐ Star this repository

🎯 Learning Path: Understand → Prevent → Scale

This repository demonstrates research-backed techniques for preventing AI agent failures with working code examples.

🚨 Failure Mode	💡 Solution Approach	📊 Projects	⏱️ Total Time
Hallucinations	Detection and mitigation through 4 techniques	4 demos	2 hours
Getting Stuck	Context overflow, MCP timeouts, reasoning loops	3 demos	1.5 hours
Memory Loss	Persistent memory and context retrieval	Coming soon	-

🎭 Stop AI Agent Hallucinations

The Problem: Agents fabricate statistics, choose wrong tools, ignore business rules, and claim success when operations fail.

The Solution: 4 research-backed techniques that detect, contain, and mitigate hallucinations before they cause damage.

📓 Hallucination Prevention Demos

📓 Demo	🎯 Focus & Key Learning	⏱️ Time
01 - Graph-RAG vs Traditional RAG	Structured data retrieval - Compare RAG vs Graph-RAG on 300 hotel FAQs, Neo4j knowledge graph with auto entity extraction, eliminate statistical hallucinations	30 min
02 - Semantic Tool Selection	Intelligent tool filtering - Filter 31 tools to top 3 relevant, reduce errors and token costs, dynamic tool swapping	45 min
03 - Multi-Agent Validation Pattern	Cross-validation workflows - Executor → Validator → Critic pattern catches hallucinations, Strands Swarm orchestration	30 min
04 - Neurosymbolic Guardrails for AI Agents	Symbolic validation - Compare prompt engineering vs symbolic rules, business rule compliance, LLM cannot bypass	20 min

📊 Key Results

🎯 Technique	📈 Improvement	🔍 Metric
Graph-RAG	Accuracy	Precise queries on 300 hotel FAQs via knowledge graph
Semantic Tool Selection	Reduce errors and token costs	Tool selection hallucination detection (research validated), Token cost per query
Neurosymbolic Rules	Compliance	Business rule enforcement - LLM cannot bypass
Multi-Agent Validation	Detects errors	Invalid operation detection before reaching users

→ Explore hallucination prevention demos

🔄 Stop Agents from Wasting Tokens

The Problem: Agents get stuck when context windows overflow with large data, MCP tools stop responding on slow APIs, or agents repeat the same tool calls without making progress — burning tokens and blocking workflows.

The Solution: 3 research-backed techniques that prevent context overflow, handle unresponsive APIs, and detect reasoning loops before they waste resources.

📓 Token Waste & Stuck Agent Demos

📓 Demo	🎯 Focus & Key Learning	⏱️ Time
01 - Context Window Overflow	Memory management — Store large data outside context with Memory Pointer Pattern, 7x token reduction validated by IBM Research	30 min
02 - MCP Tools Not Responding	Async patterns — Handle slow/unresponsive APIs with async handleId, prevent 424 errors, immediate responses	20 min
03 - Reasoning Loops	Loop detection — DebounceHook blocks duplicate calls, clear SUCCESS/FAILED states stop retries, 7x fewer tool calls	25 min

→ Explore token waste prevention demos

Your Agent Doesn't Remember You

(Coming soon)

🔧 Technologies Used

Details

🔧 Technology	🎯 Purpose	⚡ Key Capabilities
Strands Agents	AI agent framework	Dynamic tool swapping, multi-agent orchestration, conversation memory, hooks system
Amazon Bedrock	LLM access	Claude 3 Haiku/Sonnet for agent reasoning and tool calling
Neo4j	Graph database	Relationship-aware queries, precise aggregations, multi-hop traversal
FAISS	Vector search	Semantic similarity, tool filtering, efficient nearest neighbor search
SentenceTransformers	Embeddings	Text embeddings for semantic tool selection and memory retrieval

Prerequisites

Before You Begin:

Python 3.9+ installed locally
LLM access: OpenAI (default), Amazon Bedrock, Anthropic, or Ollama
OPENAI_API_KEY environment variable (for default setup)
AWS CLI configured if using Amazon Bedrock (aws configure)
Basic understanding of AI agents and tool calling

Model Configuration: All demos use OpenAI with GPT-4o-mini by default. You can swap to any provider supported by Strands — see Strands Model Providers for configuration.

AWS Credentials Setup (if using Amazon Bedrock): Follow the AWS credentials configuration guide to configure your environment.

🚀 Quick Start Guide

1. Clone Repository

git clone https://github.com/aws-samples/sample-why-agents-fail
cd sample-why-agents-fail

2. Start with Hallucinations

cd stop-ai-agent-hallucinations

3. Explore All Techniques

Each demo folder contains detailed README files and working code examples.

💰 Cost Estimation

💰 Service	💵 Approximate Cost	📊 Usage Pattern	🔗 Pricing Link
OpenAI GPT-4o-mini	~$0.15 per 1M input tokens	Agent reasoning and tool calling	OpenAI Pricing
Amazon Bedrock (Claude)	~$0.25 per 1M input tokens	Alternative LLM provider	Bedrock Pricing
Neo4j (local)	Free	Graph database for demos	Neo4j Pricing
FAISS (local)	Free	Vector search library	FAISS GitHub
SentenceTransformers	Free	Local embeddings	SBERT Docs

💡 All demos can run locally with minimal costs. OpenAI GPT-4o-mini is the most cost-effective option for testing.

📖 Additional Learning Resources

Strands Agents Documentation - Framework documentation and model providers
Amazon Bedrock Documentation - LLM service guide and model access
Search for tools in your AgentCore gateway with a natural language query
Neo4j Graph Database Guide - Graph database setup and Cypher queries

⭐ Star this repository • 📖 Start Learning

🤝 Contributing

Contributions are welcome! See CONTRIBUTING for more information.

Security

If you discover a potential security issue in this project, notify AWS/Amazon Security via the vulnerability reporting page. Please do not create a public GitHub issue.

📄 License

This library is licensed under the MIT-0 License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
stop-ai-agent-hallucinations		stop-ai-agent-hallucinations
stop-ai-agents-wasting-tokens		stop-ai-agents-wasting-tokens
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Why AI Agents Fail (And How to Fix Them)

🎯 Learning Path: Understand → Prevent → Scale

🎭 Stop AI Agent Hallucinations

📓 Hallucination Prevention Demos

📊 Key Results

🔄 Stop Agents from Wasting Tokens

📓 Token Waste & Stuck Agent Demos

Your Agent Doesn't Remember You

🔧 Technologies Used

Prerequisites

🚀 Quick Start Guide

1. Clone Repository

2. Start with Hallucinations

3. Explore All Techniques

💰 Cost Estimation

📖 Additional Learning Resources

🤝 Contributing

Security

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 Why AI Agents Fail (And How to Fix Them)

🎯 Learning Path: Understand → Prevent → Scale

🎭 Stop AI Agent Hallucinations

📓 Hallucination Prevention Demos

📊 Key Results

🔄 Stop Agents from Wasting Tokens

📓 Token Waste & Stuck Agent Demos

Your Agent Doesn't Remember You

🔧 Technologies Used

Prerequisites

🚀 Quick Start Guide

1. Clone Repository

2. Start with Hallucinations

3. Explore All Techniques

💰 Cost Estimation

📖 Additional Learning Resources

🤝 Contributing

Security

📄 License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 0

Languages

Packages

Contributors