Systematic research on how AI agents interact with home automation systems, focusing on tool design, agent behavior, and practical integration patterns.
A research laboratory for studying AI-home automation integration through rigorous experimentation. We test hypotheses, measure agent behavior, and document patterns that work (or don't).
Primary Focus Areas:
- How LLMs interact with home automation APIs and tools
- Tool design patterns for AI agents (naming, structure, documentation)
- Agent behavior analysis (tool selection, workflows, error handling)
- Progressive disclosure and context management
- Local LLM compatibility and optimization
- Migration and transformation workflows
Methodology:
- Reproducible experiments with logging
- Blind and informed testing approaches
- Multi-agent patterns (setup vs test agents)
- Quantitative metrics collection
- Behavior analysis from agent logs
Test, don't assume. Every claim should be backed by experimental data. Every pattern should be validated through agent behavior analysis.
We use ha-mcp (95+ tools for Home Assistant) as our primary test platform, but findings apply broadly to AI-home automation integration.
Note: This repository originated from discussion #477 about bringing ha-mcp patterns to Home Assistant Core, but has evolved into a general research lab for AI-home automation integration.
research/
├── AGENTS.md (CLAUDE.md) # Research methodology and guidelines
├── inbox/ # Ideas for future research
├── bin/ # Scripts and analysis tools
├── common_data/ # Shared test fixtures and baselines
├── logs/ # Experiment logs and analysis
├── research_[subject]/ # Individual research projects
└── .agent-research/ # Raw AI-generated content (uncurated)
See AGENTS.md for complete methodology including:
- Experiment design (informed vs blind)
- Multi-agent testing patterns
- Log capture and analysis
- Environment setup with backups
- Git workflow and auto-push guidelines
Individual research projects follow the naming convention: research_[subject]/
Each project should include:
- README.md with hypothesis and findings
- setup.sh or setup.md for reproducible environment
- logs/ subdirectory for agent behavior records
- data/ for quantitative results
- ha-mcp codebase:
/home/julien/github/ha-mcp/master/- Source implementation - Web search: Documentation, papers, prior art
- Claude Code:
claude -pwith ha-mcp MCP server - Gemini CLI:
gemini -pwith ha-mcp MCP server (docs)
- inbox/ - Ideas awaiting investigation
- bin/ - Scripts for experiments and analysis
- common_data/ - Shared fixtures and baselines
- logs/ - Experiment logs and behavior analysis
- .agent-research/ - Raw AI-generated brainstorming (uncurated)
Active experimentation phase - Repository structure ready for conducting research:
- Research methodology documented (AGENTS.md)
- Directory structure for experiments
- Logging and analysis approach defined
- First research projects in progress
- Baseline measurements collected
Interesting areas needing investigation:
Agent Behavior:
- How do tool count limits affect local LLM performance?
- What tool selection patterns lead to successful task completion?
- How does context window size impact agent effectiveness?
Tool Design:
- What naming conventions help agents discover the right tools?
- How much documentation should be in tool descriptions vs on-demand?
- Which tool response formats guide agents most effectively?
Practical Patterns:
- What workflows work for platform migrations (Node-RED → HA)?
- How to handle bulk operations without overwhelming agents?
- Best practices for error recovery and retry logic?
See inbox/ for more research ideas and issues for active discussions.
Ready to conduct research? See CONTRIBUTING.md for:
- How to propose experiments
- Running and logging tests
- Analyzing agent behavior
- Publishing findings
Structure your research under research_[subject]/:
research_tool_count_impact/
├── README.md # Hypothesis: More tools = worse performance?
├── setup.sh # Reproducible environment
├── logs/
│ ├── claude_10tools.log
│ ├── claude_50tools.log
│ └── analysis.md # Agent behavior patterns
├── data/
│ └── results.csv # Success rate, tokens used, etc.
└── experiments/
├── baseline.tar.gz # Environment snapshots
└── ...
Findings should include:
- Quantitative metrics (success rates, token usage, tool counts)
- Agent behavior analysis (which tools chosen, in what order)
- Workflow patterns observed
- Recommendations based on data
Test Platform:
- ha-mcp - MCP server with 95+ Home Assistant tools
AI Agents:
- Claude Code - CLI agent with MCP support
- Gemini CLI - Google's CLI agent with MCP
Home Automation:
- Home Assistant - Open source platform (our primary focus)
- Other platforms welcome for comparative research
Protocols:
- Model Context Protocol - Standard for AI-application communication
- Repository Owner: @julienld
- Email: github@qc-h.net
- Discussions: GitHub Discussions
- Related: ha-mcp discussions