AAE is a sequential multi-agent analytics pipeline. Each phase writes an auditable artifact that the next phase consumes.
| Phase | Agent | Primary Output |
|---|---|---|
| Audit | agents/auditor.py |
data_health_report.json |
| Clean | agents/cleaner.py |
cleaned_data.parquet, cleaning_report.json |
| Analyze | agents/analyzer.py |
statistical_analysis_report.json |
| Architect | agents/architect.py |
star schema Parquet, DAX, data dictionary, TMDL |
| Story | agents/storyteller.py |
dashboard_stories.json |
| Assurance | agents/assurance.py |
assurance_report.json |
CSV and TSV files above the large-file threshold use agents/warehouse.py, which scans and transforms data in chunks and writes partitioned Parquet. This avoids loading large files into memory while preserving full-file analysis.
Reports include decision traces for:
- KPI and date-column selection
- cleaning actions and skipped high-risk operations
- dimension and fact-table modeling decisions
- DAX measure semantics
- story evidence and decision readiness
AAE exports:
- fact and dimension tables as Parquet
- validated DAX measures
data_dictionary.jsonanddata_dictionary.csv- Tabular Model Definition Language under
star_schema/tmdl/
Native .pbix generation is intentionally not claimed; the current handoff targets Power BI-ready semantic-model artifacts.