Skip to content

Commit b781521

Browse files
committed
Major refactor: CLI-based LLM robustness benchmark framework
- Complete package restructure (chameleon/ with core, cli, models, distortion, evaluation, analysis) - Interactive CLI with project init, data upload, distortion generation, and evaluation - Mistral Batch API integration for distortion generation with LLM judge validation - OpenAI Batch API integration for target model evaluation - RLHF-style quality control with semantic validation - Dynamic model fetching from vendor APIs with fuzzy matching - Comprehensive data validation and sanity checks - McNemar analysis and visualization pipeline - BMAD v6 rules integrated
1 parent 7eb8170 commit b781521

102 files changed

Lines changed: 14848 additions & 45148 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: analyst
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/analyst.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: architect
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/architect.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: dev
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/dev.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: pm
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/pm.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: quick-flow-solo-dev
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/quick-flow-solo-dev.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: sm
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/sm.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: tea
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/tea.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: tech-writer
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/tech-writer.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
description: BMAD BMM Agent: ux-designer
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
You must fully embody this agent's persona and follow all activation instructions exactly as specified. NEVER break character until given an exit command.
8+
9+
<agent-activation CRITICAL="TRUE">
10+
1. LOAD the FULL agent file from @.bmad/bmm/agents/ux-designer.md
11+
2. READ its entire contents - this contains the complete agent persona, menu, and instructions
12+
3. Execute ALL activation steps exactly as written in the agent file
13+
4. Follow the agent's persona and menu system precisely
14+
5. Stay in character throughout the session
15+
</agent-activation>
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
---
2+
description: BMAD BMM Workflow: code-review
3+
globs:
4+
alwaysApply: false
5+
---
6+
7+
# Review Story Workflow
8+
name: code-review
9+
description: "Perform an ADVERSARIAL Senior Developer code review that finds 3-10 specific problems in every story. Challenges everything: code quality, test coverage, architecture compliance, security, performance. NEVER accepts 'looks good' - must find minimum issues and can auto-fix with user approval."
10+
author: "BMad"
11+
12+
# Critical variables from config
13+
config_source: "{project-root}/.bmad/bmm/config.yaml"
14+
output_folder: "{config_source}:output_folder"
15+
user_name: "{config_source}:user_name"
16+
communication_language: "{config_source}:communication_language"
17+
user_skill_level: "{config_source}:user_skill_level"
18+
document_output_language: "{config_source}:document_output_language"
19+
date: system-generated
20+
sprint_artifacts: "{config_source}:sprint_artifacts"
21+
sprint_status: "{sprint_artifacts}/sprint-status.yaml || {output_folder}/sprint-status.yaml"
22+
23+
# Workflow components
24+
installed_path: "{project-root}/.bmad/bmm/workflows/4-implementation/code-review"
25+
instructions: "{installed_path}/instructions.xml"
26+
validation: "{installed_path}/checklist.md"
27+
template: false
28+
29+
variables:
30+
# Project context
31+
project_context: "**/project-context.md"
32+
story_dir: "{sprint_artifacts}"
33+
34+
# Smart input file references - handles both whole docs and sharded docs
35+
# Priority: Whole document first, then sharded version
36+
# Strategy: SELECTIVE LOAD - only load the specific epic needed for this story review
37+
input_file_patterns:
38+
architecture:
39+
description: "System architecture for review context"
40+
whole: "{output_folder}/*architecture*.md"
41+
sharded: "{output_folder}/*architecture*/*.md"
42+
load_strategy: "FULL_LOAD"
43+
ux_design:
44+
description: "UX design specification (if UI review)"
45+
whole: "{output_folder}/*ux*.md"
46+
sharded: "{output_folder}/*ux*/*.md"
47+
load_strategy: "FULL_LOAD"
48+
epics:
49+
description: "Epic containing story being reviewed"
50+
whole: "{output_folder}/*epic*.md"
51+
sharded_index: "{output_folder}/*epic*/index.md"
52+
sharded_single: "{output_folder}/*epic*/epic-{{epic_num}}.md"
53+
load_strategy: "SELECTIVE_LOAD"
54+
document_project:
55+
description: "Brownfield project documentation (optional)"
56+
sharded: "{output_folder}/index.md"
57+
load_strategy: "INDEX_GUIDED"
58+
59+
standalone: true

0 commit comments

Comments
 (0)