Add Agent TOML & YAML files + README

DannylSyph3r · DannylSyph3r · commit a3686a7d984a · 2025-10-16T12:01:31.000+01:00
diff --git a/agents/undertaker/README.md b/agents/undertaker/README.md
@@ -0,0 +1,189 @@
+# The Undertaker - Dead Code Detection Agent
+
+Find unused functions, classes, variables, imports, and unreachable code with confidence scoring across your repository.
+
+## Overview
+
+The Undertaker is a reliable dead code detection agent that identifies unused code elements using static analysis. It provides deterministic confidence-based scoring to help you safely remove dead code while minimizing false positives.
+
+## Features
+
+- **Comprehensive Detection**: Identifies unused functions, classes, methods, variables, imports, types, enums, and unreachable code
+- **Multi-Language Support**: Analyzes multiple programming languages
+- **Confidence Scoring**: Deterministic scoring (50-100%) based on reference counts and export status
+- **Safe Analysis**: Read-only static analysis that never modifies your code
+- **Unreachable Code Detection**: Finds code after return/throw/break statements
+- **Export-Aware**: Distinguishes between private and exported/public code
+- **Detailed Reporting**: JSON output with actionable findings and reasoning
+
+## Quick Start
+
+### Basic Usage
+
+```bash
+# Run default analysis with 70% confidence threshold
+qodo undertaker
+
+# Use custom confidence threshold
+qodo undertaker --min_confidence=80
+
+# Include test files in analysis
+qodo undertaker --include_tests=true
+
+# Use both options together
+qodo undertaker --min_confidence=85 --include_tests=true
+```
+
+## Configuration
+
+The agent accepts the following parameters:
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `min_confidence` | number | 70 | Minimum confidence threshold (50-100). Only results meeting or exceeding this threshold are included. |
+| `include_tests` | boolean | false | Whether to include test files in the dead code analysis. |
+
+## How It Works
+
+### Analysis Process
+
+1. **Project Discovery**: Scans source files while excluding generated and vendor directories
+2. **Definition Detection**: Identifies function/class/variable/import definitions using language-specific patterns
+3. **Reference Counting**: Counts actual usage of each identifier across the codebase (excluding its definition)
+4. **Export Analysis**: Determines whether code is exported/public, which affects confidence scoring
+5. **Unreachable Code Detection**: Finds code after terminating statements (return/throw/break)
+6. **Confidence Scoring**: Applies deterministic rules to generate confidence scores
+
+### Confidence Scoring
+
+The agent uses the following rules to calculate confidence scores:
+
+| Condition | Confidence | Tier |
+|-----------|-----------|------|
+| No references + not exported | 100% | Very High |
+| No references + exported | 90% | Very High |
+| 1 reference + not exported | 75% | High |
+| 1 reference + exported | 70% | High |
+| 2+ references | 60% or lower | Medium |
+| Unreachable code | 100% | Very High |
+
+## Output Format
+
+The agent returns a JSON object with the following structure:
+
+```json
+{
+  "summary": {
+    "total_files_scanned": 42,
+    "total_dead_code_items": 5,
+    "confidence_counts": {
+      "very_high": 3,
+      "high": 1,
+      "medium": 1
+    },
+    "estimated_lines_removable": 127
+  },
+  "dead_code_items": [
+    {
+      "identifier": "unusedFunction",
+      "type": "function",
+      "location": "src/utils.ts:42-55",
+      "confidence_score": 100,
+      "reference_count": 0,
+      "is_exported": false,
+      "reasoning": "Function is not referenced anywhere in the codebase and is not exported"
+    }
+  ],
+  "warnings": [],
+  "success": true
+}
+```
+
+## Interpreting Results
+
+- **Very High Confidence (90-100%)**: Safe to remove. These are unused code elements with no references and typically not exported.
+- **High Confidence (70-89%)**: Likely safe to remove. Usually has minimal references or is exported but not used.
+- **Medium Confidence (50-69%)**: Exercise caution. Has some references but may still be dead code. Review before removing.
+
+## Use Cases
+
+### Clean Up Your Codebase
+Remove unused code that accumulates over time as features are refactored or deprecated.
+
+### Pre-Refactoring Analysis
+Identify what can be safely removed before major refactoring efforts.
+
+### Code Review
+Use in your CI/CD pipeline to flag potential dead code during code reviews.
+
+### Dependency Reduction
+Identify unused exports that can be kept private or removed entirely.
+
+## Tools Used
+
+- **Git**: Version control operations
+- **Filesystem**: Directory and file traversal
+- **Ripgrep**: Efficient pattern matching and searching
+
+## Error Handling
+
+The agent handles errors gracefully:
+- If tools fail, analysis continues with warnings
+- Falls back to filesystem reading if pattern matching fails
+- Returns `success: true` with warnings rather than failing entirely
+- Handles cross-platform compatibility issues automatically
+
+## Technical Details
+
+The agent uses ripgrep for efficient cross-repository searching with language-specific patterns. It filters out comments and strings to minimize false positives and deduplicates results for accuracy.
+
+## Limitations
+
+- Static analysis only - cannot detect runtime dead code
+- May have false negatives if code is referenced dynamically
+- External library references may be missed if not directly imported in source files
+- Consider running multiple times with different `min_confidence` values for comprehensive analysis
+
+## Examples
+
+### Example 1: Basic Analysis
+
+```bash
+qodo undertaker
+```
+
+Scans the repository and returns all dead code with confidence >= 70%.
+
+### Example 2: High Confidence Only
+
+```bash
+qodo undertaker --min_confidence=90
+```
+
+Returns only the most reliable dead code detections (90-100% confidence).
+
+### Example 3: Including Tests
+
+```bash
+qodo undertaker --include_tests=true
+```
+
+Includes test files in the analysis, which is useful for identifying unused test helpers or fixtures.
+
+## Integration
+
+The JSON output can be easily integrated into:
+- CI/CD pipelines for automated reporting
+- Code review tools
+- Custom analysis scripts
+
+## Best Practices
+
+1. Start with the default `min_confidence=70` threshold
+2. Review "Very High" confidence items first as candidates for removal
+3. Use version control to safely remove dead code in isolated commits
+4. Run the agent periodically to maintain code quality
+
+## Support
+
+For issues or questions about the Undertaker agent, please refer to the project documentation or create an issue in the repository.
diff --git a/agents/undertaker/agent.toml b/agents/undertaker/agent.toml
@@ -0,0 +1,164 @@
+# The Undertaker - Dead Code Detection Agent
+version = "1.0"
+
+[commands.undertaker]
+description = "Reliable dead code detection agent that identifies unused functions, classes, variables, imports, and unreachable code with confidence scoring"
+
+instructions = """
+You are a dead code detection agent. Find unused code elements in the repository using static analysis.
+
+CORE MISSION:
+Identify unused functions, classes, variables, imports, types, enums, methods, and unreachable code.
+Provide confidence scoring based on reference counts and export status.
+Be reliable and dependable within reasonable limits.
+Output clean JSON results for integration.
+
+ANALYSIS PROCESS:
+1. Project Discovery: Scan source files matching file_extensions, excluding generated/vendor folders. Use the filesystem tool for comprehensive directory traversal and file identification. Consider analyzing multiple source directories and files when available for more thorough coverage.
+2. Definition Detection: Find function/class/variable/import definitions using language-specific patterns. Use ripgrep for efficient pattern matching across multiple files. Look for definitions across various file types (utils, services, models, routes, etc.).
+3. Reference Counting: For each identifier, count actual usage across the codebase (excluding its definition). Use ripgrep with precise patterns to count references and exclude false positives.
+4. Export Analysis: Detect if code is exported/public, which affects confidence scoring.
+5. Unreachable Code: Find code after return/throw/break statements in the same block.
+6. Confidence Scoring: Apply deterministic confidence rules based on usage patterns.
+
+CONFIDENCE RULES:
+No references + not exported = 100% confidence (very_high: 90-100).
+No references + exported = 90% confidence (very_high: 90-100).
+1 reference + not exported = 75% confidence (high: 70-89).
+1 reference + exported = 70% confidence (high: 70-89).
+2+ references = 60% or lower (medium: 50-69).
+Unreachable code = 100% confidence (very_high: 90-100).
+
+IMPORTANT: When calculating summary counts, ensure confidence_tier matches confidence_score ranges:
+very_high: scores 90-100
+high: scores 70-89
+medium: scores 50-69
+
+TECHNICAL REQUIREMENTS:
+Use ripgrep for efficient cross-repository searching.
+Filter out comments, strings, and false positives.
+Handle multiple programming languages with appropriate patterns.
+Deduplicate results and merge when appropriate.
+Only include items with confidence >= min_confidence threshold.
+
+OUTPUT REQUIREMENTS:
+First, you MUST use the filesystem tool to write the final JSON output to a file named 'dead_code_analysis.json'.
+Second, you MUST return the same valid JSON matching the output_schema to standard output.
+Include summary statistics and detailed findings.
+After writing the file, stop all operations. Do not perform any additional analysis or processing.
+
+ERROR HANDLING:
+If tools fail, add warnings but continue analysis.
+Fall back to filesystem reading if ripgrep patterns fail.
+Mark success=true with warnings rather than failing entirely.
+Handle cross-platform compatibility issues gracefully.
+
+FOCUS ON RELIABILITY:
+Prioritize working correctly over handling every edge case.
+Use simple, proven patterns over complex regex.
+Provide useful results even with partial data.
+Be dependable for common dead code scenarios.
+"""
+
+# Arguments for customizing the analysis
+arguments = [
+  { name = "min_confidence", type = "number", required = false, default = 70, description = "Minimum confidence threshold (50-100)" },
+  { name = "include_tests", type = "boolean", required = false, default = false, description = "Whether to include test files in analysis" }
+]
+
+# Tools the agent can use
+tools = ["git", "filesystem", "ripgrep"]
+
+# Use plan strategy for multi-step analysis
+execution_strategy = "plan"
+
+# Simplified but comprehensive output schema
+output_schema = """
+{
+  "type": "object",
+  "required": ["summary", "dead_code_items", "success"],
+  "properties": {
+    "summary": {
+      "type": "object",
+      "description": "Summary statistics of the dead code analysis.",
+      "required": ["total_files_scanned", "total_dead_code_items", "confidence_counts", "estimated_lines_removable"],
+      "properties": {
+        "total_files_scanned": {
+          "type": "number",
+          "description": "Number of source files analyzed."
+        },
+        "total_dead_code_items": {
+          "type": "number",
+          "description": "Total dead code items found."
+        },
+        "confidence_counts": {
+          "type": "object",
+          "description": "Breakdown of dead code items by confidence tier.",
+          "properties": {
+            "very_high": { "type": "number" },
+            "high": { "type": "number" },
+            "medium": { "type": "number" }
+          }
+        },
+        "estimated_lines_removable": {
+          "type": "number",
+          "description": "Estimated lines that can be safely removed."
+        }
+      }
+    },
+    "dead_code_items": {
+      "type": "array",
+      "description": "List of dead code items found, sorted by confidence.",
+      "items": {
+        "type": "object",
+        "required": ["identifier", "type", "location", "confidence_score", "reference_count", "is_exported", "reasoning"],
+        "properties": {
+          "identifier": {
+            "type": "string",
+            "description": "Name of the code element."
+          },
+          "type": {
+            "type": "string",
+            "enum": ["function", "class", "method", "variable", "interface", "type", "enum", "import", "unreachable_code", "file"],
+            "description": "Type of code element."
+          },
+          "location": {
+            "type": "string",
+            "description": "File path and line range, e.g., 'src/utils.js:10-15'."
+          },
+          "confidence_score": {
+            "type": "number",
+            "minimum": 50,
+            "maximum": 100,
+            "description": "Confidence this is dead code (50-100)."
+          },
+          "reference_count": {
+            "type": "number",
+            "description": "Number of times referenced in the codebase (excluding definition)."
+          },
+          "is_exported": {
+            "type": "boolean",
+            "description": "Whether the code is exported/public."
+          },
+          "reasoning": {
+            "type": "string",
+            "description": "Explanation for why this is considered dead code."
+          }
+        }
+      }
+    },
+    "warnings": {
+      "type": "array",
+      "items": { "type": "string" },
+      "description": "Warnings encountered during analysis."
+    },
+    "success": {
+      "type": "boolean",
+      "description": "Whether the analysis completed successfully."
+    }
+  }
+}
+"""
+
+# Success condition for CI/CD integration
+exit_expression = "success"
diff --git a/agents/undertaker/agent.yaml b/agents/undertaker/agent.yaml