Skip to content

Refactor Initiative Analyzer to Use Initiative-Centric Processing #56

@PlagueHO

Description

@PlagueHO

Technical Debt: Architectural Improvement Opportunity

Current State

The initiative analyzer currently uses an item-centric approach where each backlog item is individually analyzed against all initiatives. This creates several inefficiencies:

  • API Inefficiency: N API calls for N backlog items
  • Context Dilution: LLM must consider all initiatives for each single item
  • Suboptimal Goal Alignment: Process doesn't directly match the stated goal of "enriching initiatives with relevant backlog items"
  • Limited Batch Analysis: No opportunity for comparative analysis within initiative context

Proposed Solution

Implement an initiative-centric approach with the following architecture:

Core Algorithm Change

# Current: Item → Initiatives mapping
for backlog_item in backlog_items:
    best_initiative = analyze_against_all_initiatives(backlog_item)

# Proposed: Initiative → Items mapping  
for initiative in initiatives:
    for chunk in chunk_backlog_items(backlog_items, size=20):
        relevant_items = analyze_chunk_for_initiative(initiative, chunk)

Benefits

  1. Strategic Alignment: Directly matches goal of enriching initiatives
  2. API Efficiency: Reduces API calls by ~80% through batching
  3. Enhanced Context: LLM focuses on single initiative context
  4. Richer Analysis: Comparative analysis within initiative scope
  5. Better Token Utilization: Optimal use of context windows

Implementation Tasks

  • Create chunk_backlog_items() function for batching
  • Implement analyze_chunk_for_initiative() with structured output
  • Design new system prompt for initiative-centric analysis
  • Update data models to handle batch results
  • Implement result aggregation and deduplication
  • Add comprehensive testing for new approach
  • Performance benchmarking against current implementation

JSON Schema for Structured Output

{
  "type": "object",
  "properties": {
    "relevant_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "backlog_item_title": {"type": "string"},
          "relevance_score": {"type": "integer", "minimum": 0, "maximum": 100},
          "impact_analysis": {"type": "string"},
          "strategic_value": {"type": "string"},
          "implementation_synergies": {"type": "string"},
          "confidence_reasoning": {"type": "string"}
        }
      }
    }
  }
}

Risk Mitigation

  • Implement both approaches initially for A/B comparison
  • Ensure backward compatibility with existing CSV formats
  • Add configuration flag to switch between approaches
  • Comprehensive testing with real data sets

Impact Assessment

  • Performance: 80% reduction in API calls
  • Quality: More focused, context-aware analysis
  • Maintainability: Clearer separation of concerns
  • Scalability: Better handling of large datasets

Implementation Priority

High - This change significantly improves both efficiency and analytical quality, directly supporting the primary use case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions