Skip to content

Latest commit

 

History

History
92 lines (68 loc) · 2.69 KB

File metadata and controls

92 lines (68 loc) · 2.69 KB

Critical Fix: Token Limit Error

Problem Identified

The agent was executing tools successfully but hitting a token limit error from the LLM API:

"Requested token count exceeds the model's maximum context length of 8192 tokens. 
You requested a total of 8478 tokens"

Root Cause

  1. Verbose prompt template - The original prompt was very long with detailed instructions
  2. Multiple LLM calls - Each tool call adds to the context window
  3. Context accumulation - By the 3rd-4th iteration, the conversation history exceeded 8192 tokens
  4. High max_iterations - Setting max_iterations=30 allowed too many loops, causing context overflow

Solution Applied

File: backend/app/services/agents/resume_agent_service.py

Change 1: Simplified Prompt Template

Reduced from ~1000 tokens to ~200 tokens:

# BEFORE: Very verbose with detailed instructions
template = """You are a professional career coach and resume analyst...
[1000+ tokens of detailed instructions]
"""

# AFTER: Concise and direct
template = """Analyze resume and return JSON with: skill_inventory, experience_timeline, skill_gaps, improvement_roadmap.

Tools: {tools}

Format:
Question: {input}
Thought: [your reasoning]
Action: [tool name]
Action Input: [input]
Observation: [result]
Final Answer: [JSON only]

CRITICAL: 
- Use tools in order: resume_parser → skill_extractor → experience_analyzer → skill_gap_analyzer → roadmap_generator → analysis_formatter
- Final Answer MUST be ONLY valid JSON, no other text
- Copy JSON from analysis_formatter directly

{agent_scratchpad}"""

Change 2: Optimized Iteration Limit

# BEFORE: Too high, causes context overflow
max_iterations=30

# AFTER: Just enough for 6 tools + buffer
max_iterations=8

Why This Works

  • Shorter prompt = fewer tokens per LLM call
  • Fewer iterations = less context accumulation
  • Clear instructions = agent doesn't need verbose explanations
  • 8 iterations = 6 tools + 2 buffer for final answer

Expected Behavior After Fix

  1. Agent executes all 6 tools without token overflow
  2. Agent completes within 8 iterations
  3. Agent returns valid JSON
  4. Analysis stored with status = success
  5. Frontend displays AI analysis

Testing

  1. Restart backend
  2. Upload resume
  3. Check backend logs for:
    • NO "token count exceeds" errors
    • All 6 tools executing
    • "Agent output successfully validated and parsed"
  4. Check database: status = success
  5. Frontend displays analysis

Files Modified

  • backend/app/services/agents/resume_agent_service.py
    • _get_prompt_template() - Simplified prompt (reduced from ~1000 to ~200 tokens)
    • _execute_agent_analysis() - Reduced max_iterations from 30 to 8