The agent was executing tools successfully but hitting a token limit error from the LLM API:
"Requested token count exceeds the model's maximum context length of 8192 tokens.
You requested a total of 8478 tokens"
- Verbose prompt template - The original prompt was very long with detailed instructions
- Multiple LLM calls - Each tool call adds to the context window
- Context accumulation - By the 3rd-4th iteration, the conversation history exceeded 8192 tokens
- High max_iterations - Setting max_iterations=30 allowed too many loops, causing context overflow
File: backend/app/services/agents/resume_agent_service.py
Reduced from ~1000 tokens to ~200 tokens:
# BEFORE: Very verbose with detailed instructions
template = """You are a professional career coach and resume analyst...
[1000+ tokens of detailed instructions]
"""
# AFTER: Concise and direct
template = """Analyze resume and return JSON with: skill_inventory, experience_timeline, skill_gaps, improvement_roadmap.
Tools: {tools}
Format:
Question: {input}
Thought: [your reasoning]
Action: [tool name]
Action Input: [input]
Observation: [result]
Final Answer: [JSON only]
CRITICAL:
- Use tools in order: resume_parser → skill_extractor → experience_analyzer → skill_gap_analyzer → roadmap_generator → analysis_formatter
- Final Answer MUST be ONLY valid JSON, no other text
- Copy JSON from analysis_formatter directly
{agent_scratchpad}"""# BEFORE: Too high, causes context overflow
max_iterations=30
# AFTER: Just enough for 6 tools + buffer
max_iterations=8- Shorter prompt = fewer tokens per LLM call
- Fewer iterations = less context accumulation
- Clear instructions = agent doesn't need verbose explanations
- 8 iterations = 6 tools + 2 buffer for final answer
- Agent executes all 6 tools without token overflow
- Agent completes within 8 iterations
- Agent returns valid JSON
- Analysis stored with status =
success - Frontend displays AI analysis
- Restart backend
- Upload resume
- Check backend logs for:
- NO "token count exceeds" errors
- All 6 tools executing
- "Agent output successfully validated and parsed"
- Check database: status =
success - Frontend displays analysis
backend/app/services/agents/resume_agent_service.py_get_prompt_template()- Simplified prompt (reduced from ~1000 to ~200 tokens)_execute_agent_analysis()- Reduced max_iterations from 30 to 8