Add: beginner-friendly file analyzer example by itsroshanharry · Pull Request #380 · hud-evals/hud-python

itsroshanharry · 2026-03-22T13:09:29Z

Description

Add a beginner-friendly file analyzer example to help new users understand HUD SDK core concepts.

Problem

Currently, the examples jump from very basic (00_agent_env.py - just one tool call) to advanced production-ready agents (01_codex, 02_opencode). There's no intermediate example for beginners learning the SDK.

Solution

This example fills that gap by demonstrating:

Creating environments with custom tools (@env.tool())
Defining evaluation scenarios (@env.scenario())
Running agents with different models
Understanding automatic tool usage
Reward-based evaluation

Why This Is Valuable

Fills a gap: Provides progression path from basic to advanced
Beginner-friendly: Uses simple, relatable tools (file operations)
Educational: Extensive comments and helpful output explaining what happened
Complete: Shows full workflow from setup to evaluation
Flexible: Includes command-line options for different models

Example Output

======================================================================
FILE ANALYZER AGENT
======================================================================

🤖 Model: gpt-4o-mini
📋 Task: Analyze README.md file
🔧 Tools: list_files, read_file, count_words

======================================================================

✅ Task completed!
📊 Reward: 1.0
🔢 Steps taken: 0

📝 Agent's response:
The **README.md** file exists, and here are the details:
- **Word Count:** 70 words
- **Lines:** 14
- **Characters:** 1000
- **Average Word Length:** 14.3 characters

Testing

Tested with gpt-4o-mini (default)
Tested with gpt-4o
Verified all tools work correctly
Confirmed evaluation scenario works
Checked code formatting with ruff format
Verified linting with ruff check
Added documentation to examples/README.md

Checklist

Code follows project style guidelines
Added documentation to examples/README.md
Tested the example works
Includes clear usage instructions
Has helpful comments and docstrings
Includes shebang and proper imports
Follows existing example patterns

Additional Context

This is my first contribution to the HUD SDK. I created this example while learning the framework and thought it would help other beginners understand the core concepts. The example is self-contained and doesn't modify any existing code.

I'm excited to contribute to the project and help make it more accessible to new users!

Note

Low Risk
Adds a new standalone example script and documentation without modifying library/runtime code; risk is limited to example usability (e.g., path handling and tool output).

Overview
Adds a new beginner example examples/05_file_analyzer_agent.py that defines simple file tools (list_files, read_file, count_words), an analyze-readme scenario with basic reward scoring, and a CLI to run the agent against different models.

Updates examples/README.md to document how to run the new example (including --model and --verbose) and position it as an intermediate learning step.

^{Written by Cursor Bugbot for commit c8923aa. This will update automatically on new commits. Configure here.}

- New example demonstrating core HUD SDK concepts - Shows environment creation with custom tools - Demonstrates scenario-based evaluation - Includes clear documentation and usage examples - Fills gap between basic (00) and advanced (01/02) examples - Perfect starting point for new users learning the SDK

- Fix average word length calculation to exclude whitespace - Change 'Steps taken' to 'Tool calls' for accuracy - Add comment explaining the calculation

- Add directory traversal protection to list_files and read_file tools - Validate all file paths are within working directory before access - Fix tool calls metric to count actual MCP calls from trace - Prevent negative count when agent errors early Addresses Bugbot security and accuracy concerns

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

- Add guard to prevent AttributeError when agent returns no content - Handle edge case where response is None before calling .lower() Addresses Bugbot review comment

cursor Bot reviewed Mar 22, 2026

View reviewed changes

Comment thread examples/05_file_analyzer_agent.py Outdated

Comment thread examples/05_file_analyzer_agent.py Outdated

Fix: address Bugbot review comments

6aed6ea

- Fix average word length calculation to exclude whitespace - Change 'Steps taken' to 'Tool calls' for accuracy - Add comment explaining the calculation

cursor Bot reviewed Mar 22, 2026

View reviewed changes

Comment thread examples/05_file_analyzer_agent.py

Comment thread examples/05_file_analyzer_agent.py Outdated

cursor Bot reviewed Mar 22, 2026

View reviewed changes

Comment thread examples/05_file_analyzer_agent.py Outdated

Fix: add None check for agent response in scenario

c8923aa

- Add guard to prevent AttributeError when agent returns no content - Handle edge case where response is None before calling .lower() Addresses Bugbot review comment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add: beginner-friendly file analyzer example#380

Add: beginner-friendly file analyzer example#380
itsroshanharry wants to merge 4 commits into
hud-evals:mainfrom
itsroshanharry:feature/add-file-analyzer-example

itsroshanharry commented Mar 22, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

itsroshanharry commented Mar 22, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Why This Is Valuable

Example Output

Testing

Checklist

Additional Context

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

itsroshanharry commented Mar 22, 2026 •

edited by cursor Bot

Loading