Cli media size by tallate · Pull Request #812 · inclusionAI/AWorld

tallate · 2026-03-09T11:56:45Z

Restrict supported file types for media_comprehension, e.g. pdf
Limit image size for media_comprehension; ensure extracted frames are small enough via preprocessing
Fix execution error in /cost command
Reduce the number of restricted interactive commands

2. forbid developer and evaluator to execute interactive commands

…ionAI/AWorld into cli_activeskill # Conflicts: # aworld-cli/src/aworld_cli/history.py

gemini-code-assist · 2026-03-09T11:57:04Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the CLI's history tracking capabilities by introducing a structured, JSONL-based history system that records and aggregates token usage. Concurrently, it refines the media_comprehension agent's operational guidelines, particularly for image processing, by enforcing size limits and compression to optimize performance. Additionally, agent prompts and the terminal tool have been updated to restrict interactive commands, promoting more robust and automated execution.

Highlights

CLI History Management: Introduced a new JSONLHistory class for the CLI, enabling structured storage of command history and detailed token usage statistics, including per-model aggregation and session-specific filtering.
Media Comprehension Agent Enhancements: Added a media_comprehension sub-agent description to the aworld_agent and updated its prompt to include explicit image size checks and compression logic for files exceeding 50KB before processing.
Interactive Command Restrictions: Updated developer and evaluator agent prompts, as well as the terminal tool's logic, to explicitly disallow and better identify interactive commands, guiding users towards non-interactive alternatives.
Multimedia File Size Validation: Implemented a size validation mechanism in cast_search_tool to prevent reading excessively large multimedia files, ensuring efficient processing by the media_comprehension agent.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

aworld-cli/src/aworld_cli/history.py
- Added JSONLHistory class for structured history storage and token statistics.
aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/aworld_agent.py
- Added description for the new media_comprehension sub-agent.
aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/developer/prompt.txt
- Added a new instruction prohibiting the use of interactive commands.
aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/evaluator/prompt.txt
- Added a new instruction prohibiting the use of interactive commands.
aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/media_comprehension/media_comprehension.py
- Clarified the types of files that the media_comprehension agent cannot process.
aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/media_comprehension/prompt.txt
- Updated the image processing workflow to include mandatory size checks and compression for images exceeding 50KB.
- Provided a Python script example for image compression.
aworld/experimental/cast/tools/cast_search_tool.py
- Introduced multimedia file type detection and enforced a configurable size limit for multimedia files during _read_file operations.
- Added a helper function _resolve_file_path for resolving file paths.
examples/gaia/mcp_collections/tools/terminal.py
- Refined the regular expressions for detecting interactive commands.
- Improved the error message for forbidden commands to specify the exact command.

Activity

A new JSONLHistory class was added to aworld-cli/src/aworld_cli/history.py to manage CLI command history and token usage statistics.
The aworld_agent.py file was updated to include a description for a new media_comprehension sub-agent.
Prompts for developer and evaluator agents were modified to explicitly prohibit interactive commands.
The media_comprehension agent's prompt was updated with detailed instructions for handling image files, including size checks and compression.
The cast_search_tool.py file was updated to include logic for validating multimedia file sizes before reading them.
The terminal.py tool was refined to improve the detection and reporting of interactive commands.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces functionality for handling media files, including size checks and compression, and updates command history logging. It also refines prompts and terminal command validation. My review focuses on performance optimizations for file handling in the history module, improving error logging, reducing code duplication, and addressing potential issues with the updated interactive command blocklist. I've also pointed out a minor typo in the prompt files and a compatibility concern in a Python script within a prompt.

_{Note: Security Review is unavailable for this PR.}

I am having trouble creating individual review comments. Click here to see my feedback.

aworld-cli/src/aworld_cli/history.py (156-157)

The IOError is caught but ignored with a pass statement. This can lead to silent data loss if writing to the history file fails. It's better to log this error, similar to how it's done in load_history_strings.

        except IOError as e:
            logger.warning(f"Failed to write to history file {self.filename}: {e}")

examples/gaia/mcp_collections/tools/terminal.py (107-110)

The commands less, more, top, htop, and interactive shells (python -i, bash -i) have been removed from the interactive_command_patterns blocklist. These commands are inherently interactive and will likely hang or fail when executed in a non-interactive environment with stdin redirected from /dev/null. This could lead to unexpected behavior or timeouts. It's safer to keep them in the blocklist to provide clear and immediate feedback to the agent that such commands are not allowed.

aworld-cli/src/aworld_cli/history.py (96-97)

The current implementation reads the entire history file into memory with f.readlines() to find a record for aggregation. This can be very inefficient and consume a lot of memory if the history file grows large. Consider optimizing this by reading only a portion of the file, for example, the last N lines, to find a recent match.

aworld-cli/src/aworld_cli/history.py (175-177)

This method reads the entire history file into memory to retrieve the last limit records. For large history files, this is inefficient. A more performant approach would be to read the file from the end, especially since you only need the newest records. You could use a deque to store the last limit lines found while reading backwards.

aworld-cli/src/aworld_cli/history.py (237-285)

There is significant code duplication between the if by_model: block and the else: block for handling token statistics. Both blocks iterate over models and update stats. This logic can be refactored into a helper function to improve maintainability and reduce redundancy. The helper could take the stats dictionary, the record's token_stats, and the timestamp ts as arguments.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/developer/prompt.txt (14)

There's a typo in non-interactivealternatives. It should be non-interactive alternatives with a space.

**Do Not Use Interactive commands. You may use non-interactive alternatives (e.g. --yes, -y, CI=1, DEBIAN_FRONTEND=noninteractive) or different tools.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/evaluator/prompt.txt (30)

There's a typo in non-interactivealternatives. It should be non-interactive alternatives with a space.

- **Do Not Use Interactive commands. You may use non-interactive alternatives (e.g. --yes, -y, CI=1, DEBIAN_FRONTEND=noninteractive) or different tools.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/media_comprehension/prompt.txt (56)

The use of Image.Resampling.LANCZOS requires Pillow version 9.1.0 or newer. For better compatibility with older environments, it's safer to use the legacy Image.LANCZOS. This avoids a potential AttributeError if an older version of Pillow is installed.

        img.resize(new_size, Image.LANCZOS).save(out_path, 'JPEG', quality=70, optimize=True)

tallate added 5 commits March 9, 2026 16:16

[cli]: 1. history

a00ed66

[cli]: 1. forbid media_comprehension to process pdf or other doc

6c8ff92

2. forbid developer and evaluator to execute interactive commands

Merge branch 'cli_stream_fix_media_comp' of https://github.com/inclus…

22b06e1

…ionAI/AWorld into cli_activeskill # Conflicts: # aworld-cli/src/aworld_cli/history.py

Merge branch 'cli_stream_fix_media_comp' into cli_activeskill

0efc744

[cli]: 1. limit image size to read

072a034

JasonTse1989 approved these changes Mar 9, 2026

View reviewed changes

JasonTse1989 merged commit fcaead7 into main Mar 9, 2026
1 check passed

gemini-code-assist bot reviewed Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cli media size#812

Cli media size#812
JasonTse1989 merged 5 commits intomainfrom
cli_media_size

tallate commented Mar 9, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tallate commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

aworld-cli/src/aworld_cli/history.py (156-157)

examples/gaia/mcp_collections/tools/terminal.py (107-110)

aworld-cli/src/aworld_cli/history.py (96-97)

aworld-cli/src/aworld_cli/history.py (175-177)

aworld-cli/src/aworld_cli/history.py (237-285)

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/developer/prompt.txt (14)

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/evaluator/prompt.txt (30)

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/media_comprehension/prompt.txt (56)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tallate commented Mar 9, 2026 •

edited

Loading