Skip to content

Cli media size#812

Merged
JasonTse1989 merged 5 commits intomainfrom
cli_media_size
Mar 9, 2026
Merged

Cli media size#812
JasonTse1989 merged 5 commits intomainfrom
cli_media_size

Conversation

@tallate
Copy link
Collaborator

@tallate tallate commented Mar 9, 2026

  1. Restrict supported file types for media_comprehension, e.g. pdf
  2. Limit image size for media_comprehension; ensure extracted frames are small enough via preprocessing
  3. Fix execution error in /cost command
  4. Reduce the number of restricted interactive commands

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the CLI's history tracking capabilities by introducing a structured, JSONL-based history system that records and aggregates token usage. Concurrently, it refines the media_comprehension agent's operational guidelines, particularly for image processing, by enforcing size limits and compression to optimize performance. Additionally, agent prompts and the terminal tool have been updated to restrict interactive commands, promoting more robust and automated execution.

Highlights

  • CLI History Management: Introduced a new JSONLHistory class for the CLI, enabling structured storage of command history and detailed token usage statistics, including per-model aggregation and session-specific filtering.
  • Media Comprehension Agent Enhancements: Added a media_comprehension sub-agent description to the aworld_agent and updated its prompt to include explicit image size checks and compression logic for files exceeding 50KB before processing.
  • Interactive Command Restrictions: Updated developer and evaluator agent prompts, as well as the terminal tool's logic, to explicitly disallow and better identify interactive commands, guiding users towards non-interactive alternatives.
  • Multimedia File Size Validation: Implemented a size validation mechanism in cast_search_tool to prevent reading excessively large multimedia files, ensuring efficient processing by the media_comprehension agent.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • aworld-cli/src/aworld_cli/history.py
    • Added JSONLHistory class for structured history storage and token statistics.
  • aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/aworld_agent.py
    • Added description for the new media_comprehension sub-agent.
  • aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/developer/prompt.txt
    • Added a new instruction prohibiting the use of interactive commands.
  • aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/evaluator/prompt.txt
    • Added a new instruction prohibiting the use of interactive commands.
  • aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/media_comprehension/media_comprehension.py
    • Clarified the types of files that the media_comprehension agent cannot process.
  • aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/media_comprehension/prompt.txt
    • Updated the image processing workflow to include mandatory size checks and compression for images exceeding 50KB.
    • Provided a Python script example for image compression.
  • aworld/experimental/cast/tools/cast_search_tool.py
    • Introduced multimedia file type detection and enforced a configurable size limit for multimedia files during _read_file operations.
    • Added a helper function _resolve_file_path for resolving file paths.
  • examples/gaia/mcp_collections/tools/terminal.py
    • Refined the regular expressions for detecting interactive commands.
    • Improved the error message for forbidden commands to specify the exact command.
Activity
  • A new JSONLHistory class was added to aworld-cli/src/aworld_cli/history.py to manage CLI command history and token usage statistics.
  • The aworld_agent.py file was updated to include a description for a new media_comprehension sub-agent.
  • Prompts for developer and evaluator agents were modified to explicitly prohibit interactive commands.
  • The media_comprehension agent's prompt was updated with detailed instructions for handling image files, including size checks and compression.
  • The cast_search_tool.py file was updated to include logic for validating multimedia file sizes before reading them.
  • The terminal.py tool was refined to improve the detection and reporting of interactive commands.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@JasonTse1989 JasonTse1989 merged commit fcaead7 into main Mar 9, 2026
1 check passed
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces functionality for handling media files, including size checks and compression, and updates command history logging. It also refines prompts and terminal command validation. My review focuses on performance optimizations for file handling in the history module, improving error logging, reducing code duplication, and addressing potential issues with the updated interactive command blocklist. I've also pointed out a minor typo in the prompt files and a compatibility concern in a Python script within a prompt.

Note: Security Review is unavailable for this PR.

I am having trouble creating individual review comments. Click here to see my feedback.

aworld-cli/src/aworld_cli/history.py (156-157)

high

The IOError is caught but ignored with a pass statement. This can lead to silent data loss if writing to the history file fails. It's better to log this error, similar to how it's done in load_history_strings.

        except IOError as e:
            logger.warning(f"Failed to write to history file {self.filename}: {e}")

examples/gaia/mcp_collections/tools/terminal.py (107-110)

high

The commands less, more, top, htop, and interactive shells (python -i, bash -i) have been removed from the interactive_command_patterns blocklist. These commands are inherently interactive and will likely hang or fail when executed in a non-interactive environment with stdin redirected from /dev/null. This could lead to unexpected behavior or timeouts. It's safer to keep them in the blocklist to provide clear and immediate feedback to the agent that such commands are not allowed.

aworld-cli/src/aworld_cli/history.py (96-97)

medium

The current implementation reads the entire history file into memory with f.readlines() to find a record for aggregation. This can be very inefficient and consume a lot of memory if the history file grows large. Consider optimizing this by reading only a portion of the file, for example, the last N lines, to find a recent match.

aworld-cli/src/aworld_cli/history.py (175-177)

medium

This method reads the entire history file into memory to retrieve the last limit records. For large history files, this is inefficient. A more performant approach would be to read the file from the end, especially since you only need the newest records. You could use a deque to store the last limit lines found while reading backwards.

aworld-cli/src/aworld_cli/history.py (237-285)

medium

There is significant code duplication between the if by_model: block and the else: block for handling token statistics. Both blocks iterate over models and update stats. This logic can be refactored into a helper function to improve maintainability and reduce redundancy. The helper could take the stats dictionary, the record's token_stats, and the timestamp ts as arguments.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/developer/prompt.txt (14)

medium

There's a typo in non-interactivealternatives. It should be non-interactive alternatives with a space.

**Do Not Use Interactive commands. You may use non-interactive alternatives (e.g. --yes, -y, CI=1, DEBIAN_FRONTEND=noninteractive) or different tools.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/evaluator/prompt.txt (30)

medium

There's a typo in non-interactivealternatives. It should be non-interactive alternatives with a space.

- **Do Not Use Interactive commands. You may use non-interactive alternatives (e.g. --yes, -y, CI=1, DEBIAN_FRONTEND=noninteractive) or different tools.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/media_comprehension/prompt.txt (56)

medium

The use of Image.Resampling.LANCZOS requires Pillow version 9.1.0 or newer. For better compatibility with older environments, it's safer to use the legacy Image.LANCZOS. This avoids a potential AttributeError if an older version of Pillow is installed.

        img.resize(new_size, Image.LANCZOS).save(out_path, 'JPEG', quality=70, optimize=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants