Skip to content

Commit 0efc744

Browse files
committed
Merge branch 'cli_stream_fix_media_comp' into cli_activeskill
2 parents 22b06e1 + 6c8ff92 commit 0efc744

File tree

5 files changed

+12
-7
lines changed

5 files changed

+12
-7
lines changed

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/aworld_agent.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ def _build_beijing_date_line() -> str:
6565
* `developer`: a sub-agent that can develop apps/code/html/website and laterimprove this developed apps/code/html/website according to the suggestions from the `evaluator`, by using terminal and other professional tools.
6666
* `evaluator`: a sub-agent that can evaluate the apps/code/html/website's (developed by the `developer`) performance, user experience, and so on, and present professional suggestions to the `developer` for the apps/code/html/website improvement.
6767
* `terminal`: A tool set that can execute terminal commands. **Path restriction:** Do not `cd` to other directories; always operate from the current working directory. When operating on files, always use explicit relative or absolute paths. **Timeout requirement:** You MUST always set a reasonable `timeout` (in seconds) when calling the terminal tool; do not rely on defaults for long-running commands—choose an appropriate timeout based on the expected duration (e.g., 60–120 seconds for builds, 30–60 for quick commands).
68+
* `media_comprehension`: a sub-agent that specially for understanding images, audio, and video files. Cannot process: documents (.pdf, e.g. report.pdf), spreadsheets (.xlsx/.csv, e.g. data.xlsx), presentations (.pptx, e.g. slides.pptx), code (.py/.js/.ts, e.g. main.py), archives (.zip/.tar/.rar, e.g. backup.zip), executables (.exe/.bin, e.g. app.exe), databases (.db/.sqlite, e.g. users.db), structured data (.json/.xml/.yaml, e.g. config.json), web pages (.html/.htm, e.g. index.html).
6869
6970
## 4. Available Skills
7071
* Please be aware that if you need to have access to a particular skill to help you to complete the task, you MUST use the appropriate `SKILL_tool` to activate the skill, which returns you the exact skill content.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/developer/prompt.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ You analyze target codebases, identify key modules and entry points, and apply p
1111
**You must only use one tool call per turn.** Do not chain commands.
1212
**Prohibit one-shot large file creation:** You must never create or modify large files in a single operation. Large files must always be output in segments. Each segment must not exceed 50 lines or 5,000 characters. This applies to both Mode 1 (creation via `terminal`) and Mode 2 (modification via `CAST_CODER.search_replace`).
1313
**Do Not Use browser_take_screenshot:** You Must Not use browser_take_screenshot, since this tool call will return very large files which will block the task.
14+
**Do Not Use Interactive commands. You may use non-interactivealternatives (e.g. --yes, -y, CI=1, DEBIAN_FRONTEND=noninteractive) or different tools.
1415

1516
## 🔄 Core Workflow: Operating Modes
1617
You will operate in one of two modes, determined by the user's request. You must identify the correct mode at the beginning of the task and follow its specific workflow.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/evaluator/prompt.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,5 @@ You are equipped with multiple assistants. It is your job to know which to use a
2626
- **Honest Capability Assessment:** If a user's request is beyond the combined capabilities of your available assistants, you must terminate the task and clearly explain to the user why it cannot be completed.
2727
- **Working Directory:** Always treat the current directory as your working directory for all actions: run shell commands from it, and use it (or paths under it) for any temporary or output files when such operations are permitted (e.g. non-code tasks). You MUST NOT redirect work or temporary files to /tmp; Always use the current directory so outputs stay with the user's context.
2828
- **Do Not Delete Files:** You MUST NOT use the `terminal_tool` to rm -rf any file, since this will delete the file from the system. except the ms-playwrightmodule installation case.
29-
- **Do Not Use browser_take_screenshot:** You Must Not use browser_take_screenshot, since this tool call will return very large files which will block the task.
29+
- **Do Not Use browser_take_screenshot:** You Must Not use browser_take_screenshot, since this tool call will return very large files which will block the task.
30+
- **Do Not Use Interactive commands. You may use non-interactivealternatives (e.g. --yes, -y, CI=1, DEBIAN_FRONTEND=noninteractive) or different tools.

aworld-cli/src/aworld_cli/inner_plugins/smllc/agents/media_comprehension/media_comprehension.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ async def async_policy(self, observation: Observation, info: Dict[str, Any] = {}
6161
- Images: Recognize, describe, and interpret visual content.
6262
- Audio: Transcribe speech and analyze audio content.
6363
- Video: Understand video content, analyze scenes, and perform multimodal comprehension.
64+
65+
Cannot process (do NOT delegate to this agent): Documents (.pdf, e.g. report.pdf), spreadsheets (.xlsx/.csv, e.g. data.xlsx), presentations (.pptx, e.g. slides.pptx), code/scripts (.py/.js/.ts, e.g. main.py), archives (.zip/.tar/.rar, e.g. backup.zip), executables (.exe/.bin, e.g. app.exe), databases (.db/.sqlite, e.g. users.db), structured data (.json/.xml/.yaml, e.g. config.json), web pages (.html/.htm, e.g. index.html).
6466
"""
6567
)
6668
def build_media_comprehension_swarm():

examples/gaia/mcp_collections/tools/terminal.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -104,10 +104,7 @@ def __init__(self, arguments: ActionArguments) -> None:
104104
# Interactive-only commands (block stdin, cannot run non-interactively)
105105
self.interactive_command_patterns = [
106106
r"(?:^|\s)(vim|vi|nano|emacs)(?:\s|$)",
107-
r"(?:^|\s)(less|more)(?:\s|$)",
108-
r"(?:^|\s)(top|htop)(?:\s|$)",
109-
r"(?:^|\s)(ftp|telnet)(?:\s|$)",
110-
r"(?:^|\s)(python3?|bash)\s+-i\b",
107+
r"(?:^|\s)(ftp|telnet)(?:\s|$)"
111108
]
112109

113110
# Get current platform info
@@ -147,9 +144,12 @@ def _check_interactive_command(self, command: str) -> tuple[bool, str | None]:
147144
Tuple of (is_allowed, reason_if_forbidden)
148145
"""
149146
for pattern in self.interactive_command_patterns:
150-
if re.search(pattern, command, re.IGNORECASE):
147+
m = re.search(pattern, command, re.IGNORECASE)
148+
if m:
149+
forbidden_cmd = m.group(1)
151150
return False, (
152-
"Interactive commands are not allowed. Use non-interactive alternatives "
151+
f"Interactive commands are not allowed (forbidden: {forbidden_cmd}). "
152+
"Use non-interactive alternatives "
153153
"(e.g. --yes, -y, CI=1, DEBIAN_FRONTEND=noninteractive) or different tools."
154154
)
155155
return True, None

0 commit comments

Comments
 (0)