Skip to content

Add code operators refer to sandbox#1030

Open
ChongjinHuang wants to merge 2 commits intoLazyAGI:mainfrom
ChongjinHuang:hcj/sandbox
Open

Add code operators refer to sandbox#1030
ChongjinHuang wants to merge 2 commits intoLazyAGI:mainfrom
ChongjinHuang:hcj/sandbox

Conversation

@ChongjinHuang
Copy link
Contributor

📌 PR 内容 / PR Description

  • 添加与沙箱有关的代码验证算子,后续等待修改。

✅ 变更类型 / Type of Change

  • 修复 Bug / Bug fix (non-breaking change that fixes an issue)
  • 新功能 / New feature (non-breaking change that adds functionality)
  • 重构 / Refactor (no functionality change, code structure optimized)
  • 重大变更 / Breaking change (fix or feature that would cause existing functionality to change)
  • 文档更新 / Documentation update (changes to docs only)
  • 性能优化 / Performance optimization

⚠️ 注意事项 / Additional Notes

@gemini-code-assist
Copy link

Summary of Changes

Hello @ririqing0614, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces new capabilities for evaluating generated code within a sandboxed environment. It adds a CodeSandboxSampleEvaluator to validate code snippets and a robust PythonExecutor that manages code execution, including process isolation, timeout handling, and capturing output like text and images. This foundational work aims to provide a secure and controlled way to test and verify code generated by models.

Highlights

  • New Code Sandbox Evaluator: A CodeSandboxSampleEvaluator class has been added to code_sandbox_op.py. This class is designed to evaluate generated code within a sandbox environment, providing status and logs for execution results.
  • Robust Python Executor: A PythonExecutor class has been introduced in python_executor.py. This class handles the execution of Python code, including features like process isolation, timeout management, and capturing stdout/figures.
  • Flexible Runtime Environments: The PythonExecutor supports different runtime environments (GenericRuntime, ImageRuntime, DateRuntime, ColorObjectRuntime) to manage global variables and specific functionalities (e.g., image handling, date operations) during code execution.
  • Process Isolation for Security and Stability: The PythonExecutor can run code in a separate, persistent worker process using PersistentWorker, enhancing security by isolating execution and preventing issues from crashing the main application.
Changelog
  • lazyllm/tools/data/operators/code_sandbox_op.py
    • Added CodeSandboxSampleEvaluator for sandboxed code validation.
    • Integrated PythonExecutor for code execution.
  • lazyllm/tools/data/operators/python_executor.py
    • Added PythonExecutor class for executing Python code with process isolation and timeout.
    • Implemented PersistentWorker for managing isolated worker processes.
    • Defined various runtime classes (GenericRuntime, ImageRuntime, DateRuntime, ColorObjectRuntime) for different execution contexts.
    • Included utility functions for image encoding/decoding and matplotlib figure capturing.
Activity
  • The pull request was created by ririqing0614.
  • The title indicates the addition of code operators referring to a sandbox.
  • The description notes that the changes add code verification operators related to the sandbox, with further modifications pending.
  • The change type is marked as a 'New feature'.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a code sandbox execution feature, which is a valuable addition. The use of process isolation for security is a good design choice. I've identified a few areas for improvement, including a critical bug in image data handling, some opportunities to improve resource management, and suggestions to enhance code maintainability and correctness. All original comments remain valid and are included for your review.

Comment on lines +369 to +370
init_captured_figures.append(base64.b64encode(
BytesIO(image.tobytes()).getvalue()).decode('utf-8'))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a bug in how images are being re-encoded to base64. image.tobytes() returns raw pixel data, not a formatted image file (like PNG or JPEG). When this base64 string is decoded later, it will not be a renderable image. You should save the PIL image to a memory buffer in a specific format to get the correct byte representation.

For example:

with BytesIO() as buf:
    image.save(buf, format='PNG')
    b64_str = base64.b64encode(buf.getvalue()).decode('utf-8')
init_captured_figures.append(b64_str)


return image

except (base64.binascii.Error, Exception) as e:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching the generic Exception is too broad. It can hide bugs and make debugging harder by catching unexpected errors. It's better to catch only the specific exceptions you expect from the try block, such as base64.binascii.Error and Image.UnidentifiedImageError from Pillow.

Suggested change
except (base64.binascii.Error, Exception) as e:
except (base64.binascii.Error, Image.UnidentifiedImageError) as e:


def exec_code(self, code_piece: str) -> None:
# Security check
if regex.search(r'(\s|^)?(input|os\.system|subprocess)\(', code_piece):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex used for the security check is a weak blacklist that can be easily bypassed (e.g., using __import__). While process isolation is the main security layer, this weak check might provide a false sense of security. It would be better to either remove it or replace it with a more robust sandboxing approach that restricts available built-ins and modules within the exec scope.

Comment on lines +455 to +500
else:
# Non-isolation mode (for backward compatibility)
runtime = runtime_class(messages) if runtime_class else self.runtime_class(messages)

try:
if get_answer_from_stdout:
program_io = io.StringIO()
with redirect_stdout(program_io):
runtime.exec_code('\n'.join(code))
program_io.seek(0)
result = program_io.read()
elif answer_symbol:
runtime.exec_code('\n'.join(code))
result = runtime._global_vars.get(answer_symbol, '')
elif answer_expr:
runtime.exec_code('\n'.join(code))
result = runtime.eval_code(answer_expr)
else:
if len(code) > 1:
runtime.exec_code('\n'.join(code[:-1]))
result = runtime.eval_code(code[-1])
else:
runtime.exec_code('\n'.join(code))
result = ''

# Check for captured figures
captured_figures = runtime.captured_figures
if captured_figures:
result = {
'text': result,
'images': captured_figures
} if result else {'images': captured_figures}
else:
result = {'text': result} if result else {}

report = 'Done'

except Exception as e:
error_msg = f'{type(e).__name__}: {str(e)}'
result = {
'error': error_msg,
'traceback': traceback.format_exc()
}
report = f'Error: {error_msg}'

return result, report

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The code in this else block (for non-isolated execution) is very similar to the logic inside PersistentWorker._handle_execute. This duplication increases maintenance overhead, as any bug fixes or changes will need to be applied in two places. Consider refactoring to reuse the same execution logic for both isolated and non-isolated paths.

Comment on lines +542 to +545
except TimeoutError as error:
LOG.warning(error)
all_exec_results.append(('', 'Timeout Error'))
timeout_cnt += 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This except TimeoutError block appears to be unreachable. The self.execute() method handles timeouts internally by catching queue.Empty and returns a dictionary with an error status, rather than raising a TimeoutError. This block seems to be a remnant of a previous implementation and should be removed to avoid confusion.

Comment on lines +580 to +584
def __del__(self):
'''Clean up resources.'''
if self.persistent_worker:
self.persistent_worker.terminate()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Relying on __del__ for resource cleanup is unreliable as its execution is not guaranteed. A better approach is to provide an explicit terminate() method and implement the context manager protocol (__enter__/__exit__). This allows for deterministic cleanup of the worker process using a with statement.

I'd suggest replacing __del__ with a terminate() method and advising users to call it explicitly or use a with block.

Suggested change
def __del__(self):
'''Clean up resources.'''
if self.persistent_worker:
self.persistent_worker.terminate()
def terminate(self):
'''Clean up resources.'''
if self.persistent_worker:
self.persistent_worker.terminate()
self.persistent_worker = None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant