Python code execution tool (System/Venv/Docker) #1371

marklysze · 2025-03-17T04:08:23Z

Why are these changes needed?

Tool for code execution that utilises an environment object to determine where execution takes place. This can be the local python environment (aka System), a virtual environment (aka Venv), or a Docker container.

This will replace the need for the code execution capabilities currently in ConversableAgent. They will be useful for building agents for things like code development, code testing, etc.

Related issue number

N/A

Checks

I've included any doc changes needed for https://docs.ag2.ai/. See https://docs.ag2.ai/docs/contributor-guide/documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

marklysze · 2025-03-17T04:12:00Z

Sample code:

from autogen import ConversableAgent, register_function
from autogen.tools.experimental import PythonLocalExecutionTool

# Initialize the tool
python_executor = PythonLocalExecutionTool(
    use_venv=True,
    venv_path="/app/ag2/ms_code/.venv_ms_testing",  # Use an existing virtual environment (it will create one on each call otherwise, destroying it upon completion)
    timeout=60,
)

llm_config = {"model": "gpt-4o-mini", "api_type": "openai"}

# Create an agent that can use the tool
code_runner = ConversableAgent(
    name="code_runner",
    system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
    llm_config=llm_config,
)

question_agent = ConversableAgent(
    name="question_agent",
    system_message="You are a developer AI agent. Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. Keep refining the code until it works.",
    llm_config=llm_config,
)

register_function(
    python_executor,
    caller=question_agent,
    executor=code_runner,
    description="Run Python code",
)

result = code_runner.initiate_chat(
    recipient=question_agent,
    message="""
    Write a Python code with incorrect syntax. Always install numpy and pandas.
    """,
    max_turns=5,
)

davorrunje · 2025-03-17T08:24:54Z

We can use asyncer to support async file handling and remove the optional dependency to aiofiles. E.g.

from asyncer import asyncify

def read_file(name: str) -> str:
    with open(name) as f:
        return f.read()


async def a_read_file(name: str):
    content = await asyncify(read_file)(name=name)
    print(content)

marklysze · 2025-03-17T08:44:11Z

We can use asyncer to support async file handling and remove the optional dependency to aiofiles. E.g.

from asyncer import asyncify

def read_file(name: str) -> str:
    with open(name) as f:
        return f.read()


async def a_read_file(name: str):
    content = await asyncify(read_file)(name=name)
    print(content)

Great, thanks!

davorrunje · 2025-03-17T08:58:31Z

I think we should use context managers to define things like python environment and working directories. They need to have lifecycle management and that should be made explicit, not hidden in the implementational details inside ConversableAgent.

This is how it could look like:

from autogen import ConversableAgent, LLMConfig
from autogen.environments import PythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonLocalExecutionTool

# create virtual env and manage its lifecycle
with PythonEnvironment(python_version="3.11", dependancies=["numpy>2.1,<3", "pandas"]):

    # create working directories
    with WorkingDirectory.create_tmp() as wd1, WorkingDirectory.create_tmp() as wd2:
        # Initialize the tool
        python_executor = PythonLocalExecutionTool(
            use_venv=True,
            timeout=60,
            # the default working directory is from the outer scope (wd2), but you can change it explicitely
            working_directory=wd1,
            # it is using the python environment from the outer scope, but we could change it
           # python_environment=...
        )

        with LLMConfig(model="gpt-4o-mini", api_type= "openai"):

            # Create an agent that can use the tool
            code_runner = ConversableAgent(
                name="code_runner",
                system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
                tools=python_executor,
            )

            question_agent = ConversableAgent(
                name="question_agent",
                system_message="You are a developer AI agent. Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. Keep refining the code until it works.",
            )

        # this will be done automatically in the global run function
        python_executor.register_for_execution(question_agent)
       
        result = code_runner.initiate_chat(
            recipient=question_agent,
            message="""
            Write a Python code with incorrect syntax. Always install numpy and pandas.
            """,
            max_turns=5,
        )

…ments

marklysze · 2025-03-26T04:39:39Z

Updated, now have VenvPythonEnvironment and SystemPythonEnvironment for Python environments. And have WorkingDirectory for folder.

SystemPythonEnvironment example with context managers:

from autogen import ConversableAgent, LLMConfig, register_function

# Import the environment, working directory, and code execution tool
from autogen.environments import SystemPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonLocalExecutionTool

with SystemPythonEnvironment(executable="/usr/local/bin/python") as sys_py_env:
    with WorkingDirectory(path="/tmp/ag2_working_dir/") as wd:
        # Create our code execution tool, using the environment and working directory from the above context managers
        python_executor = PythonLocalExecutionTool(
            timeout=60,
            # If not using the context managers above, you can set the working directory and python environment here
            # working_directory=wd,
            # python_environment=sys_py_env,
        )

with LLMConfig(model="gpt-4o", api_type="openai"):

    # code_runner has the code execution tool available to execute
    code_runner = ConversableAgent(
        name="code_runner",
        system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
        human_input_mode="NEVER",
    )

    # question_agent has the code execution tool available to its LLM
    question_agent = ConversableAgent(
        name="question_agent",
        system_message=("You are a developer AI agent. "
            "Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
            "Keep refining the code until it works."
        ),
    )

# Register the python execution tool with the agents
register_function(
    python_executor,
    caller=question_agent,
    executor=code_runner,
    description="Run Python code",
)

result = code_runner.initiate_chat(
    recipient=question_agent,
    message=("Write Python code to print the current Python version followed by the numbers 1 to 11. "
             "Make a syntax error in the first version and fix it in the second version."
    ),
    max_turns=5,
)

print(f"Result: {result.summary}")

Venv environment example:

from autogen import ConversableAgent, LLMConfig, register_function

# Import the environment, working directory, and code execution tool
from autogen.environments import VenvPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonLocalExecutionTool

# Create a new virtual environment using a Python version
# Change this to match a version you have installed
venv = VenvPythonEnvironment(python_version="3.11")

# Create a temporary directory
working_dir = WorkingDirectory.create_tmp()

# Create our code execution tool
python_executor = PythonLocalExecutionTool(
    working_directory=working_dir,
    python_environment=venv,
)

with LLMConfig(model="gpt-4o", api_type="openai"):

    # code_runner has the code execution tool available to execute
    code_runner = ConversableAgent(
        name="code_runner",
        system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
        human_input_mode="NEVER",
    )

    # question_agent has the code execution tool available to its LLM
    question_agent = ConversableAgent(
        name="question_agent",
        system_message=("You are a developer AI agent. "
            "Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
            "Keep refining the code until it works."
        ),
    )

# Register the python execution tool with the agents
register_function(
    python_executor,
    caller=question_agent,
    executor=code_runner,
    description="Run Python code",
)

result = code_runner.initiate_chat(
    recipient=question_agent,
    message=("Write a Python program to write a poem to a file. "
             "Follow up with another program to read the poem from the file and print it."
    ),
    max_turns=5,
)

print(f"Result: {result.summary}")

marklysze · 2025-03-26T04:45:19Z

@davorrunje I've updated, incorporated environments. If you can have a look over please. Also missing are tests, would like some guidance on how to test this.

marklysze · 2025-03-27T19:13:02Z

DockerPythonEnvironment example:

from autogen import ConversableAgent, LLMConfig, register_function

# Import the environment, working directory, and code execution tool
from autogen.environments import DockerPythonEnvironment, WorkingDirectory
from autogen.tools.experimental import PythonCodeExecutionTool

with DockerPythonEnvironment(image="python:3.11-slim", pip_packages=["numpy", "pandas", "matplotlib"]) as docker_env:
    with WorkingDirectory(path="/tmp/ag2_working_dir/") as wd:
        # Create our code execution tool, using the environment and working directory from the above context managers
        python_executor = PythonCodeExecutionTool(
            timeout=60,
            # If not using the context managers above, you can set the working directory and python environment here
            # working_directory=wd,
            # python_environment=docker_env,
        )

    with LLMConfig(model="gpt-4o", api_type="openai"):
        # code_runner has the code execution tool available to execute
        code_runner = ConversableAgent(
            name="code_runner",
            system_message="You are a code executor agent, when you don't execute code write the message 'TERMINATE' by itself.",
            human_input_mode="NEVER",
        )

        # question_agent has the code execution tool available to its LLM
        question_agent = ConversableAgent(
            name="question_agent",
            system_message=(
                "You are a developer AI agent. "
                "Send all your code suggestions to the python_executor tool where it will be executed and result returned to you. "
                "Keep refining the code until it works."
            ),
        )

    # Register the python execution tool with the agents
    register_function(
        python_executor,
        caller=question_agent,
        executor=code_runner,
        description="Run Python code",
    )

    result = code_runner.initiate_chat(
        recipient=question_agent,
        message=(
            "Write Python code to print the current Python version followed by the numbers 1 to 11. "
            "Make a syntax error in the first version and fix it in the second version."
        ),
        max_turns=5,
    )

    print(f"Result: {result.summary}")

Output

2025-03-28 06:13:11,416 - INFO - Docker version: Docker version 27.5.1, build 9f9e405
2025-03-28 06:13:13,980 - INFO - Pulled Docker image: python:3.11-slim
2025-03-28 06:13:13,980 - INFO - Starting Docker container: ag2_docker_env_8e729dc8
2025-03-28 06:13:14,217 - INFO - Started Docker container: ag2_docker_env_8e729dc8 (e7691423b8cf1857f5b07adea2a4b479d9377b9400d350fa324037c3a03fc8dd)
2025-03-28 06:13:14,218 - INFO - Installing pip packages: numpy pandas matplotlib
2025-03-28 06:13:25,157 - INFO - Successfully installed pip packages
code_runner (to question_agent):

Write Python code to print the current Python version followed by the numbers 1 to 11. Make a syntax error in the first version and fix it in the second version.

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
question_agent (to code_runner):

***** Suggested tool call (call_GvddnvooTw8sONm9lsDUTiBJ): python_execute_code *****
Arguments: 
{"code_execution_request":{"code":"import platform\nprint('Python version:', platform.python_version())\nfor i in range(1, 12):\n  printi(i)","libraries":[]}}
************************************************************************************

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING FUNCTION python_execute_code...
Call ID: call_GvddnvooTw8sONm9lsDUTiBJ
Input arguments: {'code_execution_request': {'code': "import platform\nprint('Python version:', platform.python_version())\nfor i in range(1, 12):\n  printi(i)", 'libraries': []}}
code_runner (to question_agent):

***** Response from calling tool (call_GvddnvooTw8sONm9lsDUTiBJ) *****
{'success': False, 'stdout': 'Python version: 3.11.11\n', 'stderr': 'Traceback (most recent call last):\n  File "/workspace/script.py", line 4, in <module>\n    printi(i)\n    ^^^^^^\nNameError: name \'printi\' is not defined. Did you mean: \'print\'?\n', 'returncode': 1}
**********************************************************************

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
2025-03-28 06:13:29,210 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
question_agent (to code_runner):

There was a NameError due to using 'printi' instead of 'print'. Let's correct that and execute the code again.
***** Suggested tool call (call_VvDlLL6tIvnQB1FHZ7FroUaL): python_execute_code *****
Arguments: 
{"code_execution_request":{"code":"import platform\nprint('Python version:', platform.python_version())\nfor i in range(1, 12):\n  print(i)","libraries":[]}}
************************************************************************************

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING FUNCTION python_execute_code...
Call ID: call_VvDlLL6tIvnQB1FHZ7FroUaL
Input arguments: {'code_execution_request': {'code': "import platform\nprint('Python version:', platform.python_version())\nfor i in range(1, 12):\n  print(i)", 'libraries': []}}
code_runner (to question_agent):

***** Response from calling tool (call_VvDlLL6tIvnQB1FHZ7FroUaL) *****
{'success': True, 'stdout': 'Python version: 3.11.11\n1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n', 'stderr': '', 'returncode': 0}
**********************************************************************

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
2025-03-28 06:13:30,605 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
question_agent (to code_runner):

The corrected code executed successfully and printed the current Python version followed by the numbers 1 to 11:

'''
Python version: 3.11.11
1
2
3
4
5
6
7
8
9
10
11
'''

--------------------------------------------------------------------------------
2025-03-28 06:13:31,356 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
code_runner (to question_agent):

TERMINATE

--------------------------------------------------------------------------------
Please give feedback to code_runner. Press enter or type 'exit' to stop the conversation: exit

>>>>>>>> TERMINATING RUN (34bd4162-7dfd-47d1-98ad-3f4072b51ba0): Termination message condition on agent 'question_agent' met and no human input provided

>>>>>>>> TERMINATING RUN (178595eb-15b8-417f-a0e0-f3e91765d766): Termination message condition on agent 'code_runner' met
Result: 
2025-03-28 06:13:35,669 - INFO - Stopping Docker container: ag2_docker_env_8e729dc8
2025-03-28 06:13:45,883 - INFO - Removing Docker container: ag2_docker_env_8e729dc8

marklysze · 2025-03-27T21:51:45Z

Note: I want to refactor the Docker configuration parameters into two new classes, DockerExistingContainerConfig and DockerNewContainerConfig, as the DockerPythonEnvironment parameter list is long.

qingyun-wu · 2025-06-05T05:28:10Z

Nice one! Hopefully, we can find a chance to continue this soon!

qingyun-wu · 2025-06-05T05:32:11Z

@sonichi when you have a moment, could you help review this too? Thanks!

marklysze · 2025-07-01T20:38:40Z

@qingyun-wu, @sonichi, @randombet - I have tested the code snippets in the documentation for system, venv, and docker code execution and they are working. If we could get a review of this it would be appreciated.

codecov · 2025-07-01T20:39:13Z

Codecov Report

Attention: Patch coverage is 26.07656% with 309 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
autogen/environments/docker_python_environment.py	12.57%	139 Missing ⚠️
autogen/environments/venv_python_environment.py	17.02%	78 Missing ⚠️
autogen/environments/working_directory.py	36.36%	28 Missing ⚠️
autogen/environments/system_python_environment.py	37.83%	23 Missing ⚠️
autogen/environments/python_environment.py	51.16%	21 Missing ⚠️
...perimental/code_execution/python_code_execution.py	39.39%	20 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (692fa07) and HEAD (d4c6645). Click for more details.

HEAD has 3787 uploads less than BASE

Flag BASE (692fa07) HEAD (d4c6645)

3.9 251 0

ubuntu-latest 377 1

commsagent-discord 27 0

optional-deps 502 0

3.10 307 0

commsagent-telegram 27 0

commsagent-slack 27 0

3.13 272 0

browser-use 9 0

core-without-llm 42 1

3.12 107 0

macos-latest 338 0

3.11 142 1

windows-latest 364 0

jupyter-executor 27 0

retrievechat-pgvector 30 0

retrievechat-mongodb 30 0

graph-rag-falkor-db 18 0

interop 27 0

twilio 27 0

retrievechat 44 0

retrievechat-qdrant 41 0

rag 21 0

interop-crewai 27 0

interop-pydantic-ai 27 0

wikipedia-api 27 0

agent-eval 3 0

interop-langchain 26 0

mcp 8 0

teachable 8 0

crawl4ai 27 0

gpt-assistant-agent 9 0

docs 18 0

google-api 27 0

websurfer 44 0

long-context 9 0

retrievechat-couchbase 9 0

websockets 26 0

llama-index-agent 9 0

mistral 42 0

swarm 40 0

anthropic 45 0

groq 39 0

lmm 9 0

ollama 41 0

together 42 0

cohere 45 0

bedrock 42 0

cerebras 42 0

gemini 42 0

Files with missing lines	Coverage Δ
autogen/environments/__init__.py	`100.00% <100.00%> (ø)`
autogen/tools/experimental/__init__.py	`100.00% <100.00%> (ø)`
...ogen/tools/experimental/code_execution/__init__.py	`100.00% <100.00%> (ø)`
...perimental/code_execution/python_code_execution.py	`39.39% <39.39%> (ø)`
autogen/environments/python_environment.py	`51.16% <51.16%> (ø)`
autogen/environments/system_python_environment.py	`37.83% <37.83%> (ø)`
autogen/environments/working_directory.py	`36.36% <36.36%> (ø)`
autogen/environments/venv_python_environment.py	`17.02% <17.02%> (ø)`
autogen/environments/docker_python_environment.py	`12.57% <12.57%> (ø)`

... and 58 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Local Python code execution tool

e6cb13b

marklysze added the enhancement New feature or request label Mar 17, 2025

marklysze added this to ag2 Mar 17, 2025

davorrunje self-requested a review March 17, 2025 08:17

marklysze and others added 5 commits March 21, 2025 23:57

Merge remote-tracking branch 'origin/main' into codeexectool

e7b4fad

Creation of environment classes, Split Venv and system Python environ…

27a2eda

…ments

Merge branch 'main' into codeexectool

bfc532c

Venv and System PythonEnvironment tidy

5cd448d

Documentation, code examples, context manager and without fixes

077352e

pre-commit fix

cb94da8

Docker execution! Removed local from tool name

19b2273

marklysze changed the title ~~Local Python code execution tool~~ Python code execution tool (System/Venv/Docker) Mar 27, 2025

merge remote-tracking branch 'origin/main' into codeexectool

03c7db4

davorrunje self-assigned this Apr 7, 2025

qingyun-wu requested a review from sonichi June 5, 2025 05:31

marklysze marked this pull request as ready for review July 1, 2025 19:43

marklysze added 2 commits July 1, 2025 20:01

Merge remote-tracking branch 'origin/main' into codeexectool

9a21667

Document tweak

d4c6645

marklysze removed the request for review from davorrunje July 1, 2025 20:38

marklysze requested a review from randombet July 1, 2025 20:38

marklysze unassigned davorrunje Jul 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python code execution tool (System/Venv/Docker) #1371

Python code execution tool (System/Venv/Docker) #1371

Uh oh!

marklysze commented Mar 17, 2025 •

edited

Loading

Uh oh!

marklysze commented Mar 17, 2025 •

edited by davorrunje

Loading

Uh oh!

davorrunje commented Mar 17, 2025 •

edited

Loading

Uh oh!

marklysze commented Mar 17, 2025

Uh oh!

davorrunje commented Mar 17, 2025 •

edited

Loading

Uh oh!

marklysze commented Mar 26, 2025

Uh oh!

marklysze commented Mar 26, 2025

Uh oh!

marklysze commented Mar 27, 2025 •

edited

Loading

Uh oh!

marklysze commented Mar 27, 2025

Uh oh!

qingyun-wu commented Jun 5, 2025

Uh oh!

qingyun-wu commented Jun 5, 2025

Uh oh!

marklysze commented Jul 1, 2025

Uh oh!

codecov bot commented Jul 1, 2025

Uh oh!

Uh oh!

Python code execution tool (System/Venv/Docker) #1371

Are you sure you want to change the base?

Python code execution tool (System/Venv/Docker) #1371

Uh oh!

Conversation

marklysze commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

marklysze commented Mar 17, 2025 • edited by davorrunje Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davorrunje commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marklysze commented Mar 17, 2025

Uh oh!

davorrunje commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marklysze commented Mar 26, 2025

Uh oh!

marklysze commented Mar 26, 2025

Uh oh!

marklysze commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marklysze commented Mar 27, 2025

Uh oh!

qingyun-wu commented Jun 5, 2025

Uh oh!

qingyun-wu commented Jun 5, 2025

Uh oh!

marklysze commented Jul 1, 2025

Uh oh!

codecov bot commented Jul 1, 2025

Codecov Report

Uh oh!

Uh oh!

marklysze commented Mar 17, 2025 •

edited

Loading

marklysze commented Mar 17, 2025 •

edited by davorrunje

Loading

davorrunje commented Mar 17, 2025 •

edited

Loading

davorrunje commented Mar 17, 2025 •

edited

Loading

marklysze commented Mar 27, 2025 •

edited

Loading