Skip to content

Difficulty Capturing Network Logs (TLS/HTTPS LLM Calls) for Python Programs in Docker Containers When AgentSight Runs on Host #23

@dyoung23

Description

@dyoung23

Hi team,
I'm using AgentSight to monitor AI agents running Python programs in Docker containers, but I'm encountering an issue with capturing network logs, specifically TLS-encrypted HTTPS calls (e.g., to LLM APIs like OpenAI or Claude). This seems related to Docker namespace isolation, as the monitoring works fine when AgentSight and the Python program are in the same environment, but fails in a mixed setup.
Environment:

Host OS: [e.g., Ubuntu 22.04, Linux kernel 5.15.0-25-generic]
Docker version: [e.g., 24.0.7]
AgentSight version: latest
Python version in container: 3.9
The Python programs use HTTPS for LLM calls (e.g., call OpenAI via requests library with OpenSSL).

Use Case:
In my setup, I have multiple temporary or long-running Docker containers that run Python AI agents. Temporary ones are executed via commands like:
textdocker run --rm -it my-image:latest python main.py
Long-running ones have fixed startup configurations that cannot be modified (no additional mounts, privileges, or network=host flags).
I need to monitor LLM calls (intent stream) and system actions (action stream) across these multiple containers without modifying their configs.
Steps to Reproduce:

Run AgentSight on the host:textagentsight trace -c "python" --server(This filters for Python processes and starts the server for logs/analysis.)
Launch a Docker container with a Python program that makes HTTPS LLM calls:textdocker run --rm -it my-image:latest python main.py(The Python script includes HTTPS requests, e.g., to an LLM API.)
Observe the AgentSight output/logs.

Expected Behavior:
AgentSight should capture both:

System actions (e.g., file operations via kernel events like openat2).
Network logs (intent stream: decrypted TLS traffic from SSL_read/SSL_write, including LLM prompts/responses).

This would allow full correlation as described in the paper (boundary tracing for semantic gap bridging).
Actual Behavior:

System/file operations are captured successfully (process lineage and kernel events work).
Network logs (TLS/HTTPS LLM calls) are not captured when AgentSight is on the host and Python is in Docker.
However, if I deploy AgentSight inside the same Docker container as the Python program (e.g., in a shared container with modified startup flags like --privileged, --pid=host, --network=host), then network logs are captured normally.

This suggests the issue is with uprobes not attaching properly across container namespaces (e.g., mount/user namespaces affecting OpenSSL symbol resolution or network stack visibility).
Question:
How can I configure AgentSight to monitor network logs for Python programs across multiple Docker containers when it's running on the host (or in a separate container)? Ideally without modifying the target containers' startup configs. Are there any flags, config options, or workarounds (e.g., dynamic uprobe path resolution for containers) to handle this? If this is a known limitation, could it be addressed in a future release?
Thanks for your help and for this great tool! I'd be happy to provide more logs or test patches.

Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions