Skip to content

core:services:kraken: Fix container log stream timeout#3766

Merged
patrickelectric merged 1 commit intobluerobotics:masterfrom
joaomariolago:slow-down-logs-dicovery
Feb 4, 2026
Merged

core:services:kraken: Fix container log stream timeout#3766
patrickelectric merged 1 commit intobluerobotics:masterfrom
joaomariolago:slow-down-logs-dicovery

Conversation

@joaomariolago
Copy link
Collaborator

@joaomariolago joaomariolago commented Feb 4, 2026

Closes #3763

Summary by Sourcery

Reduce the frequency of extension log synchronization by integrating it into the existing cleaner task loop instead of running a separate high-frequency background task.

Enhancements:

  • Add a reusable helper method to perform extension log synchronization with error handling.
  • Remove the dedicated extension log background task in favor of running log synchronization as part of the cleaner task cycle.

Summary by Sourcery

Allow configuring Docker client timeouts for Kraken harbor operations and use a non-timing-out client for container log streaming to prevent premature log stream termination.

Bug Fixes:

  • Prevent container log streaming from timing out by using a Docker client configured with no timeout for log retrieval.

Enhancements:

  • Extend the Docker context manager to support an optional custom HTTP client timeout via a dedicated aiohttp session.

@sourcery-ai
Copy link

sourcery-ai bot commented Feb 4, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Integrates a configurable-timeout Docker client context to prevent container log stream timeouts and uses it specifically for log retrieval, while keeping existing Docker usage unchanged elsewhere.

Sequence diagram for container log retrieval with non expiring timeout

sequenceDiagram
    actor Service
    participant ContainerModel
    participant DockerCtx
    participant Docker
    participant aiohttp_ClientSession
    participant aiohttp_ClientTimeout

    Service->>ContainerModel: get_container_log_by_name(container_name)
    activate ContainerModel
    ContainerModel->>DockerCtx: __init__(timeout=0)
    activate DockerCtx
    DockerCtx->>Docker: Docker(session=True)
    activate Docker
    Docker-->>DockerCtx: docker_instance
    DockerCtx->>aiohttp_ClientTimeout: aiohttp_ClientTimeout(total=None, sock_read=None)
    activate aiohttp_ClientTimeout
    aiohttp_ClientTimeout-->>DockerCtx: timeout_instance
    DockerCtx->>aiohttp_ClientSession: aiohttp_ClientSession(connector=docker_instance.connector, timeout=timeout_instance)
    activate aiohttp_ClientSession
    aiohttp_ClientSession-->>DockerCtx: session_instance
    DockerCtx->>Docker: set session = session_instance
    Docker-->>DockerCtx: session_configured
    DockerCtx-->>ContainerModel: docker_instance as context
    deactivate DockerCtx

    ContainerModel->>Docker: get_container_logs(container_name)
    Docker-->>ContainerModel: Async log stream (no total or sock_read timeout)
    ContainerModel-->>Service: yield log lines
    deactivate ContainerModel

    Service-->>DockerCtx: exit async context
    DockerCtx->>Docker: close()
    Docker-->>DockerCtx: closed
    DockerCtx-->>Service: context closed
Loading

Class diagram for updated DockerCtx and container log retrieval

classDiagram
    class DockerCtx {
        - Docker _client
        + DockerCtx(timeout Optional_int)
        + __aenter__() Docker
        + __aexit__(exc_type type, exc value, traceback object) None
    }

    class Docker {
        + connector aiohttp_BaseConnector
        + session aiohttp_ClientSession
        + Docker(session bool)
    }

    class aiohttp_ClientSession {
        + connector aiohttp_BaseConnector
        + timeout aiohttp_ClientTimeout
        + aiohttp_ClientSession(connector aiohttp_BaseConnector, timeout aiohttp_ClientTimeout)
    }

    class aiohttp_ClientTimeout {
        + total Optional_int
        + sock_read Optional_int
        + aiohttp_ClientTimeout(total Optional_int, sock_read Optional_int)
    }

    class ContainerModel {
        + get_container_log_by_name(container_name str) AsyncGenerator_str_None
        + get_raw_container_by_name(client Docker, container_name str) DockerContainer
    }

    DockerCtx --> Docker : creates
    Docker --> aiohttp_ClientSession : uses
    aiohttp_ClientSession --> aiohttp_ClientTimeout : uses
    ContainerModel ..> DockerCtx : uses_for_log_stream
    ContainerModel ..> Docker : uses_via_DockerCtx
Loading

File-Level Changes

Change Details Files
Add optional timeout support to DockerCtx for customizing aiohttp client session timeouts used by aiodocker.
  • Extend DockerCtx constructor to accept an optional timeout parameter.
  • When no timeout is provided, preserve the previous behavior by constructing Docker with default settings.
  • When a timeout is provided, force creation of an aiohttp session-backed Docker client.
  • Construct a custom aiohttp.ClientSession with connector from Docker and a ClientTimeout that maps timeout=0 to no overall or sock_read timeout, otherwise using the given timeout value.
core/services/kraken/harbor/contexts.py
Use a non-timing-out Docker client when streaming container logs to avoid log stream timeout issues.
  • Change get_container_log_by_name to create its DockerCtx with timeout=0.
  • Ensure log streaming relies on the custom session timeout behavior while other callers of DockerCtx remain unaffected.
core/services/kraken/harbor/container.py

Assessment against linked issues

Issue Objective Addressed Explanation
#3763 Modify Kraken’s log ingestion/container log handling so it no longer causes long and repeating CPU usage surges in BlueOS 1.5.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@joaomariolago joaomariolago force-pushed the slow-down-logs-dicovery branch 3 times, most recently from b4181ca to 1f8a680 Compare February 4, 2026 21:53
@joaomariolago joaomariolago changed the title core:services:kraken: Reduce ext log job interval core:services:kraken: Fix containerr log stream timeout Feb 4, 2026
* Make sure kraken does not keeps reattaching to stream logs due to
  internal aiohttp 5 minutes timeout
@rafaellehmkuhl
Copy link
Member

Tried just the harbor/ changes on Tony's Towfish ROV (where the problem was constantly happening) and there's no more CPU usage surges (after 25 minutes of testing).

@joaomariolago joaomariolago force-pushed the slow-down-logs-dicovery branch from 1f8a680 to 13c2ee0 Compare February 4, 2026 23:18
@joaomariolago joaomariolago changed the title core:services:kraken: Fix containerr log stream timeout core:services:kraken: Fix container log stream timeout Feb 4, 2026
@rafaellehmkuhl
Copy link
Member

@joaomariolago I believe we can link that PR to #3763, right?

@joaomariolago joaomariolago marked this pull request as ready for review February 4, 2026 23:47
@patrickelectric patrickelectric merged commit e161f06 into bluerobotics:master Feb 4, 2026
7 checks passed
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Consider avoiding timeout=0 as a magic value for "no timeout" in DockerCtx (e.g., use a named constant or a dedicated sentinel) and document that behavior in the class docstring so callers understand the semantics.
  • The assignment self._client.session = self._client.session = aiohttp.ClientSession(...) in DockerCtx.__init__ looks accidental; simplify this to a single assignment to make the intent clear.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider avoiding `timeout=0` as a magic value for "no timeout" in `DockerCtx` (e.g., use a named constant or a dedicated sentinel) and document that behavior in the class docstring so callers understand the semantics.
- The assignment `self._client.session = self._client.session = aiohttp.ClientSession(...)` in `DockerCtx.__init__` looks accidental; simplify this to a single assignment to make the intent clear.

## Individual Comments

### Comment 1
<location> `core/services/kraken/harbor/contexts.py:16-20` </location>
<code_context>
+        if timeout is None:
+            self._client: Docker = Docker()
+        else:
+            # aiodocker will not create a session if is different from None
+            self._client: Docker = Docker(session=True)  # type: ignore
+            # We insert a new session with desired timeout
+            self._client.session = self._client.session = aiohttp.ClientSession(
+                connector=self._client.connector,
+                timeout=aiohttp.ClientTimeout(
+                    total=None if timeout == 0 else timeout,
</code_context>

<issue_to_address>
**issue (bug_risk):** Closing or reusing the initially created session should be explicit to avoid potential resource leaks.

Using `Docker(session=True)` creates an internal `ClientSession` that is then replaced with a new one, leaving the original potentially unclosed. To avoid leaking resources, either construct the desired `ClientSession` first and pass it into `Docker`, or explicitly close the initial session before overwriting `self._client.session`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +16 to +20
# aiodocker will not create a session if is different from None
self._client: Docker = Docker(session=True) # type: ignore
# We insert a new session with desired timeout
self._client.session = self._client.session = aiohttp.ClientSession(
connector=self._client.connector,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Closing or reusing the initially created session should be explicit to avoid potential resource leaks.

Using Docker(session=True) creates an internal ClientSession that is then replaced with a new one, leaving the original potentially unclosed. To avoid leaking resources, either construct the desired ClientSession first and pass it into Docker, or explicitly close the initial session before overwriting self._client.session.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Something in BlueOS 1.5 is causing long and repeating surges in CPU usage

3 participants

Comments