Skip to content

docbuild metadata saturates 100% of CPU #188

@tomschr

Description

@tomschr

Situation

My previous tests used only a small subset of the whole documentation set. But when using sles/*/en-us, the CPU went up to 100% even with 16 CPU cores. This looks suspicious and I think, there is a problem in the code.

Use Case

Making the subcommand really work and still making the CPU responsive.

Background

If we have, for example, 50 deliverables, the process_deliverable would attempt to launch 50 git, daps, and temporary clone tasks concurrently. This leads to a classic "thundering herd" problem, where a massive number of tasks are spawned simultaneously, creating several negative consequences:

  • CPU Saturation
    Even with many cores, trying to run hundreds of CPU-intensive tasks at once will overwhelm the system, causing CPU usage to spike to 100%.
  • Memory Exhaustion
    Each of thos tasks consumes a significant amount of memory. Launching too many can exhaust the system's RAM.
  • I/O Bottlenecks
    The simultaneous creation of hundreds of temporary directories and repository clones can create a bottleneck on the disk's I/O, slowing everything down.

Possible Implementation

The most idiomatic way to control resource usage from within an asyncio application is to limit the number of concurrent tasks that can run simultaneously. Spawning hundreds of git and daps processes at once, even asynchronously, creates significant overhead for the OS scheduler and can easily consume all available CPU cores.

Some ideas:

  • asyncio.Sempaphore to act as a gate, ensuring only a certain number of process_deliverable tasks are active at any given time. A sensible limit is the number of CPU cores on your machine or from the app config.
  • Using an asyncio:Queue and implement a producer/worker pattern. With a queue, you only create a small, fixed number of long-lived worker tasks. This can be more memory-efficient if the work items themselves are small, as you don't have hundreds of suspended coroutine objects waiting.
  • Going through the file and identifying blocking commands.

Some ideas outside of Python:

  • taskset pins a process to a specific set of CPU cores:

    taskset -c 0,1 python -m docbuild metadata ...
  • cpulimit (openSUSE package name: cpulimit) monitors and throttles a process to keep its CPU usage below a certain percentage:

    # Limit the process to 200% CPU (i.e., two full cores)
    cpulimit --limit=200 -- python -m docbuild metadata ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind:performanceMemory leaks, resource usage, or latency optimizations.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions