Skip to content

Tui memory leak#6722

Merged
wxtim merged 2 commits intocylc:8.4.xfrom
oliver-sanders:tui-memory-leak
Apr 11, 2025
Merged

Tui memory leak#6722
wxtim merged 2 commits intocylc:8.4.xfrom
oliver-sanders:tui-memory-leak

Conversation

@oliver-sanders
Copy link
Copy Markdown
Member

@oliver-sanders oliver-sanders commented Apr 10, 2025

Fix a slow memory leak in Tui:

  • Urwid has a cache (CanvasCache) that caches the render of each widget to speed up future render cycles.
  • Tui is a dynamic tool, it creates a new TuiNode for each update.
  • The old TuiNode and its child nodes were not being cleared out of the urwid cache preventing Python's garbage clearance from doing its job.
  • This change explicitly clears the urwid cache for each Tui update.
  • Preserving the cache post-update isn't really a performance boost for Tui as pretty much every node on the screen has changed anyway (besides the application header and footer). So this is likely of little consequence performance wise. The cache is still active between updates so things like expand/collapse operations will still benefit.
  • There is an open issue in urwid for a similar issue with the cache so we may be able to drop this workaround one day.

How to test:

  • The memray profiler didn't seem to pick up on this issue (I guess caches are unlikely to be noticed by an internal sampling profiler).
  • I found easier to spot using an external profiler tool like the Python memory_profiler, but you can also spot it by counting objects using pympler.muppy.
  • Start a workflow in paused mode.
  • Launch Tui and open it on the given workflow.
  • Find the lead process ID and attach your profiler tool (note the subprocess will have the PPID of the lead process), e.g. mprof attach PID.
  • Leave it for ~60 seconds and stop your profiler, e.g. quit Tui and run mprof graph $PWD/<data-file>.
  • It should reveal a steady ratcheting of memory before this change and a flat line after (nothing is changing in the workflow, so the line should be quite flat).

Example output from memory_profiler before:

before

And after:

after

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included - hard to test for memory leaks.
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@oliver-sanders oliver-sanders added bug Something is wrong :( efficiency For notable efficiency improvements labels Apr 10, 2025
@oliver-sanders oliver-sanders added this to the 8.4.x milestone Apr 10, 2025
@oliver-sanders oliver-sanders self-assigned this Apr 10, 2025
Comment on lines +513 to +517
if not (self.tree_walker):
self.tree_walker = urwid.TreeWalker(topnode)
self.listbox.body = self.tree_walker
else:
self.tree_walker.set_focus(topnode)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, there is a nicer way to do this now.

importlib_metadata>=5.0; python_version < "3.12"
# NOTE: exclude two urwid versions that were not compatible with Tui
urwid==2.*,!=2.6.2,!=2.6.3
urwid>=2.2,!=2.6.2,!=2.6.3,<3
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, drop support for urwid <2.2

* Urwid has a cache (CanvasCache) that caches the render of each widget
  to speed up future render cycles.
* Tui is a dynamic tool, it creates a new TuiNode for each update.
* The old TuiNode and its child nodes were not being cleared out of the
  urwid cache preventing Python's garbage clearance from doing its job.
* This change explicitly clears the urwid cache for each Tui update.
* Preserving the cache post-update isn't really a performance boost for
  Tui as pretty much every node on the screen has changed anyway
  (besides the application header and footer). So this is likely of
  little consequence performance wise. The cache is still active between
  updates so things like expand/collapse operations will still benefit.
Copy link
Copy Markdown
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with a quick profiler run as per your instructions. Nice.

@oliver-sanders oliver-sanders requested a review from wxtim April 11, 2025 08:28
@oliver-sanders
Copy link
Copy Markdown
Member Author

@wxtim, could you give this a quick sanity check.

@wxtim wxtim merged commit 46f933f into cylc:8.4.x Apr 11, 2025
37 checks passed
@oliver-sanders oliver-sanders deleted the tui-memory-leak branch April 11, 2025 10:43
@MetRonnie MetRonnie modified the milestones: 8.4.x, 8.4.3 Apr 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something is wrong :( efficiency For notable efficiency improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants