mem: free intermediate arrays during YoloX inference#483
Open
KRRT7 wants to merge 2 commits intoUnstructured-IO:mainfrom
Open
mem: free intermediate arrays during YoloX inference#483KRRT7 wants to merge 2 commits intoUnstructured-IO:mainfrom
KRRT7 wants to merge 2 commits intoUnstructured-IO:mainfrom
Conversation
3837285 to
9025807
Compare
Delete origin_img, img/ort_inputs, and output at the points where they become dead instead of letting them linger until function return. The biggest win is origin_img — the full-resolution numpy copy of the input PIL image — which stays alive through ONNX inference in the current code. Savings are proportional to image size.
e42e101 to
4bfd7c4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Free
origin_img,img/ort_inputs, andoutputat the points where they become dead inimage_processing(), instead of letting them linger until function return.The biggest win is
origin_img— the full-resolution numpy copy of the input PIL image — which currently stays alive through the entire ONNXsession.run()call. Savings are proportional to image size: larger pages (higher DPI renders) carry a bigger unused array through inference.Benchmark
Measured with memray (
memray run+memray stats --json), 3 iterations per approach, on Apple M3 Max / Python 3.12. ONNX inference workspace simulated as a 35 MiB allocation.At the default 200 DPI render resolution (1700×2200 for US Letter), this frees ~11 MiB of dead weight before ONNX inference. Zero behavior change — just earlier cleanup of arrays that are never read again.
Reproduce
benchmarks/bench_free_intermediates.py