-
Notifications
You must be signed in to change notification settings - Fork 369
Cpu memory graph break #3886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Cpu memory graph break #3886
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to Python style guidelines:
--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_compiler.py 2025-11-04 20:05:23.825034+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_compiler.py 2025-11-04 20:05:55.253944+00:00
@@ -876,15 +876,14 @@
# This is done to release CPU memory.
for attr in dir(gm):
if attr.startswith("_frozen_param"):
delattr(gm, attr)
-
-
from torch_tensorrt.dynamo.conversion._ConverterRegistry import DYNAMO_CONVERTERS
+
DYNAMO_CONVERTERS.disallowed_targets = set()
-
+
for name, _ in partitioned_module.named_children():
submodule = getattr(partitioned_module, name)
# filter on the GraphModule
if not isinstance(submodule, torch.fx.graph_module.GraphModule):
continueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a test case or something to demonstrate this feature?
|
|
||
| logger = logging.getLogger(__name__) | ||
| NON_BREAKABLE_OP_LISTS = [ | ||
| ["addmm", "addmm"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note for implementation later.
- this should use an actual subgraph definition
- it should use pytorch op targets not strings
addmmshould be decomposed right so the graph we want ismm->add- There should be a user facing API to modify this list similar to what we have for passes
|
|
||
| def calculate_num_of_break(self, subgraphs: List[Subgraph]) -> int: | ||
|
|
||
| def calculate_size_budget( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an API to define this manually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think so. For now you can just hardcode and play with it
| This function breaks the subgraphs into smaller subgraphs at the specified frequency to save CPU memory. | ||
| """ | ||
| op_to_break = "add." | ||
| op_to_break = "addmm." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this hardcoded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not called. I just left there for experiment and would delete later
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Please delete options that are not relevant and/or add your own.
Checklist: