-
Notifications
You must be signed in to change notification settings - Fork 16
Call BranchMorpher after dw deduplication #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
pytato/transform/__init__.py
Outdated
|
||
class EqualBranchesReuser(CopyMapper): | ||
""" | ||
A mapper that replaces equal segments of graphs with identical objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add more context to the docstring, contrastng with CachedMapper
and specifically mentioning that this is about equality after the mapper applies, and describe the resulting complexity reduction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! A few more notes/thoughts below.
A mapper that reuses the same object instances for equal segments of | ||
graphs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A mapper that reuses the same object instances for equal segments of | |
graphs. | |
A mapper that reuses the same object instances for segments of | |
graphs that become equal after processing by subclasses of this mapper, | |
where the equality test happens immediately after a new node is created. |
:class:`pytato.Array` nodes. However, they differ at the point where | ||
two array expressions are compared. :class:`CopyMapper` compares array | ||
expressions before the expressions are mapped i.e. repeatedly comparing | ||
equal array expressions but unequal instances, and because of this it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equal array expressions but unequal instances, and because of this it | |
different instances representing equal array expressions, and, because of this, it |
hand, :class:`PostMapEqualNodeReuser` has linear complexity in the | ||
number of nodes in the number of array expressions as the larger mapped | ||
expressions already contain same instances for the predecessors, | ||
resulting in a cheaper equality comparison overall. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you want to say instead here is that this avoids building potentially large repeated subexpexpressions in the first place, rather than having to (expensively) merge them after the fact.
return array_or_names | ||
# many paths in the DAG might be semantically equivalent after DWs are | ||
# deduplicated => morph them | ||
return PostMapEqualNodeReuser()(array_or_names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, wait. If it's used like this, I would argue that that's equivalent to just using a CopyMapper
. I figured the data-wrapper-deduplicator should inherit from this to avoid building large identical subtrees in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added test_post_map_equal_node_reuser_intestine
to investigate whether this is equivalent to a CopyMapper
, and at least in that example, it appears to be. It remains to figure out where the difference in execution time comes from.
@@ -1782,8 +1829,11 @@ def cached_data_wrapper_if_present(ary: ArrayOrNames) -> ArrayOrNames: | |||
len(data_wrapper_cache), | |||
data_wrappers_encountered - len(data_wrapper_cache)) | |||
|
|||
return array_or_names | |||
# many paths in the DAG might be semantically equivalent after DWs are | |||
# deduplicated => morph them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still: "morph" isn't good terminology.
Some timings on the graphs (~36k nodes) implemented in https://github.com/illinois-ceesd/drivers_y2-isolator:
With this patch:
Without this patch: