-
Notifications
You must be signed in to change notification settings - Fork 27
hier comparae #910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hier comparae #910
Conversation
📝 WalkthroughWalkthroughThe changes in this pull request involve several modifications across various files in the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
closes #864 |
@@ -68,6 +69,10 @@ def __init__( | |||
|
|||
self.update() | |||
|
|||
def __deepcopy__(self, memo): | |||
no_copy_keys = ["data_files"] # do not copy these things |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deepcopy crashes without this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (14)
pyphare/pyphare/pharein/load_balancer.py (2)
53-54
: LGTM: Simplified error message with improved readabilityThe error message has been simplified by removing the f-string and split across two lines, improving readability. This change is consistent with the other modifications in the file and maintains the clarity of the message.
For consistency with the other error messages in this method, consider moving the entire message to a single line:
raise RuntimeError("LoadBalancer cannot be registered as no simulation exists")
Line range hint
36-56
: Summary: Consistent simplification of error messagesThe changes in this file involve simplifying error messages by removing f-strings. These modifications improve code consistency and slightly simplify the error messages without altering the functionality of the
LoadBalancer
class. The changes are well-implemented and maintain the clarity of the error messages.Consider applying this simplification pattern to other parts of the codebase for consistency. If there are many similar changes across multiple files, it might be worth creating a coding guideline for error message formatting to ensure consistency throughout the project.
pyphare/pyphare/core/phare_utilities.py (3)
131-134
: LGTM! Consider adding type hints for improved readability.The refactoring of
assert_fp_any_all_close
toany_fp_tol
is a good improvement. It makes the function more flexible and reusable by returning a dictionary of tolerances instead of performing an assertion.Consider adding type hints to improve readability and maintainability:
def any_fp_tol(a: np.ndarray, b: np.ndarray, atol: float = 1e-16, rtol: float = 0, atol_fp32: Optional[float] = None) -> Dict[str, float]: # ... function body ...
137-138
: LGTM! Consider adding a docstring for clarity.The new
fp_any_all_close
function is a good addition, providing a flexible way to compare arrays using the tolerances fromany_fp_tol
.Consider adding a docstring to explain the function's purpose and parameters:
def fp_any_all_close(a, b, atol=1e-16, rtol=0, atol_fp32=None): """ Check if two arrays are element-wise equal within a tolerance. This function uses `any_fp_tol` to determine appropriate tolerances, then calls `np.allclose` with these tolerances. Parameters: a, b : array_like Input arrays to compare. atol : float, optional The absolute tolerance parameter (see `numpy.allclose`). rtol : float, optional The relative tolerance parameter (see `numpy.allclose`). atol_fp32 : float, optional The absolute tolerance to use if either input contains 32-bit floats. Returns: bool Returns True if the arrays are equal within the given tolerance; False otherwise. """ return np.allclose(a, b, **any_fp_tol(a, b, atol, rtol, atol_fp32))
141-142
: LGTM! Consider adding an explanatory comment.The redefinition of
assert_fp_any_all_close
is a good improvement. It leveragesnp.testing.assert_allclose
for more robust assertion checking while maintaining consistent tolerance handling withany_fp_tol
.Consider adding a brief comment explaining the change and its benefits:
def assert_fp_any_all_close(a, b, atol=1e-16, rtol=0, atol_fp32=None): # Using np.testing.assert_allclose for more robust assertion checking # while maintaining consistent tolerance handling with any_fp_tol np.testing.assert_allclose(a, b, **any_fp_tol(a, b, atol, rtol, atol_fp32))pyphare/pyphare/pharesee/hierarchy/fromh5.py (1)
27-27
: Approve the addition of thehier
parameter with suggestions for improvement.The addition of the
hier
parameter improves the function's flexibility. However, consider the following improvements:
- Add type hints to clarify the expected types of the parameters and return value.
- Update the function's docstring to explain the purpose and usage of the new
hier
parameter.Here's a suggested implementation with type hints and an updated docstring:
from typing import List, Optional from .hierarchy import PatchHierarchy def get_all_available_quantities_from_h5( filepath: str, time: float = 0, exclude: List[str] = None, hier: Optional[PatchHierarchy] = None ) -> PatchHierarchy: """ Retrieve all available quantities from an H5 file. Args: filepath (str): Path to the H5 file. time (float, optional): Timestamp to retrieve data from. Defaults to 0. exclude (List[str], optional): List of quantities to exclude. Defaults to None. hier (Optional[PatchHierarchy], optional): Existing hierarchy to update. If None, a new hierarchy is created. Defaults to None. Returns: PatchHierarchy: The updated or newly created patch hierarchy. """ if exclude is None: exclude = ["tags"] time = format_timestamp(time) path = Path(filepath) for h5 in path.glob("*.h5"): if h5.parent == path and h5.stem not in exclude: hier = hierarchy_fromh5(str(h5), time, hier) return hierThis implementation addresses the mutable default argument issue and provides clear type hints and documentation.
🧰 Tools
🪛 Ruff
27-27: Do not use mutable data structures for argument defaults
Replace with
None
; initialize within function(B006)
pyphare/pyphare/pharesee/particles.py (2)
80-84
: Approve changes with a minor suggestionThe additions to the
__eq__
method are good improvements:
- The size comparison (lines 80-84) is an efficient way to quickly determine inequality for differently sized particle sets.
- The print statement for type mismatch (line 96) enhances debugging capabilities.
These changes maintain existing functionality while adding useful optimizations and debug information.
Consider using a logging framework instead of print statements for better control over debug output in production environments.
Also applies to: 96-96
303-305
: Approve changes with a minor optimization suggestionThe modifications to the
single_patch_per_level_per_pop_from
function are good improvements:
- The added check
if key in particles
(line 304) improves the robustness of the data aggregation process by avoiding potential KeyError exceptions.- This change maintains existing functionality while adding a necessary safety check.
Consider optimizing the loop slightly by removing
.keys()
from the outer loop condition:-for key in patch.patch_datas.keys(): +for key in patch.patch_datas: if key in particles: particles[key] += [patch[key].dataset]This aligns with Python best practices and the static analysis hint (SIM118).
🧰 Tools
🪛 Ruff
303-303: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
pyphare/pyphare/pharesee/hierarchy/hierarchy.py (1)
72-75
: LGTM: New__deepcopy__
method implemented correctly.The
__deepcopy__
method is implemented correctly, following Python's deep copy protocol. It appropriately uses thedeep_copy
function and excludes thedata_files
attribute from being copied.Consider adding a brief docstring to explain why
data_files
is excluded from the deep copy. This would improve code maintainability and clarify the intention behind this design decision.pyphare/pyphare/pharesee/hierarchy/hierarchy_utils.py (5)
309-309
: Update the docstring forflat_finest_field
to includeneghosts
parameter.The
flat_finest_field
function now includes a new parameterneghosts=1
. Please update the function's docstring to explain the purpose and usage of theneghosts
parameter to enhance code clarity and maintainability.
Line range hint
332-344
: Addneghosts
parameter description to theflat_finest_field_1d
docstring.The function
flat_finest_field_1d
now includes the parameterneghosts
, but the docstring does not mention it. Please update the docstring to include a description ofneghosts
and how it affects the function's behavior to improve code readability.
Line range hint
594-636
: Optimize dictionary key comparisons inhierarchy_compare
.In the
hierarchy_compare
function, you compare dictionary keys usingdict.keys()
for equality checks, such as on line 601 and 604. In Python, comparing the keys directly can be done more efficiently by comparing the dictionaries as sets.Apply this diff to improve key comparisons:
- if this.time_hier.keys() != that.time_hier.keys(): + if set(this.time_hier) != set(that.time_hier):Similarly, on line 622:
- if patch_ref.patch_datas.keys() != patch_cmp.patch_datas.keys(): + if set(patch_ref.patch_datas) != set(patch_cmp.patch_datas):This change makes the code more idiomatic and may offer performance benefits in large dictionaries.
🧰 Tools
🪛 Ruff
613-613: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
628-630
: Enhance error messages for data mismatches.When a data mismatch is detected in
hierarchy_compare
, the error message could be more informative to aid debugging. Including additional context such as the time index, level index, and patch index would help identify the exact location of the mismatch.Consider updating the message:
- msg = f"data mismatch: {type(patch_data_ref).__name__} {patch_data_key}" + msg = ( + f"Data mismatch in {type(patch_data_ref).__name__} '{patch_data_key}' " + f"at time index {tidx}, level {level_idx}, patch index {patch_idx}." + )This provides a clearer understanding of where the mismatch occurs.
638-680
: Add a docstring tosingle_patch_for_LO
and ensure robust error handling.The function
single_patch_for_LO
lacks a docstring explaining its purpose, parameters, and return value. Adding a docstring improves code readability and maintainability. Additionally, consider adding error handling for unexpected conditions to make the function more robust.Please add a docstring like:
def single_patch_for_LO(hier, qties=None, skip=None): """ Creates a single-level hierarchy with one combined patch. :param hier: The original hierarchy to process. :param qties: Optional list of quantities to include. :param skip: Optional list of quantities to skip. :return: A new hierarchy with a single patch at the lowest level. This function deep copies the provided hierarchy and consolidates all patches at level 0 into a single patch, combining their data. It is useful for data extraction or manipulation where a single continuous dataset is required. """Also, within the function, ensure that all possible data types are handled appropriately, and consider adding error messages or exceptions for unexpected data types.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (9)
- pyphare/pyphare/core/phare_utilities.py (1 hunks)
- pyphare/pyphare/pharein/examples/job.py (0 hunks)
- pyphare/pyphare/pharein/init.py (1 hunks)
- pyphare/pyphare/pharein/load_balancer.py (2 hunks)
- pyphare/pyphare/pharesee/hierarchy/fromh5.py (1 hunks)
- pyphare/pyphare/pharesee/hierarchy/hierarchy.py (2 hunks)
- pyphare/pyphare/pharesee/hierarchy/hierarchy_utils.py (7 hunks)
- pyphare/pyphare/pharesee/hierarchy/patchdata.py (5 hunks)
- pyphare/pyphare/pharesee/particles.py (3 hunks)
💤 Files with no reviewable changes (1)
- pyphare/pyphare/pharein/examples/job.py
🧰 Additional context used
🪛 Ruff
pyphare/pyphare/pharesee/hierarchy/fromh5.py
27-27: Do not use mutable data structures for argument defaults
Replace with
None
; initialize within function(B006)
pyphare/pyphare/pharesee/hierarchy/hierarchy_utils.py
624-624: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
pyphare/pyphare/pharesee/particles.py
303-303: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
🪛 GitHub Check: CodeQL
pyphare/pyphare/pharesee/hierarchy/hierarchy_utils.py
[notice] 7-7: Cyclic import
Import of module pyphare.pharesee.hierarchy.hierarchy begins an import cycle.
🔇 Additional comments (14)
pyphare/pyphare/pharein/init.py (1)
9-10
:⚠️ Potential issueRevise error handling for module import
The added error check doesn't align with the behavior of
importlib.import_module()
. This function raises anImportError
when it fails to import a module, rather than returningNone
. The current implementation will never reach the new conditional check.Consider replacing the new lines with a try-except block to catch
ImportError
:- jobmodule = importlib.import_module(jobname) # lgtm [py/unused-local-variable] - if jobmodule is None: - raise RuntimeError("failed to import job") + try: + jobmodule = importlib.import_module(jobname) # lgtm [py/unused-local-variable] + except ImportError as e: + raise RuntimeError(f"Failed to import job '{jobname}': {str(e)}")This change will:
- Correctly catch import failures.
- Provide more detailed error information, including the job name and the specific import error.
- Maintain the desired behavior of raising a
RuntimeError
.To ensure this change doesn't conflict with existing error handling, let's check for other uses of
importlib.import_module()
in the codebase:✅ Verification successful
Confirm error handling for module import
The conditional check for
jobmodule
beingNone
is unnecessary becauseimportlib.import_module()
raises anImportError
on failure instead of returningNone
. Please update the code to use a try-except block as suggested to handle import errors appropriately.
- File:
pyphare/pyphare/pharein/init.py
- Lines: 9-10
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for other uses of importlib.import_module() rg --type python "importlib\.import_module" -C 5Length of output: 81
Script:
#!/bin/bash # Search for other uses of importlib.import_module() rg --type py "importlib\.import_module" -C 5Length of output: 1993
pyphare/pyphare/pharein/load_balancer.py (2)
36-36
: LGTM: Simplified error messageThe error message has been simplified by removing the f-string. This change is consistent with the other modifications in the file and maintains the clarity of the message.
56-56
: LGTM: Simplified error messageThe error message has been simplified by removing the f-string. This change is consistent with the other modifications in the file and maintains the clarity of the message.
pyphare/pyphare/core/phare_utilities.py (1)
131-142
: Overall, these changes improve the floating-point comparison utilities.The modifications to the floating-point comparison functions in this file are well-implemented and provide several benefits:
- Improved flexibility with the new
any_fp_tol
function.- Better consistency in tolerance handling across different comparison functions.
- Utilization of numpy's robust
allclose
andassert_allclose
functions.- Maintained backward compatibility with the
assert_fp_any_all_close
function.These changes should lead to more reliable and flexible floating-point comparisons throughout the PHARE project.
pyphare/pyphare/pharesee/particles.py (1)
99-101
: Approve addition of__ne__
methodThe addition of the
__ne__
method is a good practice:
- It follows Python's convention for defining
__ne__
in terms of__eq__
.- This ensures consistent behavior for inequality comparisons.
- The implementation is correct and concise.
pyphare/pyphare/pharesee/hierarchy/hierarchy.py (2)
1-3
: LGTM: Import statements reorganized and new import added.The reorganization of import statements improves code readability. The addition of
deep_copy
fromphare_utilities
is likely related to the new__deepcopy__
method implementation.Also applies to: 9-10
72-75
: Verify usage of deep copy operations onPatchHierarchy
instances.The addition of the
__deepcopy__
method enhances the functionality of thePatchHierarchy
class. To ensure this change doesn't introduce any unintended side effects, it's advisable to verify all occurrences where deep copies ofPatchHierarchy
instances are created throughout the codebase.pyphare/pyphare/pharesee/hierarchy/patchdata.py (6)
27-27
: Refactoring__deepcopy__
to usephut.deep_copy
The change to use
phut.deep_copy
in the__deepcopy__
method enhances code maintainability by leveraging a centralized deep copy utility.
83-87
: Addingcompare
method for tolerant field comparisonThe implementation of the
compare
method allows for robust comparison ofFieldData
instances, accounting for floating-point precision via theatol
parameter and usingphut.fp_any_all_close
. This is a good practice when dealing with numerical data.
89-90
: Overriding__eq__
to utilize thecompare
methodBy redefining
__eq__
to rely on thecompare
method, equality checks betweenFieldData
instances become consistent and take numerical tolerance into account.
91-92
: Implementing__ne__
for consistent inequality comparisonDefining
__ne__
as the logical negation of__eq__
ensures that inequality comparisons betweenFieldData
instances function as expected.
123-125
: Implementing__setitem__
for item assignmentThe addition of
__setitem__
enables direct assignment to slices of the dataset, enhancing the usability of theFieldData
class.
237-238
: Overriding__eq__
to use the updatedcompare
methodBy updating
__eq__
to use the revisedcompare
method, equality comparisons inParticleData
will function correctly with the improved dataset comparison logic.pyphare/pyphare/pharesee/hierarchy/hierarchy_utils.py (1)
322-322
:⚠️ Potential issueUse
NotImplementedError
instead ofRuntimeError
.In the
flat_finest_field
function, raising aRuntimeError
with the message"Not implemented"
can be misleading. It's more appropriate to raise aNotImplementedError
to indicate that the method is intended to be implemented in the future.Apply this diff to update the exception type:
- raise RuntimeError("Not implemented") + raise NotImplementedError("Not implemented")Likely invalid or redundant comment.
def compare(self, that, *args, **kwargs): | ||
"""args/kwargs may include atol for consistency with field::compare""" | ||
return self.name == that.name and self.dataset == that.dataset | ||
|
||
def __eq__(self, that): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential issue in ParticleData.compare
: Dataset comparison may not behave as intended
In the ParticleData.compare
method, using self.dataset == that.dataset
may not produce the expected result if dataset
is a NumPy array, as this comparison returns an array of booleans. This could lead to unintended behavior in the logical and
operation.
Consider using numpy.array_equal
or numpy.allclose
for array comparison. Additionally, since the docstring mentions atol
for consistency with FieldData.compare
, you might want to include atol
in the comparison to handle floating-point precision.
Apply this diff to address the issue:
def compare(self, that, *args, **kwargs):
"""args/kwargs may include atol for consistency with field::compare"""
+ atol = kwargs.get('atol', 1e-16)
- return self.name == that.name and self.dataset == that.dataset
+ return self.name == that.name and np.allclose(self.dataset, that.dataset, atol=atol)
Remember to import NumPy as np
if it's not already imported:
+import numpy as np
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def compare(self, that, *args, **kwargs): | |
"""args/kwargs may include atol for consistency with field::compare""" | |
return self.name == that.name and self.dataset == that.dataset | |
def __eq__(self, that): | |
def compare(self, that, *args, **kwargs): | |
"""args/kwargs may include atol for consistency with field::compare""" | |
atol = kwargs.get('atol', 1e-16) | |
return self.name == that.name and np.allclose(self.dataset, that.dataset, atol=atol) | |
def __eq__(self, that): |
failed: List[Tuple[str, Any, Any]] = field(default_factory=lambda: []) | ||
|
||
def __bool__(self): | ||
return self.ok | ||
return not self.failed | ||
|
||
def __repr__(self): | ||
for msg, ref, cmp in self: | ||
print(msg) | ||
try: | ||
if type(ref) is FieldData: | ||
phut.assert_fp_any_all_close(ref[:], cmp[:], atol=1e-16) | ||
except AssertionError as e: | ||
print(e) | ||
return self.failed[0][0] | ||
|
||
def __call__(self, reason, ref=None, cmp=None): | ||
self.failed.append((reason, ref, cmp)) | ||
return self | ||
|
||
def __getitem__(self, idx): | ||
return (self.failed[idx][1], self.failed[idx][2]) | ||
|
||
def __iter__(self): | ||
return self.failed.__iter__() | ||
|
||
def __reversed__(self): | ||
return reversed(self.failed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the __repr__
method in the EqualityReport
class.
The __repr__
method currently prints messages and returns one of them, which is not standard behavior for __repr__
. The purpose of __repr__
is to return a string that would recreate the object when evaluated (or at least give a detailed representation). Additionally, printing within __repr__
is not recommended.
Apply this diff to correct the __repr__
method:
- def __repr__(self):
- for msg, ref, cmp in self:
- print(msg)
- try:
- if type(ref) is FieldData:
- phut.assert_fp_any_all_close(ref[:], cmp[:], atol=1e-16)
- except AssertionError as e:
- print(e)
- return self.failed[0][0]
+ def __repr__(self):
+ return f"EqualityReport(failed={self.failed})"
This change ensures that __repr__
returns a string representation of the EqualityReport
object without side effects.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
failed: List[Tuple[str, Any, Any]] = field(default_factory=lambda: []) | |
def __bool__(self): | |
return self.ok | |
return not self.failed | |
def __repr__(self): | |
for msg, ref, cmp in self: | |
print(msg) | |
try: | |
if type(ref) is FieldData: | |
phut.assert_fp_any_all_close(ref[:], cmp[:], atol=1e-16) | |
except AssertionError as e: | |
print(e) | |
return self.failed[0][0] | |
def __call__(self, reason, ref=None, cmp=None): | |
self.failed.append((reason, ref, cmp)) | |
return self | |
def __getitem__(self, idx): | |
return (self.failed[idx][1], self.failed[idx][2]) | |
def __iter__(self): | |
return self.failed.__iter__() | |
def __reversed__(self): | |
return reversed(self.failed) | |
failed: List[Tuple[str, Any, Any]] = field(default_factory=lambda: []) | |
def __bool__(self): | |
return not self.failed | |
def __repr__(self): | |
return f"EqualityReport(failed={self.failed})" | |
def __call__(self, reason, ref=None, cmp=None): | |
self.failed.append((reason, ref, cmp)) | |
return self | |
def __getitem__(self, idx): | |
return (self.failed[idx][1], self.failed[idx][2]) | |
def __iter__(self): | |
return self.failed.__iter__() | |
def __reversed__(self): | |
return reversed(self.failed) |
some of #888 isolated for roch
Summary by CodeRabbit
New Features
FieldData
andParticleData
classes.Bug Fixes
LoadBalancer
class.Refactor
PatchHierarchy
class with a new deep copy method.Documentation