Distributing typing information (#127)#135
Conversation
WalkthroughThe pull request updates packaging and type annotations throughout the project. In Changes
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
Should typing information be included inline and also with these stubs @mmcdermott? |
Codecov ReportAttention: Patch coverage is
... and 3 files with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Outside diff range and nitpick comments (7)
src/aces/query.pyi (1)
8-8: LGTM! Consider adding a docstring for better documentation.The function signature is well-defined with clear type hints. However, to improve the documentation, consider adding a docstring to describe the purpose of the function, its parameters, and the return value.
Here's a suggested docstring to add above the function signature:
def query(cfg: TaskExtractorConfig, predicates_df: pl.DataFrame) -> pl.DataFrame: """ Query and process data based on the given configuration and predicates. Args: cfg (TaskExtractorConfig): Configuration for task extraction. predicates_df (pl.DataFrame): DataFrame containing predicates for querying. Returns: pl.DataFrame: Resulting DataFrame after applying the query and processing. """ ...src/aces/extract_subtree.pyi (1)
6-11: LGTM: Well-structured function signature with clear type annotations.The function signature is well-defined with appropriate type annotations, which enhances maintainability and IDE support. The parameter names are descriptive, improving readability.
Consider adding a docstring to provide more context about the function's purpose and parameter descriptions. This would further improve the usability of the stub file. For example:
def extract_subtree( subtree: Node, subtree_anchor_realizations: pl.DataFrame, predicates_df: pl.DataFrame, subtree_root_offset: timedelta = timedelta(0), ) -> pl.DataFrame: """ Extract a subtree based on the provided parameters. Args: subtree (Node): The subtree to extract. subtree_anchor_realizations (pl.DataFrame): DataFrame containing anchor realizations. predicates_df (pl.DataFrame): DataFrame containing predicates. subtree_root_offset (timedelta, optional): Offset for the subtree root. Defaults to timedelta(0). Returns: pl.DataFrame: The extracted subtree data. """ ...src/aces/utils.pyi (1)
16-16: Consider a more descriptive name for log_tree function.While the
log_treefunction signature is clear, the function name could be more descriptive. Consider renaming it to something likelog_node_treeorlog_tree_structureto better convey its purpose.src/aces/utils.py (1)
Line range hint
1-70: Reconsider the approach to distributing typing information.The removal of type hints from these functions seems to be part of the PR objective "Distributing typing information". However, this approach may not be ideal for several reasons:
- It reduces code clarity and self-documentation in the main implementation file.
- It may lead to inconsistencies between the implementation and any separate type definitions.
- It goes against the trend in the Python community towards more type annotations in code.
Instead of removing type hints from the main implementation, consider:
- Keeping type hints in the main implementation file for clarity and immediate understanding.
- If the goal is to provide separate type definitions, consider using stub files (
.pyi) in addition to, not instead of, in-line type hints. This allows for more detailed or alternative type definitions without sacrificing the benefits of in-line hints.- Ensure that any separate type definitions (if used) are consistent with the in-line hints and are kept up-to-date with changes in the implementation.
This approach would maintain code quality in the main implementation while still allowing for the distribution of additional typing information if needed.
src/aces/predicates.pyi (1)
1-7: Organize import statements for improved readability.
- Group imports in the order: standard library imports, third-party imports, local imports.
- Include necessary imports from the
typingmodule due to added type hints.- Remove any unused imports to keep the code clean.
Apply this diff to reorganize and update imports:
from pathlib import Path +from typing import Any, Dict, List, Optional import polars as pl from omegaconf import DictConfig from .config import TaskExtractorConfigsrc/aces/config.pyi (2)
31-31: Confirm the necessity of the empty__post_init__method inDerivedPredicateConfig.The
__post_init__method is declared but contains..., indicating no current implementation. If it's not performing any initialization, it might be unnecessary.Consider removing the
__post_init__method if it's not needed, or provide an implementation if future initialization steps are anticipated.
90-95: Simplify type annotations in theloadmethod ofTaskExtractorConfig.In the
loadmethod, the class typeTaskExtractorConfigis quoted in the type annotations. Since we're within the class definition, and forward references are not needed here, the quotes can be omitted.Update the type annotations for clarity:
- cls: Type["TaskExtractorConfig"], + cls: Type[TaskExtractorConfig], ... - ) -> "TaskExtractorConfig": ... + ) -> TaskExtractorConfig: ...This makes the annotations more straightforward and avoids unnecessary use of forward references.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (10)
- pyproject.toml (1 hunks)
- src/aces/aggregate.pyi (1 hunks)
- src/aces/config.pyi (1 hunks)
- src/aces/constraints.pyi (1 hunks)
- src/aces/extract_subtree.pyi (1 hunks)
- src/aces/predicates.pyi (1 hunks)
- src/aces/query.pyi (1 hunks)
- src/aces/types.pyi (1 hunks)
- src/aces/utils.py (2 hunks)
- src/aces/utils.pyi (1 hunks)
🔇 Additional comments not posted (26)
src/aces/query.pyi (2)
1-8: Overall, the type stub file is well-structured and provides clear type hints.The file successfully defines the type information for the
queryfunction, which appears to be a key component in a task extraction process. The use of Polars DataFrames suggests efficient data handling. To further improve the file:
- Verify the usage of imported modules as suggested earlier.
- Consider adding a docstring to the
queryfunction for better documentation.These minor improvements will enhance the overall quality and maintainability of the code.
1-6: Verify the usage of imported modulesThe following imports are not directly used in this type stub file:
OptionalfromtypingNodefrombigtreeWhile they might be used in the implementation file, it's good practice to keep imports minimal in stub files.
Please run the following script to check if these imports are used in the corresponding implementation file:
If these imports are not used in the implementation file, consider removing them from the stub file.
✅ Verification successful
Unused imports confirmed
The imports
OptionalfromtypingandNodefrombigtreeare not used in either the stub filesrc/aces/query.pyior the implementation filesrc/aces/query.py. It is recommended to remove these imports from the stub file to keep the codebase clean and maintainable.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if unused imports in the stub file are used in the implementation file # Test: Search for usage of Optional and Node in the implementation file implementation_file=$(echo "src/aces/query.pyi" | sed 's/\.pyi$/.py/') echo "Checking usage in $implementation_file" rg --type python "Optional|Node" "$implementation_file"Length of output: 254
Script:
#!/bin/bash # Description: Verify if 'Optional' and 'Node' are used in the implementation file implementation_file="src/aces/query.py" if [ -f "$implementation_file" ]; then echo "Checking usage of 'Optional' and 'Node' in $implementation_file" rg "Optional|Node" "$implementation_file" else echo "Implementation file $implementation_file does not exist." echo "Please ensure that the implementation file is present." fiLength of output: 252
src/aces/extract_subtree.pyi (2)
1-4: LGTM: Imports are appropriate and well-structured.The imports are relevant to the function being defined:
timedeltais used in the function signature.polars(aspl) is used for DataFrame operations.Nodefrombigtreeis used for tree-like data structures.These choices suggest a focus on performance and appropriate data structures for the task at hand.
1-11: Overall, excellent stub file implementation.This stub file is well-structured and provides clear type information for the
extract_subtreefunction. The use of type annotations, appropriate imports, and a well-defined function signature contributes to improved maintainability and developer experience. The only suggested improvement is the addition of a docstring for enhanced documentation.src/aces/constraints.pyi (5)
6-8: LGTM! Well-defined function signature.The
check_constraintsfunction signature is well-typed and clearly defines its input parameters and return type. The use of type hints is appropriate and will aid in type checking and code understanding.
9-9: LGTM! Well-defined function signature.The
check_static_variablesfunction signature is well-typed and clearly defines its input parameters and return type. The use of type hints is appropriate and will aid in type checking and code understanding.
1-9: Overall, the constraints.pyi file looks good.This file provides clear type hints for the
check_constraintsandcheck_static_variablesfunctions, which will improve type checking and code understanding. The function signatures are well-defined and use appropriate type hints.A few suggestions for improvement:
- Verify the usage of the
loggerimport from thelogurulibrary.- Confirm the usage of the
ANY_EVENT_COLUMNimport from the local.typesmodule.If these imports are not used in the actual implementation, consider removing them to keep the type hints file concise and focused.
4-4: LGTM! Verify ANY_EVENT_COLUMN usage.The import from a local module is good for maintaining a modular structure.
However,
ANY_EVENT_COLUMNis not visibly used in this file. Let's verify its usage:#!/bin/bash # Description: Verify the usage of ANY_EVENT_COLUMN in the constraints module. # Test: Search for ANY_EVENT_COLUMN usage in the constraints module rg --type python 'ANY_EVENT_COLUMN' src/aces/constraints.pyIf the script doesn't return any results, consider removing the unused import.
1-2: LGTM! Verify logger usage.The imports look good. The Polars library is correctly imported and used in the function signatures.
However, the
loggerimport is not visibly used in this file. Let's verify its usage:If the script doesn't return any results, consider removing the unused import.
src/aces/utils.pyi (5)
1-11: LGTM: Import statements are well-organized and appropriate.The import statements are properly structured, separating standard library imports from third-party imports. They cover all the necessary modules for the utility functions declared in this file.
13-13: LGTM: parse_timedelta function signature is well-defined.The function signature for
parse_timedeltais clear and uses appropriate type hints. It correctly handles an optional input and returns atimedeltaobject.
14-15: LGTM: capture_output context manager is well-defined.The
capture_outputcontext manager is correctly decorated and has an appropriate return type hint for a generator that yields aStringIOobject.
17-17: LGTM: hydra_loguru_init function signature is well-defined.The
hydra_loguru_initfunction signature is clear and uses appropriate type hints. It correctly specifies a filename input and indicates that it doesn't return a value.
1-17: Overall, the utils.pyi stub file is well-structured and provides clear type information.This stub file effectively defines the type information for the utility functions in the
aces.utilsmodule. The use of type hints and the structure of the file are appropriate for a.pyistub file. The only suggestion is to consider renaming thelog_treefunction for better clarity, but this is a minor improvement.pyproject.toml (3)
32-33: LGTM: Enabling package data inclusionThe addition of
include-package-data = truein the[tool.setuptools]section is correct and aligns with the PR objective. This configuration ensures that package data, including typing information files, will be included when building the package.
32-36: Summary: Proper configuration for distributing typing informationThe changes in this file correctly set up the project for distributing typing information:
- Enabling
include-package-dataensures all package data is included during build.- Specifying
*.pyiandpy.typedfiles as package data ensures type information is distributed.These changes align well with the PR objective and follow best practices for Python packaging. Once the package name is verified and corrected if necessary, these modifications will effectively enable the distribution of typing information with your package.
35-36: Verify the package name and approve type information inclusionThe inclusion of
*.pyiandpy.typedfiles as package data is correct and aligns with the PR objective of distributing typing information. This ensures that type hint stub files and the marker file for type checking support are included in the package.However, please verify if
your_packageis the correct package name. It appears to be a placeholder and should be replaced with the actual package name (likely "es-aces" based on the project name).To confirm the correct package name, you can run the following command:
If the result doesn't match "your_package", please update the package name accordingly.
✅ Verification successful
Package name verified and type information inclusion approved
- Replace
your_packagewithes-acesin the[tool.setuptools.package-data]section.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the correct package name in pyproject.toml # Test: Search for the package name in pyproject.toml rg --type toml 'name\s*=\s*"([^"]+)"' pyproject.tomlLength of output: 194
src/aces/aggregate.pyi (1)
7-10: 🛠️ Refactor suggestionEnsure type annotations are compatible with supported Python versions
The use of the
|operator for type unions (e.g.,TemporalWindowBounds | tuple[...]andtimedelta | None) requires Python 3.10 or later. If the project needs to support earlier Python versions, consider usingUnionandOptionalfrom thetypingmodule to maintain compatibility.You can verify the project's Python version compatibility by checking the
pyproject.tomlorsetup.py. If earlier versions need to be supported, apply this diff:+from typing import Union, Optional -def aggregate_temporal_window( - predicates_df: pl.DataFrame, - endpoint_expr: TemporalWindowBounds | tuple[bool, timedelta, bool, timedelta | None], -) -> pl.DataFrame: ... +def aggregate_temporal_window( + predicates_df: pl.DataFrame, + endpoint_expr: Union[TemporalWindowBounds, tuple[bool, timedelta, bool, Optional[timedelta]]], +) -> pl.DataFrame: ...Run the following script to check the Python version specified in
pyproject.toml:src/aces/predicates.pyi (1)
20-20: Confirm the accessibility ofTaskExtractorConfig.The function
get_predicates_dfusesTaskExtractorConfigas a parameter type. Verify thatTaskExtractorConfigis properly defined and accessible from.config. This ensures that type checking and code navigation work correctly.src/aces/types.pyi (4)
6-11: Global variables are declared appropriately.The global variables
PRED_CNT_TYPE,START_OF_RECORD_KEY,END_OF_RECORD_KEY, andANY_EVENT_COLUMNare correctly declared with appropriate type annotations.
12-24:TemporalWindowBoundsdata class is well-defined.The
TemporalWindowBoundsclass and its methods are correctly declared with appropriate type annotations and method signatures.
25-36:ToEventWindowBoundsdata class is well-defined.The
ToEventWindowBoundsclass and its methods are correctly declared with appropriate type annotations and method signatures.
1-5:⚠️ Potential issueAdd missing import for
dataclasses.The
dataclassesmodule is used but not imported. Please add the import statement fordataclassesto ensure the decorators function correctly.Apply this diff to fix the issue:
from datetime import timedelta +import dataclasses from typing import Any, Iterator import polars as plLikely invalid or redundant comment.
src/aces/config.pyi (3)
51-61: Check return types of property methods inWindowConfig.The property methods
start_endpoint_exprandend_endpoint_exprreturnOptional[Union[ToEventWindowBounds, TemporalWindowBounds]]. Ensure that these types are correctly defined and imported.Confirm that
ToEventWindowBoundsandTemporalWindowBoundsare properly defined:#!/bin/bash # Description: Verify the definitions of ToEventWindowBounds and TemporalWindowBounds. # Search for the type definitions in the codebase. rg --type py --files-with-matches 'class ToEventWindowBounds' rg --type py --files-with-matches 'class TemporalWindowBounds'
45-48: Verify the implementations of_check_referenceand_parse_boundarymethods inWindowConfig.The class methods
_check_referenceand_parse_boundaryare declared with..., suggesting their implementations should be provided elsewhere.Ensure these methods are implemented in the codebase:
#!/bin/bash # Description: Verify the existence of method implementations for _check_reference and _parse_boundary. # Search for the implementations in `.py` files excluding stub files. rg --type py --glob '!*.pyi' -l 'def _check_reference' rg --type py --glob '!*.pyi' -l 'def _parse_boundary'
20-21: Ensure implementations ofMEDS_eval_exprandESGPT_eval_exprexist.The methods
MEDS_eval_exprandESGPT_eval_exprinPlainPredicateConfigare declared with..., indicating that their implementations should exist elsewhere.Run the following script to confirm that these methods are implemented in the codebase:
✅ Verification successful
Method Implementations Confirmed
The implementations of
MEDS_eval_exprandESGPT_eval_exprexist insrc/aces/config.py.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence of method implementations for MEDS_eval_expr and ESGPT_eval_expr. # Search for the implementations in `.py` files excluding stub files. rg --type py --glob '!*.pyi' -l 'def MEDS_eval_expr' rg --type py --glob '!*.pyi' -l 'def ESGPT_eval_expr'Length of output: 258
Script:
#!/bin/bash # Description: Verify the existence of method implementations for MEDS_eval_expr and ESGPT_eval_expr. # Search for the implementation of MEDS_eval_expr in `.py` files excluding stub files. rg --type py --glob '!*.pyi' -l 'def MEDS_eval_expr' # Search for the implementation of ESGPT_eval_expr in `.py` files excluding stub files. rg --type py --glob '!*.pyi' -l 'def ESGPT_eval_expr'Length of output: 148
mmcdermott
left a comment
There was a problem hiding this comment.
@justin13601 this looks good to me, though i don't know how the .pyi files usually work / should be formatted. Note that any remaining, non-resolved coderabbit comments are things that I agree with and recommend should be addressed before you push the PR.
Yes, I think so, but google around or check the google style guide to see if there are other recommendations in the community. |
|
@justin13601 can we merge or close this? Not sure which would be more appropriate, but I'd like to get it closed out rather than remaining indefinitely open. Thanks! |
Yep will try to merge this weekend! |
|
@justin13601 re-pinging on this. |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/aces/__main__.py(1 hunks)src/aces/config.py(10 hunks)src/aces/expand_shards.py(1 hunks)src/aces/types.py(4 hunks)src/aces/utils.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/aces/utils.py
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/aces/__main__.py
[warning] 134-134: src/aces/main.py#L134
Added line #L134 was not covered by tests
🔇 Additional comments (12)
src/aces/expand_shards.py (1)
74-74: LGTM! The return type annotation is accurate.The
-> Nonereturn type annotation correctly indicates that themain()function doesn't return any value.src/aces/__main__.py (1)
134-134: LGTM! The return type annotation is accurate.The
-> Nonereturn type annotation correctly indicates that themain()function doesn't return any value.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 134-134: src/aces/main.py#L134
Added line #L134 was not covered by testssrc/aces/types.py (5)
8-10: LGTM! The imports are necessary for type annotations.The added imports provide the required types:
Iteratorfromcollections.abcfor iterator type hintsAnyfromtypingfor generic type hints
64-65: LGTM! The return type annotation is accurate.The
-> Iterator[Any]return type annotation correctly indicates that the method returns an iterator that yields values of any type.
68-69: LGTM! The parameter and return type annotations are accurate.The type annotations correctly specify:
key: intfor the index parameter-> Anyfor the return type since the method can return values of any type
71-71: LGTM! The return type annotations are accurate.The
-> Nonereturn type annotations correctly indicate that these__post_init__methods don't return any value.Also applies to: 210-210
231-232: LGTM! The type annotations are accurate.The type annotations in
ToEventWindowBoundsclass correctly mirror those inTemporalWindowBounds:
__iter__returnsIterator[Any]__getitem__acceptskey: intand returnsAnyAlso applies to: 235-236
src/aces/config.py (5)
12-12: LGTM! The import is necessary for type annotations.The
Anyimport fromtypingis required for the type annotations in this file.
651-651: LGTM! The return type annotation is accurate.The
-> Nonereturn type annotation correctly indicates that_check_referencedoesn't return any value.
1131-1135: LGTM! The type annotations are accurate.The type annotations correctly specify:
cls: TaskExtractorConfigfor the class parameterconfig_path: str | Pathfor the config path parameterpredicates_path: str | Path = Nonefor the optional predicates path parameter-> TaskExtractorConfigfor the return type
1316-1316: LGTM! The return type annotations are accurate.The
-> Nonereturn type annotations correctly indicate that these methods don't return any value:
_initialize_predicates_initialize_windows__post_init__Also applies to: 1363-1363, 1510-1510
1523-1523: LGTM! The property return type annotations are accurate.The return type annotations correctly specify:
plain_predicatesreturnsdict[str, PlainPredicateConfig]derived_predicatesreturnsOrderedDict[str, DerivedPredicateConfig]Also applies to: 1528-1528
Summary by CodeRabbit
Chores
Refactor