Added doctest helpers to namespace and used ruff for linting / pre-commit over other older tools.#179
Added doctest helpers to namespace and used ruff for linting / pre-commit over other older tools.#179justin13601 merged 13 commits intomainfrom
Conversation
WalkthroughThis set of changes focuses on modernizing development workflows, refining code style and documentation, and streamlining test and CLI infrastructure. The pre-commit configuration is overhauled to consolidate Python formatting and linting under Ruff, with several tools removed and others updated. The test infrastructure is updated to simplify imports and setup, including the removal of Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI (cli)
participant Main (main)
participant Core Modules
User->>CLI (cli): Run aces-cli [args]
CLI (cli)->>CLI (cli): Print usage if no args
CLI (cli)->>Main (main): Call main(cfg)
Main (main)->>Core Modules: Import and execute core logic
Possibly related PRs
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
docs/source/usage.md (3)
9-10: Convert inline note to admonition for consistency
The inline "Note:" near the installation section can be updated to a blockquote admonition (> [!NOTE]) to match the formatting used elsewhere in this document.Apply:
- **Note:** If you choose to use the ESGPT standard, please install ESGPT first before installing ACES. This ensures compatibility with the `polars` version required by ACES. + > [!NOTE] + > If you choose to use the ESGPT standard, please install ESGPT first before installing ACES. This ensures compatibility with the `polars` version required by ACES.
71-72: Add missing comma for readability
Consider adding a comma before "and" to improve clarity:- ... a prediction is made and can be set to `start` or `end` of a particular window. + ... a prediction is made, and can be set to `start` or `end` of a particular window.🧰 Tools
🪛 LanguageTool
[uncategorized] ~71-~71: Possible missing comma found.
Context: ... the timestamp at which a prediction is made and can be set tostart> orendof...(AI_HYDRA_LEO_MISSING_COMMA)
242-243: Capitalize the first word in admonition text
The admonition starts with a lowercase "you". For consistency, capitalize it:- > you may have to run this command for every terminal - please visit [Hydra's Documentation]... + > You may have to run this command for every terminal - please visit [Hydra's Documentation]...src/aces/types.py (1)
147-152: Clarify class docstring for event-based windows
The docstring forToEventWindowBoundsstill reads "temporal window bounds" but should reflect that this class represents event-bound windows.Apply:
- """Named tuple to represent temporal window bounds. + """Named tuple to represent event-bound window bounds.pyproject.toml (1)
65-83: Configure Ruff linter with target version, line length, and selected rule sets.
Centralizes linting and formatting under Ruff, matching.pre-commit-config.yaml. The selected codes (E,F,B, etc.) cover a broad range of checks—consider periodically reviewing ignored rules to catch new issues.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (29)
.github/workflows/tests.yaml(1 hunks).pre-commit-config.yaml(3 hunks)README.md(4 hunks)conftest.py(1 hunks)docs/source/conf.py(1 hunks)docs/source/configuration.md(5 hunks)docs/source/notebooks/examples.ipynb(0 hunks)docs/source/notebooks/tutorial_esgpt.ipynb(1 hunks)docs/source/notebooks/tutorial_meds.ipynb(1 hunks)docs/source/usage.md(3 hunks)pyproject.toml(2 hunks)src/aces/__init__.py(1 hunks)src/aces/aggregate.py(3 hunks)src/aces/config.py(37 hunks)src/aces/constraints.py(2 hunks)src/aces/extract_subtree.py(2 hunks)src/aces/predicates.py(6 hunks)src/aces/query.py(3 hunks)src/aces/run.py(3 hunks)src/aces/types.py(13 hunks)src/aces/utils.py(1 hunks)tests/test_aggregate_hypothesis.py(1 hunks)tests/test_e2e.py(0 hunks)tests/test_expand_shards.py(1 hunks)tests/test_help_message.py(1 hunks)tests/test_meds.py(3 hunks)tests/test_other_meds.py(0 hunks)tests/test_override_meds.py(1 hunks)tests/utils.py(4 hunks)
💤 Files with no reviewable changes (3)
- docs/source/notebooks/examples.ipynb
- tests/test_other_meds.py
- tests/test_e2e.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/aces/predicates.py (1)
src/aces/config.py (2)
PlainPredicateConfig(35-238)TaskExtractorConfig(859-1689)
src/aces/run.py (1)
src/aces/query.py (1)
query(19-197)
🪛 LanguageTool
docs/source/usage.md
[uncategorized] ~71-~71: Possible missing comma found.
Context: ... the timestamp at which a prediction is made and can be set to start > or end of...
(AI_HYDRA_LEO_MISSING_COMMA)
docs/source/configuration.md
[uncategorized] ~164-~164: Loose punctuation mark.
Context: ...REFERENCING = $REFERENCED -> $PREDICATE, $REFERENCING = $REFERENCED <- $PREDICA...
(UNLIKELY_OPENING_PUNCTUATION)
🪛 GitHub Check: codecov/patch
src/aces/config.py
[warning] 1650-1650: src/aces/config.py#L1650
Added line #L1650 was not covered by tests
🔇 Additional comments (55)
src/aces/__init__.py (1)
5-5: Add blank line after module docstring
The blank line inserted after the closing triple-quote enhances readability and aligns with PEP 257’s recommendation of having one blank line between the module docstring and the first import.docs/source/notebooks/tutorial_esgpt.ipynb (1)
89-89: Remove explicit read mode inopen()
Relying on the default read mode by usingopen(config_path)simplifies the code without changing behavior.docs/source/notebooks/tutorial_meds.ipynb (1)
90-90: Simplify YAML file opening call
Dropping the explicit"r"mode argument (open(config_path)) leverages the default read mode, cleaning up the code while preserving functionality.src/aces/extract_subtree.py (1)
290-290: Use tuple unpacking for endpoint expressions
Changing from concatenation (endpoint_expr + (subtree_root_offset,)) to unpacking ((*endpoint_expr, subtree_root_offset)) is functionally equivalent and more idiomatic for combining tuples.conftest.py (1)
1-33: Introduce autouse fixture for doctest namespace
The_setup_doctest_namespacefixture automatically injects common utilities (e.g.,caplog,json,pl,yaml,MagicMock,patch,datetime,timedelta,tempfile) into the doctest environment, streamlining examples by removing boilerplate imports.src/aces/constraints.py (2)
19-19: Spelling correction in docstringThe typo "satsified" was correctly fixed to "satisfied".
98-100: Improved validation logicGood refactoring to consolidate the constraint validation logic into a single if-statement with a logical OR. This makes the code more concise while maintaining the same behavior.
docs/source/conf.py (1)
7-8: Import organization improvementMoving the
sphinx.ext.apidocimport statement to the top of the file with other imports follows Python best practices for import organization. This improves code readability and maintainability.tests/test_expand_shards.py (1)
11-13: Improved assertion formattingThe assertion has been reformatted from a multi-line expression to a cleaner style with a trailing comma and parenthesized error message. This improves readability while maintaining the same functionality.
tests/test_override_meds.py (1)
372-372: Removed unnecessary lint suppression commentRemoved the
# noqa: E501comment that was previously used to suppress line length warnings. This aligns with the PR objective of using ruff for linting, which likely handles this differently.tests/test_help_message.py (1)
9-11: Approve single-line assertion formatting.The refactored assertion consolidates the multi-line check into a concise single-line format without altering logic or expected output.
src/aces/utils.py (1)
14-14: Approve explicit optional type hint.Updating the signature to
str | Noneclarifies thattime_strmay beNoneand leverages Python 3.10+ union syntax consistently with the rest of the codebase..github/workflows/tests.yaml (1)
29-29: Verify extras groups alignment.Ensure that the extras
devandtestsare defined inpyproject.tomland include the intended dependencies for development and testing.docs/source/configuration.md (1)
53-57: Approve standardized[!NOTE]admonitions.The conversion from inline bold
**Note**to the GitHub-flavored blockquote[!NOTE]format is consistent and enhances readability without altering content.Also applies to: 86-90, 114-117, 159-163, 168-174, 204-210
tests/test_aggregate_hypothesis.py (1)
101-104: Approve inline conditional refactor.Refactoring the timestamp filters to inline conditional expressions maintains original behavior while improving readability.
docs/source/usage.md (2)
23-23: Documentation content formatting is correct
This paragraph clearly explains the task configuration example and has been integrated correctly.
64-70: Admonition block is well-formed and informative
The[!NOTE]block standardizes the explanation of configuration file structure and index_timestamp usage.README.md (3)
10-10: Approve GitHub Actions badge extension update
The badge URL has been corrected from.ymlto.yamlfor the tests workflow, and the link now points totests.yaml.
22-24: Approve updates section entries
The new release notes and formatting for dates in the "Updates" section are accurate and consistent with the project timeline.
63-66: Admonition block is well-formed
The[!NOTE]block now clearly highlights the dependency installation order, matching the style used elsewhere.tests/test_meds.py (2)
610-612: Assertion formatting is consistent and clear
The multi-line assertion with a parenthesized message enhances readability and aligns with PEP8.
624-626: Assertion formatting is consistent and clear
The update formats the assertion message across multiple lines for clarity and maintainability.src/aces/types.py (4)
25-56: Doctest examples cleaned up appropriately
The removal of# doctest: +NORMALIZE_WHITESPACEdirectives in theTemporalWindowBoundsdocstring does not affect functionality and simplifies the examples.
75-125: Polars rolling kwargs examples updated correctly
The doctest blocks forpolars_gp_rolling_kwargsare clear, and the formatting improvements enhance readability without altering behavior.
146-214: Event window bounds doctests are correct
TheToEventWindowBoundsdocstring examples have been simplified and remain accurate for testing edge cases.
309-312: Refactor end_event assignment is concise and correct
The single-line conditional forend_eventmaintains the same logic in a more readable form.src/aces/query.py (4)
63-66: Doctest examples now usecaplogfor log capture
Switching tocaplog.at_level(logging.INFO)makes the examples more idiomatic for pytest and leverages the injected fixture.
86-95: Static variable filtering doctest updated
The refactored examples correctly demonstrate checking log messages viacaplog.text.
102-105: Trigger event absence doctest updated
Usingcaplogin the doctest maintains consistency and clarity in the test examples.
163-165: Consolidatelogger.infointo a single line
The refactored f-string for label extraction logging is concise and functionally identical.src/aces/run.py (2)
4-5: Good refactoring: moved imports to the module level.This change follows Python best practices by placing imports at the module level instead of inside functions, improving readability and potentially reducing runtime overhead since these imports are no longer executed each time the function is called.
Also applies to: 6-9, 15-17
180-188: Well-structured CLI entry point.The new
cli()function nicely encapsulates the command-line interface logic, separating it from the core functionality inmain(). This improves modularity and maintainability. The function provides helpful usage information when no arguments are provided, enhancing the user experience.tests/utils.py (3)
29-29: Improved type annotations.Changing from
strtostr | Nonefor themsgandpredicate_filesparameters provides more accurate type information, which enhances static type checking and code clarity.Also applies to: 81-81
123-123: Nice code optimizations.Two good improvements here:
- Using
data_dir.resolve()!sinstead ofstr(data_dir.resolve())is more idiomatic- Replacing
list(wrote_files.values())[0]withnext(iter(wrote_files.values()))avoids unnecessary list creation, making it more efficientBoth changes maintain functionality while improving code quality.
Also applies to: 125-125
148-150: Better assertion formatting.Reformatting the assertion to use a single-line condition with a multi-line error message improves readability while maintaining the same functionality.
src/aces/aggregate.py (3)
974-977: Good Boolean logic simplification.The conditions for determining
right_inclusiveandleft_inclusivewere elegantly simplified using membership checks (inandnot in), making the code more readable and concise without changing functionality.Also applies to: 982-985
1005-1007: Nice conditional consolidation.Combining these separate conditions using an
oroperator simplifies the code while maintaining the same logic, reducing cognitive load for readers and making maintenance easier.
1099-1101: Effective condition combination.Similar to the previous change, this consolidation of conditions with an
oroperator reduces branching and improves code readability without affecting the behavior.src/aces/config.py (2)
12-13: Improved import organization with TYPE_CHECKING.Moving the
OrderedDictimport under aTYPE_CHECKINGguard is an excellent optimization that reduces runtime overhead while maintaining static type checking capabilities, following Python best practices for type annotations.Also applies to: 28-30
1649-1650:Details
❓ Verification inconclusive
Simplified conditional logic needs test coverage.
This change improves the code by checking if a child is not already in the parent's children before adding it, preventing potential duplicate entries. However, this line isn't covered by tests according to static analysis.
Consider adding test coverage for this edge case to ensure this defensive programming logic works as expected:
🏁 Script executed:
#!/bin/bash # Check which test files might be related to this functionality fd -e py _initialize_windows tests/ # Look for any tests that might be creating window configurations with parent-child relationships rg "window.*parent" tests/Length of output: 66
Ensure parent-child duplication logic is tested
I wasn’t able to find any existing tests that hit the new guard in src/aces/config.py (lines 1649–1650):
if node.parent and child not in node.parent.children: node.parent.children += (child,)Please add unit tests for this edge case, for example:
- Verify a child is appended only once when
node.parentis set- Verify nothing happens if
node.parentisNone- Verify no duplicates when
childis already inparent.childrenYou can add these in a new file (e.g.
tests/test_config.py) or alongside related tests in your suite.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 1650-1650: src/aces/config.py#L1650
Added line #L1650 was not covered by testspyproject.toml (4)
41-41: Update CLI entry point tocli.
Theaces-cliscript now points toaces.run:cli, reflecting the refactored entry logic. Ensure thatcli()is exported fromsrc/aces/run.pyand that any CI or packaging tests invoking the old entry point are updated.
45-47: Reorganize optional dependencies intodev,tests, andprofilinggroups.
Separating development, testing, and profiling extras improves clarity and keeps installations lean. Confirm that CI workflows install both[dev,tests]and that documentation mentions the new extra names.
62-64: Add coverage report exclusions.
Excluding debug logs, import errors, and type-checking blocks helps focus coverage metrics on business logic. Verify that coverage thresholds in CI reflect these exclusions.
86-91: Extend per-file ignores for__init__.pyand naming conventions.
Ignoring unused imports in__init__.pyis standard, and the naming ignore patterns align with project-specific constants. Looks good.src/aces/predicates.py (4)
10-10: ImportPlainPredicateConfigandTaskExtractorConfig.
Enables type hints and functionality in this module. Verify thatTaskExtractorConfigis used only forget_predicates_dfand doesn’t introduce unnecessary dependencies.
187-188: Use unpacking for thecolumnslist and improve logging.
Replacing["subject_id", "timestamp"] + predicateswith["subject_id", "timestamp", *predicates]enhances readability. The f-string now uses{data_path.resolve()!s}for concise path formatting.
284-284: Apply unpacking when selecting predicate columns in MEDS loader.
Switching from list concatenation to["subject_id", "timestamp", *predicate_cols]keeps selection syntax consistent with other modules.
429-435: Add doctest for missingEventStreamimport in ESGPT loader.
Documenting theImportErrorpath whenEventStreamis absent improves example coverage and developer guidance..pre-commit-config.yaml (7)
1-2: Bump default Python version to 3.12 for pre-commit.
Aligns with the project’s target of Python ≥3.10 and ensures hooks run under the intended interpreter. Confirm CI runners supportpython3.12.
21-29: Consolidate Python linting and formatting under Ruff.
Replacing Black, isort, autoflake, pyupgrade, and flake8 hooks withruff-formatandruff --fixstreamlines toolchain and reduces maintenance. Ensurepyproject.toml’s Ruff settings cover all needed rules.
38-43: Upgrade Prettier to v4.0.0-alpha.8 for YAML formatting.
New version may alter formatting defaults; run through existing YAML files to catch any unexpected diffs.
45-50: Upgrade ShellCheck to v0.10.0.1.
Keeps shell script linting up-to-date. Verify no new warnings/errors in CI after bump.
51-66: Updatemdformatto v0.7.22 and add plugins.
Includingmdformat-ruff,mdformat-gfm-alerts,mdformat-myst, andmdformat-tocenhances markdown formatting. Ensure these plugins are installed in dev environments and that docs render as expected.
67-75: Upgradecodespellto v2.4.1 with refined skip patterns.
Skipping notebooks, large assets, and ignored words (ehr,crate) optimizes spelling checks. Confirm no critical terms are inadvertently skipped.
82-90: Switch notebook linting tonbqa-ruffandnbqa-ruff-format.
Leverages Ruff for both linting and formatting of Jupyter notebooks with autofix. Run sample notebooks to verify hooks behave as intended.
This should be a purely stylistic change, and not affect functionality.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Documentation
Chores