Skip to content

feat: add helper functions to parse input files #2918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Mar 11, 2025

Conversation

fgvieira
Copy link
Contributor

@fgvieira fgvieira commented Jun 16, 2024

Add helper functions to parse input files. See example in snakemake/snakemake-wrappers#2725.

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

Release Notes

  • New Features

    • Added integration of Xonsh scripts into Snakemake workflows.
    • Introduced a new rule for executing Python scripts within a Conda environment.
    • Enhanced report generation capabilities, including self-contained HTML and ZIP archives.
    • Expanded input handling capabilities with new methods for parsing inputs and extracting checksums.
  • Bug Fixes

    • Improved checksum extraction and validation processes in existing workflows.
  • Documentation

    • Expanded documentation to include details on Xonsh integration and report generation features.
    • Clarified usage of conda environments and apptainer integration within Snakemake.
  • Tests

    • Added new test cases for Xonsh script execution and Conda deployment scenarios.
    • Updated existing tests to validate checksum functionality and improve input handling.

Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link

sonarqubecloud bot commented Aug 7, 2024

@johanneskoester
Copy link
Contributor

@fgvieira since PR #2927 is now merged and released, I think this PR can be revived and finalized?

@johanneskoester johanneskoester self-assigned this Jan 15, 2025
Copy link
Contributor

coderabbitai bot commented Jan 15, 2025

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (3)
  • CHANGELOG.md is excluded by !CHANGELOG.md
  • CHANGELOG.md is excluded by !CHANGELOG.md
  • CHANGELOG.md is excluded by !CHANGELOG.md

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

The pull request introduces a new section in the Snakemake documentation for integrating Xonsh scripts, alongside updates to various modules and test files. Key changes include the addition of functions parse_input and extract_checksum to enhance input handling, modifications to existing function signatures, and the introduction of new rules and tests to validate these functionalities. The documentation now provides examples for using Xonsh scripts and clarifies the usage of conda environments and report generation in Snakemake workflows.

Changes

File Change Summary
docs/snakefiles/rules.rst Added documentation for integrating Xonsh scripts with examples.
snakemake/ioutils/__init__.py Added imports for parse_input and extract_checksum to global context.
snakemake/ioutils/input.py Updated parse_input() function signature to require infile and parser parameters.
tests/test_ioutils/Snakefile Updated test Snakefile with new input handling and checksum extraction.
tests/test_ioutils/samples.md5 Added new checksum entries for various TSV files.
tests/test_ioutils/expected-results/* Updated expected result files to reflect new checksum handling.
docs/snakefiles/deployment.rst Clarified usage of conda environments and apptainer integration with the run directive.
docs/snakefiles/reporting.rst Enhanced report generation documentation with new sections on results inclusion and report types.
snakemake/cli.py Modified help text for the --report argument to clarify report generation options.
snakemake/script/__init__.py Added XonshScript class for executing Xonsh scripts.
snakemake/workflow.py Introduced check_may_use_software_deployment(method) function for validating deployment directives.
test-environment.yml Added setuptools, cwltool, and cwl-utils to top-level dependencies.
tests/test_conda_run/Snakefile Added new rule random_python_conda_script for testing purposes.
tests/test_script_xsh/Snakefile Introduced new rules for Xonsh script workflow testing.
tests/test_script_xsh/scripts/test.xsh Added output functionality to the Xonsh script.

Sequence Diagram

sequenceDiagram
    participant User as Snakemake Workflow
    participant ParseInput as parse_input()
    participant ExtractChecksum as extract_checksum()
    participant InputFile as Input File

    User->>ParseInput: Call with input file and parser
    ParseInput->>InputFile: Open and read file
    alt No custom parser
        ParseInput-->>User: Return file content
    else Custom parser provided
        ParseInput->>ExtractChecksum: Apply parser function
        ExtractChecksum->>InputFile: Read specific data
        ExtractChecksum-->>ParseInput: Return parsed value
        ParseInput-->>User: Return parsed result
    end
Loading

Possibly related PRs

  • feat: xonsh support #3310: The changes in the main PR and the retrieved PR are directly related as both introduce new sections in the Snakemake documentation for integrating Xonsh scripts, along with modifications to support Xonsh script execution in the codebase.

Suggested reviewers

  • johanneskoester

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
snakemake/ioutils/input.py (1)

1-9: Add error handling for file operations.

The function should handle potential IOErrors when opening/reading the file and provide meaningful error messages.

 def parse_input(infile=None, parser=None, **kwargs):
     def inner(wildcards, input, output):
-        with open(infile, "r") as fh:
-            if parser is None:
-                return fh.read().strip()
-            else:
-                return parser(fh, **kwargs)
+        try:
+            with open(infile, "r") as fh:
+                if parser is None:
+                    return fh.read().strip()
+                else:
+                    return parser(fh, **kwargs)
+        except IOError as e:
+            raise WorkflowError(f"Error reading input file {infile}: {str(e)}")
     return inner
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a75043 and 2e37b32.

📒 Files selected for processing (7)
  • docs/snakefiles/rules.rst (2 hunks)
  • snakemake/ioutils/__init__.py (2 hunks)
  • snakemake/ioutils/input.py (1 hunks)
  • tests/test_ioutils/Snakefile (4 hunks)
  • tests/test_ioutils/expected-results/c/1.txt (1 hunks)
  • tests/test_ioutils/expected-results/results/switch~someswitch.column~sample.txt (1 hunks)
  • tests/test_ioutils/samples.md5 (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • tests/test_ioutils/expected-results/c/1.txt
  • tests/test_ioutils/expected-results/results/switchsomeswitch.columnsample.txt
  • tests/test_ioutils/samples.md5
🧰 Additional context used
📓 Path-based instructions (2)
snakemake/ioutils/__init__.py (1)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

snakemake/ioutils/input.py (1)

Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

🪛 Ruff (0.8.2)
snakemake/ioutils/input.py

15-15: Do not assign a lambda expression, use a def

Rewrite fix_file_name as a def

(E731)

⏰ Context from checks skipped due to timeout of 90000ms (31)
  • GitHub Check: testing (10, 3.12)
  • GitHub Check: testing (10, 3.11)
  • GitHub Check: testing (9, 3.12)
  • GitHub Check: testing (9, 3.11)
  • GitHub Check: testing (8, 3.12)
  • GitHub Check: testing (8, 3.11)
  • GitHub Check: testing (7, 3.12)
  • GitHub Check: testing (7, 3.11)
  • GitHub Check: testing (6, 3.12)
  • GitHub Check: testing (6, 3.11)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing (5, 3.12)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing (5, 3.11)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing (4, 3.12)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing (4, 3.11)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (3, 3.12)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing (3, 3.11)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (2, 3.12)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing (2, 3.11)
  • GitHub Check: testing (1, 3.12)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing (1, 3.11)
  • GitHub Check: testing-windows (1)
  • GitHub Check: apidocs
🔇 Additional comments (3)
snakemake/ioutils/__init__.py (1)

8-8: LGTM!

The registration of new functions follows the established pattern correctly.

Also applies to: 25-26

tests/test_ioutils/Snakefile (1)

11-13: LGTM!

The test coverage is comprehensive and demonstrates proper usage of the new functionality.

Also applies to: 22-29

docs/snakefiles/rules.rst (1)

550-592: LGTM!

The documentation is clear, well-structured, and includes helpful examples.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/snakefiles/rules.rst (1)

573-593: Enhance the extract_checksum Function Documentation

The documentation for extract_checksum succinctly states its purpose and signature. To further improve clarity, consider:

  • Stating explicitly the expected format or structure of the infile (for example, if it is a CSV, a text file with MD5 checksums, etc.) and what exact form the file parameter should take.
  • Explaining that the function returns a string representing the extracted checksum.
  • Noting that when used in conjunction with parse_input, any additional keyword arguments (e.g. file=output.tsv in the usage example) are forwarded to extract_checksum.

These additions would provide the reader with a fuller understanding of how to use the function and what to expect from its output.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e37b32 and dcb5eaa.

📒 Files selected for processing (1)
  • docs/snakefiles/rules.rst (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (18)
  • GitHub Check: testing (9, 3.12, bash)
  • GitHub Check: testing (9, 3.11, bash)
  • GitHub Check: testing (6, 3.12, bash)
  • GitHub Check: testing (5, 3.12, dash)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing (3, 3.11, bash)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing (2, 3.12, dash)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (2, 3.12, bash)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (1, 3.12, dash)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing-windows (1)
🔇 Additional comments (1)
docs/snakefiles/rules.rst (1)

550-572: Clarify the parse_input Function’s Signature and Behavior

The newly added documentation for parse_input is clear in its intent. However, consider adding the following clarifications:

  • Specify that the third parameter (referred to as kwargs in the signature) is optional and is used to pass extra keyword arguments directly to the parser.
  • Emphasize that parse_input returns a function which, when evaluated later by Snakemake, invokes the given parser with the supplied parameters.
  • As noted in previous reviews (), please ensure the implementation/documentation reminds users to validate the first argument—if it is not a callable, parse_input should raise a descriptive error directing them to the updated documentation (see PR 2927).

These clarifications will help users understand both the required types for each parameter and the deferred evaluation behavior of the helper function.

@fgvieira fgvieira moved this from Ready to In progress in Snakemake Hackathon March 2025 Mar 10, 2025
@fgvieira fgvieira moved this from In progress to In review in Snakemake Hackathon March 2025 Mar 10, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
snakemake/ioutils/input.py (1)

12-30: ⚠️ Potential issue

Multiple improvements needed for extract_checksum function.

  1. Replace lambda with a proper function per static analysis
  2. Import WorkflowError class
  3. Use proper exception chaining with from syntax
  4. Add more comprehensive error handling
+from snakemake.exceptions import WorkflowError
+
 def extract_checksum(infile, **kwargs):
+    """Extract checksum from a CSV file.
+    
+    Args:
+        infile: Path to CSV file containing checksums
+        **kwargs: Additional arguments, must include 'file' key
+        
+    Returns:
+        str: The extracted checksum
+        
+    Raises:
+        WorkflowError: If pandas is not installed, file not found, or checksum extraction fails
+    """
+    def fix_file_name(x):
+        return x.removeprefix("./")
+
     try:
         import pandas as pd
-
-        fix_file_name = lambda x: x.removeprefix("./")
-        return (
-            pd.read_csv(
-                infile,
-                sep="  ",
-                header=None,
-                engine="python",
-                converters={1: fix_file_name},
-            )
-            .set_index(1)
-            .loc[fix_file_name(kwargs.get("file"))]
-            .item()
-        )
     except ImportError:
-        raise WorkflowError("Pandas is required to extract checksum from file.")
+        raise WorkflowError("Pandas is required to extract checksum from file.") from None
+        
+    if "file" not in kwargs:
+        raise WorkflowError("'file' parameter is required for extract_checksum function")
+        
+    try:
+        df = pd.read_csv(
+            infile,
+            sep="  ",
+            header=None,
+            engine="python",
+            converters={1: fix_file_name},
+        )
+        return df.set_index(1).loc[fix_file_name(kwargs["file"])].item()
+    except FileNotFoundError:
+        raise WorkflowError(f"Checksum file not found: {infile}") from None
+    except KeyError:
+        raise WorkflowError(f"File {kwargs['file']} not found in checksum file") from None
+    except Exception as e:
+        raise WorkflowError(f"Error extracting checksum: {str(e)}") from e
🧰 Tools
🪛 Ruff (0.8.2)

16-16: Do not assign a lambda expression, use a def

Rewrite fix_file_name as a def

(E731)


30-30: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


30-30: Undefined name WorkflowError

(F821)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd491af and 2eba028.

📒 Files selected for processing (1)
  • snakemake/ioutils/input.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • snakemake/ioutils/input.py
🪛 Ruff (0.8.2)
snakemake/ioutils/input.py

16-16: Do not assign a lambda expression, use a def

Rewrite fix_file_name as a def

(E731)


30-30: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


30-30: Undefined name WorkflowError

(F821)

⏰ Context from checks skipped due to timeout of 90000ms (40)
  • GitHub Check: testing (10, 3.12, dash)
  • GitHub Check: testing (10, 3.12, bash)
  • GitHub Check: testing (10, 3.11, bash)
  • GitHub Check: testing (9, 3.12, dash)
  • GitHub Check: testing (9, 3.12, bash)
  • GitHub Check: testing (9, 3.11, bash)
  • GitHub Check: testing (8, 3.12, dash)
  • GitHub Check: testing (8, 3.12, bash)
  • GitHub Check: testing (8, 3.11, bash)
  • GitHub Check: testing (7, 3.12, dash)
  • GitHub Check: testing (7, 3.12, bash)
  • GitHub Check: testing (7, 3.11, bash)
  • GitHub Check: testing (6, 3.12, dash)
  • GitHub Check: testing (6, 3.12, bash)
  • GitHub Check: testing (6, 3.11, bash)
  • GitHub Check: testing (5, 3.12, dash)
  • GitHub Check: testing (5, 3.12, bash)
  • GitHub Check: testing (5, 3.11, bash)
  • GitHub Check: testing (4, 3.12, dash)
  • GitHub Check: testing (4, 3.12, bash)
  • GitHub Check: testing (4, 3.11, bash)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing (3, 3.12, dash)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing (3, 3.12, bash)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing (3, 3.11, bash)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing (2, 3.12, dash)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing (2, 3.12, bash)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing (2, 3.11, bash)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing (1, 3.12, dash)
  • GitHub Check: testing (1, 3.12, bash)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing (1, 3.11, bash)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing-windows (1)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (2)
snakemake/script/__init__.py (1)

1600-1736: 💡 Verification agent

🧩 Analysis chain

Verify Xonsh is properly installed in the conda environment for this feature.

The new Xonsh script functionality will require the xonsh shell to be installed in the user's environment. Please ensure this dependency is documented for users who want to use this feature.


🏁 Script executed:

#!/bin/bash
# Check if xonsh is mentioned in any conda environment files or documentation
echo "Searching for xonsh in conda environment files:"
find workflow -name "*.yaml" -o -name "*.yml" | xargs grep -l "xonsh" || echo "xonsh not found in conda environment files"

echo "Searching for xonsh in documentation:"
find docs -type f -name "*.rst" -o -name "*.md" | xargs grep -l "xonsh" || echo "xonsh not mentioned in documentation files"

Length of output: 565


Update Required: Document the Xonsh Dependency

The new Xonsh script functionality requires the xonsh shell to be installed in the user's environment. However, our search did not find any references to xonsh in the conda environment YAML files or in the documentation. Please ensure that:

  • The dependency on xonsh is clearly mentioned in the installation or setup instructions (e.g., in the README or other docs).
  • The conda environment configuration is updated (if applicable) to include the xonsh package.

Once these documentation updates are added, users will be properly guided on how to install xonsh to utilize the new feature.

🧰 Tools
🪛 Ruff (0.8.2)

1675-1675: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


1676-1679: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


1725-1733: Use dict.get() without default value

Remove default value

(SIM910)

tests/tests_using_conda.py (1)

332-346: ⚠️ Potential issue

Duplicate test functions detected

The test functions test_get_log_both, test_get_log_stderr, test_get_log_stdout, and test_get_log_complex are duplicated at the end of the file. This will cause issues when running tests, as pytest will discover each function twice.

Remove the duplicate test functions:

-def test_get_log_both():
-    run(dpath("test_get_log_both"))
-
-
-
-def test_get_log_stderr():
-    run(dpath("test_get_log_stderr"))
-
-
-
-def test_get_log_stdout():
-    run(dpath("test_get_log_stdout"))
-
-
-
-def test_get_log_complex():
-    run(dpath("test_get_log_complex"))
🧹 Nitpick comments (3)
docs/snakefiles/rules.rst (2)

550-571: Clarify the parse_input function signature and expected behavior.
The new documentation for parse_input is clear and provides a good usage example. However, the signature is noted as

parse_input(input_item, parser, kwargs)

while the example only passes a string as the first argument (e.g. "samples"). It would improve clarity to explicitly state what types are accepted for input_item – for instance, that it may be a key referencing an input file or a callable—and what will happen if a non-callable is provided (per the suggestions in PR #2927). Consider adding a note or a brief explanation that an error will be raised with a reference to the updated docs section if the first argument is not of the expected type.


573-593: Review and enhance the extract_checksum function documentation.
The documentation clearly explains that extract_checksum parses an input file and returns the checksum for a given filename. The usage example is helpful. To further aid users, consider clarifying how the file parameter is used in cases where multiple matches might occur or if no matching filename is found. This additional detail would prevent potential ambiguities in usage and guide users to implement proper error handling in their parser implementations.

snakemake/workflow.py (1)

1812-1813: Fix the boolean comparison style.

The current comparison against False is discouraged in Python. Use a simple truthiness check instead.

-if not ruleinfo.template_engine and ruleinfo.container_img != False:
+if not ruleinfo.template_engine and ruleinfo.container_img:
🧰 Tools
🪛 Ruff (0.8.2)

1812-1812: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks

Replace with ruleinfo.container_img

(E712)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8fd3e33 and e406351.

⛔ Files ignored due to path filters (2)
  • pyproject.toml is excluded by !pyproject.toml
  • tests/test_script_xsh/expected-results/test.out is excluded by !**/*.out
📒 Files selected for processing (30)
  • docs/snakefiles/rules.rst (1 hunks)
  • docs/snakefiles/rules.rst (2 hunks)
  • docs/snakefiles/rules.rst (3 hunks)
  • docs/snakefiles/deployment.rst (4 hunks)
  • docs/snakefiles/reporting.rst (4 hunks)
  • snakemake/cli.py (3 hunks)
  • snakemake/script/__init__.py (4 hunks)
  • snakemake/workflow.py (4 hunks)
  • docs/snakefiles/rules.rst (2 hunks)
  • snakemake/cli.py (1 hunks)
  • test-environment.yml (1 hunks)
  • tests/tests_using_conda.py (1 hunks)
  • docs/snakefiles/rules.rst (1 hunks)
  • docs/snakefiles/deployment.rst (2 hunks)
  • docs/snakefiles/reporting.rst (5 hunks)
  • docs/snakefiles/rules.rst (1 hunks)
  • snakemake/cli.py (1 hunks)
  • snakemake/script/__init__.py (3 hunks)
  • snakemake/workflow.py (1 hunks)
  • test-environment.yml (2 hunks)
  • tests/test_conda_python_3_7_script/Snakefile (1 hunks)
  • tests/test_conda_python_3_7_script/test_script_python_3_7.py (1 hunks)
  • tests/test_conda_run/Snakefile (1 hunks)
  • tests/test_conda_run/expected-results/test.txt (1 hunks)
  • tests/test_conda_run/test_python_env.yaml (1 hunks)
  • tests/test_conda_run/test_script_run.py (1 hunks)
  • tests/test_script_xsh/Snakefile (1 hunks)
  • tests/test_script_xsh/envs/xonsh.yaml (1 hunks)
  • tests/test_script_xsh/scripts/test.xsh (1 hunks)
  • tests/tests_using_conda.py (1 hunks)
✅ Files skipped from review due to trivial changes (8)
  • tests/test_script_xsh/scripts/test.xsh
  • tests/test_conda_run/expected-results/test.txt
  • tests/test_conda_run/test_script_run.py
  • tests/test_conda_run/test_python_env.yaml
  • tests/test_conda_python_3_7_script/test_script_python_3_7.py
  • docs/snakefiles/deployment.rst
  • tests/test_conda_python_3_7_script/Snakefile
  • tests/test_script_xsh/envs/xonsh.yaml
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/snakefiles/rules.rst
  • docs/snakefiles/rules.rst
  • docs/snakefiles/rules.rst
  • docs/snakefiles/rules.rst
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • tests/tests_using_conda.py
  • snakemake/workflow.py
  • snakemake/cli.py
  • snakemake/script/__init__.py
🪛 Ruff (0.8.2)
snakemake/workflow.py

127-127: snakemake.api imported but unused

Remove unused import: snakemake.api

(F401)


1812-1812: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks

Replace with ruleinfo.container_img

(E712)

snakemake/script/__init__.py

97-100: Use ternary operator anchor = f"#{urllib.parse.quote(self._anchor)}" if self._anchor else "" instead of if-else-block

Replace if-else-block with anchor = f"#{urllib.parse.quote(self._anchor)}" if self._anchor else ""

(SIM108)

⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: testing-windows (10)
  • GitHub Check: testing-windows (9)
  • GitHub Check: testing-windows (8)
  • GitHub Check: testing-windows (7)
  • GitHub Check: testing-windows (6)
  • GitHub Check: testing-windows (5)
  • GitHub Check: testing-windows (4)
  • GitHub Check: testing-windows (3)
  • GitHub Check: testing-windows (2)
  • GitHub Check: testing-windows (1)
🔇 Additional comments (30)
tests/test_script_xsh/Snakefile (1)

1-13: Well-structured test Snakefile for Xonsh script integration

This is a clean and minimal test case for the new Xonsh script integration feature. The file follows Snakemake conventions with a proper all rule targeting the expected output, and a rule specifying both the conda environment and script to be executed.

tests/test_conda_run/Snakefile (1)

1-13: Good test case with informative comments

This rule demonstrates how to execute a Python script with a conda environment using the run directive. The comments clearly explain that this approach is mainly for testing purposes and provide guidance on the recommended approach for production workflows (using the script directive).

snakemake/cli.py (3)

14-14: LGTM: Appropriate import addition

Added import for the caching module that will be used to reference environment variables.


48-48: LGTM: Updated import source

Changed the import source for the Batch class to be from snakemake.settings.types.


479-479: LGTM: Improved variable reference in help text

Replaced hardcoded environment variable name with a reference to caching.LOCATION_ENVVAR, making the code more maintainable by centralizing the definition of this environment variable.

test-environment.yml (2)

7-7: LGTM: Added setuptools dependency

Added setuptools to the dependencies list, which is a common requirement for Python packaging.


61-62: LGTM: Moved packages from pip to conda dependencies

Moved cwltool and cwl-utils from the pip section to direct conda dependencies. This is generally better practice as it allows conda to handle dependency resolution, potentially leading to better compatibility across packages.

snakemake/workflow.py (4)

1779-1787: Good refactoring to centralize deployment method validation logic.

The addition of this function improves code maintainability by centralizing the logic for checking if software deployment methods can be used with template engines, replacing the previously scattered inline checks.


1794-1794: Consistent use of the new validation function for envmodules.

The code now calls the centralized validation function instead of using inline checks, improving consistency and maintainability.


1800-1800: Consistent use of the new validation function for conda.

The code now calls the centralized validation function instead of using inline checks, improving consistency and maintainability.


1808-1808: Consistent use of the new validation function for container/singularity.

The code now calls the centralized validation function instead of using inline checks, improving consistency and maintainability.

docs/snakefiles/deployment.rst (2)

288-293: Updated conda documentation to include the run directive.

This change correctly clarifies that conda environments can also be used with the run directive, while explaining its special case nature since it has access to the Snakefile and must execute in the same process as Snakemake itself.


464-467: Updated apptainer documentation to include the run directive.

This change correctly clarifies that apptainer integration can also be used with the run directive, while explaining the same special case constraints as with conda environments.

snakemake/script/__init__.py (4)

1600-1603: Good implementation of XonshScript class.

The new XonshScript class extends the functionality of Snakemake to support Xonsh scripts by leveraging inheritance from PythonScript. The implementation is minimal but sufficient, overriding only the execute_script method to use the Xonsh shell.


1667-1668: Added language recognition for Xonsh scripts.

This change correctly updates the get_language function to recognize files with the .xsh extension as Xonsh scripts, enabling their proper handling.


1732-1732: Registered XonshScript as the handler for Xonsh scripts.

The update to the language-to-executor class mapping properly associates the new XonshScript class with the 'xonsh' language identifier.


1736-1736: Updated error message to include Xonsh scripts.

The error message for unsupported script types now correctly includes Xonsh (.xsh) as a supported format, ensuring users receive accurate information about the available options.

docs/snakefiles/reporting.rst (7)

7-10: Improved introduction to reporting capabilities!

The updated introduction provides a clearer description of Snakemake's reporting capabilities, highlighting the self-contained nature of reports and explaining the distinction between smaller HTML reports and more complex ZIP archives.


13-17: Well-structured new section heading

The new section "Including results in a report" with proper RST formatting helps organize the documentation logically.


105-105: Minor formatting correction

The heading level indicator has been corrected from underlining with dashes to using carets, which maintains proper hierarchy in the RST document structure.


142-143: Consistent heading formatting

The RST formatting for the section on determining category, subcategory, and labels dynamically has been properly updated with carets to maintain consistent heading levels.


149-150: Proper section heading formatting

The heading level for "Linking between items" has been correctly formatted with carets to maintain proper document hierarchy.


152-153: Consistent subsection formatting

The subsection headings "From captions" and "From results" use consistent formatting with quotation marks as underlining, maintaining proper RST document structure.

Also applies to: 164-165


257-258: Well-structured new sections with proper RST formatting

The additions of new sections for rendering reports, including self-contained HTML files, ZIP archives, partial reports, and custom layouts, have been properly formatted with appropriate heading levels and RST syntax.

Also applies to: 265-269, 284-285, 286-290, 304-306, 318-320, 321-327, 329-331

tests/tests_using_conda.py (6)

1-5: Proper file header with attribution information

The file includes appropriate author, copyright, email, and license information.


6-24: Comprehensive imports for test functionality

The imports cover all necessary modules for testing conda functionality, including standard library modules, pytest, and Snakemake-specific modules.


26-28: Well-designed helper for Windows permission error handling

The xfail_permissionerror_on_win function provides a clean way to mark tests that might fail with permission errors on Windows.


31-39: Comprehensive test coverage for Conda functionality

The test functions cover a wide range of Conda-related functionalities, including script execution, environment creation, custom prefixes, and integration with other tools like Apptainer.

Also applies to: 41-49, 51-54, 56-59, 66-81, 83-90, 93-96, 98-107, 109-117, 119-125, 127-135, 137-155, 157-167, 169-172, 174-182, 184-187, 189-193, 195-211, 213-221, 223-227, 229-236, 238-242, 244-252, 254-258, 260-263, 265-271, 273-280, 282-290, 292-295, 297-307, 314-316, 318-320, 322-324, 326-328, 330-332


61-65: Helpful TODO comment for Windows-specific issue

The TODO comment provides valuable information about a specific Windows permission error that needs to be addressed.


309-312: Informative comment about test dependencies

The comment clearly explains that the following tests don't explicitly depend on Conda but will fail if 'conda info --json' doesn't work as expected.

Copy link

@johanneskoester johanneskoester merged commit 63e45a7 into snakemake:main Mar 11, 2025
49 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in Snakemake Hackathon March 2025 Mar 11, 2025
@fgvieira fgvieira deleted the input_helper_func branch March 11, 2025 13:51
johanneskoester pushed a commit that referenced this pull request Mar 14, 2025
🤖 I have created a release *beep* *boop*
---


##
[9.0.0](v8.30.0...v9.0.0)
(2025-03-14)


### ⚠ BREAKING CHANGES

* Logging refactor & add LoggerPluginInterface
([#3107](#3107))

### Features

* [#3412](#3412) - keep
shadow folder of failed job if --keep-incomplete flag is set.
([#3430](#3430))
([22978c3](22978c3))
* add flag --report-after-run to automatically generate the report after
a successfull workflow run
([#3428](#3428))
([b0a7f03](b0a7f03))
* add flatten function to IO utils
([#3424](#3424))
([67fa392](67fa392))
* add helper functions to parse input files
([#2918](#2918))
([63e45a7](63e45a7))
* Add option to print redacted file names
([#3089](#3089))
([ba4d264](ba4d264))
* add support for validation of polars dataframe and lazyframe
([#3262](#3262))
([c7473a6](c7473a6))
* added support for rendering dag with mermaid js
([#3409](#3409))
([7bf8381](7bf8381))
* adding --replace-workflow-config to fully replace workflow configs
(from config: directive) with --configfile, instead of merging them
([#3381](#3381))
([47504a0](47504a0))
* Dynamic module name
([#3401](#3401))
([024dc32](024dc32))
* Enable saving and reloading IOCache object
([#3386](#3386))
([c935953](c935953))
* files added in rule params with workflow.source_path will be available
in used containers
([#3385](#3385))
([a6e45bf](a6e45bf))
* Fix keep_local in storage directive and more freedom over remote
retrieval behaviour
([#3410](#3410))
([67b4739](67b4739))
* inherit parameters of use rule and extend/replace individual items
them when using 'with' directive
([#3365](#3365))
([93e4b92](93e4b92))
* Logging refactor & add LoggerPluginInterface
([#3107](#3107))
([86f1d6e](86f1d6e))
* Maximal file size for checksums
([#3368](#3368))
([b039f8a](b039f8a))
* Modernize package configuration using Pixi
([#3369](#3369))
([77992d8](77992d8))
* multiext support for named input/output
([#3372](#3372))
([05e1378](05e1378))
* optionally auto-group jobs via temp files in case of remote execution
([#3378](#3378))
([cc9bba2](cc9bba2))


### Bug Fixes

* `--delete-all-output` ignores `--dry-run`
([#3265](#3265))
([23fef82](23fef82))
* 3342 faster touch runs and warning messages for non-existing files
([#3398](#3398))
([cd9c3c3](cd9c3c3))
* add default value to max-jobs-per-timespan
([#3043](#3043))
([2959abe](2959abe))
* checkpoints inside modules are overwritten
([#3359](#3359))
([fba3ac7](fba3ac7))
* Convert Path to IOFile
([#3405](#3405))
([c58684c](c58684c))
* Do not perform storage object cleanup with --keep-storage-local-copies
set ([#3358](#3358))
([9a6d14b](9a6d14b))
* edgecases of source deployment in case of remote execution
([#3396](#3396))
([5da13be](5da13be))
* enhance error message formatting for strict DAG-building mode
([#3376](#3376))
([a1c39ee](a1c39ee))
* fix bug in checkpoint handling that led to exceptions in case
checkpoint output was missing upon rerun
([#3423](#3423))
([8cf4a2f](8cf4a2f))
* force check all required outputs
([#3341](#3341))
([495a4e7](495a4e7))
* group job formatting
([#3442](#3442))
([f0b10a3](f0b10a3))
* in remote jobs, upload storage in topological order such that
modification dates are preserved (e.g. in case of group jobs)
([#3377](#3377))
([eace08f](eace08f))
* only skip eval when resource depends on input
([#3374](#3374))
([4574c92](4574c92))
* Prevent execution of conda in apptainer when not explicitly requested
in software deployment method
([#3388](#3388))
([c43c5c0](c43c5c0))
* print filenames with quotes around them in RuleException
([#3269](#3269))
([6baeda5](6baeda5))
* Re-evaluation of free resources
([#3399](#3399))
([6371293](6371293))
* ReadTheDocs layout issue due to src directory change
([#3419](#3419))
([695b127](695b127))
* robustly escaping quotes in generated bash scripts (v2)
([#3297](#3297))
([#3389](#3389))
([58720bd](58720bd))
* Show apptainer image URL in snakemake report
([#3407](#3407))
([45f0450](45f0450))
* Update ReadTheDocs configuration for documentation build to use Pixi
([#3433](#3433))
([3f227a6](3f227a6))


### Documentation

* Add pixi setup instructions to general use tutorial
([#3382](#3382))
([115e81b](115e81b))
* fix contribution section heading levels, fix docs testing setup order
([#3360](#3360))
([051dc53](051dc53))
* fix link to github.com/snakemake/poetry-snakemake-plugin
([#3436](#3436))
([ec6d97c](ec6d97c))
* fix quoting
([#3394](#3394))
([b40f599](b40f599))
* fix rerun-triggers default
([#3403](#3403))
([4430e23](4430e23))
* fix typo 'safe' -&gt; 'save'
([#3384](#3384))
([7755861](7755861))
* mention code formatting in the contribution section
([#3431](#3431))
([e8682b7](e8682b7))
* remove duplicated 'functions'.
([#3356](#3356))
([7c595db](7c595db))
* update broken links documentation
([#3437](#3437))
([e3d0d88](e3d0d88))
* Updating contributing guidelines with new pixi dev setup
([#3415](#3415))
([8e95a12](8e95a12))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: snakemake-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

2 participants