Skip to content

feat: new inphared-db wrapper #1550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

nacnoriko
Copy link

@nacnoriko nacnoriko commented Jul 20, 2023

Description

Adding a new wrapper to download the current inphared database.

QC

  • I confirm that:

For all wrappers added by this PR,

  • there is a test case which covers any introduced changes,
  • input: and output: file paths in the resulting rule can be changed arbitrarily,
  • either the wrapper can only use a single core, or the example rule contains a threads: x statement with x being a reasonable default,
  • rule names in the test case are in snake_case and somehow tell what the rule is about or match the tools purpose or name (e.g., map_reads for a step that maps reads),
  • all environment.yaml specifications follow the respective best practices,
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:),
  • all fields of the example rules in the Snakefiles and their entries are explained via comments (input:/output:/params: etc.),
  • stderr and/or stdout are logged correctly (log:), depending on the wrapped tool,
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to (see here; this also means that using any Python tempfile default behavior works),
  • the meta.yaml contains a link to the documentation of the respective tool or command,
  • Snakefiles pass the linting (snakemake --lint),
  • Snakefiles are formatted with snakefmt,
  • Python wrapper scripts are formatted with black.
  • Conda environments use a minimal amount of channels, in recommended ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as conda-forge should have highest priority and defaults channels are usually not needed because most packages are in conda-forge nowadays).

Summary by CodeRabbit

  • New Features
    • Added configuration and metadata files to support downloading sequence files from the Inphared database.
    • Introduced Snakemake workflows and rules for automated retrieval of genome and chromosome FASTA files, including support for older releases.
    • Provided example configuration files for customizable data access.
    • Implemented wrapper scripts for streamlined downloading and integration with Snakemake workflows.
  • Tests
    • Added test workflows and configuration files to validate the new data retrieval processes.

@fgvieira
Copy link
Collaborator

Is the config file really necessary?
What is the old_wrapper.py file?

Copy link
Member

@dlaehnemann dlaehnemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good already. But of course, I have a bunch of suggestions. Maybe we can work on those together during the symposium or even the workshop.


rule get_inphareddb:
output:
expand("{date}{suffix}", date=config["date"], suffix=config["suffix"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things we try with these wrappers, is to have them work with arbitrary file names. So the config[] entries should not be used in this example Snakefile, but rather in the wrapper.py file (via snakemake.params.date, for example).

Suggested change
expand("{date}{suffix}", date=config["date"], suffix=config["suffix"])
"resources/inphared.fasta"

To add the values in the config variables back into the file name, the users of the wrapper should then add python code around it. But I also like the idea of directly showcasing how to do that here. So maybe we could have two versions of calling the wrapper here, one with a fixed file name (like i suggest here), and one that contains the config[] entries.

@@ -0,0 +1,4 @@
name: inphared-db
description: Download sequence file from the Inphared database (https://github.com/RyanCook94/inphared/blob/main/README.md), and store them in a single .fasta file. Please check the current database available at the above link and adjust the config file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Download sequence file from the Inphared database (https://github.com/RyanCook94/inphared/blob/main/README.md), and store them in a single .fasta file. Please check the current database available at the above link and adjust the config file.
description: Download sequence file from the [inphared database](https://github.com/RyanCook94/inphared/blob/main/README.md), and store them in a single .fasta file. Please check the above link for available database version and adjust the config file.

output:
expand("{date}{suffix}", date=config["date"], suffix=config["suffix"])
params:
prefix = config["prefix"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename this to url:, as that might make its purpose a bit clearer?

Suggested change
prefix = config["prefix"],
url = config["url"],

Obviously, the same rename then applies in the config.yaml file.

@@ -0,0 +1,29 @@
rule get_genome:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file can be deleted here, right? It's simply a copy-paste leftover, if I understand this correctly.

@@ -0,0 +1,80 @@
__author__ = "Johannes Köster"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file can simply be deleted here, right?

@@ -0,0 +1,30 @@
rule get_genome:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file can simply be deleted here, right?

Comment on lines +8 to +9
shell:
"curl {params.prefix}{params.date}{params.suffix} -o {params.date}{params.suffix}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some little things here:

  1. You have to reference anything from the snakemake rule via the snakemake dict. So for example snakemake.params.date.
  2. We should only use the full output file name here, so that users can put in there whatever they want.
  3. We have to call the shell() function here, as this is not the body of an actual rule, but rather a plain python script.
  4. We have to use the f"" construct here to use format strings to fill in the variables in this context. Here, we are not dealing with snakemake wildcards, but rather python variables. The {} syntax is the same, so this is very confusing...
  5. I would manually put in the separator between URL and file name here, just for a slightly clearer structure of the download link. Then, we can remove the trailing/ in the url in config.yaml.
Suggested change
shell:
"curl {params.prefix}{params.date}{params.suffix} -o {params.date}{params.suffix}"
shell(f"curl {snakemake.params.url}/{snakemake.params.date}{snakemake.params.suffix} -o {snakemake.output}"

Comment on lines +8 to +9
prefix:
"https://millardlab-inphared.s3.climb.ac.uk/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep this in sync with the suggestions elsewhere, we would change this to:

Suggested change
prefix:
"https://millardlab-inphared.s3.climb.ac.uk/"
url:
"https://millardlab-inphared.s3.climb.ac.uk"

"2Jul2023"

suffix:
"_refseq_genomes.fa"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a smaller reference fasta file (of some small subset of viruses, e.g.), that could be used in the example? This should optimally get executed in the CI tests regularly, so shouldn't download much, if possible.

Also, we should still add an actual test run for this wrapper.

Copy link
Contributor

github-actions bot commented Feb 1, 2024

This PR was marked as stale because it has been open for 6 months with no activity.

@github-actions github-actions bot added the Stale label Feb 1, 2024
Copy link
Contributor

coderabbitai bot commented May 13, 2025

📝 Walkthrough

Walkthrough

Several new files were added under bio/reference/inphared-db/ to support downloading and managing bioinformatics reference data. These include Conda environment and metadata configuration files, a Python wrapper for downloading files, and multiple Snakemake workflow files for orchestrating data retrieval, including test workflows and legacy support scripts.

Changes

File(s) Change Summary
bio/reference/inphared-db/environment.yaml Added a Conda environment file specifying channels ("conda-forge", "nodefaults") and a dependency on "curl".
bio/reference/inphared-db/meta.yaml Added a metadata file defining the "inphared-db" database, including its name, description, and author information.
bio/reference/inphared-db/wrapper.py Added a Python module with metadata and a shell command to download a file using parameters for prefix, date, and suffix.
bio/reference/inphared-db/old_wrapper.py Added a Python script to download sequence data from Ensembl FTP servers, handling parameters for species, release, build, datatype, chromosome, and branch, with logic for URL construction, file selection, and error handling.
bio/reference/inphared-db/test/Snakefile
bio/reference/inphared-db/test/config.yaml
Added a test Snakemake workflow and its configuration file to define and parameterize a rule for downloading reference data using the wrapper.
bio/reference/inphared-db/test/old_release.smk Added a Snakemake workflow defining rules to download genome and chromosome FASTA files for Saccharomyces cerevisiae from an older Ensembl release, with logging and caching.
bio/reference/inphared-db/test/old_snakefile.smk Added a Snakemake workflow with rules for downloading full genome and single chromosome FASTA files for Saccharomyces cerevisiae, supporting different releases and builds, with logging and caching.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Snakemake
    participant Wrapper (wrapper.py)
    participant Remote Server

    User->>Snakemake: Launch workflow (Snakefile)
    Snakemake->>Wrapper (wrapper.py): Run shell command with params (prefix, date, suffix)
    Wrapper (wrapper.py)->>Remote Server: curl {prefix}{date}{suffix}
    Remote Server-->>Wrapper (wrapper.py): Send file data
    Wrapper (wrapper.py)-->>Snakemake: File saved locally
    Snakemake-->>User: Workflow completes, output available
Loading
sequenceDiagram
    participant User
    participant Snakemake
    participant Old Wrapper (old_wrapper.py)
    participant Ensembl FTP

    User->>Snakemake: Launch workflow (old_snakefile.smk/old_release.smk)
    Snakemake->>Old Wrapper (old_wrapper.py): Pass parameters (species, release, build, etc.)
    Old Wrapper (old_wrapper.py)->>Ensembl FTP: Check for file existence (curl)
    alt File exists
        Ensembl FTP-->>Old Wrapper (old_wrapper.py): File found
        Old Wrapper (old_wrapper.py)->>Ensembl FTP: Download and decompress file
        Ensembl FTP-->>Old Wrapper (old_wrapper.py): Send file data
        Old Wrapper (old_wrapper.py)-->>Snakemake: File saved locally
    else File not found
        Old Wrapper (old_wrapper.py)-->>Snakemake: Print error, exit with failure
    end
    Snakemake-->>User: Workflow completes or fails with error
Loading

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (9)
bio/reference/inphared-db/meta.yaml (1)

2-2: Fix trailing spaces and improve description format.

The description has trailing spaces at the end of the line, which is flagged by the YAML linter. Also, consider using the Markdown link format as previously suggested.

-description: Download sequence file from the Inphared database (https://github.com/RyanCook94/inphared/blob/main/README.md), and store them in a single .fasta file. Please check the current database available at the above link and adjust the config file. 
+description: Download sequence file from the [inphared database](https://github.com/RyanCook94/inphared/blob/main/README.md), and store them in a single .fasta file. Please check the above link for available database version and adjust the config file.
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 2-2: trailing spaces

(trailing-spaces)

bio/reference/inphared-db/wrapper.py (1)

8-9: ⚠️ Potential issue

Fix syntax errors in shell command implementation.

The current implementation has syntax errors and doesn't properly use the Snakemake API. The shell command should use Python syntax, not Snakemake rule syntax.

Based on previous feedback, implement these changes:

  1. Use the shell() function directly
  2. Access parameters through the snakemake object
  3. Use f-strings for interpolation
  4. Use the output file specified by the Snakemake rule
  5. Add separator between URL components for clarity
-    shell:
-        "curl {params.prefix}{params.date}{params.suffix} -o {params.date}{params.suffix}"
+shell(f"curl {snakemake.params.url}/{snakemake.params.date}{snakemake.params.suffix} -o {snakemake.output}")

Note: This assumes that the config.yaml has been updated to use url instead of prefix (without trailing slash) as suggested in previous reviews.

🧰 Tools
🪛 Ruff (0.8.2)

8-8: SyntaxError: Unexpected indentation


8-9: SyntaxError: Expected an expression


9-9: SyntaxError: Unexpected indentation

🪛 GitHub Actions: Code quality

[error] 8-8: Black formatting check failed: Cannot parse file at line 8. File could not be formatted.

bio/reference/inphared-db/test/config.yaml (2)

8-9: Update prefix to url without trailing slash.

As suggested in previous reviews, rename the prefix key to url and remove the trailing slash.

-prefix:
-    "https://millardlab-inphared.s3.climb.ac.uk/"
+url:
+    "https://millardlab-inphared.s3.climb.ac.uk"

4-6: Consider using a smaller reference file for testing.

The current reference file (_refseq_genomes.fa) may be large, which could be problematic for CI testing. As suggested in a previous review, find a smaller subset of viruses to use as a test reference.

Is there a smaller test file available from the inphared database that could be used for CI testing purposes? This would improve test execution times and reduce resource usage.

bio/reference/inphared-db/test/Snakefile (2)

5-5: Use arbitrary file names instead of direct config references

One of the best practices for these wrappers is to work with arbitrary file names. The config[] entries should not be used directly in the output path in this example Snakefile, but rather in the wrapper.py file.

Consider showing both approaches - one with a fixed filename and one with config values:

-    output:
-        expand("{date}{suffix}", date=config["date"], suffix=config["suffix"])    
+    output:
+        "resources/inphared.fasta"

Users can add Python code to incorporate config variables in their own workflow if needed.


7-7: Rename parameter from 'prefix' to 'url' for clarity

Renaming this parameter would make its purpose clearer to users of the wrapper.

-        prefix = config["prefix"], 
+        url = config["url"], 

Remember to update the corresponding entry in the config.yaml file as well.

bio/reference/inphared-db/test/old_release.smk (1)

1-14: Remove this file if it's not needed for the inphared-db wrapper

This file appears to be unrelated to the inphared-db wrapper functionality and might be a copy-paste leftover as noted in previous review comments.

If this file is not needed for the inphared-db wrapper, it should be removed to prevent confusion.

bio/reference/inphared-db/test/old_snakefile.smk (1)

1-14: Remove this file if it's not needed for the inphared-db wrapper

This file appears to be unrelated to the inphared-db wrapper functionality and might be a leftover file as noted in previous review comments.

If this file is not needed for the inphared-db wrapper, it should be removed to prevent confusion.

bio/reference/inphared-db/old_wrapper.py (1)

1-5: Remove this file if it's not needed for the inphared-db wrapper

This file appears to be handling Ensembl downloads rather than inphared database downloads, which seems inconsistent with the PR title. Previous review comments suggested this file might be unnecessary.

If this file is not needed for the inphared-db wrapper, it should be removed to prevent confusion.

🧹 Nitpick comments (2)
bio/reference/inphared-db/old_wrapper.py (2)

8-8: Remove unused import

The itertools.product import is never used in this script.

-from itertools import product
🧰 Tools
🪛 Ruff (0.8.2)

8-8: itertools.product imported but unused

Remove unused import: itertools.product

(F401)


47-52: Simplify nested if statements

The nested if statements can be combined for better readability.

-if chromosome:
-    if not datatype == "dna":
-        raise ValueError(
-            "invalid datatype, to select a single chromosome the datatype must be dna"
-        )
+if chromosome and datatype != "dna":
+    raise ValueError(
+        "invalid datatype, to select a single chromosome the datatype must be dna"
+    )
🧰 Tools
🪛 Ruff (0.8.2)

47-48: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 668257f and d193345.

📒 Files selected for processing (8)
  • bio/reference/inphared-db/environment.yaml (1 hunks)
  • bio/reference/inphared-db/meta.yaml (1 hunks)
  • bio/reference/inphared-db/old_wrapper.py (1 hunks)
  • bio/reference/inphared-db/test/Snakefile (1 hunks)
  • bio/reference/inphared-db/test/config.yaml (1 hunks)
  • bio/reference/inphared-db/test/old_release.smk (1 hunks)
  • bio/reference/inphared-db/test/old_snakefile.smk (1 hunks)
  • bio/reference/inphared-db/wrapper.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...

**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.

  • bio/reference/inphared-db/wrapper.py
  • bio/reference/inphared-db/old_wrapper.py
`**/wrapper.py`: Do not complain about use of undefined variable called `snakemake`.

**/wrapper.py: Do not complain about use of undefined variable called snakemake.

  • bio/reference/inphared-db/wrapper.py
🪛 YAMLlint (1.35.1)
bio/reference/inphared-db/meta.yaml

[error] 2-2: trailing spaces

(trailing-spaces)

🪛 Ruff (0.8.2)
bio/reference/inphared-db/wrapper.py

8-8: SyntaxError: Unexpected indentation


8-9: SyntaxError: Expected an expression


9-9: SyntaxError: Unexpected indentation

bio/reference/inphared-db/old_wrapper.py

8-8: itertools.product imported but unused

Remove unused import: itertools.product

(F401)


11-11: Undefined name snakemake

(F821)


12-12: Undefined name snakemake

(F821)


13-13: Undefined name snakemake

(F821)


19-19: Undefined name snakemake

(F821)


20-20: Undefined name snakemake

(F821)


22-22: Undefined name snakemake

(F821)


29-29: Undefined name snakemake

(F821)


30-30: Undefined name snakemake

(F821)


47-48: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

🪛 GitHub Actions: Code quality
bio/reference/inphared-db/wrapper.py

[error] 8-8: Black formatting check failed: Cannot parse file at line 8. File could not be formatted.

bio/reference/inphared-db/test/Snakefile

[error] 1-1: WorkflowError: configfile 'config.yaml' is defined but not present or accessible at /home/runner/work/snakemake-wrappers/snakemake-wrappers/config.yaml.

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: testing
🔇 Additional comments (2)
bio/reference/inphared-db/environment.yaml (1)

1-5: Environment configuration looks appropriate.

The environment configuration correctly specifies conda-forge with nodefaults and includes the curl dependency, which is necessary for downloading from remote URLs.

bio/reference/inphared-db/test/config.yaml (1)

1-9: Verify if this configuration file is necessary.

One of the PR comments questioned if this configuration file is truly necessary. Consider whether these parameters could be directly specified in the Snakefile rules or if this level of configuration provides valuable flexibility to users.

Could you clarify the purpose of having a separate config file rather than hardcoding these parameters in the wrapper or allowing them to be passed directly as rule parameters?

@@ -0,0 +1,12 @@
configfile: "config.yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the pipeline failure by ensuring config.yaml exists

The pipeline is failing because the config.yaml file is missing or inaccessible. Make sure the config file exists in the same directory as this Snakefile and has the correct permissions.

#!/bin/bash
# Check if config.yaml exists in the same directory as this Snakefile
if [ -f "bio/reference/inphared-db/test/config.yaml" ]; then
  echo "config.yaml exists"
else
  echo "config.yaml does not exist in the expected location"
fi
🧰 Tools
🪛 GitHub Actions: Code quality

[error] 1-1: WorkflowError: configfile 'config.yaml' is defined but not present or accessible at /home/runner/work/snakemake-wrappers/snakemake-wrappers/config.yaml.

release="75",
chromosome="I",
log:
"logs/get_genome.log",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use separate log files for different rules

Both get_genome and get_chromosome rules use the same log file, which could cause conflicts if both rules run concurrently.

-    log:
-        "logs/get_genome.log",
+    log:
+        "logs/get_chromosome.log",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"logs/get_genome.log",
log:
"logs/get_chromosome.log",

release="98",
log:
"logs/get_genome.log",
cache: "mit-software" # save space and time with between workflow caching (see docs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo in cache parameter

There appears to be a typo in the cache parameter. This rule uses "mit-software" while the other rule uses "omit-software".

-    cache: "mit-software"  # save space and time with between workflow caching (see docs)
+    cache: "omit-software"  # save space and time with between workflow caching (see docs)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cache: "mit-software" # save space and time with between workflow caching (see docs)
cache: "omit-software" # save space and time with between workflow caching (see docs)

chromosome="I", # optional: restrict to chromosome
# branch="plants", # optional: specify branch
log:
"logs/get_genome.log",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use separate log files for different rules

Both get_genome and get_chromosome rules use the same log file, which could cause conflicts if both rules run concurrently.

-    log:
-        "logs/get_genome.log",
+    log:
+        "logs/get_chromosome.log",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"logs/get_genome.log",
log:
"logs/get_chromosome.log",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants