Skip to content

feat: update more keywords and highlighting #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 3, 2025

Conversation

Hocnonsense
Copy link
Contributor

@Hocnonsense Hocnonsense commented Apr 1, 2025

This will allow snakefile more colorful in vscode

However, when things exists in function calls, it will just lost it's color...

image

Summary by CodeRabbit

  • New Features
    • Expanded language support with enhanced syntax highlighting and keyword recognition for improved editing of workflow definitions.
  • Refactor
    • Reorganized configuration schemas and updated matching criteria for more robust rule parsing and a streamlined user experience.
  • Style
    • Standardized text formatting across language definitions for consistent visual presentation.

Copy link

coderabbitai bot commented Apr 1, 2025

📝 Walkthrough

Walkthrough

This pull request refines and expands the handling of Snakemake syntax. It updates regex patterns to recognize additional keywords and parameters across multiple language configuration files, enhances syntax highlighting rules with new patterns and repository entries, and overhauls configuration schemas by introducing new sections and properties. One JSON configuration file is removed and replaced with an updated structure. These modifications streamline keyword recognition and improve the structured parsing of Snakemake constructs.

Changes

Files Change Summary
languages/snakemake.json, src/yaml/snakemake.language.yaml Expanded regex in onEnterRules to include additional keywords (e.g., module, envvars, include, workdir); string literals converted from single to double quotes.
src/yaml/snakemake.syntax.yaml, syntaxes/snakemake.tmLanguage.json Enhanced syntax highlighting by adding new patterns and repository entries for modules, user rules, run parameters, module parameters, classes, objects, and shell blocks.
src/js/keywords-regex.json, src/keywords.yaml Overhauled configuration schema: removed old definitions and introduced new sections (e.g., modules, rulerunparams, moduleparams, classes, objects, ruleargs) along with expanded keyword and function listings.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant Editor
  participant Parser
  participant Highlighter

  User->>Editor: Types in a Snakemake file
  Editor->>Parser: Triggers onEnterRules for input processing
  Parser->>Parser: Applies expanded regex to match new keywords
  Parser->>Highlighter: Identifies relevant syntax elements
  Highlighter->>Editor: Applies enhanced syntax highlighting
Loading

Possibly related PRs

Suggested labels

good first issue


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
languages/snakemake.json (1)

96-101: Expanded beforeText Regex in OnEnterRules
The updated regular expression now contains a comprehensive list of Snakemake keywords and parameters, which should enhance the auto-indentation behavior when editing Snakefiles. However, note that several keywords (e.g. conda, singularity, container, containerized, wildcard_constraints, and name) are repeated. Although duplicate entries do not hinder functionality, removing them may simplify future maintenance.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24d7afc and 2a17f79.

📒 Files selected for processing (7)
  • languages/snakemake.json (1 hunks)
  • package.json (0 hunks)
  • src/js/build_regex.js (0 hunks)
  • src/js/keywords-regex.json (0 hunks)
  • src/yaml/snakemake.snippets.yaml (0 hunks)
  • src/yaml/snakemake.syntax.yaml (0 hunks)
  • syntaxes/snakemake.tmLanguage.json (3 hunks)
💤 Files with no reviewable changes (5)
  • package.json
  • src/js/build_regex.js
  • src/yaml/snakemake.syntax.yaml
  • src/yaml/snakemake.snippets.yaml
  • src/js/keywords-regex.json
🔇 Additional comments (21)
syntaxes/snakemake.tmLanguage.json (21)

9-17: Enhanced Quote Handling Patterns Added
The new inclusion entries for #quotessmall, #quotesmid, and #quotesbig in the main patterns array will help improve syntax highlighting for different quote styles. Please verify that the recursive inclusion of these patterns (e.g. including #quotessmall within itself) is intentional and does not adversely affect performance or cause unintended matches.


24-35: New Module and Rule Alias Patterns Introduced
The additions for #modules, #useruleas, #userulefromas, and #userulesfrom extend the language support for module-related syntax and user-defined rule aliases. Ensure that these new patterns correctly capture the intended constructs without conflicting with existing rules.


39-56: Extended Rule & Module Parameters and Entity Inclusions
The new inclusions for #rulerunparams, #moduleparams, #classes, #object, #ruleargargs, and #ruleargs broaden the scope of recognized Snakemake constructs. Their regex patterns appear well structured—please validate them against a representative sample of Snakefiles to confirm accurate highlighting.


65-97: Robust Repository Definition for quotessmall
The repository block for "quotessmall" is detailed with nested inclusions (e.g. for quotes and class/object patterns), which should enhance the handling of nested quote blocks. Keep an eye on potential performance issues arising from deep recursion in complex files.


98-130: Detailed Repository for quotesmid
The "quotesmid" block mirrors the structure of "quotessmall", ensuring consistent handling of mid-level quotes. Consider adding inline documentation if future maintenance reveals that the nested structure becomes challenging to manage.


131-163: Comprehensive Repository for quotesbig
The "quotesbig" repository entry is consistent with its smaller counterparts. Verify that the begin ("\\{") and end ("\\}") markers do not conflict with other JSON or code block patterns in the target files.


164-171: Expanded Configuration Keywords in configs Section
The regex in the "configs" section now covers a broader set of keywords (e.g. envvars, include, workdir, etc.). This expansion should improve detection of configuration blocks.


172-182: Rules Section Updated for Checkpoints and Rule Declarations
Restricting this section to detect only checkpoint and rule keywords is appropriate, especially if constructs like subworkflow are now handled by the modules section.


183-193: Modules Section Implementation is Clear
The new "modules" block, which matches both module and subworkflow keywords, is a logical addition. Confirm that the capture groups yield the desired entity names in all cases.


194-210: User-Defined Rule Alias (useruleas) Pattern Introduced
The regex for matching patterns like use rule <name> as <alias> appears correct and enhances clarity when aliasing rules. Testing with edge cases is advised.


211-233: Enhanced userulefromas Pattern for Extended Aliasing
The new pattern covering use rule <rule> from <module> as <alias> improves the expressiveness of rule references. Verify that the capture groups are in the correct order to avoid misinterpretation.


234-250: Concise userulesfrom Pattern for Direct Rule Referencing
This pattern simplifies the syntax for references formatted as use rule <rule> from <module>. It is succinct and appears well-targeted for its purpose.


251-258: Consistent Rule Parameters Capture Block
The "ruleparams" block leverages a comprehensive list of keywords to recognize rule parameters. No issues detected here.


259-266: New Runner Parameters Block (rulerunparams) Added
Incorporating keywords like run, shell, script, etc., this block will help in correctly highlighting the execution stage of rules. Validate that these patterns do not accidentally overlap with other constructs.


267-274: Module Parameters Section Enhances Keyword Recognition
The "moduleparams" block now supports parameters such as snakefile, meta_wrapper, and others. The regex appears well-considered.


275-282: Accurate Class Names Recognition in classes Block
The matching for Path and WorkflowError is clear and should suffice for current needs. Revisit if additional classes require highlighting later.


283-290: Effective Object Type Matching
The "object" block appropriately captures keywords like snakemake, rules, workflow, etc. Testing against typical Snakefiles is recommended to ensure accuracy.


291-300: Robust Pattern for Rule Argument Attributes
The regex under "ruleargargs" uses dot notation effectively to capture variable attributes. This approach should be resilient for most use cases.


302-309: Concise Rule Arguments Matching
The "ruleargs" block captures the intended keywords for rule arguments succinctly.


310-317: Enhanced Built-in Functions Recognition
The updated list of functions (e.g. shell, github, gitlab, etc.) supports a more comprehensive syntax highlighting experience.


318-326: New Shell Block Captures Multi-line Shell Commands
The "shell_block" section, using triple quotes for delimiters, should robustly handle shell commands embedded in Snakefiles.

@Hocnonsense Hocnonsense added the help wanted Extra attention is needed label Apr 2, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/yaml/snakemake.language.yaml (1)

38-40: Enhanced onEnterRules Regex Pattern
The regex for the onEnterRules has been significantly expanded to include many additional keywords (e.g. workdir, configfile, pepfile, etc.). Please double-check that all added keywords trigger the intended action and that the pattern’s complexity does not impact performance.

src/yaml/snakemake.syntax.yaml (1)

7-24: Expanded Patterns for Snakemake Syntax
The patterns array has been significantly expanded to include additional entries such as #quotessmall, #quotesmid, #quotesbig, #modules, #useruleas, #userulefromas, #userulesfrom, #rulerunparams, and #moduleparams. This comprehensive inclusion will improve recognition of various constructs.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6938fb and 1561535.

📒 Files selected for processing (5)
  • src/js/keywords-regex.json (1 hunks)
  • src/keywords.yaml (1 hunks)
  • src/yaml/snakemake.language.yaml (3 hunks)
  • src/yaml/snakemake.syntax.yaml (3 hunks)
  • syntaxes/snakemake.tmLanguage.json (3 hunks)
🔇 Additional comments (48)
src/yaml/snakemake.language.yaml (4)

2-2: Consistent Comment String Formatting
The change to using double quotes for the lineComment value improves consistency with the rest of the file.


6-8: Brackets Configuration Updated
The brackets list now uses double quotes consistently. This promotes uniformity in string literals, and the entries are clear and correct.


11-13: AutoClosingPairs Consistency
The autoClosingPairs array now consistently uses double quotes and lists all expected bracket pairs along with additional language-specific pairs.


25-27: SurroundingPairs Updated
The surroundingPairs section has been updated to use consistent double quotes. This makes the configuration clearer and easier to maintain.

src/js/keywords-regex.json (1)

1-12: Comprehensive Update of Keyword Regex Configuration
The JSON object now includes expanded properties such as "modules", "rulerunparams", "moduleparams", "classes", "objects", and "ruleargs". These updates align with the changes in other syntax and configuration files, improving the structured parsing of Snakemake constructs. Please verify these regex patterns capture all intended keywords as expected.

src/yaml/snakemake.syntax.yaml (16)

27-67: Enhanced Repository Patterns for Quotes and Configuration
The repository section now defines detailed patterns for quotessmall, quotesmid, and quotesbig, and also includes references to #classes, #object, and rule argument patterns. Double-check that the recursive includes do not incur performance issues on large files.


68-75: Configs Syntax Pattern Validation
The regex under the configs section effectively uses the {{configs}} placeholder to capture a wide range of configuration keywords. Ensure that the substitution happens correctly at runtime.


76-86: Rules Pattern Update
The rules pattern correctly captures a leading keyword (either checkpoint or rule) and optionally a rule name before the colon. This implementation is clear and maintainable.


87-97: Modules Pattern Addition
The new modules pattern for capturing module and subworkflow keywords (optionally with a following name) is a good extension for improved syntax recognition.


98-109: Useruleas Pattern Refinement
The pattern for matching constructs like “use rule … as … with” is detailed and clearly defines capture groups for each keyword segment. Verify this pattern against real-world usage cases.


110-121: Userulesfromas Pattern Confirmation
This regex covers the case of importing rules with aliasing (including wildcards), and it appears robust.


122-131: Userulesfrom Pattern Validation
This simpler pattern for “use rule … from …” is straightforward and correctly captures the expected groups.


132-140: Rule Parameters Pattern
The use of the {{ruleparams}} placeholder in the regex for rule parameters is clear and adequately captures the increased set of keywords.


141-149: Rulerunparams Pattern Addition
The new regex for runtime parameters (rulerunparams) is well integrated and uses an intuitive capture structure.


150-157: Moduleparams Pattern Addition
The module-specific parameters now have their own regex pattern, ensuring consistency with the overall design.


158-162: Classes Pattern Definition
The classes pattern, which captures keywords like those defined in {{classes}}, is concise and effective for highlighting class names.


163-167: Objects Pattern Definition
The objects pattern correctly targets workflow objects based on the {{objects}} list, ensuring proper syntax highlighting.


168-173: Rule Argument Accessor Pattern
The pattern to capture object properties within rule arguments (e.g. x.y) is implemented clearly and should aid in detailed highlighting.


174-178: Rule Arguments Pattern Simplicity
The standalone rule arguments pattern is straightforward and effective at matching the expected tokens.


179-183: Functions Pattern Update
The functions regex now supports a larger set of built-in functions and uses a negative lookahead to avoid matching assignments. This is a strong improvement over previous versions.


184-189: Shell Block Pattern for Multi-line Commands
The new shell_block section clearly defines boundaries for shell commands using triple double-quotes. Please verify that the pattern handles various indentation levels and line break scenarios in actual Snakefiles.

src/keywords.yaml (10)

1-24: Updated Configs List in Keywords
The configs list has been revised to include new entries such as workdir, configfile, pepfile, pepschema, report, ruleorder, and many others. This expanded list will improve the parser’s ability to recognize all relevant configuration keywords.


25-28: Rules Section Remains Stable
The rules section still only includes checkpoint and rule, which is appropriate given the current design.


29-32: New Modules Section
The addition of a modules section listing module and subworkflow is a valuable extension that aligns with the updated configuration schema.


33-67: Expanded Rule Parameters List
The ruleparams section now includes an expanded set of keywords (e.g. threads, log, message, shadow, etc.), which enhances the expressiveness of the syntax. Make sure these changes are reflected in the corresponding regex patterns used elsewhere.


68-76: New Rulerunparams Section
Introducing the rulerunparams section with values like run, shell, script, notebook, and others provides clear categorization of runtime parameters.


77-85: New Moduleparams Section
The moduleparams section is now defined to capture module-specific parameters. This ensures that settings like snakefile and meta_wrapper are singled out for proper handling.


86-89: Classes Definition Update
The new classes list (Path, WorkflowError) improves clarity for class-based constructs in Snakemake and is consistent with changes across the project.


90-99: Enhanced Objects List
The objects section now covers a broader set of workflow-related terms such as snakemake, rules, workflow, etc., which will support more precise syntax highlighting.


100-109: Rule Arguments Definition
The ruleargs section is clear and well-organized, listing necessary elements like input, output, params, etc., and supporting proper parsing of rule-related properties.


110-144: Expanded Functions List
The functions section now includes many additional entries such as github, gitlab, gitfile, from_queue, glob_wildcards, and more. This significantly improves the coverage of built-in functions and utilities in Snakemake.

syntaxes/snakemake.tmLanguage.json (17)

9-63: Enhanced Patterns Array in tmLanguage JSON
The main patterns array now includes additional entries for advanced constructs (e.g. #quotessmall, #quotesmid, #quotesbig, #modules, #useruleas, #userulefromas, #userulesfrom, #rulerunparams, #moduleparams, #classes, #object, #ruleargargs, and #ruleargs). This expansion offers granular control over syntax highlighting and is consistent with other configuration updates.


64-163: Detailed Repository Entries for Quotes and Classes
The repository section now clearly defines nested patterns for quotessmall, quotesmid, and quotesbig, including multiple levels of includes (e.g. references to #classes, #object, etc.). Please verify that the nested structure does not negatively impact editor performance for large files.


164-171: Updated Configs Regex in Repository
The configs repository entry now uses an expanded regex that captures an extended list of configuration keywords. Make sure the regex remains readable and maintainable as changes continue.


172-182: Rules Regex in Repository
The rules entry regex correctly captures the leading whitespace, the keyword (either checkpoint or rule), and an optional rule name. This is a clear and concise implementation.


183-193: Modules Regex Update
The addition of the modules regex for capturing module and subworkflow keywords is effective and consistent with the changes in the YAML configuration.


194-213: Useruleas Repository Entry Check
The regex for the useruleas entry, which matches the “use rule … as … with” construct, is detailed and well segmented. Testing with real examples is recommended.


214-236: Userulesfromas Repository Entry
The repository entry for userulesfromas effectively captures the more complex pattern of importing rules with aliasing and wildcards.


237-253: Userulesfrom Repository Entry
This simpler pattern for “use rule … from …” is straightforward and correctly implemented.


254-261: Ruleparams Repository Regex
The ruleparams regex entry now accommodates a comprehensive list of keywords. Ensure that this list remains in sync with the definitions in the YAML configuration.


262-269: Rulerunparams Repository Regex
The new rulerunparams entry accurately captures runtime parameters. Its structure is consistent with the YAML changes.


270-277: Moduleparams Repository Regex
The moduleparams repository entry is correctly implemented to capture module-specific parameters.


278-285: Classes Regex in Repository
The regex for classes neatly matches the provided keywords (e.g. Path and WorkflowError) and uses a negative lookahead to avoid assignment conflicts.


286-293: Objects Repository Regex
The objects regex captures the intended workflow-related terms and is consistent with the overall design.


294-304: Ruleargargs Repository Regex
The pattern for capturing property access within rule arguments (e.g. input.foo) is implemented clearly and will aid in granular highlighting.


305-312: Ruleargs Repository Regex
This regex matches rule arguments without dot notation in a straightforward manner. No issues detected.


313-320: Functions Repository Regex Update
The updated functions regex includes a broader set of built-in function names and uses a negative lookahead to avoid misclassification. This is a solid improvement.


321-329: Shell Block Repository Regex
The shell_block entry clearly defines a start and end for multi-line shell command blocks using triple double-quotes. Please test with various indentation and formatting scenarios to ensure robustness.

@johanneskoester johanneskoester merged commit 8d455d5 into master Apr 3, 2025
4 checks passed
@johanneskoester johanneskoester deleted the fix/highlights branch April 3, 2025 12:44
@github-project-automation github-project-automation bot moved this from In review to Done in Snakemake Hackathon March 2025 Apr 3, 2025
@Hocnonsense Hocnonsense removed the help wanted Extra attention is needed label Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

2 participants