Add rule converter scripts for multi-IDE format generation.#14
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds a comprehensive rule conversion system to transform unified markdown security rules into multiple IDE-specific formats (Cursor, Windsurf, Copilot, Agent Skills, and Antigravity). The implementation provides validation scripts, format converters, and shared utilities for language mappings and frontmatter parsing.
Changes:
- Added Python conversion scripts that parse unified markdown rules with YAML frontmatter and generate IDE-specific formats
- Implemented validation scripts for rule structure and version consistency across project files
- Created shared utility modules for frontmatter parsing, language-to-extension mappings, and tag validation
Reviewed changes
Copilot reviewed 15 out of 17 changed files in this pull request and generated 26 comments.
Show a summary per file
| File | Description |
|---|---|
| src/validate_versions.py | Version consistency validation across pyproject.toml, plugin.json, marketplace.json, and SKILL.md |
| src/validate_unified_rules.py | Validates unified rule files for correct YAML frontmatter and structure |
| src/utils.py | Shared utilities for frontmatter parsing, tag validation, and version reading |
| src/tag_mappings.py | Centralized list of known security rule tags |
| src/language_mappings.py | Language-to-extension mappings and glob pattern conversion functions |
| src/formats/base.py | Abstract base class defining the interface for all IDE format implementations |
| src/formats/cursor.py | Generates .mdc files for Cursor IDE with YAML frontmatter |
| src/formats/windsurf.py | Generates .md files for Windsurf IDE with trigger-based rules |
| src/formats/copilot.py | Generates .instructions.md files for GitHub Copilot |
| src/formats/agentskills.py | Generates .md files following the Agent Skills standard |
| src/formats/antigravity.py | Generates .md files for Google Antigravity with glob patterns |
| src/formats/init.py | Package initialization exposing all format classes |
| src/converter.py | Core conversion logic handling rule parsing and multi-format generation |
| src/convert_to_ide_formats.py | Main entry point script for converting rules to IDE bundles |
| .gitignore | Added Python-specific ignores and IDE-generated bundle exclusions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Use trigger: always_on for rules that should always apply | ||
| if rule.always_apply: | ||
| yaml_lines.append("trigger: always_on") | ||
| else: | ||
| yaml_lines.append("trigger: glob") | ||
| yaml_lines.append(f"globs: {globs}") |
There was a problem hiding this comment.
When rule.always_apply is false and rule.languages is empty, the globs value might be empty or "**/*". This could result in generating globs: with an empty or universal glob, which might not be the intended behavior for Windsurf format.
| # Use trigger: always_on for rules that should always apply | |
| if rule.always_apply: | |
| yaml_lines.append("trigger: always_on") | |
| else: | |
| yaml_lines.append("trigger: glob") | |
| yaml_lines.append(f"globs: {globs}") | |
| # Normalize globs to handle empty or universal patterns safely | |
| normalized_globs = (globs or "").strip() | |
| # Use trigger: always_on for rules that should always apply, or when | |
| # no specific globs are provided (empty or universal patterns). | |
| if rule.always_apply or not normalized_globs or normalized_globs in ('**/*', '"**/*"', "'**/*'"): | |
| yaml_lines.append("trigger: always_on") | |
| else: | |
| yaml_lines.append("trigger: glob") | |
| yaml_lines.append(f"globs: {normalized_globs}") |
| has_languages = "languages" in frontmatter and frontmatter["languages"] | ||
| always_apply = frontmatter.get("alwaysApply", False) | ||
|
|
||
| if always_apply and has_languages: |
There was a problem hiding this comment.
This validation logic prevents rules with alwaysApply=true from having languages, but the converter's parse_rule method at line 147 sets languages = [] for such rules. This means the validation in line 39 would never trigger since the languages list would be empty, not present. Consider checking for an empty list as well.
| has_languages = "languages" in frontmatter and frontmatter["languages"] | |
| always_apply = frontmatter.get("alwaysApply", False) | |
| if always_apply and has_languages: | |
| has_languages_key = "languages" in frontmatter | |
| has_languages = has_languages_key and bool(frontmatter["languages"]) if has_languages_key else False | |
| always_apply = frontmatter.get("alwaysApply", False) | |
| if always_apply and has_languages_key: |
| if rule.languages: | ||
| # Format as YAML list | ||
| yaml_lines.append("languages:") | ||
| for lang in rule.languages: | ||
| yaml_lines.append(f"- {lang}") |
There was a problem hiding this comment.
The languages list is manually formatted as YAML (lines 66-68), but if a language name contains special YAML characters (colons, quotes, etc.), it won't be properly escaped. Consider using yaml.safe_dump for the entire languages list to ensure proper escaping.
| # Add version | ||
| yaml_lines.append(f"version: {self.version}") |
There was a problem hiding this comment.
The version value is inserted directly into YAML without proper escaping. If the version string contains special YAML characters, this could create invalid YAML. Consider using proper YAML escaping for the version value to ensure robustness.
| # Add version | |
| yaml_lines.append(f"version: {self.version}") | |
| # Add version (use YAML-safe formatting) | |
| version_line = self._format_yaml_field("version", str(self.version)) | |
| if version_line: | |
| yaml_lines.append(version_line) |
| yaml_lines.append("trigger: always_on") | ||
| else: | ||
| yaml_lines.append("trigger: glob") | ||
| yaml_lines.append(f"globs: {globs}") |
There was a problem hiding this comment.
When rule.always_apply is false and rule.languages is empty, the globs value might be empty or "**/*". This could result in generating globs: with an empty or universal glob in the Antigravity format.
| pattern = pattern.strip().lower() | ||
|
|
||
| # Check for file extensions and patterns | ||
| for ext, lang in EXTENSION_TO_LANGUAGE.items(): | ||
| if ext.lower() in pattern: | ||
| languages.add(lang) | ||
| break # One match per pattern is enough | ||
|
|
There was a problem hiding this comment.
The language check at line 129 uses ext.lower() but the EXTENSION_TO_LANGUAGE dictionary keys are already lowercase (from LANGUAGE_TO_EXTENSIONS values). The pattern matching should also handle wildcards in extensions (e.g., "Dockerfile*" from line 50) properly, but the current implementation may not correctly match these patterns.
| pattern = pattern.strip().lower() | |
| # Check for file extensions and patterns | |
| for ext, lang in EXTENSION_TO_LANGUAGE.items(): | |
| if ext.lower() in pattern: | |
| languages.add(lang) | |
| break # One match per pattern is enough | |
| # Normalize pattern for case-insensitive matching | |
| pattern = pattern.strip().lower() | |
| # Check for file extensions and patterns | |
| for ext, lang in EXTENSION_TO_LANGUAGE.items(): | |
| # Normalize extension once for comparison | |
| ext_lower = ext.lower() | |
| if "*" in ext: | |
| # Wildcard-style "extensions" (e.g., 'dockerfile*'): | |
| # treat as substring patterns within the glob. | |
| if ext_lower in pattern: | |
| languages.add(lang) | |
| break # One match per pattern is enough | |
| else: | |
| # Normal file extensions (e.g., '.py'): | |
| # require that the glob pattern end with the extension | |
| # (possibly after a '*'), to avoid partial matches. | |
| if pattern.endswith(ext_lower) or pattern.endswith(f"*{ext_lower}"): | |
| languages.add(lang) | |
| break # One match per pattern is enough |
| # Adding rule_id to the beginning of the content | ||
| rule_id = Path(filename).stem | ||
| markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}" |
There was a problem hiding this comment.
The rule_id is extracted from the filename and prepended to the content at line 168. However, if the content already starts with "rule_id:", this would create a duplicate. Consider checking if the content already has a rule_id before adding it, or document that this should never happen.
| # Adding rule_id to the beginning of the content | |
| rule_id = Path(filename).stem | |
| markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}" | |
| # Adding rule_id to the beginning of the content, if not already present | |
| rule_id = Path(filename).stem | |
| stripped_content = markdown_content.lstrip() | |
| if not stripped_content.startswith("rule_id:"): | |
| markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}" |
| yaml_dump = yaml.safe_dump( | ||
| {field_name: value}, | ||
| default_flow_style=False, | ||
| allow_unicode=True, | ||
| width=float("inf") | ||
| ) | ||
| return yaml_dump.strip() |
There was a problem hiding this comment.
The _format_yaml_field method uses yaml.safe_dump which will produce output like "field_name: value\n". When this is joined with other lines that may be simple strings like "globs: {globs}", it could create inconsistent formatting. Some fields will be properly escaped by yaml.safe_dump while others are simple f-strings, potentially creating YAML syntax errors if the f-string values contain special characters.
| yaml_dump = yaml.safe_dump( | |
| {field_name: value}, | |
| default_flow_style=False, | |
| allow_unicode=True, | |
| width=float("inf") | |
| ) | |
| return yaml_dump.strip() | |
| # Use yaml.safe_dump to escape the scalar value, then format the field manually. | |
| yaml_dump = yaml.safe_dump( | |
| value, | |
| default_flow_style=False, | |
| allow_unicode=True, | |
| width=float("inf"), | |
| ) | |
| yaml_value = yaml_dump.strip() | |
| return f"{field_name}: {yaml_value}" |
| yaml_lines.append(desc) | ||
|
|
||
| # Add globs and version | ||
| yaml_lines.append(f"globs: {globs}") |
There was a problem hiding this comment.
The globs value is inserted directly into the YAML without proper escaping. If globs contain YAML special characters (like colons in "**/*.py:text"), this could create invalid YAML. Consider using proper YAML escaping or quoting for the globs value, similar to how description is handled with _format_yaml_field.
| yaml_lines.append(f"globs: {globs}") | |
| globs_field = self._format_yaml_field("globs", globs) | |
| if globs_field: | |
| yaml_lines.append(globs_field) |
| yaml_lines.append("trigger: always_on") | ||
| else: | ||
| yaml_lines.append("trigger: glob") | ||
| yaml_lines.append(f"globs: {globs}") |
There was a problem hiding this comment.
The globs value is inserted directly into YAML without proper escaping. If globs contain special YAML characters, this could create invalid YAML syntax. Consider using proper YAML escaping for the globs value.
| yaml_lines.append(f"globs: {globs}") | |
| globs_field = self._format_yaml_field("globs", globs) | |
| if globs_field: | |
| yaml_lines.append(globs_field) |
…ting-index-page Revise terminology and enhance clarity in documentation
Adds Python scripts (
src/) for converting unified markdown rules into IDE-specific formats (Cursor, Windsurf, Copilot, Agent Skills, Antigravity). Includes validation scripts for rule structure and version consistency, plus shared utilities for frontmatter parsing and language mappings.