Skip to content

Add rule converter scripts for multi-IDE format generation.#14

Merged
aacarter1 merged 3 commits into
mainfrom
feat/add-rule-converter-scripts
Jan 29, 2026
Merged

Add rule converter scripts for multi-IDE format generation.#14
aacarter1 merged 3 commits into
mainfrom
feat/add-rule-converter-scripts

Conversation

@thomas-bartlett
Copy link
Copy Markdown
Contributor

Adds Python scripts (src/) for converting unified markdown rules into IDE-specific formats (Cursor, Windsurf, Copilot, Agent Skills, Antigravity). Includes validation scripts for rule structure and version consistency, plus shared utilities for frontmatter parsing and language mappings.

Copy link
Copy Markdown
Contributor

@aacarter1 aacarter1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thanks!

@aacarter1 aacarter1 merged commit f2525df into main Jan 29, 2026
2 checks passed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds a comprehensive rule conversion system to transform unified markdown security rules into multiple IDE-specific formats (Cursor, Windsurf, Copilot, Agent Skills, and Antigravity). The implementation provides validation scripts, format converters, and shared utilities for language mappings and frontmatter parsing.

Changes:

  • Added Python conversion scripts that parse unified markdown rules with YAML frontmatter and generate IDE-specific formats
  • Implemented validation scripts for rule structure and version consistency across project files
  • Created shared utility modules for frontmatter parsing, language-to-extension mappings, and tag validation

Reviewed changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 26 comments.

Show a summary per file
File Description
src/validate_versions.py Version consistency validation across pyproject.toml, plugin.json, marketplace.json, and SKILL.md
src/validate_unified_rules.py Validates unified rule files for correct YAML frontmatter and structure
src/utils.py Shared utilities for frontmatter parsing, tag validation, and version reading
src/tag_mappings.py Centralized list of known security rule tags
src/language_mappings.py Language-to-extension mappings and glob pattern conversion functions
src/formats/base.py Abstract base class defining the interface for all IDE format implementations
src/formats/cursor.py Generates .mdc files for Cursor IDE with YAML frontmatter
src/formats/windsurf.py Generates .md files for Windsurf IDE with trigger-based rules
src/formats/copilot.py Generates .instructions.md files for GitHub Copilot
src/formats/agentskills.py Generates .md files following the Agent Skills standard
src/formats/antigravity.py Generates .md files for Google Antigravity with glob patterns
src/formats/init.py Package initialization exposing all format classes
src/converter.py Core conversion logic handling rule parsing and multi-format generation
src/convert_to_ide_formats.py Main entry point script for converting rules to IDE bundles
.gitignore Added Python-specific ignores and IDE-generated bundle exclusions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/formats/windsurf.py
Comment on lines +46 to +51
# Use trigger: always_on for rules that should always apply
if rule.always_apply:
yaml_lines.append("trigger: always_on")
else:
yaml_lines.append("trigger: glob")
yaml_lines.append(f"globs: {globs}")
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When rule.always_apply is false and rule.languages is empty, the globs value might be empty or "**/*". This could result in generating globs: with an empty or universal glob, which might not be the intended behavior for Windsurf format.

Suggested change
# Use trigger: always_on for rules that should always apply
if rule.always_apply:
yaml_lines.append("trigger: always_on")
else:
yaml_lines.append("trigger: glob")
yaml_lines.append(f"globs: {globs}")
# Normalize globs to handle empty or universal patterns safely
normalized_globs = (globs or "").strip()
# Use trigger: always_on for rules that should always apply, or when
# no specific globs are provided (empty or universal patterns).
if rule.always_apply or not normalized_globs or normalized_globs in ('**/*', '"**/*"', "'**/*'"):
yaml_lines.append("trigger: always_on")
else:
yaml_lines.append("trigger: glob")
yaml_lines.append(f"globs: {normalized_globs}")

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +39
has_languages = "languages" in frontmatter and frontmatter["languages"]
always_apply = frontmatter.get("alwaysApply", False)

if always_apply and has_languages:
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation logic prevents rules with alwaysApply=true from having languages, but the converter's parse_rule method at line 147 sets languages = [] for such rules. This means the validation in line 39 would never trigger since the languages list would be empty, not present. Consider checking for an empty list as well.

Suggested change
has_languages = "languages" in frontmatter and frontmatter["languages"]
always_apply = frontmatter.get("alwaysApply", False)
if always_apply and has_languages:
has_languages_key = "languages" in frontmatter
has_languages = has_languages_key and bool(frontmatter["languages"]) if has_languages_key else False
always_apply = frontmatter.get("alwaysApply", False)
if always_apply and has_languages_key:

Copilot uses AI. Check for mistakes.
Comment on lines +64 to +68
if rule.languages:
# Format as YAML list
yaml_lines.append("languages:")
for lang in rule.languages:
yaml_lines.append(f"- {lang}")
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The languages list is manually formatted as YAML (lines 66-68), but if a language name contains special YAML characters (colons, quotes, etc.), it won't be properly escaped. Consider using yaml.safe_dump for the entire languages list to ensure proper escaping.

Copilot uses AI. Check for mistakes.
Comment thread src/formats/windsurf.py
Comment on lines +58 to +59
# Add version
yaml_lines.append(f"version: {self.version}")
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version value is inserted directly into YAML without proper escaping. If the version string contains special YAML characters, this could create invalid YAML. Consider using proper YAML escaping for the version value to ensure robustness.

Suggested change
# Add version
yaml_lines.append(f"version: {self.version}")
# Add version (use YAML-safe formatting)
version_line = self._format_yaml_field("version", str(self.version))
if version_line:
yaml_lines.append(version_line)

Copilot uses AI. Check for mistakes.
yaml_lines.append("trigger: always_on")
else:
yaml_lines.append("trigger: glob")
yaml_lines.append(f"globs: {globs}")
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When rule.always_apply is false and rule.languages is empty, the globs value might be empty or "**/*". This could result in generating globs: with an empty or universal glob in the Antigravity format.

Copilot uses AI. Check for mistakes.
Comment thread src/language_mappings.py
Comment on lines +125 to +132
pattern = pattern.strip().lower()

# Check for file extensions and patterns
for ext, lang in EXTENSION_TO_LANGUAGE.items():
if ext.lower() in pattern:
languages.add(lang)
break # One match per pattern is enough

Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The language check at line 129 uses ext.lower() but the EXTENSION_TO_LANGUAGE dictionary keys are already lowercase (from LANGUAGE_TO_EXTENSIONS values). The pattern matching should also handle wildcards in extensions (e.g., "Dockerfile*" from line 50) properly, but the current implementation may not correctly match these patterns.

Suggested change
pattern = pattern.strip().lower()
# Check for file extensions and patterns
for ext, lang in EXTENSION_TO_LANGUAGE.items():
if ext.lower() in pattern:
languages.add(lang)
break # One match per pattern is enough
# Normalize pattern for case-insensitive matching
pattern = pattern.strip().lower()
# Check for file extensions and patterns
for ext, lang in EXTENSION_TO_LANGUAGE.items():
# Normalize extension once for comparison
ext_lower = ext.lower()
if "*" in ext:
# Wildcard-style "extensions" (e.g., 'dockerfile*'):
# treat as substring patterns within the glob.
if ext_lower in pattern:
languages.add(lang)
break # One match per pattern is enough
else:
# Normal file extensions (e.g., '.py'):
# require that the glob pattern end with the extension
# (possibly after a '*'), to avoid partial matches.
if pattern.endswith(ext_lower) or pattern.endswith(f"*{ext_lower}"):
languages.add(lang)
break # One match per pattern is enough

Copilot uses AI. Check for mistakes.
Comment thread src/converter.py
Comment on lines +166 to +168
# Adding rule_id to the beginning of the content
rule_id = Path(filename).stem
markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}"
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule_id is extracted from the filename and prepended to the content at line 168. However, if the content already starts with "rule_id:", this would create a duplicate. Consider checking if the content already has a rule_id before adding it, or document that this should never happen.

Suggested change
# Adding rule_id to the beginning of the content
rule_id = Path(filename).stem
markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}"
# Adding rule_id to the beginning of the content, if not already present
rule_id = Path(filename).stem
stripped_content = markdown_content.lstrip()
if not stripped_content.startswith("rule_id:"):
markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}"

Copilot uses AI. Check for mistakes.
Comment thread src/formats/base.py
Comment on lines +125 to +131
yaml_dump = yaml.safe_dump(
{field_name: value},
default_flow_style=False,
allow_unicode=True,
width=float("inf")
)
return yaml_dump.strip()
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _format_yaml_field method uses yaml.safe_dump which will produce output like "field_name: value\n". When this is joined with other lines that may be simple strings like "globs: {globs}", it could create inconsistent formatting. Some fields will be properly escaped by yaml.safe_dump while others are simple f-strings, potentially creating YAML syntax errors if the f-string values contain special characters.

Suggested change
yaml_dump = yaml.safe_dump(
{field_name: value},
default_flow_style=False,
allow_unicode=True,
width=float("inf")
)
return yaml_dump.strip()
# Use yaml.safe_dump to escape the scalar value, then format the field manually.
yaml_dump = yaml.safe_dump(
value,
default_flow_style=False,
allow_unicode=True,
width=float("inf"),
)
yaml_value = yaml_dump.strip()
return f"{field_name}: {yaml_value}"

Copilot uses AI. Check for mistakes.
Comment thread src/formats/cursor.py
yaml_lines.append(desc)

# Add globs and version
yaml_lines.append(f"globs: {globs}")
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The globs value is inserted directly into the YAML without proper escaping. If globs contain YAML special characters (like colons in "**/*.py:text"), this could create invalid YAML. Consider using proper YAML escaping or quoting for the globs value, similar to how description is handled with _format_yaml_field.

Suggested change
yaml_lines.append(f"globs: {globs}")
globs_field = self._format_yaml_field("globs", globs)
if globs_field:
yaml_lines.append(globs_field)

Copilot uses AI. Check for mistakes.
Comment thread src/formats/windsurf.py
yaml_lines.append("trigger: always_on")
else:
yaml_lines.append("trigger: glob")
yaml_lines.append(f"globs: {globs}")
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The globs value is inserted directly into YAML without proper escaping. If globs contain special YAML characters, this could create invalid YAML syntax. Consider using proper YAML escaping for the globs value.

Suggested change
yaml_lines.append(f"globs: {globs}")
globs_field = self._format_yaml_field("globs", globs)
if globs_field:
yaml_lines.append(globs_field)

Copilot uses AI. Check for mistakes.
thschaffr pushed a commit to thschaffr/project-codeguard that referenced this pull request Jan 30, 2026
…ting-index-page

Revise terminology and enhance clarity in documentation
@thomas-bartlett thomas-bartlett deleted the feat/add-rule-converter-scripts branch February 5, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants