Add rule converter scripts for multi-IDE format generation. by thomas-bartlett · Pull Request #14 · cosai-oasis/project-codeguard

thomas-bartlett · 2026-01-29T19:56:32Z

Adds Python scripts (src/) for converting unified markdown rules into IDE-specific formats (Cursor, Windsurf, Copilot, Agent Skills, Antigravity). Includes validation scripts for rule structure and version consistency, plus shared utilities for frontmatter parsing and language mappings.

aacarter1

Looks good thanks!

Copilot

Pull request overview

This pull request adds a comprehensive rule conversion system to transform unified markdown security rules into multiple IDE-specific formats (Cursor, Windsurf, Copilot, Agent Skills, and Antigravity). The implementation provides validation scripts, format converters, and shared utilities for language mappings and frontmatter parsing.

Changes:

Added Python conversion scripts that parse unified markdown rules with YAML frontmatter and generate IDE-specific formats
Implemented validation scripts for rule structure and version consistency across project files
Created shared utility modules for frontmatter parsing, language-to-extension mappings, and tag validation

Reviewed changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 26 comments.

Show a summary per file

File	Description
src/validate_versions.py	Version consistency validation across pyproject.toml, plugin.json, marketplace.json, and SKILL.md
src/validate_unified_rules.py	Validates unified rule files for correct YAML frontmatter and structure
src/utils.py	Shared utilities for frontmatter parsing, tag validation, and version reading
src/tag_mappings.py	Centralized list of known security rule tags
src/language_mappings.py	Language-to-extension mappings and glob pattern conversion functions
src/formats/base.py	Abstract base class defining the interface for all IDE format implementations
src/formats/cursor.py	Generates .mdc files for Cursor IDE with YAML frontmatter
src/formats/windsurf.py	Generates .md files for Windsurf IDE with trigger-based rules
src/formats/copilot.py	Generates .instructions.md files for GitHub Copilot
src/formats/agentskills.py	Generates .md files following the Agent Skills standard
src/formats/antigravity.py	Generates .md files for Google Antigravity with glob patterns
src/formats/init.py	Package initialization exposing all format classes
src/converter.py	Core conversion logic handling rule parsing and multi-format generation
src/convert_to_ide_formats.py	Main entry point script for converting rules to IDE bundles
.gitignore	Added Python-specific ignores and IDE-generated bundle exclusions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-29T20:03:52Z

+        # Use trigger: always_on for rules that should always apply
+        if rule.always_apply:
+            yaml_lines.append("trigger: always_on")
+        else:
+            yaml_lines.append("trigger: glob")
+            yaml_lines.append(f"globs: {globs}")


When rule.always_apply is false and rule.languages is empty, the globs value might be empty or "**/*". This could result in generating globs: with an empty or universal glob, which might not be the intended behavior for Windsurf format.

Suggested change

# Use trigger: always_on for rules that should always apply

if rule.always_apply:

yaml_lines.append("trigger: always_on")

else:

yaml_lines.append("trigger: glob")

yaml_lines.append(f"globs: {globs}")

# Normalize globs to handle empty or universal patterns safely

normalized_globs = (globs or "").strip()

# Use trigger: always_on for rules that should always apply, or when

# no specific globs are provided (empty or universal patterns).

if rule.always_apply or not normalized_globs or normalized_globs in ('**/*', '"**/*"', "'**/*'"):

yaml_lines.append("trigger: always_on")

else:

yaml_lines.append("trigger: glob")

yaml_lines.append(f"globs: {normalized_globs}")

Copilot · 2026-01-29T20:03:53Z

+        has_languages = "languages" in frontmatter and frontmatter["languages"]
+        always_apply = frontmatter.get("alwaysApply", False)
+
+        if always_apply and has_languages:


This validation logic prevents rules with alwaysApply=true from having languages, but the converter's parse_rule method at line 147 sets languages = [] for such rules. This means the validation in line 39 would never trigger since the languages list would be empty, not present. Consider checking for an empty list as well.

Suggested change

has_languages = "languages" in frontmatter and frontmatter["languages"]

always_apply = frontmatter.get("alwaysApply", False)

if always_apply and has_languages:

has_languages_key = "languages" in frontmatter

has_languages = has_languages_key and bool(frontmatter["languages"]) if has_languages_key else False

always_apply = frontmatter.get("alwaysApply", False)

if always_apply and has_languages_key:

Copilot · 2026-01-29T20:03:54Z

+        if rule.languages:
+            # Format as YAML list
+            yaml_lines.append("languages:")
+            for lang in rule.languages:
+                yaml_lines.append(f"- {lang}")


The languages list is manually formatted as YAML (lines 66-68), but if a language name contains special YAML characters (colons, quotes, etc.), it won't be properly escaped. Consider using yaml.safe_dump for the entire languages list to ensure proper escaping.

Copilot · 2026-01-29T20:03:54Z

+        # Add version
+        yaml_lines.append(f"version: {self.version}")


The version value is inserted directly into YAML without proper escaping. If the version string contains special YAML characters, this could create invalid YAML. Consider using proper YAML escaping for the version value to ensure robustness.

Suggested change

# Add version

yaml_lines.append(f"version: {self.version}")

# Add version (use YAML-safe formatting)

version_line = self._format_yaml_field("version", str(self.version))

if version_line:

yaml_lines.append(version_line)

Copilot · 2026-01-29T20:03:56Z

+            yaml_lines.append("trigger: always_on")
+        else:
+            yaml_lines.append("trigger: glob")
+            yaml_lines.append(f"globs: {globs}")


When rule.always_apply is false and rule.languages is empty, the globs value might be empty or "**/*". This could result in generating globs: with an empty or universal glob in the Antigravity format.

Copilot · 2026-01-29T20:04:01Z

+        pattern = pattern.strip().lower()
+
+        # Check for file extensions and patterns
+        for ext, lang in EXTENSION_TO_LANGUAGE.items():
+            if ext.lower() in pattern:
+                languages.add(lang)
+                break  # One match per pattern is enough
+


The language check at line 129 uses ext.lower() but the EXTENSION_TO_LANGUAGE dictionary keys are already lowercase (from LANGUAGE_TO_EXTENSIONS values). The pattern matching should also handle wildcards in extensions (e.g., "Dockerfile*" from line 50) properly, but the current implementation may not correctly match these patterns.

Suggested change

pattern = pattern.strip().lower()

# Check for file extensions and patterns

for ext, lang in EXTENSION_TO_LANGUAGE.items():

if ext.lower() in pattern:

languages.add(lang)

break # One match per pattern is enough

# Normalize pattern for case-insensitive matching

pattern = pattern.strip().lower()

# Check for file extensions and patterns

for ext, lang in EXTENSION_TO_LANGUAGE.items():

# Normalize extension once for comparison

ext_lower = ext.lower()

if "*" in ext:

# Wildcard-style "extensions" (e.g., 'dockerfile*'):

# treat as substring patterns within the glob.

if ext_lower in pattern:

languages.add(lang)

break # One match per pattern is enough

else:

# Normal file extensions (e.g., '.py'):

# require that the glob pattern end with the extension

# (possibly after a '*'), to avoid partial matches.

if pattern.endswith(ext_lower) or pattern.endswith(f"*{ext_lower}"):

languages.add(lang)

break # One match per pattern is enough

Copilot · 2026-01-29T20:04:01Z

+        # Adding rule_id to the beginning of the content
+        rule_id = Path(filename).stem
+        markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}"


The rule_id is extracted from the filename and prepended to the content at line 168. However, if the content already starts with "rule_id:", this would create a duplicate. Consider checking if the content already has a rule_id before adding it, or document that this should never happen.

Suggested change

# Adding rule_id to the beginning of the content

rule_id = Path(filename).stem

markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}"

# Adding rule_id to the beginning of the content, if not already present

rule_id = Path(filename).stem

stripped_content = markdown_content.lstrip()

if not stripped_content.startswith("rule_id:"):

markdown_content = f"rule_id: {rule_id}\n\n{markdown_content}"

Copilot · 2026-01-29T20:04:02Z

+            yaml_dump = yaml.safe_dump(
+                {field_name: value},
+                default_flow_style=False,
+                allow_unicode=True,
+                width=float("inf")
+            )
+            return yaml_dump.strip()


The _format_yaml_field method uses yaml.safe_dump which will produce output like "field_name: value\n". When this is joined with other lines that may be simple strings like "globs: {globs}", it could create inconsistent formatting. Some fields will be properly escaped by yaml.safe_dump while others are simple f-strings, potentially creating YAML syntax errors if the f-string values contain special characters.

Suggested change

yaml_dump = yaml.safe_dump(

{field_name: value},

default_flow_style=False,

allow_unicode=True,

width=float("inf")

)

return yaml_dump.strip()

# Use yaml.safe_dump to escape the scalar value, then format the field manually.

yaml_dump = yaml.safe_dump(

value,

default_flow_style=False,

allow_unicode=True,

width=float("inf"),

)

yaml_value = yaml_dump.strip()

return f"{field_name}: {yaml_value}"

Copilot · 2026-01-29T20:04:02Z

+            yaml_lines.append(desc)
+
+        # Add globs and version
+        yaml_lines.append(f"globs: {globs}")


The globs value is inserted directly into the YAML without proper escaping. If globs contain YAML special characters (like colons in "**/*.py:text"), this could create invalid YAML. Consider using proper YAML escaping or quoting for the globs value, similar to how description is handled with _format_yaml_field.

Suggested change

yaml_lines.append(f"globs: {globs}")

globs_field = self._format_yaml_field("globs", globs)

if globs_field:

yaml_lines.append(globs_field)

Copilot · 2026-01-29T20:04:02Z

+            yaml_lines.append("trigger: always_on")
+        else:
+            yaml_lines.append("trigger: glob")
+            yaml_lines.append(f"globs: {globs}")


The globs value is inserted directly into YAML without proper escaping. If globs contain special YAML characters, this could create invalid YAML syntax. Consider using proper YAML escaping for the globs value.

Suggested change

yaml_lines.append(f"globs: {globs}")

globs_field = self._format_yaml_field("globs", globs)

if globs_field:

yaml_lines.append(globs_field)

…ting-index-page Revise terminology and enhance clarity in documentation

Add rule converter scripts for multi-IDE format generation.

43ce783

thomas-bartlett requested review from aacarter1, Copilot, ramraaj25 and santosomar January 29, 2026 19:56

thomas-bartlett self-assigned this Jan 29, 2026

thomas-bartlett added the enhancement New feature or request label Jan 29, 2026

Copilot started reviewing on behalf of thomas-bartlett January 29, 2026 19:56 View session

thomas-bartlett added 2 commits January 29, 2026 14:57

Merge branch 'main' into feat/add-rule-converter-scripts

3d3cb14

Add pyproject.toml and uv.lock files.

03c7210

aacarter1 approved these changes Jan 29, 2026

View reviewed changes

aacarter1 merged commit f2525df into main Jan 29, 2026
2 checks passed

Copilot AI reviewed Jan 29, 2026

View reviewed changes

thschaffr pushed a commit to thschaffr/project-codeguard that referenced this pull request Jan 30, 2026

Merge pull request cosai-oasis#14 from project-codeguard/feature/upda…

4f30bb4

…ting-index-page Revise terminology and enhance clarity in documentation

thomas-bartlett deleted the feat/add-rule-converter-scripts branch February 5, 2026 17:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rule converter scripts for multi-IDE format generation.#14

Add rule converter scripts for multi-IDE format generation.#14
aacarter1 merged 3 commits into
mainfrom
feat/add-rule-converter-scripts

thomas-bartlett commented Jan 29, 2026

Uh oh!

aacarter1 left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-        # Add version
-        yaml_lines.append(f"version: {self.version}")
+        # Add version (use YAML-safe formatting)
+        version_line = self._format_yaml_field("version", str(self.version))
+        if version_line:
+            yaml_lines.append(version_line)

-        yaml_lines.append(f"globs: {globs}")
+        globs_field = self._format_yaml_field("globs", globs)
+        if globs_field:
+            yaml_lines.append(globs_field)

Conversation

thomas-bartlett commented Jan 29, 2026

Uh oh!

aacarter1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants