Skip to content

[Bug Report] extractScriptSetup regex matches <script setup> inside HTML comments #70

@YunYouJun

Description

@YunYouJun

Description

extractScriptSetup in src/core/markdown.ts uses a regex that does not account for HTML comment boundaries. When rendered markdown HTML contains <script setup> inside an <!-- ... --> comment, the regex incorrectly extracts it as a real Vue SFC block.

Two bugs

Bug 1: Commented <script setup> is extracted

When a markdown file contains:

Some text

<!--
<script setup lang="ts">
import { ref } from "vue";
</script>
-->

More text

markdown-it correctly renders this as a single HTML comment. But extractScriptSetup still matches the <script setup> inside the comment and extracts it.

Bug 2: Greedy regex corrupts content when both real and commented scripts exist

When a markdown file contains both a real <script setup> and a commented one:

<script setup lang="ts">
const count = ref(0);
</script>

Some text

<!--
<script setup lang="ts">
import { ref } from "vue";
</script>
-->

The greedy ([\s\S]*) in the regex matches from the first <script setup> opening to the last </script> closing, swallowing the template HTML content in between. The extracted script code becomes:

const count = ref(0);
</script>
<p>Some text</p>
<!--
<script setup lang="ts">
import { ref } from "vue";

Root Cause

// src/core/markdown.ts
const scriptSetupRE = /<\s*script([^>]*)\bsetup\b([^>]*)>([\s\S]*)<\/script>/g

This regex does not skip matches inside HTML comments (<!-- ... -->).

Minimal Reproduction

// repro.mjs — run with: node repro.mjs
const scriptSetupRE = /<\s*script([^>]*)\bsetup\b([^>]*)>([\s\S]*)<\/script>/g

function extractScriptSetup(html) {
  const scripts = []
  html = html.replace(scriptSetupRE, (_, attr1, attr2, code) => {
    scripts.push({ code, attr: `${attr1} ${attr2}`.trim() })
    return ''
  })
  return { html, scripts }
}

// Bug 1: commented script is extracted
const html1 = `<p>text</p>\n<!--\n<script setup lang="ts">\nimport { ref } from "vue";\n</script>\n-->\n<p>more</p>`
const r1 = extractScriptSetup(html1)
console.log('Bug 1 — scripts extracted:', r1.scripts.length, '(expected 0)')

// Bug 2: greedy regex corrupts content
const html2 = `<script setup lang="ts">\nconst count = ref(0);\n</script>\n<p>text</p>\n<!--\n<script setup lang="ts">\nimport { ref } from "vue";\n</script>\n-->`
const r2 = extractScriptSetup(html2)
console.log('Bug 2 — script code includes HTML:', r2.scripts[0]?.code.includes('<p>'), '(expected false)')

Output:

Bug 1 — scripts extracted: 1 (expected 0)
Bug 2 — script code includes HTML: true (expected false)

Suggested Fix

Strip HTML comments from the input before applying scriptSetupRE, or modify the regex/extraction logic to skip matches that are inside <!-- ... --> blocks.

Impact

This affects any project where users write <script setup> inside HTML comments in markdown files (e.g., to temporarily disable code, or to document SFC usage examples).

Related downstream issue: YunYouJun/valaxy#558

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions