Skip to content

Neutralize Hugo shortcode delimiters in generated code examples#11206

Open
rgharris wants to merge 1 commit into
masterfrom
rgharris/hugo-shortcode-fix
Open

Neutralize Hugo shortcode delimiters in generated code examples#11206
rgharris wants to merge 1 commit into
masterfrom
rgharris/hugo-shortcode-fix

Conversation

@rgharris
Copy link
Copy Markdown
Contributor

@rgharris rgharris commented May 29, 2026

Stop Hugo from misreading shortcode delimiters inside generated fenced code examples, which was aborting the entire site build.

resourcedocsgen now neutralizes Hugo shortcode delimiters ({{% and {{<) that appear inside fenced code blocks when emitting .md files. Hugo parses shortcodes even inside code fences, so an example snippet like {{%s|token}} (Dynatrace credential-vault syntax plus a Java String.format placeholder) was read as a call to a shortcode named "s", failing the build with 'template for shortcode "s" not found'.

The rewrite is restricted to fence interiors, so legitimate hand-authored shortcodes ({{< chooser >}} / {{% choosable %}}, which only wrap fences from the outside) are untouched. Inline code spans and 4-space-indented code blocks are intentionally outside this change because the current generator surface emits fenced examples.

Neutralization inserts a single space after {{. It needs no delimiter pairing (the offending text often has no matching %}}) and is idempotent. Tradeoff: the inserted space changes the rendered example, so a copied snippet carries an extra space (Bearer {{ %s|token}}). That is harmless to the Java String.format call, but Dynatrace would not accept the resulting vault reference if pasted verbatim. We accept this cosmetic cost over a failing build.

Surfaced by #11191 (auto-filed workflow failure for the dynatrace@v0.35.2 bump, PR #11190). The offending {{%s|token}} examples are new in that version (0 occurrences in v0.34.0, 8 in v0.35.2), so the build fails deterministically until this lands on master.

Closes #11205

@rgharris rgharris requested a review from a team May 29, 2026 18:30
@claude
Copy link
Copy Markdown

claude Bot commented May 29, 2026

Registry Review — PR #11206

Scope: tools/resourcedocsgen/pkg/output.go (+58), tools/resourcedocsgen/pkg/output_test.go (+93, new). Module-local change in resourcedocsgen; no Hugo/CI/infra impact.

Summary

Solid, surgical fix for the Hugo shortcode-parsing failure described in #11205. The approach — neutralize {{%/{{< only inside fenced code blocks, preserving authored shortcodes outside fences — is well-targeted and the test coverage is thorough (nested fences, tilde fences, indented delimiters, idempotency, prose passthrough).

Findings

1. hugoShortcodeOpener name is slightly narrower than what it matches (output.go:29)

The regex \{\{([%<]) also matches Hugo shortcode closers like {{</foo>}} and {{%/foo%}}. That's fine — they need disarming too — but the name implies opener-only. Consider renaming for accuracy (and updating the doc comment on line 28):

// hugoShortcodeDelim matches Hugo's shortcode delimiters `{{%` and `{{<`,
// covering both openers ({{% foo %}}, {{< foo >}}) and closers ({{% /foo %}}, {{< /foo >}}).
var hugoShortcodeDelim = regexp.MustCompile(`\{\{([%<])`)

Non-blocking — purely a clarity nit.

2. Inline code spans are not neutralized (output.go:33-54)

The function only handles fenced code blocks. Inline code like `use {{% foo %}} here` in prose will still be shortcode-parsed by Hugo. Not a problem for the current Dynatrace failure (the offending {{%s|token}} snippets are all in fenced Java blocks), and chasing inline spans here would expand scope. Worth a one-line comment near the function doc noting the scope, so a future reader hitting an inline-code variant doesn't spend time wondering whether this was intentional:

// neutralizeHugoShortcodes disarms Hugo shortcode openers inside fenced code
// blocks, where Hugo would resolve them before Markdown rendering. Inline code
// spans are intentionally not handled — extend here if a real failure surfaces.
func neutralizeHugoShortcodes(contents []byte) []byte {

3. Indented (4-space) code blocks are not handled (output.go:33-54)

CommonMark also recognizes 4-space-indented blocks as code. They're not handled here. As with inline spans, this isn't blocking — resourcedocsgen's output uses fenced blocks — but mentioning it in the doc comment (or the PR description's "tradeoffs" section) helps future debugging if a provider switches conventions.

4. CRLF inputs work, worth a quick mental check

strings.Split(contents, \"\\n\") leaves a trailing \\r on each line under CRLF, which fenceRun tolerates because the close-fence check uses strings.TrimSpace(rest). Good as-is — no change needed, just noting the resilience is incidental, not asserted by tests. If you wanted belt-and-suspenders, a single CRLF test case would lock it in.

Looks good

  • Idempotency is tested (output_test.go:54-58) — important since EmitFile could in principle be called on already-processed content.
  • Nested fences via longer outer markers are tested (output_test.go:60-66, output_test.go:78-82) — that's the CommonMark-correct behavior.
  • The opt-in by .md suffix in EmitFile (output.go:88) is the right scope.
  • Copyright header on the new test file uses 2026 (output_test.go:1) — matches current year.
  • Code is idiomatic Go, errors handled, regex compiled once as a package var.

Suggested test additions (optional)

  • A CRLF input case to lock in the incidental support.
  • A case where the closing fence is missing (EOF inside a fence) — current behavior (continue neutralizing to EOF) is correct but untested.

Mention @claude for additional reviews or fixes.

Stop Hugo from misreading shortcode delimiters inside generated fenced code examples, which was aborting the entire site build.

resourcedocsgen now neutralizes Hugo shortcode delimiters ({{% and {{<) that appear inside fenced code blocks when emitting .md files. Hugo parses shortcodes even inside code fences, so an example snippet like {{%s|token}} (Dynatrace credential-vault syntax plus a Java String.format placeholder) was read as a call to a shortcode named "s", failing the build with 'template for shortcode "s" not found'.

The rewrite is restricted to fence interiors, so legitimate hand-authored shortcodes ({{< chooser >}} / {{% choosable %}}, which only wrap fences from the outside) are untouched. Inline code spans and 4-space-indented code blocks are intentionally outside this change because the current generator surface emits fenced examples.

Neutralization inserts a single space after {{. It needs no delimiter pairing (the offending text often has no matching %}}) and is idempotent. Tradeoff: the inserted space changes the rendered example, so a copied snippet carries an extra space (Bearer {{ %s|token}}). That is harmless to the Java String.format call, but Dynatrace would not accept the resulting vault reference if pasted verbatim. We accept this cosmetic cost over a failing build.

Surfaced by #11191 (auto-filed workflow failure for the dynatrace@v0.35.2 bump, PR #11190). The offending {{%s|token}} examples are new in that version (0 occurrences in v0.34.0, 8 in v0.35.2), so the build fails deterministically until this lands on master.

Closes #11205
@rgharris rgharris force-pushed the rgharris/hugo-shortcode-fix branch from 0aef8cf to 2e4dca4 Compare May 29, 2026 18:36
@rgharris
Copy link
Copy Markdown
Contributor Author

@claude review again

@claude
Copy link
Copy Markdown

claude Bot commented May 29, 2026

Claude finished @rgharris's task in 2m 34s —— View job


Re-review — PR #11206

Re-read both files at HEAD 2e4dca4 and walked the fence-tracking logic through every test case by hand. The earlier review's actionable findings are already addressed in this commit:

  • The variable is named hugoShortcodeDelim and the comment names it as a "shortcode delimiter" — covers both openers and closers (output.go:28-29).
  • The function doc explicitly calls out inline code spans and 4-space indented blocks as intentionally out of scope (output.go:31-33).

No new issues surfaced on a fresh read. A few small observations worth recording, none blocking:

Logic spot-checks (all correct)

  • Fence-state machine (output.go:39-54) — n >= fenceLen plus strings.TrimSpace(rest) == "" matches CommonMark closing-fence rules; shorter inner fences and tilde-vs-backtick siblings stay inside the outer fence. Test cases at output_test.go:64-69 and 76-79 lock this in.
  • Triple-brace {{{id} — regex \{\{([%<]) requires the third char to be % or <, so Python f-strings stay intact. output_test.go:49-52 confirms.
  • Closers{{< /foo >}} inside a fence is rewritten the same way as openers (the < is captured as $1); they need disarming too, so this is correct behavior.
  • Idempotency — replacing {{% with {{ % guarantees a second pass can't re-match. output_test.go:58-62 covers it.
  • Concurrencyregexp.Regexp methods are safe for concurrent use, so the package-level compiled regex is fine.

Optional follow-ups (not blocking)

  • CRLF resilience is incidentalstrings.Split(contents, "\n") leaves a trailing \r on each line, and fenceRun's strings.TrimSpace(rest) happens to swallow it on the close-fence check. Works today, but a one-line CRLF test case would lock it in. Add CRLF test →
  • Unclosed fence at EOF — behavior (keep neutralizing to EOF) is correct but untested. A short test would document the intent. Add unclosed-fence test →
  • Allocation profilestring(contents) + Split + Join + []byte(...) is fine for doc-sized inputs; no action needed, just noting it for posterity in case this is ever called on very large bundles.

Verdict

LGTM as-is. The scope (fence interiors only, opt-in via .md suffix in EmitFile) is right-sized for the deterministic Dynatrace failure, the trade-off in the PR description (extra space in pasted snippets) is the honest one to make, and the test coverage hits the meaningful edge cases (nested fences, tilde fences, indented delimiters, idempotency, prose passthrough). Tests weren't executed locally — sandbox blocked go test — but the implementation is small enough to verify by inspection, and CI on this PR will exercise it.

  • Re-read both changed files at HEAD 2e4dca4
  • Re-verify prior findings against current code
  • Walk fence-tracking logic by hand across test cases
  • Post re-review
    · branch rgharris/hugo-shortcode-fix

@github-actions
Copy link
Copy Markdown
Contributor

Your site preview for commit 0aef8cf is ready! 🎉

http://registry--origin-pr-11206-0aef8cfa.s3-website.us-west-2.amazonaws.com/registry.

@github-actions
Copy link
Copy Markdown
Contributor

Your site preview for commit 2e4dca4 is ready! 🎉

http://registry--origin-pr-11206-2e4dca42.s3-website.us-west-2.amazonaws.com/registry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hugo shortcode delimiters in generated code examples break the site build

1 participant