fix(vale): track full line range per hunk in CI annotation parser by dhtclk · Pull Request #6188 · ClickHouse/clickhouse-docs

dhtclk · 2026-05-08T19:42:43Z

Summary

scripts/vale/changed_lines_to_json.py was only recording the start line of each @@ hunk header, so any error-level Vale finding past the first added line of a hunk got filtered out by vale_annotations.py and silently passed CI.

This was caught while reviewing #6183 — Vale flagged two ClickHouse.Headings errors (lines 44 and 56), but CI was green because both lines fell outside the [26, 43] changed_lines list the script produced.

Before

match = re.search(r"^@@ -[0-9]+(?:,[0-9]+)? \+([0-9]+)(?:,[0-9]+)? @@", line)
if match:
    line_number = int(match.group(1))
    changed_lines.append(line_number)   # only the start line

For PR #6183's @@ -41,0 +43,26 @@ hunk this produced [43] instead of [43..68].

After

match = re.search(r"^@@ -[0-9]+(?:,[0-9]+)? \+([0-9]+)(?:,([0-9]+))? @@", line)
if match:
    start = int(match.group(1))
    count = int(match.group(2)) if match.group(2) is not None else 1
    changed_lines.extend(range(start, start + count))

Now [43, 44, 45, …, 68]. Pure deletions (+A,0) yield an empty range as expected.

Validation against PR #6183

	Before fix	After fix
`changed_lines` for the PR file	`[26, 43]`	27 entries (`[26, 43..68]` — matches `+27` additions)
Annotation script exit code	`0` (pass)	`1` (fail)
Errors surfaced to GitHub annotations	0	2 (sentence-case violations on lines 44 + 56)

Blast radius

This bug affected any error-level Vale finding past the first added line of every hunk, for as long as this script has existed. A separate sweep of recently-merged PRs may be worth doing to identify escapes — out of scope here.

Test plan

New unit tests at scripts/vale/test/test_changed_lines_to_json.py cover six cases: single-line addition (omitted count), multi-line addition, pure deletion, modified hunk, multiple hunks concatenated, and hunk header with trailing context.
Verified the new tests pass against the fix (6/6).
Verified the new tests fail against the buggy version (5/6 fail; the single-line case coincidentally matched the buggy output).
End-to-end validation against PR chore(byoc): add GCP priviledge documentation #6183: ran the fixed changed_lines_to_json.py + vale_annotations.py against the PR's diff and confirmed CI would now exit with code 1 and surface both heading errors.

Run locally:

python3 scripts/vale/test/test_changed_lines_to_json.py -v

🤖 Generated with Claude Code

scripts/vale/changed_lines_to_json.py only recorded the start line of each `@@` hunk header, so vale_annotations.py filtered Vale findings on every other added line out of the diff. Result: error-level Vale findings past the first added line of each hunk silently passed CI (e.g. heading-casing errors on lines 44 and 56 of PR #6183). Expand each hunk into the full added range using the count from the unified=0 hunk header. Pure deletions (`+A,0`) yield an empty range, as expected. Adds unit tests for the parser covering single-line, multi-line, deletion, modified, multi-hunk, and trailing-context cases. Five of the six tests fail against the previous behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-08T19:42:49Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
clickhouse-docs	Ready	Preview, Comment	May 8, 2026 7:56pm
clickhouse-docs-jp	Building	Preview, Comment	May 8, 2026 7:56pm

3 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
clickhouse-docs-ko	Ignored	Preview	May 8, 2026 7:56pm
clickhouse-docs-ru	Ignored	Preview	May 8, 2026 7:56pm
clickhouse-docs-zh	Ignored	Preview	May 8, 2026 7:56pm

dhtclk requested a review from a team as a code owner May 8, 2026 19:42

Blargian approved these changes May 8, 2026

View reviewed changes

vercel Bot deployed to Preview – clickhouse-docs May 8, 2026 19:56 View deployment

dhtclk merged commit 3f8508b into main May 8, 2026
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(vale): track full line range per hunk in CI annotation parser#6188

fix(vale): track full line range per hunk in CI annotation parser#6188
dhtclk merged 1 commit intomainfrom
fix/vale-changed-lines-full-range

dhtclk commented May 8, 2026

Uh oh!

vercel Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dhtclk commented May 8, 2026

Summary

Before

After

Validation against PR #6183

Blast radius

Test plan

Uh oh!

vercel Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented May 8, 2026 •

edited

Loading