Skip to content

Refresh check spelling#10106

Open
jsoref wants to merge 7 commits intorancher-sandbox:mainfrom
jsoref:refresh-check-spelling
Open

Refresh check spelling#10106
jsoref wants to merge 7 commits intorancher-sandbox:mainfrom
jsoref:refresh-check-spelling

Conversation

@jsoref
Copy link
Copy Markdown
Contributor

@jsoref jsoref commented Apr 7, 2026

This takes advantage of a new feature in v0.0.26: config.json for defining configuration using json instead of in workflows.

jsoref added 7 commits April 7, 2026 13:12
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
@jsoref jsoref force-pushed the refresh-check-spelling branch from 1b8eab1 to a12c271 Compare April 7, 2026 17:13
@@ -1,4 +1,5 @@
emoji
esc'd
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still working on properly fixing it -- I've added some initial code to help with this and will write a test (but probably not today)

@@ -0,0 +1,29 @@
{
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file can contain configuration as long as it's delegated by $INPUTS via load-config-from.pr-base-keys.

Comment on lines +111 to +112
sudo apt-get update
sudo apt-get install -y cpanminus
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(DOCKER_HOST=unix:///Users/$USER/.rd/docker.sock act --container-daemon-socket - -j spelling 2>&1) 

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, they don't have the equivalent of DEBIAN_FRONTEND=noninteractive set (and whatever else that causes auto-detect to fall over)? Having that actually in the commit message would be nice, but that's sort of cherry on top here.

Comment on lines -66 to +69
git clone --branch "$version" --depth 1 "$repo" "$checkout" >&2
git -c advice.detachedHead=false clone --branch "$version" --depth 1 "$repo" "$checkout" >&2
else
git -C "$checkout" fetch origin "$version" >&2
git -C "$checkout" checkout "$version" >&2
git -c advice.detachedHead=false -C "$checkout" checkout "$version" >&2
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reduces some noise at the start.


INPUTS=$(yq --output-format=json <<EOF
suppress_push_for_open_pull_request: 1
checkout: true
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't necessary as the workflow checks out the repository (or the repository exists because someone has it and is running this script).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I just copied it from the workflow without thinking too hard.

cspell:dart/dart.txt
load-config-from: |
{
"pr-base-keys": [
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure this is the right choice for this repository. I need to think about this a bit more... I think that the equivalent might actually be to use pr-trusted-keys.

The current code spelling.sh basically uses whatever is in this file which is effectively the merge head as opposed to the base...

Anyway, this change is more to talk about whether it would be desirable to use the JSON config which v0.0.26 has or rely on yq to produce json.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think pr-trusted-keys makes more sense to us, because we don't allow any permissions in this workflow anyway so it should be no different; it just makes making PRs changing check-spelling configuration easier.

I don't think using the external JSON file is materially better (because we still need the bulk of the config in this file); if we can move all of it over there, then it would make more sense, instead of having configuration in two places instead. (Having just one line to load the other file, of course, is an exception.) Otherwise we already could have the whole thing in a different file, and only patch in use_sarif via jq/yq. Of course, if using the external file makes things much easier for version bumps, that's a good reason to do that; but in that case, please document why (in a GitHub PR comment is fine).

(P.S. the documentation needs an update now that v0.0.26 is out.)

Comment on lines -822 to -829
# Homoglyph (Cyrillic) should be `A`/`B`/`C`/`E`/`H`/`I`/`I`/`J`/`K`/`M`/`O`/`P`/`S`/`T`/`Y`
# It's possible that your content is intentionally mixing Cyrillic and Latin scripts, but if it isn't, you definitely want to correct this.
(?<=[A-Z]{2})[АВСЕНІӀЈКМОРЅТУ]|[АВСЕНІӀЈКМОРЅТУ](?=[A-Z]+(?:\b|[a-z]+)|[a-z]+(?:[^a-z]|$))

# Homoglyph (Cyrillic) should be `a`/`b`/`c`/`e`/`o`/`p`/`x`/`y`
# It's possible that your content is intentionally mixing Cyrillic and Latin scripts, but if it isn't, you definitely want to correct this.
[авсеорху](?=[A-Za-z]{2,})|(?<=[A-Za-z]{2})[авсеорху]|(?<=[A-Za-z])[авсеорху](?=[A-Za-z])

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check-spelling will now complain about the presence of homoglyphs where letters should be if they substitute for a word in the dictionary.

If you're worried about individual homoglyphs, you'd still want this, in which case it'd make sense to pull in two more rules: https://github.com/check-spelling/spell-check-this/blob/6bdcf01505c96f16f6016cc97fd6a045baca8577/.github/actions/spelling/line_forbidden.patterns#L1006-L1010

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we're not particularly worried about homoglyphs (more than typical). Saw that in the release notes, but didn't realize we already had something here. Thanks!

@jsoref jsoref marked this pull request as ready for review April 7, 2026 17:25
Comment on lines -822 to -829
# Homoglyph (Cyrillic) should be `A`/`B`/`C`/`E`/`H`/`I`/`I`/`J`/`K`/`M`/`O`/`P`/`S`/`T`/`Y`
# It's possible that your content is intentionally mixing Cyrillic and Latin scripts, but if it isn't, you definitely want to correct this.
(?<=[A-Z]{2})[АВСЕНІӀЈКМОРЅТУ]|[АВСЕНІӀЈКМОРЅТУ](?=[A-Z]+(?:\b|[a-z]+)|[a-z]+(?:[^a-z]|$))

# Homoglyph (Cyrillic) should be `a`/`b`/`c`/`e`/`o`/`p`/`x`/`y`
# It's possible that your content is intentionally mixing Cyrillic and Latin scripts, but if it isn't, you definitely want to correct this.
[авсеорху](?=[A-Za-z]{2,})|(?<=[A-Za-z]{2})[авсеорху]|(?<=[A-Za-z])[авсеорху](?=[A-Za-z])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we're not particularly worried about homoglyphs (more than typical). Saw that in the release notes, but didn't realize we already had something here. Thanks!

Comment on lines +111 to +112
sudo apt-get update
sudo apt-get install -y cpanminus
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, they don't have the equivalent of DEBIAN_FRONTEND=noninteractive set (and whatever else that causes auto-detect to fall over)? Having that actually in the commit message would be nice, but that's sort of cherry on top here.


INPUTS=$(yq --output-format=json <<EOF
suppress_push_for_open_pull_request: 1
checkout: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I just copied it from the workflow without thinking too hard.

cspell:dart/dart.txt
load-config-from: |
{
"pr-base-keys": [
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think pr-trusted-keys makes more sense to us, because we don't allow any permissions in this workflow anyway so it should be no different; it just makes making PRs changing check-spelling configuration easier.

I don't think using the external JSON file is materially better (because we still need the bulk of the config in this file); if we can move all of it over there, then it would make more sense, instead of having configuration in two places instead. (Having just one line to load the other file, of course, is an exception.) Otherwise we already could have the whole thing in a different file, and only patch in use_sarif via jq/yq. Of course, if using the external file makes things much easier for version bumps, that's a good reason to do that; but in that case, please document why (in a GitHub PR comment is fine).

(P.S. the documentation needs an update now that v0.0.26 is out.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants