Skip to content

feat: Korean language-aware compression#462

Open
gruming wants to merge 4 commits into
JuliusBrussee:mainfrom
gruming:korean-language-aware-compression
Open

feat: Korean language-aware compression#462
gruming wants to merge 4 commits into
JuliusBrussee:mainfrom
gruming:korean-language-aware-compression

Conversation

@gruming

@gruming gruming commented May 28, 2026

Copy link
Copy Markdown

Summary

Caveman's compression rules are English-specific (drop articles, English short synonyms), so when it responds in Korean the output breaks — garbled fragments, forced/awkward translation of technical terms, unnatural phrasing. This PR makes caveman respond in the user's language and compress it natively.

When responding in Korean it now:

  • uses 음슴체/개조식 (Korean's natural terse register) instead of broken grunt
  • drops particles (은/는/이/가/을/를) only when grammatical role stays clear
  • keeps technical terms in English — first mention 한글(English), single form after; ultra drops the gloss
  • never translates/transliterates code, function/API names, or error strings

Design

  • Language is orthogonal to intensity. lite/full/ultra are unchanged; the new rules just describe how each intensity manifests in Korean. No new mode names (no korean-*).
  • Auto-detected, not a toggle. Caveman follows the user's input language, so it "just works" after install. wenyan-* stays an explicit Classical Chinese override and wins over the auto-detected language (wenyan = chosen style; Korean = following the user's actual language).
  • English behavior is unchanged — every hunk is additive; existing English rules/examples are untouched. Backward compatible.

Changes

  • skills/caveman/SKILL.md — new ## Language + ## Korean sections, Korean lite/full/ultra examples, description note. (Single source of truth; the plugins/ mirror + dist/ are left for CI sync per CONTRIBUTING.)
  • src/rules/caveman-activate.md — language-aware line for the always-on rule body.
  • README.md — short Korean section + What You Get row.

Test plan

Ran the edited SKILL.md as a system prompt through claude -p (mirrors the eval harness setup):

  • Korean concept question -> natural 음슴체, English technical terms preserved
  • Destructive op (DROP COLUMN) -> auto-clarity drops to plain prose for the warning, SQL preserved
  • ultra -> max compression, English terms only, no 병기 gloss
  • English prompt -> unchanged English caveman, no Korean leakage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant