Add Korean Lunar Calendar (음력) support#2375
Open
dsblank wants to merge 11 commits into
Open
Conversation
Adds a new CAL_CHINESE_LUNAR (value 7) calendar type alongside the existing Gregorian, Julian, Hebrew, French Republican, Persian, Islamic, and Swedish calendars. Conversion between Chinese Lunar dates and the internal SDN (Serial Date Number) representation is implemented in gcalendar.py using a compact year-info table (17 bits per year, covering 1900-2099) derived from the lunardate package (GPL-2, Fung F. Lee, Ricky Yeung, LI Daobing). Leap months are encoded as month + 100 (e.g. month 104 = intercalary 4th month). Month names use pinyin romanisation (Zhengyue...Shier'yue) so they can be translated for any locale. Parsing accepts both pinyin names and YYYY-MM[-DD] numeric notation with full leap-month support. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When Gramps runs under a Simplified (zh_CN) or Traditional (zh_TW) Chinese locale, Chinese Lunar calendar dates are now displayed as native 年/月/日 strings (e.g. 1976年八月8日) instead of pinyin romanisation. Leap months use the correct prefix — 闰 for Simplified, 閏 for Traditional (e.g. 1976年闰八月8日 for month 108). The parsers now recognise Chinese character month names (正月, 二月 ... 十二月) and their leap forms as input, in addition to the pinyin names already supported by the base English parser. Calendar keywords 农历 / 阴历 / 旧历 (Simplified) and 農曆 / 陰曆 / 舊曆 (Traditional) are also accepted as the Chinese Lunar calendar specifier. Base DateParser gains dedicated _cltext/_cltext2 regexes built from chinese_lunar_to_int so that month-name parsing works for any locale. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces chinese_sexagenary_year(year) in gcalendar.py, which maps any Gregorian-aligned Chinese year to its 干支 name (e.g. 1984 → 甲子, 2024 → 甲辰) using the standard (year - 4) % 10 / % 12 formula with the ten Heavenly Stems (天干) and twelve Earthly Branches (地支). The zh_CN and zh_TW date displayers gain a third format option "干支年格式" (index 2). When selected, Chinese Lunar dates render as e.g. 甲子年八月8日 or 甲辰年闰八月8日 instead of a numeric year. Three new unit tests cover: known year values (1984 甲子, 2024 甲辰, 1900 庚子, 1949 己丑, 2025 乙巳), the 60-year cycle property, and that all 60 stem-branch combinations appear exactly once per cycle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In the base _display_chinese_lunar, passing a remapped date_val (month 108 → 8) to _display_calendar caused the ISO string to show month 08 instead of 108, breaking parser round-trips. Fixed by falling back to display_iso whenever the month is a leap month (> 100), which preserves the raw encoding for all non-zh-locale formats. Locale-specific handlers (zh_CN, zh_TW) override this method and render leap months natively (闰八月 / 閏八月) so they are unaffected. Also adds Chinese Lunar months 1-12 and the known 1976 leap 8th month (108) to the non-Gregorian calendar loop in date_test.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the 200-entry (1900-2099) SDN lookup table with a 9600-entry table covering years 400-9999, generated from the tyme4py library (MIT licence, https://github.com/6tail/tyme4py). The table stores the same 17-bit encoding per year (12 regular month sizes, leap month index, leap month size). The SDN anchor is derived by working backward from the verified reference point Lunar 1600/1/1 = Gregorian 1600-02-14, so the 1600-9999 range is verifiably exact. Years 400-1599 are also accurate; years 1-399 are excluded because tyme4py's astronomical reconstruction for that period uses a different leap-month schedule than the historically recorded Han-dynasty calendar, introducing ~59 days of systematic drift. The generation script scripts/gen_chinese_lunar_table.py can be run to regenerate the table from tyme4py if the upstream library is updated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the Korean Lunar calendar as calendar type 8, following the same pattern as the Chinese Lunar calendar. The Korean and Chinese lunar calendars share identical astronomical rules and dates; the only differences are the display names (Korean month names and 간지 year names). New files: - gramps/gen/datehandler/_date_ko.py — Korean parser and displayer, registered for ko/ko_KR locales. Month names use traditional Korean (정월…십이월), leap months are prefixed with 윤, and format 2 shows the 60-year 간지 (干支) sexagenary cycle (갑자…계해). - gramps/gen/lib/test/korean_calendar_test.py — SDN round-trip, 간지 name, and base handler tests. - gramps/gen/datehandler/test/date_ko_test.py — Korean parser and displayer tests. Modified files: gcalendar.py (korean_lunar_sdn, korean_lunar_ymd, korean_ganji_year), date.py (CAL_KOREAN_LUNAR = 8), _datestrings.py (Korean Lunar calendar name and romanised month names), _datedisplay.py (_display_korean_lunar), _dateparser.py (korean_lunar_to_int, _parse_korean_lunar), __init__.py, editdate.py, po/POTFILES.skip. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Five bugs caused 1854 test failures for lang='ko':
- calendar_to_int used short names ("율리우스") while the translations
produce the full calendar names with the 력 suffix ("율리우스력",
"히브리력", "이슬람력", "프랑스 혁명력", "페르시안력", "스웨덴 달력").
Added both forms so either spelling is accepted.
- All modifier and quality msgstr in ko.po were empty, causing display
to fall back to English ("before ", "after ", "about ", "from ",
"to ", "estimated ", "calculated "). Added English fallbacks to
modifier_to_int and quality_to_int so the parser accepts them.
- The span regex used "부터...까지" but the Korean span display
format is "<start>에서 <stop>까지". Replaced with an infix
"에서...까지" pattern using named groups start/stop.
- The range regex used "사이...와" but the Korean range display format
is "<start>에서 <stop>사이". Replaced with an infix "에서...사이"
pattern.
- The ko.po span and range format templates put {date_quality} at the
end after 까지/사이, so the parser could never strip quality before
parsing the span. Moved {date_quality} to the front of both
templates to match how every other locale orders the fields.
- Collapsed a 3-line _klmon_str assignment to one line to satisfy
Black's line-length formatting.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fill in po/ko.po modifier msgstr values (이전/이후/약/부터/까지) and quality msgstr values (추정/계산) so the date displayer outputs Korean rather than falling back to English. Round-trip parsing is preserved because DateParserKO.modifier_to_int and quality_to_int already list these Korean keywords.
Po file changes should be submitted through Weblate, not via pull requests directly to the repository.
…abel.
The Korean locale translates the span display format as
"{nonstd_calendar_and_ny}{date_start}에서 {date_stop}까지 {date_quality}",
placing the (calendar) label at the start and the quality word at the end.
After strip() the leading whitespace was gone, so the old _cal regex
"(.*)\s+\(cal\)(.*)" (requiring \s+ before the opening parenthesis) never
matched, causing the entire date to be stored as text-only.
Two regex fixes in _dateparser.init_strings():
* _cal, _calny, _calny_iso, _ny, _ny_iso: change (.*)\s+ to (.*?)\s* so
that a "(CalendarName)" at position 0 (no leading whitespace) is
recognised. Using non-greedy (.*?) keeps the leftmost-match behaviour
identical to the old greedy approach for the common case where the
calendar label appears at the end.
* _qual: extend the pattern from "(.* ?)QUAL\s+(.+)" to
"(.* ?)QUAL(?:\s+(.+)|\s*$)" so quality words that appear at the very
end of the string (nothing after them) are also stripped.
match_quality updated to handle the new optional group(3) (None when
quality is at the end).
These changes resolve 1344 CI failures in
DateHandlerTest.test_span for lang='ko'.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In Korean, temporal modifiers meaning "before" and "after" follow the date: "2000년 이전" (before 2000), "1949년 이후" (after 2000). Previously both were in modifier_to_int (prefix), so "2000년 이전" was treated as unrecognised text and the display fell back to English "before"/"after". - Move 이전/이후 from modifier_to_int to modifier_after_to_int in DateParserKO. English fallbacks (before/after) remain in modifier_to_int so stored dates with English text still parse. - Override _modifier_after regex in init_strings to use \s* so that both "2000년 이전" (space) and "2000년이전" (no space) are accepted. - Add __init__ to DateDisplayKO to patch _mod_str with leading-space postfix strings for 이전/이후 and prefix "약 " for MOD_ABOUT, and to restore korean_lunar Hangul month names overwritten by DateDisplay.__init__. - Update test_modifier_ijeon/ihu to check modifier_after_to_int. - Add TestKoreanModifierParsing covering parser, display word order, English fallbacks, and round-trips. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the Korean Lunar calendar (음력) as calendar type 8, following the same pattern as the Chinese Lunar calendar added in #2354.
The Korean and Chinese lunar calendars share identical astronomical rules and produce the same dates. The differences are display names only:
New files
gramps/gen/datehandler/_date_ko.pyko/ko_KRlocalesgramps/gen/lib/test/korean_calendar_test.pygramps/gen/datehandler/test/date_ko_test.pyModified files
gcalendar.py—korean_lunar_sdn,korean_lunar_ymd,korean_ganji_yeardate.py—CAL_KOREAN_LUNAR = 8, updatedCALENDARS, converter lists and name lists_datestrings.py— "Korean Lunar" calendar name; romanised month names (Jeongwol … Sibiwol) for non-Korean locales_datedisplay.py—korean_lunarattribute;_display_korean_lunarbase method_dateparser.py—korean_lunar_to_intclass variable;_parse_korean_lunarmethoddatehandler/__init__.py— imports_date_kogui/editors/editdate.py— addsCAL_KOREAN_LUNARto month-name mappo/POTFILES.skip— excludes_date_ko.py(no translatable strings)Dependencies
This PR is built on top of #2354 (Chinese Lunar Calendar). Once #2354 is merged to master this branch will be rebased and the diff will show only the Korean-specific changes.
Test plan
korean_calendar_test.py,date_ko_test.py)🤖 Generated with Claude Code
Date Display: US (English) vs KO (Korean)