Reject non-ASCII digits in vWeekday ordwk (BYDAY/BYWEEKDAY/WKST)#1488
Open
Labib-Bin-Salam wants to merge 1 commit into
Open
Reject non-ASCII digits in vWeekday ordwk (BYDAY/BYWEEKDAY/WKST)#1488Labib-Bin-Salam wants to merge 1 commit into
Labib-Bin-Salam wants to merge 1 commit into
Conversation
The WEEKDAY_RULE regular expression matched ``\d`` for the ``ordwk`` (relative) part. In Python's default Unicode mode ``\d`` is True for non-ASCII digits such as the Arabic-Indic "12", and ``int`` then parses them, so a value like "<Arabic-Indic 12>MO" was silently accepted with relative == 12 instead of being rejected. RFC 5545, section 3.3.10 allows only ASCII DIGIT here (ordwk = 1*2DIGIT). Match [0-9] so the grammar agrees with what int() accepts. This mirrors the earlier vMonth non-ASCII digit fix. Added a regression test covering Arabic-Indic and extended-Arabic digits, and a change log entry. Existing valid values (e.g. 2MO, -1SU, +3WE, 53SU) are unaffected. AI usage disclosure (per CONTRIBUTING, "Responsible AI use"): - Model: Claude Opus 4.8 (claude-opus-4-8), run via Claude Code. - How used: assisted in identifying this sibling of the merged vMonth fix, drafting the one-line regex change, the regression test, and the change log entry. All output was reviewed, executed, and verified by the author (targeted prop and recurrence suites plus ruff pass locally).
Contributor
|
Profile summary: Full profile |
Documentation build overview
4 files changed± 404.html± contribute/index.html± _modules/icalendar/parser/string.html± _modules/icalendar/prop/recur/weekday.html |
angatha
reviewed
Jun 21, 2026
| r"""Non-ASCII digits must not be accepted in the ``ordwk`` (relative) part. | ||
|
|
||
| ``\d`` is ``True`` for non-ASCII digits (e.g. Arabic-Indic ones), and | ||
| ``int`` parses them, so an Arabic-Indic ``"12MO"`` was silently accepted |
Collaborator
There was a problem hiding this comment.
"12MO" is the valid version, did you mean "١٢MO"?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Linked issue
No related issue; reporting directly (same as #1465).
Description
vWeekdaysilently accepts non-ASCII digits in theordwk(relative) part of aBYDAY/BYWEEKDAY/WKSTvalue.Repro:
vWeekday("١٢MO")(Arabic-Indic "12"), or parse a calendar containingRRULE:FREQ=MONTHLY;BYDAY=١٢MO.Cause:
WEEKDAY_RULEmatches the relative part with(?P<relative>[\d]{0,2}). In Python's default Unicode mode,\disTruefor non-ASCII decimal digits (Arabic-Indic١٢, extended-Arabic۱۲, Devanagari, …).int()then parses those, so a malformedordwkis silently accepted and normalized to a validrelativenumber. RFC 5545 §3.3.10 allows only ASCIIDIGIThere (ordwk = 1*2DIGIT).Fix: match
[0-9]instead of\dso the grammar agrees with whatint()accepts. After the fix,vWeekday("١٢MO")raisesValueErrorlike any other malformed weekday, while every valid ASCII value (MO,2MO,-1SU,+3WE,53SU) is unchanged.This is the direct sibling of the recently merged
vMonthfix (#1465), which fixed the same non-ASCII-digit class inBYMONTH.Checklist
/news, following the instructions in Change log entry format.Additional information
test_non_ascii_ordwk_digits_rejected(Arabic-Indic and extended-Arabic digits, single- and double-digit) next to the existingvWeekdaytests; it fails before the change and passes after.ruff checkandruff format --checkare clean on the changed files; the targetedprop/(887) and recurrence/RRULE(444) suites pass locally. (The unrelatedtest_timezone_identification/test_with_doctestfailures reproduce identically on a clean checkout in this environment and are not affected by this change.)claude-opus-4-8), run via Claude Code — used to locate this sibling of thevMonthfix and to draft the regex change, the regression test, and the change log entry. All output was reviewed, executed, and verified by me, and I take responsibility for it. The model/version and usage are also recorded in the commit message and the disclosure is included in the change log entry.📚 Documentation preview 📚: https://icalendar--1488.org.readthedocs.build/en/1488/