-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
gh-83461: Don't allow datetime parsing to accept non-ASCII digits #131008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
StanFromIreland
commented
Mar 9, 2025
•
edited by bedevere-app
bot
Loading
edited by bedevere-app
bot
- Issue: Don't allow datetime parsing to accept non-Ascii digits #83461
The original issue was marked with a security bug but I haven't looked at the entire thread so we might consider it a simple bug fix |
No it needs to be relabeled: -versions -type-security +type-bug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a need to update the C implementation?
C calls the Python implementation. |
Co-authored-by: Bénédikt Tran <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually wonder but what about languages for which their input has non-ASCII strings such as Japanese?
Misc/NEWS.d/next/Library/2025-03-09-11-01-00.gh-issue-83461.auwd13.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ASCII flag restriction is too broad here as it applies to all formats, not just the digit part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please carefully address the following suggestions and please make sure that typos lines are correctly formatted.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Note: I'll review more this PR once I'm back because it will be easier when I'm on a laptop and not on mobile (so on Wednesday/Thursday) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last nit and LGTM (ok on my side).
Anything else left to do @pganssle ? |
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be put in the comfy chair! |
I have made the requested changes; please review again |
I think my comments on how to organize the documentation were not clear enough, so rather than another round of back-and-forth, I went ahead and pushed some changes to the documentation directly. Also, I realized that the summary of "reject non-ASCII digits" isn't quite right, because there are actually locales where codes like >>> from datetime import datetime
>>> import locale; locale.setlocale(locale.LC_ALL, 'fa_IR.utf8')
>>> print(datetime.now().strftime('%x'))
۲۵/۰۳/۲۶
>>> print(datetime.now().strftime("c"))
چهارشنبه ۲۶ مارس ۲۵، ۱۰:۳۷:۵۱ And in fact in 3.14 we fixed TBH, I'm a little apprehensive about so widely advertising this change, because part of the rationale for doing it is that support for this kind of thing was patchy in the first place, so it's better for it to either all work or not work at all. Fixing a bug like that doesn't seem like the kind of thing that needs a What's New entry and a permanent log of the change in the Given that we're trying to improve locale support in other ways, I could imagine us adding support for these things more explicitly in the future, so I wonder if we want to set the expectation that this will never work. @StanFromIreland @picnixz Would one or both of you mind taking a look at my changes and seeing if you agree? @serhiy-storchaka As the person who has been working on the locale improvements, any thoughts on this subject? |
I think that formats like Currently, a heuristic is used to determine the format for First, we need to add official support of the Actually, I started to work on |
Oh I wasn't aware of this so thank you for the corrections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in Lib/_striptime.py
, you can add a small comment saying that the O*
formats are locale-specific? (just above `for d in 'dmyHIMS'?
@@ -563,6 +563,9 @@ Functions | |||
When used with the :func:`strptime` function, ``%U`` and ``%W`` are only used in | |||
calculations when the day of the week and the year are specified. | |||
|
|||
(5) | |||
The :func:`strptime` function does not accept non-ASCII digits for numeric values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention the "non-locale-specific numeric format codes" here or, since it's not officially supported, we can be a bit lazy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pganssle Do you want to do this too? Or should I?
@pganssle friendly ping :-) |