Skip to content

gh-83461: Don't allow datetime parsing to accept non-ASCII digits #131008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion Doc/library/datetime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2432,7 +2432,8 @@ Class attributes:

:class:`date`, :class:`.datetime`, and :class:`.time` objects all support a
``strftime(format)`` method, to create a string representing the time under the
control of an explicit format string.
control of an explicit format string. A :exc:`ValueError` will be raised if digits
are not ASCII.

Conversely, the :meth:`date.strptime`, :meth:`datetime.strptime` and
:meth:`time.strptime` class methods create an object from a string
Expand Down Expand Up @@ -2611,6 +2612,9 @@ differences between platforms in handling of unsupported format specifiers.
.. versionadded:: 3.12
``%:z`` was added.

.. versionchanged:: next
Non-ASCII digits are now rejected by ``strptime`` for numerical directives.

Technical Detail
^^^^^^^^^^^^^^^^

Expand Down
10 changes: 7 additions & 3 deletions Doc/library/time.rst
Original file line number Diff line number Diff line change
Expand Up @@ -594,9 +594,10 @@ Functions
:func:`strftime`; it defaults to ``"%a %b %d %H:%M:%S %Y"`` which matches the
formatting returned by :func:`ctime`. If *string* cannot be parsed according
to *format*, or if it has excess data after parsing, :exc:`ValueError` is
raised. The default values used to fill in any missing data when more
accurate values cannot be inferred are ``(1900, 1, 1, 0, 0, 0, 0, 1, -1)``.
Both *string* and *format* must be strings.
raised. :exc:`ValueError` is raised if digits are not ASCII. The default
values used to fill in any missing data when more accurate values cannot be
inferred are ``(1900, 1, 1, 0, 0, 0, 0, 1, -1)``. Both *string* and *format*
must be strings.

For example:

Expand All @@ -616,6 +617,9 @@ Functions
and thus does not necessarily support all directives available that are not
documented as supported.

.. versionchanged:: next
Non-ASCII digits are now rejected by ``strptime``.


.. class:: struct_time

Expand Down
14 changes: 14 additions & 0 deletions Doc/whatsnew/3.14.rst
Original file line number Diff line number Diff line change
Expand Up @@ -504,6 +504,11 @@ datetime
* Add :meth:`datetime.time.strptime` and :meth:`datetime.date.strptime`.
(Contributed by Wannes Boeykens in :gh:`41431`.)

* The :meth:`datetime.date.strptime`, :meth:`datetime.datetime.strptime` and
:meth:`datetime.time.strptime` methods now only accept ASCII digits, will and
raise a :exc:`ValueError` if non-ASCII digits are specified.
(Contributed by Stan Ulbrych in :gh:`131008`.)

decimal
-------

Expand Down Expand Up @@ -878,6 +883,15 @@ sys.monitoring
* Two new events are added: :monitoring-event:`BRANCH_LEFT` and
:monitoring-event:`BRANCH_RIGHT`. The ``BRANCH`` event is deprecated.


time
----

* The :meth:`time.strptime`, now only accept ASCII digits, and will raise a
:exc:`ValueError` if non-ASCII digits are specified.
(Contributed by Stan Ulbrych in :gh:`131008`.)


threading
---------

Expand Down
26 changes: 13 additions & 13 deletions Lib/_strptime.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,23 +286,23 @@ def __init__(self, locale_time=None):
base = super()
mapping = {
# The " [1-9]" part of the regex is to make %c from ANSI C work
'd': r"(?P<d>3[0-1]|[1-2]\d|0[1-9]|[1-9]| [1-9])",
'd': r"(?P<d>3[0-1]|[1-2][0-9]|0[1-9]|[1-9]| [1-9])",
'f': r"(?P<f>[0-9]{1,6})",
'H': r"(?P<H>2[0-3]|[0-1]\d|\d)",
'H': r"(?P<H>2[0-3]|[0-1][0-9]|[0-9])",
'I': r"(?P<I>1[0-2]|0[1-9]|[1-9]| [1-9])",
'G': r"(?P<G>\d\d\d\d)",
'j': r"(?P<j>36[0-6]|3[0-5]\d|[1-2]\d\d|0[1-9]\d|00[1-9]|[1-9]\d|0[1-9]|[1-9])",
'G': r"(?P<G>[0-9]{4})",
'j': r"(?P<j>36[0-6]|3[0-5][0-9]|[1-2][0-9][0-9]|0[1-9][0-9]|00[1-9]|[1-9][0-9]|0[1-9]|[1-9])",
'm': r"(?P<m>1[0-2]|0[1-9]|[1-9])",
'M': r"(?P<M>[0-5]\d|\d)",
'S': r"(?P<S>6[0-1]|[0-5]\d|\d)",
'U': r"(?P<U>5[0-3]|[0-4]\d|\d)",
'M': r"(?P<M>[0-5][0-9]|[0-9])",
'S': r"(?P<S>6[0-1]|[0-5][0-9]|[0-9])",
'U': r"(?P<U>5[0-3]|[0-4][0-9]|[0-9])",
'w': r"(?P<w>[0-6])",
'u': r"(?P<u>[1-7])",
'V': r"(?P<V>5[0-3]|0[1-9]|[1-4]\d|\d)",
'V': r"(?P<V>5[0-3]|0[1-9]|[1-4][0-9]|[0-9])",
# W is set below by using 'U'
'y': r"(?P<y>\d\d)",
'Y': r"(?P<Y>\d\d\d\d)",
'z': r"(?P<z>[+-]\d\d:?[0-5]\d(:?[0-5]\d(\.\d{1,6})?)?|(?-i:Z))",
'y': r"(?P<y>[0-9]{2})",
'Y': r"(?P<Y>[0-9]{4})",
'z': r"(?P<z>[+-][0-9][0-9]:?[0-5][0-9](:?[0-5][0-9](\.[0-9]{1,6})?)?|(?-i:Z))",
'A': self.__seqToRE(self.locale_time.f_weekday, 'A'),
'a': self.__seqToRE(self.locale_time.a_weekday, 'a'),
'B': self.__seqToRE(self.locale_time.f_month[1:], 'B'),
Expand All @@ -313,8 +313,8 @@ def __init__(self, locale_time=None):
'Z'),
'%': '%'}
for d in 'dmyHIMS':
mapping['O' + d] = r'(?P<%s>\d\d|\d| \d)' % d
mapping['Ow'] = r'(?P<w>\d)'
mapping['O' + d] = r'(?P<%s>[0-9][0-9]|[0-9]| [0-9])' % d
mapping['Ow'] = r'(?P<w>[0-9])'
mapping['W'] = mapping['U'].replace('U', 'W')
base.__init__(mapping)
base.__setitem__('X', self.pattern(self.locale_time.LC_time))
Expand Down
5 changes: 4 additions & 1 deletion Lib/test/datetimetester.py
Original file line number Diff line number Diff line change
Expand Up @@ -2916,6 +2916,9 @@ def test_strptime(self):
with self.assertRaises(ValueError): strptime("-000", "%z")
with self.assertRaises(ValueError): strptime("z", "%z")

# test only ascii is allowed
with self.assertRaises(ValueError): strptime('٢025-03-09', '%Y-%m-%d')

def test_strptime_single_digit(self):
# bpo-34903: Check that single digit dates and times are allowed.

Expand Down Expand Up @@ -4036,7 +4039,7 @@ def test_strptime_tz(self):
self.assertEqual(strptime("UTC", "%Z").tzinfo, None)

def test_strptime_errors(self):
for tzstr in ("-2400", "-000", "z"):
for tzstr in ("-2400", "-000", "z", "٢"):
with self.assertRaises(ValueError):
self.theclass.strptime(tzstr, "%z")

Expand Down
4 changes: 4 additions & 0 deletions Lib/test/test_time.py
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,10 @@ def test_strptime_leap_year(self):
r'.*day of month without a year.*'):
time.strptime('02-07 18:28', '%m-%d %H:%M')

def test_strptime_non_ascii(self):
with self.assertRaises(ValueError):
time.strptime('٢025', '%Y')

def test_asctime(self):
time.asctime(time.gmtime(self.t))

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
:meth:`datetime.date.strptime`, :meth:`datetime.datetime.strptime` and
:meth:`datetime.time.strptime` now only accept ASCII digits.
Loading