Skip to content

Commit bd6845d

Browse files
committed
fixup! Stop incorrectly RFC 2047 encoding non-ASCII email addresses
- Incorporate PR review feedback - Improve docs
1 parent 819c0bc commit bd6845d

File tree

7 files changed

+41
-17
lines changed

7 files changed

+41
-17
lines changed

Doc/library/email.errors.rst

+9
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,15 @@ The following exception classes are defined in the :mod:`email.errors` module:
5959
headers.
6060

6161

62+
.. exception:: InvalidMailboxError()
63+
64+
Raised when serializing a message with an address header that contains
65+
a mailbox incompatible with the policy in use.
66+
(See :attr:`email.policy.EmailPolicy.utf8`.)
67+
68+
.. versionadded:: 3.14
69+
70+
6271
.. exception:: MessageDefect()
6372

6473
This is the base class for all defects found when parsing email messages.

Doc/library/email.policy.rst

+11-6
Original file line numberDiff line numberDiff line change
@@ -411,12 +411,17 @@ added matters. To illustrate::
411411
formatted in this way may be passed to SMTP servers that support
412412
the ``SMTPUTF8`` extension (:rfc:`6531`).
413413

414-
.. versionchanged:: 3.13
415-
If ``False``, the generator will raise a ``ValueError`` if any email
416-
address contains non-ASCII characters. To send to a non-ASCII domain
417-
with ``utf8=False``, encode the domain using the third-party
418-
:pypi:`idna` module or :mod:`encodings.idna`. No RFC allows a non-ASCII
419-
username ("localpart") in an email address with ``utf8=False``.
414+
When ``False``, the generator will raise an
415+
:exc:`~email.errors.InvalidMailboxError` if any address header includes
416+
a mailbox ("addr-spec") with non-ASCII characters. To use a mailbox with
417+
an internationalized domain name, first encode the domain using the
418+
third-party :pypi:`idna` or :pypi:`uts46` module or with
419+
:mod:`encodings.idna`. It is not possible to use a non-ASCII username
420+
("local-part") in a mailbox when ``utf8=False``.
421+
422+
.. versionchanged:: 3.14
423+
Raises :exc:`~email.errors.InvalidMailboxError`. (Earlier versions
424+
incorrectly applied :rfc:`2047` to non-ASCII addr-specs.)
420425

421426
.. attribute:: refold_source
422427

Lib/email/_header_value_parser.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2843,7 +2843,7 @@ def _refold_parse_tree(parse_tree, *, policy):
28432843
# Non-ASCII addr-spec came from parsed message; leave unchanged.
28442844
want_encoding = False
28452845
else:
2846-
raise ValueError(
2846+
raise errors.InvalidMailboxError(
28472847
"Non-ASCII address requires policy with utf8=True:"
28482848
" '{}'".format(part)
28492849
)

Lib/email/errors.py

+4
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,10 @@ class HeaderWriteError(MessageError):
3333
"""Error while writing headers."""
3434

3535

36+
class InvalidMailboxError(MessageError, ValueError):
37+
"""A mailbox was not compatible with the policy in use."""
38+
39+
3640
# These are parsing defects which the parser was able to work around.
3741
class MessageDefect(ValueError):
3842
"""Base class for a message defect."""

Lib/test/test_email/test_generator.py

+6-4
Original file line numberDiff line numberDiff line change
@@ -304,11 +304,13 @@ def test_non_ascii_addr_spec_raises(self):
304304
with self.subTest(address=address):
305305
msg = EmailMessage()
306306
msg['To'] = address
307-
expected_error = re.escape(
308-
"Non-ASCII address requires policy with utf8=True:"
309-
" '{}'".format(msg['To'].addresses[0].addr_spec)
307+
addr_spec = msg['To'].addresses[0].addr_spec
308+
expected_error = (
309+
fr"(?i)(?=.*non-ascii)(?=.*utf8.*True)(?=.*{re.escape(addr_spec)})"
310310
)
311-
with self.assertRaisesRegex(ValueError, expected_error):
311+
with self.assertRaisesRegex(
312+
email.errors.InvalidMailboxError, expected_error
313+
):
312314
g.flatten(msg)
313315

314316

Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1-
Stop incorrectly using RFC 2047 "encoded words" for email addresses with
2-
non-ASCII characters when email.generator is called using a policy with
3-
``utf8=False``.
1+
The :mod:`email` module no longer incorrectly encodes non-ASCII characters
2+
in email addresses using :rfc:`2047` encoding. Under a policy with ``utf8=True``
3+
this means the addresses will be correctly passed through. Under a policy with
4+
``utf8=False``, attempting to serialize a message with non-ASCII email addresses
5+
will now result in an :exc:`~email.errors.InvalidMailboxError`.
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1-
Stop incorrectly using RFC 2047 "encoded words" for email addresses with
2-
non-ASCII characters when email.generator is called using a policy with
3-
``utf8=False``.
1+
The :mod:`email` module no longer incorrectly encodes non-ASCII characters
2+
in email addresses using :rfc:`2047` encoding. Under a policy with ``utf8=True``
3+
this means the addresses will be correctly passed through. Under a policy with
4+
``utf8=False``, attempting to serialize a message with non-ASCII email addresses
5+
will now result in an :exc:`~email.errors.InvalidMailboxError`.

0 commit comments

Comments
 (0)