Skip to content

Commit e760911

Browse files
committed
Add email.header.decode_header_to_string() convenience function
This function takes an email header, possibly with portions encoded according to RFC2047, and converts it to a standard Python string. It is intended to provide a sane, Pythonic replacement for `email.header.decode_header()`, which has two major problems: 1. May return either bytes or str (bpo-22833/gh-67022), an inconsistent and error-prone interface 2. Exposes details of an email header value's encoding which most users will not care about or want to deal with. Many users likely just want to decode an email header value to a Python string. It turns out that `email.header` already contained most of the code necessary to do this, and providing `decode_header_to_string` as a documented wrapper function points users in the right direction.
1 parent 17168be commit e760911

File tree

4 files changed

+70
-13
lines changed

4 files changed

+70
-13
lines changed

Doc/library/email.header.rst

+35-12
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,38 @@ Here is the :class:`Header` class description:
173173
The :mod:`email.header` module also provides the following convenient functions.
174174

175175

176+
.. function:: decode_header_to_string(header)
177+
178+
Decode a message header value to a Unicode string, including handling
179+
portions encoded according to :rfc:`2047`.
180+
181+
An :exc:`classemail.errors.HeaderParseError` may be raised when
182+
certain decoding errors occur (e.g. a base64 decoding exception).
183+
184+
Here are examples:
185+
186+
>>> from email.header import decode_header_to_string
187+
>>> decode_header_to_string('=?iso-8859-1?q?p=F6stal?=')
188+
'p\xf6stal'
189+
>>> decode_header_to_string('unencoded_string')
190+
'unencoded_string'
191+
>>> decode_header_to_string('bar =?utf-8?B?ZsOzbw==?=')
192+
'bar f\xf3o'
193+
194+
195+
.. function:: make_header(decoded_seq, maxlinelen=None, header_name=None, continuation_ws=' ')
196+
197+
Create a :class:`Header` instance from a sequence of pairs as returned by
198+
:func:`decode_header`.
199+
200+
:func:`decode_header` takes a header value string and returns a sequence of
201+
pairs of the format ``(decoded_string, charset)`` where *charset* is the name of
202+
the character set.
203+
204+
This function takes one of those sequence of pairs and returns a
205+
:class:`Header` instance. Optional *maxlinelen*, *header_name*, and
206+
*continuation_ws* are as in the :class:`Header` constructor.
207+
176208

177209
.. function:: decode_header(header)
178210

@@ -202,16 +234,7 @@ The :mod:`email.header` module also provides the following convenient functions.
202234
>>> decode_header('bar =?utf-8?B?ZsOzbw==?=')
203235
[(b'bar ', None), (b'f\xc3\xb3o', 'utf-8')]
204236

237+
.. note::
205238

206-
.. function:: make_header(decoded_seq, maxlinelen=None, header_name=None, continuation_ws=' ')
207-
208-
Create a :class:`Header` instance from a sequence of pairs as returned by
209-
:func:`decode_header`.
210-
211-
:func:`decode_header` takes a header value string and returns a sequence of
212-
pairs of the format ``(decoded_string, charset)`` where *charset* is the name of
213-
the character set.
214-
215-
This function takes one of those sequence of pairs and returns a
216-
:class:`Header` instance. Optional *maxlinelen*, *header_name*, and
217-
*continuation_ws* are as in the :class:`Header` constructor.
239+
This function exists for for backwards compatibility only. For
240+
new code we recommend using :mod:`email.header.decode_header_to_string`.

Lib/email/header.py

+20
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,9 @@ def decode_header(header):
7474
7575
An email.errors.HeaderParseError may be raised when certain decoding error
7676
occurs (e.g. a base64 decoding exception).
77+
78+
This function exists for backwards compatibility only. For new code, we
79+
recommend using decode_header_to_string instead.
7780
"""
7881
# If it is a Header object, we can just return the encoded chunks.
7982
if hasattr(header, '_chunks'):
@@ -155,6 +158,23 @@ def decode_header(header):
155158
return collapsed
156159

157160

161+
162+
def decode_header_to_string(header):
163+
"""Decode a message header into a string.
164+
165+
header may be a string that may or may not contain RFC2047 encoded words,
166+
or it may be a Header object; in the latter case, this is equivalent to
167+
str(header).
168+
169+
An email.errors.HeaderParseError may be raised when certain decoding error
170+
occurs (e.g. a base64 decoding exception).
171+
"""
172+
173+
if not isinstance(header, Header):
174+
header = make_header(decode_header(header))
175+
return str(header)
176+
177+
158178

159179
def make_header(decoded_seq, maxlinelen=None, header_name=None,
160180
continuation_ws=' '):

Lib/test/test_email/test_email.py

+12-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
import email.policy
2020

2121
from email.charset import Charset
22-
from email.header import Header, decode_header, make_header
22+
from email.header import Header, decode_header, decode_header_to_string, make_header
2323
from email.parser import Parser, HeaderParser
2424
from email.generator import Generator, DecodedGenerator, BytesGenerator
2525
from email.message import Message
@@ -2476,6 +2476,17 @@ def test_unencoded_utf8(self):
24762476
self.assertEqual(decode_header(s),
24772477
[('header with unexpected non ASCII caract\xe8res', None)])
24782478

2479+
def test_decode_header_to_string_from_string(self):
2480+
s = '=?windows-1252?q?=22M=FCller_T=22?=\r\n <[email protected]>'
2481+
self.assertEqual(str(make_header(decode_header(s))),
2482+
decode_header_to_string(s))
2483+
2484+
def test_decode_header_to_string_from_header_obj(self):
2485+
s = '\xeatre'
2486+
h = Header(s)
2487+
self.assertEqual(str(h),
2488+
decode_header_to_string(h))
2489+
24792490

24802491
# Test the MIMEMessage class
24812492
class TestMIMEMessage(TestEmailBase):
Original file line numberDiff line numberDiff line change
@@ -1 +1,4 @@
11
The inconsistent return types of :func:`email.header.decode_header` are now documented.
2+
3+
:func:`email.header.decode_header_to_string` is provided as a less error-prone and
4+
more straightforward alternative for it.

0 commit comments

Comments
 (0)