Skip to content

Incorrect Parsing of Quoted Strings #95

@ChaosData

Description

@ChaosData

Hi,

There are a few issues I've observed related to the handling of quoted strings.
It appears that during local-part parsing, the local-part is first broken up by its . characters unconditionally, and then each subsection is parsed individually using something similar to what should be done for the local-part itself. In particular, while . is within the %d35-91 subrange of qtext as defined in RFC 5322 (Section 3.2.4, "Quoted Strings"), it is not allowed within a quoted string, because a split is performed first, leaving two "dangling" double quote characters. Furthermore, this leads the validator to incorrectly accept certain strings that are invalid. So, while valid "a.b"@example.tld would be rejected, the invalid "a"."b"@example.tld is accepted.

Additionally, neither comments nor folding whitespace appear to be properly handled. While I'm not too keen on the whole nested ((comment)) comment structure, an interesting issue this causes is that whitespace within quoted strings is not accepted, requiring space characters to be escaped using the backslash-prepended quoted-pair syntax (Section 3.2.2, "Folding White Space and Comments"). Disregarding the obs-qtext subrange of qtext, whitespace is still supported within quoted strings through the definition of quoted-string:

[CFWS]
DQUOTE *([FWS] qcontent) [FWS] DQUOTE
[CFWS]

This results in rejections of valid addresses such as "Fred Bloggs"@example.com (sourced from RFC 3696, Section 3, "Restrictions on email addresses"). The same RFC also provides examples of similarly rejected (but still valid) emails, such as Fred\ Bloggs@example.com, but my understanding is that quoted-pair sequences outside of quotes are only allowed in the obs-* obsolete formats, so that may not be a big issue.

From my understanding of the spec, assuming non-support of comments, there should be a check at the beginning to determine if an address starts with a double quote character to determine if it should be parsed as a quoted string, or as a .-internally-delimited dot-atom.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions