Skip to content

The first path segment is unexpectedly interpreted as an authority after normalization #85

Open
@lo48576

Description

  • scheme:/..///bar has scheme="scheme", authority=None, path=/..///bar.
    However, after normalization, it has scheme="scheme", authority="bar".
  • Consider t1 as an IRI ..///bar resolved against scheme:.
    t1 should have scheme="scheme" and authority=None (since ..///bar does not contain authority).
    However, resulting string is scheme://bar, it has authority=bar.

And some more examples:

$ python
Python 3.9.9 (main, Jan 10 2022, 18:52:39)
[GCC 11.2.1 20211127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from rfc3986 import uri_reference
>>> b = uri_reference('scheme:')
>>> r1 = uri_reference('..///bar')
>>> t1 = r1.resolve_with(b)
>>> t1
URIReference(scheme='scheme', authority=None, path='//bar', query=None, fragment=None)
>>> t1.unsplit()
'scheme://bar'
>>> r2 = uri_reference('/..///bar')
>>> r2.resolve_with(b)
URIReference(scheme='scheme', authority=None, path='//bar', query=None, fragment=None)
>>> uri_reference('scheme:/..///bar').normalize()
URIReference(scheme='scheme', authority=None, path='//bar', query=None, fragment=None)
>>> uri_reference('scheme:/..///bar').normalize().unsplit()
'scheme://bar'

I'm not sure how this should handled.
Collapsing the // at the beginning is not explicitly allowed by RFC 3986, so I think the normalization and the resolution cannot produce valid output and should fail in this case.
(But RFC 3986 does not seem to state that they can fail!)

This can caused by normalization during resolution, so #84 may also be affected by this issue.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions