The first path segment is unexpectedly interpreted as an authority after normalization #85
Open
Description
scheme:/..///bar
has scheme="scheme"
, authority=None
, path=/..///bar
.
However, after normalization, it has scheme="scheme"
, authority="bar"
.- Consider t1 as an IRI
..///bar
resolved againstscheme:
.
t1 should have scheme="scheme"
and authority=None
(since..///bar
does not contain authority).
However, resulting string isscheme://bar
, it has authority=bar
.
And some more examples:
$ python
Python 3.9.9 (main, Jan 10 2022, 18:52:39)
[GCC 11.2.1 20211127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from rfc3986 import uri_reference
>>> b = uri_reference('scheme:')
>>> r1 = uri_reference('..///bar')
>>> t1 = r1.resolve_with(b)
>>> t1
URIReference(scheme='scheme', authority=None, path='//bar', query=None, fragment=None)
>>> t1.unsplit()
'scheme://bar'
>>> r2 = uri_reference('/..///bar')
>>> r2.resolve_with(b)
URIReference(scheme='scheme', authority=None, path='//bar', query=None, fragment=None)
>>> uri_reference('scheme:/..///bar').normalize()
URIReference(scheme='scheme', authority=None, path='//bar', query=None, fragment=None)
>>> uri_reference('scheme:/..///bar').normalize().unsplit()
'scheme://bar'
I'm not sure how this should handled.
Collapsing the //
at the beginning is not explicitly allowed by RFC 3986, so I think the normalization and the resolution cannot produce valid output and should fail in this case.
(But RFC 3986 does not seem to state that they can fail!)
This can caused by normalization during resolution, so #84 may also be affected by this issue.
Metadata
Assignees
Labels
No labels