Description
If people type URLs as browsers display them, and then attempt to parse that, they silently get a slightly confusing, although technically correct, result:
>>> link = hyperlink.DecodedURL.from_text("example.com/some-path?some=query")
>>> link.scheme
''
>>> link.host
''
>>> link.path
('example.com', 'some-path')
>>> link.replace(scheme='https')
DecodedURL(url=URL.from_text('https:example.com/some-path?some=query'))
I realize that this is the correct treatment per what putting that string into an a href=
would do, but also, if I wanted hyperlink
to process user input, it's not the thing that I'd expect. What's even more annoying is the fix required here once the URL is structured is somewhat non-intuitive: link.replace(scheme='https', host=link.path[0], path=link.path[1:])
, rather than what a naive user might assume of link.replace(scheme="https")
.
I would probably not propose from_text
actually behave differently but I wonder if there's some way to make this type of usage idiomatic in a way that people will discover when they want this type of "do the thing the location bar would" behavior. Would we want something like hyperlink.parser(with_implicit_scheme="https").parse('example.com')
?