Skip to content

The parser validates non valid urls #97

@jubnl

Description

@jubnl

Description

The parser should be stronger, there's some urls that are valid for the lib should not be valid. See code/output below

Steps to reproduce

from pprint import pprint

import giturlparse

if __name__ == "__main__":

    urls = [
        "https://github(com../testing2/jubnl/test",
        "https://github$com/testing2/ jubnl/test ",
        "https://git/test",
        "https://git...com/jubnl",
    ]

    for url in urls:
        parsed = giturlparse.parse(url)
        print(f"Initial url: '{url}'")
        print(f"Is url valid: {parsed.valid}")
        if parsed.valid:
            print(f"Parsed urls:")
            pprint(parsed.urls)
C:\Users\user\PycharmProjects\multiproc\.venv\Scripts\python.exe C:\Users\user\PycharmProjects\multiproc\main.py 
Initial url: 'https://github(com../testing2/jubnl/test'
Is url valid: True
Parsed urls:
{'git': 'git://github(com../testing2/jubnl/test.git',
 'https': 'https://github(com../testing2/jubnl/test.git',
 'ssh': 'git@github(com..:testing2/jubnl/test.git'}
Initial url: 'https://github$com/testing2/ jubnl/test '
Is url valid: True
Parsed urls:
{'git': 'git://github$com/testing2/ jubnl/test .git',
 'https': 'https://github$com/testing2/ jubnl/test .git',
 'ssh': 'git@github$com:testing2/ jubnl/test .git'}
Initial url: 'https://git/test'
Is url valid: True
Parsed urls:
Traceback (most recent call last):
  File "C:\Users\user\PycharmProjects\multiproc\main.py", line 107, in <module>
    pprint(parsed.urls)
           ^^^^^^^^^^^
  File "C:\Users\user\PycharmProjects\multiproc\.venv\Lib\site-packages\giturlparse\result.py", line 102, in urls
    return {protocol: self.format(protocol) for protocol in self._platform_obj.PROTOCOLS}
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\PycharmProjects\multiproc\.venv\Lib\site-packages\giturlparse\result.py", line 102, in <dictcomp>
    return {protocol: self.format(protocol) for protocol in self._platform_obj.PROTOCOLS}
                      ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\PycharmProjects\multiproc\.venv\Lib\site-packages\giturlparse\result.py", line 73, in format
    return self._platform_obj.FORMATS[protocol] % items
           ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'http'

Process finished with exit code 1

Versions

Python 3.11.4
giturlparse 0.12.0

Windows 11

Expected behaviour

The parser should not validate those kind of url

Actual behaviour

The parser validated the urls

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions