Skip to content

Conversation

@saibotma
Copy link

@saibotma saibotma commented Oct 2, 2025

This PR fixes several issues with URL detection and enhances support for various URL formats across both parsing modes.

Changes

  • Fix consecutive periods bug (looseUrl mode):
    Prevents invalid strings with consecutive periods (e.g., test..example.com) from being incorrectly parsed as URLs.
    The algorithm now correctly extracts only the valid domain part (example.com).

  • Enhanced URL format support (both modes):

    • localhost URLs with ports (e.g., localhost:3000)
    • IP address URLs (e.g., 192.168.1.1:8080)
    • URLs with custom ports (e.g., example.com:8080)
    • Punycode/internationalized domains (e.g., xn--n3h.com)
  • Improved trailing punctuation handling (both modes):
    Better detection of URL boundaries to exclude trailing punctuation marks (., ,, !, ?, ;, :) from being included in parsed URLs.

  • Comprehensive test coverage:
    Added extensive test cases covering all the new URL formats and edge cases for both parsing modes.

Technical Details

Updated both regex patterns in lib/src/url.dart:

  • Standard URL regex now properly handles trailing punctuation and extended URL formats.
  • Loose URL regex completely rewritten to support localhost, IP addresses, ports, and punycode domains while preventing false positives with consecutive periods.

Fixes

floMars and others added 4 commits September 30, 2025 15:09
…rl mode

- Updated _looseUrlRegex to ensure periods only appear between valid character groups
- Added validation to reject matches when prefix ends with a period
- Added test cases to verify consecutive periods are not matched as URLs

Fixes issue where patterns like 'awdaw....aw', 'awdaw...wad...wadw', and 'test..example.com' were incorrectly identified as valid URLs when looseUrl option was enabled.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@saibotma saibotma marked this pull request as ready for review October 2, 2025 10:48
@saibotma saibotma changed the title Fix: Prevent consecutive periods from being matched as URLs in looseUrl mode Improve URL parsing for both normal and looseUrl modes Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant