Skip to content

Wrap net/url to fix URI handling#6357

Open
sdassow wants to merge 39 commits into
fyne-io:developfrom
sdassow:fix/file-uri-escaping
Open

Wrap net/url to fix URI handling#6357
sdassow wants to merge 39 commits into
fyne-io:developfrom
sdassow:fix/file-uri-escaping

Conversation

@sdassow

@sdassow sdassow commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description:

Handle URIs with the standard library. URNs are supported by detecting the scheme and passing through the path.

Checklist:

  • Tests included.
  • Lint and formatter run with no errors.
  • Tests all pass.

@sdassow sdassow marked this pull request as draft June 10, 2026 12:23
@sdassow sdassow marked this pull request as ready for review June 10, 2026 12:34
@coveralls

coveralls commented Jun 10, 2026

Copy link
Copy Markdown

Coverage Status

coverage: 60.014% (-0.1%) from 60.155% — sdassow:fix/file-uri-escaping into fyne-io:develop

Comment thread storage/repository/uri.go Outdated
@MaxGyver83

Copy link
Copy Markdown
Member

This implementation uses a statically generated table for performance reasons.

Is this much faster than escaped := url.URL{Scheme: u.scheme, Path: u.path}? Does this speed-up really matter in a real app?

Comment thread storage/repository/dont_escape.go Outdated
false, // 0x20
false, // 0x21
false, // 0x22
false, // 0x23

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Printable characters on a lookup table (like #) would be much more helpful than the hex code

@toaster toaster left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in Slack, I would prefer to use the stdlib’s URL instead of re-inventing the wheel here.

I’d also like to get another opinion on this topic.

For the actual implementation:

  • You should explicitly test all the bytes and maybe a couple of unicode characters.

Optional:

  • I suggest to use a shouldEscape[c] instead of the dontEscape[c] approach.
  • You might simplify the generator implementation.

Comment thread storage/repository/file_path_escape.go Outdated
func filePathEscape(path string) string {
length := len(path)
for _, c := range []byte(path) {
if !dontEscape[c] {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an inverted list would be easier to read here and not worse later:

if escape[c] { … }
…
if escape[c] { /* escape */ } else { … }

Comment on lines +10 to +12
assert.Equal(t, "/home/user/file.txt", filePathEscape("/home/user/file.txt"))
assert.Equal(t, "/home/user/file%231.txt", filePathEscape("/home/user/file#1.txt"))
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not sufficient. You can remove characters from the list to escape without breaking the test.

Comment thread storage/repository/dont_escape_gen.go Outdated
Comment on lines +31 to +43
dontEscape['$'] = true
dontEscape['&'] = true
dontEscape['+'] = true
dontEscape['-'] = true
dontEscape['.'] = true
dontEscape[':'] = true
dontEscape['='] = true
dontEscape['@'] = true
dontEscape['_'] = true
dontEscape['~'] = true

dontEscape['\\'] = true
dontEscape['/'] = true

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for _, rune in range "$&+-.:=@_~\\/" {
    dontEscape[rune] = true
}

@sdassow

sdassow commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

As discussed in Slack, I would prefer to use the stdlib’s URL instead of re-inventing the wheel here.

[snip]

Just for reference, I looked into this, and we were using net/url for parsing until d95d323, but still maintain our own implementation otherwise, and we could be using url.URL instead, which I think is what you're suggesting.

I'll try and see how far I get before spending more time on the details of the current PR, but the feedback is good and appreciated and will be applied if it turns out it's still needed.

@sdassow

sdassow commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

This implementation uses a statically generated table for performance reasons.

Is this much faster than escaped := url.URL{Scheme: u.scheme, Path: u.path}? Does this speed-up really matter in a real app?

The (probable (since I've done this before)) speed-up probably won't make a real-world difference in a Fyne app and is more a side-effect than the goal, and the comment was to explain why this particular implementation (static lookup table, no function call overhead).

I'm now looking if we could wrap url.URL instead like Tilo suggested.

@sdassow sdassow changed the title Add fix and test for file URIs with special characters Wrap net/url to fix URI handling Jun 11, 2026

@toaster toaster left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, just some minor things:

  • consider type uri url.URL (constructed via uri{…})
  • use direct prop access

Comment thread storage/repository/uri.go
Comment on lines 35 to 37
type uri struct {
scheme string
authority string
path string
query string
fragment string
url.URL
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not type uri url.URL?

@sdassow sdassow Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public fields of url.URL conflict with the accessor names/API unfortunately.

Comment thread storage/repository/uri.go

func (u *uri) Extension() string {
return path.Ext(u.path)
return path.Ext(u.URL.Path)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether you actually embed or simply do type uri url.URL does not matter, this is accessible as u.Path in both cases.

This applies to all other u.URL.*.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the few that don't actually need it, but this one does due to the Path method/field conflict.

@sdassow sdassow requested a review from toaster June 23, 2026 11:41
@sdassow sdassow requested review from MaxGyver83 and andydotxyz June 23, 2026 11:57

@toaster toaster left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • someone else should check the decisions regarding urn and Windows file URI in ParseURI
  • check whether uri#String() can simply be removed
  • add exceptional cases to getUserHost test
  • improve comprehensibility of hostname regexp by extracting host component pattern into a constant, i.e., by giving it a name

Comment thread storage/repository/generic_test.go Outdated
{"foo@bar", "foo", "bar"},
{"@bar", "", "bar"},
{"foo:bar@baz", "foo:bar", "baz"},
{"foo:bar@", "foo:bar", ""},

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add exceptional cases here, too, e.g. foo:bar:baz@nowhere.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I don’t like the idea of testing unexported functions because you do not guarantee the exported functions working as expected this way.

Comment thread storage/repository/parse.go Outdated
"fyne.io/fyne/v2"
)

var rxHostName = regexp.MustCompile(`^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$`)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, you should extract the host name component part into a constant:

const hostNameComponentPattern = "[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?"

var rxHostName = regexp.MustCompile("^" + hostNameComponentPattern + `(?:\.` + hostNameComponentPattern + ")*$")

Comment thread storage/repository/uri.go Outdated
}
return s.String()
return u.URL.String()
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn’t this method superfluous now?
I think removing it would provide u.URL.String() directly to uri.

@sdassow

sdassow commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Great feedback, thanks! All addressed.

  • someone else should check the decisions regarding urn and Windows file URI in ParseURI

Yes, please!

@sdassow sdassow requested a review from toaster June 25, 2026 10:21
@andydotxyz

Copy link
Copy Markdown
Member

If urn is called out as a special case I think it needs tests covering that...

@andydotxyz andydotxyz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants