Skip to content

Commit 68ee8e5

Browse files
yarikopticclaude
andcommitted
Add patch to URL-escape key paths in HTTP remote URLs
keyUrls in Remote/Git.hs constructs URLs for fetching content from HTTP git remotes by simple string concatenation of the repo URL and the annex object path. When the key contains characters that keyFile encodes using git-annex's internal escaping (& for colons, % for slashes), the resulting URL contains bare % and & characters that are invalid in a URI path -- % must be followed by two hex digits per RFC 3986, and the parser rejects the URL as "invalid url". This affects URL-backend keys like URL--yt:https://www.youtube.com/watch?v=... where keyFile produces paths containing &c (for :) and %% (for //), resulting in unparseable URLs. SHA256E and other hash-based keys are unaffected since their serialized forms contain only URI-safe characters. The fix applies escapeURIString (from Network.URI, already imported) to percent-encode the path components while preserving / as a path separator. This is the same approach used by Remote/S3.hs and Remote/WebDAV/DavLocation.hs. See https://git-annex.branchable.com/bugs/fails_to_get_from_apache2_server_URL_backend_file/ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent d18a02c commit 68ee8e5

1 file changed

Lines changed: 18 additions & 0 deletions

File tree

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
diff --git a/Remote/Git.hs b/Remote/Git.hs
2+
index 6b7dc77d98..4faaea082d 100644
3+
--- a/Remote/Git.hs
4+
+++ b/Remote/Git.hs
5+
@@ -482,7 +482,12 @@ inAnnex' repo rmt st@(State connpool duc _ _ _ _) key
6+
keyUrls :: GitConfig -> Git.Repo -> Remote -> Key -> [String]
7+
keyUrls gc repo r key = map tourl locs'
8+
where
9+
- tourl l = Git.repoLocation repo ++ "/" ++ l
10+
+ tourl l = Git.repoLocation repo ++ "/" ++ escapeURIString escchar l
11+
+ -- Escape characters that are not allowed unescaped in a URI
12+
+ -- path component, but don't escape '/' since the location
13+
+ -- is a path with multiple components.
14+
+ escchar '/' = True
15+
+ escchar c = isUnescapedInURIComponent c
16+
-- If the remote is known to not be bare, try the hash locations
17+
-- used for non-bare repos first, as an optimisation.
18+
locs

0 commit comments

Comments
 (0)