Improve metadata request handling #285

ZachNagengast · 2025-11-03T20:32:32Z

This PR attempts to fix a couple issues with the httpHead request, particularly with unit tests in our project, but also useful in general.

URLSession appears to have an internal limit on how many can be instantiated in a short amount of time. New sessions are necessary for the httpGet requests because the sessionConfig can change when using background mode (potential for improvement here), but for httpHead, we are using the default config with a single redirect delegate, so it can be a shared singleton for the process.
With the addition of httpHead checks we increased the volume of API requests to HF which may have led to some rate limiting w/o an hf token - this will now cache them with a 1 minute ttl to specifically reduce repeated requests within a short timeframe to the same endpoint.
- The caching is directly inspired by swift-package-manager's metadata caching.

This also cleans up the interface of the httpHead function to return only the HTTPURLResponse instead of (Data, HTTPURLResponse), since it's an internal function and was being ignored at all call sites so far. The .gitignore change was to exclude the build folder for the transformers-cli example.

Interested in feedback on the TTL time and any interface changes, as well as general comments.

pcuenca · 2025-11-04T13:29:54Z

Sources/Hub/HubApi.swift

+    /// Session actor for metadata requests with redirect handling.
+    ///
+    /// Static to share a single URLSession across all HubApi instances, preventing resource
+    /// exhaustion when many instances are created. Persists for process lifetime.


If we go this way, it might be useful to document that this is only being used for HEAD requests.

pcuenca · 2025-11-04T13:36:44Z

Sources/Hub/HubApi.swift

+    /// Cache for metadata responses with configurable expiration.
+    ///
+    /// Shared cache across all HubApi instances for better hit rates and lower memory usage.
+    /// Reduces redundant network requests for file metadata. Default TTL is 1 minute (60 seconds).
+    /// Entries auto-expire based on TTL on next get call for any key.
+    internal static let metadataCache: MetadataCache = .init(defaultTTL: 1 * 60)


A 1 minute cache sounds reasonable, but I wonder if this will introduce issues or edge cases that will be difficult to debug. It also introduces some complexity, in exchange for solving a very specific problem: allow unauthenticated CIs to proceed. Have you explored how difficult it would be to fix this problem as part of the testing suite, or would that be impractical?

Thinking out loud: The Foundation URL Loading System has built-in response caching via URLCache. If we're getting cache misses, there are two possibilities, either:

The cache isn't big enough

The server isn't sending the right headers

The default cache is tiny (IIRC, <1MB in-memory, 10MB on disk), so setting a larger, explicit size in the URLSessionConfiguration is a good first step.

Unfortunately, the API isn't sending the headers needed to get server-negotiated cache behavior:

$ curl -I https://huggingface.co/api/models/bert-base-uncased Access-Control-Allow-Origin: https://huggingface.co Access-Control-Expose-Headers: X-Repo-Commit,X-Request-Id,X-Error-Code,X-Error-Message,X-Total-Count,ETag,Link,Accept-Ranges,Content-Range,X-Linked-Size,X-Linked-ETag,X-Xet-Hash Access-Control-Max-Age: 86400 Connection: close Content-Length: 4464 Content-Type: application/json; charset=utf-8 Cross-Origin-Opener-Policy: same-origin Date: Tue, 04 Nov 2025 15:38:16 GMT Etag: W/"1170-c8f41jvZ6In8xh7lkgxKmGX+dKI" Ratelimit: "api";r=498;t=176 Ratelimit-Policy: "fixed window";"api";q=500;w=300 Referrer-Policy: strict-origin-when-cross-origin Vary: Origin Via: 1.1 4d89e7f6870714b602988e2ed1135996.cloudfront.net (CloudFront) X-Amz-Cf-Id: 5f5unAR82VYVA3CC1Vo5g6NCp1SYkkeqt78EX_zO9ugLTbYNhBcjUA== X-Amz-Cf-Pop: IAD55-P8 X-Cache: Miss from cloudfront X-Powered-By: huggingface-moon X-Request-Id: Root=1-690a1de8-66e423c063a9090f15c12908

We get ETag which we can use for validation, but not Cache-Control, Expires, or Last-Modified. So we'd need to configure request.cachePolicy = .returnCacheDataElseLoad.

It'd be great to avoid implementing our own caching mechanism, but I know a lot of devs end up doing that because of how mysterious URLCache can be in practice.

Good points, this is probably an unnecessary addition to this PR, just figured it would be useful because I was working on the SPM project's metadata cache recently. Caching the HEAD response in particular is interesting because it is often specifically used to determine if a cache should be used or not. Any caching at all for this HEAD request could definitely lead to a model/file being out of date.

One hybrid option we could take is using the URLCache and .returnCacheDataElseLoad but clearing the cache manually on some interval with an override.

Something like this

override func cachedResponse(for request: URLRequest) -> CachedURLResponse? { let key = cacheKey(for: request) // Check if expired if isExpired(key) { super.removeCachedResponse(for: request) return nil } // Return from underlying cache if still valid return super.cachedResponse(for: request) }

Perhaps this would be better suited for it's own PR, though.

pcuenca · 2025-11-04T13:38:32Z

Sources/Hub/HubApi.swift

    /// Performs an HTTP HEAD request to retrieve metadata without downloading content.
    ///
-    /// Allows relative redirects but ignores absolute ones for LFS files.
+    /// Uses a shared URLSession with custom redirect handling that only allows relative redirects


Ah, the use of a shared session is documented here 👍 I would still mention it when the var is defined.

- Better handled with a follow up PR to prevent unintended edge cases

ZachNagengast added 5 commits October 29, 2025 13:41

Use cache and static urlsession for metadata requests

ef5aa33

Formatting

cb1e06c

Update .gitignore

18ef7a3

Fix comment

60ae68a

Cleanup httpHead interface

bac92b3

ZachNagengast requested review from mattt and pcuenca November 3, 2025 20:32

Formatting

2ad0cb8

pcuenca reviewed Nov 4, 2025

View reviewed changes

Remove metadata request caching

8ca9805

- Better handled with a follow up PR to prevent unintended edge cases

mattt approved these changes Nov 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve metadata request handling #285

Improve metadata request handling #285

ZachNagengast commented Nov 3, 2025

Uh oh!

pcuenca Nov 4, 2025

Uh oh!

pcuenca Nov 4, 2025

Uh oh!

mattt Nov 4, 2025

Uh oh!

ZachNagengast Nov 4, 2025

Uh oh!

pcuenca Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve metadata request handling #285

Are you sure you want to change the base?

Improve metadata request handling #285

Conversation

ZachNagengast commented Nov 3, 2025

Uh oh!

pcuenca Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

pcuenca Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

mattt Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

ZachNagengast Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

pcuenca Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants