Skip to content

Fetch Ruff from an Astral mirror#18286

Open
zsol wants to merge 3 commits intomainfrom
zsol/jj-wvpyrmkopmqr
Open

Fetch Ruff from an Astral mirror#18286
zsol wants to merge 3 commits intomainfrom
zsol/jj-wvpyrmkopmqr

Conversation

@zsol
Copy link
Member

@zsol zsol commented Mar 4, 2026

Apply the hardcoded Astral mirror pattern to urls coming from the ndjson, then use the same fallback-first-then-retry loop in download_and_unpack_with_retry as for Python downloads (extracting this as shared logic, since we'll be needing this in at least one more place: #18358)

@zsol zsol force-pushed the zsol/jj-wvpyrmkopmqr branch from 575dc76 to b3580dd Compare March 6, 2026 20:16
@zsol zsol force-pushed the zsol/jj-wvpyrmkopmqr branch from b3580dd to f4e8df4 Compare March 6, 2026 20:31
@zsol zsol changed the title Fetch ruff from an Astral mirror Fetch Ruff from an Astral mirror Mar 6, 2026
@zsol zsol force-pushed the zsol/jj-wvpyrmkopmqr branch 3 times, most recently from 14daf6a to 425c3c6 Compare March 6, 2026 20:39
@zsol zsol marked this pull request as ready for review March 6, 2026 20:39
@zsol zsol requested a review from konstin March 9, 2026 10:19
@konstin konstin added the enhancement New feature or improvement to existing functionality label Mar 10, 2026
@zsol zsol force-pushed the zsol/jj-wvpyrmkopmqr branch from 425c3c6 to 96aabc9 Compare March 10, 2026 14:48
Comment on lines +358 to +364
Self::Download { .. } => true,
Self::RetriedError { err, .. } => err.should_try_next_url(),
Self::Extract { source } => {
retryable_on_request_failure(source) == Some(Retryable::Transient)
}
_ => false,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc we retryable_on_request_failure on the whole error type, or at least that's what we usually do to catch all cases. In this file, we have e.g.:

let reader = response
.bytes_stream()
.map_err(std::io::Error::other)

which would bubble into the wrong path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying we don't need to investigate the uv-bin-isntall-level Error at all and simply delegate to retryable_on_request_failure from should_try_next_url?
Or to do something like:

match self {
  Self::Download { .. } => true,
  Self::RetriedError { err, .. } => err.should_try_next_url(),
  _ => retryable_on_request_failure(self) == Some(Retryable::Transient),
}

(I don't really understand how the example you gave will bubble into the wrong path - AFAICT it will match Self::Extract, where source will be something that eventually wraps std::io::Error::other which itself wraps the reqwest/h2/... error, but that will be extracted by retryable_on_request_failure, no?)

Copy link
Member Author

@zsol zsol Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could simply change the stream reader to always wrap its error with Error::Download instead of (or in addition to?) io::Error::other, and then do a very simple source iteration in should_try_next_url similar to what retryable_on_request_failure does - but now it only needs to check for Error::Download. How does that sound?

Edit: actually response already wraps its errors like that, so we don't need to change the download side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying we don't need to investigate the uv-bin-isntall-level Error at all and simply delegate to retryable_on_request_failure from should_try_next_url?

I think so, at least I can't come up with an example where retryable_on_request_failure doesn't work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the LLM to conjure an error case that isn't currently classified as retriable but we should still fall back to the next url for. I added a test case for this (invalid chunk size during a chunked-encoded http response).

I made this work by inventing a new Error::Streaming that wraps all errors during the reading of the http stream, and specifically looking for this during walking of the error chain.

Wdyt?

Copy link
Member

@konstin konstin Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exercises a realistic body-streaming protocol failure: the server advertises chunked transfer encoding but sends an invalid chunk size.

That sounds like something that either retryable_on_request_failure should cover, or should be fatal (here: next URL), it's the same problem for other types of requests. Is the problem that retryable_on_request_failure doesn't see the error type due to e.g. #[error(transparent)], or that we don't match that kind of error? It may just be that we consider this error fatal because it's not an error that happens transiently and goes away when retrying the URL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that retryable_on_request_failure might consider some errors fatal (for good reasons), but they aren't fatal for the purposes of falling back to the next URL. I'm thinking of errors that point to a misconfiguration of the host we're downloading from; this is the reason I wanted to capture all network errors for Python downloading in

fn should_try_next_url(&self) -> bool {
match self {
// There are two primary reasons to try an alternative URL:
// - HTTP/DNS/TCP/etc errors due to a mirror being blocked at various layers
// - HTTP 404s from the mirror, which may mean the next URL still works
// So we catch all network-level errors here.
Self::NetworkError(..)
| Self::NetworkMiddlewareError(..)
| Self::NetworkErrorWithRetries { .. } => true,
// `Io` uses `#[error(transparent)]`, so `source()` delegates to the inner error's
// own source rather than returning the `io::Error` itself. We must unwrap it
// explicitly so that `retryable_on_request_failure` can inspect the io error kind.
Self::Io(err) => retryable_on_request_failure(err).is_some(),
_ => false,

The invalid chunk size case is arguably a sign of the host being broken, and I wouldn't have high hopes of a retry working, so I think it makes sense to mark it as fatal for retry purposes

@zsol zsol force-pushed the zsol/jj-wvpyrmkopmqr branch from adef364 to 8b6bd55 Compare March 11, 2026 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or improvement to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants