Problem
--cache-exclude-status lets you exclude HTTP status codes (e.g. 429, 500..599) from being cached, but network-level errors — Connection reset by peer, DNS failures, timeouts that never get an HTTP response — have no status code and cannot be excluded. Lychee caches them with --cache just like any other result.
On a subsequent run lychee finds the cached error, considers it fresh (within --max-cache-age), and reports the failure without retrying — bypassing --max-retries entirely. A transient infrastructure blip in one run thus causes every subsequent run that uses the cache to fail on a URL that is actually healthy.
Reproduction
- Run lychee with
--cache against a URL that has a flaky network path.
- On the first run where the connection is reset, lychee caches the error.
- On any subsequent run within
--max-cache-age, lychee immediately reports the URL as failed without making a network request (and without applying --max-retries).
Expected behaviour
Network-level errors should not be cached, or at minimum should be excludable. They are inherently transient — unlike a 404 (the page is gone) or a 403 (access denied), a TCP reset or DNS timeout says nothing about the long-term availability of the URL.
Proposed fix
One or both of the following:
-
Extend --cache-exclude-status to accept a sentinel value such as error or network that matches any non-HTTP failure (connection reset, timeout with no response, DNS error, etc.).
Example: --cache-exclude-status error,429,500..599
-
Default to not caching network errors. HTTP 404/410 are worth caching (the resource is gone). A TCP reset is not — it almost certainly reflects a transient condition. Caching it by default makes `--max-retries` useless for cached runs.
Workaround
Strip non-2xx entries from the restored `.lycheecache` file before invoking lychee. This forces lychee to re-check any URL that previously errored, restoring proper retry behaviour regardless of what is in the cache:
# Keep only successful cache entries so transient errors are always retried
if [ -f .lycheecache ]; then
grep -E ',2[0-9]{2},' .lycheecache > .lycheecache.tmp || true
mv .lycheecache.tmp .lycheecache
fi
Note: this relies on the assumption that lychee's cache format stores the HTTP status as a comma-delimited field — if the format changes this workaround will silently drop all cache entries. A first-class flag in lychee is the right fix.
Problem
--cache-exclude-statuslets you exclude HTTP status codes (e.g.429,500..599) from being cached, but network-level errors —Connection reset by peer, DNS failures, timeouts that never get an HTTP response — have no status code and cannot be excluded. Lychee caches them with--cachejust like any other result.On a subsequent run lychee finds the cached error, considers it fresh (within
--max-cache-age), and reports the failure without retrying — bypassing--max-retriesentirely. A transient infrastructure blip in one run thus causes every subsequent run that uses the cache to fail on a URL that is actually healthy.Reproduction
--cacheagainst a URL that has a flaky network path.--max-cache-age, lychee immediately reports the URL as failed without making a network request (and without applying--max-retries).Expected behaviour
Network-level errors should not be cached, or at minimum should be excludable. They are inherently transient — unlike a
404(the page is gone) or a403(access denied), a TCP reset or DNS timeout says nothing about the long-term availability of the URL.Proposed fix
One or both of the following:
Extend
--cache-exclude-statusto accept a sentinel value such aserrorornetworkthat matches any non-HTTP failure (connection reset, timeout with no response, DNS error, etc.).Example:
--cache-exclude-status error,429,500..599Default to not caching network errors. HTTP
404/410are worth caching (the resource is gone). A TCP reset is not — it almost certainly reflects a transient condition. Caching it by default makes `--max-retries` useless for cached runs.Workaround
Strip non-2xx entries from the restored `.lycheecache` file before invoking lychee. This forces lychee to re-check any URL that previously errored, restoring proper retry behaviour regardless of what is in the cache:
Note: this relies on the assumption that lychee's cache format stores the HTTP status as a comma-delimited field — if the format changes this workaround will silently drop all cache entries. A first-class flag in lychee is the right fix.