Improve resilience against connection issues when accessing redis cache #36872

malmor · 2025-07-07T07:40:32Z

malmor
Jul 7, 2025

Tell us more.

Hey everyone,
we sometimes have connection issues to our redis cache, which Renovate does not seem to handle very good.

Setup

We are running Renovate hourly against ~2.000 repositories, hosted in a self-managed GitLab instance.

To improve the performance of Renovate and reduce the load on external systems, we have enabled all the difference caching settings, e.g. repositoryCache, cachePrivatePackages and presetCachePersistence. In order to share the cache between multiple renovate jobs, we are running a small redis instance in our infrastructure (single container running redis:8.0.2).

Issue

A few times every day there seems to be some kind of connection error between Renovate and our redis cache:

Error: read ETIMEDOUT
   at TCP.onStreamRead (node:internal/stream_base_commons:216:20)

I assume that is some kind of issue in our infrastructure or network setup, nothing caused by Renovate (we haven't figured out to cause yet, if you have any ideas we would like to hear them 😉).

I am opening this discussion because Renovate does not seem to handle this kind of issue very resilient. It just exits with Exit code 1, without any additional logs or hints regarding what went wrong. Here are the logs for a failure (LOG_LEVEL is set to info, LOG_FILE_LEVEL to trace):

Job log:

 INFO: Repository started (repository=group-a/project-a)
       "renovateVersion": "40.62.1"
 INFO: Repository is disabled - skipping (repository=group-a/project-a)
 INFO: Repository finished (repository=group-a/project-a)
       "cloned": true,
       "durationMs": 1488
 INFO: Repository started (repository=group-b/project-b)
       "renovateVersion": "40.62.1"
Error: read ETIMEDOUT
    at TCP.onStreamRead (node:internal/stream_base_commons:216:20)
Uploading artifacts for failed job
Uploading artifacts...
renovate-log.ndjson: found 1 matching artifact files and directories 
Uploading artifacts as "archive" to coordinator... 201 Created  id=53489094 responseStatus=201 Created token=xxx
Cleaning up project directory and file based variables
ERROR: Job failed: command terminated with exit code 1

Log file:

{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":20,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","msg":"No config migration necessary","time":"2025-06-30T12:48:16.295Z","v":0}
{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":10,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","config":{"$schema":"https://docs.renovatebot.com/renovate-schema.json","extends":["config:recommended"]},"msg":"decryptConfig()","time":"2025-06-30T12:48:16.296Z","v":0}
{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":10,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","config":{"$schema":"https://docs.renovatebot.com/renovate-schema.json","extends":["config:recommended"]},"msg":"decryptedConfig","time":"2025-06-30T12:48:16.296Z","v":0}
{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":10,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","config":{"$schema":"https://docs.renovatebot.com/renovate-schema.json","extends":["config:recommended"]},"existingPresets":[],"msg":"resolveConfigPresets","time":"2025-06-30T12:48:16.296Z","v":0}
{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":10,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","template":"config:recommended","filteredInput":{"platform":"gitlab","env":{"CI":"true","HOME":"/home/ubuntu","PATH":"/home/ubuntu/.local/bin:/home/ubuntu/bin:/home/ubuntu/.local/bin:/home/ubuntu/bin:/home/ubuntu/.local/bin:/home/ubuntu/bin:/home/ubuntu/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin","LC_ALL":"C.UTF-8","LANG":"C.UTF-8"}},"msg":"Compiling template","time":"2025-06-30T12:48:16.296Z","v":0}
{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":30,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","varNames":[null,"constructor","allowedCommands","allowedEnv","allowCustomCrateRegistries","allowedHeaders","allowPlugins","allowScripts","binarySource","cacheDir","cacheHardTtlMinutes","cacheTtlOverride","containerbaseDir","customEnvVariables","dockerChildPrefix","dockerCliOptions","dockerSidecarImage","dockerUser","dryRun","encryptedWarning","exposeAllEnv","executionTimeout","githubTokenWarn","localDir","migratePresets","presetCachePersistence","gitTimeout","endpoint","httpCacheTtlDays","autodiscoverRepoSort","autodiscoverRepoOrder","userAgent","dockerMaxPages","s3Endpoint","s3PathStyle","cachePrivatePackages"],"template":"config:recommended","msg":"Disallowed variable names in template","time":"2025-06-30T12:48:16.297Z","v":0}
{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":10,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","msg":"Resolving preset \"config:recommended\"","time":"2025-06-30T12:48:16.297Z","v":0}
{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":10,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","msg":"getPreset(config:recommended)","time":"2025-06-30T12:48:16.297Z","v":0}
{"name":"renovate","hostname":"runner-t1kypbgj-project-5253-concurrent-1-c72zi5vc","pid":266,"level":10,"logContext":"OUhNLQSgaS0-Wgi2NLofZ","repository":"group-b/project-b","msg":"cache.get(preset, preset:config:recommended)","time":"2025-06-30T12:48:16.297Z","v":0}
EOF

Proposed / Expected behaviour

All the data stored in the redis cache should just be a cache - if nothing is there or it can't be read, it should be ignored and fetched from the actual origin (e.g. GitLab or GitHub in this scenario). The error could be reported at the end of the run (similar to host errors when looking up packages), but should not just exit in the middle of processing a repository.

What do you think about this? Should there be a more graceful handling for connection issues to a redis cache?

Kind regards,
Malte

rarkins · 2025-07-07T08:46:32Z

rarkins
Jul 7, 2025

I think it would be fine to log an ERROR and then keep going, with attempts to reconnect. Because it could happen hundreds or thousands of times, it should be logger.once.error() so that it's deduplicated.

PR welcome if you'd like to figure out where in the code this can be done. It looks like it's perhaps an uncaught exception, because the Error: prefix is not how Renovate logs.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve resilience against connection issues when accessing redis cache #36872

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Improve resilience against connection issues when accessing redis cache #36872

Uh oh!

malmor Jul 7, 2025

Tell us more.

Setup

Issue

Proposed / Expected behaviour

Replies: 1 comment

Uh oh!

rarkins Jul 7, 2025

malmor
Jul 7, 2025

rarkins
Jul 7, 2025