Improve Sonarr and Radarr sync performance by mjc · Pull Request #3306 · morpheus65535/bazarr

mjc · 2026-04-22T20:07:45Z

Summary

This PR improves Sonarr and Radarr sync performance by removing repeated work from full library syncs and adding indexes for common sync lookup paths.

The main changes are:

Add SQLite indexes used by Sonarr episode sync and wanted-subtitle queries
Avoid duplicate Sonarr episode syncs during full-series sync
Reuse Sonarr profile/tag/language context across a full sync
Skip unchanged Sonarr series row updates
Collapse duplicate per-series episode DB queries
Avoid unnecessary episode file size checks when Sonarr already reports a valid size
Cache episode parser settings during a series sync
Replace Radarr O(n²) movie comparison with keyed lookups

Performance notes

Tested against a large Bazarr database: thousands of series, tens of thousands of episodes, and thousands of movies. Measurements came from live profiling, SQLite EXPLAIN, copied-database timing, and focused micro-benchmarks.

SQLite EXPLAIN previously showed SCAN table_episodes for per-series Sonarr episode lookups and wanted-subtitle counts. With the new indexes, those plans use indexed SEARCH operations.
A copied-database sweep over all series returned on the order of 100k episode rows per round. Median CPU dropped from 160.239s to 0.327s; wall time dropped from 161.061s to 0.329s.
Full Sonarr sync could roughly double per-series episode sync work because update_series() called sync_episodes() directly and update_one_series() could also call it. The full sync path now keeps one explicit episode sync per processed series.
Full Sonarr sync now computes profile/tag/language context once and passes the already fetched Sonarr series payload through, while standalone callers still fetch as before.
Unchanged series rows can now return before issuing an UPDATE, avoiding thousands of unnecessary SQLite writes on large libraries.
Episode sync now derives episode ids from the full-row query, removing one duplicate per-series SELECT.
episodeParser() now trusts Sonarr's reported file size first and only falls back to a filesystem stat when needed.
Parser settings that were resolved repeatedly through Dynaconf are now resolved once per series sync and passed into the parser.
Radarr movie comparison now uses a radarrId keyed dictionary. With thousands of movies, the old subset scan used 0.955625s CPU in a micro-benchmark; the keyed lookup used 0.001418s CPU, about 674x faster for that comparison step.

Test plan

Added regression coverage for standalone Sonarr series refreshes so API/manual/SignalR callers still process fetched series data
Added regression coverage that full-series Sonarr sync only performs one explicit episode sync per processed series
Added regression coverage that unchanged Sonarr series rows skip UPDATE while manual callers still sync episodes
Added regression coverage that episodeParser() avoids filesystem stat calls when Sonarr's reported size is already valid
Added regression coverage that Radarr movie comparison checks the matching radarrId row rather than scanning every movie row

Measured on a large Bazarr database with thousands of series, tens of thousands of episodes, and thousands of movies. Before this migration, EXPLAIN showed SCAN table_episodes for the per-series Sonarr episode lookup and for the wanted-subtitle count. After adding idx_table_episodes_sonarrSeriesId and the partial missing_subtitles indexes, those plans changed to indexed SEARCH operations. A copied-database sweep over all series returned on the order of 100k episode rows per round. Median CPU fell from 160.239s without the indexes to 0.327s with them, and wall time fell from 161.061s to 0.329s.

Code inspection and live profiling showed update_series() called update_one_series() and then sync_episodes(), while update_one_series() also called sync_episodes() for non-SignalR updates. On a library with thousands of series, the full Sonarr pass could therefore roughly double per-series episode sync work before doing any useful episode comparison work. This adds an explicit flag so the full-series loop updates the series row once and keeps its existing single sync_episodes(series_id=...) call.

Live py-spy samples after restart showed full-series sync spending time in get_language_profiles() and update_one_series() setup while processing the same full Sonarr series payload already fetched by update_series(). Before this change, update_one_series() rebuilt the audio profile list, tag map, and language profiles once per series on a library with thousands of series, and its standalone path could also fetch the series from Sonarr again. The full sync path now computes the shared profile/tag context once, passes the already fetched show payload into update_one_series(), and leaves the per-series fallback behavior for standalone and SignalR calls.

Live py-spy samples still showed the full sync active inside database.execute(update(TableShows)...) for Sonarr series rows even when the parsed values were unchanged. On the measured library, that meant thousands of SQLite update transactions during a full Sonarr pass before episode sync work. Comparing the parsed series dict to the existing row lets unchanged series return before issuing UPDATE. Standalone non-SignalR calls still keep their episode sync behavior when the row is unchanged, so this only removes the avoidable series-row write.

Live py-spy samples during startup sync showed sync_episodes() spending time in _fetchall_impl/all at the TableEpisodes lookup for each series. Code inspection showed two queries for the same sonarrSeriesId: one query for episode ids and a second full-row query used for comparisons. The full-row query already contains the sonarrEpisodeId keys, so this derives the id list from that dictionary and removes one per-series SQLite SELECT from the measured thousands-series sync path.

After the earlier Sonarr sync reductions, live py-spy samples showed episodeParser() spending CPU in os.path.getsize()/genericpath for episode files. sync_episodes() already accepts an episode when Sonarr reports episodeFile.size above MINIMUM_VIDEO_SIZE, but episodeParser() still re-statted the path for every parsed episode. This trusts Sonarr's reported size first and only falls back to os.path.getsize() when the reported size is too small and the file is not an enabled .strm entry, removing the common per-episode filesystem stat from startup sync.

After avoiding the repeated file stat, live py-spy samples showed remaining parser overhead in Dynaconf setting resolution, including recursively_evaluate_lazy_format, from settings.general.enable_strm_support and settings.general.parse_embedded_audio_track. Those settings do not change inside one sync_episodes() pass, but the old code resolved them for each parsed episode. This resolves both settings once per series sync and passes the values to episodeParser(), while episodeParser() keeps fallback reads for standalone callers such as sync_one_episode().

The old full movie sync compared each parsed Radarr movie against every database movie row with any(parsed_movie.items() <= x for x in current_movies_db_kv), making the unchanged-row check O(movie_count squared). Measured with thousands of movies, the old subset scan used 0.955625s CPU in a micro-benchmark, while the keyed radarrId lookup used 0.001418s CPU, about 674x faster for the comparison step. This builds one radarrId-keyed dictionary from TableMovies, uses a set for the existing id membership check, and compares each parsed movie only against its matching row.

Covers the review risks around standalone Sonarr series refreshes, the full-series explicit episode sync path, unchanged series update skipping, episodeParser file-size handling, and keyed Radarr movie comparisons. These tests keep the measured optimizations from regressing while avoiding exact library-size assumptions.

mjc · 2026-04-22T20:10:38Z

Force-pushed the branch only to re-sign the commits and fix GitHub commit verification. No code changes were made in that push.

Copilot

Pull request overview

This PR improves Bazarr’s Sonarr/Radarr sync performance by reducing repeated per-item work during full syncs and adding database indexes that align with common subtitle-sync query patterns.

Changes:

Add SQLite indexes to speed up Sonarr episode lookups and “wanted subtitles” queries.
Reduce redundant Sonarr sync work (reuse context across a full sync, avoid duplicate episode syncs, skip unchanged series updates, collapse duplicate episode DB queries).
Replace Radarr’s O(n²) “is this movie changed?” comparison with keyed lookups, and add regression tests for the new fast paths.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
tests/bazarr/test_sync_performance_paths.py	Adds regression tests covering the optimized Sonarr/Radarr sync paths and parser behavior changes.
migrations/versions/6f0b2c8d9a1e_.py	Adds indexes for episodes/movies missing-subtitles queries and per-series episode lookups.
bazarr/sonarr/sync/series.py	Reuses Sonarr context across full syncs, avoids duplicate episode syncs, and skips unchanged series updates.
bazarr/sonarr/sync/parser.py	Avoids filesystem stat calls when Sonarr’s reported episode file size is already valid; caches settings via parameters.
bazarr/sonarr/sync/episodes.py	Removes a redundant per-series DB query and passes cached parser settings to reduce repeated config access.
bazarr/radarr/sync/movies.py	Replaces per-movie subset scans with `radarrId`-keyed comparisons to avoid O(n²) behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

morpheus65535 · 2026-05-31T14:18:57Z

Could you review this PR following recent changes in development branch? We've already added indexes and there's been other db migrations that must be taken into account.

If you must create a new PR, please split strm file support and sync performance improvement in two different PR?

Thanks!

Copilot AI review requested due to automatic review settings April 22, 2026 20:07

Copilot started reviewing on behalf of mjc April 22, 2026 20:08 View session

mjc added 9 commits April 22, 2026 14:09

mjc force-pushed the optimize-sonarr-sync-indexes branch from 3967b54 to 4d2acd2 Compare April 22, 2026 20:10

Copilot AI reviewed Apr 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Sonarr and Radarr sync performance#3306

Improve Sonarr and Radarr sync performance#3306
mjc wants to merge 9 commits into
morpheus65535:developmentfrom
mjc:optimize-sonarr-sync-indexes

mjc commented Apr 22, 2026

Uh oh!

mjc commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

morpheus65535 commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mjc commented Apr 22, 2026

Summary

Performance notes

Test plan

Uh oh!

mjc commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

morpheus65535 commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants