Skip to content

fix: optimize org stats count queries and fix branching bug#3531

Open
dkrizan wants to merge 3 commits intomainfrom
dkrizan/fix-org-stats-count-queries
Open

fix: optimize org stats count queries and fix branching bug#3531
dkrizan wants to merge 3 commits intomainfrom
dkrizan/fix-org-stats-count-queries

Conversation

@dkrizan
Copy link
Contributor

@dkrizan dkrizan commented Mar 13, 2026

Summary

  • Add partial index key_project_name_ns_not_deleted on key(project_id, name, namespace_id) WHERE deleted_at IS NULL to speed up org-wide COUNT(DISTINCT) queries in OrganizationStatsService
  • Fix bug where getKeyCount() and getTranslationCount() counted keys on orphan branches even when project.useBranching = false
  • Add tests for branching exclusion behavior

Closes #3527

Test plan

  • OrganizationStatsServiceTest — all 6 tests pass including new branching exclusion tests
  • CI passes
  • Verify on testing instance that PUT translations endpoint is faster for large orgs

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Corrected organization statistics to accurately reflect key and translation counts when branching is disabled for projects, preventing branch-related items from being included in totals.
  • Chores

    • Added database index to improve performance of statistics queries.

…keys (#3527)

Add partial index on key(project_id, name, namespace_id) WHERE deleted_at IS NULL
to speed up org-wide COUNT(DISTINCT) queries. Fix bug where keys on orphan branches
were counted even when project.useBranching=false.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dkrizan dkrizan marked this pull request as draft March 13, 2026 11:19
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

This pull request fixes slow organization stats count queries and incorrect branch key counting when branching is disabled. Changes include adding a database index for the count queries, updating the OrganizationStatsService to filter branch keys based on the project's useBranching setting, and extending test data to validate the new branching-aware counting logic.

Changes

Cohort / File(s) Summary
Test Data and Assertions
backend/app/src/test/kotlin/io/tolgee/service/OrganizationStatsServiceTest.kt, backend/data/src/main/kotlin/io/tolgee/development/testDataBuilder/data/OrganizationStatsTestData.kt
Added test data for a project with branching disabled (noBranchingProject), updated test expectations to reflect correct counts when branching is disabled (10 keys, 9 translations), and added new test cases to validate branch keys are excluded when useBranching = false.
Count Query Logic
backend/data/src/main/kotlin/io/tolgee/service/organization/OrganizationStatsService.kt
Modified getTranslationCount() and getKeyCount() queries to apply branching filter (p.use_branching = true OR k.branch_id IS NULL), replacing nested subquery approach with direct table joins on project deletion and organization ownership.
Database Index
backend/data/src/main/resources/db/changelog/schema.xml
Added Liquibase migration to create a partial index key_project_name_ns_not_deleted on (project_id, name, namespace_id) where deleted_at IS NULL to optimize count query performance.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • bdshadow
  • JanCizmar

Poem

🐰 A hop through the branches, now counted just right,
Keys with no branching are filtered from sight,
An index was added to speed up the way,
Organization stats now calculate faster each day! 📊✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: optimizing org stats count queries and fixing a branching bug for incorrect key counting.
Linked Issues check ✅ Passed All objectives from issue #3527 are met: partial index added on key table, branching condition applied to both getKeyCount and getTranslationCount queries, and comprehensive tests added.
Out of Scope Changes check ✅ Passed All changes directly address issue #3527 requirements: database index optimization, branching bug fix, test data setup, and test coverage for the new behavior.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dkrizan/fix-org-stats-count-queries
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
backend/app/src/test/kotlin/io/tolgee/service/OrganizationStatsServiceTest.kt (1)

84-105: These two new tests are useful but currently duplicate existing org-total assertions.

Consider consolidating or parameterizing to keep intent explicit while reducing maintenance repetition.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/app/src/test/kotlin/io/tolgee/service/OrganizationStatsServiceTest.kt`
around lines 84 - 105, The two new tests (`getKeyCount excludes branch keys when
project has branching disabled` and `getTranslationCount excludes branch
translations when project has branching disabled`) duplicate existing
organization-wide assertions; instead, change them to assert only the specific
project-level behavior (call organizationStatsService.getKeyCount and
getTranslationCount for the specific no-branching project id or use a helper to
fetch per-project counts) or convert them into a parameterized test that takes
project id and expected counts, and remove the duplicated org-total assertions
using testData.organization.id so the intent (exclude branch items for a
no-branching project) remains explicit without repeating org-wide sums.
backend/data/src/main/resources/db/changelog/schema.xml (1)

5178-5188: Consider validating overlap with the existing broad index before keeping both.

The new partial index is great for active-key counts, but the older non-partial index on the same key order still exists (Line 5081-Line 5090). Keeping both can increase write overhead; worth confirming with EXPLAIN/index usage stats whether the older one is still needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/data/src/main/resources/db/changelog/schema.xml` around lines 5178 -
5188, You added a partial index key_project_name_ns_not_deleted in changeSet id
1741855200000-1 but an older non-partial index on the same column order
(project_id, name, namespace_id) still exists; validate whether both are needed
by running EXPLAIN for representative queries and checking index usage
(pg_stat_user_indexes / pg_stat_all_indexes) and index definitions, then either
remove the old broad index (via a new rollback/drop changeSet) or adjust this
changeSet to drop the older index conditionally to avoid write overhead;
reference the new index name key_project_name_ns_not_deleted and the existing
non-partial index on public.key (project_id, name, namespace_id) when making the
change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@backend/app/src/test/kotlin/io/tolgee/service/OrganizationStatsServiceTest.kt`:
- Around line 84-105: The two new tests (`getKeyCount excludes branch keys when
project has branching disabled` and `getTranslationCount excludes branch
translations when project has branching disabled`) duplicate existing
organization-wide assertions; instead, change them to assert only the specific
project-level behavior (call organizationStatsService.getKeyCount and
getTranslationCount for the specific no-branching project id or use a helper to
fetch per-project counts) or convert them into a parameterized test that takes
project id and expected counts, and remove the duplicated org-total assertions
using testData.organization.id so the intent (exclude branch items for a
no-branching project) remains explicit without repeating org-wide sums.

In `@backend/data/src/main/resources/db/changelog/schema.xml`:
- Around line 5178-5188: You added a partial index
key_project_name_ns_not_deleted in changeSet id 1741855200000-1 but an older
non-partial index on the same column order (project_id, name, namespace_id)
still exists; validate whether both are needed by running EXPLAIN for
representative queries and checking index usage (pg_stat_user_indexes /
pg_stat_all_indexes) and index definitions, then either remove the old broad
index (via a new rollback/drop changeSet) or adjust this changeSet to drop the
older index conditionally to avoid write overhead; reference the new index name
key_project_name_ns_not_deleted and the existing non-partial index on public.key
(project_id, name, namespace_id) when making the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5496a483-6969-46da-990b-d24750a9b327

📥 Commits

Reviewing files that changed from the base of the PR and between 50ca4d4 and 6228b59.

📒 Files selected for processing (4)
  • backend/app/src/test/kotlin/io/tolgee/service/OrganizationStatsServiceTest.kt
  • backend/data/src/main/kotlin/io/tolgee/development/testDataBuilder/data/OrganizationStatsTestData.kt
  • backend/data/src/main/kotlin/io/tolgee/service/organization/OrganizationStatsService.kt
  • backend/data/src/main/resources/db/changelog/schema.xml

dkrizan and others added 2 commits March 13, 2026 13:00
…iltering

Replace EXISTS subqueries with LEFT JOIN on branch table. This excludes
keys on deleted branches and correctly counts default-branch keys when
project.useBranching=false.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ount

PostgreSQL kept driving the translation count query from the language
table, causing 11M+ nested loop iterations. Use materialized CTEs to
force the join order: org keys first, then translations, then language
filter. Also fix branching filter to count default-branch keys when
useBranching=false, and exclude keys on deleted branches.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dkrizan dkrizan marked this pull request as ready for review March 13, 2026 16:14
@dkrizan dkrizan requested a review from bdshadow March 13, 2026 18:25
<changeSet author="dkrizan" id="1741855200000-1" runInTransaction="false">
<sql>
create index concurrently if not exists key_project_name_ns_not_deleted
on public.key (project_id, name, namespace_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the same index, but not partial, already exists. It was added in the changeset 1771846650000-1. So it's not clear why the new one is required here.

Maybe, for the usage query it must be for 2 fields: project_id and deleted_at? Not sure, just asking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It improved the performance by 25%, I think due to where deleted_at is null .. but I will dive deeper and find a better way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, got it. It's interesting why the existing index isn't taken then... Maybe because of more fields in it 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason Postgres does not choose it is the condition WHERE deleted_at IS NOT NULL, in this case, the opposite is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the best solution + the refactor of SQLs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Organization stats count queries are slow and incorrectly count branch keys

2 participants