-
-
Notifications
You must be signed in to change notification settings - Fork 326
fix: Handle rate-limiting exceptions consistently in LLM provider service #3370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThe pull request restructures LlmProviderService with new public methods for provider selection, rate-limit handling, and fake response generation. Concurrently, LlmRateLimitedException's constructor is updated to make the retryAt parameter non-nullable, removing optional handling for retry timing information. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant LlmProviderService
participant ProviderSelector as Provider<br/>Selection
participant SuspensionCheck as Suspension<br/>Check
participant RetryHandler as Retry<br/>Handler
participant ApiService as AbstractLlmApiService
participant RestTemplate
Client->>LlmProviderService: getProviderByName(org, name, priority)
LlmProviderService->>ProviderSelector: Compute candidates (custom + server providers)
ProviderSelector-->>LlmProviderService: Candidate list
LlmProviderService->>SuspensionCheck: Check suspension status
SuspensionCheck-->>LlmProviderService: Provider (non-suspended)
Client->>LlmProviderService: getProviderResponse(service, params, config, template)
LlmProviderService->>RetryHandler: repeatWhileProvidersRateLimited(...)
loop Until not rate-limited or max attempts
RetryHandler->>ApiService: Call provider API
ApiService->>RestTemplate: Execute request
RestTemplate-->>ApiService: Response or rate-limit error
ApiService-->>RetryHandler: PromptResult or LlmRateLimitedException
alt Rate-limited
RetryHandler->>SuspensionCheck: suspendProvider(name, providerId)
RetryHandler->>ProviderSelector: Select next candidate
else Success
RetryHandler-->>LlmProviderService: PromptResult
end
end
LlmProviderService-->>Client: PromptResult
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt:
- Around line 102-117: The code can proceed with an empty providers list due to
a TOCTOU race and cause division-by-zero/IndexOutOfBounds; move the cache read
and the empty-check into the synchronized(this) block (or re-check
providers.isEmpty() inside synchronized) so the provider list and providerInfo
are read atomically, and if providers is still empty throw a proper exception
(e.g., LlmRateLimitedException or a new descriptive exception) instead of
performing modulus/index access; ensure lastUsed/lastUsedMap logic (lastUsed,
lastUsedIndex, newIndex, providers.get(...), lastUsedMap.set(...)) only runs
when providers.isNotEmpty().
🧹 Nitpick comments (2)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (2)
57-57: ConcurrentHashMap usage with synchronized access.While ConcurrentHashMap provides thread-safe operations, all accesses to
lastUsedMap(lines 111-116) are wrapped insynchronized(this)blocks, making the concurrent collection redundant. A regularmutableMapOf()would suffice given the synchronization, though this doesn't cause issues.
229-233: Good synchronization for cache modification.The synchronized block ensures atomicity of the read-modify-write operation on the cache, preventing concurrent modifications from being lost. However, synchronizing on
thiscreates a broad lock scope that could lead to contention when multiple threads operate on different provider names simultaneously.Consider using a more fine-grained locking strategy (e.g., per-provider-name locks) if performance becomes an issue:
private val providerLocks = ConcurrentHashMap<String, Any>() fun suspendProvider(name: String, providerId: Long, period: Long) { val lock = providerLocks.computeIfAbsent(name) { Any() } synchronized(lock) { val providerInfo = cache.get(name, ProviderInfo::class.java) ?: ProviderInfo() providerInfo.suspendMap.set(providerId, currentDateProvider.date.time + period) cache.set(name, providerInfo) } }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/data/src/main/kotlin/io/tolgee/batch/MtProviderCatching.ktee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.kt
📄 CodeRabbit inference engine (AGENTS.md)
**/*.kt: After modifying JPA entities, run./gradlew diffChangeLogto generate Liquibase changelog entries (add--no-daemonflag if docker command is not found)
Run specific test suites instead of the bare test task; use:data:test,:server-app:runStandardTests,:server-app:runContextRecreatingTests,:server-app:runWebsocketTests,:server-app:runWithoutEeTests,:ee-test:test, or:security:testwith the--testsflag for individual tests
Always run./gradlew ktlintFormatbefore commits in backend code
Files:
backend/data/src/main/kotlin/io/tolgee/batch/MtProviderCatching.ktee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
**/*.{ts,tsx,js,jsx,kt,kts}
⚙️ CodeRabbit configuration file
As part of review, please check if the file follows 'The Stepdown Rule': The most important, high-level concepts should be at the top (like a newspaper headline and opening paragraph). Details should increase as you read downward. Functions should be ordered so that a caller appears before the functions it calls. When working with JavaScript components, we allow the main component to live at the bottom of the file as an exception to the rule.
Files:
backend/data/src/main/kotlin/io/tolgee/batch/MtProviderCatching.ktee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build frontend 🏗️
- GitHub Check: Frontend static check 🪲
- GitHub Check: Build backend 🏗️
🔇 Additional comments (3)
backend/data/src/main/kotlin/io/tolgee/batch/MtProviderCatching.kt (1)
47-47: LGTM! Reasonable default retry delay.The increase from 100ms to 5000ms (5 seconds) provides a more appropriate default delay for rate-limited scenarios, preventing rapid retry storms and aligning better with typical rate-limiting windows.
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (2)
91-100: Good addition of synchronization for provider filtering.Wrapping the cache read and provider filtering in a synchronized block ensures atomicity and prevents race conditions where provider suspension state could change mid-operation.
146-150: No changes needed. TheLlmRateLimitedExceptionconstructor correctly supports theretryAt,params, andcauseparameters. The code is using the exception as designed, andretryAtis an existing parameter in the class definition (defined with a default value ofnull), not a new addition.
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
Outdated
Show resolved
Hide resolved
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
Outdated
Show resolved
Hide resolved
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt:
- Around line 90-117: The synchronized block around provider filtering is
insufficient because lastUsedMap.compute(...) runs outside it, causing a race
where a provider can be selected after being suspended; move the round-robin
update and selection (the logic using lastUsedMap.compute and the final return
of providers.first { it.id == providerId }) into the same synchronized section
that computes `providers` so filtering and selection are atomic; also stop
synchronizing on `name.intern()` and instead use a dedicated lock-per-name
(e.g., a ConcurrentHashMap<String, Any> called `locks` and synchronize on
`locks.getOrPut(name)`), ensuring `suspendProvider()` and other code paths use
the same lock map so suspensions and selections cannot race.
🧹 Nitpick comments (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (1)
124-186: Consider reordering methods to follow the Stepdown Rule.According to the coding guidelines, "a caller appears before the functions it calls." Since
callProvider(line 169) callsrepeatWhileProvidersRateLimited(line 124), consider movingcallProvideraboverepeatWhileProvidersRateLimitedfor better readability.As per coding guidelines for Kotlin files, the Stepdown Rule should be followed.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.kt
📄 CodeRabbit inference engine (AGENTS.md)
**/*.kt: After modifying JPA entities, run./gradlew diffChangeLogto generate Liquibase changelog entries (add--no-daemonflag if docker command is not found)
Run specific test suites instead of the bare test task; use:data:test,:server-app:runStandardTests,:server-app:runContextRecreatingTests,:server-app:runWebsocketTests,:server-app:runWithoutEeTests,:ee-test:test, or:security:testwith the--testsflag for individual tests
Always run./gradlew ktlintFormatbefore commits in backend code
Files:
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
**/*.{ts,tsx,js,jsx,kt,kts}
⚙️ CodeRabbit configuration file
As part of review, please check if the file follows 'The Stepdown Rule': The most important, high-level concepts should be at the top (like a newspaper headline and opening paragraph). Details should increase as you read downward. Functions should be ordered so that a caller appears before the functions it calls. When working with JavaScript components, we allow the main component to live at the bottom of the file as an exception to the rule.
Files:
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build backend 🏗️
- GitHub Check: Frontend static check 🪲
- GitHub Check: Build frontend 🏗️
🔇 Additional comments (3)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (3)
57-57: Good use of ConcurrentHashMap for thread safety.The change to
ConcurrentHashMapensures thread-safe access tolastUsedMap. However, note that since the round-robin selection logic (lines 111-117) is not inside the synchronized block, you might want to consider moving it inside to ensure atomicity between provider filtering and selection (see comment on lines 90-117).
146-150: Excellent improvement to include retry timing.Adding the
retryAttimestamp toLlmRateLimitedExceptionprovides much better feedback to callers about when they can retry. This is consistent with the exception thrown at line 103 when all providers are suspended.
229-233: Good synchronization of cache updates.Moving the cache update into a synchronized block ensures thread safety when suspending providers. This correctly prevents race conditions between suspension and provider selection. However, as noted in the earlier comment, consider replacing
String.intern()with dedicated lock objects.
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (2)
105-105: Minor simplification available.The suspend time retrieval can be simplified:
♻️ Suggested simplification
-val closestUnsuspend = providerInfo.suspendMap.map { (_, time) -> time }.min() +val closestUnsuspend = providerInfo.suspendMap.values.min()
317-321: Consider thread-safe map forsuspendMap.The
suspendMapinsideProviderInfouses a non-thread-safemutableMapOf(). While current access is protected by synchronized blocks, using aConcurrentHashMapwould provide defense-in-depth and make the code more robust against future modifications that might access it without synchronization.♻️ Suggested improvement
companion object { data class ProviderInfo( - var suspendMap: MutableMap<Long, Long> = mutableMapOf(), + var suspendMap: MutableMap<Long, Long> = ConcurrentHashMap(), ) }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.kt
📄 CodeRabbit inference engine (AGENTS.md)
**/*.kt: After modifying JPA entities, run./gradlew diffChangeLogto generate Liquibase changelog entries (add--no-daemonflag if docker command is not found)
Run specific test suites instead of the bare test task; use:data:test,:server-app:runStandardTests,:server-app:runContextRecreatingTests,:server-app:runWebsocketTests,:server-app:runWithoutEeTests,:ee-test:test, or:security:testwith the--testsflag for individual tests
Always run./gradlew ktlintFormatbefore commits in backend code
Files:
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
**/*.{ts,tsx,js,jsx,kt,kts}
⚙️ CodeRabbit configuration file
As part of review, please check if the file follows 'The Stepdown Rule': The most important, high-level concepts should be at the top (like a newspaper headline and opening paragraph). Details should increase as you read downward. Functions should be ordered so that a caller appears before the functions it calls. When working with JavaScript components, we allow the main component to live at the bottom of the file as an exception to the rule.
Files:
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
- GitHub Check: E2E testing 🔎 (15, 9)
- GitHub Check: E2E testing 🔎 (15, 10)
- GitHub Check: E2E testing 🔎 (15, 12)
- GitHub Check: E2E testing 🔎 (15, 8)
- GitHub Check: E2E testing 🔎 (15, 11)
- GitHub Check: E2E testing 🔎 (15, 14)
- GitHub Check: E2E testing 🔎 (15, 13)
- GitHub Check: E2E testing 🔎 (15, 4)
- GitHub Check: E2E testing 🔎 (15, 1)
- GitHub Check: E2E testing 🔎 (15, 7)
- GitHub Check: E2E testing 🔎 (15, 2)
- GitHub Check: BT 🔎 (server-app:runStandardTests)
- GitHub Check: E2E testing 🔎 (15, 3)
- GitHub Check: E2E testing 🔎 (15, 6)
- GitHub Check: BT 🔎 (ktlint:test)
- GitHub Check: E2E testing 🔎 (15, 5)
- GitHub Check: E2E testing 🔎 (15, 0)
- GitHub Check: BT 🔎 (data:test)
- GitHub Check: BT 🔎 (server-app:runWithoutEeTests)
- GitHub Check: BT 🔎 (ee-test:test)
- GitHub Check: BT 🔎 (server-app:runContextRecreatingTests)
- GitHub Check: BT 🔎 (security:test)
- GitHub Check: BT 🔎 (server-app:runWebsocketTests)
- GitHub Check: Frontend static check 🪲
🔇 Additional comments (3)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (3)
58-58: LGTM!Good use of
ConcurrentHashMapfor thread-safe storage of last-used provider tracking.
114-121: LGTM!Good use of
ConcurrentHashMap.compute()for atomic round-robin advancement. The approach correctly captures an immutable snapshot of providers before updating the index.
150-154: LGTM!The exception correctly includes a
retryAttimestamp consistent with the suspension period, and properly preserves the original error for debugging.
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (2)
168-230: Approve with optional refactoring suggestion.The provider selection logic is correct: custom providers override server providers by name, priority filtering works as expected, and the round-robin selection via
lastUsedMap.computeis atomic and safe. The race condition between the synchronized block exit and thecomputeoperation is acceptable becauseprovidersis a local snapshot.Optional: Replace
String.intern()synchronization.Using
name.intern()as a lock object (lines 200, 258) is functional but discouraged as it pollutes the string pool and can cause unexpected contention. Consider using a dedicated lock map instead:♻️ Suggested refactoring
Add a lock map at the class level:
private val lockMap: ConcurrentHashMap<String, Any> = ConcurrentHashMap()Then replace synchronization blocks:
// Instead of synchronized(name.intern()) synchronized(lockMap.computeIfAbsent(name) { Any() }) { // ... existing logic }This provides the same per-name locking without string pool pollution.
60-322: Consider reordering functions per coding guidelines.The file doesn't fully follow the Stepdown Rule specified in the coding guidelines. Functions should be ordered so callers appear before callees. For example:
getProviderByName(line 168) is called byrepeatWhileProvidersRateLimited(line 79)suspendProvider(line 253) is called byrepeatWhileProvidersRateLimited(line 79)getProviderService(line 243) is called bycallProvider(line 60)Reordering would improve readability by allowing readers to follow the call flow top-to-bottom.
As per coding guidelines, Kotlin files should follow the Stepdown Rule.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.kt
📄 CodeRabbit inference engine (AGENTS.md)
**/*.kt: After modifying JPA entities, run./gradlew diffChangeLogto generate Liquibase changelog entries (add--no-daemonflag if docker command is not found)
Run specific test suites instead of the bare test task; use:data:test,:server-app:runStandardTests,:server-app:runContextRecreatingTests,:server-app:runWebsocketTests,:server-app:runWithoutEeTests,:ee-test:test, or:security:testwith the--testsflag for individual tests
Always run./gradlew ktlintFormatbefore commits in backend code
Files:
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
**/*.{ts,tsx,js,jsx,kt,kts}
⚙️ CodeRabbit configuration file
As part of review, please check if the file follows 'The Stepdown Rule': The most important, high-level concepts should be at the top (like a newspaper headline and opening paragraph). Details should increase as you read downward. Functions should be ordered so that a caller appears before the functions it calls. When working with JavaScript components, we allow the main component to live at the bottom of the file as an exception to the rule.
Files:
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧬 Code graph analysis (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (1)
backend/data/src/main/kotlin/io/tolgee/repository/LlmProviderRepository.kt (2)
getAll(9-25)getAll(12-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Frontend static check 🪲
- GitHub Check: Build frontend 🏗️
- GitHub Check: Build backend 🏗️
🔇 Additional comments (8)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (8)
39-39: LGTM - Good choice for thread-safe map operations.The
ConcurrentHashMapimport is appropriate for the thread-safelastUsedMapimplementation and supports the atomiccomputeoperation used for round-robin provider selection.
58-58: LGTM - Thread-safe implementation.The switch to
ConcurrentHashMapensures thread-safe access to the round-robin tracking map, with atomiccomputeoperations preventing race conditions during provider selection.
60-77: LGTM - Clean orchestration of provider calls.The new
callProviderfunction effectively coordinates rate-limit handling, timeout retries, and response processing. The fallback logic for attempts (config → service default) and price calculation ensures consistent behavior.
79-106: LGTM - Effective rate-limit handling with improved UX.The retry logic properly suspends rate-limited providers and surfaces the
retryAttimestamp to clients, enabling informed retry behavior. The 60-second suspension period aligns with the exception's retry window.
108-122: LGTM - Solid timeout retry implementation.The function correctly handles progressive timeout attempts by creating a fresh
RestTemplatewith the appropriate timeout for each iteration, properly catching timeout-related exceptions.
124-166: LGTM - Well-structured helper functions.The extracted helpers improve readability and maintainability. Notably, the
calculatePricefunction correctly excludes cached tokens from input cost (line 160), which optimizes pricing for prompt caching scenarios.
236-241: LGTM - Clever use of negative IDs.The negative indexing scheme (
-(index + 1)) effectively distinguishes server-configured providers from custom providers without requiring separate type flags or additional fields.
253-263: LGTM - Consistent synchronization pattern.The suspension logic correctly uses the same synchronization pattern as
getProviderByName, ensuring thread-safe cache updates. The mutablesuspendMapis safe because all access occurs within the synchronized block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (1)
111-125: Guard edge cases: emptyattempts=> NPE;cachedTokens > inputTokens=> negative priceBoth can produce surprising behavior (crash or negative credits).
Proposed fix
private fun <T> repeatWithTimeouts( attempts: List<Int>, callback: (restTemplate: RestTemplate) -> T, ): T { + if (attempts.isEmpty()) { + throw InvalidStateException() // (or BadRequestException) - attempts must contain at least one timeout + } var lastError: Exception? = null for (timeout in attempts) { val restTemplate = restTemplateBuilder.readTimeout(Duration.ofSeconds(timeout.toLong())).build() try { return callback(restTemplate) } catch (e: ResourceAccessException) { lastError = e } } throw FailedDependencyException(Message.LLM_PROVIDER_ERROR, listOf(lastError!!.message), lastError) }private fun calculatePrice( providerConfig: LlmProviderDto, usage: PromptResult.Usage?, ): Int { val tokenPriceInCreditsInput: Double = (providerConfig.tokenPriceInCreditsInput ?: 0.0) val tokenPriceInCreditsOutput: Double = (providerConfig.tokenPriceInCreditsOutput ?: 0.0) val inputTokens: Long = usage?.inputTokens ?: 0L val outputTokens: Long = usage?.outputTokens ?: 0L val cachedTokens: Long = usage?.cachedTokens ?: 0L - val inputPrice = (inputTokens - cachedTokens) * tokenPriceInCreditsInput + val billableInputTokens = (inputTokens - cachedTokens).coerceAtLeast(0L) + val inputPrice = billableInputTokens * tokenPriceInCreditsInput val outputPrice = (outputTokens) * tokenPriceInCreditsOutput val price = inputPrice + outputPrice return (price * 100).roundToInt() }Also applies to: 153-169
🤖 Fix all issues with AI agents
In @ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt:
- Around line 171-237: ProviderInfo.suspendMap may include suspend times for
provider IDs that are not in the current candidate set (providersWithPriority),
so computing closestUnsuspend from the whole map can return an incorrect retry
time; fix by restricting the min calculation to suspendMap entries whose keys
are in the current providers list (e.g., get the candidateIds =
providersWithPriority.map { it.id } and compute closestUnsuspend =
providerInfo.suspendMap.filterKeys { it in candidateIds
}.values.min()/minOrNull()), and handle the null case (fall back to throwing
BadRequestException) before throwing LlmRateLimitedException(closestUnsuspend).
🧹 Nitpick comments (2)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (2)
58-61:lockMapcan grow unbounded (high-cardinality provider names)If
nameis user-controlled / high-cardinality,lockMapbecomes a memory leak. Consider a bounded/striped lock strategy (or include org in the lock key if that matches the intended isolation).
139-151: Fake response JSON is OK for tests, but consider returning structured fields
If anything downstream parsesPromptResult.responseas JSON, string interpolation withconfig.namecould create invalid JSON for unusual names.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.kt
📄 CodeRabbit inference engine (AGENTS.md)
**/*.kt: After modifying JPA entities, run./gradlew diffChangeLogto generate Liquibase changelog entries (add--no-daemonflag if docker command is not found)
Run specific test suites instead of the bare test task; use:data:test,:server-app:runStandardTests,:server-app:runContextRecreatingTests,:server-app:runWebsocketTests,:server-app:runWithoutEeTests,:ee-test:test, or:security:testwith the--testsflag for individual tests
Always run./gradlew ktlintFormatbefore commits in backend code
Files:
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
**/*.{ts,tsx,js,jsx,kt,kts}
⚙️ CodeRabbit configuration file
As part of review, please check if the file follows 'The Stepdown Rule': The most important, high-level concepts should be at the top (like a newspaper headline and opening paragraph). Details should increase as you read downward. Functions should be ordered so that a caller appears before the functions it calls. When working with JavaScript components, we allow the main component to live at the bottom of the file as an exception to the rule.
Files:
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧬 Code graph analysis (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (1)
backend/data/src/main/kotlin/io/tolgee/repository/LlmProviderRepository.kt (2)
getAll(9-25)getAll(12-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build backend 🏗️
- GitHub Check: Build frontend 🏗️
- GitHub Check: Frontend static check 🪲
🔇 Additional comments (4)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (4)
63-80:callProvider(...)orchestration looks clean (selection → retries → enrichment)Nice consolidation of selection + timeout retries + price enrichment in one place.
127-137: Fake-provider switch is nicely centralized
260-270: Good: suspension write is done under the same per-name lock as reads
82-109: Confirm whether hard-coded "60s suspend + 3 provider attempts" matches product expectationsThe review comment accurately identifies hardcoded values:
- 3 retry attempts (line 92:
repeat(3))- 60-second suspension period (line 97:
60 * 1000)- No delay/backoff between retry attempts
retryAtin thrown exception usescurrentDateProvider.date.time + 60 * 1000(line 105)However, when
getProviderByName()detects all providers are suspended, it throws with the actualclosestUnsuspendtime from the suspension map (line 221), not the hardcoded 60s. The hardcoded 60s applies only when all 3 attempts exhaust without finding an available provider.The
attemptsconfiguration inLlmPropertiesallows per-provider timeout specifications but does not control the "3 retry" count or the 60-second suspension period—both remain hardcoded.No test coverage or product documentation was found to determine if these values align with product expectations. This requires confirmation from the product team or design documentation.
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
Outdated
Show resolved
Hide resolved
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
Outdated
Show resolved
Hide resolved
…vice - Made `retryAt` in `LlmRateLimitedException` non-nullable with default suspension logic. - Centralized rate limit suspension period using a constant. - Improved exception handling and logging for rate-limited LLM providers.
ba5a609 to
a519cd0
Compare
- Reordered methods logically for better structure and consistency. - Resolved redundant definitions by consolidating duplicated method declarations. - Ensured private constants are properly maintained in the companion object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (1)
57-57: Critical: Race condition on shared mutable state.The
lastUsedMapis accessed without synchronization ingetProviderByName(lines 232, 236), which creates a data race when multiple threads call this method concurrently. This violates thread-safety guarantees and can lead to incorrect round-robin provider selection.🔒 Suggested fix using ConcurrentHashMap
- private var lastUsedMap: MutableMap<String, Long> = mutableMapOf() + private val lastUsedMap: ConcurrentHashMap<String, Long> = ConcurrentHashMap()Also add the import:
import java.util.concurrent.ConcurrentHashMapNote: While
ConcurrentHashMapmakes individual operations thread-safe, the compound read-modify-write operation ingetProviderByName(lines 232-236) may still need synchronization if strict round-robin ordering is required under concurrent access. Consider usingcomputeorcomputeIfPresentfor atomic updates.
🤖 Fix all issues with AI agents
In @ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt:
- Around line 173-187: The function repeatWithTimeouts can receive an empty
attempts list, leaving lastError null and causing an NPE when throwing
FailedDependencyException; add a guard at the start of repeatWithTimeouts to
validate attempts is not empty (or set a clear default) and if it is empty throw
a FailedDependencyException or IllegalArgumentException with a descriptive
message, or when throwing FailedDependencyException ensure you supply a non-null
cause/message by checking lastError and providing a fallback error message;
update references in repeatWithTimeouts to use this validation so lastError is
never dereferenced when attempts is empty.
🧹 Nitpick comments (5)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (5)
150-150: Extract magic number to a named constant.The hardcoded
3should be extracted to a named constant to improve maintainability and make the retry limit configurable or at least self-documenting.♻️ Proposed refactor
Add to the companion object:
companion object { private const val RATE_LIMIT_SUSPEND_PERIOD_MS = 60000L + private const val MAX_RATE_LIMIT_RETRIES = 3Then update line 150:
- repeat(3) { + repeat(MAX_RATE_LIMIT_RETRIES) {
232-237: Use idiomatic Kotlin indexing operator.Line 235 uses
.get(newIndex)which is less idiomatic in Kotlin. The[]operator is preferred for list/array access.♻️ Proposed refactor
- val provider = providers.get(newIndex) + val provider = providers[newIndex]
227-230: Consider adding more context to the exception.When all providers are suspended, the exception could include the provider name and list of suspended provider IDs to help with debugging.
♻️ Proposed enhancement
if (providers.isEmpty() && providerInfo?.suspendMap?.isNotEmpty() == true) { val closestUnsuspend = providerInfo.suspendMap.map { (_, time) -> time }.min() - throw LlmRateLimitedException(closestUnsuspend) + throw LlmRateLimitedException( + retryAt = closestUnsuspend, + params = listOf("All providers for '$name' are currently suspended") + ) }
197-211: Consider simplifying provider selection logic.The current logic fetches all providers, checks existence, then filters. This could be more direct by filtering immediately, which would be clearer and slightly more efficient.
♻️ Proposed refactor for clarity
- val providersOfTheName = - if (customProviders.find { it.name == name } != null) { - customProviders - } else { - serverProviders - }.filter { - it.name == name - } + val customProvidersOfName = customProviders.filter { it.name == name } + val providersOfTheName = customProvidersOfName.ifEmpty { + serverProviders.filter { it.name == name } + } - val providersWithPriority = - if (providersOfTheName.find { it.priority == priority } != null) { - providersOfTheName.filter { it.priority == priority } - } else { - providersOfTheName - } + val providersWithMatchingPriority = providersOfTheName.filter { it.priority == priority } + val providersWithPriority = providersWithMatchingPriority.ifEmpty { + providersOfTheName + }
59-310: Consider reorganizing methods to follow the Stepdown Rule.Per the coding guidelines, functions should be ordered so that a caller appears before the functions it calls, with high-level concepts at the top. Currently, helper methods like
repeatWhileProvidersRateLimited,repeatWithTimeouts, andgetProviderByNameappear after CRUD methods but are called by the high-levelcallProvidermethod.Consider this organization:
- High-level public API:
callProvider(already at top ✓)- CRUD operations:
createProvider,updateProvider,deleteProvider,getAll- Core provider operations called by
callProvider:repeatWhileProvidersRateLimited,repeatWithTimeouts,getProviderByName- Lower-level helpers:
getProviderResponse,getProviderService,suspendProvider,getFakedResponse,calculatePriceAs per coding guidelines.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/data/src/main/kotlin/io/tolgee/exceptions/LlmRateLimitedException.ktee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.kt
📄 CodeRabbit inference engine (AGENTS.md)
**/*.kt: After modifying JPA entities, run./gradlew diffChangeLogto generate Liquibase changelog entries (add--no-daemonflag if docker command is not found)
Run specific test suites instead of the bare test task; use:data:test,:server-app:runStandardTests,:server-app:runContextRecreatingTests,:server-app:runWebsocketTests,:server-app:runWithoutEeTests,:ee-test:test, or:security:testwith the--testsflag for individual tests
Always run./gradlew ktlintFormatbefore commits in backend code
Files:
backend/data/src/main/kotlin/io/tolgee/exceptions/LlmRateLimitedException.ktee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
**/*.{ts,tsx,js,jsx,kt,kts}
⚙️ CodeRabbit configuration file
As part of review, please check if the file follows 'The Stepdown Rule': The most important, high-level concepts should be at the top (like a newspaper headline and opening paragraph). Details should increase as you read downward. Functions should be ordered so that a caller appears before the functions it calls. When working with JavaScript components, we allow the main component to live at the bottom of the file as an exception to the rule.
Files:
backend/data/src/main/kotlin/io/tolgee/exceptions/LlmRateLimitedException.ktee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
🧬 Code graph analysis (1)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (1)
backend/data/src/main/kotlin/io/tolgee/repository/LlmProviderRepository.kt (2)
getAll(9-25)getAll(12-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Frontend static check 🪲
- GitHub Check: Build backend 🏗️
🔇 Additional comments (6)
ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt (5)
130-132: LGTM!The delegation to the repository and DTO mapping is straightforward and correct.
240-250: LGTM!The routing logic for fake vs. real provider responses is clear and appropriate for testing scenarios.
252-260: LGTM!The provider type mapping is exhaustive and correct.
280-287: Cache update has potential race condition (likely acceptable).The get-modify-set pattern on the cache (lines 284-286) is not atomic. If two threads suspend different providers with the same name simultaneously, one update could be lost. However, this is likely acceptable given that:
- Suspension timing doesn't need to be perfectly precise
- The impact is minimal (slightly delayed unsuspension)
- The scenario is relatively rare
If stricter consistency is needed, consider using cache locking or atomic operations.
289-301: LGTM!The fake response generation is appropriate for testing purposes, with realistic token usage values.
backend/data/src/main/kotlin/io/tolgee/exceptions/LlmRateLimitedException.kt (1)
6-14: All call sites have been properly updated for the breaking API change.The
retryAtparameter is now required and non-nullable. Verification of the codebase confirms that both instantiations ofLlmRateLimitedExceptioninLlmProviderService.kt(lines 166 and 229) provide the requiredretryAtvalue. No orphaned call sites exist.
| fun <T> repeatWithTimeouts( | ||
| attempts: List<Int>, | ||
| callback: (restTemplate: RestTemplate) -> T, | ||
| ): T { | ||
| var lastError: Exception? = null | ||
| for (timeout in attempts) { | ||
| val restTemplate = restTemplateBuilder.readTimeout(Duration.ofSeconds(timeout.toLong())).build() | ||
| try { | ||
| return callback(restTemplate) | ||
| } catch (e: ResourceAccessException) { | ||
| lastError = e | ||
| } | ||
| } | ||
| throw FailedDependencyException(Message.LLM_PROVIDER_ERROR, listOf(lastError!!.message), lastError) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle edge case of empty attempts list.
If attempts is empty, the loop doesn't execute and lastError remains null, causing a NullPointerException on line 186. While this may be an unlikely scenario, defensive programming would handle it gracefully.
🛡️ Proposed fix
fun <T> repeatWithTimeouts(
attempts: List<Int>,
callback: (restTemplate: RestTemplate) -> T,
): T {
+ if (attempts.isEmpty()) {
+ throw BadRequestException(Message.LLM_PROVIDER_ERROR, listOf("No timeout attempts configured"))
+ }
var lastError: Exception? = null
for (timeout in attempts) {
val restTemplate = restTemplateBuilder.readTimeout(Duration.ofSeconds(timeout.toLong())).build()
try {
return callback(restTemplate)
} catch (e: ResourceAccessException) {
lastError = e
}
}
- throw FailedDependencyException(Message.LLM_PROVIDER_ERROR, listOf(lastError!!.message), lastError)
+ throw FailedDependencyException(Message.LLM_PROVIDER_ERROR, listOf(lastError?.message), lastError)
}🤖 Prompt for AI Agents
In @ee/backend/app/src/main/kotlin/io/tolgee/ee/service/LlmProviderService.kt
around lines 173 - 187, The function repeatWithTimeouts can receive an empty
attempts list, leaving lastError null and causing an NPE when throwing
FailedDependencyException; add a guard at the start of repeatWithTimeouts to
validate attempts is not empty (or set a clear default) and if it is empty throw
a FailedDependencyException or IllegalArgumentException with a descriptive
message, or when throwing FailedDependencyException ensure you supply a non-null
cause/message by checking lastError and providing a fallback error message;
update references in repeatWithTimeouts to use this validation so lastError is
never dereferenced when attempts is empty.
Summary by CodeRabbit
Release Notes
New Features
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.