fix(core): include gemini-2.5-flash-lite in default fallback chain#26914
fix(core): include gemini-2.5-flash-lite in default fallback chain#26914Eswar809 wants to merge 1 commit into
Conversation
When the default Pro and Flash quotas are exhausted, the CLI now falls back to gemini-2.5-flash-lite (1000 RPD on the free tier) instead of erroring out. The legacy chain in `policyCatalog.ts` and the dynamic chains (`default`, `auto-default`) in `defaultModelConfigs.ts` are kept in parity. Flash retains `maxAttempts: 10`; the last-resort marker moves to Flash-Lite. Fixes google-gemini#26841
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request improves the reliability of the model selection process by extending the default fallback chain to include 'gemini-2.5-flash-lite'. This change prevents errors for free-tier users when primary quotas are reached, allowing the system to automatically attempt the lite model instead of requiring manual intervention. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request integrates DEFAULT_GEMINI_FLASH_LITE_MODEL into the fallback chain as the new last-resort model, updating policyCatalog.ts, defaultModelConfigs.ts, and associated tests. The reviewer recommends adding maxAttempts: 10 to the new model's policy definitions to prevent infinite loops and maintaining consistency by using model constants in the configuration files.
| definePolicy({ | ||
| model: DEFAULT_GEMINI_FLASH_LITE_MODEL, | ||
| isLastResort: true, | ||
| }), |
There was a problem hiding this comment.
The new last-resort model DEFAULT_GEMINI_FLASH_LITE_MODEL should include maxAttempts: 10 to maintain the same level of retry resilience previously provided by DEFAULT_GEMINI_FLASH_MODEL. This ensures that transient failures at the end of the fallback chain are handled with a specific limit, preventing potential infinite loops while maintaining safety.
definePolicy({
model: DEFAULT_GEMINI_FLASH_LITE_MODEL,
isLastResort: true,
maxAttempts: 10,
}),References
- A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.
| { | ||
| model: 'gemini-2.5-flash-lite', | ||
| isLastResort: true, |
There was a problem hiding this comment.
Add maxAttempts: 10 to the DEFAULT_GEMINI_FLASH_LITE_MODEL policy in the default model chain. The model identifier has been updated to use the constant to ensure it matches the hardcoded values in PolicyCatalog.ts, and the retry limit is added to prevent infinite loops in the fallback logic.
{
model: DEFAULT_GEMINI_FLASH_LITE_MODEL,
isLastResort: true,
maxAttempts: 10,References
- Values for modelChains actions in defaultModelConfigs.ts must match the hardcoded values in PolicyCatalog.ts.
- A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.
| { | ||
| model: 'gemini-2.5-flash-lite', | ||
| isLastResort: true, |
There was a problem hiding this comment.
Add maxAttempts: 10 to the DEFAULT_GEMINI_FLASH_LITE_MODEL policy in the auto-default model chain. This ensures consistent retry behavior across fallback paths and maintains identifier parity with PolicyCatalog.ts.
{
model: DEFAULT_GEMINI_FLASH_LITE_MODEL,
isLastResort: true,
maxAttempts: 10,References
- Values for modelChains actions in defaultModelConfigs.ts must match the hardcoded values in PolicyCatalog.ts.
- A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.
Summary
Fixes #26841 — when the default Pro and Flash quotas are exhausted, the
CLI now falls back to
gemini-2.5-flash-lite(1000 RPD on the free tier)instead of erroring out. Previously free-tier users had to manually
re-run with
--model gemini-2.5-flash-litedespite quota remaining.The fix is applied symmetrically across both code paths:
packages/core/src/availability/policyCatalog.tspackages/core/src/config/defaultModelConfigs.ts(
modelChains.default+modelChains.auto-default)gemini-2.5-flashkeepsmaxAttempts: 10; theisLastResortmarkermoves to
gemini-2.5-flash-lite(which the existingFLASH_LITE_CHAINalready exercises in production for users who explicitly opt in).
Test plan
packages/core/src/availability/policyCatalog.test.ts— chain length 2→3 + new assertionspackages/core/src/availability/policyHelpers.test.ts— 6 chain-shape assertions updated + wrap-around orderpackages/core/src/fallback/handler.test.ts— 4 tests updated for new last-resort being Flash-Litesrc/availability+src/fallback+src/utils/flashFallback.test.ts+src/config/flashFallback.test.ts+src/services/modelConfig.golden.test.tsgreennpm run typecheckclean across full workspace