Skip to content

fix(core): include gemini-2.5-flash-lite in default fallback chain#26914

Open
Eswar809 wants to merge 1 commit into
google-gemini:mainfrom
Eswar809:fix/flash-lite-fallback-chain
Open

fix(core): include gemini-2.5-flash-lite in default fallback chain#26914
Eswar809 wants to merge 1 commit into
google-gemini:mainfrom
Eswar809:fix/flash-lite-fallback-chain

Conversation

@Eswar809
Copy link
Copy Markdown
Contributor

Summary

Fixes #26841 — when the default Pro and Flash quotas are exhausted, the
CLI now falls back to gemini-2.5-flash-lite (1000 RPD on the free tier)
instead of erroring out. Previously free-tier users had to manually
re-run with --model gemini-2.5-flash-lite despite quota remaining.

The fix is applied symmetrically across both code paths:

  • Legacy chain: packages/core/src/availability/policyCatalog.ts
  • Dynamic chain: packages/core/src/config/defaultModelConfigs.ts
    (modelChains.default + modelChains.auto-default)

gemini-2.5-flash keeps maxAttempts: 10; the isLastResort marker
moves to gemini-2.5-flash-lite (which the existing FLASH_LITE_CHAIN
already exercises in production for users who explicitly opt in).

Test plan

  • packages/core/src/availability/policyCatalog.test.ts — chain length 2→3 + new assertions
  • packages/core/src/availability/policyHelpers.test.ts — 6 chain-shape assertions updated + wrap-around order
  • packages/core/src/fallback/handler.test.ts — 4 tests updated for new last-resort being Flash-Lite
  • Parity test between dynamic + legacy chains still passes
  • All 86 tests in src/availability + src/fallback + src/utils/flashFallback.test.ts + src/config/flashFallback.test.ts +
    src/services/modelConfig.golden.test.ts green
  • npm run typecheck clean across full workspace
  • Pre-commit hooks (prettier, eslint --fix) clean

When the default Pro and Flash quotas are exhausted, the CLI now falls
back to gemini-2.5-flash-lite (1000 RPD on the free tier) instead of
erroring out. The legacy chain in `policyCatalog.ts` and the dynamic
chains (`default`, `auto-default`) in `defaultModelConfigs.ts` are kept
in parity. Flash retains `maxAttempts: 10`; the last-resort marker
moves to Flash-Lite.

Fixes google-gemini#26841
@Eswar809 Eswar809 requested a review from a team as a code owner May 12, 2026 13:36
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request improves the reliability of the model selection process by extending the default fallback chain to include 'gemini-2.5-flash-lite'. This change prevents errors for free-tier users when primary quotas are reached, allowing the system to automatically attempt the lite model instead of requiring manual intervention.

Highlights

  • Fallback Chain Extension: Updated the default model fallback chain to include 'gemini-2.5-flash-lite' as the final resort, ensuring free-tier users have a fallback option when other quotas are exhausted.
  • Symmetric Implementation: Applied the fallback chain changes consistently across both the legacy policy catalog and the dynamic model configuration paths.
  • Test Suite Updates: Updated multiple test files, including policy catalog, policy helpers, and fallback handlers, to reflect the new chain length and last-resort model.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request integrates DEFAULT_GEMINI_FLASH_LITE_MODEL into the fallback chain as the new last-resort model, updating policyCatalog.ts, defaultModelConfigs.ts, and associated tests. The reviewer recommends adding maxAttempts: 10 to the new model's policy definitions to prevent infinite loops and maintaining consistency by using model constants in the configuration files.

Comment on lines +127 to +130
definePolicy({
model: DEFAULT_GEMINI_FLASH_LITE_MODEL,
isLastResort: true,
}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new last-resort model DEFAULT_GEMINI_FLASH_LITE_MODEL should include maxAttempts: 10 to maintain the same level of retry resilience previously provided by DEFAULT_GEMINI_FLASH_MODEL. This ensures that transient failures at the end of the fallback chain are handled with a specific limit, preventing potential infinite loops while maintaining safety.

    definePolicy({
      model: DEFAULT_GEMINI_FLASH_LITE_MODEL,
      isLastResort: true,
      maxAttempts: 10,
    }),
References
  1. A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.

Comment on lines +663 to +665
{
model: 'gemini-2.5-flash-lite',
isLastResort: true,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Add maxAttempts: 10 to the DEFAULT_GEMINI_FLASH_LITE_MODEL policy in the default model chain. The model identifier has been updated to use the constant to ensure it matches the hardcoded values in PolicyCatalog.ts, and the retry limit is added to prevent infinite loops in the fallback logic.

      {
        model: DEFAULT_GEMINI_FLASH_LITE_MODEL,
        isLastResort: true,
        maxAttempts: 10,
References
  1. Values for modelChains actions in defaultModelConfigs.ts must match the hardcoded values in PolicyCatalog.ts.
  2. A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.

Comment on lines +713 to +715
{
model: 'gemini-2.5-flash-lite',
isLastResort: true,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Add maxAttempts: 10 to the DEFAULT_GEMINI_FLASH_LITE_MODEL policy in the auto-default model chain. This ensures consistent retry behavior across fallback paths and maintains identifier parity with PolicyCatalog.ts.

      {
        model: DEFAULT_GEMINI_FLASH_LITE_MODEL,
        isLastResort: true,
        maxAttempts: 10,
References
  1. Values for modelChains actions in defaultModelConfigs.ts must match the hardcoded values in PolicyCatalog.ts.
  2. A recursive error/reconnect handler is acceptable as long as it includes a mechanism to limit the maximum number of retry attempts to prevent infinite loops.

@gemini-cli gemini-cli Bot added priority/p2 Important but can be addressed in a future release. area/core Issues related to User Interface, OS Support, Core Functionality area/platform Issues related to Build infra, Release mgmt, Testing, Eval infra, Capacity, Quota mgmt area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality priority/p1 Important and should be addressed in the near term. area/security Issues related to security labels May 12, 2026
@gemini-cli gemini-cli Bot added priority/p3 Backlog - a good idea but not currently a priority. area/enterprise Issues related to Telemetry, Policy, Quota / Licensing labels May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality area/core Issues related to User Interface, OS Support, Core Functionality area/enterprise Issues related to Telemetry, Policy, Quota / Licensing area/platform Issues related to Build infra, Release mgmt, Testing, Eval infra, Capacity, Quota mgmt area/security Issues related to security priority/p1 Important and should be addressed in the near term. priority/p2 Important but can be addressed in a future release. priority/p3 Backlog - a good idea but not currently a priority.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Default fallback policy chain missing gemini-2.5-flash-lite — users lose 1,000 RPD of available capacity

1 participant