Add caching to Google Sheets API calls to prevent rate limiting by pragya811 · Pull Request #1002 · redhat-performance/cloud-governance

pragya811 · 2026-06-10T09:02:22Z

Type of change

Note: Fill x in []

bug
enhancement
documentation
dependencies

Description

When updating multiple rows in Google Sheets, the find_sheet_id_by_name() method was being called repeatedly for the same spreadsheet and sheet, causing excessive API calls and hitting the Google Sheets API rate limit of 60 read requests per minute.

This commit adds an in-memory cache to store sheet IDs, significantly reducing the number of API calls. The cache is session-scoped and automatically invalidated when new sheets are created.

Also fixes a typo in error logging (logger.into -> logger.info).

For security reasons, all pull requests need to be approved first before running any automated CI

Assisted-by: Cursor

When updating multiple rows in Google Sheets, the find_sheet_id_by_name() method was being called repeatedly for the same spreadsheet and sheet, causing excessive API calls and hitting the Google Sheets API rate limit of 60 read requests per minute. This commit adds an in-memory cache to store sheet IDs, significantly reducing the number of API calls. The cache is session-scoped and automatically invalidated when new sheets are created. Also fixes a typo in error logging (logger.into -> logger.info). Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai · 2026-06-10T09:02:33Z

📝 Walkthrough

Walkthrough

Adds an in-memory cache for Google Sheets sheetIds keyed by spreadsheet_id:sheet_name, a cache-eviction helper, invalidates cache after creating worksheets, and applies retry-with-eviction logic (and a logging fix) to several sheet-modifying methods.

Changes

Sheet ID Caching and Retry Behavior

Layer / File(s)	Summary
Cache initialization and lookup `cloud_governance/common/google_drive/google_drive_operations.py`	Initializes an in-memory sheet ID cache in `__init__`; `find_sheet_id_by_name(..., use_cache: bool = True)` returns cached IDs when present, otherwise fetches sheet metadata and stores the `sheetId` in the cache.
Cache eviction helper `cloud_governance/common/google_drive/google_drive_operations.py`	Adds `_evict_sheet_cache(sheet_name, spreadsheet_id)` to remove a cached sheet ID entry and log the eviction.
Invalidate cache after worksheet creation `cloud_governance/common/google_drive/google_drive_operations.py`	After `batchUpdate` creates a worksheet, computes the `spreadsheet_id:sheet_name` cache key and deletes any existing cached sheet ID for that key.
delete_rows: retry with cache eviction `cloud_governance/common/google_drive/google_drive_operations.py`	Refactors `delete_rows` to attempt the operation twice: use cached `sheetId` on first attempt, evict cache and retry on specific `HttpError` messages; non-matching failures are logged and re-raised.
paste_csv_to_gsheet: retry with cache eviction `cloud_governance/common/google_drive/google_drive_operations.py`	Adds two-attempt loop that uses cache on first attempt and bypasses cache on retry; evicts cached `sheetId` and retries when certain `HttpError` cases occur.
update_row_in_gsheet: retry with cache eviction `cloud_governance/common/google_drive/google_drive_operations.py`	Adds two-attempt loop with cache-on-first-attempt / bypass-on-retry behavior; evicts cached `sheetId` and retries on specific `HttpError` cases; request uses numeric `columnIndex` (`0`) for update start.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and clearly summarizes the main change: adding caching to Google Sheets API calls to prevent rate limiting, which aligns with the changeset's primary objective.
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description clearly explains the problem (excessive API calls hitting rate limits), the solution (in-memory caching), and includes additional fixes (typo in logging).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

cloud_governance/common/google_drive/google_drive_operations.py (1)
48-52: ⚡ Quick win

Populate cache from create response instead of deleting key.

After addSheet, you already have authoritative response data; deleting the cache key forces an extra spreadsheets().get(...sheets.properties) read on next operation. Cache the new sheetId directly from batchUpdate(...).execute() replies to reduce read pressure further.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cloud_governance/common/google_drive/google_drive_operations.py` around lines
48 - 52, The code deletes a cached sheet id after calling
self.__service.spreadsheets().batchUpdate(...).execute(), causing an extra read;
instead parse the batchUpdate response (the result of
self.__service.spreadsheets().batchUpdate(...).execute()) to extract the new
sheetId for the created sheet and store it in self.__sheet_id_cache under the
cache_key (f"{gsheet_id}:{sheet_name}") so subsequent operations use the
authoritative id; update the logic around batchUpdate call in
google_drive_operations.py to inspect the execute() response for the added
sheet's properties and assign that sheetId into __sheet_id_cache rather than
deleting the key.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cloud_governance/common/google_drive/google_drive_operations.py`:
- Around line 112-115: find_sheet_id_by_name currently returns cached sheet IDs
from self.__sheet_id_cache unconditionally which can serve stale sheetId values;
modify it to detect and recover from "sheet not found" failures by evicting the
cache entry and retrying the lookup once (or implement a bounded TTL on
entries), and apply the same change to the sibling cache access at the other
location referenced (lines ~123-125) so callers like delete_rows,
update_row_in_gsheet, and paste/update flows will retry with a fresh lookup when
a cached id proves invalid.

---

Nitpick comments:
In `@cloud_governance/common/google_drive/google_drive_operations.py`:
- Around line 48-52: The code deletes a cached sheet id after calling
self.__service.spreadsheets().batchUpdate(...).execute(), causing an extra read;
instead parse the batchUpdate response (the result of
self.__service.spreadsheets().batchUpdate(...).execute()) to extract the new
sheetId for the created sheet and store it in self.__sheet_id_cache under the
cache_key (f"{gsheet_id}:{sheet_name}") so subsequent operations use the
authoritative id; update the logic around batchUpdate call in
google_drive_operations.py to inspect the execute() response for the added
sheet's properties and assign that sheetId into __sheet_id_cache rather than
deleting the key.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 16fa2d47-e838-44f4-a015-1e436f19af35

📥 Commits

Reviewing files that changed from the base of the PR and between 9be8c00 and 6bbc2ce.

📒 Files selected for processing (1)

cloud_governance/common/google_drive/google_drive_operations.py

Addresses potential issue where cached sheet IDs can become stale if sheets are renamed/deleted/recreated by external processes during a job run. This would cause operations to fail with "sheet not found" errors with no recovery mechanism. Changes: - Added _evict_sheet_cache() method to remove stale cache entries - Added use_cache parameter to find_sheet_id_by_name() to bypass cache - Wrapped delete_rows(), update_row_in_gsheet(), and paste_csv_to_gsheet() with retry logic that: 1. Attempts operation with cached sheet ID 2. On "not found" errors, evicts cache and retries once with fresh lookup 3. Fails gracefully if second attempt also fails This defensive approach prevents complete job failures when sheet metadata changes during execution. Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

cloud_governance/common/google_drive/google_drive_operations.py (1)

173-174: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stale-sheet retry matching is too narrow and can miss real stale sheetId failures.

These guards only retry on "Unable to parse range" / "not found". Stale cached IDs from Sheets often fail as "No grid with id", which would skip eviction+retry and defeat the cache-recovery path.

Suggested fix

+    `@staticmethod`
+    def _is_stale_sheet_error(error: HttpError) -> bool:
+        msg = str(error).lower()
+        return any(token in msg for token in ('not found', 'unable to parse range', 'no grid with id'))
+
     def delete_rows(self, spreadsheet_id: str, sheet_name: str, row_number: int):
@@
-                except HttpError as error:
-                    if attempt == 0 and ('Unable to parse range' in str(error) or 'not found' in str(error).lower()):
+                except HttpError as error:
+                    if attempt == 0 and self._is_stale_sheet_error(error):
@@
     def paste_csv_to_gsheet(self, csv_path, spreadsheet_id: str, sheet_name: str):
@@
-                    except HttpError as error:
-                        if attempt == 0 and ('Unable to parse range' in str(error) or 'not found' in str(error).lower()):
+                    except HttpError as error:
+                        if attempt == 0 and self._is_stale_sheet_error(error):
@@
     def update_row_in_gsheet(self, data: list, gsheet_id: str, row: int, sheet_name: str):
@@
-                except HttpError as error:
-                    if attempt == 0 and ('Unable to parse range' in str(error) or 'not found' in str(error).lower()):
+                except HttpError as error:
+                    if attempt == 0 and self._is_stale_sheet_error(error):

#!/bin/bash
# Verify stale-id error matching logic and whether "No grid with id" is currently handled.
rg -n -C2 "Unable to parse range|not found|No grid with id|_is_stale_sheet_error" cloud_governance/common/google_drive/google_drive_operations.py

Also applies to: 219-220, 253-254

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cloud_governance/common/google_drive/google_drive_operations.py` around lines
173 - 174, The stale-sheet retry guard is too narrow—update the checks around
self._evict_sheet_cache (used where you currently match 'Unable to parse range'
or 'not found') to also detect "No grid with id" and other sheet-id stale
messages by adding a small helper like _is_stale_sheet_error(self, error)
(case-insensitive, matches 'unable to parse range', 'not found', 'no grid with
id', etc.), then call that helper in the existing conditionals (the blocks that
call self._evict_sheet_cache at the current occurrences) so eviction+retry will
run for all stale-sheet error variants; apply the same change to the other two
occurrences noted in the comment.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cloud_governance/common/google_drive/google_drive_operations.py`:
- Around line 177-178: The delete_rows function currently catches a second
failure, logs it with logger.info and uses break which suppresses the error and
lets callers assume deletion succeeded; change this to log at an appropriate
level (error) and re-raise the exception (or raise a new descriptive exception)
instead of breaking so failures propagate to callers. Locate delete_rows and
replace the logger.info/following break behavior with code that logs the error
(including the caught exception) and then raise the original exception (or a
wrapped one) so downstream code cannot silently continue after a failed delete.
- Around line 205-206: The Sheets batchUpdate payload is using string values for
"rowIndex" and "columnIndex" (e.g., "0" / '0'); change these to integer values
(0) where the payload/gridRange is constructed so the JSON sends numbers not
strings; locate the places in google_drive_operations.py that build the
batchUpdate/gridRange objects (look for "rowIndex" and "columnIndex" keys around
the shown diff) and replace the quoted string values with unquoted numeric
literals (0) so the API receives numeric indexes.

---

Duplicate comments:
In `@cloud_governance/common/google_drive/google_drive_operations.py`:
- Around line 173-174: The stale-sheet retry guard is too narrow—update the
checks around self._evict_sheet_cache (used where you currently match 'Unable to
parse range' or 'not found') to also detect "No grid with id" and other sheet-id
stale messages by adding a small helper like _is_stale_sheet_error(self, error)
(case-insensitive, matches 'unable to parse range', 'not found', 'no grid with
id', etc.), then call that helper in the existing conditionals (the blocks that
call self._evict_sheet_cache at the current occurrences) so eviction+retry will
run for all stale-sheet error variants; apply the same change to the other two
occurrences noted in the comment.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: cc8b99a0-610a-45c6-bc42-eff530dbed33

📥 Commits

Reviewing files that changed from the base of the PR and between 6bbc2ce and db9b141.

📒 Files selected for processing (1)

cloud_governance/common/google_drive/google_drive_operations.py

ebattat

/approve

pragya811 self-assigned this Jun 10, 2026

pragya811 added this to Cloud-Governance project Jun 10, 2026

github-project-automation Bot moved this to In progress in Cloud-Governance project Jun 10, 2026

pragya811 requested a review from ebattat June 10, 2026 09:02

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

Comment thread cloud_governance/common/google_drive/google_drive_operations.py

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

Comment thread cloud_governance/common/google_drive/google_drive_operations.py Outdated

Comment thread cloud_governance/common/google_drive/google_drive_operations.py Outdated

Address coderabbit review comments

4a864a9

ebattat approved these changes Jun 10, 2026

View reviewed changes

pragya811 merged commit d775cb7 into main Jun 10, 2026
19 checks passed

github-project-automation Bot moved this from In progress to Done in Cloud-Governance project Jun 10, 2026

pragya811 deleted the jenkins_fix3 branch June 10, 2026 13:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add caching to Google Sheets API calls to prevent rate limiting#1002

Add caching to Google Sheets API calls to prevent rate limiting#1002
pragya811 merged 3 commits into
mainfrom
jenkins_fix3

pragya811 commented Jun 10, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

ebattat left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

pragya811 commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Type of change

Description

For security reasons, all pull requests need to be approved first before running any automated CI

Uh oh!

coderabbitai Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ebattat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pragya811 commented Jun 10, 2026 •

edited

Loading

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading