Skip to content

Conversation

@parthivsaikia
Copy link
Contributor

Description

This PR addresses critical stability issues identified in the GetAllAhHelmPackages function during high-volume syncs with Artifact Hub.

Changes included:

  1. Implemented Exponential Backoff: Replaced the static sleep duration with an exponential backoff strategy (1s, 2s, 4s...) to handle 429 Too Many Requests errors gracefully.
  2. Fixed Resource Leak: Resolved a file descriptor exhaustion issue where defer resp.Body.Close() was placed inside a for loop (which only closes resources on function return). The response body is now closed explicitly after each iteration.
  3. Updated Documentation: Updated the function comment to accurately reflect the new retry mechanism.

This PR fixes #855

Signed commits

  • Yes, I signed my commits.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @parthivsaikia, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the stability and reliability of the GetAllAhHelmPackages function, particularly during high-volume data synchronization with Artifact Hub. By introducing an exponential backoff mechanism, the system can now gracefully manage rate limits, preventing service interruptions. Additionally, a critical resource leak related to HTTP response bodies has been addressed, ensuring efficient resource utilization and preventing potential system crashes under heavy load.

Highlights

  • Exponential Backoff: Implemented an exponential backoff strategy (1s, 2s, 4s...) to gracefully handle '429 Too Many Requests' errors when interacting with Artifact Hub, replacing a static sleep duration.
  • Resource Leak Fix: Resolved a file descriptor exhaustion issue by explicitly closing the HTTP response body after each iteration within a loop, rather than relying on defer which would only close resources upon function return.
  • Documentation Update: Updated the function comment for GetAllAhHelmPackages to accurately reflect the new retry mechanism and improved rate limit handling.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the rate limiting and resource leak issues in GetAllAhHelmPackages. The introduction of exponential backoff is a solid improvement over a static sleep, and the resource leak from defer inside a loop is correctly fixed. My review includes a few suggestions to further improve robustness and maintainability: one critical fix for potential panics from unsafe type assertions, and recommendations to replace fmt.Println with proper logging and to use a constant for the retry count.

Comment on lines 165 to 166
name := p["name"].(string)
repo := p["repository"].(map[string]interface{})["name"].(string)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

These direct type assertions can cause a panic if the fields name or repository are missing from the API response, or if their types are not what's expected. This could crash the entire sync process.

It's much safer to use the two-value comma-ok idiom to check for success before using the value. For example:

name, ok := p["name"].(string)
if !ok {
    log.Printf("Skipping package due to missing or invalid name: %v", p)
    continue
}

repoMap, ok := p["repository"].(map[string]interface{})
if !ok {
    log.Printf("Skipping package due to missing or invalid repository: %v", p)
    continue
}

repo, ok := repoMap["name"].(string)
if !ok {
    log.Printf("Skipping package due to missing or invalid repository name: %v", p)
    continue
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot implement this feedback.

var resp *http.Response
var err error

for i := 0; i < 5; i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The number of retries (5) is a magic number. It's better to define this as a constant (e.g., const maxRetries = 5) at the package or function level for improved readability and maintainability.

}

if err != nil {
fmt.Println(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using fmt.Println for errors is not recommended in library code as it writes to standard output and offers no configurability (like levels, formatting, or output destination). This makes debugging in a larger application difficult. This applies to all fmt.Println calls in this function.

Consider using a structured logger. The project seems to use logrus elsewhere, but even the standard log package would be an improvement. For example:

import "log"
// ...
log.Printf("error during http get for url %s: %v", url, err)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Artifact Hub sync fails (429 errors) and leaks connections

2 participants