Skip to content

Conversation

@thrivikram-karur-g
Copy link
Contributor

Description

Adding read flow traces for flows like file cache read, file cache write, file cache with parallel downloads,

Link to the issue in case of a bug fix.

b/454481823

Testing details

  1. Manual - NA
  2. Unit tests - NA
  3. Integration tests - NA

Any backward incompatible change? If so, please explain.

NA

@thrivikram-karur-g thrivikram-karur-g requested a review from a team as a code owner January 6, 2026 05:32
@github-actions github-actions bot added the remind-reviewers Auto remind reviewers in attention set for review post 24hrs of inactivity on PR. label Jan 6, 2026
@thrivikram-karur-g thrivikram-karur-g removed the request for review from ashmeenkaur January 6, 2026 05:32
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @thrivikram-karur-g, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the observability of read operations by introducing a new tracing framework based on OpenTelemetry. It integrates tracing into critical read paths, including file caching and direct GCS reads, to provide detailed insights into performance and potential bottlenecks. The changes allow for better understanding and debugging of how data is accessed and processed within the system.

Highlights

  • New Tracing Package: A new tracing package has been introduced, providing an abstraction layer for distributed tracing. It includes a TraceHandle interface with noopTracer and otelTracer implementations, allowing for flexible integration with OpenTelemetry.
  • Read Flow Tracing Integration: Comprehensive tracing has been integrated into various read-related components, including buffered reads, file cache operations (read, write, and parallel downloads), and direct GCS reader paths. This enables detailed visibility into the performance of read operations.
  • Context Propagation and Error Recording: Tracing contexts are now properly propagated across function calls, ensuring that spans are correctly linked in distributed traces. Errors encountered during read operations are also recorded within their respective spans for easier debugging.
  • Detailed Span Attributes: Specific attributes, such as gcsfuse.is_cache_hit and gcsfuse.bytes_read, are added to read-related spans. These attributes provide granular details about cache performance and data transfer, enhancing observability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces distributed tracing for read flows by adding a new tracing package and integrating it throughout the read path. The changes are well-structured, introducing a TraceHandle interface with noop and otel implementations, and using it to trace operations like file cache reads and downloads.

I have a few suggestions to improve the trace context propagation and the consistency of span naming. Overall, this is a great addition for observability.

// the file, capped by the configured minimum.
blocksInFile := (int64(opts.Object.Size) + opts.Config.PrefetchBlockSizeBytes - 1) / opts.Config.PrefetchBlockSizeBytes
numBlocksToReserve := min(blocksInFile, opts.Config.MinBlocksPerHandle)
_, span := opts.TraceHandle.StartTrace(context.Background(), tracing.ReadPrefetchBlockPoolGen)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using context.Background() here starts a new trace, which will not be a child of the parent operation's trace (e.g., ReadFile). This breaks the trace context propagation.

To fix this, the context.Context from the ReadWithReadManager call in internal/fs/handle/file.go should be propagated down to NewBufferedReader. This can be done by:

  1. Adding a Ctx context.Context field to read_manager.ReadManagerConfig and bufferedread.BufferedReaderOptions.
  2. Updating read_manager.NewReadManager to accept a context and pass it through the options.
  3. Using the propagated context here instead of context.Background().

This will ensure that the ReadPrefetchBlockPoolGen span is correctly parented under the ongoing operation's span.


var err error
for _, r := range rr.readers {
ctx, span := rr.traceHandle.StartTrace(ctx, reflect.TypeOf(r).String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using reflect.TypeOf(r).String() for span names makes the tracing dependent on the internal type names, which can be brittle. For example, if a type is renamed or moved, the span name will change unexpectedly. A more robust approach would be to use string constants for span names for each reader type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

remind-reviewers Auto remind reviewers in attention set for review post 24hrs of inactivity on PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant