Skip to content

Pass cache options to spancache.Get#1531

Merged
Kern-- merged 1 commit into
awslabs:mainfrom
Kern--:test-fuse-failures
Apr 25, 2025
Merged

Pass cache options to spancache.Get#1531
Kern-- merged 1 commit into
awslabs:mainfrom
Kern--:test-fuse-failures

Conversation

@Kern--
Copy link
Copy Markdown

@Kern-- Kern-- commented Apr 24, 2025

Issue #, if available:

Description of changes:
Before this change, the span manager did not pass cache opts to the span cache when getting keys, but it did pass options when putting keys. This change makes them consistent.

This is problematic with the Direct option because writes are directly to files, but reads use a file descriptor cache. This leads to the following issue:

  1. The background fetcher fetches a span, but does not decompress it
  2. The background fetcher writes the compressed span to the span cache
  3. A later read loads the compressed data from the span cache. This caches the file descriptor
  4. The span is decompressed
  5. The span is rewritten to the same cache key by opening a temp file, writing the decompressed span data, then renaming it to the original name. NOTE: This is a different file descriptor from step 3!
  6. The span is marked decompressed
  7. Another read hits in the cache and get's back the fd to the (now deleted) compressed file from step 3.
  8. SOCI returns compressed data or get's an unxpected EOF because there isn't as much compressed data as uncompressed data.

The solution here is to pass cache options. Each cache Get opens a new file descriptor, so the caching issue is bypassed.

This was exposed by #1512 because we stopped using the go-toml default values which included DirectoryCacheConfig.Direct which was true. We should move the go-toml default values into config parsers because go-toml defaults are deprecated in v2.0. It's a default=true config value which will be easier to add after #1518

Testing performed:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@Kern-- Kern-- requested a review from a team as a code owner April 24, 2025 23:37
@github-actions github-actions Bot added the go Pull requests that update Go code label Apr 24, 2025
Comment thread fs/span-manager/span_manager.go Outdated
Before this change, the span manager did not pass cache opts to the span
cache when getting keys, but it did pass options when putting keys. This
change makes them consistent.

This is problematic with the Direct option because writes are directly
to files, but reads use a file descriptor cache. This leads to the
following issue:

1. The background fetcher fetches a span, but does not decompress it
2. The background fetcher writes the compressed span to the span cache
3. A later read loads the compressed data from the span cache. This
   caches the file descriptor
4. The span is decompressed
5. The span is rewritten to the same cache key by opening a temp file,
   writing the decompressed span data, then renaming it to the original
   name. NOTE: This is a different file descriptor from step 3!
6. The span is marked decompressed
7. Another read hits in the cache and get's back the fd to the (now
   deleted) compressed file from step 3.
8. SOCI returns compressed data or get's an unxpected EOF because there
   isn't as much compressed data as uncompressed data.

The solution here is to pass cache options. Each cache Get opens a new
file descriptor, so the caching issue is bypassed.

Signed-off-by: Kern Walster <walster@amazon.com>
@Kern-- Kern-- force-pushed the test-fuse-failures branch from 46c5a37 to e049f8e Compare April 25, 2025 15:42
@Kern-- Kern-- merged commit 739c04f into awslabs:main Apr 25, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Pull requests that update Go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants