Skip to content

feat: implement File Store#328

Open
wangxiaoxuan273 wants to merge 7 commits intooras-project:mainfrom
wangxiaoxuan273:file-ai-attempt
Open

feat: implement File Store#328
wangxiaoxuan273 wants to merge 7 commits intooras-project:mainfrom
wangxiaoxuan273:file-ai-attempt

Conversation

@wangxiaoxuan273
Copy link
Contributor

What this PR does / why we need it

Which issue(s) this PR resolves / fixes

Resolves / Fixes #<issue_id>

Please check the following list

  • Does the affected code have corresponding tests, e.g. unit test, E2E test?
  • Does this change require a documentation update?
  • Does this introduce breaking changes that would require an announcement or bumping the major version?
  • Do all new files have an appropriate license header?

Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>
@codecov
Copy link

codecov bot commented Feb 3, 2026

Codecov Report

❌ Patch coverage is 84.50920% with 101 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.22%. Comparing base (ca1e9aa) to head (2ea33d7).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/OrasProject.Oras/Content/File/TarUtilities.cs 70.83% 47 Missing and 9 partials ⚠️
src/OrasProject.Oras/Content/File/Store.cs 89.37% 37 Missing and 7 partials ⚠️
src/OrasProject.Oras/Registry/Remote/BlobStore.cs 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #328      +/-   ##
==========================================
- Coverage   91.45%   90.22%   -1.23%     
==========================================
  Files          61       68       +7     
  Lines        2586     3305     +719     
  Branches      345      443      +98     
==========================================
+ Hits         2365     2982     +617     
- Misses        137      218      +81     
- Partials       84      105      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>
@wangxiaoxuan273 wangxiaoxuan273 marked this pull request as ready for review February 3, 2026 08:01
Copilot AI review requested due to automatic review settings February 3, 2026 08:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a File Store feature for the ORAS .NET library, providing file-system-based content storage that implements the ITarget interface for OCI artifact operations.

Changes:

  • Adds a comprehensive File Store implementation with support for named file storage, tar/gzip directory archives, and fallback memory storage
  • Implements five custom exceptions for file store operations
  • Adds OCI standard annotations class and file store-specific annotations
  • Includes tar/gzip utilities for directory handling
  • Provides comprehensive test coverage with 2,239 lines of tests across 45+ test cases

Reviewed changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 63 comments.

Show a summary per file
File Description
src/OrasProject.Oras/Content/File/Store.cs Core File Store implementation with ITarget, IPredecessorFindable, and disposal interfaces
src/OrasProject.Oras/Content/File/TarUtilities.cs Utilities for creating and extracting tar.gz archives with checksums
src/OrasProject.Oras/Content/File/Annotations.cs File store-specific annotation constants
src/OrasProject.Oras/Oci/Annotations.cs OCI standard annotation key constants
src/OrasProject.Oras/Content/File/Exceptions/*.cs Five custom exception classes for file store operations
tests/OrasProject.Oras.Tests/Content/File/StoreTest.cs Comprehensive test suite with 45+ test cases covering all file store functionality
tests/OrasProject.Oras.Tests/Content/File/Exceptions/ExceptionTest.cs Tests for all exception constructors and messages
tests/OrasProject.Oras.Tests/Registry/Remote/RepositoryTest.cs Whitespace cleanup
src/OrasProject.Oras/Registry/Remote/BlobStore.cs Whitespace cleanup

Copilot AI review requested due to automatic review settings February 3, 2026 08:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 15 comments.

Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>

Update src/OrasProject.Oras/Content/File/Store.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>

Update tests/OrasProject.Oras.Tests/Content/File/StoreTest.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>

Update tests/OrasProject.Oras.Tests/Content/File/StoreTest.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>

Update src/OrasProject.Oras/Content/File/Store.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>

Update tests/OrasProject.Oras.Tests/Content/File/StoreTest.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>

Update tests/OrasProject.Oras.Tests/Content/File/StoreTest.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>

Update src/OrasProject.Oras/Content/File/TarUtilities.cs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>
Signed-off-by: Xiaoxuan Wang <wangxiaoxuan119@gmail.com>
Copilot AI review requested due to automatic review settings February 3, 2026 08:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 16 comments.

@sajayantony
Copy link
Contributor

Note: This comment was generated by GitHub Copilot.

Suggested PR Split & Design Doc Request

@wangxiaoxuan273 — This PR is quite large (+3,988 lines across 13 files). To make review more tractable, we'd like to request two things:

1. Design Doc

Before submitting code PRs, please create a brief design document covering:

  • File Store architecture — CAS mapping, name tracking, fallback storage
  • Component interaction — Store ↔ TarUtilities ↔ MemoryGraph ↔ MemoryTagStore
  • Public API surfaceITarget, IPredecessorFindable, AddAsync, configuration properties
  • Directory handling strategy — tar.gz creation/extraction with digest verification
  • Concurrency modelNameStatus + SemaphoreSlim
  • Disposal / lifecycle

2. Suggested Issue Breakdown (~10 issues, each ≤200-300 lines)

Issues 1-4 can be worked on in parallel. The rest form a sequential chain.

# Scope ~Lines Depends On
1 OCI Annotations constantsOci/Annotations.cs (fix spec link v1.1.0→v1.1.1) ~91
2 FileStore annotation constantsContent/File/Annotations.cs ~30
3 Exception types + tests — 5 exception classes + ExceptionTest.cs ~270+230
4 TarUtilities — tar creationTarDirectoryAsync, CreateDirectoryEntry, GetUnixFileMode + tests ~160
5 TarUtilities — tar extractionExtractTarGzipAsync, ExtractTarDirectoryAsync, path helpers + tests. Fix undisposed GZipStream/TarReader ~190 #4
6 Store scaffolding — class decl, fields, properties, constructors, Dispose, helpers (ThrowIfClosed, AbsPath, ResolveWritePath), NameStatus (fix SemaphoreSlim disposal), HashingStream + tests ~200 #1-3
7 Store — blob push/fetch/existsFetchAsync, ExistsAsync, PushAsync, PushInternalAsync, PushFileAsync, SaveFileAsync, VerifyAndCopyAsync + tests. Fix undisposed FileStream and MemoryStream issues ~200 #6
8 Store — tag/resolve/predecessors/dedupResolveAsync, TagAsync, GetPredecessorsAsync, RestoreDuplicatesAsync + tests ~200 #7
9 Store — directory add/pushAddAsync, PushDirectoryAsync, DescriptorFromDirectoryAsync, DescriptorFromFileAsync + tests ~250 #5, #7, #8
10 Copy integration testsMemoryToFile, FileToMemory, PartialCopy + ITarget interface test ~300 #9

Dependency graph:

#1 (OCI Annotations) ──┐
#2 (FS Annotations) ───┤
#3 (Exceptions) ───────┤
                       ├──→ #6 (Scaffolding) → #7 (Blob push/fetch) → #8 (Tag/resolve/dedup) → #9 (Dir add/push) → #10 (Integration tests)
#4 (Tar creation) ─────┤                                                                          ↑
                       ├─→ #5 (Tar extraction) ───────────────────────────────────────────────────┘

Additional notes:

  • Drop the BlobStore.cs and RepositoryTest.cs changes — they are whitespace-only (trailing spaces) and add noise.
  • Address all ~30 undisposed MemoryStream/FileStream/TarReader review comments in the respective issues.
  • Each issue should include its own tests and pass CI independently.

shizhMSFT

This comment was marked as duplicate.

Copy link
Contributor

@shizhMSFT shizhMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

This review is generated by GitHub Copilot using Claude Opus 4.6, mimicking shizhMSFT's review style. This is not the real shizhMSFT. Take the feedback with that context in mind.


Substantial effort here a full FileStore implementation with ~1200 lines of production code and ~2500 lines of tests. The overall structure aligns well with oras-go semantics.

That said, I have a few security and correctness concerns that should be addressed before merging.

using var hash = IncrementalHash.CreateHash(HashAlgorithmName.SHA256);

// Read the entire decompressed content for hashing
var bufferedStream = new MemoryStream();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire decompressed content is buffered into a MemoryStream when a checksum is present. There is no size limit.

This is vulnerable to a zip bomb attack a malicious gzip archive of a few KB can decompress to gigabytes and blow up the memory of the process.

Consider streaming through a HashingStream (you already have one in Store.cs) as a pass-through while extracting, rather than buffering the entire decompressed content. This way you can verify the checksum at the end without holding everything in memory.

var computedDigest = $"sha256:{Convert.ToHexString(computedHash).ToLowerInvariant()}";
if (!string.Equals(computedDigest, checksum, StringComparison.OrdinalIgnoreCase))
{
throw new InvalidOperationException($"Content digest mismatch: expected {checksum}, got {computedDigest}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InvalidOperationException is thrown here for a digest mismatch, but Store.cs uses MismatchedDigestException (from OrasProject.Oras.Content.Exceptions) for the same scenario in VerifyAndCopyAsync. Should be consistent use MismatchedDigestException here as well.

The repo copilot-instructions.md also says: "Prefer existing types in OrasProject.Oras.Exceptions."

string targetDirectory,
string directoryName,
Stream tarStream,
bool preservePermissions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preservePermissions is accepted as a parameter but never used inside ExtractTarDirectoryAsync. Extracted files are always created with default permissions.

This is dead code. Either implement it (apply entry.Mode via File.SetUnixFileMode on .NET 8+) or remove the parameter until the feature is ready. Shipping a parameter that silently does nothing is misleading to callers.

{
var fileStream = System.IO.File.Create(fullPath);
await using var _fs = fileStream.ConfigureAwait(false);
await entry.DataStream.CopyToAsync(fileStream, cancellationToken).ConfigureAwait(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no limit on entry.DataStream. A malicious tar archive could contain a single entry claiming to be a regular file with an extremely large size. The CopyToAsync call would write unbounded data to disk.

Consider checking entry.Length against a configurable max size before copying, or at minimum, counting bytes during copy and aborting if the total exceeds a reasonable threshold.

}
break;

case TarEntryType.HardLink:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard links are silently dropped. This can cause data loss when extracting archives that rely on hard links.

The comment says ".NET 8 does not have File.CreateHardLink", but you can P/Invoke CreateHardLink on Windows or use link() on Unix. At minimum, fall back to copying the linked file content instead of silently skipping. If intentionally unsupported, consider throwing or logging a warning so the caller knows data was lost.


// Check if the content exists in the store
if (_digestToPath.ContainsKey(target.Digest))
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExistsAsync returns true if the digest is in _digestToPath, without checking whether the file still exists on disk. But FetchAsync (line 175) does check System.IO.File.Exists(path) and throws NotFoundException if the file is gone.

This inconsistency means ExistsAsync can return true and then an immediate FetchAsync throws. Should we add the same file-existence check here?

/// <summary>
/// Creates a temporary file.
/// </summary>
private Task<(FileStream, string)> CreateTempFileAsync(CancellationToken cancellationToken)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: CreateTempFileAsync is not actually async it returns Task.FromResult(...). Consider making it a synchronous method (CreateTempFile) to avoid misleading callers and the unnecessary Task allocation.


foreach (var tempPath in _tmpFiles.Keys)
{
try
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: DisposeAsync just calls the synchronous Dispose(). If temp file cleanup could involve many files, consider making this truly async. At minimum, add GC.SuppressFinalize(this) in Dispose() per the standard dispose pattern.

/// Tests pushing with duplicate name fails.
/// </summary>
[Fact]
public async Task FileStore_File_Push_DuplicateName()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CopyAsync / CopyNodeAsync helpers reimplement content-graph traversal and copy logic manually. Should these tests be exercising the library own copy API (e.g., Extensions.CopyGraphAsync) instead? Reimplementing the logic in tests risks testing the test helper rather than the actual library behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants