feat(provisioner): support disk and replication BucketClass parameters via FilerConf#6
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdded bucket parameter validation and FilerConf integration to the provisioner: create/delete flows now read, update, and persist FilerConf PathConf entries when bucket params (disk, replication) are present; tests updated to use an in-memory fake filer storing files and bucket entries. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Provisioner as Provisioner Server
participant Filer as Filer Client
participant FilerConf as FilerConf Object
Client->>Provisioner: DriverCreateBucket(name, params)
Provisioner->>Provisioner: validateBucketParams(params)
alt invalid
Provisioner-->>Client: InvalidArgument
else valid
Provisioner->>Filer: CreateEntry(bucket path)
Provisioner->>Provisioner: needsFilerConf(params)
alt needs FilerConf
Provisioner->>Filer: readFilerConf()
Filer-->>Provisioner: FilerConf
Provisioner->>FilerConf: add/update PathConf(path, disk, replication)
Provisioner->>Filer: saveFilerConf(updated)
end
Provisioner-->>Client: Success
end
sequenceDiagram
participant Client
participant Provisioner as Provisioner Server
participant Filer as Filer Client
participant FilerConf as FilerConf Object
Client->>Provisioner: DriverDeleteBucket(name)
Provisioner->>Provisioner: deleteBucket(name)
Provisioner->>Filer: DeleteEntry(bucket path)
Provisioner->>Filer: readFilerConf()
Filer-->>Provisioner: FilerConf
Provisioner->>FilerConf: remove PathConf for bucket path (if present)
Provisioner->>Filer: saveFilerConf(updated)
Provisioner-->>Client: Success
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @IvanHunters, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the SeaweedFS provisioner by enabling the configuration of per-bucket storage characteristics, specifically disk type and replication factor, directly through COSI BucketClass parameters. This allows users to define more granular storage policies for their buckets, improving resource utilization and data resilience. The changes include robust parameter validation and ensure that configuration entries are correctly managed throughout the bucket lifecycle, including automatic cleanup upon deletion. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for disk type and replication factor parameters for COSI BucketClasses, which are configured via SeaweedFS FilerConf. The changes include adding parameter validation, logic to write and clean up PathConf entries in FilerConf during bucket creation and deletion, and an improved fake Filer for testing. The new tests cover various scenarios, including valid and invalid parameter combinations, and sequential bucket operations. Overall, the changes are well-structured and include good test coverage for the new functionality.
| // readFilerConf loads FilerConf from the filer, returning an empty conf if | ||
| // the file does not exist yet. | ||
| // | ||
| // TODO: the read-modify-write cycle (readFilerConf -> modify -> saveFilerConf) | ||
| // is not atomic. Concurrent bucket operations may overwrite each other's | ||
| // PathConf entries. This is the same limitation as the IAM read-modify-write | ||
| // in configureS3Access. Consider filer-side locking or optimistic concurrency | ||
| // if concurrent bucket creation becomes a concern. |
There was a problem hiding this comment.
The TODO comment correctly identifies a significant issue: the read-modify-write cycle for FilerConf is not atomic. This can lead to race conditions where concurrent bucket operations might overwrite each other's PathConf entries, resulting in data loss or incorrect configurations. This is a critical concern for data integrity in a multi-threaded or concurrent environment. It's important to address this with either filer-side locking, optimistic concurrency, or a different approach to ensure atomicity.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/driver/provisioner.go (1)
177-197:⚠️ Potential issue | 🟠 MajorHandle "bucket not found" as non-fatal in delete retry to prevent stale PathConf.
If
saveFilerConffails on the first delete attempt, a retry will callDeleteEntryagain on an already-deleted bucket, which will fail and return immediately without cleaning up the PathConf. Check the error fromDeleteEntryand treat "not found" errors as non-fatal so cleanup can proceed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner.go` around lines 177 - 197, When retrying deletion, treat a "bucket not found" from DeleteEntry as non-fatal so subsequent PathConf cleanup still runs: in the closure around DeleteEntry, inspect the error returned by c.DeleteEntry(ctx, &filer_pb.DeleteEntryRequest{...}) and if it indicates the bucket is already deleted (NotFound/does-not-exist), swallow that error and continue to call readFilerConf, check fc.GetLocationConf(prefix), call fc.DeleteLocationConf(prefix) if present, and then saveFilerConf(ctx, c, fc); only return for other error kinds. Ensure you reference the existing symbols DeleteEntry, readFilerConf, GetLocationConf, DeleteLocationConf, saveFilerConf, filerBucketsPath and id when making the change.
🧹 Nitpick comments (1)
pkg/driver/provisioner.go (1)
322-323: Avoid string-matching errors for not-found detection.
strings.Contains(err.Error(), "no entry is found")is brittle. Please centralize this behind anisNotFoundErrhelper using structured status handling where available, with fallback text matching only if required.Also applies to: 352-354
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner.go` around lines 322 - 323, Replace brittle string-matching checks of err.Error() (e.g., the checks inside the function around the block that returns fc when strings.Contains(err.Error(), "no entry is found") and the similar check at lines ~352-354) with a centralized helper isNotFoundErr(error) that encapsulates the detection logic; implement isNotFoundErr to first test well-known structured conditions (errors.Is(err, sql.ErrNoRows) or other sentinel errors, gRPC/status.Code(err) == codes.NotFound, or Kubernetes metav1.IsNotFound where applicable) and only fall back to a conservative substring match if none apply, then call isNotFoundErr(...) in place of the two existing string.Contains checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/driver/provisioner_test.go`:
- Around line 167-171: The test currently only asserts err != nil for cases
where c.wantErr is true; update the assertion to verify the GRPC status code is
InvalidArgument by using status.Code(err) and comparing to codes.InvalidArgument
(importing "google.golang.org/grpc/codes" and "google.golang.org/grpc/status" if
missing). Specifically, in the branch handling c.wantErr inside the test in
provisioner_test.go replace the simple non-nil check with a check that err is
non-nil and status.Code(err) == codes.InvalidArgument (fail the test if the code
differs) so invalid disk/replication parameter cases assert the correct RPC
contract.
In `@pkg/driver/provisioner.go`:
- Around line 310-315: The read-modify-write on FilerConf in readFilerConf (and
paired saveFilerConf/modify code paths used by configureS3Access) is non-atomic
and can drop concurrent PathConf updates; make this robust by adding either
filer-side locking or optimistic concurrency: implement a revision/checksum
field read from filer.FilerConf and verify it before writing (retry on
mismatch), or acquire a named lock via the filer (or a distributed lock) around
the modify+save sequence; update readFilerConf/saveFilerConf call sites
(including configureS3Access) to use the chosen locking/retry strategy and
surface retries/errors so concurrent bucket creations do not overwrite each
other’s PathConf entries.
- Around line 160-171: The code can leave a created bucket/entry when
saveFilerConf fails after a successful CreateEntry; update the function to
perform a compensation (rollback) when saveFilerConf(ctx, c, fc) returns an
error: after the error from saveFilerConf, attempt to delete the created entry
(call the same creation counterpart, e.g., DeleteEntry or RemoveEntry for the
bucket at s.filerBucketsPath+"/"+name+"/" or use the existing CreateEntry caller
context), ignore "not found" during cleanup, capture any cleanup error and wrap
or log it alongside the original save error, and finally return the original
save error (augmented with cleanup info) so the system does not drift between
the bucket and FilerConf; reference readFilerConf, SetLocationConf,
saveFilerConf and the CreateEntry that precedes this block to locate where to
add the rollback.
---
Outside diff comments:
In `@pkg/driver/provisioner.go`:
- Around line 177-197: When retrying deletion, treat a "bucket not found" from
DeleteEntry as non-fatal so subsequent PathConf cleanup still runs: in the
closure around DeleteEntry, inspect the error returned by c.DeleteEntry(ctx,
&filer_pb.DeleteEntryRequest{...}) and if it indicates the bucket is already
deleted (NotFound/does-not-exist), swallow that error and continue to call
readFilerConf, check fc.GetLocationConf(prefix), call
fc.DeleteLocationConf(prefix) if present, and then saveFilerConf(ctx, c, fc);
only return for other error kinds. Ensure you reference the existing symbols
DeleteEntry, readFilerConf, GetLocationConf, DeleteLocationConf, saveFilerConf,
filerBucketsPath and id when making the change.
---
Nitpick comments:
In `@pkg/driver/provisioner.go`:
- Around line 322-323: Replace brittle string-matching checks of err.Error()
(e.g., the checks inside the function around the block that returns fc when
strings.Contains(err.Error(), "no entry is found") and the similar check at
lines ~352-354) with a centralized helper isNotFoundErr(error) that encapsulates
the detection logic; implement isNotFoundErr to first test well-known structured
conditions (errors.Is(err, sql.ErrNoRows) or other sentinel errors,
gRPC/status.Code(err) == codes.NotFound, or Kubernetes metav1.IsNotFound where
applicable) and only fall back to a conservative substring match if none apply,
then call isNotFoundErr(...) in place of the two existing string.Contains
checks.
| fc, err := readFilerConf(ctx, c) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if err := fc.SetLocationConf(&filer_pb.FilerConf_PathConf{ | ||
| LocationPrefix: s.filerBucketsPath + "/" + name + "/", | ||
| DiskType: params[paramDisk], | ||
| Replication: params[paramReplication], | ||
| }); err != nil { | ||
| return fmt.Errorf("set location conf: %w", err) | ||
| } | ||
| return saveFilerConf(ctx, c, fc) |
There was a problem hiding this comment.
Handle partial-create failures to prevent bucket/FilerConf drift.
If CreateEntry succeeds and saveFilerConf fails, the RPC returns error but the bucket remains created. A retry can then fail with “already exists” while still missing the intended PathConf.
🛠️ Suggested compensation pattern
@@
if err := fc.SetLocationConf(&filer_pb.FilerConf_PathConf{
LocationPrefix: s.filerBucketsPath + "/" + name + "/",
DiskType: params[paramDisk],
Replication: params[paramReplication],
}); err != nil {
return fmt.Errorf("set location conf: %w", err)
}
- return saveFilerConf(ctx, c, fc)
+ if err := saveFilerConf(ctx, c, fc); err != nil {
+ // Best-effort rollback to avoid returning an error with a partially-created bucket.
+ _, _ = c.DeleteEntry(ctx, &filer_pb.DeleteEntryRequest{
+ Directory: s.filerBucketsPath,
+ Name: name,
+ IsDeleteData: true,
+ IsRecursive: true,
+ IgnoreRecursiveError: true,
+ })
+ return fmt.Errorf("save filer conf: %w", err)
+ }
+ return nil
})
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fc, err := readFilerConf(ctx, c) | |
| if err != nil { | |
| return err | |
| } | |
| if err := fc.SetLocationConf(&filer_pb.FilerConf_PathConf{ | |
| LocationPrefix: s.filerBucketsPath + "/" + name + "/", | |
| DiskType: params[paramDisk], | |
| Replication: params[paramReplication], | |
| }); err != nil { | |
| return fmt.Errorf("set location conf: %w", err) | |
| } | |
| return saveFilerConf(ctx, c, fc) | |
| fc, err := readFilerConf(ctx, c) | |
| if err != nil { | |
| return err | |
| } | |
| if err := fc.SetLocationConf(&filer_pb.FilerConf_PathConf{ | |
| LocationPrefix: s.filerBucketsPath + "/" + name + "/", | |
| DiskType: params[paramDisk], | |
| Replication: params[paramReplication], | |
| }); err != nil { | |
| return fmt.Errorf("set location conf: %w", err) | |
| } | |
| if err := saveFilerConf(ctx, c, fc); err != nil { | |
| // Best-effort rollback to avoid returning an error with a partially-created bucket. | |
| _, _ = c.DeleteEntry(ctx, &filer_pb.DeleteEntryRequest{ | |
| Directory: s.filerBucketsPath, | |
| Name: name, | |
| IsDeleteData: true, | |
| IsRecursive: true, | |
| IgnoreRecursiveError: true, | |
| }) | |
| return fmt.Errorf("save filer conf: %w", err) | |
| } | |
| return nil |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/driver/provisioner.go` around lines 160 - 171, The code can leave a
created bucket/entry when saveFilerConf fails after a successful CreateEntry;
update the function to perform a compensation (rollback) when saveFilerConf(ctx,
c, fc) returns an error: after the error from saveFilerConf, attempt to delete
the created entry (call the same creation counterpart, e.g., DeleteEntry or
RemoveEntry for the bucket at s.filerBucketsPath+"/"+name+"/" or use the
existing CreateEntry caller context), ignore "not found" during cleanup, capture
any cleanup error and wrap or log it alongside the original save error, and
finally return the original save error (augmented with cleanup info) so the
system does not drift between the bucket and FilerConf; reference readFilerConf,
SetLocationConf, saveFilerConf and the CreateEntry that precedes this block to
locate where to add the rollback.
| // TODO: the read-modify-write cycle (readFilerConf -> modify -> saveFilerConf) | ||
| // is not atomic. Concurrent bucket operations may overwrite each other's | ||
| // PathConf entries. This is the same limitation as the IAM read-modify-write | ||
| // in configureS3Access. Consider filer-side locking or optimistic concurrency | ||
| // if concurrent bucket creation becomes a concern. | ||
| func readFilerConf(ctx context.Context, c filer_pb.SeaweedFilerClient) (*filer.FilerConf, error) { |
There was a problem hiding this comment.
Known non-atomic RMW can drop concurrent PathConf updates.
The TODO is accurate: concurrent bucket operations can overwrite each other’s FilerConf mutations. This is a data-loss class issue and should be addressed before relying on this in parallel provisioning workloads.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/driver/provisioner.go` around lines 310 - 315, The read-modify-write on
FilerConf in readFilerConf (and paired saveFilerConf/modify code paths used by
configureS3Access) is non-atomic and can drop concurrent PathConf updates; make
this robust by adding either filer-side locking or optimistic concurrency:
implement a revision/checksum field read from filer.FilerConf and verify it
before writing (retry on mismatch), or acquire a named lock via the filer (or a
distributed lock) around the modify+save sequence; update
readFilerConf/saveFilerConf call sites (including configureS3Access) to use the
chosen locking/retry strategy and surface retries/errors so concurrent bucket
creations do not overwrite each other’s PathConf entries.
pkg/driver/provisioner.go
Outdated
| // See https://github.com/seaweedfs/seaweedfs/wiki/Replication | ||
| var validReplications = map[string]bool{ | ||
| "000": true, // no replication | ||
| "001": true, // replicate once on same rack | ||
| "010": true, // replicate once on different rack, same DC | ||
| "100": true, // replicate once on different DC | ||
| "200": true, // replicate twice on two other DCs | ||
| "110": true, // replicate once on different rack + once on different DC | ||
| } |
There was a problem hiding this comment.
Is there any reason for hardcode specific values. User might want to specify somrhing like 003 or 011 which is supported by seaweedfs bub will be refused by driver.
pkg/driver/provisioner.go
Outdated
| func validateBucketParams(params map[string]string) error { | ||
| if v := params[paramDisk]; v != "" { | ||
| switch v { | ||
| case "hdd", "ssd": |
There was a problem hiding this comment.
Does seaweedfs support only hdd and ssd disks? Or this value can be customized to have other types defined?
cc @chrislusf
9aa2cac to
7bbcd22
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (4)
pkg/driver/provisioner.go (3)
335-340:⚠️ Potential issue | 🟠 MajorNon-atomic read-modify-write on FilerConf can drop concurrent PathConf entries (existing concern).
The TODO is accurate. Concurrent bucket creates/deletes sharing the same FilerConf file will race on the RMW cycle, with the last writer silently overwriting earlier PathConf entries.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner.go` around lines 335 - 340, The read-modify-write cycle around FilerConf (functions readFilerConf and the configureS3Access path that mutates PathConf) is non-atomic and can lose concurrent PathConf entries; to fix, implement an optimistic concurrency or filer-side locking approach: when reading FilerConf via readFilerConf, capture a version/ETag or last-modified marker (or obtain a filer lock token), then before saving verify the version/token hasn’t changed and retry the read-modify-write if it has (or acquire/release a filer lock around modify+save); update the save path to return a conflict error on version mismatch and add a bounded retry loop in the caller (configureS3Access or any bucket create/delete code) to merge concurrent changes rather than overwrite PathConf entries.
175-192:⚠️ Potential issue | 🟠 MajorPartial-create failure leaves bucket/FilerConf in a drifted state (existing concern).
If
CreateEntrysucceeds butsaveFilerConffails, the bucket exists without its PathConf, and a retry will receive "already exists" without ever applying the intended configuration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner.go` around lines 175 - 192, The code can leave a bucket without its PathConf if CreateEntry succeeds but saveFilerConf fails; to fix it, make the operation atomic by adding a rollback on saveFilerConf error: after calling saveFilerConf(ctx, c, fc) fails, attempt to delete the newly created bucket (e.g., call the provisioner/client bucket deletion method used elsewhere such as DeleteEntry or equivalent on the same client c with the bucket name) and return a wrapped error that includes both the save error and any deletion error; update the block around needsFilerConf, readFilerConf, fc.SetLocationConf and saveFilerConf to perform this cleanup so retries don’t hit “already exists” without PathConf.
65-95:⚠️ Potential issue | 🟠 MajorOverly restrictive disk type and replication validation will reject valid SeaweedFS configurations.
validReplicationsallows only 6 codes, silently rejecting003,011,101,020, and others documented or used in SeaweedFS. Similarly, the disk type switch ("hdd","ssd") conflicts with SeaweedFS's free-format disk tag — any custom storage tier label (e.g.,"nvme","warm") will be rejected withInvalidArgument.The safest fix here is to drop both validations entirely (pass values through to SeaweedFS, which will error if invalid), or at minimum document the limitation prominently and treat unrecognised values as a pass-through warning rather than a hard reject.
💡 Proposed mitigation — pass through and let SeaweedFS validate
-var validReplications = map[string]bool{ - "000": true, - "001": true, - "010": true, - "100": true, - "200": true, - "110": true, -} - // needsFilerConf reports whether any FilerConf-related parameters are set. func needsFilerConf(params map[string]string) bool { return params[paramDisk] != "" || params[paramReplication] != "" } -// validateBucketParams checks that disk and replication values are valid -// SeaweedFS options. -func validateBucketParams(params map[string]string) error { - if v := params[paramDisk]; v != "" { - switch v { - case "hdd", "ssd": - default: - return fmt.Errorf("invalid disk type %q: must be %q or %q", v, "hdd", "ssd") - } - } - if v := params[paramReplication]; v != "" { - if !validReplications[v] { - return fmt.Errorf("invalid replication %q: must be one of 000, 001, 010, 100, 200, 110", v) - } - } - return nil -} +// validateBucketParams is intentionally a no-op: SeaweedFS treats both disk +// type and replication as free-format strings, so validation is delegated to +// the filer. +func validateBucketParams(_ map[string]string) error { return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner.go` around lines 65 - 95, The current validateBucketParams function and validReplications map enforce overly restrictive checks for paramDisk and paramReplication; remove the hard-coded validation so these values are passed through to SeaweedFS. Specifically, delete the validReplications map and the switch/check blocks inside validateBucketParams that reference paramDisk and paramReplication (leave needsFilerConf as-is), and make validateBucketParams a no-op that just returns nil so custom disk tags (e.g., "nvme") and extended replication codes are allowed to reach SeaweedFS for authoritative validation. Ensure you remove any now-unused symbols to keep the package compiling.pkg/driver/provisioner_test.go (1)
175-179:⚠️ Potential issue | 🟡 Minor
wantErrbranch does not assertcodes.InvalidArgument(existing concern).A wrong status code (e.g.,
codes.Internal) passes the test undetected, so the RPC contract for invalid params is not actually verified.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner_test.go` around lines 175 - 179, The test's wantErr branch currently only checks that an error exists but doesn't assert the gRPC status code; update the branch in the test (the block handling c.wantErr in provisioner_test.go) to extract the gRPC status from err using status.FromError (or status.Code) and assert the code equals codes.InvalidArgument, failing the test with a clear message if the error is nil, not a gRPC status, or has a different code; reference codes.InvalidArgument and status.FromError/status.Code to locate and implement the assertion.
🧹 Nitpick comments (3)
pkg/driver/provisioner_test.go (2)
80-83:DeleteEntrydoes not remove the entry from thebucketsmap.Post-delete,
ff.buckets[name]still holds the stale entry.bucketExtendedand future assertions could return false-positive data for a deleted bucket.♻️ Keep `buckets` map in sync
func (f *fakeFiler) DeleteEntry(_ context.Context, in *filer_pb.DeleteEntryRequest) (*filer_pb.DeleteEntryResponse, error) { delete(f.files, f.fileKey(in.Directory, in.Name)) + if in.Directory == "/buckets" { + delete(f.buckets, in.Name) + } return &filer_pb.DeleteEntryResponse{}, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner_test.go` around lines 80 - 83, The DeleteEntry method in fakeFiler currently removes the entry from f.files but leaves f.buckets[name] populated; update DeleteEntry (in pkg/driver/provisioner_test.go) to also remove the corresponding entry from the buckets map (e.g., delete(f.buckets, in.Name) or the appropriate bucket key) after deleting from f.files so fakeFiler's state stays consistent for bucketExtended and subsequent assertions; ensure you reference the fakeFiler struct's buckets field and use the same keying scheme as fileKey/in.Name when removing the bucket entry.
46-48: Nil-map panic iffakeFileris instantiated directly (withoutnewProv).
f.filesis written unconditionally insideCreateEntry.newProvalways initialises it, but a bare&fakeFiler{}in a future test would silently panic. Consider initialising lazily or adding a constructor.♻️ Lazy initialisation guard
func (f *fakeFiler) CreateEntry(_ context.Context, in *filer_pb.CreateEntryRequest) (*filer_pb.CreateEntryResponse, error) { + if f.files == nil { + f.files = make(map[string][]byte) + } if in.Entry.Content != nil {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner_test.go` around lines 46 - 48, The CreateEntry method on fakeFiler writes to the f.files map unconditionally which can nil-panics if a test instantiates &fakeFiler{} directly; add a lazy-initialisation guard at the start of CreateEntry (and any other mutating methods) to ensure f.files is non-nil (e.g. if f.files == nil { f.files = make(map[string][]byte) }) or provide/require a constructor (newProv or NewFakeFiler) that always initializes files; update CreateEntry (and use the existing fileKey/in.Directory/in.Entry.Name logic) so writes are safe even when fakeFiler is created without newProv.pkg/driver/provisioner.go (1)
347-349: Use gRPC status code inspection instead ofstrings.Containsfor "not found" detection.
strings.Contains(err.Error(), "no entry is found")is brittle — any wording change in SeaweedFS breaks the sentinel silently. Preferstatus.Code(err) == codes.NotFound(or a helper wrapping it), consistent with how gRPC errors should be inspected.♻️ Proposed refactor for both helpers
+import "google.golang.org/grpc/codes" +import "google.golang.org/grpc/status" func readFilerConf(ctx context.Context, c filer_pb.SeaweedFilerClient) (*filer.FilerConf, error) { fc := filer.NewFilerConf() resp, err := c.LookupDirectoryEntry(ctx, &filer_pb.LookupDirectoryEntryRequest{ Directory: filer.DirectoryEtcSeaweedFS, Name: filer.FilerConfName, }) if err != nil { - if strings.Contains(err.Error(), "no entry is found") { + if status.Code(err) == codes.NotFound || strings.Contains(err.Error(), "no entry is found") { return fc, nil } return nil, fmt.Errorf("read filer conf: %w", err) } ... } func saveFilerConf(ctx context.Context, c filer_pb.SeaweedFilerClient, fc *filer.FilerConf) error { ... _, err := c.UpdateEntry(...) - if err == nil || !strings.Contains(err.Error(), "no entry is found") { + if err == nil || (status.Code(err) != codes.NotFound && !strings.Contains(err.Error(), "no entry is found")) { return err } ... }(The
strings.Containsfallback can be removed once the gRPC status code path is confirmed correct against the real filer.)Also applies to: 377-379
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner.go` around lines 347 - 349, Replace the brittle substring check strings.Contains(err.Error(), "no entry is found") in pkg/driver/provisioner.go with a proper gRPC status check using status.Code(err) == codes.NotFound; add imports for "google.golang.org/grpc/status" and "google.golang.org/grpc/codes" and apply the same change to the other occurrence around the helper that returns fc (the same "not found" handling used at both sites), removing the substring fallback once confirmed against the real filer so the code consistently detects NotFound via gRPC status codes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/driver/provisioner_test.go`:
- Around line 175-179: The test's wantErr branch currently only checks that an
error exists but doesn't assert the gRPC status code; update the branch in the
test (the block handling c.wantErr in provisioner_test.go) to extract the gRPC
status from err using status.FromError (or status.Code) and assert the code
equals codes.InvalidArgument, failing the test with a clear message if the error
is nil, not a gRPC status, or has a different code; reference
codes.InvalidArgument and status.FromError/status.Code to locate and implement
the assertion.
In `@pkg/driver/provisioner.go`:
- Around line 335-340: The read-modify-write cycle around FilerConf (functions
readFilerConf and the configureS3Access path that mutates PathConf) is
non-atomic and can lose concurrent PathConf entries; to fix, implement an
optimistic concurrency or filer-side locking approach: when reading FilerConf
via readFilerConf, capture a version/ETag or last-modified marker (or obtain a
filer lock token), then before saving verify the version/token hasn’t changed
and retry the read-modify-write if it has (or acquire/release a filer lock
around modify+save); update the save path to return a conflict error on version
mismatch and add a bounded retry loop in the caller (configureS3Access or any
bucket create/delete code) to merge concurrent changes rather than overwrite
PathConf entries.
- Around line 175-192: The code can leave a bucket without its PathConf if
CreateEntry succeeds but saveFilerConf fails; to fix it, make the operation
atomic by adding a rollback on saveFilerConf error: after calling
saveFilerConf(ctx, c, fc) fails, attempt to delete the newly created bucket
(e.g., call the provisioner/client bucket deletion method used elsewhere such as
DeleteEntry or equivalent on the same client c with the bucket name) and return
a wrapped error that includes both the save error and any deletion error; update
the block around needsFilerConf, readFilerConf, fc.SetLocationConf and
saveFilerConf to perform this cleanup so retries don’t hit “already exists”
without PathConf.
- Around line 65-95: The current validateBucketParams function and
validReplications map enforce overly restrictive checks for paramDisk and
paramReplication; remove the hard-coded validation so these values are passed
through to SeaweedFS. Specifically, delete the validReplications map and the
switch/check blocks inside validateBucketParams that reference paramDisk and
paramReplication (leave needsFilerConf as-is), and make validateBucketParams a
no-op that just returns nil so custom disk tags (e.g., "nvme") and extended
replication codes are allowed to reach SeaweedFS for authoritative validation.
Ensure you remove any now-unused symbols to keep the package compiling.
---
Nitpick comments:
In `@pkg/driver/provisioner_test.go`:
- Around line 80-83: The DeleteEntry method in fakeFiler currently removes the
entry from f.files but leaves f.buckets[name] populated; update DeleteEntry (in
pkg/driver/provisioner_test.go) to also remove the corresponding entry from the
buckets map (e.g., delete(f.buckets, in.Name) or the appropriate bucket key)
after deleting from f.files so fakeFiler's state stays consistent for
bucketExtended and subsequent assertions; ensure you reference the fakeFiler
struct's buckets field and use the same keying scheme as fileKey/in.Name when
removing the bucket entry.
- Around line 46-48: The CreateEntry method on fakeFiler writes to the f.files
map unconditionally which can nil-panics if a test instantiates &fakeFiler{}
directly; add a lazy-initialisation guard at the start of CreateEntry (and any
other mutating methods) to ensure f.files is non-nil (e.g. if f.files == nil {
f.files = make(map[string][]byte) }) or provide/require a constructor (newProv
or NewFakeFiler) that always initializes files; update CreateEntry (and use the
existing fileKey/in.Directory/in.Entry.Name logic) so writes are safe even when
fakeFiler is created without newProv.
In `@pkg/driver/provisioner.go`:
- Around line 347-349: Replace the brittle substring check
strings.Contains(err.Error(), "no entry is found") in pkg/driver/provisioner.go
with a proper gRPC status check using status.Code(err) == codes.NotFound; add
imports for "google.golang.org/grpc/status" and "google.golang.org/grpc/codes"
and apply the same change to the other occurrence around the helper that returns
fc (the same "not found" handling used at both sites), removing the substring
fallback once confirmed against the real filer so the code consistently detects
NotFound via gRPC status codes.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
pkg/driver/provisioner.go (2)
323-327:⚠️ Potential issue | 🟠 MajorThe non-atomic FilerConf read-modify-write remains a data-loss risk under concurrency.
Concurrent create/delete operations can overwrite each other’s PathConf updates.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner.go` around lines 323 - 327, The current read-modify-write cycle using readFilerConf -> modify -> saveFilerConf can cause lost updates to PathConf under concurrent bucket operations; update the provisioner to make updates atomic by implementing either a filer-side locking mechanism (acquire/release lock around readFilerConf/saveFilerConf) or an optimistic concurrency loop that reads the filer version, applies changes to PathConf, and retries saveFilerConf when a version conflict is detected; ensure the change is applied where PathConf is mutated (the code paths invoking readFilerConf and saveFilerConf and configureS3Access) and surface errors when retries/locks fail so callers can handle contention.
173-180:⚠️ Potential issue | 🟠 MajorHandle
saveFilerConffailure with rollback to avoid bucket/FilerConf drift.If bucket creation succeeds and
saveFilerConffails, the RPC returns an error but leaves the bucket created without PathConf.Compensation pattern
if err := fc.SetLocationConf(&filer_pb.FilerConf_PathConf{ LocationPrefix: s.filerBucketsPath + "/" + name + "/", DiskType: params[paramDisk], Replication: params[paramReplication], }); err != nil { return fmt.Errorf("set location conf: %w", err) } - return saveFilerConf(ctx, c, fc) + if err := saveFilerConf(ctx, c, fc); err != nil { + _, _ = c.DeleteEntry(ctx, &filer_pb.DeleteEntryRequest{ + Directory: s.filerBucketsPath, + Name: name, + IsDeleteData: true, + IsRecursive: true, + IgnoreRecursiveError: true, + }) + return fmt.Errorf("save filer conf: %w", err) + } + return nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner.go` around lines 173 - 180, If saveFilerConf(ctx, c, fc) fails after creating the bucket and setting PathConf, perform a rollback to delete the newly created bucket to avoid bucket/FilerConf drift: call the appropriate bucket-delete method on the same client/context (use the same name used for LocationPrefix) inside the error branch, log any delete error but return the original saveFilerConf error (wrap with context). Update the block around SetLocationConf(...) and saveFilerConf(...) to attempt deletion of s.filerBucketsPath + "/" + name (or the home's bucket API) on failure, ensure you handle and surface both the original save error and any cleanup error without leaving resources inconsistent.pkg/driver/provisioner_test.go (1)
189-193:⚠️ Potential issue | 🟡 MinorAssert the gRPC status code for invalid-parameter cases.
These branches still only assert
err != nil; they should also verifycodes.InvalidArgumentto enforce the RPC contract.Test hardening patch
import ( "context" "fmt" "net" "reflect" "testing" "github.com/seaweedfs/seaweedfs/weed/filer" "github.com/seaweedfs/seaweedfs/weed/pb/filer_pb" "github.com/seaweedfs/seaweedfs/weed/s3api/s3_constants" "google.golang.org/grpc" + "google.golang.org/grpc/codes" "google.golang.org/grpc/credentials/insecure" + "google.golang.org/grpc/status" cosispec "sigs.k8s.io/container-object-storage-interface-spec" ) @@ if c.wantErr { if err == nil { t.Fatal("expected error, got nil") } + if got := status.Code(err); got != codes.InvalidArgument { + t.Fatalf("code=%v want=%v err=%v", got, codes.InvalidArgument, err) + } return }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner_test.go` around lines 189 - 193, When c.wantErr is true the test currently only checks err != nil; update the failure branch that uses c.wantErr (the block referencing c.wantErr and err) to also assert the gRPC status code is codes.InvalidArgument by calling status.Code(err) (or status.FromError) and comparing to codes.InvalidArgument. Import the grpc status/codes packages ("google.golang.org/grpc/status" and "google.golang.org/grpc/codes") if not already present and replace the simple nil-check in the failing branch with a check that err is non-nil and status.Code(err) == codes.InvalidArgument, failing the test with a descriptive message if either condition is not met.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/driver/provisioner_test.go`:
- Around line 38-42: The test failure comes from references to ff.iam after you
removed that field in the fakeFiler refactor; either restore a compatible iam
field on the fakeFiler struct or update the remaining test code to use the new
storage (ff.files). To fix: locate the fakeFiler type and either add back an iam
field with the same shape the tests expect (e.g., iam map[string]<original IAM
type> to preserve existing ff.iam usages) or change all remaining references to
ff.iam (in the tests that fail) to read/write from ff.files[...] using the
appropriate key and value conversion; target symbols: fakeFiler and ff.iam (or
calls that reference ff.iam in the failing tests).
In `@pkg/driver/provisioner.go`:
- Around line 193-205: The delete path currently returns early on the initial
delete error, preventing PathConf cleanup; change the flow around the delete
call so you capture the delete error instead of returning immediately, continue
to call readFilerConf, check fc.GetLocationConf(prefix), call
fc.DeleteLocationConf(prefix) and saveFilerConf(ctx, c, fc), and finally return
the original delete error (or nil) after attempting the PathConf cleanup;
reference the delete call's err variable plus readFilerConf, GetLocationConf,
DeleteLocationConf, and saveFilerConf to locate and update the logic.
---
Duplicate comments:
In `@pkg/driver/provisioner_test.go`:
- Around line 189-193: When c.wantErr is true the test currently only checks err
!= nil; update the failure branch that uses c.wantErr (the block referencing
c.wantErr and err) to also assert the gRPC status code is codes.InvalidArgument
by calling status.Code(err) (or status.FromError) and comparing to
codes.InvalidArgument. Import the grpc status/codes packages
("google.golang.org/grpc/status" and "google.golang.org/grpc/codes") if not
already present and replace the simple nil-check in the failing branch with a
check that err is non-nil and status.Code(err) == codes.InvalidArgument, failing
the test with a descriptive message if either condition is not met.
In `@pkg/driver/provisioner.go`:
- Around line 323-327: The current read-modify-write cycle using readFilerConf
-> modify -> saveFilerConf can cause lost updates to PathConf under concurrent
bucket operations; update the provisioner to make updates atomic by implementing
either a filer-side locking mechanism (acquire/release lock around
readFilerConf/saveFilerConf) or an optimistic concurrency loop that reads the
filer version, applies changes to PathConf, and retries saveFilerConf when a
version conflict is detected; ensure the change is applied where PathConf is
mutated (the code paths invoking readFilerConf and saveFilerConf and
configureS3Access) and surface errors when retries/locks fail so callers can
handle contention.
- Around line 173-180: If saveFilerConf(ctx, c, fc) fails after creating the
bucket and setting PathConf, perform a rollback to delete the newly created
bucket to avoid bucket/FilerConf drift: call the appropriate bucket-delete
method on the same client/context (use the same name used for LocationPrefix)
inside the error branch, log any delete error but return the original
saveFilerConf error (wrap with context). Update the block around
SetLocationConf(...) and saveFilerConf(...) to attempt deletion of
s.filerBucketsPath + "/" + name (or the home's bucket API) on failure, ensure
you handle and surface both the original save error and any cleanup error
without leaving resources inconsistent.
| if err != nil { | ||
| return err | ||
| } | ||
| fc, err := readFilerConf(ctx, c) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| prefix := s.filerBucketsPath + "/" + id + "/" | ||
| if _, found := fc.GetLocationConf(prefix); !found { | ||
| return nil | ||
| } | ||
| fc.DeleteLocationConf(prefix) | ||
| return saveFilerConf(ctx, c, fc) |
There was a problem hiding this comment.
Make delete flow idempotent so PathConf cleanup still runs when bucket is already missing.
Current early return on delete error prevents stale PathConf cleanup during retries/partial failures.
Suggested fix
_, err := c.DeleteEntry(ctx, &filer_pb.DeleteEntryRequest{
Directory: s.filerBucketsPath,
Name: id,
IsDeleteData: true,
IsRecursive: true,
IgnoreRecursiveError: true,
})
- if err != nil {
+ if err != nil && !strings.Contains(err.Error(), "no entry is found") {
return err
}
fc, err := readFilerConf(ctx, c)
if err != nil {
return err📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if err != nil { | |
| return err | |
| } | |
| fc, err := readFilerConf(ctx, c) | |
| if err != nil { | |
| return err | |
| } | |
| prefix := s.filerBucketsPath + "/" + id + "/" | |
| if _, found := fc.GetLocationConf(prefix); !found { | |
| return nil | |
| } | |
| fc.DeleteLocationConf(prefix) | |
| return saveFilerConf(ctx, c, fc) | |
| if err != nil && !strings.Contains(err.Error(), "no entry is found") { | |
| return err | |
| } | |
| fc, err := readFilerConf(ctx, c) | |
| if err != nil { | |
| return err | |
| } | |
| prefix := s.filerBucketsPath + "/" + id + "/" | |
| if _, found := fc.GetLocationConf(prefix); !found { | |
| return nil | |
| } | |
| fc.DeleteLocationConf(prefix) | |
| return saveFilerConf(ctx, c, fc) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/driver/provisioner.go` around lines 193 - 205, The delete path currently
returns early on the initial delete error, preventing PathConf cleanup; change
the flow around the delete call so you capture the delete error instead of
returning immediately, continue to call readFilerConf, check
fc.GetLocationConf(prefix), call fc.DeleteLocationConf(prefix) and
saveFilerConf(ctx, c, fc), and finally return the original delete error (or nil)
after attempting the PathConf cleanup; reference the delete call's err variable
plus readFilerConf, GetLocationConf, DeleteLocationConf, and saveFilerConf to
locate and update the logic.
…s via FilerConf Configure per-bucket storage options (disk type and replication factor) through COSI BucketClass parameters by writing PathConf entries to SeaweedFS FilerConf. Validate parameter values at the RPC boundary. Clean up PathConf entries when buckets are deleted. Signed-off-by: ohotnikov.ivan <ohotnikov.ivan@e-queo.net>
disk is a free-format tag in SeaweedFS (confirmed by chrislusf), so remove the hdd/ssd restriction. Replication is validated as exactly 3 digits instead of a hardcoded set of known values. Also fix fakeFiler compatibility with accessPolicy tests from main. Signed-off-by: ohotnikov.ivan <ohotnikov.ivan@e-queo.net>
6a47b8f to
fc6b592
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/driver/provisioner_test.go (1)
82-84: FakeFiler's DeleteEntry doesn't remove bucket frombucketsmap.
DeleteEntrydeletes fromfilesbut not frombuckets. While this doesn't affect current tests, it could cause confusion if future tests check bucket existence after deletion.♻️ Suggested fix
func (f *fakeFiler) DeleteEntry(_ context.Context, in *filer_pb.DeleteEntryRequest) (*filer_pb.DeleteEntryResponse, error) { delete(f.files, f.fileKey(in.Directory, in.Name)) + if in.Directory == "/buckets" { + delete(f.buckets, in.Name) + } return &filer_pb.DeleteEntryResponse{}, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/driver/provisioner_test.go` around lines 82 - 84, fakeFiler's DeleteEntry currently removes the entry from f.files but never updates f.buckets, leaving stale bucket entries; update the DeleteEntry method (fakeFiler.DeleteEntry) to also remove the corresponding bucket from f.buckets when the deleted file was the last entry in that bucket — use the same key logic as fileKey (or an existing bucketKey helper) to identify the bucket for in.Directory/in.Name, delete the file from f.files as now, then check if any remaining f.files entries share that bucket key and if none remain remove the bucket from f.buckets before returning the DeleteEntryResponse.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/driver/provisioner_test.go`:
- Around line 82-84: fakeFiler's DeleteEntry currently removes the entry from
f.files but never updates f.buckets, leaving stale bucket entries; update the
DeleteEntry method (fakeFiler.DeleteEntry) to also remove the corresponding
bucket from f.buckets when the deleted file was the last entry in that bucket —
use the same key logic as fileKey (or an existing bucketKey helper) to identify
the bucket for in.Directory/in.Name, delete the file from f.files as now, then
check if any remaining f.files entries share that bucket key and if none remain
remove the bucket from f.buckets before returning the DeleteEntryResponse.
- Add lazy init for fakeFiler.files map to prevent nil-map panic - Delete from fakeFiler.buckets map in DeleteEntry for consistency - Replace wantErr bool with wantCode codes.Code to verify the exact gRPC status code (e.g. InvalidArgument) instead of just err != nil Signed-off-by: ohotnikov.ivan <ohotnikov.ivan@e-queo.net>
Summary
disktype (hdd/ssd) andreplicationfactor throughCOSI BucketClass parameters by writing PathConf entries to SeaweedFS FilerConf
InvalidArgumentstatusTest plan
TestDriverCreateBucket— no params, disk only, replication only, both, invalid valuesTestDriverCreateBucketSequentialPreservesAll— two sequential creates preserve both PathConf entriesTestDriverDeleteBucket— PathConf removed on delete, no spurious FilerConf for plain bucketsTestDriverGrantBucketAccessandTestDriverRevokeBucketAccessstill passgo vet ./pkg/driver/cleanSummary by CodeRabbit
New Features
Bug Fixes
Tests