-
Notifications
You must be signed in to change notification settings - Fork 3.8k
fix: improve snapshot API validation and restore job tracking #47096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
issue: milvus-io#47066, milvus-io#47067, milvus-io#47068 related: milvus-io#44358 - Add standard naming validation for snapshot name in all snapshot tasks (create, drop, describe, restore) using ValidateSnapshotName - Fix list_snapshots() to support empty collection_name for listing all snapshots by setting collectionID to 0 - Add start_time field to RestoreSnapshotInfo proto and populate it from job.GetStartTs() in buildRestoreInfo - Add target collection name validation in restoreSnapshotTask - Add comprehensive unit tests for snapshot name validation Signed-off-by: Wei Liu <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: weiliu1031 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello @weiliu1031, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness and usability of Milvus's snapshot management APIs. It addresses several issues by improving validation logic for snapshot and collection names, ensuring consistency across the system. Additionally, it refines the tracking of restore jobs by accurately reporting their start times, providing users with more complete operational insights. These changes collectively lead to a more reliable and predictable snapshot experience. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
[ci-v2-notice] To rerun ci-v2 checks, comment with:
If you have any questions or requests, please contact @zhikunyao. |
|
@weiliu1031 Please associate the related issue to the body of your Pull Request. (eg. "issue: #") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces several fixes and improvements related to snapshot management. It adds validation for snapshot names in create, drop, describe, and restore operations, which is a great enhancement for robustness. It also fixes an issue where list_snapshots would not work with an empty collection name, and correctly populates the start_time for restore jobs. The changes are well-implemented and include new unit tests for the validation logic. However, the new validation logic for describeSnapshotTask and restoreSnapshotTask is not covered by tests. I've added a comment to suggest adding these tests.
| // =========================== Validation Tests for Issue #47068 =========================== | ||
|
|
||
| func TestCreateSnapshotTask_PreExecute_InvalidName(t *testing.T) { | ||
| testCases := []struct { | ||
| name string | ||
| snapshotName string | ||
| expectedErrMsg string | ||
| }{ | ||
| // Empty/whitespace cases | ||
| {"empty string", "", "snapshot name should be not empty"}, | ||
| {"single space", " ", "snapshot name should be not empty"}, | ||
| {"multiple spaces", " ", "snapshot name should be not empty"}, | ||
| {"tab character", "\t", "snapshot name should be not empty"}, | ||
| // Invalid first character | ||
| {"starts with number", "123snapshot", "the first character of snapshot name must be an underscore or letter"}, | ||
| {"starts with special char", "$snapshot", "the first character of snapshot name must be an underscore or letter"}, | ||
| {"starts with hyphen", "-snapshot", "the first character of snapshot name must be an underscore or letter"}, | ||
| // Invalid characters in name | ||
| {"contains space", "snap shot", "snapshot name can only contain"}, | ||
| {"contains special char", "snap@shot", "snapshot name can only contain"}, | ||
| {"contains chinese", "快照test", "the first character of snapshot name must be an underscore or letter"}, | ||
| // Too long name (exceeds 255 characters) | ||
| {"too long name", strings.Repeat("a", 256), "the length of snapshot name must be not greater than limit"}, | ||
| } | ||
|
|
||
| for _, tc := range testCases { | ||
| t.Run(tc.name, func(t *testing.T) { | ||
| task := &createSnapshotTask{ | ||
| req: &milvuspb.CreateSnapshotRequest{ | ||
| Name: tc.snapshotName, | ||
| DbName: "default", | ||
| CollectionName: "test_collection", | ||
| }, | ||
| } | ||
|
|
||
| err := task.PreExecute(context.Background()) | ||
|
|
||
| assert.Error(t, err) | ||
| assert.Contains(t, err.Error(), tc.expectedErrMsg) | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| func TestDropSnapshotTask_PreExecute_InvalidName(t *testing.T) { | ||
| testCases := []struct { | ||
| name string | ||
| snapshotName string | ||
| expectedErrMsg string | ||
| }{ | ||
| // Empty/whitespace cases | ||
| {"empty string", "", "snapshot name should be not empty"}, | ||
| {"single space", " ", "snapshot name should be not empty"}, | ||
| // Invalid first character | ||
| {"starts with number", "123snapshot", "the first character of snapshot name must be an underscore or letter"}, | ||
| // Invalid characters | ||
| {"contains space", "snap shot", "snapshot name can only contain"}, | ||
| } | ||
|
|
||
| for _, tc := range testCases { | ||
| t.Run(tc.name, func(t *testing.T) { | ||
| task := &dropSnapshotTask{ | ||
| req: &milvuspb.DropSnapshotRequest{ | ||
| Name: tc.snapshotName, | ||
| }, | ||
| } | ||
|
|
||
| err := task.PreExecute(context.Background()) | ||
|
|
||
| assert.Error(t, err) | ||
| assert.Contains(t, err.Error(), tc.expectedErrMsg) | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| func TestCreateSnapshotTask_PreExecute_ValidName(t *testing.T) { | ||
| testCases := []struct { | ||
| name string | ||
| snapshotName string | ||
| }{ | ||
| {"simple name", "snapshot"}, | ||
| {"with underscore prefix", "_snapshot"}, | ||
| {"with numbers", "snapshot123"}, | ||
| {"mixed", "_snap_shot_123"}, | ||
| {"uppercase", "Snapshot"}, | ||
| {"with dollar sign", "snapshot$test"}, // $ is allowed by default config | ||
| } | ||
|
|
||
| for _, tc := range testCases { | ||
| t.Run(tc.name, func(t *testing.T) { | ||
| task := &createSnapshotTask{ | ||
| req: &milvuspb.CreateSnapshotRequest{ | ||
| Name: tc.snapshotName, | ||
| DbName: "default", | ||
| CollectionName: "test_collection", | ||
| }, | ||
| } | ||
|
|
||
| // Mock globalMetaCache | ||
| globalMetaCache = &MetaCache{} | ||
| mockGetCollectionID := mockey.Mock((*MetaCache).GetCollectionID).Return(int64(100), nil).Build() | ||
| defer mockGetCollectionID.UnPatch() | ||
|
|
||
| err := task.PreExecute(context.Background()) | ||
|
|
||
| assert.NoError(t, err) | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| // =========================== Test for Issue #47066 =========================== | ||
|
|
||
| func TestListSnapshotsTask_PreExecute_EmptyCollectionName(t *testing.T) { | ||
| task := &listSnapshotsTask{ | ||
| req: &milvuspb.ListSnapshotsRequest{ | ||
| DbName: "default", | ||
| CollectionName: "", // Empty collection name should list all snapshots | ||
| }, | ||
| } | ||
|
|
||
| err := task.PreExecute(context.Background()) | ||
|
|
||
| assert.NoError(t, err) | ||
| assert.Equal(t, int64(0), task.collectionID) // collectionID should be 0 for listing all | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new validation logic for createSnapshotTask and dropSnapshotTask is well-tested. However, the PR also adds similar validation to describeSnapshotTask and restoreSnapshotTask, but there are no corresponding tests for these tasks. To ensure the validation works as expected across all snapshot operations, please add tests for invalid snapshot names for describeSnapshotTask and restoreSnapshotTask as well. You can follow the pattern of TestCreateSnapshotTask_PreExecute_InvalidName.
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (53.33%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #47096 +/- ##
==========================================
- Coverage 76.54% 76.41% -0.14%
==========================================
Files 2016 2016
Lines 326966 326365 -601
==========================================
- Hits 250278 249383 -895
- Misses 68716 68980 +264
- Partials 7972 8002 +30
🚀 New features to boost your workflow:
|
issue: #47066, #47067, #47068
related: #44358
Summary
list_snapshots()now correctly handles emptycollection_nameparameter by listing snapshots across all collectionsget_restore_snapshot_state()now correctly returnsstart_timeby populating it fromRestoreSnapshotJob.StartedAtcreate_snapshot/drop_snapshotnow applies standard Milvus name validation rules tosnapshot_name(same rules as collection/partition names)Changes
internal/proxy/task_snapshot.go:listSnapshotsTask.PreExecuteValidateSnapshotName()for snapshot name validationinternal/proxy/util.go:ValidateSnapshotName()function that reuses standard naming validationinternal/datacoord/snapshot_manager.go:StartTimefield inGetRestoreSnapshotJobProgressresponsepkg/proto/data_coord.proto:start_timefield toGetRestoreSnapshotJobProgressResponseinternal/proxy/task_snapshot_test.go:Test plan
Signed-off-by: Wei Liu [email protected]