-
Notifications
You must be signed in to change notification settings - Fork 402
lakectl enhance downloader reduce call to stat object for faster download #9555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly questions and concerns
cmd/lakectl/cmd/fs_download.go
Outdated
|
||
type objectInfo struct { | ||
relPath string | ||
object *apigen.ObjectStats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object *apigen.ObjectStats | |
stats *apigen.ObjectStats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to objectStat
cmd/lakectl/cmd/fs_download.go
Outdated
) | ||
|
||
type objectInfo struct { | ||
relPath string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relative to what? document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment.
relative to the source path in the repo - used the same terminology used on lakectl fs download code.
// download object | ||
var err error | ||
if d.PreSign && swag.Int64Value(objectStat.SizeBytes) >= d.PartSize { | ||
// download using presigned multipart download, it will fall back to presign single object download if needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When will it fallback to presign single object download?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloud return an error in case the object size < d.PartSize
, but instead it will just call downloadObject
.
I've added a check here, before the call just not to go through the fallback.
Both are internal methods - I cloud just return an error in downloadPresignMultipart, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If downloadPresignMultipart
is only called from here, you can remove the internal check in downloadPresignMultipart
. If it's not, then leave it as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// download object | ||
var err error | ||
if d.PreSign && swag.Int64Value(objectStat.SizeBytes) >= d.PartSize { | ||
// download using presigned multipart download, it will fall back to presign single object download if needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If downloadPresignMultipart
is only called from here, you can remove the internal check in downloadPresignMultipart
. If it's not, then leave it as is
@itaiad200 refactor the code for more reuse and single place to check the preconditions for download without object stat. |
// avoiding the need for a separate stat call. | ||
func (d *Downloader) DownloadWithObjectInfo(ctx context.Context, src uri.URI, dst string, tracker *progress.Tracker, objectStat *apigen.ObjectStats) error { | ||
// downloadObjectCore handles the common download logic for both Download and DownloadWithObjectInfo | ||
func (d *Downloader) downloadObjectCore(ctx context.Context, src uri.URI, dst string, tracker *progress.Tracker, objectStat *apigen.ObjectStats) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this better?
func (d *Downloader) downloadObjectCore(ctx context.Context, src uri.URI, dst string, tracker *progress.Tracker, objectStat *apigen.ObjectStats) error { | |
func (d *Downloader) downloadWithObjectInfo(ctx context.Context, src uri.URI, dst string, tracker *progress.Tracker, objectStat *apigen.ObjectStats) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't want to mix with DownloadWithObjectInfo
if size < d.PartSize { | ||
return d.downloadObject(ctx, src, dst, tracker) | ||
return fmt.Errorf("object is smaller than PartSize (%d): %w", d.PartSize, ErrValidation) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code path is unreachable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just to be on the safe side if someone will call this function without checking the minimum size.
use object listing stat information to remove extra call for stat during lakectl fs download.
kept call to stat object when we require presign url for multipart download of large file.
Close #9557