Skip to content

Conversation

nopcoder
Copy link
Contributor

@nopcoder nopcoder commented Sep 25, 2025

use object listing stat information to remove extra call for stat during lakectl fs download.
kept call to stat object when we require presign url for multipart download of large file.

Close #9557

@nopcoder nopcoder self-assigned this Sep 25, 2025
@nopcoder nopcoder added area/lakectl Issues related to lakeFS' command line interface (lakectl) include-changelog PR description should be included in next release changelog labels Sep 25, 2025
@nopcoder nopcoder requested review from a team and itaiad200 September 25, 2025 17:31
@nopcoder nopcoder marked this pull request as ready for review September 26, 2025 07:53
Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly questions and concerns


type objectInfo struct {
relPath string
object *apigen.ObjectStats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object *apigen.ObjectStats
stats *apigen.ObjectStats

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to objectStat

)

type objectInfo struct {
relPath string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relative to what? document

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment.

relative to the source path in the repo - used the same terminology used on lakectl fs download code.

// download object
var err error
if d.PreSign && swag.Int64Value(objectStat.SizeBytes) >= d.PartSize {
// download using presigned multipart download, it will fall back to presign single object download if needed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will it fallback to presign single object download?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloud return an error in case the object size < d.PartSize, but instead it will just call downloadObject.
I've added a check here, before the call just not to go through the fallback.
Both are internal methods - I cloud just return an error in downloadPresignMultipart, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If downloadPresignMultipart is only called from here, you can remove the internal check in downloadPresignMultipart. If it's not, then leave it as is

@nopcoder nopcoder requested a review from itaiad200 September 27, 2025 17:10
Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// download object
var err error
if d.PreSign && swag.Int64Value(objectStat.SizeBytes) >= d.PartSize {
// download using presigned multipart download, it will fall back to presign single object download if needed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If downloadPresignMultipart is only called from here, you can remove the internal check in downloadPresignMultipart. If it's not, then leave it as is

@nopcoder nopcoder requested a review from itaiad200 September 28, 2025 09:29
@nopcoder
Copy link
Contributor Author

@itaiad200 refactor the code for more reuse and single place to check the preconditions for download without object stat.

// avoiding the need for a separate stat call.
func (d *Downloader) DownloadWithObjectInfo(ctx context.Context, src uri.URI, dst string, tracker *progress.Tracker, objectStat *apigen.ObjectStats) error {
// downloadObjectCore handles the common download logic for both Download and DownloadWithObjectInfo
func (d *Downloader) downloadObjectCore(ctx context.Context, src uri.URI, dst string, tracker *progress.Tracker, objectStat *apigen.ObjectStats) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this better?

Suggested change
func (d *Downloader) downloadObjectCore(ctx context.Context, src uri.URI, dst string, tracker *progress.Tracker, objectStat *apigen.ObjectStats) error {
func (d *Downloader) downloadWithObjectInfo(ctx context.Context, src uri.URI, dst string, tracker *progress.Tracker, objectStat *apigen.ObjectStats) error {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't want to mix with DownloadWithObjectInfo

Comment on lines 147 to 149
if size < d.PartSize {
return d.downloadObject(ctx, src, dst, tracker)
return fmt.Errorf("object is smaller than PartSize (%d): %w", d.PartSize, ErrValidation)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code path is unreachable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just to be on the safe side if someone will call this function without checking the minimum size.

@nopcoder nopcoder merged commit 0a89d13 into master Sep 28, 2025
66 of 67 checks passed
@nopcoder nopcoder deleted the task/lakectl-fs-download branch September 28, 2025 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/lakectl Issues related to lakeFS' command line interface (lakectl) include-changelog PR description should be included in next release changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

lakectl fs recursive download improve performance

2 participants