Skip to content

Conversation

@Fokko
Copy link
Collaborator

@Fokko Fokko commented Oct 6, 2025

What changes are proposed in this pull request?

Allow converting bytes::Bytes into a Binary Scalar.

How was this change tested?

Copy link
Member

@zachschuermann zachschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if we could add a quick test?

@codecov
Copy link

codecov bot commented Oct 6, 2025

Codecov Report

❌ Patch coverage is 77.77778% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.90%. Comparing base (01e7cb1) to head (d89ec2a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
kernel/src/expressions/scalars.rs 77.77% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1373   +/-   ##
=======================================
  Coverage   84.90%   84.90%           
=======================================
  Files         113      113           
  Lines       28948    28966   +18     
  Branches    28948    28966   +18     
=======================================
+ Hits        24578    24595   +17     
- Misses       3198     3199    +1     
  Partials     1172     1172           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Fokko Fokko force-pushed the fd-binary-scalar branch from 0308daa to 62cbd88 Compare October 6, 2025 17:48
@Fokko
Copy link
Collaborator Author

Fokko commented Oct 6, 2025

@zachschuermann Sure thing, added!

Copy link
Collaborator

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is harmless, but I'm curious what purpose it serves?
(see detailed comment below)

Comment on lines 549 to 551
impl From<bytes::Bytes> for Scalar {
fn from(b: bytes::Bytes) -> Self {
Self::Binary(b.to_vec())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checking -- Bytes::to_vec is a copying operation provided by impl Deref<Target=[u8]>, and we already have From<&[u8]> for Scalar?

If the goal was to transfer ownership cheaply, I don't think that's possible with Bytes (which is like Arc -- "cheaply cloneable and thereby shareable between an unlimited amount of components").

Is the goal to allow e.g. b.into() instead of b.as_ref().into()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for Bytes over Vec<u8> mainly because I expect to slice and dice the content of the bytes, which does a copy when working with a raw Vec, while Bytes avoids this.

Also, when using ToSchema to represent a Vec<u8>, it will create an DataType::Array<DataType::Byte>; therefore, using the Binary avoids that. More on this in #1361 (comment)

That said, I think we can use b.into() here 👍

Suggested change
impl From<bytes::Bytes> for Scalar {
fn from(b: bytes::Bytes) -> Self {
Self::Binary(b.to_vec())
impl From<bytes::Bytes> for Scalar {
fn from(b: bytes::Bytes) -> Self {
Self::Binary(b.into())

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I wasn't questioning the use of Bytes in general (vs. say Vec) -- just trying to understand why we need a From<Bytes> when we already have From<&[u8]> and there's no performance difference between the two?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no performance difference, that's correct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I don't understand the question. We need them both, for each of the types, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's a generic method involved somewhere, any call site could just add an .as_ref() and leverage the existing from-slice:

- bytes.into()
+ bytes.as_ref().into()
- Scalar::from(bytes)
+ Scalar::from(bytes.as_ref())

Depending on the number of call sites we anticipate, that might be less cognitive overhead than a new trait impl. But either way is fine -- just wanted to understand the motivation for the change, which is not stated anywhere I could see.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context @scovich, I'll go ahead and merge this PR since I don't think we have to implement any other From for Binary-related fields.

@Fokko Fokko merged commit f431de0 into main Oct 8, 2025
37 of 38 checks passed
@Fokko Fokko deleted the fd-binary-scalar branch October 8, 2025 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants