-
Notifications
You must be signed in to change notification settings - Fork 133
Add flatten
attribute to derive SerializeRow
#1144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
0de5bb0
to
dc2a19e
Compare
I apologize about the multiple force pushes yesterday, I was chewing on it yesterday a bit more since it wasn't reviewed yet and decided to move some stuff around to make it more clear that all the new structs are internal to the macro implementation only (by moving it to that sub-module). All that should be done by now. I have also started work on adding the flatten attribute for the |
I have finished the work to also support flattening when serializing with |
@wprzytula would you mind taking a look at this PR and tell me if you would like me to keep it as-is or add the flatten support to the |
c03bf5b
to
4184a7b
Compare
@Lorak-mmk , could you take a look and let me know if there are any concerns holding this PR? |
@nrxus We're sorry for poor responsivity on our side. We're busy with next year planning; we'll be able to look at your PR later. |
This is a significant but breaking change, so we most likely won't be able to attend to it before releasing 1.0. We are quite busy with other work :( |
Are you sure?
|
I made especially sure not to change any existing API, and to hide all of the new types/traits in the existing internal module to not increase the public API surface area other the new attribute. |
In such case, we will technically be able to release it in, say, 1.1, when we find time to review and accept this after we release 1.0. Does it sound OK to you, @nrxus ? |
Yep sounds good! I'll just keep pointing to my branch for now. I also have a branch to do this same support but when serializing with order enforced. Should I just merge it here so you all only have to review it as one complete feature? It'd make the overall size of the PR bigger which is why I had kept it separate |
IMO let's have it in a single PR, separate commits. |
I made a typo. I definitely meant that this is NOT a breaking change, and thus we will not prioritise the review before releasing 1.0. |
236ab1a
to
031f226
Compare
I am perfectly happy with your version so I just did that but I removed the Also, you were right, once I switched to fully qualifying the path then there is only one error for the first case as expected: error[E0277]: `serialize::row::tests::foobar::Inner` cannot be flattened here
--> scylla-cql/src/serialize/row_tests.rs:1220:14
|
1220 | #[derive(SerializeRow)]
| ^^^^^^^^^^^^ `serialize::row::tests::foobar::Inner` is not a struct that derives `SerializeRow` with `match_by_name` flavor
|
= help: the trait `SerializeRowByName` is not implemented for `serialize::row::tests::foobar::Inner`
= note: There are two common reasons for that:
- `serialize::row::tests::foobar::Inner` does not use `#[derive(SerializeRow)]`
- `serialize::row::tests::foobar::Inner` uses `#[scylla(flavor = "enforce_order")]`
= help: the following other types implement trait `SerializeRowByName`:
RowWithTTL
TestRowWithColumnRename
TestRowWithColumnSorting
TestRowWithGenerics<'a, T>
TestRowWithNoColumns
TestRowWithSkippedFields
serialize::row::tests::foobar::Outer
serialize::row::tests::test_row_serialization_nested_structs::InnerColumnsOne
and 2 others
= note: this error originates in the derive macro `SerializeRow` (in Nightly builds, run with -Z macro-backtrace for more info)
For more information about this error, try `rustc --explain E0277`.
error: could not compile `scylla-cql` (lib test) due to 4 previous errors
|
1af115e
to
ea09107
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last 2 minor changes.
scylla-macros/src/serialize/row.rs
Outdated
// if not, then check if any flattened field has a column for it | ||
_ => 'flatten_try: { | ||
#({ | ||
match self.#flattened_fields.serialize_field(spec, writer)? { | ||
// there is a column and the field is done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, I reall thought I would give approve this time, but there is one more unqualified call :(
scylla-macros/src/serialize/row.rs
Outdated
// if what is missing is a flattened field then report that error | ||
#(if !self.#flattened_visited_flag_names { | ||
return self.#flattened_fields.check_missing() | ||
})* | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And another one here
This test checks that `DeserializeValue` macro behaves correctly in case of https://github.com/apache/cassandra/blob/4a80daf32eb4226d9870b914779a1fc007479da6/doc/native_protocol_v4.spec#L1003. 1. Serialize the UDT with the missing field value 2. Deserialize the serialized UDT, but include the missing field in metadata 3. Expect that the missing field value is treated as null I don't know of any way to simulate this using the Scylla/C* cluster. CQL protocol mentions that such scenario is possible, but does not specify when it can appear.
the function itself is technically public but in the doc hidden internal macro module so that it can be used for macros to simplify code expansion but doesn't affect the surface area of the public API
otherwise it can't compile if the impl block already had a 'b named lifetime
`NoColumnWithName` and `ValueMissingForColumn` were flipped. `NoColumnWithName` is returned when there is a serializable field in the Rust struct (i.e., is it not ignored) but there is no matching column for it in the spec `ValueMissingForColumn` should be returned when the spec expects a column but there is no matching field in the Rust struct.
I see at least one other unqualified call. Give me a moment, I'll fix them and push to my branch. |
This is a no-op refactor to the code generation by SerializeRow. New types and traits are introduced in the doc hidden module `_macro_internal` such that the API surface area is unchanged. The goal of this refactor is to make the addition of the flattened serialization feature easier
any helper types/methods have been created inside the internal macro module so as not to increase the public API surface area
This commit focuses on the `match_by_name` flavor.
If a struct's field is a reference, allow that reference to use the `flatten` attribute so that the field can be serialized using SerializeRow and the columns produced flattened into the parent struct. Prior to this commit this was not possible so this commit does two changes: 1. Adds a blanket implementation on the needed traits (SerializeRowByName and SerializeRowInOrder) for references of structs that already implement them. 2. Adds a bound to the lifetimes of the fields when making the partial structs during serialization such that the reference to the fields have to outlive the partial struct being created during serialization.
It was released over 6 months ago so it is ok to use it. It allows us to use LazyCell.
This uses `diagnostic::on_unimplemented` to hint the users to derive `SerializeRow` with the correct flavor
I've pushed some changes to my branch ( https://github.com/Lorak-mmk/scylla-rust-driver/tree/flatten-serialize-name ), feel free to use it.
I have no more issues with this version, so if you decide to use it I can approve without another review. |
ea09107
to
c5470a0
Compare
I have rebased my branch against yours such that this PR now has your changes. Let me know if that's enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really great work! I hope you don't hate us too much for that many review rounds 😅
I don't think we need to wait for @wprzytula - he approved it many times already.
Thanks for the approval and merge! No worries about the many rounds, it was a big change so I appreciate thoroughness! |
This is similar to the
flatten
attribute in serde.This PR adds support for both the
match_by_name
and theenforce_ordering
flavors but it does not allow these structs to be mix-and-matched. This means that structs of different flavors of serialization cannot be flattened into one another. This is a feasibility limitation as these two methods of serialization are completely at odds with each other and hence cannot be combined. The error produced if these two flavors are matched will be at compile time but it may not be the clearest error since it would be about the struct not implementing some doc hidden trait they wouldn't be able to see in the docs.I have only added this attribute to
SerializeRow
because it was easier thanDeserializeRow
but I also want to add it to that macro in a future PR next chance I get to dig into this code.All the new traits/structs/enums needed for this change are inside the
_macro_internal
subdmodule such that no new public API is exposed. Maybe in the future those could be made public but it felt too early to know if all the signatures were exactly how we wanted to expose them or not.For context, I am currently dealing with an issue that if I have different insert queries where one sets N columns and another one sets the same N and one extra, then I have two make two structs with N repeated fields. With this PR I'd be able to to instead flatten the struct with N fields inside the other struct to make my code more maintainable.
By name serialization
ser::row::ByName
A new internal-only struct
ser::row::ByName
is added that wraps a struct that implements a new trait:SerializeRowByName
. This new type has a single functionser::row::ByName::serialize
and attempts to serialize an entireRowSerializationContext
, returning an error if any of the columns in the context were not serialized or do not belong to the struct. This is basically the implementation ofSerializeRow::serialize
for any struct that implementsSerializeRowByName
but split into its own internal-type so that the macro doesn't have to create this shared code. This couldn't be added as a default implementation in one of our traits because we need to call for some functions usingSelf
as a generic parameter which caused some compilation errors.SerializeRowByName
When deriving
SerializeRow
using thematch_by_name
flavor the struct will also implement a new internal-only trait:SerializeRowByName
. This trait has a single type associated typePartial
, and a functionpartial()
that creates it. The partial struct has 3 main parts:SerializeRowByName
such thatpartial()
can be called on it.The partial struct is required to implement a new trait
PartialSerializeRowByName
PartialSerializeRowByName
PartialSerializeRowByName
has two required functions:serialize_field
: takes the spec of a single column and attempts to serialize the corresponding field to it. If this column does not belong to this partial struct then the caller is told that the column is unused so that the caller can instead try to use a different field for this same column (i.e., when testing to see if any nested structs can serialize to that column). If the column is used, then a check is done to see if that column has completed the serialization of this field so that it can remove it out of itsmissing
set. The caller is informed if that column has finished the serialization of this partial struct or not.check_missing
: consumes the partial struct while checking if all the fields in this struct were serialized, returning an error if not. This is used insideser::row::ByName::serialize
to verify that the a struct has been fully serialized. If a field has not finished serializing and the field is a nested struct (i.e., not just a column) then we should get the error from the nested struct instead for better error messaging.To do this signaling, a new internal-only enum
ser::row::FieldStatus
was added that returns whether a column was used for the field, was used and completed the field, or was used by the field is still missing more columns.By order serialization
ser::row::InOrder
A new internal-only struct
ser::row::InOrder
is added that wraps a struct that implements a new trait:SerializeRowInOrder
. This new type has a single functionser::row::InOrder::serialize
that attempts to serialize an entireRowSerializationContext
, returning an error if any of the columns in the context were not serialized or do not belong to the struct. It does this by:ser::row::ByColumn
.SerializeRowInOrder
implementation for the struct we are derivingSerializeRow
for using theser::row::ByColumn
instance.ser::row::ByColumn
instance was fully consumed.This is basically the implementation of
SerializeRow::serialize
for any struct that implementsSerializeRowInOrder
but split into its own internal-type so that the macro doesn't have to create this shared code. This couldn't be added as a default implementation in one of our traits because we need to call for some functions usingSelf
as a generic parameter which caused some compilation errors.ser::row::ByColumn
ser::row::ByColumn
wraps an iterator over column specs and provides the following methods:next
: Given a value to serialize it type and name checks it against the next column spec in the iterator, serializing it if successful or returning an error therwisenext_skip_name
: Given a value to serialize it type checks (but skips name check) it against the next column spec in the iterator, serializing it if successful or returning an error therwisefinish
: verifies that the iterator is fully consumed.SerializeRowInOrder
When deriving
SerializeRow
using theenforced_ordering
flavor the struct will also implement a new internal-only trait:SerializeRowInOrder
. This trait has a single methodserialize_in_order()
whose generated implementation will:next
ornext_skip_name
on the givenser::row::ByColum
instance.serialize_in_order
on it (implying the nested struct must also implementSerializeRowInOrder
and pass along itsser::row::ByColum
instance.Note that this method does not call for
finish()
onser::row::ByColumn
because it does not need to verify that the iterator was fully consumed as it could have been called during flattening and we only want to verify that the iterator is consumed on the root struct being serialized.Pre-review checklist