You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
columnar: Refactor PartEncoder, add decode_into(...) to PartDecoder (MaterializeInc#26669)
This PR was split out from
MaterializeInc#26605
### Motivation
This refactors `PartEncoder` to take ownership of the columns it's
encoding into, rather than just mutable references. More detail is
provided in MaterializeInc#26605 but
the tl;dr is we only want to downcast `dyn Array`s once, and there isn't
any lifetime that we can associate with the borrows to achieve this.
It also renames the existing `PartDecoder::decode(idx, &mut V)` method
to `PartDecoder::decode_into(...)` and adds a new
`PartDecoder::decode(idx) -> V` method. The goal of `decode_into(...)`
is it allows you to re-use allocations where possible, but that isn't
currently applicable when decoding from our structured columnar data.
When decoding with `Codec` we go from `&[u8]` -> `ProtoRow` -> `Row`, so
it's very helpful to retain the intermediate `ProtoRow` to reduce
allocations. But with columnar data we go directly from `dyn
arrow::Array` -> `Row`, so there isn't any intermediate step to retain.
Also we return the `Row`s we decode as part of an iterator, so there
isn't any way to reclaim them for re-use.
In practice without `PartDecoder::decode(idx)` we end up with a pattern
like:
```
let mut val = V::default();
decoder.decode_into(idx, &mut val);
val
```
Instead of requiring our generic parameters `K` and `V` to implement
`Default`, it was easier to add the new method.
### Tips for reviewer
This PR is broken up into two commits:
1. Refactor to `PartEncoder`
2. New methods on `PartDecoder`
### Checklist
- [ ] This PR has adequate test coverage / QA involvement has been duly
considered. ([trigger-ci for additional test/nightly
runs](https://trigger-ci.dev.materialize.com/))
- [ ] This PR has an associated up-to-date [design
doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md),
is a design doc
([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)),
or is sufficiently small to not require a design.
<!-- Reference the design in the description. -->
- [ ] If this PR evolves [an existing `$T ⇔ Proto$T`
mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md)
(possibly in a backwards-incompatible way), then it is tagged with a
`T-proto` label.
- [ ] If this PR will require changes to cloud orchestration or tests,
there is a companion cloud PR to account for those changes that is
tagged with the release-blocker label
([example](MaterializeInc/cloud#5021)).
<!-- Ask in #team-cloud on Slack if you need help preparing the cloud
PR. -->
- [x] This PR includes the following [user-facing behavior
changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note):
- N/a
0 commit comments