Skip to content

Conversation

@sffc
Copy link
Member

@sffc sffc commented Jan 9, 2026

See #7391
Depends on #7399

I was making a PR that got a bit bigger than I wanted, so I split this out into a standalone change.

We finally have VarZeroCow, a heap-or-reference container for a VarULE. Another reasonable container to want is what I'm proposing in this PR, a FixedLengthVarULE, which stores the VarULE on the stack with a fixed number of bytes.

This can be used to compose ULEs and VarULEs in const contexts, as I show with the PluralElementsPackedULE example.

@sffc sffc requested review from robertbastian and removed request for zbraniecki January 9, 2026 21:13
Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the problem this solves.

@sffc
Copy link
Member Author

sffc commented Jan 9, 2026

This can be used to compose ULEs and VarULEs in const contexts, as I show with the PluralElementsPackedULE example.

It can also be used to create a string or zerovec on the stack, without any additional dependencies.

Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the motivation: why is VarZeroCow<'static> insufficient? What is the benefit of having these values on the stack instead of static memory?

As far as I can tell, anything you ought to be able to do with this type in const should be doable in non-const as well.

/// use icu::locale::locale;
/// use zerovec::ule::FixedLengthVarULE;
///
/// let value = "hello, world!"; // 13 bytes long
Copy link
Member

@Manishearth Manishearth Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: this isn't a motivating example: this doesn't work in const anyway (since there's try_from_encodeable)

(It's an example, but not a motivating one)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would work in const if I replaced try_from_encodeable with new_unchecked, I just wanted to not have unsafe code in the docs test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fine, I was just saying that the example itself wasn't motivating the new zerovec API.

But I have a clearer idea of what's going on now.

@Manishearth
Copy link
Member

Ah, I understand. This lets you copy the data into a new, stack-owned buffer, which you can't do with static data. Hm

@sffc sffc requested a review from Manishearth January 9, 2026 21:49
/// ```
///
/// [generic_const_exprs]: https://doc.rust-lang.org/beta/unstable-book/language-features/generic-const-exprs.html#generic_const_exprs
pub const fn new_singleton_mn<const M: usize, const N: usize>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like these APIs where you have to guess the output.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand your comment, but this is in an unstable module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have to know M and N for this not to panic

Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to hear more about the use case for new_singleton_mn in a const context, since as far as I can tell that would need unsafe code anyway.

I think I see use cases for the new type: being able to create a self-contained stack VarULE is useful.

I think the type name is somewhat misleading. Perhaps ConstStackVarULE? or ConstOwnedVarULE?

I think the type is relatively niche and we should name it in a way that reflects that, and document it as such. Stuffing it in the ule module sounds good, otherwise. In the long term I'd like to organize the mess that is that module.

remainder[i] = input.as_bytes()[i];
i += 1;
}
*start = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: mention "first byte = 0 for a singleton" for redundancy

/// use zerovec::ule::FixedLengthVarULE;
///
/// const plural_ule: FixedLengthVarULE<1, PluralElementsPackedULE<str>> =
/// PluralElementsPackedULE::new_singleton_mn::<0, 1>(FixedLengthVarULE::EMPTY_STR);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clever

@sffc
Copy link
Member Author

sffc commented Jan 9, 2026

I added a const example with no unsafe.

In general this is useful for making default/empty VarULEs in const contexts that are composed of inner ULE types.

@Manishearth
Copy link
Member

In general this is useful for making default/empty VarULEs in const contexts that are composed of inner ULE types.

Right, but where are we hoping to do that? I'm still missing parts of the story here I think.

@sffc
Copy link
Member Author

sffc commented Jan 9, 2026

In general this is useful for making default/empty VarULEs in const contexts that are composed of inner ULE types.

Right, but where are we hoping to do that? I'm still missing parts of the story here I think.

I want to use this in call sites such as

    pub const PLURAL_PATTERN_0: &'static PluralElementsPackedULE<SinglePlaceholderPattern> =
        unsafe { PluralElementsPackedULE::from_bytes_unchecked(&[0, 1]) };

in order to remove the unsafe. @robertbastian may or may not agree with that particular call site but it serves as a useful illustration.

@Manishearth
Copy link
Member

Hmm. I guess this gets more useful if we have consts on various VarULE types for common patterns.

I do worry that this is a fair amount of unsafe code to get rid of some easily-tested unsafe, though. Are we planning on using this const pattern a lot?

An alternative potential route might be to define a macro that can define VarULE consts from unchecked bytes, where it also generates a test. This macro becomes unsafe to use, BUT its safety invariant is mostly "if the test passes this macro usage is fine".

Not saying that's better, but making sure we have considered the design space.

@robertbastian
Copy link
Member

I honestly think for that call site this would be a regression. The current code is unsafe, but it's a single expression, similar to hundreds that we have in baked data, and there's a test that it's correct. This PR adds a lot of weird APIs and unsafe code just to make that one expression safe.

@Manishearth
Copy link
Member

Overall I've found the defaults story in VarULE to be unsatisfying, so "more consts" is definitely an interesting proposition. BUT I'm not fully convinced we should do it this way. Unsure.

@sffc
Copy link
Member Author

sffc commented Jan 9, 2026

Note: the use case for this type of helper will only grow, because the composition stuff (VarTuple) is fairly new and we are using it more and more.

@sffc
Copy link
Member Author

sffc commented Jan 9, 2026

I honestly think for that call site this would be a regression. The current code is unsafe, but it's a single expression, similar to hundreds that we have in baked data, and there's a test that it's correct. This PR adds a lot of weird APIs and unsafe code just to make that one expression safe.

It's not just about that one expression. It's about that class of expressions, which I claim is a class that will get bigger over time.

@sffc
Copy link
Member Author

sffc commented Jan 9, 2026

I also just want to point out:

    pub const PLURAL_PATTERN_0: &'static PluralElementsPackedULE<SinglePlaceholderPattern> =
        unsafe { PluralElementsPackedULE::from_bytes_unchecked(&[0, 1]) };

This code requires subject matter expertise in 3 different components: pattern (to know that 1 means "single placeholder pattern with a single placeholder"), plurals (to know that 0 means "singleton plural elements"), and compact (to know that this combination of pattern and plurals is suitable for PLURAL_PATTERN_0). Yes, the test helps a lot. But, you should be able to write a call site such as the following, and no extra test is required.

    pub const PLURAL_PATTERN_0: /* some type */ =
        PluralElementsPackedULE::new_singleton(SinglePlaceholderPattern::PASS_THROUGH_ULE);

(That line of code does two things that we don't have yet: new_singleton, which is gated on Rust feature generic_const_exprs, and PASS_THROUGH_ULE, because the ULE composition needs to use the correct types. This comment is not an opportunity to bikeshed either of those APIs.)

Copy link
Member

@robertbastian robertbastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't agree with this, but leaving a technical review

/// ```
///
/// [generic_const_exprs]: https://doc.rust-lang.org/beta/unstable-book/language-features/generic-const-exprs.html#generic_const_exprs
pub const fn new_singleton_mn<const M: usize, const N: usize>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"singleton" is the wrong term, we don't use that with plurals. this method is equivalent to PluralElements::new, so it should just be new

Copy link
Member Author

@sffc sffc Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "singleton" is already used in this file multiple times, including in test function names.

The difference between singleton and non-singleton matters a great deal to PluralElementsPackedULE, because it results in two different encodings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the caller shouldn't have to know encoding details?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also doesn't need to be qualified as _mn; it's an unstable type so we can just change the generics if generic_const_exprs ever gets stabilised (I wouldn't hold my breath). so just new

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather keep the _mn because it's a pain to debug cross-crate function naming issues when the function mostly stays the same but const parameters change a little. But I will remove singleton.

input: ConstStackVarULE<M, V>,
) -> ConstStackVarULE<N, PluralElementsPackedULE<V>> {
#[allow(clippy::panic)] // for safety, and documented
if N != M + 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: this should be a const assertion, not a runtime panic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please suggest code that compiles with a const assertion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you can't do it then we shouldn't have this yet

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm? We have lots of const code with runtime assertions because there isn't a good way to write const assertions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even the Rust standard library does runtime panics on const parameters. For example:

https://doc.rust-lang.org/std/primitive.slice.html#method.as_chunks

slice::as_chunks panics if the const parameter N is zero. There was a lengthy discussion on the proposal thread, starting here: rust-lang/rust#74985 (comment)

remainder[i] = input.as_bytes()[i];
i += 1;
}
// First byte = 0 for a singleton
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not "for a singleton" but "for a other case"

and this 0 represents a PluralCategoryAndMetadata, so you should use that type to construct it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is for a singleton. That is the term from the byte encoding definition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually to be fully correct we should add in the FourBitMetadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes a PackedPluralElementsULE<T> corresponds to a PluralElements<(FourBitMetadata, T)>, so the constructor should take that tuple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. (not as a tuple)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all other APIs use (FourBitMetadata, T), please at least change the order

/// ConstStackVarULE::<20, str>::try_from_encodeable("hello, world!").unwrap_err();
/// ```
#[derive(Copy, Clone, PartialEq, Eq)]
pub struct ConstStackVarULE<const N: usize, V: VarULE + ?Sized> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: this is not generally on the stack. the point is that it's Sized so it can be on the stack. SizedVarULE sounds like a trait, maybe SizedVarULEBytes? EncodedSizedVarULE? EncodedVarULEBytes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like SizedVarULEBytes

///
/// Returns an error if the byte length in the container is not the correct length
/// for the encodeable object.
pub fn try_from_encodeable(input: impl EncodeAsVarULE<V>) -> Result<Self, UleError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn try_from_encodeable(input: impl EncodeAsVarULE<V>) -> Result<Self, UleError> {
pub fn try_from_encodeable(input: &impl EncodeAsVarULE<V>) -> Result<Self, UleError> {

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the recommend style here. It doesn't matter if the trait is implemented on a reference. @Manishearth ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is style choice. you only test this with a &str input, but say you have a String, you don't want to accidentally pass that by value.

@robertbastian
Copy link
Member

This code requires subject matter expertise in 3 different components: pattern (to know that 1 means "single placeholder pattern with a single placeholder"), plurals (to know that 0 means "singleton plural elements"), and compact (to know that this combination of pattern and plurals is suitable for PLURAL_PATTERN_0)

No it doesn't. I wrote that code and don't know how these things are encoded. I just wrote the test and then put the bytes there that make the test pass. I think that's a valid and very readable approach.

But, you should be able to write a call site such as the following, and no extra test is required

Yes! I would love that. But it's not feasible with current Rust/zerovec.

/// let plural_ule = PluralElementsPackedULE::new_singleton_mn::<13, 14>(inner_ule, metadata);
/// let rules = PluralRules::try_new(locale!("en").into(), Default::default()).unwrap();
///
/// assert_eq!(plural_ule.as_varule().get(0.into(), &rules).1, "hello, world!");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: merge the assertions

Suggested change
/// assert_eq!(plural_ule.as_varule().get(0.into(), &rules).1, "hello, world!");
/// assert_eq!(plural_ule.as_varule().get(0.into(), &rules), ("hello, world!", metadata));

Comment on lines 532 to 533
/// const plural_ule: SizedVarULEBytes<1, PluralElementsPackedULE<str>> =
/// PluralElementsPackedULE::new_singleton_mn::<0, 1>(SizedVarULEBytes::EMPTY_STR, FourBitMetadata::zero());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or better just integrate into the example above, this is very repetitive

Suggested change
/// const plural_ule: SizedVarULEBytes<1, PluralElementsPackedULE<str>> =
/// PluralElementsPackedULE::new_singleton_mn::<0, 1>(SizedVarULEBytes::EMPTY_STR, FourBitMetadata::zero());
/// let plural_ule = const {
/// PluralElementsPackedULE::new_singleton_mn::<0, 1>(SizedVarULEBytes::EMPTY_STR, FourBitMetadata::zero()) };

/// ```
///
/// [generic_const_exprs]: https://doc.rust-lang.org/beta/unstable-book/language-features/generic-const-exprs.html#generic_const_exprs
pub const fn new_singleton_mn<const M: usize, const N: usize>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also doesn't need to be qualified as _mn; it's an unstable type so we can just change the generics if generic_const_exprs ever gets stabilised (I wouldn't hold my breath). so just new

@Manishearth
Copy link
Member

Manishearth commented Jan 10, 2026

It's not just about that one expression. It's about that class of expressions, which I claim is a class that will get bigger over time.

I am not yet convinced that this is the case, but the use case seems sufficiently well outlined and I agree it is valid.. I would be in favor of us having more consts like this. As I said before, the default construction problem in varule is something I've wanted to see improvement on.

Regarding the MN stuff I'm overall okay with landing suboptimal API we know that we can clean up in future Rust. Though I don't know if we have a timeline for const expressions in generics?

@sffc
Copy link
Member Author

sffc commented Jan 10, 2026

Though I don't know if we have a timeline for const expressions in generics?

Well, it works in nightly, and rust-lang/rust#76560 has a lot of upvotes. It seems like there are concrete next steps but no one is currently actively funding it.

@sffc sffc requested review from a team and younies as code owners January 10, 2026 05:05
@sffc
Copy link
Member Author

sffc commented Jan 10, 2026

OK I added an ergonomic to_sized_varule_bytes! macro for construction, which even works with type hinting, and then migrated the compactdecimal call site to:

    pub const PLURAL_PATTERN_0: SizedVarULEBytes<
        2,
        PluralElementsPackedULE<SinglePlaceholderPattern>,
    > = PluralElementsPackedULE::new_mn(
        FourBitMetadata::zero(),
        to_sized_varule_bytes!(SinglePlaceholderPattern::PASS_THROUGH),
    );

Const variables always require a type, or else the type could be elided, except for the number 2.

@Manishearth
Copy link
Member

@sffc Thanks!

I think if we're open to using a macro we don't need the SizedVarULEBytes type at all. The macro can construct a const/static array, and then construct a second const/static that references it.

This is what I was referring to earlier:

An alternative potential route might be to define a macro that can define VarULE consts from unchecked bytes, where it also generates a test. This macro becomes unsafe to use, BUT its safety invariant is mostly "if the test passes this macro usage is fine".

I can write more about this later.

@sffc
Copy link
Member Author

sffc commented Jan 11, 2026

If we can't remove the unsafe, then a macro of that shape isn't worth doing in my opinion.

@Manishearth
Copy link
Member

Manishearth commented Jan 11, 2026

Sorry, I should have been more explicit: I think using the design of your macro so far combined with my model, this can be done without callsite unsafe.

@sffc sffc added the waiting-on-reviewer PRs waiting for action from the reviewer for >7 days label Jan 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-on-reviewer PRs waiting for action from the reviewer for >7 days

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants