Description
Some thoughts on the structure of DateTime FFI.
Types: we might not need very many
The Rust type DateTimeFormatter<CompositeFieldSet>
does everything. We can have many constructors that load different subsets of data. The reason to have multiple types is to reduce stack size (memory usage in FFI) and make the Drop impl a bit smaller, which I think is valuable.
It probably makes sense to start with these types:
FFI type | Rust type | Approximate stack size |
---|---|---|
TimeFormatter |
TimeFormatter<TimeFieldSet> |
272 |
DateFormatter |
DateTimeFormatter<DateFieldSet> |
440 |
DateTimeFormatter |
DateTimeFormatter<CompositeDateTimeFieldSet> |
480 |
ZonedDateTimeFormatter |
DateTimeFormatter<CompositeFieldSet> |
1472 |
GregorianDateTimeFormatter |
FixedCalendarDateTimeFormatter<Gregorian, CompositeDateTimeFieldSet> |
424 |
GregorianZonedDateTimeFormatter |
FixedCalendarDateTimeFormatter<Gregorian, CompositeFieldSet> |
1416 |
size_test!(TimeFormatter<fieldsets::enums::TimeFieldSet>, time_formatter_size, 272); size_test!(DateTimeFormatter<fieldsets::enums::DateFieldSet>, date_formatter_size, 440); size_test!(DateTimeFormatter<fieldsets::enums::CompositeDateTimeFieldSet>, date_time_formatter_size, 480); size_test!(DateTimeFormatter<fieldsets::enums::CompositeFieldSet>, zoned_date_time_formatter_size, 1472); size_test!(FixedCalendarDateTimeFormatter<icu_calendar::Gregorian, fieldsets::enums::CompositeDateTimeFieldSet>, gregorian_date_time_formatter_size, 424); size_test!(FixedCalendarDateTimeFormatter<icu_calendar::Gregorian, fieldsets::enums::CompositeFieldSet>, gregorian_zoned_date_time_formatter_size, 1416);
Maybe we don't need the Gregorian types. The only difference is the presence of the AnyCalendar
field and its associated destructor, which seems to be fairly small in the grand scheme. We can still of course have the Gregorian constructors, but they can resolve to these types.
I think we can also have just DateTimeFormatter
and not DateFormatter
.
Basically: time zone data is way bigger than date data which is way bigger than time data. So if you already have date data, removing time data is mostly in the noise, and if you already have zone data, removing date data is mostly in the noise.
Static Field Set Constructors
For dates, times, and datetimes, I think we can just have all the constructors. There are 20ish.
I think we should flatten the field set options into positional arguments, like we do in some other types. This avoids having a bunch of tiny field set structs like we do in Rust.
For time zones, I see a few options…
- Maybe we could have a fn to "add" time zone data to a
DateTimeFormatter
, transforming it into aZonedDateTimeFormatter
. This would require a bit of work on the Rust side, but I think it's doable. - Or we do a hybrid approach where we have time zone constructors for each style and they take a dynamic datetime field set as an argument. You won't get slicing of date data, but date data is small relative to time zone data.
- Anything else?
Dynamic Field Set Constructors
The Rust approach is infeasible over FFI, since it is built on the idea of a bunch of little structs that compose into bigger structs and eventually into an all-encompassing dataful enum.
I think we instead want to introduce a builder API. I think we should start in Rust and then port it to FFI. I've had in mind for a while that the builder API would work similarly to the Serde impl.
There are a few approaches for specifying the field sets:
- Enums and structs: specify all date, calendar period, time, and zone fields in plain enums. Then you put the ones you want into a struct. The constructor is fallible if you put together an invalid combination of field sets, but the field sets individually are always valid. This is slightly more type-safe.
- Set of fields: export a single enum with possible fields, and the field set is a vector of that enum. Then we check to see if the fields you put in the vector compose a valid field set. This is what the Serde impl does.
- String: specify the semantic field set as a string, perhaps in JSON syntax, and use the Serde impl directly.
I lean toward option 2 because it is a smaller API surface, but I can be easily persuaded. If option 1 can be implemented more efficiently since it is structs and enums instead of a vector, that would convince me.
Alternatively, we could say that we don't export dynamic field sets over FFI. Clients in each language can build their own delegation mechanism that sits on top of the static constructors. This would be annoying for clients, but it is totally feasible. And if CLDR adds a string semantic skeleton syntax, we can add a single narrow constructor for that.