-
Notifications
You must be signed in to change notification settings - Fork 57
text_primitives: improve OpenType settings CSS parsing #501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
text_primitives: improve OpenType settings CSS parsing #501
Conversation
| at: usize, | ||
| span: Option<(usize, usize)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually I discourage the use of architecture-dependent types like usize to keep behaviour consistent across platforms, but I think it's fine for this type of struct. Curious what your thoughts are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything else would be super annoying in terms of casts.
text_primitives/src/tag.rs
Outdated
| while pos < self.len { | ||
| let ch = self.source[pos]; | ||
| ch.is_ascii_whitespace() || ch == b',' | ||
| } { | ||
| pos += 1; | ||
| if ch.is_ascii_whitespace() || ch == b',' { | ||
| pos += 1; | ||
| } else { | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe these lines allow an arbitrary number and order of and ,. That's technically not allowed in the CSS spec IIUC, but I understand if you'd prefer to keep the leniency.
i.e., this test passes:
#[test]
fn parse_feature_settings_css_list_ok() {
let parsed: Result<Vec<_>, _> =
Setting::<u16>::parse_css_list(r#""liga" on,,, , ,,, 'kern', "dlig" off, "salt" 3,"#).collect();
let settings = parsed.unwrap();
assert_eq!(settings.len(), 4);
assert_eq!(settings[0].tag, Tag::parse("liga").unwrap());
assert_eq!(settings[0].value, 1);
assert_eq!(settings[1].tag, Tag::parse("kern").unwrap());
assert_eq!(settings[1].value, 1);
assert_eq!(settings[2].tag, Tag::parse("dlig").unwrap());
assert_eq!(settings[2].value, 0);
assert_eq!(settings[3].tag, Tag::parse("salt").unwrap());
assert_eq!(settings[3].value, 3);
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to start out lenient and make it less lenient if needed in the future. Working on a revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've pushed a commit that addresses this, adds more tests, and hardens some other stuff in parsing.
df22fc3 to
60373c0
Compare
|
Only partially related to this, but ... would be good to have something, somewhere for handling an equivalent of the |
dfrg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM with a couple questions.
| self.tmp_variations.clear(); | ||
| self.tmp_variations | ||
| .extend(FontVariation::parse_list(source)); | ||
| .extend(FontVariation::parse_css_list(source).map_while(Result::ok)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be filter_map instead? Seems useful to skip invalid entries when possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just give the Codex response here which I think I agree with:
filter_map(Result::ok)only helps if the iterator can continue producing items after an error. Ourparse_css_listiterators don’t recover: on the firstErrthey setdone = trueand terminate. In that world:
map_while(Result::ok)= “take items until first error”filter_map(Result::ok)= “drop the error, but there are no later items anyway”So switching to
filter_mapwouldn’t actually “skip invalid entries when possible” today; it would just slightly change the signal of intent (and arguably mislead future readers into thinking we do recovery).If the intent is “stop at first parse error, keep earlier valid ones” (matching the old
take_while(ok)behavior),map_while(Result::ok)is the clearer expression.
If the intent is “try to salvage later entries”, we’d need to change the parser to recover and keep iterating; thenfilter_mapwould make sense.
text_primitives/src/tag.rs
Outdated
|
|
||
| /// A single OpenType setting (tag + value). | ||
| #[derive(Clone, Copy, PartialEq, Debug)] | ||
| pub struct Setting<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about hiding this and just exposing concrete FontVariation and FontFeature types instead? I’m trying to think of a case where we want to handle these generically and none comes to mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done this as a separate commit in the series ... it ended up being sizable once I took it to the logical endpoint. Or what I think is the logical endpoint.
Seems like a reasonable thing to include. |
|
I feel like all parsing in the |
- Add structured parsing errors (kind + byte offset/span) for CSS OpenType settings strings. - Rename list parsers to `parse_css_list`. - Update Parley resolver to use the fallible CSS parser. - Add unit tests covering success cases and key error offsets/spans.
Co-authored-by: Taj Pereira <[email protected]>
- Make `ParseCssList::next` strict about separators (no ,, / leading comma) - Improve `u16` feature value error classification (`InvalidSyntax` vs `OutOfRange`) - Add regression tests for separator “soup” and value errors
…e/variation types * Replace `Setting<T>` with `FontFeature`/`FontVariation` and export them from `text_primitives` * Add `FontVariations`/`FontFeatures` (`Source`/`List`) wrappers and make them `Into<StyleProperty>` * Add ergonomic constructors: `empty()` and `From<&[T]>`/`From<&[T; N]>` for list usage * Update resolve logic and tests to the new API surface
60373c0 to
7508192
Compare
I've written something for this ... but it is also a fair bit of stuff. I think I'd rather submit as a followup. |
|
I've added the |
0c4fdae to
49cd65d
Compare
My gut feeling is that these don't quite fit this crate, at least as designed. If they were a direct mapping of the various CSS Browsers are also capable of simulating at least If you don't mind, my preference would be to pull these out into a separate PR for further discussion. |
|
I agree. I do want to land this though with the other commits here, including removing the public Setting. |
|
Good to go! |
49cd65d to
7508192
Compare
parse_css_list.