Skip to content

HTML Ruby PR Escalation to WHATWG SG #184

Closed
@fantasai

Description

@fantasai

Hello WHATWG SG,

This message concerns “ruby”, which is a form of interlinear annotations commonly used in East Asia, primarily (but not exclusively) for phonetic assistance. It is an important typographic feature as well as a critical accessibility feature for languages such as Japanese, where it is notably relied on by virtually all language learners, including by all school-age children.

I'm writing specifically to escalate the matter of the rejected PR to synchronize the W3C and WHATWG HTML ruby specifications.

Aside from substantially rewriting the prose and examples, the pull request does three things:

  • adds a row-primary ruby markup pattern using RB
  • removes Hixie's double-rt double-sided ruby feature that nobody implements
  • defines RTC to provide double-sided ruby under the row-primary model

This PR resolves what is, afaik, the only major intentional difference between the WHATWG and W3C HTML specs (prior to the MOU), incorporating the broadly consensus-backed W3C extended ruby markup model into the WHATWG spec.

We have two implementations of the first (RB) feature: Mozilla Firefox and Amazon Kindle. (But I am told that Kindle does not count as far as WHATWG is concerned, since it is not a "Web browser".) The second (RTC) feature is currently only implemented by Mozilla afaik (and is somewhat less critical).


The background on this PR starts more than 10 years ago; to understand it probably the best place to start is this blog post:
https://fantasai.inkedblade.net/weblog/2011/ruby/

For more examples and explanations there's also the content of the HTML PR, which is posted here:
https://html.rivoal.net/multipage/text-level-semantics.html#the-ruby-element

These two should explain the use cases and requirements, and the resulting technical design, of the extended markup model.

From a historical perspective, fantasai's post kicked off two efforts in coordination with the i18nwg:

  • https://www.w3.org/TR/html-ruby-extensions/ in the HTMLWG
    This eventually got incorporated into W3C HTML5, but, due to Hixie's objections to adding anything more complicated than what he personally thought was adequate, never made it into WHATWG HTML.

  • https://www.w3.org/TR/css-ruby-1/ in the CSSWG
    The modern CSS Ruby Layout spec, which handles all the markup patterns described in the blog post.

Thanks to Koji Ishii, Robin Berjon, Henri Sivonen, and the i18nwg's efforts, all browser engines implemented the relevant parsing rules for the full set of ruby elements awhile back, and this has been incorporated into the HTML parsing algorithm across all specs and implementations.

Subsequently the Ruby Markup Extension spec was retired as a W3C NOTE and its contents incorporated into the W3C HTML REC.

Xidorn Quan (@upsuper) then implemented the new CSS Ruby layout model in Gecko, and used it to implement layout support for the full HTML ruby extensions spec in Firefox. He later filed an issue to have WHATWG sync with W3C's HTML spec, which then stalled.

At the 2018 TPAC the W3C i18nwg and WHATWG editors discussed the situation of HTML ruby markup, as it was a major technical issue that the W3C community found unsatisfying in the WHATWG specification. The i18nwg promised to provide a PR, and the WHATWG would merge the PR provided Firefox's CSS implementation and a second layout implementation commitment.

This PR was provided in March 2020, at which point the WHATWG editors clarified that the second implementation must be "a second implementation approved for shipping" of support for layout of the extended markup, and cannot be anything but WebKit or Blink because WHATWG policy only considers web browsers to be implementations. See whatwg/html#1771 (comment)


Thus currently we find ourselves in an awkward position:

  • The HTML ruby extension spec and HTML5 REC are retired as well as somewhat outdated.
  • The editors of the current HTML spec are unwilling to merge the PR because there is no second shipping Web browser engine implementation
  • Because no maintained specification exists, we have implementers (such as Amazon) working off of old, unmaintained specs, which creates a poor environment for interop and cooperation.

There are, afaict, two ways out of this situation:

  1. WHATWG merges the PR without a second browser implementation

    • Benefits of this approach are that the HTML spec stays in one place.
    • Detriments are that the WHATWG community seems difficult to work with on this matter, which could impede further improvements.
  2. W3C publishes a Ruby Extension Recommendation

    • Benefits of this approach are that the specification is worked on where the most relevant expertise already exists (i18n, a11y, css).
    • Detriments are that this portion of the HTML spec will be maintained outside the otherwise-complete WHATWG living standard.

    Note: In this second case, we would provide a PR to update the WHATWG specification to remove the features that were never implemented and to ensure that, in both technical and editorial respects, the specifications are synchronized and appropriately cross-referenced to each other such that the WHATWG specification clearly defines a semantically compatible subset of the W3C extended ruby model.

The question I am putting before the WHATWG Steering Group is, which of these two approaches would it prefer to take?

~fantasai

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions