Principle: Write only one algorithm to accomplish a task. #562

jyasskin · 2025-03-07T00:52:22Z

This explains why and when "polyglot" formats are a bad idea.

Fixes #239.

There's some overlap between this and the preceding section, Resolving tension between interoperability and implementability. Do y'all think it's ok, or are there bits we could refactor together?

I'd also like to give an example of parsing divergence yielding security bugs, but I didn't have any readily available. Ideas?

Preview | Diff

This explains why and when "polyglot" formats are a bad idea.

annevk · 2025-03-07T07:33:00Z

index.bs

@@ -3488,6 +3505,52 @@ While the best path forward may be to choose not to specify the feature,
 there is the risk that some implementations
 may ship the feature as a nonstandard API.

+<h3 id="multiple-algorithms">Write only one algorithm to accomplish a task</h3>


Maybe "goal" instead of "task"? This immediately made me think of the event loop.

martinthomson · 2025-03-07T08:55:38Z

index.bs

+
+When specifying how to accomplish a task, write a single algorithm to do it,
+instead of letting implementers pick between multiple algorithms.
+It is very difficult to ensure that


Suggested change

It is very difficult to ensure that

It is very difficult to ensure that

martinthomson · 2025-03-07T08:56:42Z

index.bs

+two different algorithms produce the same results in all cases,
+and doing so is rarely worth the cost.
+
+Multiple algorithms seem particularly tempting when defining


I don’t think that you need this paragraphs as long as an example mentions a file format.

martinthomson · 2025-03-07T08:57:29Z

index.bs

+using either the [[HTML#the-xhtml-syntax|XHTML parsing]]
+or [[HTML#syntax|HTML parsing]] algorithm.
+Authors who tried to use this syntax tended to produce documents
+that actually only worked with one of the two parsers.


Suggested change

that actually only worked with one of the two parsers.

that only worked with one of the two parsers.

martinthomson · 2025-03-07T09:01:31Z

index.bs

+
+Note: While [[rfc6838#section-6|structured suffixes]] define that
+a document can be parsed in two different ways,
+they do not violate this rule because the results have different data models.


Isn’t the real difference here that the suffix parsing produces an intermediate result?

i suspect that this is insufficient still, because it doesn’t really get at why suffix parsers exist. That is still somewhat contested, but my view is that intermediate results are rarely able to be processed meaningfully, so they are limited to use in diagnostic tools and the like.

msporny

Hmm, the principle seems too blunt to be useful. Some high level thoughts to start, I'm still trying to think about what text would be useful:

Polyglot, as a term, is wrong -- this is about interpreting the same data serialization using different algorithms, not about a single system interpreting that serialization using different algorithms. It's more about an ecosystem interpreting that same data serialization using different algorithms (which is useful, more on that below).
Yes, there are cases where this resulted in bad outcomes -- XHTML/HTML is a good example.
The comparison between VCDM and SD-JWT-VC is totally wrong, they're two totally different data models, using two totally different serializations, using two totally different algorithms -- and there are a number of us that think that whole thing is a massive standardization failure, so using that as an example of the right way to do something is not what we want to do. The only thing they have in common is the phrase "Verifiable Credential", and even that is being objected to by some of us.
The multiple suffixes thing is also contested -- in the IETF MEDIAMAN WG, we couldn't find broad-scale usage of suffix-based processing, what @martinthomson is saying is important here. I'll add that the suffix-based processing is also not a clear example of why this principle is good or bad.

Fundamentally, the principle seems misguided. Yes, at some level one data format and one algorithm is a good thing. However, what a traditional web crawler gets out of a web page is different from what a browser parsing the web page works with is different from what a frontier AI model gets out of a web page. The algorithms that each uses are quite different and useful and this principle seems to be arguing against that.

I think the only solid ground here is the XHTML/HTML example. You're going to get push back on the other items being mentioned if they continue to be mentioned in the way the current PR is written up.

I'll try to think of some constructive text, but wanted to get some preliminary thoughts down in an effort to help shape the PR into something more easily defensible.

filip26 · 2025-03-07T15:16:20Z

Algorithms + Data Structures = Programs (Niklaus Wirth).

It’s rational to avoid having two algorithms performing the same function, especially when considering costs like time and space complexity, and to recommend the one that best fits the criteria. However, if this change is based on the assumption:

use either JSON or JSON-LD to parse bytes into their data models.

then there is a misunderstanding of the algorithmization basics. JSON and JSON-LD have different data models, as noted. They involve different data structures in the equation at the top, which means different algorithms are needed because they operate on different data structures.

From this perspective, calling for one algorithm to operate on different data structures does not make sense.

My recommendation would be to use a different argument when advocating for a single algorithm - considering factors such as time complexity, space complexity, etc.

martinthomson · 2025-03-10T03:52:40Z

I don't agree with Manu about this being misguided. The point here is that the same HTML document is not seeking to express multiple distinct sets of semantics depending on how it is processed, there is just one HTML with one interpretation, and one data model that both the producer of the content and the consumer of content can agree on. If they disagree, that is likely due to one or other being wrong.

This is because there is just a single specification for HTML and a single way to interpret HTML content according to that content.

Obviously, what someone does once they have received HTML might differ, but those differences do not relate to how the HTML itself was interpreted, but how the content at the next layer (that is, the words and images and stuff like that) is interpreted. Sure, a human and an AI model will seek to do different things with the information they are presented with, but the interpretation is singular.

Where CID struggles a little is that there are two paths to the same interpretation. It manages that by giving implementations a choice and promising that the outcome will be the same either way. It's bad, because now there is a third place where a bug can result in a different interpretation (producer, consumer, and now spec), but it's not fundamentally a polyglot in the sense that there are multiple divergent interpretations possible.

The core message is that having divergent paths is undesirable. And yes, that means saying that seeking to have a pure JSON vs a JSON-LD interpretation of the same content is a bad idea. Because divergence in data models means that there is no single interpretation of the content upon which all potential recipients might agree upon.

martinthomson · 2025-03-10T09:17:27Z

index.bs

+to assign properties to particular objects than JSON does,
+these specifications had to add extra rules to both kinds of parsers
+in order to ensure that each input document had exactly one possible interpretation.
+[[vc-data-model-2.0 inline]] and [[draft-ietf-oauth-sd-jwt-vc inline]] fixed the problem


Manu is right that these are completely different (and that they likely represent standardization failure, though the question of where the failure occurred might be contested). In a sense, it is OK that they are completely different (that they are in competition is potentially bad if they address the same use cases, but there is no risk that one might be mistaken for the other).

I think that it would serve this example better to focus only on the CID case.

gkellogg · 2025-03-10T20:28:11Z

This issue really gets at the heart of a basic divide at W3C: one that is browser-centric, vs. one which is data-centric. In fact, JSON-LD does parse JSON (and YAML and CBOR) into a common INFRA-based data structure (called the Internal Representation) which various algorithms operate over to perform different transformations, including to interpret as RDF. This is the core reason behind JSON-LD, which has become extremely widely used on the Web (in large part, due to schema.org).

HTML is also often processed differently, typically by interpreting the resulting DOM. This might be done to extract Microdata/RDFa, interpret the contents of script elements, or to perform extensive re-formatting through ReSpec or Bikeshed. Search engines interpret the DOM for their own uses, so a general principle would seem to settle on a data representation which different applications can use to suit their different use cases.

In the case of Verifiable Credentials, the basic failure would seem to be a lack of agreement on how to work with the data that is represented in the JSON. This is an area the TAG can help with for future specs, rather than getting into a reductionist view that Polyglot formats are fundamentally a bad idea.

BigBlueHat · 2025-03-11T19:56:21Z

index.bs

+these specifications had to add extra rules to both kinds of parsers
+in order to ensure that each input document had exactly one possible interpretation.
+[[vc-data-model-2.0 inline]] and [[draft-ietf-oauth-sd-jwt-vc inline]] fixed the problem
+by defining different media types for JSON-LD vs JSON,


This statement is incorrect. The VCDMv2 only has a single media type: application/vc.

Likewise, SD-JWT-VC only has one: application/dc+sd-jwt. However, SD-JWT is not JSON parse-able base format.

Both specifications have parsing algorithms unique to their media types--and specific to their tasks.

It remains unclear how these examples are "polyglot".

hober

This looks really good to me on a first-pass look. I'll try to find the time to give it a closer read some other time, but please don't block on waiting for me to do so. :)

pchampin · 2025-03-12T15:22:55Z

It's bad, because now there is a third place where a bug can result in a different interpretation (producer, consumer, and now spec).

I disagree: bugs in specs are always possible, regardless of the fact that the spec acknowledges it or not.

Also note that an algorithm is not an implementation, and no algorithm is entirely neutral: Javascript developers do not write algorithms the same way Rust (or Java, or Scala...) developers do. As a consequence, every implementer of a spec has to adapt the algorithms. The differences are not limited to programming languages: whether you are using a relational, key-value, document, or graph database, you will encode and handle a given data model differently. Ultimately, every implementation defines a different interpretation path, whether we like it or not.

In some ecosystems (browser APIs come to mind), this heterogeneity may be limited, and therefore the "only one algorithm" principle is probably a good enough way to ensure interoperability. In other ecosystems, where the heterogeneity is higher, it is probably better to acknowledge it and provide guidance to the different kinds of implementers. This is the strategy taken by the editors of CID and VCDM, and should not IMO be flagged as bad practice™.

csarven

tl;dr: The concept of "polyglot format" lacks a clear definition. It doesn't serve as a helpful lens for this discussion. The current examples don't adequately support the argument. Either different and better examples are needed to distil a principle, or a different approach should be considered. I've opted for the latter and made a change suggestion that aims to provide a more agreeable principle, and the examples should be updated in any case.

An authoritative definition of "polyglot format" with explicit criteria would help clarify whether a format, profile, or data model qualifies as such. If such a definition exists and aligns with open standards principles, a reference would be helpful.

SVG and MathML can be written to be valid and processed in both standalone XML and inline HTML contexts.

Since draft-ietf-oauth-sd-jwt-vc and vc-data-model-1.1 are separate specifications, the example does not fully illustrate the intended point.

Other examples:

HTTP status codes: Servers use either 403 or 404 to prevent resource discovery.
Click event handling: addEventListener('click') and onclick both achieve the same goal but through different mechanisms.

I don't see the significance of noting structured suffixes any more than the relationship between the 'text' top-level type and its subtypes. text/html is specific, and interpreting it as plain text is merely a useful step in the process of parsing it as intended (as pointed out by @martinthomson here, but not an alternative interpretation path. The same applies to application/ld+json, with the goal to interpret it as JSON-LD, not merely/only JSON. Treating JSON-LD purely as JSON is analogous to treating CSV or HTML as plain text - although a useful step in the whole process, as pointed by @gkellogg here, it will ignore part of the intended structure, semantics, and functionality. Similarly, application/json is intended to represent JSON, not the raw series of strings that make it up.

CID's context injection states "[a]ny differences in semantics between documents processed in either mode are either implementation or specification bugs".

As I understand it, this is analogous to different representations of an HTTP resource being deemed equivalent in meaning, e.g., when /dog depicts a dog with content negotiated as image/jpeg, it should also depict a dog, not a cat, when content negotiated as image/png. Stating that a resource can have multiple equivalent representations is not deemed to be a bug in neither HTTP or Web Architecture, and that obviously implies different algorithms to make sense of these different representations.

Sometimes reality is a bit more nuanced than "good" or "bad" =)

So, if a principle is to be stated beyond the obvious, it should encourage simplicity, clarity, and security, while accounting for the complexity and interoperability of different implementations.

csarven · 2025-03-17T12:11:43Z

index.bs

+<h3 id="multiple-algorithms">Write only one algorithm to accomplish a goal</h3>
+
+When specifying how to accomplish a goal, write a single algorithm to do it,
+instead of letting implementers pick between multiple algorithms.
+
+It is very difficult to ensure that
+two different algorithms produce the same results in all cases,
+and doing so is rarely worth the cost.


Suggested change

<h3 id="multiple-algorithms">Write only one algorithm to accomplish a goal</h3>

When specifying how to accomplish a goal, write a single algorithm to do it,

instead of letting implementers pick between multiple algorithms.

It is very difficult to ensure that

two different algorithms produce the same results in all cases,

and doing so is rarely worth the cost.

<h3 id="single-algorithm">Write only one algorithm to accomplish a goal</h3>

When defining how to achieve a feature, it's better to specify a single

approach rather than offering multiple options.

If multiple methods are allowed, they must be equivalent in conformance to

avoid unnecessary complexity, inconsistency, and security risks.

TallTed · 2025-03-17T22:50:37Z

This explains why and when "polyglot" formats are a bad idea.

It doesn't appear to do anything of the sort. I don't see a clear definition of what such a "polyglot" format is, nor how one might be worked with, nor how working with one might go bad (nor go well, so there's that).

I do see a number of apparent misunderstandings by the author which others have pointed out directly, so I won't go into those myself.

Principle: Write only one algorithm to accomplish a task.

This doesn't appear to be the focus of this writing. Perhaps it shouldn't be the title of the PR, either?

martinthomson · 2025-03-24T04:14:06Z

The term "polyglot" only appears as an anchor, not in text, so I'm not sure that the criticism of that term is relevant.

My view here is somewhat more nuanced, yet I reach the same conclusions as are laid out in the changeset.

Let's take JSON-LD vs. JSON, since I think that is at the heart of the concerns here. You might consider a processing pipeline that goes something like this:

Text/Unicode decoding (from UTF-8)
JSON decode into a generic JSON-compatible information model
Take JSON-LD attributes and produce a JSON-LD-aware information model
Maybe some generic processing based on an understanding of the JSON-LD-relevant information
Application-specific processing

There's an alternative approach. I've heard many people say that JSON-LD is just JSON if you don't understand JSON-LD. So then you have this instead:

Text/Unicode decoding (from UTF-8)
JSON decode into a generic JSON-compatible information model
Application-specific processing

That's two paths to the same outcome. You therefore need to be careful to ensure that the three final steps of the JSON-LD-aware path are identical to the application-specific JSON processing in the latter path. That is, the inferences about semantics that an application that ignores JSON-LD decorations need to be identical to the same inferences that are made when the semantics are derived from an understanding of the decorations.

What if the JSON-LD attributes indicated a different interpretation? For instance, what if @context differs? A JSON-LD pipeline might follow the interpretation that is indicated by processing a schema, producing a different outcome. (The CID specification would seem to avoid this by specifying a SHA-256 hash of the only @context permissible in that case, but that would seem to lose much of the benefit that might come from JSON-LD processing as a result.)

That puts a lot of weight on the details of the application-specific processing in each of the alternatives. Get it wrong and the same document will produce two different outcomes. To use the CID spec further as an example, what happens if @context is something other than "https://www.w3.org/ns/cid/v1"? (I couldn't work out the answer to this question; I'd infer that CID is incompletely specified, but there are conventions in the spec that I'm not familiar with.)

That's all that this proposed principle intends to say: that two paths to the same goal is harder to get right.

The example of SVG/MathML as HTML are not relevant as there is just one way to interpret those, even if (perhaps) the SVG or MathML is first interpreted as a bunch of generic HTML elements on its way to SVG parsing. There is no other path defined to getting a valid SVG, so the outcome is never ambiguous.

onclick handlers are perhaps a better example of where divergence is justified. The difference between declarative and imperative APIs on the platform that have a defined equivalency do introduce two ways to reach the same end state, but you will observe that they follow a singular process for most of their processing. The necessary points of divergence are parallel in that there are clearly-defined alternatives with clear demarcation points for where they diverge and converge. That is, the entry point might differ, but the result of processing in either case is defined by the same processing model, with the same inputs.

Consider also that JS strings can be delineated with single- or double-quotes ¹. This is complexity that aids ease of use for JS authors. That comes with a non-zero cost in complexity, but it is justified on the basis that it makes it easier to author JS, and there are a lot more JS authors than there are the specifications and implementations that bear the complexity cost.

Both of those examples (onclick, JS strings) are very minor divergences. Most platforms grow alternative paths to the same outcome over time, as new APIs are added that supersede the old APIs, but old APIs stick around. XmlHTTPRequest and Fetch are even better examples of non-trivial divergences.

I don't think that the content negotiation example is relevant either. That /dog is a dog in one format and a cat in another could be a deliberate property of the resource. The point of content negotiation is to allow resources to present different aspects based on the needs of clients. That might lead to odd outcomes, but that's just how that is. Some people continue to use examples like this as cause to consider conneg as a misfeature of HTTP; I'm partly convinced by those arguments, but it has seen a lot of application as a way to manage version migration, so I tend toward accepting the good along with the bad.

and backticks, which have even more different rules ↩

pchampin · 2025-05-06T09:40:11Z

That's all that this proposed principle intends to say: that two paths to the same goal is harder to get right.

I would agree with that, but in its current state, the PR is actually much less nuanced: "write a single algorithm" (imperative form), "ensur[ing] that two different algorithms produce the same results [...] is rarely worth the cost".

@csarven suggestion above seems more balanced.

TallTed · 2025-05-06T18:02:29Z

Straw man —

Principle: Map only one route to reach a destination.

I think this is a reasonable rephrasing of your proposed "Principle". I think it is more obviously flawed, and hope that you will see how it demonstrates the flaw in your proposition (which I suggest should be retitled from "Principle" to "Proposition").

Principle: Write only one algorithm to accomplish a task.

88a3e66

This explains why and when "polyglot" formats are a bad idea.

jyasskin requested review from hober and csarven March 7, 2025 00:52

annevk reviewed Mar 7, 2025

View reviewed changes

martinthomson reviewed Mar 7, 2025

View reviewed changes

msporny reviewed Mar 7, 2025

View reviewed changes

martinthomson reviewed Mar 10, 2025

View reviewed changes

BigBlueHat reviewed Mar 11, 2025

View reviewed changes

hober approved these changes Mar 11, 2025

View reviewed changes

jyasskin added 2 commits March 12, 2025 03:37

Goal instead of task.

6474656

Take Martin's suggestions.

d414b30

jyasskin marked this pull request as draft March 15, 2025 17:26

csarven requested changes Mar 17, 2025

View reviewed changes

	It is very difficult to ensure that

	It is very difficult to ensure that

	that actually only worked with one of the two parsers.
	that only worked with one of the two parsers.

Principle: Write only one algorithm to accomplish a task. #562

Are you sure you want to change the base?

Principle: Write only one algorithm to accomplish a task. #562

Uh oh!

Conversation

jyasskin commented Mar 7, 2025 • edited by pr-preview bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

annevk Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

martinthomson Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

martinthomson Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

martinthomson Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

martinthomson Mar 7, 2025

Choose a reason for hiding this comment

Uh oh!

msporny left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

filip26 commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martinthomson commented Mar 10, 2025

Uh oh!

martinthomson Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

gkellogg commented Mar 10, 2025

Uh oh!

BigBlueHat Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hober left a comment

Choose a reason for hiding this comment

Uh oh!

pchampin commented Mar 12, 2025

Uh oh!

csarven left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csarven Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TallTed commented Mar 17, 2025

Uh oh!

martinthomson commented Mar 24, 2025

Footnotes

Uh oh!

pchampin commented May 6, 2025

Uh oh!

TallTed commented May 6, 2025

Uh oh!

Uh oh!

jyasskin commented Mar 7, 2025 •

edited by pr-preview bot

Loading

msporny left a comment •

edited

Loading

filip26 commented Mar 7, 2025 •

edited

Loading

BigBlueHat Mar 11, 2025 •

edited

Loading

csarven left a comment •

edited

Loading

csarven Mar 17, 2025 •

edited

Loading