-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Principle: Write only one algorithm to accomplish a task. #562
base: main
Are you sure you want to change the base?
Conversation
This explains why and when "polyglot" formats are a bad idea.
index.bs
Outdated
@@ -3488,6 +3505,52 @@ While the best path forward may be to choose not to specify the feature, | |||
there is the risk that some implementations | |||
may ship the feature as a nonstandard API. | |||
|
|||
<h3 id="multiple-algorithms">Write only one algorithm to accomplish a task</h3> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "goal" instead of "task"? This immediately made me think of the event loop.
|
||
When specifying how to accomplish a task, write a single algorithm to do it, | ||
instead of letting implementers pick between multiple algorithms. | ||
It is very difficult to ensure that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very difficult to ensure that | |
It is very difficult to ensure that |
index.bs
Outdated
two different algorithms produce the same results in all cases, | ||
and doing so is rarely worth the cost. | ||
|
||
Multiple algorithms seem particularly tempting when defining |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think that you need this paragraphs as long as an example mentions a file format.
index.bs
Outdated
using either the [[HTML#the-xhtml-syntax|XHTML parsing]] | ||
or [[HTML#syntax|HTML parsing]] algorithm. | ||
Authors who tried to use this syntax tended to produce documents | ||
that actually only worked with one of the two parsers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that actually only worked with one of the two parsers. | |
that only worked with one of the two parsers. |
|
||
Note: While [[rfc6838#section-6|structured suffixes]] define that | ||
a document can be parsed in two different ways, | ||
they do not violate this rule because the results have different data models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn’t the real difference here that the suffix parsing produces an intermediate result?
i suspect that this is insufficient still, because it doesn’t really get at why suffix parsers exist. That is still somewhat contested, but my view is that intermediate results are rarely able to be processed meaningfully, so they are limited to use in diagnostic tools and the like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the principle seems too blunt to be useful. Some high level thoughts to start, I'm still trying to think about what text would be useful:
- Polyglot, as a term, is wrong -- this is about interpreting the same data serialization using different algorithms, not about a single system interpreting that serialization using different algorithms. It's more about an ecosystem interpreting that same data serialization using different algorithms (which is useful, more on that below).
- Yes, there are cases where this resulted in bad outcomes -- XHTML/HTML is a good example.
- The comparison between VCDM and SD-JWT-VC is totally wrong, they're two totally different data models, using two totally different serializations, using two totally different algorithms -- and there are a number of us that think that whole thing is a massive standardization failure, so using that as an example of the right way to do something is not what we want to do. The only thing they have in common is the phrase "Verifiable Credential", and even that is being objected to by some of us.
- The multiple suffixes thing is also contested -- in the IETF MEDIAMAN WG, we couldn't find broad-scale usage of suffix-based processing, what @martinthomson is saying is important here. I'll add that the suffix-based processing is also not a clear example of why this principle is good or bad.
Fundamentally, the principle seems misguided. Yes, at some level one data format and one algorithm is a good thing. However, what a traditional web crawler gets out of a web page is different from what a browser parsing the web page works with is different from what a frontier AI model gets out of a web page. The algorithms that each uses are quite different and useful and this principle seems to be arguing against that.
I think the only solid ground here is the XHTML/HTML example. You're going to get push back on the other items being mentioned if they continue to be mentioned in the way the current PR is written up.
I'll try to think of some constructive text, but wanted to get some preliminary thoughts down in an effort to help shape the PR into something more easily defensible.
Algorithms + Data Structures = Programs (Niklaus Wirth). It’s rational to avoid having two algorithms performing the same function, especially when considering costs like time and space complexity, and to recommend the one that best fits the criteria. However, if this change is based on the assumption:
then there is a misunderstanding of the algorithmization basics. JSON and JSON-LD have different data models, as noted. They involve different data structures in the equation at the top, which means different algorithms are needed because they operate on different data structures. From this perspective, calling for one algorithm to operate on different data structures does not make sense. My recommendation would be to use a different argument when advocating for a single algorithm - considering factors such as time complexity, space complexity, etc. |
I don't agree with Manu about this being misguided. The point here is that the same HTML document is not seeking to express multiple distinct sets of semantics depending on how it is processed, there is just one HTML with one interpretation, and one data model that both the producer of the content and the consumer of content can agree on. If they disagree, that is likely due to one or other being wrong. This is because there is just a single specification for HTML and a single way to interpret HTML content according to that content. Obviously, what someone does once they have received HTML might differ, but those differences do not relate to how the HTML itself was interpreted, but how the content at the next layer (that is, the words and images and stuff like that) is interpreted. Sure, a human and an AI model will seek to do different things with the information they are presented with, but the interpretation is singular. Where CID struggles a little is that there are two paths to the same interpretation. It manages that by giving implementations a choice and promising that the outcome will be the same either way. It's bad, because now there is a third place where a bug can result in a different interpretation (producer, consumer, and now spec), but it's not fundamentally a polyglot in the sense that there are multiple divergent interpretations possible. The core message is that having divergent paths is undesirable. And yes, that means saying that seeking to have a pure JSON vs a JSON-LD interpretation of the same content is a bad idea. Because divergence in data models means that there is no single interpretation of the content upon which all potential recipients might agree upon. |
to assign properties to particular objects than JSON does, | ||
these specifications had to add extra rules to both kinds of parsers | ||
in order to ensure that each input document had exactly one possible interpretation. | ||
[[vc-data-model-2.0 inline]] and [[draft-ietf-oauth-sd-jwt-vc inline]] fixed the problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manu is right that these are completely different (and that they likely represent standardization failure, though the question of where the failure occurred might be contested). In a sense, it is OK that they are completely different (that they are in competition is potentially bad if they address the same use cases, but there is no risk that one might be mistaken for the other).
I think that it would serve this example better to focus only on the CID case.
This issue really gets at the heart of a basic divide at W3C: one that is browser-centric, vs. one which is data-centric. In fact, JSON-LD does parse JSON (and YAML and CBOR) into a common INFRA-based data structure (called the Internal Representation) which various algorithms operate over to perform different transformations, including to interpret as RDF. This is the core reason behind JSON-LD, which has become extremely widely used on the Web (in large part, due to schema.org). HTML is also often processed differently, typically by interpreting the resulting DOM. This might be done to extract Microdata/RDFa, interpret the contents of In the case of Verifiable Credentials, the basic failure would seem to be a lack of agreement on how to work with the data that is represented in the JSON. This is an area the TAG can help with for future specs, rather than getting into a reductionist view that Polyglot formats are fundamentally a bad idea. |
these specifications had to add extra rules to both kinds of parsers | ||
in order to ensure that each input document had exactly one possible interpretation. | ||
[[vc-data-model-2.0 inline]] and [[draft-ietf-oauth-sd-jwt-vc inline]] fixed the problem | ||
by defining different media types for JSON-LD vs JSON, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement is incorrect. The VCDMv2 only has a single media type: application/vc
.
Likewise, SD-JWT-VC only has one: application/dc+sd-jwt
. However, SD-JWT is not JSON parse-able base format.
Both specifications have parsing algorithms unique to their media types--and specific to their tasks.
It remains unclear how these examples are "polyglot".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good to me on a first-pass look. I'll try to find the time to give it a closer read some other time, but please don't block on waiting for me to do so. :)
I disagree: bugs in specs are always possible, regardless of the fact that the spec acknowledges it or not. Also note that an algorithm is not an implementation, and no algorithm is entirely neutral: Javascript developers do not write algorithms the same way Rust (or Java, or Scala...) developers do. As a consequence, every implementer of a spec has to adapt the algorithms. The differences are not limited to programming languages: whether you are using a relational, key-value, document, or graph database, you will encode and handle a given data model differently. Ultimately, every implementation defines a different interpretation path, whether we like it or not. In some ecosystems (browser APIs come to mind), this heterogeneity may be limited, and therefore the "only one algorithm" principle is probably a good enough way to ensure interoperability. In other ecosystems, where the heterogeneity is higher, it is probably better to acknowledge it and provide guidance to the different kinds of implementers. This is the strategy taken by the editors of CID and VCDM, and should not IMO be flagged as bad practice™. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tl;dr: The concept of "polyglot format" lacks a clear definition. It doesn't serve as a helpful lens for this discussion. The current examples don't adequately support the argument. Either different and better examples are needed to distil a principle, or a different approach should be considered. I've opted for the latter and made a change suggestion that aims to provide a more agreeable principle, and the examples should be updated in any case.
An authoritative definition of "polyglot format" with explicit criteria would help clarify whether a format, profile, or data model qualifies as such. If such a definition exists and aligns with open standards principles, a reference would be helpful.
SVG and MathML can be written to be valid and processed in both standalone XML and inline HTML contexts.
Since draft-ietf-oauth-sd-jwt-vc
and vc-data-model-1.1
are separate specifications, the example does not fully illustrate the intended point.
Other examples:
- HTTP status codes: Servers use either
403
or404
to prevent resource discovery. - Click event handling:
addEventListener('click')
andonclick
both achieve the same goal but through different mechanisms.
I don't see the significance of noting structured suffixes any more than the relationship between the 'text' top-level type and its subtypes. text/html
is specific, and interpreting it as plain text is merely a useful step in the process of parsing it as intended (as pointed out by @martinthomson here, but not an alternative interpretation path. The same applies to application/ld+json
, with the goal to interpret it as JSON-LD, not merely/only JSON. Treating JSON-LD purely as JSON is analogous to treating CSV or HTML as plain text - although a useful step in the whole process, as pointed by @gkellogg here, it will ignore part of the intended structure, semantics, and functionality. Similarly, application/json
is intended to represent JSON, not the raw series of strings that make it up.
CID's context injection states "[a]ny differences in semantics between documents processed in either mode are either implementation or specification bugs".
As I understand it, this is analogous to different representations of an HTTP resource being deemed equivalent in meaning, e.g., when /dog
depicts a dog with content negotiated as image/jpeg
, it should also depict a dog, not a cat, when content negotiated as image/png
. Stating that a resource can have multiple equivalent representations is not deemed to be a bug in neither HTTP or Web Architecture, and that obviously implies different algorithms to make sense of these different representations.
Sometimes reality is a bit more nuanced than "good" or "bad" =)
So, if a principle is to be stated beyond the obvious, it should encourage simplicity, clarity, and security, while accounting for the complexity and interoperability of different implementations.
<h3 id="multiple-algorithms">Write only one algorithm to accomplish a goal</h3> | ||
|
||
When specifying how to accomplish a goal, write a single algorithm to do it, | ||
instead of letting implementers pick between multiple algorithms. | ||
|
||
It is very difficult to ensure that | ||
two different algorithms produce the same results in all cases, | ||
and doing so is rarely worth the cost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<h3 id="multiple-algorithms">Write only one algorithm to accomplish a goal</h3> | |
When specifying how to accomplish a goal, write a single algorithm to do it, | |
instead of letting implementers pick between multiple algorithms. | |
It is very difficult to ensure that | |
two different algorithms produce the same results in all cases, | |
and doing so is rarely worth the cost. | |
<h3 id="single-algorithm">Write only one algorithm to accomplish a goal</h3> | |
When defining how to achieve a feature, it's better to specify a single | |
approach rather than offering multiple options. | |
If multiple methods are allowed, they must be equivalent in conformance to | |
avoid unnecessary complexity, inconsistency, and security risks. |
It doesn't appear to do anything of the sort. I don't see a clear definition of what such a "polyglot" format is, nor how one might be worked with, nor how working with one might go bad (nor go well, so there's that). I do see a number of apparent misunderstandings by the author which others have pointed out directly, so I won't go into those myself.
This doesn't appear to be the focus of this writing. Perhaps it shouldn't be the title of the PR, either? |
This explains why and when "polyglot" formats are a bad idea.
Fixes #239.
There's some overlap between this and the preceding section, Resolving tension between interoperability and implementability. Do y'all think it's ok, or are there bits we could refactor together?
I'd also like to give an example of parsing divergence yielding security bugs, but I didn't have any readily available. Ideas?
Preview | Diff