-
Notifications
You must be signed in to change notification settings - Fork 680
Unified ImageFormat for both built-in and plugin formats
#2691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Unified ImageFormat for both built-in and plugin formats
#2691
Conversation
|
My first thought reading this is: looks far more complex; so what do we gain. The core new mechanics don't come across as fleshed out, quadratic complexity
If this were a new type, I'd say avoid providing the trait impls you mention to ensure that users will always use the actually comparable formats that are meant for it, i.e. the mime. A method instead of the trait would also work, see However since this changes an existing type, I'd say it points out a design problem with the approach. Types group a set of values with common behavior but since several behavior can no longer be provided for the new interpretation of the type, is it really the same type anymore? This emphasizes the question, why not introduce this functionality as a new, different type? The point of
Not a problem per-se but we'd need to ensure you can emulate it. The extension worked as-if a corresponding format was registered without anyone having to do so explicitly. Currently involves a lot of ceremony where the fallback case requires you to come up with a decoding hook since that seems to be the only way of defining a new And Also isn't this backwards? The other way around of creating types via the detection hook or by extension itself and
Looks like a bug in the current implementation, thanks for bringing it to attention.
Leaving it out makes a better impression than a not-designed afterthought.
Design problem to be resolved.
I really dislike that train of thought and, this is not meant to be harsh but, it's important to see what rubs me the wrong way. If it doesn't make the specification of the methods hard to uphold, why limit users? If a specific problem would be triggered, fix the code. If you know the code does not have a problem with this then catching ominous behavior just makes the interface expectations more complex and thus hard to use. Here, the user may intent to complete the type at a later stage if they want to structure their code to have different registration parts happen in a different sequence. There are sometimes technical requirements that motivate artificial limits but technical and requirement only works as an argument if they are traceable reasoning. (#2647's constant so that the structure can be put on the stack for instance is alright reasoning; even though one may disagree with self-imposing that requirement). This applies to the 256 limit on format counts as well. It is vaguely a stopgap for all the memory leakage but it doesn't really affect it, and is it even leakage if we're still able to refer to and use the data? Also of course fix the actual leaks with |
|
Thanks for taking the time to review!
Yup. It's far from finished. This PR draft isn't a "just needs some polish, then let's do it as is" but a "I'm working on something that I think could work, please tell me what problems you see and how we could solve them". I didn't want to work on this for a week only for it to be rejected because you folks don't like the idea fundamentally. As for the issues you mentioned here:
I don't think that's the right way to look at this. The ability to hash and ser/de an image format (builtin or plugin) is highly desirable IMO and arguably expected of an identifier. I think hashing and ser/de should definitely remain, and I am committed to making it happen. I'm just not sure whether using extensions is the best way to do it. It should work since we are kinda using extensions as identifiers already, but I'm just not certain whether extension collision or something else might cause a problem.
What "several behavior" specifically? As I see it, everything aside from the Not that Speaking of
I say "assume", becuse the doc of
What does it mean for a format to have reading enabled but not supported? If it just means "not supported", then having a single method returning a 3-state enum instead of having Sorry if this seems a little off-topic, but I had some issues even understanding what these methods are supposed to do. So before we talk about how/if we can make plugins fit these methods, I would like to talk about whether the current API to inquire about reading/writing ability should be kept as is in the first place.
Okay. I wasn't sure if we even wanted to allow users to register a detection for an otherwise unknown format.
It needs to check whether the format already exists for overrides and rejection. What's confusing about that?
So you're suggesting that plugin authors start by creating an empty If so, then I like it. This seems like a much better foundation for hooks.
The empty signature check is intended to detect user errors early. If a signature is empty, it is naturally a prefix of every possible file, meaning that it will always match every file. Registering an empty signature prevents all signatures after it from working. Seems like a clear bug to me. Another validation I considered adding is for masked signatures that have one bits where the mask has zero bits. Such signatures can never match anything, which is clearly a bug. I haven't added it yet because I wasn't sure whether it should be a debug_assert-type check or just print a warning (since it's not directly harmful). The empty extension check is an unnecessary limitation (I think). When I wrote the code, I just didn't want to think about formats where the identifying extension is empty. Shouldn't be an issue to remove. (Update: I removed all validation for now.)
I added that limit so If you don't want a limit at all, we'd have to use (Update: I used u32 for now.) |
|
I haven't looked through the code changes (this PR already exceeds the size I'm personally willing to review...) but it isn't clear to me from the PR description/comments what problem exactly is being solved here? |
|
Problems it solves:
It also doesn't have some problems that some other approaches have. E.g. using something like the internal
|
Only the first one of these points is an actual problem. The other two are descriptions of solutions and, without a motivating problem, 'easier' is not a qualifier. Easier compared to what, and how, immediately come to mind as unaddressed. The comparison is also relevant to the second point as
I've explained before why that is not a shortcoming if the 'provided' functionality is forced to be wrong (in effect or implementation).
Alternative: rename the existing enum to |
True, my bad. Then let me try again: Problems it solves:
As I see it, all of these problems are directly caused by the divide between builtin formats and plugin formats. In effect, plugin formats are currently second-class citizens as they do not enjoy similar levels of support as builtin formats. If the justification for not adding more formats to
My bad, I meant "easy". But I'm also curious what you would like me to compare it to and from what perspective. I (mistakenly) said "easier" in the context of adding new capabilities to plugins, and I am not aware of any current proposals or PRs that attempt to go in that direction.
Frankly, I'm not sure how you would like me to compare them. This PR makes However, if by "simpler approaches" you meant using extension strings as the universal type for formats, then we could compare them. But I'm still not sure what aspects of them you'd like to compare and what aspects you deem important.
Could you please elaborate a bit more? I'm sorry, but I don't see how this fits into the picture. |
I implemented the unified image format idea I talked about in #2683.
What is this PR:
ImageFormatcan now represent both builtin and plugin formats. This is done using a private format registry that contains all formats.ImageFormatis simply an index (=registry ID) into the list of formats.ImageFormatis largely unchanged. The only read difference is that e.g.ImageFormat::Pngis now aconstant instead of an enum variant.The hooks API changed slightly to useImageFormatinstead of relying on extensions everywhere. The intended usage is nowif let Some(my_format) = hooks::register_decoding_hook("ext", ...) { /* register magic bytes, etc. */ }The hooks API now always uses
ImageFormatto identify formats. Plugin authors now create an empty format and then add abilities and metadata to it.ImageFormat, since it was deprecated.Since builtin formats and plugin formats are both represented using
ImageFormat, API limitations for plugins like #2616 no longer exist. Users can now use plugin formats like any builtin format.TODOs:
ImageFormat::reading_enabledandImageFormat::writing_enabledare currently specific to features enabled/disabled on theimagecrate.It's not clear what plugin formats should return here.My current understanding is that these functions return whether "all necessary features are enabled." So plugins (which can't define features) should return
trueby the all quantifier.Resolved
Serialization and hashing for
ImageFormat.The underlying registry IDs for builtin formats are always the same, but this is not the case for plugins. IDs are assigned sequentially, so plugin IDs depend on the number of builtin formats and the order in which plugin formats are registered. Because of that, both the hash and serialized form of pluginImageFormats can vary. It is still deterministic, but the fact that code changes (e.g. registering a new plugin) may cause the hash/serialized form of a plugin format to change could still be surprising to users.Solved by using main extensions as a stable identifier.
Some formats previously had extensions that they detected, but did not return inThese were confirmed to be bugs.ImageFormat::extensions_str. Was that a bug, or an intentional feature? I currently return all registered extensions inImageFormat::extensions_str.BecauseThe new format creation hook makes this possible again.hooks::register_format_detection_hooknow takes anImageFormatinstead of an extension, it is no longer possible to register magic bytes for formats that aren't registered. I'm not sure if we're okay with this limitation.Fixed by not resolving main extensions in format creation hook.hooks::register_decoding_hook(ext, hook)currently resolves aliases when looking for existing formats to override. This is different from before and may be a problem when extensions collide. E.g. it's currently not possible to define a format for APNG, because PNG has ".apng" as an extension alias.I started adding some validation to extensions when registering plugins and magic bytes (can't be empty). Should this be a separate PR?I removed all validation after feedback.I want to change the API ofImageFormat::to_mime_typecurrently returnsapplication/octet-streamfor plugins that don't have MIME types registered. Not sure if that is the best default. Maybe the method should be changed to return anOptioninstead?ImageFormatas little as possible right now, so let's not do that now.