Extending metadata blocksize past 16 MiB #877
madah81pnz1
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The idea is to have a container type, FLAC__METADATA_TYPE_EXTENDED, that can hold other metadata blocks. These container blocks are then chained one after another, making for "unlimited" metadata size.
The main use-cases for this is to store big pictures, and also to make --keep-foreign-metadata possible when the source .wav file has chunks larger than 16 MiB.
This shows the idea:
It would be possible to mix extended and non-extended blocks, here's another example with 3 pictures:
But as with PADDING blocks, it might makes sense to optimize the file layout so that all extended blocks are merged:
However, this shows a bit of an issue though. The last PICTURE starts in the middle of an EXTENDED frame.
Lessons learned from UTF-8 tells us that it pays off to separate start and continuation:
EXT_START and EXT_CONT doesn't have to be separate block types, it can be part of a small header in each EXTENDED metadata block:
The "unique contained block id" is meant to be unique for each group of EXTENDED blocks that belongs together. And together with "extended block number" and "extended block total", these are meant to make recovering from bad edits by unaware flac tools easier, e.g. reordered blocks.
The "contained block byte offset" is to be able to do partial recovery of data in case some EXTENDED block is missing, same with "contained block total size in bytes".
An example scenario is that some tool deletes some EXTENDED blocks in the middle, and now the contained PICTURE is actually slighly corrupted with two halves from two unrelated pictures.
Using 10 bits for the extended block number makes the limit 16 GiB, which is a nice increase from the old 16 MiB. This means we need 34 bits for the byte offset.
Increasing the bits to use for the real block type is maybe also something that could be done for future-proofing, as would increasing all bit sizes to atleast 32 and use 64 for the byte counts.
PADDING blocks:
As PADDING inside EXTENDED blocks doesn't make sense, they should be disallowed, and instead multiple PADDING blocks should be written.
metaflac:
metaflac needs to be able to handle the extended blocks, but this might requires a bunch of new command line switches.
An old metaflac would list the separate EXTENDED blocks, but a new metaflac would list the contained PICTURE block instead, with no way of deleting the EXTENDED blocks.
A normal user wouldn't care if a block is EXTENDED or not, but an expert user might, or when trying to recover a corrupt file.
Beta Was this translation helpful? Give feedback.
All reactions