Skip to content

Conversation

@lukevalenta
Copy link
Contributor

@lukevalenta lukevalenta commented Jun 10, 2025

Add bootstrap Merkle Tree Certificate Authority worker

  • Add mtc_api and mtc_worker crates, implementing a (bootstrap) Merkle
    Tree Certificate Authority log that accepts a X.509 chain and
    issues a TbsCertificateLogEntry covered by that chain. There are
    several outstanding TODOs to add necessarily validation.
  • Add associated constants to PendingLogEntry to specify the data path
    elem (e.g., 'data' for static-ct-api, 'entries' for tlog-tiles, etc.),
    and an optional 'unhashed' path elem. This allows a log to publish
    unauthenticated ('unhashed') extra data to a separate path in the
    public bucket. The intended use case if for the 'bootstrap' X.509
    chain in Merkle Tree Certificates.
  • Add associated constant REQUIRE_CHECKPOINT_TIMESTAMP to LogEntry
    specifying whether checkpoints require at least one timestamped
    signature.
  • Add unhashed_entry() method to PendingLogEntry to retrieve the
    unhashed entry, if configured for the log.
  • Change get_cached_entry method to get_cached_metadata, since we don't
    always have a way to retrieve metadata from a LogEntry.
  • Remove inner() method for LogEntry, since it's never actually needed.
  • Remove logging_labels() method for LogEntry, since not every generic
    log has a 'type' field for log entries. Counts of 'add-chain' vs
    'add-pre-chain' requests can be recorded elsewhere if needed.
  • Replace Tile::set_data_with_path() with the slightly more ergonomic
    TlogTile::with_data_path().
  • Use lifetimes to remove 'Cursor' type from TileIterator and avoid some
    unnecessary cloning.
  • Refactor to avoid unnecessary clones in 'load' and 'sequence_entries'.

@lukevalenta lukevalenta requested a review from cjpatton June 10, 2025 20:15
@lukevalenta lukevalenta self-assigned this Jun 10, 2025
@lukevalenta
Copy link
Contributor Author

@rozbb fyi. I can rebase this on top of your in-progress PR once that's ready.

@lukevalenta lukevalenta requested a review from bwesterb June 16, 2025 21:08
@lukevalenta lukevalenta changed the title Add support for Bootstrap Merkle Tree Certificate log entries Add support for Bootstrap Merkle Tree Certificate log Jun 16, 2025
@lukevalenta lukevalenta force-pushed the lvalenta/mtc branch 4 times, most recently from 4e98002 to ee03357 Compare June 17, 2025 13:52
* Add mtc_api and mtc_worker crates, implementing a (bootstrap) Merkle
  Tree Certificate Authority log that accepts a X.509 chain and
  issues a TbsCertificateLogEntry covered by that chain. There are
  several outstanding TODOs to add necessary validation.
* Add associated constants to PendingLogEntry to specify the data path
  elem (e.g., 'data' for static-ct-api, 'entries' for tlog-tiles, etc.),
  and an optional 'unhashed' path elem. This allows a log to publish
  unauthenticated ('unhashed') extra data to a separate path in the
  public bucket.  The intended use case if for the 'bootstrap' X.509
  chain in Merkle Tree Certificates.
* Add associated constant REQUIRE_CHECKPOINT_TIMESTAMP to LogEntry
  specifying whether checkpoints require at least one timestamped
  signature.
* Add unhashed_entry() method to PendingLogEntry to retrieve the
  unhashed entry, if configured for the log.
* Change get_cached_entry method to get_cached_metadata, since we don't
  always have a way to retrieve metadata from a LogEntry.
* Remove inner() method for LogEntry, since it's never actually needed.
* Remove logging_labels() method for LogEntry, since not every generic
  log has a 'type' field for log entries. Counts of 'add-chain' vs
  'add-pre-chain' requests can be recorded elsewhere if needed.
* Replace Tile::set_data_with_path() with the slightly more ergonomic
  TlogTile::with_data_path().
* Use lifetimes to remove 'Cursor' type from TileIterator and avoid some
  unnecessary cloning.
* Refactor to avoid unnecessary clones in 'load' and 'sequence_entries'.
Copy link
Contributor

@rozbb rozbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! One questions: what precisely is the unhashed data tile? Is it just auxiliary data that doesn't go in the normal data tile?

I ask bc I have my own use case. I wanna use tlog tiles for webapp manifests, but these manifests can be multiple megabytes in size. The size limit for tile entries is 65kB, though. So my plan was to store the full manifest in another URL, and include the hash plus timestamp in the real data tiles. But my question then is:

  1. Is this the intended use case of "unhashed" data?
  2. Should we thus extend the existence of aux data to be independent of the Log entry type? Tlog tiles can have or not have aux data, depending on use case.
  3. Is the parsing of aux data currently the same as non-aux data? That is, is it reading a u16 and returning the entry? Because if one use case is to store larger data, then we're gonna have to support other encodings

@lukevalenta
Copy link
Contributor Author

Looks good! One questions: what precisely is the unhashed data tile? Is it just auxiliary data that doesn't go in the normal data tile?

Yes! I thought about calling it "unauthenticated data" or "extra data" in RFC6962 terminology (although in that case the extra data goes into the data tile, not a separate tile), but settled on "unhashed" to emphasize that the data is not hashed into the tree (and thus not authenticated). Happy to change to "auxiliary"!

I ask bc I have my own use case. I wanna use tlog tiles for webapp manifests, but these manifests can be multiple megabytes in size. The size limit for tile entries is 65kB, though. So my plan was to store the full manifest in another URL, and include the hash plus timestamp in the real data tiles. But my question then is:

  1. Is this the intended use case of "unhashed" data?

Yes, that would be a good use case, if you wanted to keep the manifest data in a convenient location alongside the corresponding log entries.

  1. Should we thus extend the existence of aux data to be independent of the Log entry type? Tlog tiles can have or not have aux data, depending on use case.

For MTCs I just wrapped the tlog-tiles entry with an outer MtcLogEntry that adds the aux data. The reason for making it an associated constant in the PendingLogEntry struct is so we can encode the path in the bucket (e.g., /bootstrap for MTCs) and whether or not the aux data is present, without having to transmit that information to the Sequencer for each log entry. It would be nice to use TlogTilesPendingLogEntry directly if we could use const generics to specify the path for the aux data, but const generics are restricted to primitive types. Can you think of a better way to do this?

  1. Is the parsing of aux data currently the same as non-aux data? That is, is it reading a u16 and returning the entry? Because if one use case is to store larger data, then we're gonna have to support other encodings

Nope! The Sequencer just writes the blobs without adding any additional length prefix, so it's up to the application to define the format (and make sure it can be parsed). For example, the MTC bootstrap tile entries each have a u24 length prefix.

@rozbb
Copy link
Contributor

rozbb commented Jun 17, 2025

Nice! That's all the best version of what we could want, it seems. And I think an associated constant is perfectly good for this

@lukevalenta lukevalenta merged commit 264564a into main Jun 17, 2025
1 check passed
@lukevalenta lukevalenta deleted the lvalenta/mtc branch June 17, 2025 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants