Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unparsed: expose content of the extension #72

Closed
wants to merge 2 commits into from

Conversation

baloo
Copy link
Collaborator

@baloo baloo commented May 17, 2024

Unparsed is intended to hold the value of the extension. Specification calls for:

    byte             SSH_AGENTC_EXTENSION
    string           extension type
    byte[]           extension request-specific contents

But does not mandates for content to be serialized according to ssh regular serialization format (as in implements ssh_encoding::Decode).

The Unparsed::From implementation allows for content to be formatted in any serialization, but you can't read data back unless it was encoded according to SSH serialization.

This partially reverts #67

Unparsed is intended to hold the value of the extension.
Specification calls for:
```
    byte             SSH_AGENTC_EXTENSION
    string           extension type
    byte[]           extension request-specific contents
```

But does not mandates for content to be serialized according to ssh
regular serialization format (as in implements `ssh_encoding::Decode`).

The `Unparsed::From` implementation allows for content to be formatted
in any serialization, but you can't read data back unless it was encoded
according to SSH serialization.

This partially reverts wiktor-k#67

Signed-off-by: Arthur Gautier <[email protected]>
@baloo baloo requested a review from overhacked May 17, 2024 21:10
@baloo baloo mentioned this pull request May 17, 2024
@@ -16,6 +16,11 @@ impl Unparsed {
let mut v = &self.0[..];
T::decode(&mut v)
}

/// Expose the inner content in raw format.
pub fn as_slice(&self) -> &[u8] {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be happy to add an hazmat feature if needed, or rename it like as_raw or something to highlight this is dangerous, but then I would argue From<Vec<u8>> should be removed too and replaced by:

impl Unparsed {
   pub fn new<T>(v: T) where T: ssh_encoding::Encoding -> Result<Self> {
   }
}

to highlight that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I like the "as raw" naming. Maybe to keep the symmetry we'd need "from raw" which takes a vec of unparsed data and doesn't interpret it in any way?

I like this new constructor. Do I read the signature well that it is intended for ssh encod-able data not for raw stuff?

I mean... we need raw support for crazy folks that want to use serde there (just when we got rid of it 😉).

One potential addition, not sure if good, is implementing AsRef<[u8]>. This would allow treating Unparsed as byte buffers (the implementation is the same as what you have here, just... more idiomatic (?)).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsRef<[u8]> makes accessing the content not-so-explicit imho, which renders the use of a type system a bit moot.

@baloo
Copy link
Collaborator Author

baloo commented May 18, 2024

Just to be transparent about the intent:
I have my own protocol over extensions. The messages in the payload are serialized with serde because it makes it easier to work with than having to serialize using ssh's encoding.

Arguably I could just shove everything in a Vec and hand that to Unparsed, but that makes for a double (or triple) serialization, and I find it somehow irritating.

overhacked added a commit to overhacked/ssh-agent-lib that referenced this pull request May 18, 2024
Add conversion methods to gain access to the inner bytes of `Unparsed`,
per discussion in wiktor-k#72

Signed-off-by: Ross Williams <[email protected]>
@overhacked
Copy link
Collaborator

Summary

PR #73 opened to show proposed alternative.

  1. Existing impl Encode for Unparsed just writes the raw bytes to the buffer, per spec
  2. AsRef<[u8]> would allow for the serde::Deserialize use case
  3. Adding Unparsed::into_bytes(self) -> Vec<u8> would allow for serde::de::DeserializeOwned

Discussion

The Encode implementation for Unparsed does follow the spec, as it just writes the raw bytes to the buffer it is given:

fn encode(&self, writer: &mut impl Writer) -> ssh_encoding::Result<()> {
// NOTE: Unparsed fields do not embed a length u32,
// as the inner Vec<u8> encoding is implementation-defined
// (usually an Extension)
writer.write(&self.0[..])?;
Ok(())
}

Regarding the AsRef<[u8]> vs Unparsed.as_raw() conversation, I lean towards AsRef, because it can be used in existing generic code that has AsRef<[u8]> argument bounds. I don't think it's circumvents the type system the way Deref<[u8]> would, because AsRef coercions are not automatic; a function that takes AsRef<T> has to call T.as_ref() to get the reference. (With Deref, any caller could "accidentally" call a &[u8] function on Unparsed with no explicit conversion.)

Regarding From<Vec<u8>> vs Unparsed.from_raw(), I'd keep the idiomatic impl From because it correctly implies that there is no difference in the data stored in a Vec<u8> and an Unparsed, while from_raw() implies that the internal representation changes.

Arguably I could just shove everything in a Vec and hand that to Unparsed, but that makes for a double (or triple) serialization, and I find it somehow irritating.

I don't think there is any double- or triple-serialization or even allocation happening in this case, because the From<Vec<u8>> takes ownership of the Vec you provide without copying or cloning. When Encode::encode is called, the bytes are in fact copied into the outgoing message buffer, but that occurs with all of the types that implement Encode, not just Unparsed. @baloo, is there something redundant here that I don't see?

If we wanted to provide an owned-not-borrowed deserialization path, I'd propose the inverse of From<Vec<u8>> as Unparsed::into_bytes(self) -> Vec<u8> that consumes the Unparsed, unwrapping the inner bytes.

@baloo
Copy link
Collaborator Author

baloo commented May 18, 2024

@baloo, is there something redundant here that I don't see?

No there is not redundancy, I completely missed that part:

The Encode implementation for Unparsed does follow the spec, as it just writes the raw bytes to the buffer it is given:

I can go with #73 beside the small comment I made there.

@baloo
Copy link
Collaborator Author

baloo commented May 19, 2024

superseded by #73

@baloo baloo closed this May 19, 2024
@baloo baloo deleted the baloo/unparsed/expose-value branch May 19, 2024 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants