Skip to content

Conversation

@avinayak
Copy link

@avinayak avinayak commented Nov 5, 2023

This PR Adds support for SelectObjectContentRequest in S3. This let's users query contents of S3 Objects if the yare of type CSV, JSON or Parquet.
Explanation for some design choices:

  • The output type of this request is a chunked EventStream(a binary format) encoded octet-stream.
  • There is no Range header for this request. So, we cannot use multiple http requests to stream this.
  • The only way to stream this is at the socket level (http streaming), which I added in feat: Add ability to perform octet streaming ex_aws#1012
    @bernardd Can you please check this?

@avinayak avinayak mentioned this pull request Nov 5, 2023
SelectObjectContents struct and add XML
serialization helpers
options for input_serialization,
output_serialization, and scan_range
@avinayak avinayak marked this pull request as ready for review November 9, 2023 00:27
@avinayak
Copy link
Author

avinayak commented Nov 9, 2023

ping @bernardd

describe "integration test" do
setup [:start_bypass]

test "stream SelectObjectContent results", %{bypass: bypass} do
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ex-aws/ex_aws#1012 has to be merged for this test to pass

@seanhuggins1
Copy link

seanhuggins1 commented Dec 7, 2023

This is great work, would love to see this merged.

def to_xml(value) when is_atom(value), do: Atom.to_string(value) |> String.upcase()
def to_xml(value) when is_binary(value), do: value
def to_xml(value) when is_integer(value), do: Integer.to_string(value)
def to_xml(value) when is_float(value), do: Float.to_string(value)
Copy link
Contributor

@bernardd bernardd Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably best not to roll our own XML encoder here - it misses some basic stuff like escaping of control characters. Maybe add something like https://github.com/joshnuss/xml_builder as an optional dependency to be used here?

@bernardd
Copy link
Contributor

Thanks very much for your work on this and the corresponding PR over in ex_aws_s3, @avinayak. Please also accept my apologies for not getting to it sooner. It's been quite a busy end of the year over here.
At the moment it looks like there's a bunch of tests that are failing as a result of the changes to the request family of function signatures. I've also made a note about the XML stuff.
If you're able to take a look at that stuff and rebase it against the latest main, I'll do my best to get it merged as soon as I can.

@avinayak avinayak closed this Aug 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants