Skip to content

Implement Typed Documents and TypeRegistry #282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 44 commits into
base: decaf
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
d33fcdf
Add type registry prototype class
jterapin Mar 6, 2025
5ff40cc
Add type registry to codegenerated schema
jterapin Mar 6, 2025
66a6285
Update projections
jterapin Mar 6, 2025
c5e45ed
Merge branch 'decaf' into typed_documents
jterapin Mar 7, 2025
2bd15a2
Merge branch 'decaf' into typed_documents
jterapin Mar 10, 2025
03bfa82
Merge branch 'decaf' into typed_documents
jterapin Mar 12, 2025
e6435d5
Update requires
jterapin Apr 3, 2025
0830827
Add initial document implementation
jterapin Apr 3, 2025
877654f
Merge decaf into branch
jterapin Apr 3, 2025
a61318f
Update to include cbor
jterapin Apr 7, 2025
4edfae3
Expand on typed docs
jterapin Apr 7, 2025
ff959f1
Update file names
jterapin Apr 7, 2025
8b9b560
Merge branch 'decaf' into typed_documents
jterapin Apr 9, 2025
598db66
More refactoring
jterapin Apr 11, 2025
3a4c0d1
Merge branch 'decaf' into typed_documents
jterapin Apr 11, 2025
269b2b5
Remove scratches
jterapin Apr 11, 2025
90c58ce
Fix rubocop
jterapin Apr 11, 2025
a1e46cc
Clean up document
jterapin Apr 14, 2025
8b666cd
Clean document specs
jterapin Apr 14, 2025
2ddf4bd
Update TypeRegistry
jterapin Apr 14, 2025
112ddf4
Add documentation
jterapin Apr 14, 2025
6283813
Add TypeRegistry specs
jterapin Apr 14, 2025
efbfa5e
Merge branch 'decaf' into typed_documents
jterapin Apr 14, 2025
88ff845
Add TypeRegistry tests
jterapin Apr 15, 2025
66b2cde
Update projections
jterapin Apr 15, 2025
22998a0
Update syntax
jterapin Apr 15, 2025
9afeacd
Update projections
jterapin Apr 15, 2025
232844c
Merge branch 'decaf' into typed_documents
jterapin Apr 17, 2025
e8920fd
Change schema name to shapes to stay aligned
jterapin Apr 17, 2025
ef8c027
Remove as_typed method from TypeRegistry
jterapin Apr 17, 2025
3854661
Create TimeHelper module
jterapin Apr 18, 2025
48e1b0f
Fix timestamp failures
jterapin Apr 18, 2025
04d0d5b
Refactor type registry per feedbacks
jterapin Apr 21, 2025
b35e09b
Update 1 projection
jterapin Apr 21, 2025
66825be
Update weather projection
jterapin Apr 21, 2025
4efe669
Use SchemaHelper for testing
jterapin Apr 21, 2025
73e0e83
Update TypeRegistry to use SchemaHelper for testing
jterapin Apr 21, 2025
60ef29c
Add TypeRegistry documentation
jterapin Apr 21, 2025
b67cf54
Update example
jterapin Apr 28, 2025
ea9b380
Merge decaf into branch
jterapin Apr 28, 2025
381602b
Remove reference to type registry from client
jterapin Apr 28, 2025
6d41f64
Update projections
jterapin Apr 28, 2025
83fa2bb
Update projections
jterapin Apr 28, 2025
b23a2e9
Document now inherits SimpleDelegator
jterapin Apr 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ bundle exec smithy-ruby smith client --gem-name weather --gem-version 1.0.0 --de
### IRB
IRB on `weather` gem:
```
irb -I projections/weather/lib -I gems/smithy-client/lib -I gems/smithy-schema/lib -r weather
irb -I projections/weather/lib -I gems/smithy-client/lib -I gems/smithy-schema/lib -I gems/smithy-cbor/lib -r weather
```

Create a Weather client:
Expand Down
2 changes: 2 additions & 0 deletions gems/smithy-schema/lib/smithy-schema.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

require_relative 'smithy-schema/shapes'
require_relative 'smithy-schema/structure'
require_relative 'smithy-schema/document'
require_relative 'smithy-schema/type_registry'
require_relative 'smithy-schema/union'

module Smithy
Expand Down
104 changes: 104 additions & 0 deletions gems/smithy-schema/lib/smithy-schema/document.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# frozen_string_literal: true

require_relative 'document_utils'

module Smithy
module Schema
# A Smithy document type, representing typed or untyped data from Smithy data model.
# ## Document types
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add more details (e.g. code examples) but I might punt that to the Smithy-Ruby Wiki instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in-code examples/documentation are preferred but it doesn't have to be done this moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea - I'd agree examples and more detail would be better as in code documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add them now - since information is very fresh in my head.

# Document types are protocol-agnostic view of untyped data. They could be combined
# with a schema to serialize its contents.
#
# Smithy-Ruby currently only support JSON documents.
class Document
# @param [Object] data document data
# @param [Hash] options
# @option options [Smithy::Schema::Structure] :schema schema to reference when setting
# document data. Only applicable when data param is a type of {Shapes::StructureShape}.
# @option options [Boolean] :use_timestamp_format Whether to use the `timestampFormat`
# trait or ignore it when creating a {Document} with given schema. The `timestampFormat`
# trait is ignored by default.
# @option options [Boolean] :use_json_name Whether to use the `jsonName` trait or ignore
# it when creating a {Document} with given schema. The `jsonName` trait is ignored
# by default.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was pretty torn on whether I want to have these options around. These options only applies when we create a document with a given schema to decide whether we want to honor specific traits in the @data

My idea is that once @data is set - we can simply do JSON.dump(document.data) to load the data for a request - in lieu of having to iterate over the document data to adhere to traits during SERDE process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are referring to the use_x options, I think they're pretty useless, and can always be added later. I'd start with them removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that there's test cases now, I might see what they ended up doing to determine whether i want to keep these options around.

def initialize(data, options = {})
@data = set_data(data, options)
@discriminator = extract_discriminator(data, options)
end

# @return [Object] data
attr_reader :data

# @return [String] discriminator
attr_reader :discriminator

# @param [Object] key
# @return [Object]
def [](key)
return unless @data.is_a?(Hash) && @data.key?(key)

@data[key]
end

# @param [Shapes::Shape] schema
# @return [Shapes::Structure] typed shape
def as_typed(schema)
error_message = 'Invalid schema or document data'
raise ArgumentError, error_message unless valid_schema?(schema) && @data.is_a?(Hash)

type = schema.type.new
DocumentUtils.apply(@data, schema, type)
end

private

def discriminator?(data)
data.is_a?(Hash) && data.key?('__type')
end

def extract_discriminator(data, opts)
return if data.nil?

return unless discriminator?(data) || (schema = opts[:schema])

if discriminator?(data)
data['__type']
else
error_message = "Expected a structure schema, given #{schema.class} instead"
raise error_message unless valid_schema?(schema)

schema.id
end
end

def set_data(data, opts)
return if data.nil?

case data
when Smithy::Schema::Structure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already "done"? If it's a structure then it's already typed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on that? what do you mean by being "done" - I still want to re-format the typed shape into a document format.

schema = opts[:schema]
if schema.nil? || !valid_schema?(schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does still seem weird to me that the run time type doesn't know its own schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to have that information at hand but we had some previous discussion on whether this should be a thing or not - based on context of typed shapes being PORO . I don't have a strong opinion but I think starting out without is fine until we have a use case where we absolutely need them to.

raise ArgumentError, "Unable to create a document with given schema: #{schema}"
end

opts = opts.except(:schema)
# case 1 - extract data from runtime shape, schema is required to know to properly extract
DocumentUtils.extract(data, schema, opts)

else
if discriminator?(data)
# case 2 - extract typed data from parsed JSON
data.except('__type')
else
# case 3 - untyped data, we will need consolidate timestamps and such
DocumentUtils.format(data)
end
end
end

def valid_schema?(schema)
schema.is_a?(Shapes::StructureShape) && !schema.type.nil?
end
end
end
end
231 changes: 231 additions & 0 deletions gems/smithy-schema/lib/smithy-schema/document_utils.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
# frozen_string_literal: true

require 'base64'
require 'time'

module Smithy
module Schema
# @api private
# Document Utilities to help (de)construct data to/from Smithy document
module DocumentUtils
class << self
# Used to transform untyped data
def format(data)
return if data.nil?

case data
when Time
data.to_i # timestamp format is "epoch-seconds" by default
when Hash
data.transform_values { |v| format(v) }
when Array
data.map { |d| format(d) }
else
data
end
end

# Used to apply data to runtime shape
def apply(data, schema, type = nil)
case shape(schema)
when Shapes::StructureShape then apply_structure(data, schema, type)
when Shapes::UnionShape then apply_union(data, schema, type)
when Shapes::ListShape then apply_list(data, schema)
when Shapes::MapShape then apply_map(data, schema)
when Shapes::TimestampShape then apply_timestamp(data, schema)
when Shapes::BlobShape then Base64.decode64(data)
else data
end
end

# rubocop:disable Metrics/CyclomaticComplexity
def extract(data, schema, opts = {})
return if data.nil?

case shape(schema)
when Shapes::StructureShape then extract_structure(data, schema, opts)
when Shapes::UnionShape then extract_union(data, schema, opts)
when Shapes::ListShape then extract_list(data, schema)
when Shapes::MapShape then extract_map(data, schema)
when Shapes::BlobShape then extract_blob(data)
when Shapes::TimestampShape then extract_timestamp(data, schema, opts)
else data
end
end
# rubocop:enable Metrics/CyclomaticComplexity

private

def apply_list(data, schema)
shape = shape(schema)
data.map do |v|
next if v.nil?

apply(v, shape.member)
end
end

def apply_map(data, schema)
shape = shape(schema)
data.transform_values do |v|
if v.nil?
nil
else
apply(v, shape.value)
end
end
end

def apply_structure(data, schema, type)
shape = shape(schema)

type = shape.type.new if type.nil?
data.each do |k, v|
name =
if (member = member_with_json_name(k, shape))
shape.name_by_member_name(member.name)
else
member_name(shape, k)
end
next if name.nil?

type[name] = apply(v, shape.member(name))
end
type
end

def apply_timestamp(data, schema)
data = data.is_a?(Numeric) ? Time.at(data) : Time.parse(data)
time(data, timestamp_format(schema))
end

def apply_union(data, schema, type)
shape = shape(schema)
key, value = data.flatten
return if key.nil?

if (member = member_with_json_name(key, shape))
apply_union_member(member.name, value, shape, type)
elsif shape.name_by_member_name?(key)
apply_union_member(key, value, shape, type)
else
shape.member_type(:unknown).new(key, value)
end
end

def apply_union_member(key, value, shape, type)
member_name = shape.name_by_member_name(key)
type = shape.member_type(member_name) if type.nil?
type.new(apply(value, shape.member(member_name)))
end

def extract_blob(data)
Base64.strict_encode64(data.is_a?(String) ? data : data.read)
end

def extract_list(data, schema)
shape = shape(schema)
data.collect { |v| extract(v, shape.member) }
end

def extract_map(data, schema)
shape = shape(schema)
data.each.with_object({}) { |(k, v), h| h[k] = extract(v, shape.value) }
end

def extract_structure(data, schema, opts)
shape = shape(schema)
data.to_h.each_with_object({}) do |(k, v), o|
next unless shape.member?(k)

member_shape = shape.member(k)
member_name = resolve_member_name(member_shape, opts)
o[member_name] = extract(v, member_shape, opts)
end
end

def extract_timestamp(data, schema, opts)
return unless data.is_a?(Time)

trait = timestamp_format(schema) if opts[:use_timestamp_format]
time(data, trait)
end

# rubocop:disable Metrics/AbcSize
def extract_union(data, schema, opts)
h = {}
shape = shape(schema)
if data.is_a?(Schema::Union)
member_shape = shape.member_by_type(data.class)
member_name = resolve_member_name(member_shape, opts)
h[member_name] = extract(data, member_shape).value
else
key, value = data.first
if shape.member?(key)
member_shape = shape.member(key)
member_name = resolve_member_name(member_shape, opts)
h[member_name] = extract(value, member_shape)
end
end
h
end
# rubocop:enable Metrics/AbcSize

def member_name(schema, key)
return unless schema.name_by_member_name?(key) || schema.member?(key.to_sym)

schema.name_by_member_name(key) || key.to_sym
end

def member_with_json_name(name, shape)
shape.members.values.find do |v|
v.traits['smithy.api#jsonName'] == name if v.traits.include?('smithy.api#jsonName')
end
end

def resolve_member_name(member_shape, opts)
if opts[:use_json_name] && member_shape.traits['smithy.api#jsonName']
member_shape.traits['smithy.api#jsonName']
else
member_shape.name
end
end

def shape(schema)
schema.is_a?(Shapes::MemberShape) ? schema.shape : schema
end

# The following steps are taken to determine the format of timestamp:
# Use the timestampFormat trait of the member, if present.
# Use the timestampFormat trait of the shape, if present.
# If none of the above applies, use epoch-seconds as default
def timestamp_format(schema)
if schema.traits['smithy.api#timestampFormat']
schema.traits['smithy.api#timestampFormat']
elsif schema.shape.traits['smithy.api#timestampFormat']
schema.shape.traits['smithy.api#timestampFormat']
else
'epoch-seconds'
end
end

def time(data, trait = nil)
if trait
case trait
when 'http-date'
data.utc.iso8601
when 'date-time'
data.utc.httpdate
when 'epoch-seconds'
data.utc.to_i
else
raise "unhandled timestamp format `#{value}`"
end
else
data.utc.to_i # default format
end
end
end
end
end
end
2 changes: 1 addition & 1 deletion gems/smithy-schema/lib/smithy-schema/structure.rb
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def _to_h_array(obj)
end
end

# An empty Struct that includes the {Client::Structure} module.
# An empty Struct that includes the {Schema::Structure} module.
EmptyStructure = Struct.new do
include Smithy::Schema::Structure
end
Expand Down
Loading
Loading