Skip to content

Add support for Smithy bigInteger and bigDecimal types as string wrappers in aws-smithy-types, allowing users to parse with their preferred big number library.#4418

Merged
landonxjames merged 15 commits into
smithy-lang:mainfrom
AmitKulkarni23:add-biginteger-bigdecimal-support
Jan 29, 2026
Merged

Conversation

@AmitKulkarni23
Copy link
Copy Markdown
Contributor

Motivation and Context

Fixes #312

Smithy defines bigInteger and bigDecimal types for arbitrary-precision numbers, but smithy-rs had TODO placeholders instead of implementations. This prevented users from working with services that use these types.

Description

  • Added BigInteger and BigDecimal runtime types in aws-smithy-types as string wrappers
  • Implemented JSON serialization/deserialization in codegen
  • String-based approach allows users to choose their preferred big number library (e.g., num-bigint, rust_decimal)
  • Added unit tests and integration tests with protocol test coverage

Testing

  • Added unit tests in SymbolVisitorTest.kt
  • Created integration test model big-numbers.smithy with protocol tests
  • All codegen-core tests pass
  • All codegen-client-test integration tests pass

Checklist

  • [ x] For changes to the smithy-rs codegen or runtime crates, I have created a changelog entry Markdown file in the .changelog directory, specifying "client," "server," or both in the applies_to key.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@AmitKulkarni23 AmitKulkarni23 requested review from a team as code owners November 21, 2025 16:59
Copy link
Copy Markdown
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! we need to clean up those representation in aws-smithy-types so that are forward compatible with eventual improvements

Comment on lines +24 to +27
/// Returns the string representation.
pub fn as_str(&self) -> &str {
&self.0
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't expose methods like this since they will probably be impossible to implement if we eventually switch to using a real internal representation that's not a string

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Remove as_str() method - it's redundant and limiting
  2. Keep AsRef trait - works with any internal representation
  3. Update codegen to use .as_ref() instead of .as_str()

^^ Does this work?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR


impl BigInteger {
/// Creates a new `BigInteger` from a string.
pub fn new(value: impl Into<String>) -> Self {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should implement FromStr instead

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will use FromStr instead of new in next revision.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR

when (val target = model.expectShape(memberShape.target)) {
is StringShape -> deserializeString(target)
is BooleanShape -> rustTemplate("#{expect_bool_or_null}(tokens.next())?", *codegenScope)
is BigIntegerShape -> deserializeBigInteger()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably need to support this for more than just json protocols. also need protocol tests. does smithy have any protocol tests for these yet?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will add serialization and deserialization code for XML, CBOR protocols in the next revision.

Copy link
Copy Markdown
Contributor Author

@AmitKulkarni23 AmitKulkarni23 Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does smithy have any protocol tests for these yet?

Protocol tests exist in misc.smithy but BigInteger/BigDecimal are commented out - https://github.com/smithy-lang/smithy-rs/blob/main/codegen-core/common-test-models/misc.smithy#L100. I will uncomment them now that the implementation is complete. However, it seems like misc.smithy only tests JSON. I will look at references and add protocol tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR


private fun RustWriter.deserializeBigInteger() {
rustTemplate(
"#{expect_string_or_null}(tokens.next())?.map(|s| s.to_unescaped().map(|u| #{BigInteger}::new(u.into_owned()))).transpose()?",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also many (all?) of the supported JSON protocols represent these as regular JSON numbers: https://smithy.io/2.0/aws/protocols/aws-json-1_0-protocol.html#shape-serialization

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference. I will use expect_number_or_null instead of expect_string_or_null

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR

Comment on lines +29 to +32
/// Consumes the `BigInteger` and returns the inner string.
pub fn into_inner(self) -> String {
self.0
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reason as

if we eventually switch to using a real internal representation that's not a string

We could consider delaying adding this to leave an option for the future, unless this conversion is required right now.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will remove this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest revision of PR

@AmitKulkarni23 AmitKulkarni23 force-pushed the add-biginteger-bigdecimal-support branch from 343e932 to 019c154 Compare November 25, 2025 16:48

is TimestampShape -> rust("decoder.timestamp()")

is BigIntegerShape ->
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers:

I have added serialization and parsing logic for the following protocols:

  • JSON
  • CBOR
  • XML
  • AWS Query
  • AWS EC2

Let me know if there are any other protocols.


impl Default for BigInteger {
fn default() -> Self {
Self("0".to_string())
Copy link
Copy Markdown
Contributor Author

@AmitKulkarni23 AmitKulkarni23 Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for reviewrs: Default Values for BigInteger/BigDecimal

I've implemented Default trait for both types to support error correction in client codegen:

Context:
ErrorCorrection.kt line 67 generates Some(Default::default()) for all NumberShape types, including BigInteger/BigDecimal, when required fields are missing during deserialization.

Are "0" and "0.0" appropriate defaults for error correction scenarios?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should match whatever we do for normal integers

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought process:

BigInteger/BigDecimal Default implementations match primitive number behavior:

i32::default()  = 0
i64::default()  = 0
f32::default()  = 0
f64::default()  = 0
u32::default()  = 0
u64::default()  = 0
i8::default()   = 0
i16::default()  = 0
  • BigInteger::default() returns BigInteger("0") (string "0" representing zero)
  • BigDecimal::default() returns BigDecimal("0.0") (string "0.0" representing zero)

All number types default to their zero representation. BigInteger/BigDecimal use string storage for arbitrary precision, but semantically represent the same zero value as primitive numbers.

Copy link
Copy Markdown
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks pretty good. We can decide how we want to handle large numbers in JSON -- currently as you have implemented there will be a loss of precision (but there is no inherent need for that since we control aws-smithy-json and can have it parse a number as a string directly.

bodyMediaType: "application/xml",
headers: {"Content-Type": "application/xml"},
params: {
bigInt: 987654321,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we actually use numbers that don't fit into int/decimals?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test values are limited by Smithy's Java-based model parser, which converts numeric literals to Java Number types. When these are serialized back to strings, Java uses scientific notation for large values.

Example of the parser limitation:

params: {
   bigDec: 123456789012345.123456789
}

After Smithy parses this, the codegen sees: 1.2345678901234512E14 (scientific notation with precision loss)

However, this is only a test limitation - the actual runtime code handles arbitrary precision correctly:

  • XML/JSON input is parsed as strings from the wire
  • BigDecimal/BigInteger use FromStr to parse directly from those strings
  • Serialization writes the string back via .as_ref()
  • No precision loss occurs in production code

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha. then we should write an integration test for this against the generated code manually. You can do this by writing a test in kotlin that generates the service, then utilizes the serializers

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done this as part of https://github.com/smithy-lang/smithy-rs/pull/4418/files#diff-ae00c010398af1c58c9129aef4393407a833d72c73b2063ad28e7754d3ae4eedR385

let input = crate::test_input::BigNumberOpInput::builder().payload(
    crate::test_model::BigNumberData::builder()
        .big_int("12345678901234567890".parse().unwrap())
        .big_dec("3.141592653589793238".parse().unwrap())
        .build()
).build().unwrap();
let serialized = ${format(operationSerializer)}(&input.payload.unwrap()).unwrap();
let output = std::str::from_utf8(&serialized).unwrap();
assert!(output.contains("<bigInt>12345678901234567890</bigInt>"));
assert!(output.contains("<bigDec>3.141592653589793238</bigDec>"));

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to remove this unit test which has numbers that don't fit into int/decimals in next revison of the PR


is BigIntegerShape ->
rustTemplate(
"<#{BigInteger} as ::std::str::FromStr>::from_str(decoder.str()?.as_ref()).map_err(|_| #{Error}::custom(\"infallible\", decoder.position()))",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is what the spec said for CBOR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation uses decoder.str() / encoder.str() (CBOR text strings, Major Type 3) as a temporary approach.

According to the Smithy RPC v2 CBOR spec, BigInteger/BigDecimal should use:

  • BigInteger: Major Type 6, tags 2 (unsigned bignum) or 3 (negative bignum)
  • BigDecimal: Major Type 6, tag 4 (decimal fraction)

However, aws-smithy-cbor doesn't currently expose methods for these CBOR tags. The underlying minicbor library supports tags (used internally for timestamps), but we'd need to add public methods like encoder.bignum() and encoder.decimal() to properly implement the spec.

Any suggestions on how to address this? How do you recommend I proceed here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify aws-smithy-cbor — you can find the source of it in this repo

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some direction:

The CBOR spec requires binary encoding (tags 2/3/4), but BigInteger/BigDecimal are string wrappers to avoid choosing a bignum library.

Should we:

  1. Keep current text string encoding (non-compliant but simple)
  2. Document that BigInteger/BigDecimal don't work with CBOR
  3. Add num-bigint dependency to aws-smithy-cbor for spec compliance

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a dependency to aws-smithy-cbor seems OK.

Option 1 is a non-starter — we can't have non-compliant code in smithy-rs.

I would prefer 3, but for simplicity, we could have 2 (but it must FAIL to codegen at runtime, it can't be only a documented feature).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going ahead with Option 2 - failing codegen at runtime

)
is BigDecimalShape ->
rustTemplate(
"<#{BigDecimal} as ::std::str::FromStr>::from_str(decoder.str()?.as_ref()).map_err(|_| #{Error}::custom(\"infallible\", decoder.position()))",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this error is infallible is it? wouldn't this happen if the string wasn't a valid big decimal? this error seems worth preserving?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate the string is a valid number format?

Options:

  1. Keep infallible - Accept any string, let users validate when they parse it
  2. Add validation - Check the string is a valid number format, return error if not

What do you recommend?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see...a bit of a can of worms. I forgot we were basically doing nothing with the numeric values. We can punt this for now.


rustTemplate(
"""
#{expect_number_or_null}(tokens.next())?
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think do actually do this properly you need to add some additional code to aws-smithy-json to parse a number as a string? not sure how hard that would be.

As it is, this isn't terrible, but its not ideal since it defeats the point


impl Default for BigInteger {
fn default() -> Self {
Self("0".to_string())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should match whatever we do for normal integers

@AmitKulkarni23 AmitKulkarni23 requested a review from rcoh December 1, 2025 22:47
@AmitKulkarni23 AmitKulkarni23 force-pushed the add-biginteger-bigdecimal-support branch from aacd195 to 0f0fecf Compare December 3, 2025 20:29
Comment on lines +419 to +426
let s = format!("{f}");
// f64 formatting drops ".0" for whole numbers (0.0 -> "0")
// Restore it to preserve that the original JSON had decimal notation
if !s.contains('.') && !s.contains('e') && !s.contains('E') {
format!("{s}.0")
} else {
s
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this correct? I definitely want to see some tests for this.

Comment on lines +256 to +264
let xml = br##"<BigNumberData>
<bigInt>12345678901234567890</bigInt>
<bigDec>3.141592653589793238</bigDec>
</BigNumberData>
"##;
let output = ${format(operationParser)}(xml, test_output::BigNumberOpOutput::builder()).unwrap().build();
assert_eq!(output.big_int.as_ref().map(|v| v.as_ref()), Some("12345678901234567890"));
assert_eq!(output.big_dec.as_ref().map(|v| v.as_ref()), Some("3.141592653589793238"));
""",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice test! please add something similar for JSON

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump on this test — we need a test that actually tests that we are preserving E2E precision with JSON.

Copy link
Copy Markdown
Contributor Author

@AmitKulkarni23 AmitKulkarni23 Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Adding a full end to end Kotlin test for XML protocol that actually serializes and deserializes big numbers in the next commit.

operations: [ProcessBigNumbers]
}

@http(uri: "/process", method: "POST")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there appears to be some code that handles E / scientific notation but I don't see any tests of that here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding tests for these in the next commit.

landonxjames added a commit that referenced this pull request Dec 12, 2025
…bitrary precision for BigInteger/BigDecimal (#4444)

## Motivation and Context

Currently, `expect_number_or_null()` parses JSON numbers through this
flow:
1. JSON string `"9007199254740993"` → parsed to `u64` → stored in
`Number::PosInt(9007199254740993)`
2. Later converted to `f64` for certain operations → **precision lost**
(f64 has only 53 bits of precision)
3. Converted back to string → `"9007199254740992"` (wrong value!)

`expect_number_or_null()` converts JSON numbers to `u64`/`i64`/`f64`,
which causes precision loss for numbers larger than these types can
represent. This defeats the purpose of BigInteger/BigDecimal support
which are meant to handle arbitrarily large numbers without precision
loss.

This commit addresses comments #4418 

## Description
Adds `expect_number_as_string_or_null()` function to `aws-smithy-json`
that:
- Extracts JSON numbers as strings without intermediate numeric
conversion
- Uses the `offset` from `Token::ValueNumber` to extract the raw number
string from the original JSON input
- Preserves arbitrary precision for BigInteger and BigDecimal

## Testing
- Added comprehensive tests for various number formats (large integers,
decimals, scientific notation)
- Added error case tests (string, boolean, object, array tokens)
- All tests pass

## Checklist
- [x] For changes to the smithy-rs codegen or runtime crates, I have
created a changelog entry Markdown file in the `.changelog` directory,
specifying "client," "server," or both in the `applies_to` key.

---------

Co-authored-by: Amit Kulkarni <kulami@amazon.com>
Co-authored-by: Landon James <lnj@amazon.com>
Amit Kulkarni added 2 commits January 5, 2026 09:48
…pers in aws-smithy-types, allowing users to parse with their preferred big number library.
@AmitKulkarni23 AmitKulkarni23 force-pushed the add-biginteger-bigdecimal-support branch from 0f0fecf to 20e566b Compare January 5, 2026 18:29
Copy link
Copy Markdown
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must fix:

  • Must validate inputs in to BigInteger / BigDecimal because we are raw-writing them into JSON. We may want to also improve that API to safe guard that we are only writing "safe" characters?
  • Must add a Kotlin test that validates we successfully round trip large values through the serializers (since the protocol tests do not) Can the protocol tests use a number that is actually out of range?
  • Few other more minor inline comments

Thanks for your continued hard work on this!

bodyMediaType: "application/json",
headers: {"Content-Type": "application/json"},
params: {
bigInt: 9007199254740991,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't larger than u64::max — I guess that's a protocol test limitation? This is 2^53-1 (max safe integer in Javascript), so we aren't really testing that big integers work (e.g. this code would pass even without your changes right?)

rustBlockTemplate(
"pub(crate) fn $fnName(value: &[u8], ${unusedMut}mut builder: #{Builder}) -> #{Result}<#{Builder}, #{Error}>",
"""
##[allow(unused)]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why allow(unused)? parsers should only be generated when they are actually used.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was running one build commands, compilation was failing. Rust was complaining that some of the variuables were unused. Therefore, I had added this exception. Will try to reproduce this and add more details here.

rustTemplate(
"""
// Alias for nested parsers that expect `input` parameter name
let input = value;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you just change your parser to match all the other ones and use value?

Copy link
Copy Markdown
Contributor Author

@AmitKulkarni23 AmitKulkarni23 Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nested parsers are called with input as the second parameter. Since the top-level function has value as its parameter, I created the alias let input = value; so we could pass input
to the nested parsers. I will try to incorporate this review comment.

rustTemplate(
"""
#{expect_number_as_string_or_null}(tokens.next(), input)?
.map(|s| <#{BigInteger} as ::std::str::FromStr>::from_str(s).expect("infallible"))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make this fallible — see other comments.

Copy link
Copy Markdown
Contributor Author

@AmitKulkarni23 AmitKulkarni23 Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Here is my plan for the next commit:

  1. Add validation function is_valid_number_string() that only allows valid JSON number characters:

    • Digits: 0-9
    • Signs: -, +
    • Decimal: .
    • Scientific notation: e, E
    • Rejects JSON special characters: quotes, commas, braces, brackets, etc.
  2. Create proper error type:

    #[derive(Debug, Clone, PartialEq, Eq)]
    #[non_exhaustive]
    pub enum BigNumberError {
        InvalidFormat(String),
    }

Comment on lines +27 to +28
// Infallible because any string is valid - we just store it without validation
type Err = std::convert::Infallible;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a semver hazard — it should return an enum marked with #[non_exhaustive] so an error could be added in the future

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Adding the below in the next revision/commit

#[derive(Debug, Clone, PartialEq, Eq)]
#[non_exhaustive]
pub enum BigNumberError {
    InvalidFormat(String),
}

Comment on lines +35 to +39
impl From<String> for BigInteger {
fn from(value: String) -> Self {
Self(value)
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this impl is probably too hazardous to keep

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my understanding of what this comment means:

  • From trait must always succeed (infallible)
  • With validation, invalid strings should return errors, not panic
  • Panicking in From is unexpected and dangerous
  • Users should use FromStr instead, which is properly fallible

^^ Assuming that this is right, I am going to remove From<String> implementations for BigInteger and BigDecimal. Users must now use FromStr::from_str() which properly returns Result<T, BigNumberError>.

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct BigInteger(String);

impl BigInteger {}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty impl block does nothing

Suggested change
impl BigInteger {}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Removing this in next revision/commit

Comment on lines +429 to +435
"$writer.write_raw_value(${value.name}.as_ref());",
*codegenScope,
)
is BigDecimalShape ->
rustTemplate(
"$writer.write_raw_value(${value.name}.as_ref());",
*codegenScope,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm...there is actually a vulnerability here or at least the possibility to introduce invalid JSON — we are not validating the input to BigInteger and BigDecimal and then we're writing them untrusted directly into the JSON.

We need to validate that they are valid before storing them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.
Introducing a simple validation function here:

fn is_valid_number_string(s: &str) -> bool {
    if s.is_empty() {
        return false;
    }
    
    s.chars().all(|c| matches!(c, '0'..='9' | '-' | '+' | '.' | 'e' | 'E'))
}

^^ These are all the valid characters that I could think of in any BigNumber

And using it as:

impl std::str::FromStr for BigInteger {
    type Err = BigNumberError;

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        if !is_valid_number_string(s) {
            return Err(BigNumberError::InvalidFormat(s.to_string()));
        }
        Ok(Self(s.to_string()))
    }
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

big integer is only numbers — the larger set is only for BigDecimal

Comment on lines +256 to +264
let xml = br##"<BigNumberData>
<bigInt>12345678901234567890</bigInt>
<bigDec>3.141592653589793238</bigDec>
</BigNumberData>
"##;
let output = ${format(operationParser)}(xml, test_output::BigNumberOpOutput::builder()).unwrap().build();
assert_eq!(output.big_int.as_ref().map(|v| v.as_ref()), Some("12345678901234567890"));
assert_eq!(output.big_dec.as_ref().map(|v| v.as_ref()), Some("3.141592653589793238"));
""",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump on this test — we need a test that actually tests that we are preserving E2E precision with JSON.

rustBlockTemplate(
"""
pub(crate) fn $fnName<'a, I>(tokens: &mut #{Peekable}<I>) -> #{Result}<Option<#{ReturnType}>, #{Error}>
##[allow(unused)]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why allow(unused)?

…nerability; Make FromStr fallible with non-exhaustive error enum; Remove hazardous From<String> implementations; Use _value parameter consistently and remove unnecessary #[allow(unused)] attributes; Add integration tests for E2E precision preservation; Implement NaN saturation for values > f64::MAX
@AmitKulkarni23 AmitKulkarni23 requested a review from rcoh January 8, 2026 16:40
Copy link
Copy Markdown
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're really close here!

is BigDecimalShape -> {
val value = data.toString()
rustTemplate(
"<#{BigDecimal} as ::std::str::FromStr>::from_str(${value.dq()}).unwrap()",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"<#{BigDecimal} as ::std::str::FromStr>::from_str(${value.dq()}).unwrap()",
"<#{BigDecimal} as ::std::str::FromStr>::from_str(${value.dq()}).expect("invalid string for BigDecimal")",

is BigIntegerShape -> {
val value = data.toString()
rustTemplate(
"<#{BigInteger} as ::std::str::FromStr>::from_str(${value.dq()}).unwrap()",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"<#{BigInteger} as ::std::str::FromStr>::from_str(${value.dq()}).unwrap()",
"<#{BigInteger} as ::std::str::FromStr>::from_str(${value.dq()}).expect("Invalid string for big integer")",

// (binary bignum representation), but aws-smithy-cbor doesn't implement these tags yet.
is BigIntegerShape ->
throw CodegenException(
"BigInteger is not supported with Concise Binary Object Representation (CBOR) protocol",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there an open ticket for this? If not, please open one and then link to the ticket in the error

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#4473 - Linking this in the error as well

rustBlockTemplate(
"pub(crate) fn $fnName(value: &[u8], ${unusedMut}mut builder: #{Builder}) -> #{Result}<#{Builder}, #{Error}>",
"""
pub(crate) fn $fnName(_value: &[u8], ${unusedMut}mut builder: #{Builder}) -> #{Result}<#{Builder}, #{Error}>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really confused — this was value before...why did you need to make it _value?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah—I see the issue. Nested functions need access to it now that previously didn't have access to it at all.

Comment on lines +429 to +435
"$writer.write_raw_value(${value.name}.as_ref());",
*codegenScope,
)
is BigDecimalShape ->
rustTemplate(
"$writer.write_raw_value(${value.name}.as_ref());",
*codegenScope,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

big integer is only numbers — the larger set is only for BigDecimal

.and_then(|f| {
must_be_finite(f).map_err(|_| self.error_at(start, InvalidNumber))
})?,
.map(|f| if f.is_finite() { f } else { f64::NAN })?,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to change expect_number to move the check for finite-ness there I think?


/// Validates that a string contains only valid JSON number characters.
/// Prevents JSON injection by rejecting strings with quotes, commas, braces, etc.
fn is_valid_number_string(s: &str) -> bool {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integers have just 0..9 — probably want a tighter restriction there

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Miss on my part. Changing this in next revision/commit.

…idation; Move finite check to token consumer; Use expect() in Instantiator
@AmitKulkarni23 AmitKulkarni23 requested a review from rcoh January 8, 2026 19:42

// Check first character (can be sign or digit)
match chars.next() {
Some('-') | Some('+') | Some('0'..='9') => {}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you really proceed a number with +?

// Values exceeding f64::MAX should tokenize successfully with NaN
// to support BigInteger/BigDecimal arbitrary precision types
let expect_nan = |input| {
fn out_of_range_floats_produce_infinity() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did this change to infinity? I thought we chose nan?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see. the behavior of the parser is Infinity, that's fine.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is +/- infinity a valid value for Smithy bigInteger or bigDecimal shapes ? (e.g. infinity or -infinity)?

@github-actions
Copy link
Copy Markdown

@rcoh rcoh force-pushed the add-biginteger-bigdecimal-support branch from 3b26323 to df1dd7e Compare January 29, 2026 01:32
@landonxjames landonxjames merged commit 959b1a4 into smithy-lang:main Jan 29, 2026
45 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Support for BigInteger / BigDecimal

5 participants