diff --git a/docs/standard/serialization/binaryformatter-migration-guide/read-nrbf-payloads.md b/docs/standard/serialization/binaryformatter-migration-guide/read-nrbf-payloads.md index 581a219d4682a..f21049a6d1589 100644 --- a/docs/standard/serialization/binaryformatter-migration-guide/read-nrbf-payloads.md +++ b/docs/standard/serialization/binaryformatter-migration-guide/read-nrbf-payloads.md @@ -18,6 +18,9 @@ helpviewer_keywords: As part of .NET 9, a new [NrbfDecoder] class was introduced to decode NRBF payloads without performing _deserialization_ of the payload. This API can safely be used to decode trusted or untrusted payloads without any of the risks that [BinaryFormatter] deserialization carries. However, [NrbfDecoder] merely decodes the data into structures an application can further process. Care must be taken when using [NrbfDecoder] to safely load the data into the appropriate instances. +> [!CAUTION] +> [NrbfDecoder] is _an_ implementation of an NRBF reader, but its behaviors don't strictly follow [BinaryFormatter]'s implementation. Thus you shouldn't use the output of [NrbfDecoder] to determine whether a call to [BinaryFormatter] would be safe. + You can think of as being the equivalent of using a JSON/XML reader without the deserializer. ## NrbfDecoder @@ -33,7 +36,8 @@ You can think of as being the equivalent - Use collision-resistant randomized hashing to store records referenced by other records (to avoid running out of memory for dictionary backed by an array whose size depends on the number of hash-code collisions). - Only primitive types can be instantiated in an implicit way. Arrays can be instantiated on demand. Other types are never instantiated. -When using [NrbfDecoder], it is important not to reintroduce those capabilities in general-purpose code as doing so would negate these safeguards. +> [!CAUTION] +> When using [NrbfDecoder], it's important not to reintroduce those capabilities in general-purpose code, as doing so would negate these safeguards. ### Deserialize a closed set of types @@ -82,17 +86,38 @@ internal static T LoadFromFile(string path) The NRBF payload consists of serialization records that represent the serialized objects and their metadata. To read the whole payload and get the root object, you need to call the method. -The method returns a instance. is an abstract class that represents the serialization record and provides three self-describing properties: , , and . It exposes one method, , which compares the type name read from the payload (and exposed via property) against the specified type. This method ignores assembly names, so users don't need to worry about type forwarding and assembly versioning. It also does not consider member names or their types (because getting this information would require type loading). +The method returns a instance. is an abstract class that represents the serialization record and provides three self-describing properties: , , and . + +> [!NOTE] +> An attacker could create a payload with cycles (example: class or an array of objects with a reference to itself). The returns an instance of which implements and amongst other things, it can be used to detect cycles in decoded records. + + exposes one method, , which compares the type name read from the payload (and exposed via property) against the specified type. This method ignores assembly names, so users don't need to worry about type forwarding and assembly versioning. It also does not consider member names or their types (because getting this information would require type loading). ```csharp using System.Formats.Nrbf; -static T Pseudocode(Stream payload) +static Animal Pseudocode(Stream payload) { SerializationRecord record = NrbfDecoder.Read(payload); - if (!record.TypeNameMatches(typeof(T)) + if (record.TypeNameMatches(typeof(Cat)) && record is ClassRecord catRecord) + { + return new Cat() + { + Name = catRecord.GetString("Name"), + WorshippersCount = catRecord.GetInt32("WorshippersCount") + }; + } + else if (record.TypeNameMatches(typeof(Dog)) && record is ClassRecord dogRecord) { - throw new Exception($"Expected the record to match type name `{typeof(T).AssemblyQualifiedName}`, but got `{record.TypeName.AssemblyQualifiedName}`." + return new Dog() + { + Name = dogRecord.GetString("Name"), + FriendsCount = dogRecord.GetInt32("FriendsCount") + }; + } + else + { + throw new Exception($"Unexpected record: `{record.TypeName.AssemblyQualifiedName}`."); } } ``` @@ -104,7 +129,7 @@ There are more than a dozen different serialization [record types](/openspecs/wi - derives from the non-generic , which also exposes a property. But on the base class, the value is returned as `object` (which introduces boxing for value types). - : describes all `class` and `struct` besides the aforementioned primitive types. - : describes all array records, including jagged and multi-dimensional arrays. -- : describes single-dimensional, zero-indexed array records, where `T` can be either a primitive type or a . +- : describes single-dimensional, zero-indexed array records, where `T` can be either a primitive type or a . ```csharp SerializationRecord rootObject = NrbfDecoder.Decode(payload); // payload is a Stream @@ -134,7 +159,8 @@ The API it provides: - property that gets the names of serialized members. - method that checks if member of given name was present in the payload. It was designed for handling versioning scenarios where given member could have been renamed. - A set of dedicated methods for retrieving primitive values of the provided member name: , , , , , , , , , , , , , , , and . -- and methods to retrieve instance of given record types. +- retrieves an instance of [ClassRecord]. In case of a cycle, it's the same instance of the current [ClassRecord] with the same . +- retrieves an instance of [ArrayRecord]. - to retrieve any serialization record and to retrieve any serialization record or a raw primitive value. The following code snippet shows in action: @@ -157,22 +183,30 @@ Sample output = new() Text = rootRecord.GetString(nameof(Sample.Text)), // using dedicated method to read an array of bytes ArrayOfBytes = ((SZArrayRecord)rootRecord.GetArrayRecord(nameof(Sample.ArrayOfBytes))).GetArray(), - // using GetClassRecord to read a class record - ClassInstance = new() +}; + +// using GetClassRecord to read a class record +ClassRecord? referenced = rootRecord.GetClassRecord(nameof(Sample.ClassInstance)); +if (referenced is not null) +{ + if (referenced.Id.Equals(rootRecord.Id)) { - Text = rootRecord - .GetClassRecord(nameof(Sample.ClassInstance))! - .GetString(nameof(Sample.Text)) + throw new Exception("Unexpected cycle detected!"); } -}; + + output.ClassInstance = new() + { + Text = referenced.GetString(nameof(Sample.Text)) + }; +} ``` #### ArrayRecord defines the core behavior for NRBF array records and provides a base for derived classes. It provides two properties: -- which gets the rank of the array. -- which get a buffer of integers that represent the number of elements in every dimension. +- , which gets the rank of the array. +- , which gets a buffer of integers that represent the number of elements in every dimension. It's recommended to **check the total length of the provided array record** before calling . It also provides one method: . When used for the first time, it allocates an array and fills it with the data provided in the serialized records (in case of the natively supported primitive types like `string` or `int`) or the serialized records themselves (in case of arrays of complex types). @@ -180,11 +214,18 @@ It also provides one method: . W ```csharp ArrayRecord arrayRecord = (ArrayRecord)NrbfDecoder.Decode(stream); +if (arrayRecord.Rank != 2 || arrayRecord.Lengths[0] * arrayRecord.Lengths[1] > 10_000) +{ + throw new Exception("The array had unexpected rank or length!"); +} int[,] array2d = (int[,])arrayRecord.GetArray(typeof(int[,])); ``` If there is a type mismatch (example: the attacker has provided a payload with an array of two billion strings), the method throws . +> [!CAUTION] +> Unfortunately, the NRBF format makes it easy for an attacker to compress a large number of null array items. That's why it's recommended to always check the total length of the array before calling . Moreover, accepts an optional `allowNulls` Boolean argument, which, when set to `false`, will throw for nulls. + [NrbfDecoder] does not load or instantiate any custom types, so in case of arrays of complex types, it returns an array of . ```csharp @@ -195,14 +236,18 @@ public class ComplexType3D } ArrayRecord arrayRecord = (ArrayRecord)NrbfDecoder.Decode(payload); -SerializationRecord[] records = (SerializationRecord[])arrayRecord.GetArray(expectedArrayType: typeof(ComplexType3D[])); +if (arrayRecord.Rank != 1 || arrayRecord.Lengths[0] > 10_000) +{ + throw new Exception("The array had unexpected rank or length!"); +} + +SerializationRecord[] records = (SerializationRecord[])arrayRecord.GetArray(expectedArrayType: typeof(ComplexType3D[]), allowNulls: false); ComplexType3D[] output = records.OfType().Select(classRecord => new ComplexType3D() { I = classRecord.GetInt32(nameof(ComplexType3D.I)), J = classRecord.GetInt32(nameof(ComplexType3D.J)), K = classRecord.GetInt32(nameof(ComplexType3D.K)), }).ToArray(); - ``` .NET Framework supported non-zero indexed arrays within NRBF payloads, but this support was never ported to .NET (Core). [NrbfDecoder] therefore does not support decoding non-zero indexed arrays.