Skip to content

[BUG] Async Deserialization of SearchDocument Fails #40768

Open
@jeburge-msft

Description

@jeburge-msft

Library name and version

Azure.Search.Documents 11.5.1

Describe the bug

I am experiencing an issue causing SearchClient.SearchAsync<T>() to fail when T is SearchDocument.

I am consistently receiving the following exception call stack when making certain requests to the SearchClient:

System.Text.Json.JsonException: The JSON value could not be converted to Azure.Search.Documents.Models.SearchDocument. Path: $ | LineNumber: 0 | BytePositionInLine: 20.
 ---> System.InvalidOperationException: Cannot skip tokens on partial JSON. Either get the whole payload and create a Utf8JsonReader instance where isFinalBlock is true or call TrySkip.
   at System.Text.Json.Utf8JsonReader.Skip()
   at Azure.Search.Documents.JsonSerialization.ReadSearchDocument(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, Nullable`1 recursionDepth)
   at Azure.Search.Documents.Models.SearchDocumentConverter.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
   at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   --- End of inner exception stack trace ---
   at System.Text.Json.ThrowHelper.ReThrowWithPath(ReadStack& state, Utf8JsonReader& reader, Exception ex)
   at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadCore[TValue](JsonConverter jsonConverter, Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadCore[TValue](JsonReaderState& readerState, Boolean isFinalBlock, ReadOnlySpan`1 buffer, JsonSerializerOptions options, ReadStack& state, JsonConverter converterBase)
   at System.Text.Json.JsonSerializer.ContinueDeserialize[TValue](ReadBufferState& bufferState, JsonReaderState& jsonReaderState, ReadStack& readStack, JsonConverter converter, JsonSerializerOptions options)
   at System.Text.Json.JsonSerializer.ReadAllAsync[TValue](Stream utf8Json, JsonTypeInfo jsonTypeInfo, CancellationToken cancellationToken)
   at Azure.Search.Documents.Models.SearchResult`1.DeserializeAsync(JsonElement element, ObjectSerializer serializer, JsonSerializerOptions options, Boolean async, CancellationToken cancellationToken)
   at Azure.Search.Documents.Models.SearchResults`1.DeserializeAsync(Stream json, ObjectSerializer serializer, Boolean async, CancellationToken cancellationToken)
   at Azure.Search.Documents.SearchClient.SearchInternal[T](SearchOptions options, String operationName, Boolean async, CancellationToken cancellationToken)
   at Azure.Search.Documents.SearchClient.SearchInternal[T](String searchText, SearchOptions options, Boolean async, CancellationToken cancellationToken)
   at Azure.Search.Documents.SearchClient.SearchAsync[T](String searchText, SearchOptions options, CancellationToken cancellationToken)

I have determined the issue is in the custom JsonConverter SearchDocumentConverter that is added to the default JsonSerializerOptions in Azure.Search.Documents.JsonSerialization.

Specifically in ReadSearchDocument, there is the following code attempting to "skip" OData properties present in the Json being read (

):

// Ignore OData properties - we don't expose those on custom
// user schemas
if (!propertyName.StartsWith(Constants.ODataKeyPrefix, StringComparison.OrdinalIgnoreCase))
{
 object value = ReadSearchDocObject(ref reader, recursionDepth.Value - 1);
 doc[propertyName] = value;
}
else
{
 reader.Skip();
}

I believe that reader.Skip() is not the correct behavior to use in an async context, and instead we should use the following logic:

if (!reader.TrySkip())
    throw new SomeJsonException("TrySkip failed on OData property.");

Although these implementations seem somewhat equivalent, in reality the behavior is quite different as discussed in: https://stackoverflow.com/questions/63038334/how-do-i-handle-partial-json-in-a-jsonconverter-while-using-deserializeasync-on.

Looking into the System.Text.Json.Utf8JsonReader code we see the following (https://github.com/dotnet/runtime/blob/486142a4b87ed6e3134bd1a8726156fb3f55157a/src/libraries/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.cs):

        //
        // Summary:
        //     Skips the children of the current JSON token.
        //
        // Exceptions:
        //   T:System.InvalidOperationException:
        //     The reader was given partial data with more data to follow (that is, System.Text.Json.Utf8JsonReader.IsFinalBlock
        //     is false).
        //
        //   T:System.Text.Json.JsonException:
        //     An invalid JSON token was encountered while skipping, according to the JSON RFC.
        //     -or- The current depth exceeds the recursive limit set by the maximum depth.
        public void Skip()
        {
            if (!_isFinalBlock)
            {
                throw ThrowHelper.GetInvalidOperationException_CannotSkipOnPartial();
            }

            SkipHelper();
        }

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private void SkipHelper()
        {
            if (TokenType == JsonTokenType.PropertyName)
            {
                bool flag = Read();
            }

            if (TokenType == JsonTokenType.StartObject || TokenType == JsonTokenType.StartArray)
            {
                int currentDepth = CurrentDepth;
                do
                {
                    bool flag2 = Read();
                }
                while (currentDepth < CurrentDepth);
            }
        }
        //
        // Summary:
        //     Tries to skip the children of the current JSON token.
        //
        // Returns:
        //     true if there was enough data for the children to be skipped successfully; otherwise,
        //     false.
        //
        // Exceptions:
        //   T:System.Text.Json.JsonException:
        //     An invalid JSON token was encountered while skipping, according to the JSON RFC.
        //     -or - The current depth exceeds the recursive limit set by the maximum depth.
        public bool TrySkip()
        {
            if (_isFinalBlock)
            {
                SkipHelper();
                return true;
            }

            return TrySkipHelper();
        }

        private bool TrySkipHelper()
        {
            Utf8JsonReader utf8JsonReader = this;
            if (TokenType != JsonTokenType.PropertyName || Read())
            {
                if (TokenType != JsonTokenType.StartObject && TokenType != JsonTokenType.StartArray)
                {
                    goto IL_0042;
                }

                int currentDepth = CurrentDepth;
                while (Read())
                {
                    if (currentDepth < CurrentDepth)
                    {
                        continue;
                    }

                    goto IL_0042;
                }
            }

            this = utf8JsonReader;
            return false;
IL_0042:
            return true;
        }

Interestingly, the Skip() method does not even attempt the "skip" action if we are not at the final block. In an async context, it is likely that we have not finalized the block at the time we are reading and skipping some OData property in the returned SearchDocument JSON. There should be plenty of times that the OData property is "skippable" even when the block is not finalized. There could be some discussion on whether the bug truly lies in the Azure.Search.Documents code or the System.Text.Json code, but the interaction between the two at the moment certainly seems incorrect.

Expected behavior

Correctly deserialize and return the SearchDocuments when making a call to searchClient.SearchAsync<SearchDocument>(options, cancellationToken) (or in fact any async code that invokes ReadSearchDocument on the returned HTTP response from the search service that contains OData properties like @search.score).

Actual behavior

Receive the following exception:

System.Text.Json.JsonException: The JSON value could not be converted to Azure.Search.Documents.Models.SearchDocument. Path: $ | LineNumber: 0 | BytePositionInLine: 20.
 ---> System.InvalidOperationException: Cannot skip tokens on partial JSON. Either get the whole payload and create a Utf8JsonReader instance where isFinalBlock is true or call TrySkip.
   at System.Text.Json.Utf8JsonReader.Skip()
   at Azure.Search.Documents.JsonSerialization.ReadSearchDocument(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, Nullable`1 recursionDepth)
   at Azure.Search.Documents.Models.SearchDocumentConverter.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options)
   at System.Text.Json.Serialization.JsonConverter`1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
   at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   --- End of inner exception stack trace ---
   at System.Text.Json.ThrowHelper.ReThrowWithPath(ReadStack& state, Utf8JsonReader& reader, Exception ex)
   at System.Text.Json.Serialization.JsonConverter`1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadCore[TValue](JsonConverter jsonConverter, Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
   at System.Text.Json.JsonSerializer.ReadCore[TValue](JsonReaderState& readerState, Boolean isFinalBlock, ReadOnlySpan`1 buffer, JsonSerializerOptions options, ReadStack& state, JsonConverter converterBase)
   at System.Text.Json.JsonSerializer.ContinueDeserialize[TValue](ReadBufferState& bufferState, JsonReaderState& jsonReaderState, ReadStack& readStack, JsonConverter converter, JsonSerializerOptions options)
   at System.Text.Json.JsonSerializer.ReadAllAsync[TValue](Stream utf8Json, JsonTypeInfo jsonTypeInfo, CancellationToken cancellationToken)
   at Azure.Search.Documents.Models.SearchResult`1.DeserializeAsync(JsonElement element, ObjectSerializer serializer, JsonSerializerOptions options, Boolean async, CancellationToken cancellationToken)
   at Azure.Search.Documents.Models.SearchResults`1.DeserializeAsync(Stream json, ObjectSerializer serializer, Boolean async, CancellationToken cancellationToken)
   at Azure.Search.Documents.SearchClient.SearchInternal[T](SearchOptions options, String operationName, Boolean async, CancellationToken cancellationToken)
   at Azure.Search.Documents.SearchClient.SearchInternal[T](String searchText, SearchOptions options, Boolean async, CancellationToken cancellationToken)
   at Azure.Search.Documents.SearchClient.SearchAsync[T](String searchText, SearchOptions options, CancellationToken cancellationToken)

Reproduction Steps

Invoke searchClient.SearchAsync<SearchDocument>(options, cancellationToken) on a search index that contains data with non-trivial document length. This exception will be encountered when the racing of block finalization vs the skipping of OData properties results in a non-finalized block when reader.Skip() is invoked. Note that because this is a racing issue that it does not repro reliably.

Environment

No response

Metadata

Metadata

Assignees

Labels

ClientThis issue points to a problem in the data-plane of the library.Searchcustomer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions