Skip to content

[BUG]: Error unpickling decimal array when array contains at least one '0' value #1026

Open
@akamor

Description

@akamor

Describe the bug
Parquet file has a column of array<decimal<38,18>>. If row has an array such as array(0,1,2,3) we will get an exception under certain situations. The exception only happens when the array itself contains at least 1 element whose value is 0.

To Reproduce

Create a parquet file described as above then read it into a DF.

From the C# side do something like:

var df = spark.read.parquet("path to file")
var rows = df.toLocalIterator();

Once you try to enumerate rows you'll get the following exception:

Unhandled exception. System.FormatException: Input string was not in a correct format.
   at System.Number.ThrowOverflowOrFormatException(ParsingStatus status, TypeCode type)
   at System.Convert.ToDecimal(String value, IFormatProvider provider)
   at Razorvine.Pickle.Objects.DecimalConstructor.construct(Object[] args)
   at Razorvine.Pickle.UnpicklerImplementation`1.load_reduce()
   at Razorvine.Pickle.UnpicklerImplementation`1.Dispatch(Byte key)
   at Razorvine.Pickle.UnpicklerImplementation`1.Load()
   at Razorvine.Pickle.Unpickler.loads(Byte[] pickledata, Int32 stackCapacity)
   at Razorvine.Pickle.Unpickler.loads(ReadOnlyMemory`1 pickledata, Int32 stackCapacity)
   at Microsoft.Spark.Utils.PythonSerDe.GetUnpickledObjects(Stream stream, Int32 messageLength) in /mnt/c/Users/adam/Repos/spark/src/csharp/Microsoft.Spark/Utils/PythonSerDe.cs:line 65
   at Microsoft.Spark.Sql.RowCollector.Collect(ISocketWrapper socket)+MoveNext() in /mnt/c/Users/adam/Repos/spark/src/csharp/Microsoft.Spark/Sql/RowCollector.cs:line 33
   at Microsoft.Spark.Sql.RowCollector.LocalIteratorFromSocket.GetEnumerator()+MoveNext() in /mnt/c/Users/adam/Repos/spark/src/csharp/Microsoft.Spark/Sql/RowCollector.cs:line 117
   at Microsoft.Spark.Sql.DataFrame.ToLocalIterator(Boolean prefetchPartitions)+MoveNext() in /mnt/c/Users/adam/Repos/spark/src/csharp/Microsoft.Spark/Sql/DataFrame.cs:line 843

This currently is happening with v1.0.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions