Skip to content
This repository was archived by the owner on Mar 29, 2022. It is now read-only.
This repository was archived by the owner on Mar 29, 2022. It is now read-only.

spark cbind frame bug #128

@corepointer

Description

@corepointer

I looked at the bug 🐛 @Shafaq-Siddiqi mentioned. I didn't find where to fix it, but I'll leave some more information for somebody to fix it:

I reduced the DML that produces the bug to

F = read($X, data_type="frame", format="csv")
A = cbind(F, as.frame(matrix(1, nrow(F), 1)))
print(toString(A))

To produce the error uncomment the spark test invocation in BuiltinMiceTest.java:49 and replace the content of src/test/scripts/functions/builtin/mice.dml with the three lines of DML above.

Upon running the test, the check for frame block dimensions in FrameBlock.java:1002 will now fail with org.tugraz.sysds.runtime.DMLRuntimeException: Incompatible number of rows for cbind: 98 (expected: 49)

So the block is split and the column to append is not. This results in the dimension mismatch. This is as far as I got. I didn't find where the split happens. I tried specifying dimensions explicitly in the read() function (that gave other errors, which I'll investigate another time) and in an MTD file. That did not help though :-/ Furthermore, the problem seems to occur only with "real" frame data, not with matrices converted to frames with as.frame().

Originally posted by @corepointer in #116 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions