Open
Description
The scitype of a tuple is intended to be the Tuple
of the element scitypes. For example:
julia> scitype((1.0, 4))
Tuple{Continuous, Count}
By this logic, if I create a 1-tuple with a table t
as it's single element, then this tuple should have Tuple{scitype(t)}
. But this isn't always the case:
t = (x=[1, 2], y=["a", "b"])
julia> scitype(t)
Table{Union{AbstractVector{Count}, AbstractVector{Textual}}}
julia> scitype((t,))
Table{Union{AbstractVector{AbstractVector{Count}}, AbstractVector{AbstractVector{Textual}}}}
The problem is that (t, )
is also a table (with one row):
julia> schema((t,))
┌───────┬─────────────────────────┬────────────────┐
│ names │ scitypes │ types │
├───────┼─────────────────────────┼────────────────┤
│ x │ AbstractVector{Count} │ Vector{Int64} │
│ y │ AbstractVector{Textual} │ Vector{String} │
└───────┴─────────────────────────┴────────────────┘
This is pretty awful 😢 . For example it makes it tricky, in MLJBase, to use the fit_data_scitype
of models, to check compatibility of a model with data, as in JuliaAI/MLJBase.jl#731 . That is, the test scitype(data) <: fit_data_scitype(model)
where data
is the tuple of data arguments, is not reliable.
Metadata
Metadata
Assignees
Labels
No labels