Skip to content

Breaking: split set into unsafe_set and set #926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Breaking: split set into unsafe_set and set #926

wants to merge 21 commits into from

Conversation

rafaqz
Copy link
Owner

@rafaqz rafaqz commented Feb 10, 2025

This PR aims to make set much more reliable and useful to users, and keep the faster and always type stable unsafe_set for internal use

closes #624, closes #920 and closes #924

@felixcremer
Copy link
Collaborator

I suppose this is going to close #624 #920 and #924?

@rafaqz
Copy link
Owner Author

rafaqz commented Feb 11, 2025

Yeah, basically just implementing your idea in #624

rafaqz and others added 10 commits February 25, 2025 01:18
)

* DimVector of NamedTuple is a NamedTuple table

* bugfix

* remove show

* fix ambiguity
… for `AbstractDimStack` (#903)

* add combine method

* test groupby and similar

* docs entry
* add preservedims keyword to DimTable

* add tests

* Apply suggestions from code review

Co-authored-by: Anshul Singhvi <[email protected]>

* tests, and fix DimSlices

* better table docs

* cleanup

* test

* indexing overhaul

* fix similar and broadcast for basicdimarray

* bugfix rebuildsliced

* more indexing cleanup

* cleanup similar and gubfix indexing

* bugfixes

* uncomment

* fix doctests

* just dont doctest unreproducable failures, for now

* combine new Tables integrations

* bugfix and cleanup show

* bugfix and more tests for preservedims and mergedims

---------

Co-authored-by: Anshul Singhvi <[email protected]>
Copy link
Collaborator

@felixcremer felixcremer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first rudimentary reading of the code. I am going to add some tests

checkaxis(lookup::Union{Dimension,Lookup}, axis::AbstractUnitRange) =
first(axes(lookup)) == axis || _checkaxiserror(lookup, axis)

checkaxes(lookups::Tuple, axes::Tuple) = all(map(checkaxis, lookups, axes))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be the checkaxis function? From what I understand they are doing conceptually the same on different input. I find it quite confusing to have two different functions with such a similar name.

Keywords can be passed to `set` to update the fields of an object,
working like keyword `rebuild` but updating related fields where needed.

Fields are always the same as keywords for the objects constructor.

## Examples
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we hide the examples behind an extended Help block, so that one can get them by doing ?? but one can still quickly look at the docstring without having to scroll through all the examples?

@felixcremer
Copy link
Collaborator

set should also work on a DimTree.

@felixcremer
Copy link
Collaborator

set of a single array in a DimStack should work but throws the following error:

julia> set(s, (test2=zero(a2)))
ERROR: MethodError: no method matching rebuild(::DimStack{…}; test2::Matrix{…})
This error has been manually thrown, explicitly, so the method may exist but be intentionally marked as unimplemented.

Closest candidates are:
  rebuild(::AbstractDimStack; data, dims, refdims, layerdims, metadata, layermetadata) got unsupported keyword argument "test2"
   @ DimensionalData ~/.julia/dev/DimensionalData/src/stack/stack.jl:73
  rebuild(::AbstractDimStack, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any) got unsupported keyword argument "test2"
   @ DimensionalData ~/.julia/dev/DimensionalData/src/stack/stack.jl:66
  rebuild(::AbstractDimStack, ::Any, ::Any, ::Any, ::Any, ::Any) got unsupported keyword argument "test2"
   @ DimensionalData ~/.julia/dev/DimensionalData/src/stack/stack.jl:66
  ...

Stacktrace:
 [1] kwerr(::@NamedTuple{}, ::Function, ::DimStack{…})
   @ Base ./error.jl:165
 [2] _set(s::DimensionalData.Dimensions.Lookups.Safe, st::DimStack{…}; data::Nothing, dims::Nothing, kw::@Kwargs{})
   @ DimensionalData ~/.julia/dev/DimensionalData/src/set.jl:202
 [3] set(::DimStack{…}; kw::@Kwargs{})
   @ DimensionalData ~/.julia/dev/DimensionalData/src/set.jl:139
 [4] top-level scope
   @ REPL[137]:1
Some type information was truncated. Use `show(err)` to see complete types.

julia> set(s, (test2=zero(a2),))
ERROR: ArgumentError: keys (:test2, :test3) and (:test2,) do not match
Stacktrace:
 [1] _keyerr(ka::Tuple{Symbol, Symbol}, kb::Tuple{Symbol})
   @ DimensionalData ~/.julia/dev/DimensionalData/src/set.jl:268
 [2] _set_dimstack_data(::DimensionalData.Dimensions.Lookups.Safe, st::DimStack{…}, newdata::@NamedTuple{})
   @ DimensionalData ~/.julia/dev/DimensionalData/src/set.jl:254
 [3] _set(s::DimensionalData.Dimensions.Lookups.Safe, A::DimStack{…}, newdata::@NamedTuple{})
   @ DimensionalData ~/.julia/dev/DimensionalData/src/set.jl:240
 [4] set(x::DimStack{…}, args::@NamedTuple{}; kw::@Kwargs{})
   @ DimensionalData ~/.julia/dev/DimensionalData/src/set.jl:139
 [5] top-level scope
   @ REPL[138]:1
Some type information was truncated. Use `show(err)` to see complete types.

@rafaqz
Copy link
Owner Author

rafaqz commented Apr 4, 2025

Yeah there is nothing written for DimTree yet

@felixcremer
Copy link
Collaborator

setting the axes from a DimStack with mixed dimension names to the same names should fail if the axes can't be made the same.

julia> da = DimArray(a, dimz; name=:test)
┌ 2×2 DimArray{Int64, 2} test ┐
├─────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   X Sampled{Float64} 143.0:2.0:145.0 ForwardOrdered Regular Points,
   Y Sampled{Float64} -38.0:2.0:-36.0 ForwardOrdered Regular Points
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
      -38.0  -36.0
 143.0    1      2
 145.0    3      4

julia> da2 = DimArray(a2, dimz2; name=:test2)
┌ 3×4 DimArray{Int64, 2} test2 ┐
├──────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   row    Sampled{Float64} 10.0:10.0:30.0 ForwardOrdered Regular Points,
   column Sampled{Float64} -2.0:1.0:1.0 ForwardOrdered Regular Points
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
     -2.0  -1.0  0.0  1.0
 10.0   1     2    3    4
 20.0   3     4    5    6
 30.0   4     5    6    7

smix = DimStack(da, da2)
julia> set(smix, :row => X, :column => Z) # This should fail, because the DimStack is not usable.2×2×3×4 DimStack ┐
├──────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   X Sampled{Float64} 143.0:2.0:145.0 ForwardOrdered Regular Points,
   Y Sampled{Float64} -38.0:2.0:-36.0 ForwardOrdered Regular Points,
  ↗ X Sampled{Float64} 10.0:10.0:30.0 ForwardOrdered Regular Points,
  ⬔ Z Sampled{Float64} -2.0:1.0:1.0 ForwardOrdered Regular Points
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── layers ┤
  :test  eltype: Int64 dims: X, Y size: 2×2
  :test2 eltype: Int64 dims: X, Z size: 2×4

@felixcremer
Copy link
Collaborator

I am not too sure, what is the correct behaviour here but I think this should keep working even if the set of mixed axes fail:
This could also get rid of the second X axis.

julia> ssame = DimStack(da, set(da, X=>Z))
┌ 2×2×2 DimStack ┐
├────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   X Sampled{Float64} 143.0:2.0:145.0 ForwardOrdered Regular Points,
   Y Sampled{Float64} -38.0:2.0:-36.0 ForwardOrdered Regular Points,
  ↗ Z Sampled{Float64} 143.0:2.0:145.0 ForwardOrdered Regular Points
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── layers ┤
  :layer1 eltype: Int64 dims: X, Y size: 2×2
  :layer2 eltype: Int64 dims: Z, Y size: 2×2
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

julia> set(ssame, :Z=>X)
┌ 2×2×2 DimStack ┐
├────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   X Sampled{Float64} 143.0:2.0:145.0 ForwardOrdered Regular Points,
   Y Sampled{Float64} -38.0:2.0:-36.0 ForwardOrdered Regular Points,
  ↗ X Sampled{Float64} 143.0:2.0:145.0 ForwardOrdered Regular Points
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── layers ┤
  :layer1 eltype: Int64 dims: X, Y size: 2×2
  :layer2 eltype: Int64 dims: X, Y size: 2×2

@rafaqz
Copy link
Owner Author

rafaqz commented Apr 4, 2025

Yeah that should get rid of the second X as they match. Not sure what to do if they are different, probably having the same dim twice shouldn't exist

@felixcremer
Copy link
Collaborator

Having the same dim twice doesn't work if you try to construct it.

@felixcremer
Copy link
Collaborator

I feel like I don't have a good grasp of how the code is entangled. I tried to fix some ambiguities and go through the tests to get them fixed. You should definitively have a good look at my changes, because it feels a bit like guesswork.

@felixcremer
Copy link
Collaborator

I fixed the ambiguities and now we run only in other test failures.

@rafaqz
Copy link
Owner Author

rafaqz commented Apr 7, 2025

Thanks. I'm not expecting you to work through all that!! It's fine to just add tests that are broken

@felixcremer
Copy link
Collaborator

I am not sure, what is the best behaviour here, but I am leaning towards also changing the Sampling to Sampled.

julia> x = format(X([:a,:b]))
X Categorical{Symbol} ForwardOrdered
wrapping: 2-element Vector{Symbol}:
 :a
 :b

julia> xint = set(x, 1:2)
X Categorical{Int64} ForwardOrdered
wrapping: 1:2

@rafaqz
Copy link
Owner Author

rafaqz commented Apr 9, 2025

Raster Bands are Categorical Ints, so I think we can't change that.

@felixcremer
Copy link
Collaborator

Set of the locus on the boarder of the typemaximum and minimum fails

julia> x = format(X(typemin(Int8):typemax(Int8)))
X Sampled{Int8} ForwardOrdered Regular Points
wrapping: -128:127

julia> xi = set(x, Intervals(Start()))
X Sampled{Int8} ForwardOrdered Regular Intervals{Start}
wrapping: -128:127

julia> set(xi, Center())
ERROR: DimensionMismatch: argument dimensions must match: length of r1 is 0, length of r2 is 256
Stacktrace:
  [1] -(r1::UnitRange{Int8}, r2::UnitRange{Int8})
    @ Base ./range.jl:1443
  [2] broadcasted
    @ ./broadcast.jl:1143 [inlined]
  [3] broadcasted
    @ ./broadcast.jl:1331 [inlined]
  [4] _shiftlocus(destlocus::Center, span::Regular{…}, sampling::Intervals{…}, l::Sampled{…})
    @ DimensionalData.Dimensions.Lookups ~/.julia/dev/DimensionalData/src/Lookups/utils.jl:30
  [5] _shiftlocus
    @ ~/.julia/dev/DimensionalData/src/Lookups/utils.jl:20 [inlined]
  [6] shiftlocus
    @ ~/.julia/dev/DimensionalData/src/Lookups/utils.jl:11 [inlined]
  [7] _maybeshiftlocus
    @ ~/.julia/dev/DimensionalData/src/Lookups/utils.jl:59 [inlined]
  [8] maybeshiftlocus
    @ ~/.julia/dev/DimensionalData/src/Lookups/utils.jl:57 [inlined]
  [9] _set
    @ ~/.julia/dev/DimensionalData/src/Lookups/set.jl:149 [inlined]
 [10] _set
    @ ~/.julia/dev/DimensionalData/src/Dimensions/set.jl:52 [inlined]
 [11] set(dim::X{Sampled{Int8, UnitRange{Int8}, ForwardOrdered, Regular{Int8}, Intervals{Start}, NoMetadata}}, x::Center)
    @ DimensionalData.Dimensions ~/.julia/dev/DimensionalData/src/Dimensions/set.jl:3
 [12] top-level scope
    @ REPL[197]:1
Some type information was truncated. Use `show(err)` to see complete types.

@rafaqz
Copy link
Owner Author

rafaqz commented Apr 14, 2025

Ahh looks like Int overflow. Not sure we can deal with that without changing the type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants