Skip to content

API of SortedContainer #918

Open
Open
@Wu-Chenyang

Description

@Wu-Chenyang

In my application, I find it's useful to define the following APIs for SortedContainer.

Base.checkbounds(::Type{Bool}, container::SortedContainer, token::DataStructures.IntSemiToken) =
    token.address >= 3 && (token.address in container.bt.useddatacells)
Base.checkbounds(container::SortedContainer, token::DataStructures.IntSemiToken) = (container, token) |> DataStructures.has_data
Base.nextind(container::SortedContainer, token::DataStructures.IntSemiToken) = (container, token) |> advance
Base.prevind(container::SortedContainer, token::DataStructures.IntSemiToken) = (container, token) |> regress

Base.@propagate_inbounds function Base.getindex(m::SortedDict,
                              i::DataStructures.IntSemiToken)
           @boundscheck DataStructures.has_data((m,i))
           return (m, i) |> deref
end
Base.@propagate_inbounds function Base.getindex(m::SortedMultiDict,
                              i::DataStructures.IntSemiToken)
           @boundscheck DataStructures.has_data((m,i))
           return (m, i) |> deref
end
Base.@propagate_inbounds function Base.getindex(m::SortedSet,
                              i::DataStructures.IntSemiToken)
           @boundscheck DataStructures.has_data((m,i))
           return (m, i) |> deref
end
# Base.@propagate_inbounds function Base.getindex(m::SortedContainer, token::DataStructures.IntSemiToken)
#     @boundscheck DataStructures.has_data((m,token))
#     return (container, token) |> deref
# end

The first three functions checkbounds, nextind, prevind are straightforward and align well with existing APIs. The getindex functions for IntSemiToken have been implemented for SortedDict and SortedMultiDict, which return only the values. I modified them to return key-value pairs, because it is natural to do so in my application. Besides, given the existence of another getindex function that retrieves the value using the key, I think retrieving key-value pairs using IntSemiToken seems a natural choice. Of course, it might not be a good choice in general. How do you think about these changes?

Here is a piece of my application code for motivating the changes. It uses the general APIs for sorted containers to simultaneously handle sorted AbstractVector and SortedDict.

struct Observation{I, O}
    index::I
    obs::O
    Observation{I,O}(index::I, obs::O) where {I, O} = new{I,O}(index, obs)
end
Observation(index::I, obs::O) where {I, O} = Observation{I,O}(index, obs)
Base.lt(::Base.Order.ForwardOrdering, a::Observation, b::Observation) = a.index < b.index
Base.lt(::Base.Order.ForwardOrdering, a::Real, b::Observation) = a < b.index
Base.lt(::Base.Order.ForwardOrdering, a::Observation, b::Real) = a.index < b
flatten(obs::Observation) = (obs.index, obs.obs)
flatten(pair::Pair) = (pair.first, pair.second)

const ObservationContainer{I,O} = Union{AbstractVector{<:Observation{I,O}}, SortedDict{I,O}}

function (w::AbstractWienerProcess)(t::Real, condition::ObservationContainer)
    w₀, Σ = params(w)

    # `IntSemiToken` for `SortedDict`, `Int` for `AbstractVector`
    idx = searchsortedlast(condition, t)
    if checkbounds(Bool, condition, idx)
        t₁, a = flatten(condition[idx])
    else
        t₁ = zero(t)
        a = w₀
    end

    next_idx = nextind(condition, idx)
    if checkbounds(Bool, condition, next_idx)
        t₂, b = flatten(condition[next_idx])
        μ = a + (t - t₁) / (t₂ - t₁) * (b - a)
        cov_t = ((t₂ - t) * (t - t₁) / (t₂ - t₁))
        construct_normal(μ, Σ, cov_t)
    else
        construct_normal(a, Σ, t - t₁)
    end
end

I also tried to use the SortedSet as the container in this code block but encountered a problem that searchsortedlast for SortedSet tries to convert the t::Real to Observation before comparing it to other observations in the container. I think this compulsory conversion might not be necessary. In some cases, it is natural to directly compare objects of different types. The above code is an example.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions