Skip to content

Splay tree amortized time issues #923

Open
@matthewsot

Description

@matthewsot

Hello,

Thanks for your work on this package !

The documentation at https://juliacollections.github.io/DataStructures.jl/stable/splay_tree/#Splay-Tree-1 indicates the splay tree implementation should guarantee

Operations such as search, insert and delete can be done in O(log n) amortized time, where n is the number of nodes in the SplayTree

But I think this can be violated by the current implementation because it doesn't splay on unsuccessful searches, or on redundant insertions.

Here's a short test script to show the behavior; I have a particularly slow computer so you might need to increase the "10000" to see useful timing info:

using DataStructures

function test(N)
    if true
        t = SplayTree{Int}()
    else
        t = AVLTree{Int}()
    end

    for i in 1:N
        push!(t, i)
    end

    for i in 1:N
        if true
            push!(t, 1)
        else
            push!(t, -i)
        end
    end
end

for i in 1:10
    @time test(i * 10000)
end

In its default configuration, this:

  1. Pushes 1, 2, 3, ..., N, which (correctly) creates a splay tree that's just one long list (root is N, left child N-1, left child N-2, etc.).
  2. Tries to repeatedly re-push 1, which is already in the data structure. This causes it to traverse the entire length-N chain, but since the implementation doesn't splay on redundant pushes no reorganization happens and so this expensive operation happens every single time we search.

The result (at least on my machine) is very poor scaling behavior.

Notably, it is significantly faster if an AVL tree is used or if we push nonredundant elements, so I think the explanation above of the cause of the slowdown is correct. And I think it also affects, e.g., sequences of unsuccessful haskeys (replace push!(t, 1) with haskey(t, 0)).

I think the basic fix is to have search_node always splay the very last node it encounters, as suggested in the splay tree paper. Alternatively, if this is expected behavior, it would be nice IMO to document that. If this package is still maintained + there's interest in either of those two changes I'm happy to send a PR.

Also sorry if I've misunderstood something about the implementation or use of the package; I'm new to Julia.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions