Comparison to SortedContainers.SortedList

Have you seen the [Python SortedContainers module](http://www.grantjenks.com/docs/sortedcontainers/)? It's a fast, pure-Python implementation of SortedList, SortedSet, and SortedDict data types. It also has an extensive [performance comparison](http://www.grantjenks.com/docs/sortedcontainers/performance.html) with modules related to yours.

I was interested in your performance claims and I wanted to benchmark SortedContainers against your skip lists. But a [SortedList](http://www.grantjenks.com/docs/sortedcontainers/sortedlistwithkey.html) is not quite the same as a SkipList. So I wrote a little wrapper and did a benchmark comparison (using your `perf_skiplist.py`. Here's my results:

| Test | PySkipList | SortedContainers | Compare |
| --- | --- | --- | --- |
| skiplist_index_throughput_1000 | 152326.28 | 602629.89 | 3.95617808037 times faster |
| skiplist_index_throughput_10000 | 134544.94 | 485423.76 | 3.60789309505 times faster |
| skiplist_index_throughput_100000 | 100671.55 | 382475.61 | 3.79924228841 times faster |
| skiplist_insert_throughput_1000 | 77543.06 | 368406.15 | 4.75098803168 times faster |
| skiplist_insert_throughput_10000 | 57264.03 | 321809.49 | 5.61974925621 times faster |
| skiplist_insert_throughput_100000 | 42304.85 | 257105.53 | 6.07744809401 times faster |
| skiplist_remove_throughput_1000 | 88881.20 | 171370.95 | 1.92808996728 times faster |
| skiplist_remove_throughput_10000 | 66682.10 | 140854.81 | 2.11233314488 times faster |
| skiplist_remove_throughput_100000 | 50299.01 | 109368.50 | 2.17436685136 times faster |
| skiplist_search_throughput_1000 | 226229.99 | 346636.69 | 1.53223138099 times faster |
| skiplist_search_throughput_10000 | 159439.83 | 241837.23 | 1.51679307485 times faster |
| skiplist_search_throughput_100000 | 103455.39 | 181585.36 | 1.75520444126 times faster |

I'm not sure if you published this because the SkipList data structure is awesome (which it is) or because you need this in a production environment where performance matters. In the latter case, I wanted to provide this wrapper for you and get your thoughts. Here's the wrapper source (Apache2 License):

``` python
from operator import itemgetter
from sortedcontainers import SortedListWithKey

class PySkipList(SortedListWithKey):
    def __init__(self):
        self._list = SortedListWithKey(key=itemgetter(0))

    def insert(self, key, value):
        self._list.add((key, value))

    def replace(self, key, value):
        if not len(self):
            self._list.add((key, value))

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            self._list[pos] = (key, value)
        else:
            self._list.add((key, value))

    def clear(self):
        self._list.clear()

    def __len__(self):
        return len(self._list)

    def __iter__(self, start=None, stop=None):
        return self._list.irange_key(
            min_key=start, max_key=stop, inclusive=(False, False)
        )

    items = __iter__

    def keys(self, start=None, stop=None):
        return (pair[0] for pair in self.items(start, stop))

    def values(self, start=None, stop=None):
        return (pair[1] for pair in self.items(start, stop))

    def popitem(self):
        if len(self):
            return self._list.pop(0)
        else:
            raise KeyError

    def search(self, key, default=None):
        if not len(self):
            return default

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            return pair
        else:
            return default

    def remove(self, key):
        if not len(self):
            raise KeyError

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            return self._list.pop(pos)
        else:
            raise KeyError

    __not_set = object()

    def pop(self, key, default=__not_set):
        if not len(self):
            if default is __not_set:
                raise KeyError
            else:
                return default

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            return self._list.pop(pos)[1]
        else:
            if default is __not_set:
                raise KeyError
            else:
                return default

    def __contains__(self, key):
        if not len(self):
            return False

        pos = self._list.bisect_key_left(key)

        return key == self._list[pos][0]

    def index(self, key, default=__not_set):
        if not len(self):
            if default is __not_set:
                raise KeyError
            else:
                return default

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            return pos
        else:
            if default is __not_set:
                raise KeyError
            else:
                return default

    def count(self, key):
        start = self._list.bisect_key_left(key)
        end = self._list.bisect_key_right(key)
        return end - start

    def __getitem__(self, pos):
        if isinstance(pos, slice):
            return self._list.islice(pos.start, pos.stop)
        else:
            return self._list[pos]

    def __delitem__(self, pos):
        del self._list[pos]

    def __setitem__(self, pos, value):
        pair = self._list.pop(pos)
        self._list[pos] = (pair[0], value)
```

If you're interested in making this even faster, let me know. There's a few things that could be done.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparison to SortedContainers.SortedList #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Test	PySkipList	SortedContainers	Compare
skiplist_index_throughput_1000	152326.28	602629.89	3.95617808037 times faster
skiplist_index_throughput_10000	134544.94	485423.76	3.60789309505 times faster
skiplist_index_throughput_100000	100671.55	382475.61	3.79924228841 times faster
skiplist_insert_throughput_1000	77543.06	368406.15	4.75098803168 times faster
skiplist_insert_throughput_10000	57264.03	321809.49	5.61974925621 times faster
skiplist_insert_throughput_100000	42304.85	257105.53	6.07744809401 times faster
skiplist_remove_throughput_1000	88881.20	171370.95	1.92808996728 times faster
skiplist_remove_throughput_10000	66682.10	140854.81	2.11233314488 times faster
skiplist_remove_throughput_100000	50299.01	109368.50	2.17436685136 times faster
skiplist_search_throughput_1000	226229.99	346636.69	1.53223138099 times faster
skiplist_search_throughput_10000	159439.83	241837.23	1.51679307485 times faster
skiplist_search_throughput_100000	103455.39	181585.36	1.75520444126 times faster

Comparison to SortedContainers.SortedList #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions