Skip to content

import/export raw buffer? #2

@dominictarr

Description

@dominictarr

I'm working on a modular database project flumedb in which I am intending to use jsbloom to create a view to test whether a given item is within the database or not (etc)

I tested the import/export function you have provided which uses LZString, and unfortunately found that in certain cases it actually expands the size of the exported filter to be larger than the raw buffer! When the filter is either sparse or overfull it compresses well (because it's mostly 0s or mostly 1s) but when it's in the middle is when it compresses the worst.

I saw in the code https://github.com/cry/jsbloom/blob/master/bloom.js#L130 that you are just stringifying the buffer and then compressing it. This splits the buffer to comma separated integers, so that one byte becomes 2-4 bytes. While it should theoretically possible to compress this as well as another encoding, most compression algorithms are not that clever, so it helps if you give them an easier input.

Anyway, I want to be able to import/export the raw buffer, and want to ask about the best way to add this feature.

One option would be to pass in {encode, decode} functions which could then be used within the export/import functions. (this is a standard way for the level community https://github.com/level/levelup#custom-encodings ) the default encoding could then be your LZString, if you wanted to keep this backwards compatible.

A simple change, although less elegant, i think would be set bVector as the third argument.

It seems to me that it might also be important to serialize the other settings (items, probability) but maybe those can also be passed to {encode, decode}?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions