Skip to content

Yourzo/hasherBenchmarker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hasher benchmarker

is simple tool, built in C++, to measures and compare speeds of hashing functions in unordered map. Currently, it supports key types: int, std::string and pointer. It was built as a part of my bachelor's thesis.

Bachelor's thesis:

Comparison of hash function implementations for hash tables

Setup

  • Start the program from correct directory with config.json.
  • Results can be found in results directory that gets created after first run.

Config.json

Application can do multiple benchmarks each run. Every benchmark produces one result a pair of data and metadata files.

  1. Define how many replications will be executed for each test (these will stay the same across all benchmarks)
"replications": 40,
  1. Define benchmarks field, (it's field of fields).
  2. In every benchmark define your test for given benchmark.
  3. Shuffle option can be added (default is false).
{
    "replications": 40,
    "shuffle": true,
    "benchmarks": [
        [
            {
                "here": "define test"
            }
        ]
    ]
}

TEST:

NAME:

Choose name, this will be used as name of the column in ```result/.data.csv```:
"name": "basic int 10 000 jenkins 32 bit"

HASHER:

Next hasher has to picked, hasher works as a key type identification too:
"hasher": "jenkins 32 bit"

int hashers:

name
hash 1
jenkins 32 bit
multiplication hash
indentity_int
std::hash int
murmur2_int
murmur3_int

std::string hashers:

name
rolling hash
jenkins hash
std::hash string
djb2
sdbm
murmur2_str
murmur3_str

pointer hashers:

name
shift 3 pointer align 8
shift 4 pointer align 16
shift 5 pointer align 32
murmur2_ptr
murmur3_ptr
std::hash ptr

Generators

In next line key generator is defined, generator type has to match key type. Generators are heavily reliant on [**random**](https://en.cppreference.com/w/cpp/header/random) from c++ standard library.
"generator": "basic int"

int generators:

name
basic int
normal dist int

std::string generators:

name
long string
small string
random length string
pointer generators:
name
packed pointer
random pointer

Size of the map

Last field requires us to define how many elements are to be in tested map.
"mapSize":10000,

Config examples:

* Config with one test:
{
    "replications": 40,
    "benchmarks": [
        [
            {
                "name": "basic int 10 000 jenkins 32 bit",
                "hasher": "jenkins 32 bit",
                "generator": "basic int",
                "mapSize": 10000
            }
        ]
    ]
}
  • Config with two tests, each in one benchmark:
{
    "replications": 40,
    "benchmarks": [
        [
            {
                "name": "basic int 10 000 jenkins 32 bit",
                "hasher": "jenkins 32 bit",
                "generator": "basic int",
                "mapSize": 10000
            },
            {
                "name": "basic int 10 000 jenkins 32 bit",
                "hasher": "jenkins 32 bit",
                "generator": "basic int",
                "mapSize": 10000
            }
        ]
    ]
}
  • Config with multiple benchmarks:
{
    "replications": 40,
    "benchmarks": [
        [
            {
                "name": "basic int 10 000 jenkins 32 bit",
                "hasher": "jenkins 32 bit",
                "generator": "basic int",
                "mapSize": 10000
            }
        ],
        [
            {
                "name": "packed pointer 1000 shift 4",
                "hasher": "shift 4 pointer",
                "generator": "packed pointer",
                "mapSize": 1000000
            }
        ]
    ]
}
Coming soon:
  • finishing this file (documentation)
  • Python script to process the results.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors