-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
Tokyocabinet has several points of behaviour which are not ideal for our usage, resulting in CPU usage inefficiency.
- The treating of keys as strings, resulting in a double hashing of every key.
- The malloc/free performed on every get (search for 'free' in
ocean.db.tokyocabinet.TokyoCabinetM). TC copies its internal record into a malloced buffer, which is passed to our get method, which then copies it into the D buffer to receive it. This is a double copy, in addition to a theoretically unnecessary malloc/free cycle.
It may be possible to fix these issues, either by modifying tokyocabinet itself or by implementing a new storage engine in D, perhaps based on ocean.util.container.map.HashMap. (The latter would require implementing a proper step iterator in the HashMap.)
Additional not-so-ideal points about tokyocabinet:
- Internally, 32-bit hashes are used.
- From looking at the tcmapput() function, there seems to be no kind of rehashing behaviour. This means that the initial num_records and load_factor settings are very important! Perhaps it's good though, in our case, to not have any rehashing behaviour.
- There seems to be no built-in way to minimise the allocated memory of the database after a large number of records have been removed. This can happen, for example, after a data redistribution.
Metadata
Metadata
Assignees
Labels
No labels