Problems with tokyocabinet as the memory storage engine

Tokyocabinet has several points of behaviour which are not ideal for our usage, resulting in CPU usage inefficiency.
1. The treating of keys as strings, resulting in a double hashing of every key.
2. The malloc/free performed on every get (search for 'free' in `ocean.db.tokyocabinet.TokyoCabinetM`). TC copies its internal record into a malloced buffer, which is passed to our get method, which then copies it into the D buffer to receive it. This is a double copy, in addition to a theoretically unnecessary malloc/free cycle.

It may be possible to fix these issues, either by modifying tokyocabinet itself or by implementing a new storage engine in D, perhaps based on `ocean.util.container.map.HashMap`. (The latter would require implementing a proper step iterator in the HashMap.)

Additional not-so-ideal points about tokyocabinet:
- Internally, 32-bit hashes are used.
- From looking at the tcmapput() function, there seems to be no kind of rehashing behaviour. This means that the initial num_records and load_factor settings are very important! Perhaps it's good though, in our case, to not have any rehashing behaviour.
- There seems to be no built-in way to minimise the allocated memory of the database after a large number of records have been removed. This can happen, for example, after a data redistribution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with tokyocabinet as the memory storage engine #192

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problems with tokyocabinet as the memory storage engine #192

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions