Redesign katip-elasticsearch internals

Discussed a bit in other channels. Tagging in @bitemyapp 

katip-elasticsearch is extremely naive and has some bad UX to boot. Right now it has a configurable pool of worker threads. They pull from a `TBMQueue` and then do individual `indexDocument` calls. This is not ideal because:
1. The user already configures a queue size for the scribe overall. It is confusing when they do that and then they're limited by this seemingly superfluous queue with an independent size.
2. `indexDocument` is slow and hard on the elasticsearch server. It should be bulk indexing.

I think the right way to go is:
1. Try to drop this concept of a worker pool. You won't need to beat up your ES server from multiple threads if you're actually bulk indexing.
2. See if we can get away with no worker threads in general. The scribe itself is an independent thread.
3. I feel like we should accumulate logs and flush every X seconds or Y messages (configurable with defaults) or when the scribe is finalizing. This probably means we could use a faster/more appropriate data structure than TBMQueue. The main challenge I think is the X seconds part. I think we'd need to spin off an Async for flushing but I'm not quite seeing how best to design it to flush when full or after a certain amount of inactivity.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign katip-elasticsearch internals #72

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Redesign katip-elasticsearch internals #72

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions