Skip to content

Redesign katip-elasticsearch internals #72

@MichaelXavier

Description

@MichaelXavier

Discussed a bit in other channels. Tagging in @bitemyapp

katip-elasticsearch is extremely naive and has some bad UX to boot. Right now it has a configurable pool of worker threads. They pull from a TBMQueue and then do individual indexDocument calls. This is not ideal because:

  1. The user already configures a queue size for the scribe overall. It is confusing when they do that and then they're limited by this seemingly superfluous queue with an independent size.
  2. indexDocument is slow and hard on the elasticsearch server. It should be bulk indexing.

I think the right way to go is:

  1. Try to drop this concept of a worker pool. You won't need to beat up your ES server from multiple threads if you're actually bulk indexing.
  2. See if we can get away with no worker threads in general. The scribe itself is an independent thread.
  3. I feel like we should accumulate logs and flush every X seconds or Y messages (configurable with defaults) or when the scribe is finalizing. This probably means we could use a faster/more appropriate data structure than TBMQueue. The main challenge I think is the X seconds part. I think we'd need to spin off an Async for flushing but I'm not quite seeing how best to design it to flush when full or after a certain amount of inactivity.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions