Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Etcd as a data store backend #742

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

williamdes
Copy link
Contributor

Ref: #634

See:

Goals: provide a backend to store everything I need to store

  • Data store
    • read
    • write
    • delete
  • Blob store (etcd is not made for large data)
  • Full-text store
  • Lookup store

https://stalw.art/docs/get-started

Thank you Alvin for the TiKV implementation

Co-Authored-By: Alvin Peters <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Sep 8, 2024

CLA assistant check
All committers have signed the CLA.

@williamdes
Copy link
Contributor Author

@mdecimus can you backport be18ddf into main

STORE=etcd cargo test store_tests --no-default-features --features=etcd -- --nocapture
# or
STORE=sqlite cargo test store_tests --no-default-features --features=sqlite -- --nocapture

Before this patch tests can not disable everything but one store type

@@ -71,6 +71,8 @@ impl DistributedBlob {
Store::MySQL(store) => store.get_blob(key, read_range).await,
#[cfg(feature = "rocks")]
Store::RocksDb(store) => store.get_blob(key, read_range).await,
#[cfg(feature = "etcd")]
Store::Etcd(_) => unimplemented!(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can I declare that my backend will not be able to handle blobs and avoid such lines ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdecimus could you help me please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should implement etcd as a lookup store, not a data store. Check the Redis implementation for guidelines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I want to store data but not blobs:

  • Data store
  • Full-text store
  • Lookup store

https://stalw.art/docs/get-started

Should I still implement blobs even if this is not recommended with etcd ?
I really want that this implementation checks all boxes on the https://stalw.art/docs/get-started page but not blobs. As I want Garage/S3 to store them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I want to store data but not blobs:

That is not possible with the current design. A data store has to offer also blob and lookup functionalities in order to be functional.

I don't have experience with etcd but according to their website it was designed as a store for settings and metadata. For this reason it should be added as a lookup store rather than a data store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but my true question is if I do it as a lookup store I can not store data into it
no FTS
And I will have basically done the same as Redis and this is not solving my infrastructure problem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your use case exactly? And why are you choosing Etcd over FoundationDB or PostgreSQL? It makes sense as a lookup store but I wouldn't use it for FTS or blob storage as it is not the right tool for the job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my use case is finding the right softwares that scale across multiple data centers. Multi master multi read. The worst setup you could imagine.

Garage works perfectly out of the box for S3, great. The have a KV api but not what we need for this project.

Etcd seems to be built to work in the same way, but only with implementation I can be sure.

PostgresSQL is terrible to configure for multiple datacenter. A nightmare.
FoundationDB should have worked but did not and was nonsense to setup across multiple data centers.

I use LDAP for accounts, perfect no need to have write access. There is no writes to be made anyway on my setup. And for so little writes managing conflicts would be easy.

Multiple data centers with multi Read and write is a complicated setup I know ^^
The average round trip is 60ms in the worst case.

Can you elaborate more about why it would not be suitable for FTS?
Does it store large data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it store large data?

Not if you use an external blob store. If you are sure that etcd can store large amounts of data go ahead, but it has to implement the blob store methods even if they are not used.

Also, if this featured is merged it won't be distributed by default in order to keep the binary size small since etcd is not a popular choice for storing indexes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants