Skip to content

Smarter metadata fetching #1280

Open
Open
@Lorak-mmk

Description

@Lorak-mmk

This is an issue about improving the way metadata is fetched. Below I discuss our previous ideas, and the currently proposed approach.

History

Metadata fetching is expensive (#786). Driver fetches metadata every minute (by default), and such fetch requires reading system_schema tables - containing possibly a lot of data.

We did previously consider extracting the metadata from Session (#595), so that fetching it is optional.
This however proved difficult - it's really hard to design good API given all the considerations.
Even if we did design some good API, it would still be more difficult to use than the current one, and it does not even address the real issue, it only hides it. The issue is that fetching metadata costs a lot - hiding it does not make it cheaper, it only makes the cost opt-in.

We had ideas to not fetch metadata periodically:

The question arises: why did we even implement periodic fetching, instead of handling the events?
I did not work in the Rust Driver back then, but the way I understand it it was to avoid request storm when a lot of schema changes are done in a short time. I think this is a very valid reason.

Current idea

Assumption

I would still like to get rid of periodically fetching full metadata. I think we can do this, but there is one assumption that must hold: after we receive schema change event X on connection C, we can query system / system_schema tables on connection C, and their contents will already include changes made by X.

I don't think it's unreasonable to expect it from the server, and @avikivity did confirm that Scylla does behave this way.

First optimization - optional fetch

With the above assumption, we can notice that if we receive an event, fetch the metadata some time after than, and then receive no more events we know that this metadata is consistent. That means we can skip periodic fetching, until we receive another event.
So the proposed change is:

  • When receiving metadata-related event, mark some flag saying metadata is dirty.
  • When fetch period (60s by default) finishes, fetch metadata only if the dirty flag is set (and then unset it).

This step is enough to fully skip metadata fetches other than the initial one, when the cluster is stable (no topology changes, no schema changes).

Second optimization - partial fetch

It is based on the first one. Notice that metadata consists of:

  • topology
  • schema of each keyspace (they are fully independent of each other)

We can have a flag for each of them, instead of one global flag. Then we only need to fetch the parts that have the flag set, instead of fetching whole metadata. It should not be difficult to implement.

Third optimization - exponential backoff

Also based on the first optimization. The reason we do periodic fetches instead of reacting to events is to avoid request storm when many schema changes are made.
We could however try to heuristically behave better when there is only a small amount of changes.
Why do this? If we ignore events, then the user needs to wait up to 60s (default) for the changes to be noticed by the driver. It could be beneficial for the driver to notice changes faster.

How to do this? We can react instantly for the first event or two - if that's all then great, we applied the changes quickly.
If more events are received, we would wait with handling them (as we do now). The delay could be exponential, going up to the current 60s.

Further optimizations - more granularity, client side drop handling

I don't think its important (the first three optimizations are most likely more than enough to not worry about schema fetches again), but we could possibly make the fetches even more granular than the keyspace (for example, fetch only the changed table).
We could also handle some DROP events without fetching - just dropping the ks/table/udt specified by the event from metadata.

It is likely much more complicated - keyspaces are independent, but tables / mv / udts are not, so we would have to think about how given events interact with one another.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions