[Feature request] Speed up GetSchema

Loading the schema from a DB with a large number of measurements takes a long time. I've observed anywhere from 8-20 minutes before `GetSchema` completes.

I suspect the cause of long load times to be a result of: https://github.com/toni-moreno/syncflux/blob/9d69de4d5d32651ba58e7a8e82dea3dd120e2144/pkg/agent/hacluster.go#L147

This is making individual API calls for each measurement to fetch field keys. 

I was thinking that it may be possible to use `show field keys on <sdb>`, so that the API responds with field keys for ALL measurements in the selected db. I think this would work, but I haven't investigated whether there are any size limitations with influxdb JSON responses, or the rest client used. 

With 1000 measurements, the API took 12s to respond with a 1.72MB JSON payload. Compared to a request for fields on a single measurement, which took between 500-800ms within a small sample size of requests.

An alternate could be splitting the list of measurements and fetch field keys in batches, but this could also be very slow. For example, `show field keys from disk,diskio,interrupts,kernel` would take upward of 12s, sometimes even giving an empty response. Maybe influxdb does not index on this sort of query?

For my limited testing, I am running InfluxDB 1.7.7, with queries being routed through influxdb-srelay. Queries made directly to master were slightly faster, with all fields being returned in 4s, and batches of 4 varying between 4-12s per request. 

It would be awesome if we could set a flag at the command line to force bulk loading of all field keys in a single request, or have some sort of logic that automatically switches to bulk loading if a certain amount of measurements are seen in one DB. If batching requests is workable with additional configuration in influxdb, that would also be great. 

I'd be happy to submit a PR with my proposed solution, but would appreciate some feedback on the correct approach to take. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature request] Speed up GetSchema #25

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature request] Speed up GetSchema #25

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions