Description
I'm still experiencing performance issues between calling CLI that calls HTTP API under the hood vs FlightRPC with pyarrow client.
I'm gonna try to create a minimal example for this.
I'm using the v3.0.0 build enterprise, with compaction enabled.
In the mean time here is the log calling the same SQL query:
Here using CLI: 32ms
influxdb3 | SELECT DISTINCT ON (tid) lat, lon, tid, battery, altitude, time
influxdb3 | FROM telemetry
influxdb3 | WHERE tid IN ('1', '2', '3', '4', '5', '6', '7', '8', '9', '10')
influxdb3 | ORDER BY tid, time DESC;
influxdb3 | query_params=Params { } issue_time=2025-04-20T17:52:56.673480632+00:00 partitions=8 parquet_files=16 deduplicated_partitions=3 deduplicated_parquet_files=11 plan_duration_secs=0.031543458 permit_duration_secs=0.000114208 execute_duration_secs=0.000655875 end2end_duration_secs=0.032463916 compute_duration_secs=0.00020381 max_memory=0 ingester_metrics=IngesterMetrics { latency_to_plan = 0ns, latency_to_full_data = 0ns, response_rows = 0, partition_count = 0, response_size = 0 } success=true running=false cancelled=false
Using FLightRPC: 617ms
influxdb3 | SELECT DISTINCT ON (tid) lat, lon, tid, battery, altitude, time
influxdb3 | FROM telemetry
influxdb3 | WHERE tid IN ('1', '2', '3', '4', '5', '6', '7', '8', '9', '10')
influxdb3 | ORDER BY tid, time DESC;
influxdb3 | query_params=Params { } issue_time=2025-04-20T17:53:48.931954711+00:00 partitions=8 parquet_files=16 deduplicated_partitions=3 deduplicated_parquet_files=11 plan_duration_secs=0.060829459 permit_duration_secs=0.001200208 execute_duration_secs=0.555666209 end2end_duration_secs=0.617759376 compute_duration_secs=0.095443782 max_memory=25699024 ingester_metrics=IngesterMetrics { latency_to_plan = 0ns, latency_to_full_data = 0ns, response_rows = 0, partition_count = 0, response_size = 0 } success=true running=false cancelled=false
Expected behaviour:
Same performance between the two paths
Actual behaviour:
Environment info:
- Please provide the command you used to build the project, including any
RUSTFLAGS
. - System info: Darwin 24.3.0 arm64
- I'm running influxdb3 using docker with a volume, object store is setup as
file
- Other relevant environment details: disk info, hardware setup etc.
Config:
- INFLUXDB3_OBJECT_STORE=file
- INFLUXDB3_DB_DIR=/var/lib/influxdb3
- INFLUXDB3_ENTERPRISE_LICENSE_EMAIL=
- INFLUXDB3_ENTERPRISE_MODE=all
- INFLUXDB3_HTTP_BIND_ADDR=0.0.0.0:8181
- INFLUXDB3_MAX_HTTP_REQUEST_SIZE=20971520
- LOG_FILTER=info
- INFLUXDB3_WAL_FLUSH_INTERVAL=1000ms
- INFLUXDB3_NODE_IDENTIFIER_PREFIX=node-1
- INFLUXDB3_ENTERPRISE_CLUSTER_ID=cluster-1
Logs:
Include snippet of errors in logs or stack traces here.
Sometimes you can get useful information by running the program with the RUST_BACKTRACE=full
environment variable.
Finally, the IOx server has a -vv
for verbose logging.