| id | historical |
|---|---|
| title | Historical service |
| sidebar_label | Historical |
The Historical service is responsible for storing and querying historical data. Historical services cache data segments on local disk and serve queries from that cache as well as from an in-memory cache.
For Apache Druid Historical service configuration, see Historical configuration.
For basic tuning guidance for the Historical service, see Basic cluster tuning.
For a list of API endpoints supported by the Historical, please see the Service status API reference.
org.apache.druid.cli.Main server historical
Each Historical service copies or pulls segment files from deep storage to local disk in an area called the segment cache. To configure the size and location of the segment cache on each Historical service, set the druid.segmentCache.locations.
For more information, see Segment cache size.
The Coordinator controls the assignment of segments to Historicals and the balance of segments between Historicals. Historical services do not communicate directly with each other. The Coordinator sends segment load and drop requests to each Historical over HTTP, and each Historical exposes an HTTP endpoint for the Coordinator to poll for the current state of its segment assignments.
When a Historical service receives a load request, it checks its own segment cache. If no information about the segment exists there, the Historical uses the segment metadata included in the request — including where the segment is located in deep storage and how to decompress and process it — to pull the segment from deep storage.
For more information about segment metadata and Druid segments in general, see Segments.
After a Historical service pulls down and processes a segment from deep storage, Druid advertises the segment as being available for queries from the Broker. The Historical exposes its current set of served segments via an HTTP endpoint (/druid-internal/v1/segments), which the Broker polls to learn what data each Historical is serving.
For more information about how the Broker determines what data is available for queries, see Broker.
To make data from the segment cache available for querying as soon as possible, Historical services search the local segment cache upon startup and advertise the segments found there.
The segment cache uses memory mapping. The cache consumes memory from the underlying operating system so Historicals can hold parts of segment files in memory to increase query performance at the data level. The in-memory segment cache is affected by the size of the Historical JVM, heap / direct memory buffers, and other services on the operating system itself.
At query time, if the required part of a segment file is available in the memory mapped cache or "page cache", the Historical re-uses it and reads it directly from memory. If it is not in the memory-mapped cache, the Historical reads that part of the segment from disk. In this case, there is potential for new data to flush other segment data from memory. This means that if free operating system memory is close to druid.server.maxSize, the more likely that segment data will be available in memory and reduce query times. Conversely, the lower the free operating system memory, the more likely a Historical is to read segments from disk.
Note that this memory-mapped segment cache is in addition to other query-level caches.
You can configure a Historical service to log and report metrics for every query it services. For information on querying Historical services, see Querying.