Hoist uses Hazelcast to coordinate multiple application instances running as a cluster. The
clustering system provides distributed data structures (caches, maps, topics), primary instance
coordination, and pub/sub messaging β all managed through BaseService factory methods and
ClusterService.
Clustering enables:
- Shared state β Caches and cached values replicated across instances
- Primary-only tasks β Timers that run on only one instance (e.g., scheduled jobs)
- Pub/sub messaging β Cluster-wide event distribution via Hazelcast topics
- Distributed execution β Running code on specific or all instances
- Admin visibility β Cluster-wide monitoring via the Admin Console
Most of the clustering functionality is accessed indirectly through BaseService resource factories
(createCache, createCachedValue, createTimer, createIMap, createISet). ClusterUtils
provides methods for distributed execution β runOnInstance(), runOnPrimary(), and
runOnAllInstances() β to run service methods on specific or all cluster members. Direct
interaction with ClusterService is rarely needed in application code.
| File | Location | Role |
|---|---|---|
ClusterService |
grails-app/services/io/xh/hoist/cluster/ |
Primary service β cluster state, primary detection |
ClusterConfig |
grails-app/init/io/xh/hoist/ |
Hazelcast configuration |
Cache |
src/main/groovy/io/xh/hoist/cache/ |
Key-value cache (Hazelcast ReplicatedMap backed) |
CachedValue |
src/main/groovy/io/xh/hoist/cachedvalue/ |
Single-value cache (Hazelcast ReliableTopic backed) |
Timer |
src/main/groovy/io/xh/hoist/util/ |
Managed polling timer with primaryOnly support |
HoistCoreGrailsPlugin |
src/main/groovy/io/xh/hoist/ |
Hazelcast lifecycle (init and shutdown) |
Application Start
β
βββ HoistCoreGrailsPlugin.doWithSpring()
β βββ ClusterService.initializeHazelcast() β Hazelcast instance created
β
βββ ApplicationReadyEvent
β βββ ClusterService.onApplicationEvent() β Sets instanceState = RUNNING
β
βββ Runtime
β βββ HoistFilter β ClusterService.ensureRunning() β Rejects requests if not RUNNING
β
βββ Shutdown
βββ ClusterService.instanceState = STOPPING
βββ Timer.shutdownAll()
βββ ClusterService.shutdownHazelcast()
| State | Description |
|---|---|
STARTING |
Hazelcast initialized, services loading |
RUNNING |
Fully ready, accepting requests |
STOPPING |
Shutting down, rejecting new requests |
The primary instance is the oldest member of the Hazelcast cluster. It handles tasks that should run on only one instance β typically scheduled data refreshes, batch jobs, and monitoring checks. When the primary instance leaves the cluster, the next-oldest member automatically becomes the new primary.
The isPrimary property is on ClusterService and is also available directly on BaseService
for convenience:
// In any BaseService
if (isPrimary) {
// Only executes on the primary instance
}Manages the Hazelcast cluster lifecycle and provides cluster-awareness to the rest of the framework.
| Property/Method | Description |
|---|---|
isPrimary |
true if this is the oldest cluster member |
instanceState |
Current instance state (STARTING, RUNNING, STOPPING) |
localName |
Human-readable name for this instance |
hzInstance |
Direct access to the Hazelcast instance (rarely needed) |
ensureRunning() |
Throws if instance is not in RUNNING state |
All distributed data structures are created through BaseService factory methods (see
base-classes.md for the full API). Here we focus on the clustering aspects.
Cache<K, V> uses a Hazelcast ReplicatedMap when replicate: true, meaning every instance
holds a complete copy of all entries. This is ideal for small-to-medium datasets that are read
frequently. The default is replicate: false (local-only, backed by a ConcurrentHashMap).
private Cache<String, Map> priceCache
void init() {
priceCache = createCache(
name: 'prices',
replicate: true, // backed by Hazelcast ReplicatedMap
expireTime: 5 * MINUTES // entries expire after this duration
)
}Hazelcast resource name: xhcache.{FullClassName}[prices]
When replicate: false, the cache uses a local ConcurrentHashMap instead β useful for
instance-specific data that doesn't need to be shared.
Cache entries have a configurable expireTime and are culled by an internal timer. Expired entries
are removed lazily on access and periodically by the cull timer.
CachedValue<T> stores a single value that can be replicated across the cluster. When a value is
set on any instance, all other instances receive the update. This makes CachedValue ideal for
expensive computations that should be shared β compute once on the primary, replicate to all.
private CachedValue<Map> summary
void init() {
summary = createCachedValue(
name: 'summary',
replicate: true,
expireTime: 30 * MINUTES
)
}Hazelcast resource name: xhcachedvalue.{FullClassName}[summary]
CachedValue replication is backed by a Hazelcast ReliableTopic, which replays the most recent value to new
instances joining the cluster.
Both Cache and CachedValue provide an ensureAvailable() method that blocks until a value is
present, with a configurable timeout (default 30 seconds). This is important during startup when a
non-primary instance may need to wait for the primary to populate a replicated value before it can
serve requests:
void init() {
marketData = createCachedValue(name: 'marketData', replicate: true)
marketData.ensureAvailable(timeout: 60 * SECONDS)
}ReplicatedMap<K, V> is a Hazelcast map where every instance holds a complete copy (eventually
consistent). This is what Cache uses internally β use createReplicatedMap() directly only
when you need raw Hazelcast map access without Cache's expiration and admin features.
IMap<K, V> is a Hazelcast distributed map where data is partitioned across cluster members.
Unlike Cache (fully replicated), IMap distributes entries β each entry lives on a subset of
instances. This is better for large datasets.
private IMap<String, byte[]> documentStore
void init() {
documentStore = createIMap('documentStore')
}Hazelcast resource name: {FullClassName}[documentStore]
Timer supports a primaryOnly mode where the task runs only on the primary instance. This
prevents duplicate work across cluster members:
createTimer(
name: 'dailyCleanup',
runFn: this.&cleanup,
interval: 24 * HOURS,
primaryOnly: true // runs only on the primary instance
)When primaryOnly: true, the timer uses a Hazelcast ReplicatedMap
(xhTimersLastCompleted) to track the last completion time across the cluster. This ensures
that if the primary changes (e.g., old primary shuts down), the new primary knows when the task
last ran and can schedule the next run correctly.
All Hazelcast distributed objects follow a naming pattern that groups resources by service:
{FullClassName}[{resourceName}]
For example, io.xh.hoist.config.ConfigService[configs].
The Cache and CachedValue wrappers add their own prefixes:
- Cache:
xhcache.{FullClassName}[{resourceName}] - CachedValue:
xhcachedvalue.{FullClassName}[{resourceName}]
This convention enables the Admin Console's "Cluster Objects" view to group and display all distributed resources by service.
Hazelcast topics provide cluster-wide pub/sub messaging. Services subscribe via
BaseService.subscribeToTopic():
void init() {
// Subscribe to a cluster-wide topic
subscribeToTopic(
topic: 'xhConfigChanged',
onMessage: { Map msg -> handleConfigChange(msg) },
primaryOnly: false // receive on all instances
)
}
// Publish to a topic
getTopic('myCustomTopic').publish([action: 'refresh', source: username])Topics are used extensively within hoist-core for config changes, preference changes, and other framework events.
Configures the Hazelcast instance before it starts. Handles:
- Network discovery β
createNetworkConfig()is a no-op by default (Hazelcast uses multicast discovery); apps can override to customize - Hibernate cache regions β GORM second-level cache backed by Hazelcast JCache
- Default eviction policies β LRU eviction for Hibernate cache regions
- Application customization β Services can provide a
static configureClusterclosure
Hazelcast uses its default multicast discovery out of the box. The ClusterConfig.createNetworkConfig()
method has an empty body by default β it serves as a hook that applications can override in a
subclass to customize network discovery (e.g., TCP/IP member lists, cloud discovery) for their
deployment environment.
ClusterConfig configures Hazelcast as the GORM second-level cache provider with default
eviction policies:
- Default cache: 5000 entries, LRU eviction
- Update timestamps region: 1000 entries
- Query results region: 10000 entries
Domain classes can customize their cache settings via a static cache closure:
class MyDomain {
static mapping = {
cache true
}
// Optional: customize the Hazelcast cache config for this domain
static cache = { cfg ->
cfg.evictionConfig.size = 10000
}
}Services can customize Hazelcast configuration for their distributed resources by declaring a
static configureCluster closure:
class MyService extends BaseService {
static configureCluster = { Config c ->
c.getMapConfig(hzName('largeDataset', this)).with {
evictionConfig.size = 100
}
}
private IMap<String, Map> largeDataset = createIMap('largeDataset')
}The most common clustering pattern β a timer on the primary instance refreshes a cached value that replicates to all instances:
class MarketDataService extends BaseService {
private CachedValue<Map> marketData
void init() {
marketData = createCachedValue(name: 'marketData', replicate: true)
createTimer(
name: 'refreshMarketData',
runFn: this.&refreshMarketData,
interval: 'xhMarketDataRefreshSecs',
primaryOnly: true
)
}
Map getMarketData() { marketData.get() }
private void refreshMarketData() {
marketData.set(fetchFromExternalApi())
}
}Publish a message that all instances will receive:
// Publisher
getTopic('dataRefreshed').publish([source: 'MarketDataService', timestamp: new Date()])
// Subscriber (in a different service)
void init() {
subscribeToTopic(
topic: 'dataRefreshed',
onMessage: { Map msg -> clearLocalState() }
)
}The Admin Console provides several cluster-related views:
- Cluster > Instances β Lists all cluster members with their state, uptime, and memory usage
- Cluster > Objects β Shows all distributed Hazelcast objects with sizes and stats
- Cluster > Services β Admin stats from all services across all instances
These views rely on ClusterService.getAdminStats() and distributed execution to gather data
from all cluster members.
Cache (with replicate: true) and CachedValue (with replicate: true) replicate data to
every instance. Storing large datasets (e.g., millions of rows) in these structures will consume
memory on every instance. Use IMap for large datasets that can be partitioned, or Cache with
replicate: false for instance-local data.
Without primaryOnly: true, a timer runs on every instance in the cluster. This means scheduled
database cleanups, email sends, or API calls will execute N times (once per instance). Always use
primaryOnly: true for tasks that should run once cluster-wide.
All values stored in Hazelcast distributed objects must be serializable. Hoist configures Kryo as Hazelcast's global serializer, which handles most common types (Maps, Lists, simple POGOs) automatically. However, Grails domain objects, closures, and other complex types may not be Kryo-serializable and will cause errors. Favor plain Maps, Lists, or simple POGOs instead.
Hazelcast replication is eventually consistent. After setting a value on one instance, there may be a brief window where other instances see the old value. For most Hoist use cases this is acceptable, but don't rely on instant cross-instance consistency for critical operations.