Description
Summary
Add the an event-safe mode to the full-node, which brings important features to facilitate reliable integrations.
Motivation
During the development of the wallet service, we spent a long time developing a sync algorithm between the state of the wallet and the state of the full node. The motivation is to simplify this integration and others with the full node.
Without this feature, the integration must understand Hathor's architecture to develop a reliable integration. After this feature, they just have to react to a small number of cases accordingly.
Guide-level explanation
In this document, integrations and applications have the same meaning and will be used interchangeably. Blocks and transactions are collectively called vertices. Transactions refer to all non-block vertices (e.g., regular transaction and token creation transactions).
The full node can be executed with --enable-event-queue
, which means all events will be persisted and can be accessed and modified through an HTTP API. Hence, when an application is stopped or restarted, it can resume processing events from the last processed event.
Application should use the full node as the single source of truth, which means there is no need to store the vertices or their metadata. In fact, they should always fetch information from the full node, which is fast and reliable.
Applications can also remove old events at their will to save space. For instance, an application can have the policy to persist only the last month of events. In this case, the application must periodically call the API to clean up old events.
The full node ensures that no events will be missed even in case of system crash or power failure. The full node also stores the creation date and last modified date of vertices. The full node can optionally keep track of which events were correctly processed by the application.
Applications will commonly have the following initial workflow:
- Stop full node's sync algorithm, so no update will be accepted. The application can optionally clear the events in this step.
- Process all vertices in topological order. The previous step ensures that nothing will be modified while this initial processing is executed. The topological order ensures that all dependencies will be fulfilled.
- Start full node's sync algorithm and process all events in real time.
After the application is fully initialized, it just has to process the new events. For this, the application can receive the ordered events in real time or just poll the events as needed.
Applications must process events in the same order they arrive. So, after an event is successfully handled, applications must mark them as done to get the new event. If the queue of events gets full, the full node stops syncing and signals error.
The sync between the full node and an application will consist of handling the following events:
- Mempool: New executed tx.
- Mempool: New voided tx.
- Mempool: Tx changed to voided.
- Mempool: Tx changed to executed.
- Blockchain: New best block found.
- Blockchain: New orphan block found.
- Reorg: Tx is back to the mempool and is executed.
- Reorg: Tx is back to the mempool and is voided.
- Reorg: Tx is confirmed by another block.
- Reorg: Block became orphan.
- Reorg: Orphan block became part of the best blockchain.
If the full node dies while the application keeps running, the full node will remember which events have already been processed and which have not. Applications must also keep track of which events have been processed on their end, so they can properly mark them as done after the full node is back.
In case of a non-graceful stop, the full node should start in pause mode and wait for the application to resume the sync.
APIs
GET /events/[:event_id]
Get the details of a specific event id.
GET /events/list/
Get the list of events according to the filters.
POST /events/flush/
Flush the event list, deleting all that matches the filters.
GET /events/next/
Return the event in front of the queue. If this API is called multiple times without marking the event as done, it will return the same event all the times.
POST /events/[:event_id]/mark-as-done/
Mark the event as successfully done, and returns the next event of the queue. The caller can optionally delete the event.
GET /sync/status
Get the current status of the full node.
POST /sync/pause
Pause the full-node from syncing and receiving new transactions. It might be used for the application's maintenance.
POST /sync/resume
Resume syncing and receiving new transactions.
Events
Events are stored in JSON format with the following required fields:
{
"id": int (incremental),
"parent_id": int (parent event),
"timestamp": int,
"type": str,
}
Each event type can extend the JSON with extra fields.
Daemon events
NODE_STARTED
This event is generated when the full node is started.
NODE_STOPPED
This event is generated when the full node is gracefully stopped.
Sync events
SYNC_RESUMED
This event is generated when the full node resumes the sync algorithm.
SYNC_PAUSED
This event is generated when the full node pauses the sync algorithm.
Consensus events
NEW_TX_ACCEPTED
This event is generated when a new transaction is accepted and placed into the mempool.
NEW_BEST_BLOCK
This event is generated when a block is appended to the best blockchain. This event generates a sequence of METADATA_UPDATE:first_block
events in topological order.
NEW_ORPHAN_BLOCK
This event is generated when a new orphan block is received.
TX_DECLINED
This event is generated when the full node declines a transaction. The transaction will not be stored in the database and its data will only be available in this event.
REORG
This event is generated when a reorg occurs, i.e., a new blockchain with higher proof-of-work is found. This event genrates a sequence of other events.
METADATA_UPDATED
This event is generated when a vertex's metadata is updated. It might be caused by different sources.
The extra fields are:
{
field: str,
operation: str (e.g., set, append, remove, clear),
value: any,
old_value: any | null (only for set operations),
}
Example: Normal flow
The expected normal flow of events would be like this:
NEW_TX_ACCEPTED
NEW_TX_ACCEPTED
NEW_BEST_BLOCK
NEW_ORPHAN_BLOCK
NEW_TX_ACCEPTED
NEW_TX_ACCEPTED
NEW_TX_ACCEPTED
NEW_BEST_BLOCK
NEW_TX_ACCEPTED
NEW_TX_ACCEPTED
Example: Small Reorg
NEW_TX_ACCEPTED
NEW_TX_ACCEPTED
NEW_BEST_BLOCK
NEW_ORPHAN_BLOCK
REORG_BEGIN
METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)METADATA_UPDATED
(child ofREORG
)REORG_END
NEW_TX_ACCEPTED
NEW_TX_ACCEPTED
NEW_BEST_BLOCK
Example: Wallet Service
Using these events, the wallet service would not need to run a complex sync algorithm anymore. It will only have to update the database according to each type of event. As the events are generated in topological order, it can handle each event independently, and it can safely assume all dependencies will have already been processed.
Reference-level explanation
Persistence of events in a storage
As the number of events can grow fast, the full node must persist these events in a storage. Here we must add the details of how it will be implemented.
Drawbacks
Rationale and alternatives
Instead of implements the event queue in the full node, we could use one of the nice and full-featured queue system available (e.g., RabbitMQ and ZeroMQ). The queue system would run as a separate service, and the full node would have to pause syncing when new events cannot be successfully put into the queue. Even though this alternative seems good, it would require the applications to deploy, maintain, and monitor another service. So, it seems better to have a built-in queue system inside hathor-core.
Prior art
Bitcoin has an integration with ZeroMQ to handle similar situations. But, as ZeroMQ is a separate service, it might happen to lose some notifications. This is a complex situation to handle for most use cases and that's the main reason we are implement a simple event system inside the full node. For further information, see Block and Transaction Broadcasting with ZeroMQ.
Unresolved questions
Future possibilities
Add events for networking.