canton v2.7.3
Release of Canton 2.7.3
Canton 2.7.3 has been released on September 27, 2023. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.
Summary
This is a maintenance release. Please check the list of bug-fixes and improvements below. Due to various improvements, we would recommend users to upgrade to it during their next maintenance window.
Please note, we've skipped 2.7.2.
Minor Changes
Usage of the applicationId in command submissions and completion subscriptions in the canton console
Previously, the canton console would use a hard-coded "CantonConsole" as an applicationId in the command submissions and the completion subscriptions performed against the ledger api.
Now, if an access token is provided to the console, it will extract the userId from that token and use it instead. A local console will use the adminToken provided in canton.participants.<participant>.ledger-api.admin-token, whereas a remote console will use the token from canton.remote-participants.<remoteParticipant>.token
This affects the following console commands:
- ledger_api.commands.submit
- ledger_api.commands.submit_flat
- ledger_api.commands.submit_async
- ledger_api.completions.list
- ledger_api.completions.list_with_checkpoint
- ledger_api.completions.subscribe
Additionally, it is possible to override the applicationId by supplying it explicitly to the said commands.
keys.secret.rotate_node_key() console command
The console command keys.secret.rotate_node_key can now accept a name for the newly generated key.
owner_to_key_mappings.rotate_key command expects a node reference
The previous owner_to_key_mappings.rotate_key is deprecated and now expects a node reference (InstanceReferenceCommon)
to avoid any dangerous and/or unwanted key rotations.
Finalised response cache size is now configurable
The mediator keeps recent finalised responses cached in memory to avoid having to re-fetch late responses from the database.
The cache size is now configurable via
canton.mediators.mediator.caching.finalized-mediator-requests.maximum-size = 1000 // default
Improved Pruning Queries
The background pruning queries of the contract key journal for Postgres have been improved to reduce the load on the
database by making better use of the existing indexes. In addition, a pruning related query that checks the request
journal for how far it is safe to prune has also been improved for Postgres by choosing a more suitable index.
Improved KMS Audit Logs
The Canton trace id was added back to some KMS audit logs where it was missing.
Console Changes
Commands around ACS migration
Console commands for ACS migration can now be used with remote nodes.
Bug Fixes
(23-023, Critical): Crash recovery issue in command deduplication store
Description
On restart of a sync domain, the participant will replay pending transactions, updating the stores in case some writes were not persisted. Within the command deduplication store, existing records are compared with to be written records for internal consistency checking purposes. This comparison includes the trace context which differs on a restart and hence can cause the check to fail, aborting the startup with an IllegalArgumentException.
Affected Deployments
Participant
Impact
An affected participant can not reconnect to the given domain, which means that transaction processing is blocked.
Symptom
Upon restart, the participant refuses to reconnect to the domain, writing the following log message: ERROR c.d.c.p.s.d.DbCommandDeduplicationStore ... - An internal error has occurred.
java.lang.IllegalArgumentException: Cannot update command deduplication data for ChangeIdHash
Workaround
No workaround exists. You need to upgrade to a version not affected by this issue.
Likeliness
The error happens if a participant crashes with a particular state not yet written to the database. The bug has been present since end of Nov 21 and has never been observed before, not even during testing.
Recommendation
Upgrade during your next maintenance window to a patch version not affected by this issue.
(23-022, Major): - rotate_keys command is dangerous and does not work when there are multiple sequencers
We allow the replacement of key A with key B, but we cannot guarantee that the node using key A will actually have access to key B.
Furthermore, when attempting to rotate the keys of a sequencer using the rotate_node_keys(domainManagerRef) method,
it will fail if we have more than one sequencer in our environment.
This occurs because they share a unique identifier (UID), and as a result, this console command not only rotates the
keys of the sequencer it is called on but also affects the keys of the other sequencers.
Modified the process of finding keys for rotation in the rotate_node_keys(domainManagerRef)
function to prevent conflicts among multiple sequencers that share the same UID.
Additionally, we have updated the console command owner_to_key_mappings.rotate_key to
expect a node reference (InstanceReferenceCommon), thereby ensuring that both the current
and new keys are associated with the correct node.
Affected deployments
All nodes (but mostly sequencer)
Impact
A node can mistakenly rotate keys that do not pertain to it. Using rotate_node_keys(domainManagerRef) to rotate a sequencer's keys when other sequencers are present will also fail and break Canton.
Symptom
When trying to rotate a sequencers' keys it catastrophically fails with a java.lang.RuntimeException: KeyNotAvailable.
Workaround
For 'rotate_node_keys(domainManagerRef),' we have ensured that we filter the correct keys for rotation by checking both the authorized store and the local private key store.
Additionally, we have deprecated the existing 'owner_to_key_mappings.rotate_key' and introduced a new method that requires the user to provide the node instance for which they intend to apply the key rotation. We have also implemented a validation check within this function to ensure that the current and new keys are associated with this node.
Likelihood
Everytime we use rotate_node_keys to rotate the keys of sequencer(s) in a multiple sequencer environment.
Recommendation
Upgrade the Canton console that you use to administrate the domain, in particular the sequencer and mediator,
to a Canton version with the bug fix.
(23-019, Minor): Fixed rotate_node_keys command when it is used to rotate keys of sequencer(s) and mediator(s)
Canton has a series of console commands to rotate keys, in particular, rotate_node_keys that is used
to rotate de keys of a node.
We allow the replacement of key A with key B, but we cannot guarantee that the node using key A
will actually have access to key B.
Furthermore, when attempting to rotate the keys of a sequencer using the
rotate_node_keys(domainManagerRef) method, it will fail if we have more than one sequencer in our environment.
This occurs because they share a unique identifier (UID), and as a result, this console command not only
rotates the keys of the sequencer it is called on but also affects the keys of the other sequencers.
Impact
A node can mistakenly rotate keys that do not pertain to it.
Using rotate_node_keys(domainManagerRef) to rotate a sequencer's keys when other
sequencers are present will also fail and break Canton.
Symptom
No visible symptom as the command just skips over the keys to rotate.
Workaround
For rotate_node_keys(domainManagerRef), we have ensured that we filter the correct keys for
rotation by checking both the authorized store and the local private key store.
Additionally, we have deprecated the existing owner_to_key_mappings.rotate_key and introduced a new
method that requires the user to provide the node instance for which they intend to apply the key rotation.
We have also implemented a validation check within this function to ensure that the current
and new keys are associated with this node.
Likelihood
Everytime we use rotate_node_keys to rotate the keys of sequencer(s) in a multiple sequencer environment.
Recommendation
Upgrade the Canton console that you use to administrate the domain, in particular the sequencer and mediator,
to a Canton version with the bug fix.
(23-021, Minor): Core contract input stakeholders bug
The Canton protocol includes in ViewParticipantData contracts that required to re-interpret command evaluation. These inputs are known as coreInputs. Included as part of this input contract data is contract metadata the includes details of signatories and stakeholders.
Where the evaluation of a command results in a contract key being resolved to a contract identifier but that identifier is not in turn resolved to a contract instance that distributed metadata associated with contract will incorrectly have the key maintainers as both the signatory and the stakeholders. A way to do this would be to execute a choice on a contract other than the keyed contract that only issues a lookupByKey on the keyed contract.
Affected protocol versions
Protocol version 3,4
Impact
As the signatory data associated with the affected contracts is not valid, any validation based on this will also be invalid, so the command will be rejected.
Symptom
This bug was discovered when the model conformance logic was extended to validate all input contract signatories. The presented signatory list being inconsistent with that expected caused a failure.
Workaround
A workaround would be to ensure that whenever a contract key is resolved to a contract identifier that identifier is always resolved to a contract (even if not needed). For example following the lookupByKey if the case where a contract identifier is returned the issue a fetch command on this identifier, discarding the result.
Likelihood
Unlikely. Most of the time a contract key is resolved to a contract so that some action can be performed on that contract. In this situation the metadata would be correct. The only situation this has occurred observed is in test scenarios.
Recommendation
If affected, the issue should be observed during development and testing which can then be remediated by upgrading the environment to the appropriate version.
(23-020, Critical): Use of Daml Exceptions may break transaction view creation
Transaction view decomposition is the process of taking a transaction and generating a view hierarchy whereby each view has a common set of informees. Each view additionally has a rollback scope. A child view having a rollback scope that is different from that of the parent indicates that any changes to the liveness of contracts that occurred within the child view should be disregarded upon exiting the child view. For example it would be valid for a contract to be consumed in a rolled back child view and then consumed again in a subsequent child view.
As the activeness of contracts is preserved across views with the same rollback scope every source transaction rollback node should be allocated a unique scope. In certain circumstances this is not happening resulting in contract activeness inconsistency.
Affected protocol versions
This problem is observed in canton protocol version 4 and fixed in version 5.
Impact
The impact of this bug is that a inconsistent transaction view hierarchy can be generated from a consistent transaction. This in turn can result in a valid transaction being rejected or in a ledger fork when an observer participant lacks input to properly processing a transaction, thereby rejecting it beside the transaction being approved. The symptoms of this would be the mediator rejecting a valid transaction request on the basis of inconsistency or the participant declaring LOCAL_VERDICT_FAILED_MODEL_CONFORMANCE_CHECK followed by an ACS_COMMITMENT_MISMATCH.
Workaround
This bug only effects transactions that contain rollbacks (catched exceptions), if use of rollbacks can be avoided this bug will not occur.
Likelihood
To encounter this bug requires a transaction that has multiple rolled back nodes in which overlapping contracts and/or keys are used. For this reason the likelihood of encountering the bug is low and issues should be discovered during development / testing.
Recommendation
Only use Daml exception with protocol version >= 5.
(23-024, Moderate) Participant state topology transaction may be silently ignored during cascading update
In some cases, participant state and mediator domain state topology transactions were silently ignored when they were sent as part of a cascading topology update (which means they were sent together with a namespace certificate). As a result, the nodes had a different view on the topology state and not all daml transactions could be run.
Affected Deployments
All nodes
Impact
A participant node might consider another participant node as inactive and therefore refuse to send transactions or invalidate transactions.
Symptom
A Daml transaction might be rejected with UNKNOWN_INFORMEES.
Workaround
Flush the topology state by running "domain.participants.set_state(pid, Submission, Vip (and back to Ordinary))". This will run the update through the "incremental" update code path which is behaving correct, thereby fixing the topology state of the broken node.
Likelihood
The bug is deterministic and can be caused when using permissioned domains when the participant state is received together with the namespace delegation of the domain but without the namespace delegation of the participant.
Recommendation
Upgrade to this version if you intend to use permissioned domains.
If you need to fix a broken system, then upgrade to a version fixing the issue and apply the work-around to "flush" the topology state.
(23-025, Minor) PingService stops working after a LedgerAPI crash
After an Indexer restart in the Ledger API or any error causing the client transaction streams to fail, the PingService stops working.
Affected Deployments
The participant node
Impact
When the Ledger API encounters an error that leads to cancelling the client connections
while the participant node does not become passive, the PingService cannot continue processing commands.
Symptom
Ping commands issued in the PingService are timing out.
Additionally, the participant might appear unhealthy if configured to report health
by using the PingService (i.e. configured with monitoring.health.check.type = ping).
Workaround
Restart the participant node.
Likelihood
This bug occurs consistently when there is an error in the Ledger API, such as a DB overloaded
issue that causes the Ledger API Indexer to restart. For this bug to occur, the participant node
must not transition to passive state. If it transitions to passive and then back to active, the bug should not reproduce.
Recommendation
If the system is subject to frequent transient errors in the Ledger API (e.g. flaky Index database)
or consistently high load, update to this version in order to avoid reproducibility.
Compatibility
The following Canton protocol and Ethereum sequencer contract versions are supported:
| Dependency | Version |
|---|---|
| Canton protocol versions | 3, 4, 5 |
| Canton has been tested against the following versions of its dependencies: | |
| Dependency | Version |
| ---------------------------- | ---------------------------- |
| Java Runtime | OpenJDK 64-Bit Server VM Zulu11.62+17-CA (build 11.0.18+10-LTS, mixed mode) |
| Postgres | postgres (PostgreSQL) 14.9 (Debian 14.9-1.pgdg120+1) |
| Oracle | 19.18.0 |