Skip to content

canton v2.7.3

Choose a tag to compare

@canton-machine canton-machine released this 27 Sep 13:26
· 123 commits to main since this release
a65edf0

Release of Canton 2.7.3

Canton 2.7.3 has been released on September 27, 2023. You can download the Daml Open Source edition from the Daml Connect Github Release Section. The Enterprise edition is available on Artifactory.
Please also consult the full documentation of this release.

Summary

This is a maintenance release. Please check the list of bug-fixes and improvements below. Due to various improvements, we would recommend users to upgrade to it during their next maintenance window.
Please note, we've skipped 2.7.2.

Minor Changes

Usage of the applicationId in command submissions and completion subscriptions in the canton console

Previously, the canton console would use a hard-coded "CantonConsole" as an applicationId in the command submissions and the completion subscriptions performed against the ledger api.
Now, if an access token is provided to the console, it will extract the userId from that token and use it instead. A local console will use the adminToken provided in canton.participants.<participant>.ledger-api.admin-token, whereas a remote console will use the token from canton.remote-participants.<remoteParticipant>.token
This affects the following console commands:

  • ledger_api.commands.submit
  • ledger_api.commands.submit_flat
  • ledger_api.commands.submit_async
  • ledger_api.completions.list
  • ledger_api.completions.list_with_checkpoint
  • ledger_api.completions.subscribe
    Additionally, it is possible to override the applicationId by supplying it explicitly to the said commands.

keys.secret.rotate_node_key() console command

The console command keys.secret.rotate_node_key can now accept a name for the newly generated key.

owner_to_key_mappings.rotate_key command expects a node reference

The previous owner_to_key_mappings.rotate_key is deprecated and now expects a node reference (InstanceReferenceCommon)
to avoid any dangerous and/or unwanted key rotations.

Finalised response cache size is now configurable

The mediator keeps recent finalised responses cached in memory to avoid having to re-fetch late responses from the database.
The cache size is now configurable via

canton.mediators.mediator.caching.finalized-mediator-requests.maximum-size = 1000 // default

Improved Pruning Queries

The background pruning queries of the contract key journal for Postgres have been improved to reduce the load on the
database by making better use of the existing indexes. In addition, a pruning related query that checks the request
journal for how far it is safe to prune has also been improved for Postgres by choosing a more suitable index.

Improved KMS Audit Logs

The Canton trace id was added back to some KMS audit logs where it was missing.

Console Changes

Commands around ACS migration

Console commands for ACS migration can now be used with remote nodes.

Bug Fixes

(23-023, Critical): Crash recovery issue in command deduplication store

Description

On restart of a sync domain, the participant will replay pending transactions, updating the stores in case some writes were not persisted. Within the command deduplication store, existing records are compared with to be written records for internal consistency checking purposes. This comparison includes the trace context which differs on a restart and hence can cause the check to fail, aborting the startup with an IllegalArgumentException.

Affected Deployments

Participant

Impact

An affected participant can not reconnect to the given domain, which means that transaction processing is blocked.

Symptom

Upon restart, the participant refuses to reconnect to the domain, writing the following log message: ERROR c.d.c.p.s.d.DbCommandDeduplicationStore ... - An internal error has occurred.
java.lang.IllegalArgumentException: Cannot update command deduplication data for ChangeIdHash

Workaround

No workaround exists. You need to upgrade to a version not affected by this issue.

Likeliness

The error happens if a participant crashes with a particular state not yet written to the database. The bug has been present since end of Nov 21 and has never been observed before, not even during testing.

Recommendation

Upgrade during your next maintenance window to a patch version not affected by this issue.

(23-022, Major): - rotate_keys command is dangerous and does not work when there are multiple sequencers

We allow the replacement of key A with key B, but we cannot guarantee that the node using key A will actually have access to key B.
Furthermore, when attempting to rotate the keys of a sequencer using the rotate_node_keys(domainManagerRef) method,
it will fail if we have more than one sequencer in our environment.
This occurs because they share a unique identifier (UID), and as a result, this console command not only rotates the
keys of the sequencer it is called on but also affects the keys of the other sequencers.
Modified the process of finding keys for rotation in the rotate_node_keys(domainManagerRef)
function to prevent conflicts among multiple sequencers that share the same UID.
Additionally, we have updated the console command owner_to_key_mappings.rotate_key to
expect a node reference (InstanceReferenceCommon), thereby ensuring that both the current
and new keys are associated with the correct node.

Affected deployments

All nodes (but mostly sequencer)

Impact

A node can mistakenly rotate keys that do not pertain to it. Using rotate_node_keys(domainManagerRef) to rotate a sequencer's keys when other sequencers are present will also fail and break Canton.

Symptom

When trying to rotate a sequencers' keys it catastrophically fails with a java.lang.RuntimeException: KeyNotAvailable.

Workaround

For 'rotate_node_keys(domainManagerRef),' we have ensured that we filter the correct keys for rotation by checking both the authorized store and the local private key store.
Additionally, we have deprecated the existing 'owner_to_key_mappings.rotate_key' and introduced a new method that requires the user to provide the node instance for which they intend to apply the key rotation. We have also implemented a validation check within this function to ensure that the current and new keys are associated with this node.

Likelihood

Everytime we use rotate_node_keys to rotate the keys of sequencer(s) in a multiple sequencer environment.

Recommendation

Upgrade the Canton console that you use to administrate the domain, in particular the sequencer and mediator,
to a Canton version with the bug fix.

(23-019, Minor): Fixed rotate_node_keys command when it is used to rotate keys of sequencer(s) and mediator(s)

Canton has a series of console commands to rotate keys, in particular, rotate_node_keys that is used
to rotate de keys of a node.
We allow the replacement of key A with key B, but we cannot guarantee that the node using key A
will actually have access to key B.
Furthermore, when attempting to rotate the keys of a sequencer using the
rotate_node_keys(domainManagerRef) method, it will fail if we have more than one sequencer in our environment.
This occurs because they share a unique identifier (UID), and as a result, this console command not only
rotates the keys of the sequencer it is called on but also affects the keys of the other sequencers.

Impact

A node can mistakenly rotate keys that do not pertain to it.
Using rotate_node_keys(domainManagerRef) to rotate a sequencer's keys when other
sequencers are present will also fail and break Canton.

Symptom

No visible symptom as the command just skips over the keys to rotate.

Workaround

For rotate_node_keys(domainManagerRef), we have ensured that we filter the correct keys for
rotation by checking both the authorized store and the local private key store.
Additionally, we have deprecated the existing owner_to_key_mappings.rotate_key and introduced a new
method that requires the user to provide the node instance for which they intend to apply the key rotation.
We have also implemented a validation check within this function to ensure that the current
and new keys are associated with this node.

Likelihood

Everytime we use rotate_node_keys to rotate the keys of sequencer(s) in a multiple sequencer environment.

Recommendation

Upgrade the Canton console that you use to administrate the domain, in particular the sequencer and mediator,
to a Canton version with the bug fix.

(23-021, Minor): Core contract input stakeholders bug

The Canton protocol includes in ViewParticipantData contracts that required to re-interpret command evaluation. These inputs are known as coreInputs. Included as part of this input contract data is contract metadata the includes details of signatories and stakeholders.
Where the evaluation of a command results in a contract key being resolved to a contract identifier but that identifier is not in turn resolved to a contract instance that distributed metadata associated with contract will incorrectly have the key maintainers as both the signatory and the stakeholders. A way to do this would be to execute a choice on a contract other than the keyed contract that only issues a lookupByKey on the keyed contract.

Affected protocol versions

Protocol version 3,4

Impact

As the signatory data associated with the affected contracts is not valid, any validation based on this will also be invalid, so the command will be rejected.

Symptom

This bug was discovered when the model conformance logic was extended to validate all input contract signatories. The presented signatory list being inconsistent with that expected caused a failure.

Workaround

A workaround would be to ensure that whenever a contract key is resolved to a contract identifier that identifier is always resolved to a contract (even if not needed). For example following the lookupByKey if the case where a contract identifier is returned the issue a fetch command on this identifier, discarding the result.

Likelihood

Unlikely. Most of the time a contract key is resolved to a contract so that some action can be performed on that contract. In this situation the metadata would be correct. The only situation this has occurred observed is in test scenarios.

Recommendation

If affected, the issue should be observed during development and testing which can then be remediated by upgrading the environment to the appropriate version.

(23-020, Critical): Use of Daml Exceptions may break transaction view creation

Transaction view decomposition is the process of taking a transaction and generating a view hierarchy whereby each view has a common set of informees. Each view additionally has a rollback scope. A child view having a rollback scope that is different from that of the parent indicates that any changes to the liveness of contracts that occurred within the child view should be disregarded upon exiting the child view. For example it would be valid for a contract to be consumed in a rolled back child view and then consumed again in a subsequent child view.
As the activeness of contracts is preserved across views with the same rollback scope every source transaction rollback node should be allocated a unique scope. In certain circumstances this is not happening resulting in contract activeness inconsistency.

Affected protocol versions

This problem is observed in canton protocol version 4 and fixed in version 5.

Impact

The impact of this bug is that a inconsistent transaction view hierarchy can be generated from a consistent transaction. This in turn can result in a valid transaction being rejected or in a ledger fork when an observer participant lacks input to properly processing a transaction, thereby rejecting it beside the transaction being approved. The symptoms of this would be the mediator rejecting a valid transaction request on the basis of inconsistency or the participant declaring LOCAL_VERDICT_FAILED_MODEL_CONFORMANCE_CHECK followed by an ACS_COMMITMENT_MISMATCH.

Workaround

This bug only effects transactions that contain rollbacks (catched exceptions), if use of rollbacks can be avoided this bug will not occur.

Likelihood

To encounter this bug requires a transaction that has multiple rolled back nodes in which overlapping contracts and/or keys are used. For this reason the likelihood of encountering the bug is low and issues should be discovered during development / testing.

Recommendation

Only use Daml exception with protocol version >= 5.

(23-024, Moderate) Participant state topology transaction may be silently ignored during cascading update

In some cases, participant state and mediator domain state topology transactions were silently ignored when they were sent as part of a cascading topology update (which means they were sent together with a namespace certificate). As a result, the nodes had a different view on the topology state and not all daml transactions could be run.

Affected Deployments

All nodes

Impact

A participant node might consider another participant node as inactive and therefore refuse to send transactions or invalidate transactions.

Symptom

A Daml transaction might be rejected with UNKNOWN_INFORMEES.

Workaround

Flush the topology state by running "domain.participants.set_state(pid, Submission, Vip (and back to Ordinary))". This will run the update through the "incremental" update code path which is behaving correct, thereby fixing the topology state of the broken node.

Likelihood

The bug is deterministic and can be caused when using permissioned domains when the participant state is received together with the namespace delegation of the domain but without the namespace delegation of the participant.

Recommendation

Upgrade to this version if you intend to use permissioned domains.
If you need to fix a broken system, then upgrade to a version fixing the issue and apply the work-around to "flush" the topology state.

(23-025, Minor) PingService stops working after a LedgerAPI crash

After an Indexer restart in the Ledger API or any error causing the client transaction streams to fail, the PingService stops working.

Affected Deployments

The participant node

Impact

When the Ledger API encounters an error that leads to cancelling the client connections
while the participant node does not become passive, the PingService cannot continue processing commands.

Symptom

Ping commands issued in the PingService are timing out.
Additionally, the participant might appear unhealthy if configured to report health
by using the PingService (i.e. configured with monitoring.health.check.type = ping).

Workaround

Restart the participant node.

Likelihood

This bug occurs consistently when there is an error in the Ledger API, such as a DB overloaded
issue that causes the Ledger API Indexer to restart. For this bug to occur, the participant node
must not transition to passive state. If it transitions to passive and then back to active, the bug should not reproduce.

Recommendation

If the system is subject to frequent transient errors in the Ledger API (e.g. flaky Index database)
or consistently high load, update to this version in order to avoid reproducibility.

Compatibility

The following Canton protocol and Ethereum sequencer contract versions are supported:

Dependency Version
Canton protocol versions 3, 4, 5
Canton has been tested against the following versions of its dependencies:
Dependency Version
---------------------------- ----------------------------
Java Runtime OpenJDK 64-Bit Server VM Zulu11.62+17-CA (build 11.0.18+10-LTS, mixed mode)
Postgres postgres (PostgreSQL) 14.9 (Debian 14.9-1.pgdg120+1)
Oracle 19.18.0