Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: adr-070 unordered transactions refactor #23974

Merged
merged 17 commits into from
Mar 19, 2025

Conversation

technicallyty
Copy link
Contributor

Description

new way of doing unordered txs


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

Please see Pull Request Reviewer section in the contributing guide for more information on how to review a pull request.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

@technicallyty technicallyty marked this pull request as ready for review March 13, 2025 16:38
@technicallyty technicallyty requested a review from a team as a code owner March 13, 2025 16:38
// Manager contains the tx hash dictionary for duplicates checking, and expire
// them when block production progresses.
type Manager struct {
kvStore store.KVStoreService
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What module do we want to have this in? All KV store keys need a module. x/auth seems the most logical, but we could also have a small dedicated x/auth/unordered module.

Also, I think we can just use collections.Set to deal with state here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What module do we want to have this in?

are there tradeoffs between doing one or the other?

Also, I think we can just use collections.Set to deal with state here.

i was toying around with this, but was having trouble figuring out how to iterate from nil to a partial value of <block_time/.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x/auth means one less module to wire up in app.go, but maybe more tie in to x/auth long term. This is maybe a longer discussion. But for easy upgrades in 0.53, I would say x/auth and we expose a simple, well defined interface for it.

Collections isn't essential here I think, just thought it might be easier.

Copy link
Contributor Author

@technicallyty technicallyty Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have a PoC where i've made unordered sequences managed directly by x/auth store/keeper. you can check it out here: #24010

i also switched to using collections after playing with the API more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally good, but as a practice for ADR I don't find it particularly useful to have this level of detail around code implementation (I realize the previous iteration had a lot of detail too). The more important thing IMHO is the conceptual flow, i.e. pseudo-code - what happens in each phase, what properties of security, performance, etc. does the code have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense - i did find it cumbersome going back and forth between this and the draft. throwing some simple python in here would be easier to deal with

* Jan 30, 2024: Include section on deterministic transaction encoding
- Dec 4, 2023: Initial Draft (@yihuang, @tac0turtle, @alexanderbez)
- Jan 30, 2024: Include section on deterministic transaction encoding
- Mar 12, 2025: Revise implementation to use Cosmos SDK KV Store and require unique timeouts per-address (@technicallyty)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update date whenever we merge


## Status

ACCEPTED
Draft
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to accpeted whenever we merge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we're following PROCESS.md, this should probably be ACCEPTED Not Implemented

and limit the size of the dictionary.
When an unordered transaction is included in a block, a concatenation of the `timeout_timestamp` and sender’s bech32 address
will be recorded to state (i.e. `542939323/cosmos1v1234567890AbcDeF`). In cases of multi-party signing, we will use a
comma-separated list of the addresses that signed the transaction (i.e. `5532231/cosmosv11,cosmosv12,cosmosv13`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should specify that this is sorted (and implement it as sorted)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theoretically you could submit two Tx with the same timestamp with with A, B, C signers and B, A, C signers right?

These are effectively the same Tx but in the store (if not sorted) would be considered two separate

return nil
```go
func (am AppModule) PreBlock(ctx context.Context) (appmodule.ResponsePreBlock, error) {
err := am.accountKeeper.GetUnorderedTxManager().RemoveExpired(sdk.UnwrapSDKContext(ctx))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have NPE protection on this for when users do not use this

Copy link
Contributor Author

@technicallyty technicallyty Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the account keeper currently always has this instantiated, so it wouldn't NPE

you can specify to enable the unordered ante in the ante options, but we'd need another option if we want to enable/disable constructing this field in the account keeper, replacing it with a no-op impl maybe.. might be awkward UX to specify this in two different places though. probably best to walk through this on a call

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see your point - lets just be aware of it when we do code reviews for this. I think you're right about how it is awkward if users have to set in multiple places

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's just simpler to add these methods directly to the account keeper expected interface. There's not much advantage to needed to get an unordered tx manager from the account keeper IMHO. Simpler just to have something like:

type AccountKeeper interface {
  ...
  TryAddUnorderedNonce(context,Context, timestamp Timestamp, address []byte) error
  CleanupUnorderedNonces(context.Context)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts here? Looks like this was marked as resolved

Copy link
Member

@aaronc aaronc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments - maybe some small decisions to think through, but generally looks good.

safety, we should limit the range of `timeout_timestamp` to prevent very long expiration,
and limit the size of the dictionary.
When an unordered transaction is included in a block, a concatenation of the `timeout_timestamp` and sender’s bech32 address
will be recorded to state (i.e. `542939323/cosmos1v1234567890AbcDeF`). In cases of multi-party signing, we will use a sorted,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bech32 encoded? Shouldn't we just use the binary address bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll change it to bytes. the PoC used the pubkey.address.string() but i can just use bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, this complicates the impl a bit, since i was using the concat of pubkey address strings. lmk what you think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the update to one entry per signer, i have changed it to just use the pubkey bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we want to use the address, not the pubkey bytes which could depend on how the pubkey is serialized. As an alternative to the address that may be even better is just using the account number which is just a uint64

return nil
```go
func (am AppModule) PreBlock(ctx context.Context) (appmodule.ResponsePreBlock, error) {
err := am.accountKeeper.GetUnorderedTxManager().RemoveExpired(sdk.UnwrapSDKContext(ctx))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's just simpler to add these methods directly to the account keeper expected interface. There's not much advantage to needed to get an unordered tx manager from the account keeper IMHO. Simpler just to have something like:

type AccountKeeper interface {
  ...
  TryAddUnorderedNonce(context,Context, timestamp Timestamp, address []byte) error
  CleanupUnorderedNonces(context.Context)
}

// Manager contains the tx hash dictionary for duplicates checking, and expire
// them when block production progresses.
type Manager struct {
kvStore store.KVStoreService
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally good, but as a practice for ADR I don't find it particularly useful to have this level of detail around code implementation (I realize the previous iteration had a lot of detail too). The more important thing IMHO is the conceptual flow, i.e. pseudo-code - what happens in each phase, what properties of security, performance, etc. does the code have

Copy link
Contributor

github-actions bot commented Mar 18, 2025

⚠️ govulncheck found vulnerabilities:

mkdir -p /home/runner/work/cosmos-sdk/cosmos-sdk/build/
GOBIN=/home/runner/work/cosmos-sdk/cosmos-sdk/build go install golang.org/x/vuln/cmd/govulncheck@latest
go: downloading golang.org/x/vuln v1.1.4
go: downloading golang.org/x/telemetry v0.0.0-20240522233618-39ace7a40ae7
go: downloading golang.org/x/mod v0.22.0
go: downloading golang.org/x/tools v0.29.0
/home/runner/work/cosmos-sdk/cosmos-sdk/build/govulncheck ./...
=== Symbol Results ===

Vulnerability #1: GO-2025-3443
    CometBFT allows a malicious peer to stall the network by disseminating
    seemingly valid block parts in github.com/cometbft/cometbft
  More info: https://pkg.go.dev/vuln/GO-2025-3443
  Module: github.com/cometbft/cometbft
    Found in: github.com/cometbft/[email protected]
    Fixed in: github.com/cometbft/[email protected]
    Example traces found:
      #1: testutil/network/util.go:74:24: network.startInProcess calls service.BaseService.Start, which eventually calls types.Part.ValidateBasic
      #2: testutil/network/util.go:74:24: network.startInProcess calls service.BaseService.Start, which eventually calls types.Part.ValidateBasic
      #3: client/rpc/block.go:56:36: rpc.QueryBlocks calls local.Local.BlockSearch, which eventually calls types.PartFromProto
      #4: client/rpc/block.go:56:36: rpc.QueryBlocks calls local.Local.BlockSearch, which eventually calls types.PartFromProto

Your code is affected by 1 vulnerability from 1 module.
This scan also found 0 vulnerabilities in packages you import and 2
vulnerabilities in modules you require, but your code doesn't appear to call
these vulnerabilities.
Use '-show verbose' for more details.
make: *** [Makefile:155: vulncheck] Error 3

Copy link
Contributor

    ⚠️ **govulncheck found vulnerabilities:**


    >mkdir -p /home/runner/work/cosmos-sdk/cosmos-sdk/build/

GOBIN=/home/runner/work/cosmos-sdk/cosmos-sdk/build go install golang.org/x/vuln/cmd/govulncheck@latest
go: downloading golang.org/x/vuln v1.1.4
go: downloading golang.org/x/telemetry v0.0.0-20240522233618-39ace7a40ae7
go: downloading golang.org/x/mod v0.22.0
go: downloading golang.org/x/tools v0.29.0
/home/runner/work/cosmos-sdk/cosmos-sdk/build/govulncheck ./...
=== Symbol Results ===

Vulnerability #1: GO-2025-3443
CometBFT allows a malicious peer to stall the network by disseminating
seemingly valid block parts in github.com/cometbft/cometbft
More info: https://pkg.go.dev/vuln/GO-2025-3443
Module: github.com/cometbft/cometbft
Found in: github.com/cometbft/[email protected]
Fixed in: github.com/cometbft/[email protected]
Example traces found:
#1: testutil/network/util.go:74:24: network.startInProcess calls service.BaseService.Start, which eventually calls types.Part.ValidateBasic
#2: testutil/network/util.go:74:24: network.startInProcess calls service.BaseService.Start, which eventually calls types.Part.ValidateBasic
#3: client/rpc/block.go:56:36: rpc.QueryBlocks calls local.Local.BlockSearch, which eventually calls types.PartFromProto
#4: client/rpc/block.go:56:36: rpc.QueryBlocks calls local.Local.BlockSearch, which eventually calls types.PartFromProto

Your code is affected by 1 vulnerability from 1 module.
This scan also found 0 vulnerabilities in packages you import and 2
vulnerabilities in modules you require, but your code doesn't appear to call
these vulnerabilities.
Use '-show verbose' for more details.
make: *** [Makefile:155: vulncheck] Error 3

Copy link
Member

@aaronc aaronc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've addressed the key issues 👍. We can deal with details in the implementation PR

Comment on lines 94 to 102
func (m *AccountKeeper) Contains(ctx sdk.Context, sender []byte, timestamp uint64) (bool, error) {
return m.unorderedSequences.Has(ctx, collections.Join(timestamp, sender))
}

func (m *UnorderedTxManager) Add(ctx sdk.Context, sender []byte, timestamp uint64) error {
func (m *AccountKeeper) Add(ctx sdk.Context, sender []byte, timestamp uint64) error {
return m.unorderedSequences.Set(ctx, collections.Join(timestamp, sender))
}

func (m *UnorderedTxManager) RemoveExpired(ctx sdk.Context) error {
func (m *AccountKeeper) RemoveExpired(ctx sdk.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just suggest something a little more detailed than simply Contains, Add, etc. that mentions unordered

@technicallyty technicallyty enabled auto-merge (squash) March 19, 2025 19:42
Copy link
Contributor

    ⚠️ **govulncheck found vulnerabilities:**


    >mkdir -p /home/runner/work/cosmos-sdk/cosmos-sdk/build/

GOBIN=/home/runner/work/cosmos-sdk/cosmos-sdk/build go install golang.org/x/vuln/cmd/govulncheck@latest
go: downloading golang.org/x/vuln v1.1.4
go: downloading golang.org/x/telemetry v0.0.0-20240522233618-39ace7a40ae7
go: downloading golang.org/x/mod v0.22.0
go: downloading golang.org/x/tools v0.29.0
/home/runner/work/cosmos-sdk/cosmos-sdk/build/govulncheck ./...
=== Symbol Results ===

Vulnerability #1: GO-2025-3443
CometBFT allows a malicious peer to stall the network by disseminating
seemingly valid block parts in github.com/cometbft/cometbft
More info: https://pkg.go.dev/vuln/GO-2025-3443
Module: github.com/cometbft/cometbft
Found in: github.com/cometbft/[email protected]
Fixed in: github.com/cometbft/[email protected]
Example traces found:
#1: testutil/network/util.go:74:24: network.startInProcess calls service.BaseService.Start, which eventually calls types.Part.ValidateBasic
#2: testutil/network/util.go:74:24: network.startInProcess calls service.BaseService.Start, which eventually calls types.Part.ValidateBasic
#3: client/rpc/block.go:56:36: rpc.QueryBlocks calls local.Local.BlockSearch, which eventually calls types.PartFromProto
#4: client/rpc/block.go:56:36: rpc.QueryBlocks calls local.Local.BlockSearch, which eventually calls types.PartFromProto

Your code is affected by 1 vulnerability from 1 module.
This scan also found 0 vulnerabilities in packages you import and 2
vulnerabilities in modules you require, but your code doesn't appear to call
these vulnerabilities.
Use '-show verbose' for more details.
make: *** [Makefile:155: vulncheck] Error 3

@technicallyty technicallyty merged commit 6577e1e into release/v0.53.x Mar 19, 2025
43 of 44 checks passed
@technicallyty technicallyty deleted the technicallyty/new-unordered-adr branch March 19, 2025 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants