Skip to content

Conversation

@KolbyML
Copy link
Member

@KolbyML KolbyML commented Sep 10, 2025

Fixes NIT-2567

I have implemented the JSON-RPC interface for the consensus and execution layer.

I setup a basic cli-arg setup has --node.interconnect which has 2 options
- direct the default uses the pre-existing interfaces
- jsonrpc runs the Consensus and Execution interfaces over a json rpc

I run direct and it works fine.

jsonrpc fails with "failed to create node err="not connected" which to fix this I assume we need to ensure
- the client is connecting to the right jsonrpc server
- start the clients after the server is started

^ so basically ensuring the server's are running and properly connect before trying to use the interfaces

In terms of scope Have configuration to use existing interface or new RPC interface I set things up with this in mind. In the future I assume that we will allow users to specify running exclusively the consensus side so that a different execution layer client can be chosen, but to my knowledge that task was out of scope of this issue/pr and will definitely require some re-arranging and refactoring of how things are initialized.

So I believe this Issue is more of a V1 before we add more complexity to how the node is initialized.

If anybody has any questions feel free to ask me

Updated description

#3617 (comment)

@ganeshvanahalli
Copy link
Contributor

ganeshvanahalli commented Sep 18, 2025

This PR has been changed from its initial design. Execution and Consensus each have capability to enable connections to them via json-rpc through the flags --execution.rpc-server.* and --node.rpc-server.* respectively, and they are allowed to connect to external such counterparts via json-rpc facilitated by flags --execution.consensus-rpc-client.* and --node.execution-rpc-client.* respectively.

Nitro has two new flags that determine

--execution-node=<bool>
--consensus-execution-use-rpc=<bool>
  • if an execution node is to be created or instead consensus would prefer connecting over an RPC to execution.ExecutionClient interface implementer and
  • if we have execution and consensus nodes both starting then if they should communicate over rpc (used for testing) respectively

Testing done

A new CI step has been added to run current system tests with json rpc interconnect enabled.

Resolves NIT-2567

Copy link
Contributor

@diegoximenes diegoximenes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🙂, this is quite important for the alternative clients effort!

The way I see the UX on how to run a Nitro node, implemented in this repo, is something like this:

  1. Node operator runs only Execution ./nitro --consensus-node=false --execution-node=true. In that case Execution will connect with Consensus through RPC. We don't even create a arbnode.Node object in this scenario.
  2. Node operator runs only Consensus ./nitro --consensus-node=true --execution-node=false. In that case Consensus will connect with Execution through RPC.
  3. Node operator runs Execution and Consensus ./nitro --execution-node=true --consensus-node=true --consensus-execution-use-rpc=bool. A single process will run both Execution and Consensus.

I don't like the --consensus-execution-use-rpc flag on option 3 BTW, but it is worth having it for testing, at least for a while.
I don't see an advantage of running Option 3 over RPC in production if we have an option to have Consensus and Execution living in the same process.
But without this flag, to test this PR we would need to run two ./nitros, have an external component similar to consensusexecution_consensus.InitAndStartExecutionAndConsensusNodes, coordinating Execution/Consensus to handle the egg/chicken problem related to which component to start first.

It is a little bit different from what this PR is proposing.
WDYT?

@diegoximenes
Copy link
Contributor

To keep in mind:

  • The main goal of this PR is to enable our current Nitro golang Consensus node to run with Erigon and Nethermind execution clients. The first milestone for Erigon and Nethermind is to only implement simple execution clients, i.e., execution clients that implement the ExecutionClient interface, so we are not worried about other behaviors that are described by the ExecutionRecorder/ExecutionSequencer/ExecutionBatchPoster interfaces.
  • Also, we should not be worried, for now at least, about the scenario in which our current Nitro golang Execution node connects with a Consensus node that is not the one that is implemented in this repo.

By not worry I mean that we don't need to cover them in this PR, but we should definitely consider that those scenarios, could, and ideally will happen 🙂, so changes here shouldn't make our lives too difficult when covering them in the future.

There is this test, in which is possible to create a Nitro node with an execution client that only implements the ExecutionClient interface, and not ExecutionRecorder, etc.
It uses CreateNodeExecutionClient, instead of CreateNodeFullExecutionClient, under the hood.

That said, for this PR, the minimal requirements could be:

  1. If --consensus-node==false then fail during config validation step. Or don't even add this --consensus-node config for now, it can be easier 🙂
  2. If --execution-node==false
  • We can assume that there will be an external execution client interacting with this Consensus node.
  • Use CreateNodeExecutionClient instead of CreateNodeFullExecutionClient
  • Consensus will need to have an RPC server so an external execution client can "connect" to it.
  • We should avoid creating an execution.ExecutionNode object, or at least not set the consensus client to the execution.ExecutionNode object. It can become a weird situation if this ExecutionNode starts calling Consensus, which will already be called by the external execution client. Two execution clients will be connected with this Consensus node, it can become a mess 😬.
  1. If --execution-node==true and --consensus-execution-use-rpc==true then:
  • We will need to validate Execution config, to fail if it is trying to use Sequencer/Recorder/BatchPoster behavior.
  • use CreateNodeExecutionClient instead of CreateNodeFullExecutionClient
  • Consensus will need to have an RPC server
  • Execution will need to have an RPC server
  1. Else we do what is already being done on master.

I am OK with covering all execution behaviors with the RPC approach right now TBH, it seems the way to go long term wise, but it is definitely not a requirement.
In this way CreateNodeFullExecutionClient can still be used for those scenarios, and there will be less config validation steps.
Code diff of this PR will be a little bit bigger though.
If you want to follow this approach I would first get I buy in from Tsahi, he mentioned some times that he doesn't want to implement the RPC interfaces for other behaviors right now.

We could have the same behavior without the --execution-node and --consensus-execution-use-rpc flags BTW, only relying on the rpc flags such as --execution-rpc-server and --execution-rpc-client.
But it becomes less explicit, and more confusing IMO, e.g.:

  • --execution-rpc-server and --execution-rpc-client are set is analogous to: --execution-node==true and --consensus-execution-use-rpc==true
  • --execution-rpc-server is not set and --execution-rpc-client is set is analogous to: --execution-node==false

WDYT?

@ganeshvanahalli
Copy link
Contributor

To keep in mind:

...

I've addressed your comments and added two new flags to nitro -

--execution-node=<bool>
--consensus-execution-use-rpc=<bool>
  • that determine if an execution node is to be created or instead consensus would prefer connecting over an RPC to execution.ExecutionClient interface implementer and
  • if we have execution and consensus nodes both starting then if they should communicate over rpc (used for testing)

@tsahee tsahee assigned tsahee and unassigned diegoximenes Oct 8, 2025
@diegoximenes diegoximenes assigned tsahee and unassigned diegoximenes Nov 14, 2025
Copy link
Contributor

@tsahee tsahee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done reviewing.. overall seems great

arbnode/node.go Outdated
latestWasmModuleRoot common.Hash,
useRPC bool,
) (*Node, error) {
if useRPC {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why give a boolean that means "ignore executionClient" and not just give it the correct execution client?

Copy link
Contributor

@ganeshvanahalli ganeshvanahalli Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the RPC config is currently part of the arbnode's config, creating the client inside this method feels right?
The reason for using a separate boolean comes from this conversation with @diegoximenes #3617

arbnode/node.go Outdated
return nil, errors.New("full execution client must be non-nil")
}
var executionClient execution.ExecutionClient
if useRPC {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as before.. instead of useRPC, accept both "executionClient" and FullExecutionClient, and the "executionClient" could ewither be same as fullExecutionClient, or it could be the rpc client (?)

BlocksReExecutor blocksreexecutor.Config `koanf:"blocks-reexecutor"`
EnsureRollupDeployment bool `koanf:"ensure-rollup-deployment" reload:"hot"`
ExecutionNode bool `koanf:"execution-node"`
ConsensusExecutionInSameProcessUseRPC bool `koanf:"consensus-execution-in-same-process-use-rpc"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's instead just use the Node.ExecutionRPCClient.URL field:

  • If it's empty (default) - create an execution node and use it directly.
  • if it's "self" or "self-auth" - create an execution node and connect to it via rpc (still connect to it directly for fullExecutionNode)

tsahee
tsahee previously requested changes Dec 9, 2025
Copy link
Contributor

@tsahee tsahee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments

@ganeshvanahalli ganeshvanahalli removed their assignment Dec 9, 2025
Copy link
Contributor

@tsahee tsahee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting really close.. tiny comment + f2f

util/common.go Outdated
return ret
}

func BlockNumberToMessageIndex(blockNum, genesis uint64) (arbutil.MessageIndex, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we currently have, in arbutil/block_message_relation.go:
BlockNumberToMessageCount
MessageCountToBlockNumber

Either is reasonable, but we should not have both versions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed it

@tsahee tsahee assigned diegoximenes and unassigned tsahee Dec 11, 2025
@tsahee tsahee dismissed their stale review December 11, 2025 02:15

addressed

exit 1
- name: run tests with consensus and execution nodes connected over json rpc
if: matrix.test-mode == 'defaults'
Copy link
Contributor

@diegoximenes diegoximenes Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this step being run by CI?

Some possible strategies:

  • Two new steps, one for inputs.run-defaults-a and another for inputs.run-defaults-b. This will likely "double" the CI times related to run-defaults-*.
  • Create other two inputs, something like run-consensus-execution-over-rpc-a, run-consensus-execution-over-rpc-b, that will use hash scheme. This one seems safer right now.
  • Create specific system tests related to consensus and execution communicating through RPC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it used to run previously but this PR is pretty old and CI workflow was changed, I updated it to run along with defaults-A and defaults-B

if err := c.Execution.Validate(); err != nil {
return err
}
if c.Node.ExecutionRPCClient.URL == "self" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.Node.Sequencer || c.Node.BatchPoster.Enable || c.Node.BlockValidator.Enable check should also be done here.

if meta.ParentChainBlock <= l1BlockNum {
signedBlockNum := arbutil.MessageCountToBlockNumber(meta.MessageCount, genesisNum)
// #nosec G115
signedBlockNum := int64(arbutil.MessageIndexToBlockNumber(meta.MessageCount, genesisNum)) - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick:

Suggested change
signedBlockNum := int64(arbutil.MessageIndexToBlockNumber(meta.MessageCount, genesisNum)) - 1
signedBlockNum := int64(arbutil.MessageIndexToBlockNumber(meta.MessageCount-1, genesisNum))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional to avoid underflow

return b
}

func (b *NodeBuilder) WithConsensusExecutionOverRPC() *NodeBuilder {
Copy link
Contributor

@diegoximenes diegoximenes Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: This func is not being called directly by any system test. Why to make it public, and why not rely on calling configureConsensusExecutionOverRPC directly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants