Description
Abstract
Add automated E2E/Integration tests that cover on-chain interactions.
Motivation
Currently, we don't test Livepeer node software with the chain in any automated way. That approach results in:
- A lot of time spent on manually testing each chain-related change
- Bugs in the released software (e.g. Not possible to redeem tickets with Redeemer on Arbitrum #2283 or Session crashes (without errors in log) right after the O sends a tx #2152)
Our current test suite covers unit tests and video-related integration tests, but we use mocks for all on-chain interactions.
Proposed Solution
I suggest creating a separate suite of tests in Golang, which uses the testcontainer library. Then, implement white-box tests covering user-facing scenarios.
Here's a PoC code skeleton how it would look like in practice.
1. Separate suite of tests
The E2E/Integration tests need to have a connection to the chain, so they'll have specific environment requirements, that's why I suggest to create:
- separate
test_e2e.sh
script to run them - separate directory in
go-livepeer
calledtests_e2e
(ore2e
) - separate GH Action
2. Testcontainer library
As for the chain I think we should use Livepeer Geth docker image, the same what we currently use for the local dev testing. To integrate it well with Golang tests, we can make use of the testcontainer library, which enables running integration tests in the exactly same way we run unit tests (using go test
or directly from IDE). The only requirement for the tests is to have Docker running.
3. White-box testing
There are two ways we can approach creating tests: black-box testing and white-box testing.
In our case, we can imagine a sample black-box test as the following script:
# 1. Start local Geth
docker run livepeer/geth-with-livepeer-protocol:streamflow
# 2. Start Orchestrator
livepeer -network=devenv -ethUrl http://localhost:8545/ -orchestrator
# 3. Register Orchestrator
curl http://localhost:7936/activateOrchestrator
# 4. Start Broadcaster
livepeer -network=devenv -ethUrl http://localhost:8545/ -broadcaster
# 5. Fund Deposit/Reserve
curl http://localhost:7937/fundDepositAndReserve
# ...
While this script looks simple and it's really testing exactly what a user does, I believe this is not an approach we should take, because of the following reasons:
- We may want to mock some parts of the system (e.g. video-transcoding)
- We may need to do some tweaks to make tests repeatable (e.g. set
winProb
to1
) - We may want to test smaller parts of the flow (not fully E2E) at some point
Therefore, I propose to use white-box testing, but at the same time try to be as close to the black-box testing as possible. Sample white-box testing approach is presented in the PoC code skeleton and looks as follows.
func TestCompleteStreamingWorkflow(t *testing.T) {
gethC := setupGeth(context.TODO(), t)
defer gethC.Terminate(context.TODO())
ethClient := newEthClient(gethC.URI, t)
orch := startOrchestrator(ethClient)
registerOrchestrator(orch)
bcast := startBroadcaster(ethClient)
fundDepositAndReserve(bcast)
}
func startOrchestrator(ethClient eth.LivepeerEthClient) *server.LivepeerServer {
n, _ := core.NewLivepeerNode(ethClient, "./tmp", nil)
s, _ := server.NewLivepeerServer("127.0.0.1:1938", n, true, "")
// ...
return s
}
4. User-facing scenarios
I think we should focus on having user-facing E2E scenarios as opposed to just checking if all the eth client functions work correctly. The reason for this approach is that most bug we discovered wasn't related to single on-chain interactions.
Here are the scenario I think we should cover first.
Scenario 1: Full B<>O workflow
- Start O
- Register O
- Start B
- Fund Deposit/Reserve
- Start a stream
- Check if the stream was transcoded
- Check that a ticket was received by O
- Redeem the ticket
- Check ETH balance
Other scenarios to consider
- B<>O failover
- O<>T failover
- Separate Redeemer process
- Livepeer Router
- Filtering O selection (
maxPricePerUnit
, latency, etc.) - Network issues
- Redeemer unavailable
- ETH URL unavailable / rate limited
- Fast verification and suspension mechanism
Implementation Tasks and Considerations
- Prepare
geth-with-livepeer-protocol
Docker image- Update to the confluence version
- Prepare builds for
arm64/v8
- (optional) Integrate GH Actions in
livepeer/protocol
to automatically build the Docker image
- Implement Scenario 1
- Create reusable code with
testcontainer
- Decide on the test structure (one scenario per file or all scenarios in one file)
- Refactor
livepeer.go
to simplify running it from tests - Write code to cover Scenario 1
- Create reusable code with
- Create automation
- Create
test_e2e.sh
script - Create GH Action
- Create
- Implement other scenarios
Testing Tasks and Considerations
We need to make sure that:
- Tests run correctly in: GH Actions, local
amd64
andarm64/v8
(M1) environments - No test flakiness is noticed
Known Unknowns
There are a few things to consider during the implementation.
1. Mocking video transcoding
We can consider mocking the video transcoding part since it is resource-consuming. Nevertheless, I suggest to first try the real transcoding and optimise only when needed.
2. Scope of test scenarios
Initially, I plan to have complete user-facing testing scenarios, but if we find the scope of them too big, we can try splitting them into smaller ones. E.g. test only O registration instead of the full B<>O workflow.
3. No Arb Node Docker image
Currently, we only have Geth Docker image, but no Arb Node Docker image. The networks are compatible for most scenarios, but we may consider building Arb Node Docker image at some point.
Alternatives
1. Separate suite of tests:
Instead of creating a separate suite of tests, we could add the tests in the existing folders or/and run them with test.sh
.
2. Testcontainer library
Instead of using the local Geth chain, we could run tests against Arbitrum Rinkeby, but I'm afraid we'll see a lot of flakiness happening due to:
- constant data changes in Rinkeby
- unavailability of ETH RPC endpoints
3. Golang tests
Instead of integrating tests into Golang, we could do one of the following:
- Use BDD framework (e.g. Cucumber) - it would make the test scenarios look more readable, but at the same time make test code more complex and harder to maintain
- Use bash - it would make the tests more like E2E, because that's how people run Liveeper, however I'm not in favor of this solution, because bash-based tests are hard to maintain and it will make impossible to do any white-box tweaks (which may be required in our case)
4. User-facing scenarios
Instead of creating the complete user-facing E2E scenarios, we could try to target just part of interactions, e.g. only O registration. Or only ticket redemption. I think we may need to change it into such format at some point, however I'd start from the full user-facing scenarios, because that's actually how we currently test it manually. Later, if we suffer from having too many scenarios, we can consider splitting it into smaller parts.
Additional Context
N/A