Skip to content

Conversation

@a-nogikh
Copy link
Collaborator

@a-nogikh a-nogikh commented Jan 2, 2025

It allows for better caching/parallelization.
Also, the resulting image is now ~100MB smaller.
@a-nogikh a-nogikh requested a review from tarasmadan January 2, 2025 13:35
This allows for better caching and parallelization.
DOCKER_BUILDKIT=1 ensures that the Docker builder parallelizes the build
steps (whenever it's possible).
@a-nogikh a-nogikh force-pushed the features/update-env-docker branch from c26231e to e7009b2 Compare January 2, 2025 17:34
@tarasmadan
Copy link
Collaborator

I'm hesitating to add spanner emulator as a dependency. I still believe mocking is a better way to go.
Let's discuss offline.

@a-nogikh
Copy link
Collaborator Author

a-nogikh commented Jan 9, 2025

Even if we generate mocks to test the higher level logic, we still want to test the code that actually interacts with the database (the syz-cluster/pkg/db/* files in the syz-cluster PR). Mocks won't let us test that all the SQL commands we issue are correct and doing what they were supposed to do.

@tarasmadan
Copy link
Collaborator

tarasmadan commented Jan 9, 2025

You can specify the SQL command your code is expected to generate and send. Unit Test will check the command equality.
The moment we want to check the SQL command itself sounds like "let's test the tests" to me. That's why I'm hesitating.
If we want to check external system (DB) reaction to this command, it sounds like an integration test to me.

The question then is "why is emulator better than the real system?".

@a-nogikh
Copy link
Collaborator Author

a-nogikh commented Jan 9, 2025

You can specify the SQL command your code is expected to generate and send. Unit Test will check the command equality.

It also means that there's less trust to such tests unless we have proper full integration testing done at presubmit (and we don't).

The reviewer, if they want to be sure, would have to actually deploy the PR on the server or recreate the DB in their Spanner instance and re-run the modified SQL queries. Otherwise, even though the code is "tested", in reality the code that passed all our checks can easily fail in production.

The question then is "why is emulator better than the real system?".

At the very least, it's faster (one command to run Go tests that takes less than a second vs minutes to push/rebuild/redeploy on Cloud and monitor the results) and more convenient during active development. K8S gives the flexibility to deploy in different environments, and so far I have really enjoyed having a local dev cluster on my workstation.

I would also really like to decouple syz-cluster development from having to think up/rework/reimplement the syzkaller testing strategy. Yes, it would be great to have integration tests in the target environment, but do we want to halt all other work for several weeks right now to figure it out?

@dvyukov
Copy link
Collaborator

dvyukov commented Jan 10, 2025

I am in favor of adding the emulator.
It's similar to the datastore emulator we use for the dashboard/app tests. That worked very well for us. The advantages are as compared to mocks are:

  • higher-level, more realistic end-to-end tests with better coverage
  • easier to write tests
  • tests are more stable, don't need updates on unimportant code changes

Mocks won't replace end-to-end testings with emulator b/c they will have smaller code coverage. Mocks are complementary to end-to-end tests, they are more unit-test style and allow to test implementation details.

I see we can have either:

  1. End-to-end tests with an emulator + unit tests with mocks.
  2. Only end-to-end tests.

But we shouldn't have only mock based tests. If code does the right thing in the end, I don't care much what exact queries it issued, how many, and what results it received. But if it does not work in the end, but issued query sequence that matches some hard-coded golden sequence, that's not very useful.

So I would go with merging the emulator. Later we may add mock-based tests (if we start missing bugs that can't be detected with end-to-end tests, and mocks show good additional-code-to-value ratio).

@tarasmadan
Copy link
Collaborator

@a-nogikh what is the proposed eventual testing strategy? It may be better to focus on this question.

I'm using mocks for Spanner unit testing (in existing code and in the pending). You like to use emulator for similar unit testing. Mocks are faster, more flexible and will give you better coverage. You're able to simulate any external system failure.

Emulator has limitations (one, two).
One interesting limitation is "The emulator only allows one read-write transaction or schema change at a time.". Do you think syzbot-wide integration testing will be possible using emulator?

It's similar to the datastore emulator we use for the dashboard/app tests. That worked very well for us.

#4785 is a specific Datastore Emulator concern I have.

@a-nogikh
Copy link
Collaborator Author

a-nogikh commented Jan 13, 2025

Emulator has limitations (one, two).

These don't seem to be critical for our use cases.

I'm using mocks for Spanner unit testing (in existing code and in the pending). You like to use emulator for similar unit testing. Mocks are faster, more flexible and will give you better coverage. You're able to simulate any external system failure.

That would work perfectly well for higher level tests that use syz-cluster/pkg/db (so we'd mock the DB repositories), but for the code that actually interacts with the DB (the syz-cluster/pkg/db/* implementation), one would want to also make sure that:

  • The DB schema / migration definitions are correct.
  • The code executes syntactically correct SQL queries that actually do what's needed.

Mock tests would only check that the code behaves exactly as stated in tests, but that does not guarantee that the behavior is indeed correct.

#4785 is a specific Datastore Emulator concern I have.

Spanner emulator is a separate binary, so there should be no problems like that.

Copy link
Collaborator

@tarasmadan tarasmadan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's try

@a-nogikh a-nogikh added this pull request to the merge queue Jan 14, 2025
Merged via the queue into google:master with commit f310a27 Jan 14, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants