Skip to content

feat: SGW tested with overloaded docs + attachments#393

Merged
vipbhardwaj merged 2 commits into
mainfrom
sgw-large-wokload-tests
May 6, 2026
Merged

feat: SGW tested with overloaded docs + attachments#393
vipbhardwaj merged 2 commits into
mainfrom
sgw-large-wokload-tests

Conversation

@vipbhardwaj
Copy link
Copy Markdown
Contributor

@vipbhardwaj vipbhardwaj commented Apr 22, 2026

CBG-5312

These are the first two tests for "large workload tests"

  1. Larger than 20MB doc.
  2. An attachment of 50MB. It uses the new file added in dataset named xl2.jpg.

Test 1 does not use a "CBL with batch_updating then replicating to SGW" scenario like the Test 2.
This is because it was impossible for the batch_updater to even get a 19.9 MB document for case 1, to the Test Server via the API.

Copy link
Copy Markdown
Member

@borrrden borrrden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing related to the TDK itself here, so the SGW team will need to comment on correctness here

Copy link
Copy Markdown
Collaborator

@torcolvin torcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't necessarily think we should push this code at all, because we don't need to test SG only behavior in this way. The component that is useful to us is specifically to test what happens if CBL pushes a document that is larger than 20MB, and that needs a test server improvement (or this is not possible in CBL, in which case there's nothing to test).

I've given some general comments on improvements for test readability and debugability, these are things I care about a lot because we want it to be easier to understand a failure if it happens. Using shared test functions for common behaviors allows us to fix this in one place and all tests would benefit. I do not think you necessarily have to do any of these things, but I think thinking about them and filing CBG tickets for the one(s) you think would be useful would be good. These can end up being 1-2pt tickets and then we can figure out how to prioritize them in sprint planning.

Lastly, I would prefer not to commit a markdown document of the test plan, due to the likelihood of it coming out of sync with the test. The test code is the canonical representation for the test and I'm not sure that the duplicate data in a markdown file is useful. A great plan for designing a test is to put up that style of document in google docs which allows easy comments and give to the Sync Gateway team before writing a test, especially because in this case this is doing duplicate coverage of data we have inside the Sync Gateway tests.

Comment thread tests/QE/test_large_doc_workloads.py Outdated
async def test_doc_body_size_boundary(self, cblpytest: CBLPyTest) -> None:
"""This test does not use a TS/CBL to replicate documents to SGW
This is because the batch updater cannot handle such 20MB docs
And returns 500 error then'n'there..."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true of Lite in general, can you ask in chat and probably file a CBL ticket to describe the 500 error?

Specifically we know how the REST api responds to large documents and we can test this within Sync Gateway's repos. We don't know how the blip api responds to messages, which is what we want to test and know if we regress on.

As is, this test isn't useful to Sync Gateway team.

Copy link
Copy Markdown
Contributor Author

@vipbhardwaj vipbhardwaj Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to go via the TS to SGW too, for the 20MB doc, but it just wasn't supported by the batch_updater used in the TDK. In general, ofcourse CBL would be able to have a 20+MB doc pushed into itself.
Its just currently its not supported in the TDK.

Comment thread tests/QE/test_large_doc_workloads.py Outdated
self.mark_test_step(
"Create a 19.9 MB document via SGW admin API — expect acceptance."
)
under_limit_payload = _generate_payload(int(19.9 * SIZE_MB), channels=["test"])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state of a document size on Couchbase Server is the body size + metadata size, so it might be important to think about priming the size of the metadata to be large enough that the body is OK but the metadata isn't.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you mean I should aim for a bigger metadata and smaller body, how's that achieved? I didn't quite understand.
Like, just adding random stuff in metadata? That'd make the metadata invalid, won't it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what does this have to do with SGW? That's being tested right.

Comment thread tests/QE/test_large_doc_workloads.py Outdated
Comment on lines +69 to +77
assert under_limit_doc is not None, (
"SGW should return a RemoteDocument for accepted 19.9 MB doc"
)
assert under_limit_doc.id == "doc_19_9mb", (
f"Returned doc ID mismatch: expected 'doc_19_9mb', got '{under_limit_doc.id}'"
)
assert under_limit_doc.revid is not None or under_limit_doc.cv is not None, (
"Accepted document must have a revision ID or CV assigned by SGW"
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be inclined to wrap up these assertions in create_document so all callers do not need to check for success. I think you could do this as a separate cleanup PR to make the QE tests more reasonable.

Comment thread tests/QE/test_large_doc_workloads.py Outdated
Comment on lines +85 to +88
assert retrieved_doc is not None, (
"19.9 MB doc should be retrievable via GET after successful creation"
)
assert retrieved_doc.id == "doc_19_9mb", "Retrieved doc ID mismatch"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, and in a separate PR from this one, you can do the assertion about this inside get_document.

Comment thread tests/QE/test_large_doc_workloads.py Outdated
Comment on lines +118 to +120
rejected_doc = await sg.get_document(
sg_db, "doc_20_1mb", "_default", "_default"
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be contrary to what I said before, but I wonder if a better API is to have this return a specific exception for a missing document that you can catch with pytest.raises rather than rely on the caller to always do a nil check.

Comment thread tests/QE/test_large_doc_workloads.py Outdated
rejected_doc = await sg.get_document(
sg_db, "doc_20_1mb", "_default", "_default"
)
assert rejected_doc is None, (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this returns nil, then it can't return 404 in the exception.

Prefer pytest.raises for exception catching if you expect to catch an exception.

Comment thread tests/QE/test_large_doc_workloads.py Outdated
Comment on lines +309 to +313
assert sgw_rejected is None, "Oversized blob doc must NOT exist on SGW"
except CblSyncGatewayBadResponseError as e:
assert e.code == 404, (
f"Expected 404 for rejected blob doc on SGW, got HTTP {e.code}"
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that assert is None and except are both true, always use pytest.raises in your tests.

cbs_blob = cbs.get_document(bucket_name, "oversized_blob_doc")
assert cbs_blob is None, "Oversized blob doc must NOT exist in CBS bucket"

await ts.cleanup()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea behind cleanup that you always want to clean up, or you only want to clean up if the test is successful? You always have to clean up at the start of test test but if you wanted to clean up at the end:

@pytest.fixture
def cleanup(request, cblpytest):
    yield
    
    report = getattr(request.node, "rep_call", None)
    
    if report and report.passed:
        for ts in cblpytest.test_servers:
             ts.cleanup()

Or you could run this unilaterally.

Copy link
Copy Markdown
Contributor Author

@vipbhardwaj vipbhardwaj Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, this was pre-written and pre-used in the dev_e2e tests and hence when I came and learnt TDK testing I just picked this up. Seems like a good point but not related to this PR's perspective

@vipbhardwaj
Copy link
Copy Markdown
Contributor Author

I feel like a lot of these comments are related to the TDK's basic state right now and how can it be improved.
I want to work on these good points, genuinely, but I would like to also keep the focus of this PR on the sanity of these tests and how can they be improved/adjusted before they're merged.

TDK's enhancement will be picked up on a separate PR as you told me.
But for now, can you help me with the SGW's and CBL's POV of how the tests are and what more would they need?

I wont merge this PR till the TDK's enhancements that you've mentioned are done and up! I'll link that future PR to this one.

@vipbhardwaj vipbhardwaj assigned torcolvin and unassigned vipbhardwaj Apr 28, 2026
@vipbhardwaj vipbhardwaj requested a review from torcolvin April 28, 2026 10:12
@vipbhardwaj
Copy link
Copy Markdown
Contributor Author

@torcolvin its been lonely here, this PR is getting rusty lets get it merged.

@vipbhardwaj vipbhardwaj merged commit df44841 into main May 6, 2026
5 checks passed
@vipbhardwaj vipbhardwaj deleted the sgw-large-wokload-tests branch May 6, 2026 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants