Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -128,5 +128,6 @@ def _write_single_block(block: Block, project_id: str, dataset: str) -> None:
[
_write_single_block.remote(block, self.project_id, self.dataset)
for block in blocks
if BlockAccessor.for_block(block).num_rows > 0
]
)
22 changes: 22 additions & 0 deletions python/ray/data/tests/datasource/test_bigquery.py
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,28 @@ def test_write_dataset_exists(self, ray_get_mock):
),
)

def test_write_empty_block(self, ray_get_mock):
"""Test that writing a zero-sized block doesn't crash.

See https://github.com/ray-project/ray/issues/51892
"""
bq_datasink = BigQueryDatasink(
project_id=_TEST_GCP_PROJECT_ID,
dataset=_TEST_BQ_DATASET,
)
# Create an empty block with schema but no rows
block = pa.Table.from_arrays(
[pa.array([], type=pa.int64())], names=["data"]
)
ctx = TaskContext(1, "")
# This should not raise an error - empty blocks should be skipped
bq_datasink.write(
blocks=[block],
ctx=ctx,
)

ray_get_mock.assert_not_called()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The assertion assert_not_called() is incorrect because ray.get() is still invoked even when the list of remote tasks is empty. To correctly verify that no tasks are submitted for an empty block, you should assert that ray.get() was called once with an empty list.

Suggested change
ray_get_mock.assert_not_called()
ray_get_mock.assert_called_once_with([])

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test assertion incorrect for empty block case

Medium Severity

The test assertion ray_get_mock.assert_not_called() is incorrect. When the write method is called with only empty blocks, the list comprehension filters them out, producing an empty list []. However, ray.get([]) is still called (the ray.get call is unconditional). The mock would be called with an empty list argument, causing assert_not_called() to fail. The assertion should verify ray.get was called with an empty list instead.

Fix in Cursor Fix in Web



if __name__ == "__main__":
import sys
Expand Down