Skip to content

Commit 9f1c891

Browse files
authored
Document cold buckets and GCS 429 problem (#1009)
1 parent 542e032 commit 9f1c891

File tree

1 file changed

+34
-3
lines changed

1 file changed

+34
-3
lines changed

docs/docs/icechunk-python/performance.md

Lines changed: 34 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,36 @@
99

1010
Icechunk is designed to be cloud native, making it able to take advantage of the horizontal scaling of cloud providers. To learn more, check out [this blog post](https://earthmover.io/blog/exploring-icechunk-scalability) which explores just how well Icechunk can perform when matched with AWS S3.
1111

12+
## Cold buckets and repos
13+
14+
Modern object stores usually reshard their buckets on-the-fly, based on perceived load. The
15+
strategies they use are not published and very hard to discover. The details are not super important
16+
anyway, the important take away is that on new buckets and even on new repositories, the scalability
17+
of the object store may not be great from the start. You are expected to slowly ramp up load, as you
18+
write data to the repository.
19+
20+
Once you have applied consistently high write/read load to a repository for a few minutes, the object
21+
store will usually reshard your bucket allowing for more load. While this resharding happens, different
22+
object stores can respond in different ways. For example, S3 returns 5xx errors with a "SlowDown"
23+
indication. GCS returns 429 responses.
24+
25+
Icechunk helps this process by retrying failed requests with an exponential backoff. In our
26+
experience, the default configuration is enough to ingest into a fresh bucket using around 100 machines.
27+
But if this is not the case for you, you can tune the retry configuration using [StorageRetriesSettings](https://icechunk.io/en/latest/icechunk-python/reference/#icechunk.StorageRetriesSettings).
28+
29+
To learn more about how Icechunk manages object store prefixes, read our
30+
[blog post](https://earthmover.io/blog/exploring-icechunk-scalability)
31+
on Icechunk scalability.
32+
33+
!!! warning
34+
35+
Currently, Icechunk implementation of retry logic during resharding is not
36+
[working properly](https://github.com/earth-mover/icechunk/issues/954) on GCS.
37+
We have a [pull request open](https://github.com/apache/arrow-rs-object-store/pull/410) to
38+
one of Icechunk's dependencies that will solve this.
39+
In the meantime, if you get 429 errors from your Google bucket, please lower concurrency and try
40+
again. Increase concurrency slowly until errors disappear.
41+
1242
## Preloading manifests
1343

1444
Coming Soon.
@@ -42,6 +72,7 @@ repo_config = ic.RepositoryConfig(manifest=ic.ManifestConfig(splitting=split_con
4272
```
4373

4474
Then pass the config to `Repository.open` or `Repository.create`
75+
4576
```python
4677
repo = ic.Repository.open(..., config=repo_config)
4778
```
@@ -55,8 +86,8 @@ Options for specifying the arrays whose manifest you want to split are:
5586
3. [`ManifestSplitCondition.and_conditions`](./reference.md#icechunk.ManifestSplitCondition.and_conditions) to combine (1), (2), and (4) together; and
5687
4. [`ManifestSplitCondition.or_conditions`](./reference.md#icechunk.ManifestSplitCondition.or_conditions) to combine (1), (2), and (3) together.
5788

58-
5989
`And` and `Or` may be used to combine multiple path and/or name matches. For example,
90+
6091
```python exec="on" session="perf" source="material-block"
6192
array_condition = ManifestSplitCondition.or_conditions(
6293
[
@@ -75,8 +106,8 @@ Options for specifying how to split along a specific axis or dimension are:
75106
2. [`ManifestSplitDimCondition.DimensionName`](./reference.md#icechunk.ManifestSplitDimCondition.DimensionName) takes a regular expression used to match the dimension names of the array;
76107
3. [`ManifestSplitDimCondition.Any`](./reference.md#icechunk.ManifestSplitDimCondition.Any) matches any _remaining_ dimension name or axis.
77108

78-
79109
For example, for an array with dimensions `time, latitude, longitude`, the following config
110+
80111
```python exec="on" session="perf" source="material-block"
81112
from icechunk import ManifestSplitDimCondition
82113

@@ -86,8 +117,8 @@ from icechunk import ManifestSplitDimCondition
86117
ManifestSplitDimCondition.Any(): 1,
87118
}
88119
```
89-
will result in splitting manifests so that each manifest contains (3 longitude chunks x 2 latitude chunks x 1 time chunk) = 6 chunks per manifest file.
90120

121+
will result in splitting manifests so that each manifest contains (3 longitude chunks x 2 latitude chunks x 1 time chunk) = 6 chunks per manifest file.
91122

92123
!!! note
93124

0 commit comments

Comments
 (0)