You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/configuration/client.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -122,7 +122,7 @@ license: |
122
122
| celeborn.client.shuffle.rangeReadFilter.enabled | false | false | If a spark application have skewed partition, this value can set to true to improve performance. | 0.2.0 | celeborn.shuffle.rangeReadFilter.enabled |
123
123
| celeborn.client.shuffle.register.filterExcludedWorker.enabled | false | false | Whether to filter excluded worker when register shuffle. | 0.4.0 ||
124
124
| celeborn.client.shuffle.reviseLostShuffles.enabled | false | false | Whether to revise lost shuffles. | 0.6.0 ||
125
-
| celeborn.client.shuffleDataLostOnUnknownWorker.enabled |false| false |Whether to mark shuffle data lost when unknown worker is detected.| 0.6.3 ||
125
+
| celeborn.client.shuffleDataLostOnUnknownWorker.enabled |true| false |When enabled, any shuffle that had partitions on the (crashed) unknown worker is immediately marked as data lost. On the write flow revive/commit request for that shuffle will fast fail. GetReducerFileGroup requests are replied with SHUFFLE_DATA_LOST. This has no effect when ${CLIENT_PUSH_REPLICATE_ENABLED.key}=true| 0.6.3 ||
126
126
| celeborn.client.slot.assign.maxWorkers | 10000 | false | Max workers that slots of one shuffle can be allocated on. Will choose the smaller positive one from Master side and Client side, see `celeborn.master.slot.assign.maxWorkers`. | 0.3.1 ||
127
127
| celeborn.client.spark.batch.openStream.parallelClientCreation.enabled | true | false | Whether to create data clients in parallel before sending Spark batch open-stream requests. When false, data clients are created serially. | 0.6.3 ||
128
128
| celeborn.client.spark.fetch.cleanFailedShuffle | false | false | whether to clean those disk space occupied by shuffles which cannot be fetched | 0.6.0 ||
Copy file name to clipboardExpand all lines: docs/migration.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,6 +37,8 @@ license: |
37
37
38
38
- Since 0.7.0, Celeborn changed the default value of `celeborn.port.maxRetries` from `1` to `16`.
39
39
40
+
- Since 0.7.0, Celeborn change the default value of `celeborn.client.shuffleDataLostOnUnknownWorker.enabled` from `false` to `true`, which means Celeborn will treat shuffle data lost when unknown worker is detected at default.
41
+
40
42
# Upgrading from 0.5 to 0.6
41
43
42
44
- Since 0.6.0, Celeborn deprecate `celeborn.client.spark.fetch.throwsFetchFailure`. Please use `celeborn.client.spark.stageRerun.enabled` instead.
0 commit comments