You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/configuration/index.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1362,6 +1362,7 @@ Middle Managers pass their configurations down to their child peons. The Middle
1362
1362
|`druid.worker.baseTaskDirs`|List of base temporary working directories, one of which is assigned per task in a round-robin fashion. This property can be used to allow usage of multiple disks for indexing. This property is recommended in place of and takes precedence over `${druid.indexer.task.baseTaskDir}`. If this configuration is not set, `${druid.indexer.task.baseTaskDir}` is used. For example, `druid.worker.baseTaskDirs=[\"PATH1\",\"PATH2\",...]`.|null|
1363
1363
|`druid.worker.baseTaskDirSize`|The total amount of bytes that can be used by tasks on any single task dir. This value is treated symmetrically across all directories, that is, if this is 500 GB and there are 3 `baseTaskDirs`, then each of those task directories is assumed to allow for 500 GB to be used and a total of 1.5 TB will potentially be available across all tasks. The actual amount of memory assigned to each task is discussed in [Configuring task storage sizes](../ingestion/tasks.md#configuring-task-storage-sizes)|`Long.MAX_VALUE`|
1364
1364
|`druid.worker.category`|A string to name the category that the Middle Manager node belongs to.|`_default_worker_category`|
1365
+
|`druid.worker.startAlwaysEnabled`|If true, the Middle Manager always starts in the enabled state. If false, a disabled state set via the worker disable API is persisted and restored across restarts.|`false`|
1365
1366
|`druid.indexer.fork.property.druid.centralizedDatasourceSchema.enabled`| This config should be set when [Centralized Datasource Schema](#centralized-datasource-schema-experimental) feature is enabled. |false|
1366
1367
1367
1368
#### Peon processing
@@ -1478,6 +1479,7 @@ For most types of tasks, `SegmentWriteOutMediumFactory` can be configured per-ta
1478
1479
|`druid.worker.baseTaskDirSize`|The total amount of bytes that can be used by tasks on any single task dir. This value is treated symmetrically across all directories, that is, if this is 500 GB and there are 3 `baseTaskDirs`, then each of those task directories is assumed to allow for 500 GB to be used and a total of 1.5 TB will potentially be available across all tasks. The actual amount of memory assigned to each task is discussed in [Configuring task storage sizes](../ingestion/tasks.md#configuring-task-storage-sizes)|`Long.MAX_VALUE`|
1479
1480
|`druid.worker.globalIngestionHeapLimitBytes`|Total amount of heap available for ingestion processing. This is applied by automatically setting the `maxBytesInMemory` property on tasks.|Configured max JVM heap size / 6|
1480
1481
|`druid.worker.numConcurrentMerges`|Maximum number of segment persist or merge operations that can run concurrently across all tasks.|`druid.worker.capacity` / 2, rounded down|
1482
+
|`druid.worker.startAlwaysEnabled`|If true, the Indexer always starts in the enabled state. If false, a disabled state set via the worker disable API is persisted and restored across restarts.|`false`|
1481
1483
|`druid.indexer.task.baseDir`|Base temporary working directory.|`System.getProperty("java.io.tmpdir")`|
1482
1484
|`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|`${druid.indexer.task.baseDir}/persistent/tasks`|
1483
1485
|`druid.indexer.task.gracefulShutdownTimeout`|Wait this long on Indexer restart for restorable tasks to gracefully exit.|`PT5M`|
|`druid.auth.pac4j.oidc.discoveryURI`|discovery URI for fetching OP metadata [see this](http://openid.net/specs/openid-connect-discovery-1_0.html).|none|Yes|
56
56
|`druid.auth.pac4j.oidc.oidcClaim`|[claim](https://openid.net/specs/openid-connect-core-1_0.html#Claims) that will be extracted from the ID Token after validation.|name|No|
57
57
|`druid.auth.pac4j.oidc.scope`| scope is used by an application during authentication to authorize access to a user's details.|`openid profile email`|No|
58
+
|`druid.auth.pac4j.oidc.clientAuthenticationMethod`|The client authentication method to use when communicating with the OIDC provider. Supported values: `client_secret_basic`, `client_secret_post`, `client_secret_jwt`, `private_key_jwt`, `none`. If not specified, pac4j will auto-detect the method from the provider's discovery document. Set this explicitly if you need to use a specific method (e.g., when your provider advertises multiple methods but you want to use a particular one).|Auto-detected from provider|No|
58
59
59
60
:::info
60
61
Users must set a strong passphrase to ensure that an attacker is not able to guess it simply by brute force.
An autoscaler which computes the required supervisor task count via cost function based on ingestion lag and poll-to-idle ratio.
199
-
Task counts are selected from a bounded range derived from the current partitions-per-task ratio,
200
-
not strictly from factors/divisors of the partition count. This bounded partitions-per-task window enables gradual scaling while
201
-
voiding large jumps and still allowing non-divisor task counts when needed.
198
+
The cost-based autoscaler picks the number of ingestion tasks that minimizes a combined cost score. The score has two components:
202
199
203
-
**It is experimental and the implementation details as well as cost function parameters are subject to change.**
200
+
-**Lag cost** — how long it would take to drain the current backlog at the observed processing rate. More tasks reduce this cost.
201
+
-**Idle cost** — how far the predicted idle ratio is from the target of ~25%. Tasks that are too busy (under-provisioned) or too idle (over-provisioned) both drive the score up.
202
+
The sweet spot is roughly 25% idle, giving headroom to absorb traffic spikes without wasting resources.
203
+
204
+
At every evaluation interval, Druid computes the score for each candidate task count and picks the one with the lowest total cost.
204
205
205
206
Note: Kinesis is not supported yet, support is in progress.
206
207
207
208
The following table outlines the configuration properties related to the `costBased` autoscaler strategy:
208
209
209
-
| Property|Description|Required|Default|
210
-
|---------|-----------|--------|-------|
211
-
|`scaleActionPeriodMillis`|The frequency in milliseconds to check if a scale action is triggered. | No | 600000|
212
-
|`lagWeight`|The weight of extracted lag value in cost function.|No|0.25|
213
-
|`idleWeight`|The weight of extracted poll idle value in cost function. | No | 0.75|
214
-
|`useTaskCountBoundaries`|Enables the bounded partitions-per-task window when selecting task counts.|No|`false`|
215
-
|`highLagThreshold`|Average partition lag threshold that triggers burst scale-up when set to a value greater than `0`. Set to a negative value to disable burst scale-up.|No|-1|
216
-
|`minScaleUpDelay`|Minimum cooldown duration after a scale-up action before the next scale-up is allowed, specified as an ISO-8601 duration string.|No||
217
-
|`minScaleDownDelay`|Minimum cooldown duration after a scale-down action before the next scale-down is allowed, specified as an ISO-8601 duration string.|No|`PT30M`|
218
-
|`scaleDownDuringTaskRolloverOnly`|Indicates whether task scaling down is limited to periods during task rollovers only.|No|`false`|
|`scaleActionPeriodMillis`|How often, in milliseconds, Druid evaluates whether to scale.|No|`600000` (10 min) |
213
+
|`lagWeight`|How much weight to give the lag cost relative to the idle cost. Higher values make the autoscaler more aggressive about adding tasks to drain backlog.|No|`0.4`|
214
+
|`idleWeight`|How much weight to give the idle cost relative to the lag cost. Higher values make the autoscaler more aggressive about removing over-provisioned tasks.|No|`0.6`|
215
+
|`useTaskCountBoundariesOnScaleUp`|Limits scale-up to a small step relative to the current task count, preventing large jumps. Disable to allow the autoscaler to jump directly to any task count.|No|`false`|
216
+
|`useTaskCountBoundariesOnScaleDown`|Limits scale-down to a small step relative to the current task count, preventing large drops. Disable to allow the autoscaler to drop directly to any task count.|No|`true`|
217
+
|`minScaleUpDelay`|Minimum cooldown after a scale-up before the next scale-up is allowed. Specified as an ISO-8601 duration.|No|`scaleActionPeriodMillis`|
218
+
|`minScaleDownDelay`|Minimum cooldown after a scale-down before the next scale-down is allowed. Specified as an ISO-8601 duration.|No|`PT30M`|
219
+
|`scaleDownDuringTaskRolloverOnly`|If `true`, scale-down actions are deferred until the next task rollover. This avoids disrupting in-progress ingestion.|No|`false`|
219
220
220
221
The following example shows a supervisor spec with `costBased` autoscaler:
221
222
@@ -231,10 +232,10 @@ The following example shows a supervisor spec with `costBased` autoscaler:
Copy file name to clipboardExpand all lines: embedded-tests/src/test/java/org/apache/druid/testing/embedded/indexing/autoscaler/CostBasedAutoScalerIntegrationTest.java
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -143,7 +143,7 @@ public void test_autoScaler_computesOptimalTaskCountAndProducesScaleUp()
143
143
}
144
144
});
145
145
146
-
// These values were carefully handpicked to allow that test to pass in a stable manner.
146
+
// These values were carefully handpicked to allow that test to pass stably.
Copy file name to clipboardExpand all lines: extensions-core/druid-pac4j/pom.xml
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -34,11 +34,11 @@
34
34
</parent>
35
35
36
36
<properties>
37
-
<pac4j.version>5.7.3</pac4j.version>
37
+
<pac4j.version>5.7.10</pac4j.version>
38
38
39
39
<!-- Following must be updated along with any updates to pac4j version. One can find the compatible version of nimbus libraries in org.pac4j:pac4j-oidc dependencies-->
0 commit comments