You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix the broken anchor and some minor spacing issues in the text.
Leaving 1.6.3 updates as part of this change.
Closes: #525
Signed-off-by: hhcs9527 <hhcs9527@gmail.com>
Signed-off-by: Wilfred Spiegelenburg <wilfreds@apache.org>
Copy file name to clipboardExpand all lines: docs/api/system.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -66,7 +66,7 @@ Note that this list is not guaranteed to remain stable and can change from relea
66
66
67
67
**Content examples**
68
68
69
-
The output of this REST query can be rather large, and it is a combination of those which have already been documented as part of the [scheduler API](scheduler.md#Overview).
69
+
The output of this REST query can be rather large, and it is a combination of those which have already been documented as part of the [scheduler API](scheduler.md).
70
70
71
71
The `RMDiagnostics` shows the content of the K8Shim cache. The exact content is version dependent and is not stable.
72
72
The current content shows the cached objects:
@@ -81,7 +81,7 @@ The current content shows the cached objects:
81
81
82
82
## Go routine info
83
83
84
-
Dumps the stack traces of the currently running goroutines. This is a similar view as provided in the [pprof goroutine](#pprof-goroutine) in a human-readable form.
84
+
Dumps the stack traces of the currently running goroutines. This is a similar view as provided in the [pprof goroutine](#pprof-goroutine) in a human-readable form.
85
85
86
86
**URL** : `/debug/stack`
87
87
@@ -317,7 +317,7 @@ trace: A trace of execution of the current program. You can specify the duration
Copy file name to clipboardExpand all lines: docs/archived_design/k8shim.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,7 +55,7 @@ and a [validation webhook](https://kubernetes.io/docs/reference/access-authn-aut
55
55
to immediately transition from the `Starting` to `Running` state so that it will not block other applications.
56
56
2. The `validation webhook` validates the configuration set in the configmap
57
57
- This is used to prevent writing malformed configuration into the configmap.
58
-
- The validation webhook calls scheduler [validation REST API](api/scheduler.md#configuration-validation) to validate configmap updates.
58
+
- The validation webhook calls scheduler [validation REST API](api/cluster.md#configuration-validation) to validate configmap updates.
59
59
60
60
### Admission controller deployment
61
61
@@ -66,7 +66,7 @@ On startup, the admission controller performs a series of tasks to ensure that i
66
66
2. If the secret cannot be found or either CA certificate is within 90 days of expiration, generates new certificate(s). If a certificate is expiring, a new one is generated with an expiration of 12 months in the future. If both certificates are missing or expiring, the second certificate is generated with an expiration of 6 months in the future. This ensures that both certificates do not expire at the same time, and that there is an overlap of trusted certificates.
67
67
3. If the CA certificates were created or updated, writes the secrets back to Kubernetes.
68
68
4. Generates an ephemeral TLS server certificate signed by the CA certificate with the latest expiration date.
69
-
5. Validates, and if necessary, creates or updates the Kubernetes webhook configurations named `yunikorn-admission-controller-validations` and `yunikorn-admission-controller-mutations`. If the CA certificates have changed, the webhooks will also be updated. These webhooks allow the Kubernetes API server to connect to the admission controller service to perform configmap validations and pod mutations.
69
+
5. Validates, and if necessary, creates or updates the Kubernetes webhook configurations named `yunikorn-admission-controller-validations` and `yunikorn-admission-controller-mutations`. If the CA certificates have changed, the webhooks will also be updated. These webhooks allow the Kubernetes API server to connect to the admission controller service to perform configmap validations and pod mutations.
70
70
6. Starts up the admission controller HTTPS server.
71
71
72
72
Additionally, the admission controller also starts a background task to wait for CA certificates to expire. Once either certificate is expiring within the next 30 days, new CA and server certificates are generated, the webhook configurations are updated, and the HTTPS server is quickly restarted. This ensures that certificates rotate properly without downtime.
Copy file name to clipboardExpand all lines: docs/design/gang_scheduling.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -167,15 +167,15 @@ For gang scheduling we have a simple one new to one release relation in the case
167
167
168
168
The scheduler processes the AllocationAsk as follows:
169
169
1. Check if the application has an unreleased allocation for a placeholder allocation with the same _taskGroupName._ If no placeholder allocations are found a normal allocation cycle will be used to allocate the request.
170
-
2. A placeholder allocation is selected and marked for release. A request to release the placeholder allocation is communicated to the shim. This must be an async process as the shim release process is dependent on the underlying K8s response which might not be instantaneous.
170
+
2. A placeholder allocation is selected and marked for release. A request to release the placeholder allocation is communicated to the shim. This must be an async process as the shim release process is dependent on the underlying K8s response which might not be instantaneous.
171
171
NOTE: no allocations are released in the core at this point in time.
172
-
3. The core “parks” the processing of the real AllocationAsk until the shim has responded with a confirmation that the placeholder allocation has been released.
172
+
3. The core “parks” the processing of the real AllocationAsk until the shim has responded with a confirmation that the placeholder allocation has been released.
173
173
NOTE: locks are released to allow scheduling to continue
174
174
4. After the confirmation of the release is received from the shim the “parked” AllocationAsk processing is finalised.
175
175
5. The AllocationAsk is allocated on the same node as the placeholder used.
176
176
The removal of the placeholder allocation is finalised in either case. This all needs to happen as one update to the application, queue and node.
177
177
* On success: a new Allocation is created.
178
-
* On Failure: try to allocate on a different node, if that fails the AllocationAsk becomes unschedulable triggering scale up.
178
+
* On Failure: try to allocate on a different node, if that fails the AllocationAsk becomes unschedulable triggering scale up.
179
179
6. Communicate the allocation back to the shim (if applicable, based on step 5)
180
180
181
181
## Application completion
@@ -196,7 +196,7 @@ The time out of the _waiting_ state is new functionality.
196
196
197
197
Placeholders are not considered active allocations.
198
198
Placeholder asks are considered pending resource asks.
199
-
These cases will be handled in the [Cleanup](#Cleanup) below.
199
+
These cases will be handled in the [Cleanup](#cleanup) below.
200
200
201
201
### Cleanup
202
202
When we look at gang scheduling there is a further issue around unused placeholders, placeholder asks and their cleanup.
@@ -219,7 +219,7 @@ Processing in the core thus needs to consider two cases that will impact the tra
219
219
1. Placeholder asks pending (exit from _accepted_)
220
220
2. Placeholders allocated (exit from _waiting_)
221
221
222
-
Placeholder asks pending:
222
+
Placeholder asks pending:
223
223
Pending placeholder asks are handled via a timeout.
224
224
An application must only spend a limited time waiting for all placeholders to be allocated.
225
225
This timeout is needed because an application’s partial placeholders allocation may occupy cluster resources without really using them.
@@ -259,7 +259,7 @@ Combined flow for the shim and core during timeout of placeholder:
259
259
* After the placeholder Allocations and Asks are released the core moves the application to the killed state removing it from the queue (4).
260
260
* The state change is finalised in the core and shim. (5)
261
261
262
-
Allocated placeholders:
262
+
Allocated placeholders:
263
263
Leftover placeholders need to be released by the core.
264
264
The shim needs to be informed to remove them. This must be triggered on entry of the _completed_ state.
265
265
After the placeholder release is requested by the core the state transition of the application can proceed.
@@ -429,14 +429,14 @@ In patched message form that would look like:
429
429
message UpdateResponse {
430
430
...
431
431
// Released allocation(s), allocations can be released by either the RM or scheduler.
432
-
// The TerminationType defines which side needs to act and process the message.
432
+
// The TerminationType defines which side needs to act and process the message.
Copy file name to clipboardExpand all lines: docs/design/scheduler_configuration.md
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,7 +54,7 @@ Configuration to consider:
54
54
## Queue Configuration
55
55
### Queue Definition
56
56
On startup the scheduler will load the configuration for the queues from the provided configuration file after initialising the service. If there is no queue configuration provided the scheduler should start up with a simple default configuration which performs a well documented default behaviour.
57
-
Based on the kubernetes definition this configuration could be a configMap <supid="s1">[1](#f1)</sup> but not a CRD.
57
+
Based on the kubernetes definition this configuration could be a [configMap](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#should-i-use-a-configmap-or-a-custom-resource) but not a CRD.
58
58
59
59
The queue configuration is dynamic. Changing the queue configuration must not require a scheduler restart.
60
60
Changes should be allowed by either calling the GO based API, the REST based API or by updating the configuration file. Changes made through the API must be persisted in the configuration file. Making changes through an API is not a high priority requirement and could be postponed to a later release.
@@ -166,7 +166,7 @@ Defining placement rules in the configuration requires the following information
166
166
* Create (boolean)
167
167
* Filter:
168
168
* A regular expression or list of users/groups to apply the rule to.
169
-
169
+
170
170
The filter can be used to allow the rule to be used (default behaviour) or deny the rule to be used. User or groups matching the filter will be either allowed or denied.
171
171
The filter is defined as follow:
172
172
* Type:
@@ -213,7 +213,7 @@ Base point to make: a changed configuration should not impact the currently runn
213
213
### Access Control Lists
214
214
The scheduler ACL is independent of the queue ACLs. A scheduler administrator is not by default allowed to submit an application or administer the queues in the system.
215
215
216
-
All ACL types should use the same definition pattern. We should allow at least POSIX user and group names which uses the portable filename character set <supid="s2">[2](#f2)</sup>. However we should take into account that we could have domain specifiers based on the environment that the system runs in (@ sign as per HADOOP-12751).
216
+
All ACL types should use the same definition pattern. We should allow at least POSIX user and group names which uses the portable filename character set <ahref="#footnote1"><sup>[1]</sup></a>. However we should take into account that we could have domain specifiers based on the environment that the system runs in (@ sign as per HADOOP-12751).
217
217
218
218
By default access control is enabled and access is denied. The only special case is for the core scheduler which automatically adds the system user, the scheduler process owner, to the scheduler ACL. The scheduler process owner is allowed to make sure that the process owner can use the API to call any administrative actions.
219
219
@@ -241,6 +241,7 @@ The full configuration of the K8s shim is still under development.
241
241
The full configuration of the YARN shim is still under development.
0 commit comments