You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
You can specify how to classify and summarize Pod failures by the [PodFailureSpec](../pkg/apis/frameworkcontroller/v1/config.go).
@@ -842,7 +844,7 @@ Besides these general [Framework ConsistencyGuarantees](#ConsistencyGuarantees),
842
844
To safely run large scale Framework, i.e. the total task number in a single Framework is greater than 300, you just need to enable the [LargeFrameworkCompression](../pkg/apis/frameworkcontroller/v1/config.go). However, you may also need to decompress the Framework by yourself.
843
845
844
846
## <a name="FrameworkPodHistory">Framework and Pod History</a>
845
-
By leveraging the [LogObjectSnapshot](../pkg/apis/frameworkcontroller/v1/config.go), external systems, such as [Fluentd](https://www.fluentd.org) and [ElasticSearch](https://www.elastic.co/products/elasticsearch), can collect and process Framework and Pod history snapshots even if it was retried or deleted, such as for persistence, metrics conversion, visualization, alerting, acting, analysis, etc.
847
+
By leveraging the [LogObjectSnapshot](../pkg/apis/frameworkcontroller/v1/config.go), external systems, such as [Fluentd](https://www.fluentd.org) and [ElasticSearch](https://www.elastic.co/products/elasticsearch), can collect and process Framework, Task and Pod history snapshots even if it was retried or deleted, such as for persistence, metrics conversion, visualization, alerting, acting, analysis, etc.
846
848
847
849
## <a name="FrameworkTaskStateMachine">Framework and Task State Machine</a>
848
850
### <a name="FrameworkStateMachine">Framework State Machine</a>
@@ -894,7 +896,7 @@ The default behavior is to achieve all the [ConsistencyGuarantees](#ConsistencyG
894
896
895
897
For example, [drain the Node](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node) before delete it is acceptable.
896
898
897
-
*The Task instance can be universally located by its [TaskAttemptInstanceUID](../pkg/apis/frameworkcontroller/v1/types.go) or [PodUID](../pkg/apis/frameworkcontroller/v1/types.go).*
899
+
*The Task running instance can be universally located by its [TaskAttemptInstanceUID](../pkg/apis/frameworkcontroller/v1/types.go) or [PodUID](../pkg/apis/frameworkcontroller/v1/types.go).*
898
900
899
901
*To avoid the Pod is stuck in deleting forever, such as if its Node is down forever, leverage the same approach as [Delete StatefulSet Pod only after the Pod termination has been confirmed](https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/#delete-pods) manually or by your [Cloud Controller Manager](https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/#running-cloud-controller-manager).*
900
902
@@ -911,7 +913,7 @@ The default behavior is to achieve all the [ConsistencyGuarantees](#ConsistencyG
911
913
912
914
4. Do not change the [OwnerReferences](https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#owners-and-dependents) of the managed ConfigMap and Pods.
913
915
914
-
*The Framework instance can be universally located by its [FrameworkAttemptInstanceUID](../pkg/apis/frameworkcontroller/v1/types.go) or [ConfigMapUID](../pkg/apis/frameworkcontroller/v1/types.go).*
916
+
*The Framework running instance can be universally located by its [FrameworkAttemptInstanceUID](../pkg/apis/frameworkcontroller/v1/types.go) or [ConfigMapUID](../pkg/apis/frameworkcontroller/v1/types.go).*
According to the [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem), in the presence of a network partition, you cannot achieve both consistency and availability at the same time in any distributed system. So you have to make a trade-off between the [Framework Consistency](#FrameworkConsistency) and the [Framework Availability](#FrameworkAvailability).
0 commit comments