forked from argoproj/argo-workflows
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathk8s-resource-log-selector.yaml
48 lines (48 loc) · 2.29 KB
/
k8s-resource-log-selector.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# This example demonstrates how to include and obtain logs from pods created by a
# custom resource submitted via resource template. Note that this feature is only
# available in v3.3 and above.
#
# This is particularly useful since Argo Workflows does not know how
# other CRDs (Kubeflow training CRDs, Spark application CRD, etc.) work
# and thus could not pull the logs from the pods created by those CRDs.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: k8s-jobs-log-selector-
spec:
entrypoint: tf-jobtmpl
templates:
- name: tf-jobtmpl
resource:
action: create
successCondition: status.succeeded = 2
failureCondition: status.failed > 0
# You can also create any other custom K8s resource here. This is an example
# of using Kubeflow TFJob as an illustration.
manifest: |
apiVersion: kubeflow.org/v1
kind: TFJob
metadata:
name: tfjob-examples
spec:
tfReplicaSpecs:
Worker:
replicas: 2
restartPolicy: Never
template:
metadata:
# We add this label to the pods created by TFJob custom resource to inform Argo Workflows
# that we want to include the logs from the created pods. Once the pods are created with this
# label, you can then use `argo logs -c tensorflow` to the logs from this particular container.
# Note that `workflow.name` is a supported global variable provided by Argo Workflows.
#
# The Kubeflow training controller will take this CRD and automatically created worker pods with
# labels, such as `job-role` and `replica-index`. If you'd like to query logs for pods with
# specific labels, you can specify the label selector explicitly via `argo logs -l <logs-label-selector>`.
# For example, you can use `argo logs -c tensorflow -l replica-index=0` to see the first worker pod's logs.
labels:
workflows.argoproj.io/workflow: {{workflow.name}}
spec:
containers:
- name: tensorflow
image: "Placeholder for TensorFlow distributed training image"