Skip to content
This repository was archived by the owner on Sep 24, 2025. It is now read-only.

Commit af2402f

Browse files
committed
Add node selectors and tolerations to data processing
Relates: https://issues.redhat.com/browse/RHOAIENG-26219 Signed-off-by: mprahl <mprahl@users.noreply.github.com>
1 parent f2dcbb2 commit af2402f

2 files changed

Lines changed: 21 additions & 0 deletions

File tree

pipeline.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -277,6 +277,11 @@ def ilab_pipeline(
277277
data_processing_task.after(model_to_pvc_task, sdg_task)
278278
data_processing_task.set_caching_options(False)
279279
data_processing_task.set_env_variable("XDG_CACHE_HOME", "/tmp")
280+
data_processing_task.set_accelerator_type(eval_gpu_identifier)
281+
data_processing_task.set_accelerator_limit(1)
282+
add_toleration_json(data_processing_task, train_tolerations)
283+
add_node_selector_json(data_processing_task, train_node_selectors)
284+
280285

281286
# Upload "skills_processed_data" and "knowledge_processed_data" artifacts to S3 without blocking the rest of the workflow
282287
skills_processed_data_to_artifact_task = skills_processed_data_to_artifact_op()

pipeline.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -934,6 +934,11 @@ deploymentSpec:
934934
- name: XDG_CACHE_HOME
935935
value: /tmp
936936
image: registry.redhat.io/rhelai1/instructlab-nvidia-rhel9@sha256:3e6eb035c69b204746a44b3a58b2751c20050cfb6af2ba7989ba327809f87c0b
937+
resources:
938+
accelerator:
939+
count: '1'
940+
resourceCount: '1'
941+
resourceType: '{{$.inputs.parameters[''pipelinechannel--eval_gpu_identifier'']}}'
937942
exec-deletepvc:
938943
container:
939944
image: argostub/deletepvc
@@ -2620,8 +2625,13 @@ root:
26202625
- sdg-op
26212626
inputs:
26222627
parameters:
2628+
accelerator_type:
2629+
runtimeValue:
2630+
constant: '{{$.inputs.parameters[''pipelinechannel--eval_gpu_identifier'']}}'
26232631
max_batch_len:
26242632
componentInputParameter: sdg_max_batch_len
2633+
pipelinechannel--eval_gpu_identifier:
2634+
componentInputParameter: eval_gpu_identifier
26252635
taskInfo:
26262636
name: data-processing-op
26272637
deletepvc:
@@ -3395,6 +3405,9 @@ platforms:
33953405
deploymentSpec:
33963406
executors:
33973407
exec-data-processing-op:
3408+
nodeSelector:
3409+
nodeSelectorJson:
3410+
componentInputParameter: train_node_selectors
33983411
pvcMount:
33993412
- mountPath: /model
34003413
pvcNameParameter:
@@ -3412,6 +3425,9 @@ platforms:
34123425
taskOutputParameter:
34133426
outputParameterKey: name
34143427
producerTask: createpvc
3428+
tolerations:
3429+
- tolerationJson:
3430+
componentInputParameter: train_tolerations
34153431
exec-generate-metrics-report-op:
34163432
pvcMount:
34173433
- mountPath: /output

0 commit comments

Comments
 (0)