The RISICO workflow is a hybrid Cloud/HPC workflow that will be used below as an example to describe the structure of a workflow.
This workflow has a preprocessing step run on a cloud instance that will produce results for 3 dates: a given day D, the previous day D - 1, and the day before D - 2.
One HEAppE job will have to be run for each of these days. Then a post-processing step will be run taking in input the results of the 3 HEAppE jobs.
The TOSCA application template for RISICO is available at https://github.com/lexis-project/application-templates/blob/master/weather-climate/applications/risico/risico_template.yaml
It is made of the following sections described in details below:
Snippets for each section:
# Section medata provides the name and version of the template
metadata:
template_name: org.lexis.wp7.RisicoTemplate
template_version: 0.1.0-SNAPSHOT
template_author: lexis
description: RISICO template
# Section imports declare which types to import from Alien4Cloud catalog
# so that they can be instantiated in our template
imports:
- yorc-types:1.1.0
- org.lexis.common.heappe-types:1.0.8
# Section topology_template describe the application template:
# input parameters, components and relationships, outputs, workflows
topology_template:
# Input parameters provided by the user and referenced in node templates below
inputs:
token:
type: string
required: true
description: "OpenID Connect token"
...
# Description of components and relationships between these components
node_templates:
# Here a HEAppE job referencing in its properties the token input parameter
WRF_DAY_1:
type: org.lexis.common.heappe.nodes.Job
metadata:
task: computation
properties:
token: { get_input: token }
JobSpecification:
Name: WRFJob
...
# Outputs: values of attributes exposed by components described above
outputs:
ddi_post_process_results:
description: DDI path to RISICO post-processing results
value: { get_attribute: [ CloudToDDIJob, destination_path ] }
ddi_wrf_results:
description: DDI path to RISICO WRF results
value: { get_attribute: [ CloudToDDIWRFJob, destination_path ] }
# Workflows: sequences of operations on components described above
workflows:
Run:
steps:
WRF_DAY_1_create:
target: WRF_DAY_1
operation_host: ORCHESTRATOR
activities:
- call_operation: Standard.create
on_success:
- WRF_DAY_1_enable_file_transfer
...This section allows to import archives from the Alien4Cloud catalog, to add data types and node types definitions that will be used to build the application template.
For example, this import:
imports:
- org.lexis.common.heappe-types:1.0.8will import from Alien4Cloud the definition of HEAppE data types and node types that can be found in this TOSCA file: https://github.com/lexis-project/yorc-heappe-plugin/blob/master/src/a4c/heappe-types-a4c.yaml
This file provides definitions of data types and node types needed for HEAppE.
For example a data type org.lexis.common.heappe.types.JobSpecification :
data_types:
org.lexis.common.heappe.types.JobSpecification:
derived_from: tosca.datatypes.Root
properties:
Name:
description: Job name
type: string
required: true
Project:
description: Accounting project
type: string
required: true
ClusterId :
description: Cluster ID
type: integer
required: true
Tasks :
description: Tasks (at leat one task needs to be defined)
type: list
entry_schema:
type: org.lexis.common.heappe.types.TaskSpecification
required: true
...and a node type using this data type as a property:
node_types:
org.lexis.common.heappe.nodes.pub.Job:
derived_from: tosca.nodes.Root
abstract: true
description: >
HEAppE Job
# Properties having static values
properties:
token:
description: OpenID Connect token
type: string
required: false
JobSpecification:
description: Specification of the job to create
type: org.lexis.common.heappe.types.JobSpecification
required: true
# Attributes, which will take value at runtime.
attributes:
job_id:
type: string
description: >
ID of the HEAppE job created
file_transfer:
type: org.lexis.common.heappe.types.FileTransfer
description: >
File transfer settings
changed_files:
type: list
description: List of files created or changed by the job execution
tasks_name_id:
type: map
description: Map of task name - task ID
# A capability (which feature this job provides), that other components
# requiring to be associated to a job, will use as a requirement
capabilities:
heappejob:
type: org.lexis.common.heappe.capabilities.HeappeJob
# Operations implemented by this component:
interfaces:
Standard:
create:
...
delete:
...
custom:
enable_file_transfer:
...
disable_file_transfer:
...
list_changed_files:
...
tosca.interfaces.node.lifecycle.Runnable:
submit:
...
run:
...
cancel:
...So here the node type declares :
- properties (token and JobSpecification) whose value are static and must be provided before the deployment
- attributes, whose values are set by the orchestrator at runtime, for example job_id will be set once the operation create has been called
- a capability, ie. a feature provided by this component, here we describe this component is a HEAppE Job, and each component requiring to be associated to a HEAppE job will have to declare in its section
requirementsthat it needs to be associated with a component having this capabilityorg.lexis.common.heappe.capabilities.HeappeJobdescribed later in the file - interfaces, ie. operations supported by this component:
- the section
Standarddescribe standard operations defined in the TOSCA specification, here the component supportscreateanddelete, other standard operations that can be implemented arestartandstop. - the section
customdescribe any specific operations the developper needs to defined, here we define operationsenable_file_transferanddisable_file_transferto enable/disable the ability to transfer files to the job, andlist_changed_filesto list the files that changed during a job execution. When thislist_changed_filesis called, the attributechanged_filesdescribed above will be set by the orchestrator. - the section
tosca.interfaces.node.lifecycle.Runnabledescribes an extension to TOSCA that Alien4Cloud/Yorc have defined to support jobs:submitto submit a jobrunto monitor a job execution until the job endscancelto cancel a job execution.
- the section
The description of these data types and node types will be imported in our template at https://github.com/lexis-project/application-templates/blob/master/weather-climate/applications/risico/risico_template.yaml thanks to these lines in the import section:
# Section imports declare which types to import from Alien4Cloud catalog
# so that they can be instantiated in our template
imports:
- org.lexis.common.heappe-types:1.0.8This section describes input parameters, components and relationships, outputs, and workflows of the application
In this subsection you declare input parameters that a user will provide, specifying which input parameters are required and which are not required with a default value that the user can override if needed.
Here is the description on an input parameter token marked as required,
so the user will have to provide a value to this input parameter before being able to deploy the application:
topology_template:
# Input parameters provided by the user and referenced in node templates below
inputs:
token:
type: string
required: true
description: "OpenID Connect token"These input parameters can be referenced in the next section node_templates in
propertied of node templates, using the TOSCA function get_input like below:
WRF_DAY_1:
type: org.lexis.common.heappe.nodes.Job
properties:
token: { get_input: token }This subsection describes components and relationships between this component.
The Alien4Cloud UI represents this subsection this way:
Here the description of a HEappE job:
node_templates:
# Here a HEAppE job referencing in its properties the token input parameter
WRF_DAY_1:
type: org.lexis.common.heappe.nodes.Job
metadata:
task: computation
properties:
token: { get_input: token }
JobSpecification:
Name: WRFJob
Tasks:
- Name: "WRF Generic"
TemplateParameterValues:
- CommandParameterIdentifier: MPICores
ParameterValue: "48"
...Then another component having a relationship with this job is described. This component will copy data to the previous Job task directory. This relationhip is expressed in the section requirements of the component description. Here we see the component requires to be:
- hosted on a Host (the component RisicoVM described before in the file),
- associated to job WRF_DAY_1
CopyDay1DataToJobTask:
type: org.lexis.common.datatransfer.nodes.CopySubDirToJobTask
properties:
task_name: "WRF Generic"
parent_directory: /wps_data/output
subdirectory_index: 0
requirements:
- hostedOnVirtualMachineHost:
type_requirement: host
node: RisicoVM
capability: tosca.capabilities.Container
relationship: tosca.relationships.HostedOn
- job:
type_requirement: job
node: WRF_DAY_1
capability: org.lexis.common.heappe.capabilities.HeappeJob
relationship: org.lexis.common.heappe.relationships.SendInputsToJobThese requirements are expressed in the type org.lexis.common.datatransfer.nodes.CopySubDirToJobTask
described at https://github.com/lexis-project/application-templates/blob/master/common/datatransfer/types.yaml :
org.lexis.common.datatransfer.nodes.CopyFromJobTask:
derived_from: tosca.nodes.SoftwareComponent
...
requirements:
- job:
capability: org.lexis.common.heappe.capabilities.HeappeJob
node: org.lexis.common.heappe.nodes.Job
relationship: org.lexis.common.heappe.relationships.GetResultsFromJob
occurrences: [1, 1]We see above that the type describes explicitly the requirement to be associated to a component
with the capability org.lexis.common.heappe.capabilities.HeappeJob,
and our component WRF_DAY_1 of type org.lexis.common.heappe.nodes.Job declares this capability,
as described at https://github.com/lexis-project/yorc-heappe-plugin/blob/master/tosca/heappe-types.yaml :
node_types:
org.lexis.common.heappe.nodes.pub.Job:
...
capabilities:
heappejob:
type: org.lexis.common.heappe.capabilities.HeappeJob
...An additional requirement needed by the component CopyDay1DataToJobTask is that it needs
to be associated to a host.
This requirement comes from the fact that our component is of type org.lexis.common.datatransfer.nodes.CopySubDirToJobTask whose declaration shows this type inherits from a type tosca.nodes.SoftwareComponent as can be seen at https://github.com/lexis-project/application-templates/blob/master/common/datatransfer/types.yaml :
org.lexis.common.datatransfer.nodes.CopyFromJobTask:
derived_from: tosca.nodes.SoftwareComponent
...This TOSCA type tosca.nodes.SoftwareComponent is defined by the TOSCA specification, and the orchestrator provides an implementation of this type at https://github.com/ystia/yorc/blob/develop/data/tosca/normative-types.yml :
tosca.nodes.SoftwareComponent:
derived_from: tosca.nodes.Root
...
requirements:
- host:
capability: tosca.capabilities.Container
node: tosca.nodes.Compute
relationship: tosca.relationships.HostedOnThe subection outputs references attributes of components described in the section node_templates.
The UI will display these outputs.
Here for this application, these are result datasets paths in DDI.
outputs:
computation_dataset_wrf_result_path:
description: DDI path to RISICO WRF results
value: { get_attribute: [ CloudToDDIWRFJob, destination_path ] }
postprocessing_ddi_project_path:
description: DDI path to RISICO post-processing results
value: { get_attribute: [ CloudToDDIJob, destination_path ] }The subsection workflows describes workflows: sequences of operations on the components described
in subsection node_templates.
The Alien4Cloud UI represents such a workflow this way:
The TOSCA specification describes standard lifecycle operations that a component can implement (it doesn' have to implement all of them if not needed):
- create,
- configure,
- start,
- stop,
- delete.
In addition, Ystia is intorducing lifecyle operations for jobs:
- submit,
- run (ie. monitor a job submitted or running, untils the job ends),
- cancel.
Then any custom operation can be defined by the component developper. For example, in the case of a HEAppE job, these additional operations are declared in https://github.com/lexis-project/yorc-heappe-plugin/blob/master/tosca/heappe-types.yaml :
interfaces:
...
custom:
enable_file_transfer:
...
disable_file_transfer:
...
list_changed_files:
...- enable_file_transfer: allows to perform file transfers to/from the job,
- disable_file_transfer: diables the ability to perform file transfers to/from the job,
- list_changed_files: list all the files that were changed/created during the job execution.
All these sequences of operations will be defined in a workflow.
The following lifecycle workflow can be defined:
- install,
- uninstall.
These lifecycle workflows will be called automatically by the Orchestrator when the application is deployed/undeployed.
Then, the developper can then define any other workflow for his needs.
By convention in LEXIS, the LEXIS Portal will call a workflow named Run to start the LEXIS workflow execution.
We see below an excerpt of the workflow dealing with the components we have described above WRF_DAY_1 and CopyDay1DataToJobTask.
The job must first be created, then as the job needs input files to be provided,
the workflow first needs to call the operation enable_file_transfer for this job,
and then, we will be able to copy files to the job task, and then we will submit the job.
Then we will cann the run operation.
For jobs, the orchestrator takes care of running the run operation periodically
until the job is done. In which case the operation run is considered as completed,
and the workflow can go on.
Here will we will just change the state of the job to describe it is now in the stated executed.
Which gives this workflow:
workflows:
...
Run:
steps:
...
# The step named WRF_DAY_1_create create the job:
# - target is the component descirbed in subestion node_templates above.
# - operation_host is the host where to execute thos operation. For HEappE jobs,
# the host where to execute the operation is the orchestrator host
# as the orchestrator takes care of executing a HEAppE REST API request
# - activities describe which component operation to call, here the standard operation create
# - on_sucess describes the steps that have to be performed when the operation is successull
# here the next step is WRF_DAY_1_enable_file_transfer
# (you can define as well "on_failure" to describe which steps to execute on failure,
# here as no on_failure is specified, in case of operation Standard.create failure,
# the workflow will just stop here and report the failure)
WRF_DAY_1_create:
target: WRF_DAY_1
operation_host: ORCHESTRATOR
activities:
- call_operation: Standard.create
on_success:
- WRF_DAY_1_enable_file_transfer
# Enable file transfers for this job
WRF_DAY_1_enable_file_transfer:
target: WRF_DAY_1
operation_host: ORCHESTRATOR
activities:
- call_operation: custom.enable_file_transfer
on_success:
- CopyDay1ToJobTask_start
# Copy input files to this job, ie. run the operation start of
# compnent CopyDay1DataToJobTask
# Here,
# - target is not the job, but the component CopyDay1DataToJobTask
# - operation_host is not specified, then it takes the default value TARGET applies,
# which means the operation will be executed on the host hosting CopyDay1DataToJobTask,
# which is according to the requirement "hostedOnVirtualMachineHost" of CopyDay1DataToJobTask
# described in the subsection node_templates, the compute instance component named RisicoVM.
CopyDay1ToJobTask_start:
target: CopyDay1DataToJobTask
activities:
- call_operation: Standard.start
on_success:
- CopyDay1ToJobTask_started
# Then, once the operation is done, we just update the state of the component
# to started, using the TOSCA operation set_state this time, and not call_operation
# like we used above to call a given component operation
CopyDay1ToJobTask_started:
target: CopyDay1DataToJobTask
activities:
- set_state: started
on_success:
- WRF_DAY_1_submit
# Now that input files were copied, we can submit the job
WRF_DAY_1_submit:
target: WRF_DAY_1
operation_host: ORCHESTRATOR
activities:
- call_operation: tosca.interfaces.node.lifecycle.Runnable.submit
on_success:
- WRF_DAY_1_submitted
# The job being submitted, we can change its state
WRF_DAY_1_submitted:
target: WRF_DAY_1
activities:
- set_state: submitted
on_success:
- WRF_DAY_1_run
# A job operation run is called periodically be the orchestrator to check
# the job status until the job ends on success or on failure.
WRF_DAY_1_run:
target: WRF_DAY_1
operation_host: ORCHESTRATOR
activities:
- call_operation: tosca.interfaces.node.lifecycle.Runnable.run
on_success:
- WRF_DAY_1_executed
# Finally we chnage the state of component WRF_DAY_1 to executed.
WRF_DAY_1_executed:
target: WRF_DAY_1
activities:
- set_state: executed
