There are two GitHub Action pipelines for Continuous Integration(CI) and Continuous Deployment(CD) of the ML pipeline on Vertex AI environment.
CI: This is to verify the proposed changes of the pipeline by creating and compiling through the TFX CLI. Specifically, tfx pipeline create and tfx pipeline compile will error out if the given pipeline code has import/syntax problems.
You can also run the pipeline locally with tfx run create. However, in this case, you need to consider the code structure to be able to set the number of epochs and datasets differently. You certainly don't want to run the pipeline over for a larger number of epochs and on the full dataset since the purpose of CI is to check if there is no problem.
CD: This runs the pipeline on Vertex AI environment. In order to give more flexibility, it leverages the workflow_dispatch feature of GitHub Action which allows you to set some parameters manually via GitHub Action UI. In this project, there are four parameters as below(gcpProject, gcpRegion, and pipelineName are used in tfx run create TFX CLI):
gcpProject: This sets which GCP Project ID to run the pipeline on Vertex AI. This GCP Project ID is going to be used to authenticate your credentials to login and send requests to Vertex AI. The credentials(JSON) should be provided in GitHub Action Secret, and the key for the GitHub Action Secret should be the same to the value ofgcpProjectexcept that dash character(-) should be replaced by underscore(_).-is not an allowed character in a GitHub Action Secret key.gcpRegion: This sets which GCP Region to run the pipeline on Vertex AI. The default value is set toun-central1.pipelineName: In order to run the pipeline, TFX CLI need to know which pipeline to run. The names of each pipeline are defined inconfigs.pyfiles, and these values are rigistered automatically when runningtfx pipeline createCLI. In this project, there is only one pipeline, and its name is set tosegmentation-training-pipelineby default.enableDataflow: This option is used to delegate the jobs ofExampleGenandTransformto DataFlow service. This is useful to handle a large amount of data since the VM spec of each step in Vertex AI Pipeline is limited, and you sometimes get Out-Of-Memory(OOM) issue.
The basic structure of CD pipeline runs tfx pipeline create, tfx pipeline compile, and tfx run create CLIs sequentially:
tfx pipeline create: It creates/register the register to the local system. This step is required to run the downstream CLIs since other CLIs should know which pipeline to operate on.tfx pipeline compile: It compiles the pipeline. It not only produces the pipeline spec in JSON file but also builds/pushes Docker image to Google Cloud Reigstry when--build-imageoption is specified. It also makes sure each step(component) of the pipeline to be run on the newly built Docker iamge.tfx run create: It runs the pipeline on Vertex AI with the--engine=vertexoption specified. Also, three of the GitHub Action parameters are set here too;gcpProjectfor--project,gcpRegionfor--region, andpipelineNamefor--pipeline-name.