Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/release.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Always
* [ ] [pyproject.toml](https://github.com/exasol/ai-lab/blob/main/pyproject.toml)

Ship the Actual Release
* [ ] Ensure PR CI completed the release validation job successfully
* [ ] Ensure PR CI completed the `Check Version Number` step in [.github/workflows/ci.yaml](https://github.com/exasol/ai-lab/blob/main/.github/workflows/ci.yaml) successfully
* [ ] Merge the release PR
* [ ] Push the release version tag
* [ ] Verify the `Release` workflow published the AWS and Docker artifacts for that tag
Expand Down
59 changes: 53 additions & 6 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@ name: CI
on:
pull_request:

permissions:
id-token: write
contents: read

jobs:
run-unit-tests:
name: Unit Tests
Expand All @@ -13,16 +17,16 @@ jobs:
with:
fetch-depth: 0

- name: Run shellcheck
run: ./scripts/build/shellcheck.sh

- name: Setup Python & Poetry Environment
uses: exasol/python-toolbox/.github/actions/python-environment@v1
uses: exasol/python-toolbox/.github/actions/python-environment@v8
with:
python-version: "3.10"

- name: Check Version Number
run: poetry run -- python3 -u "./scripts/build/check_release.py"
env:
RELEASE_MODE: workflow_dispatch
RELEASE_TITLE: CI validation
run: poetry run -- ai-lab release check
Comment on lines +26 to +29

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we introduce the PTB here, we don't need this anymore.


- name: Run Unit Tests
run: >
Expand All @@ -32,6 +36,49 @@ jobs:
--override-ini=log_cli_level=INFO
test/unit

approval-for-aws-ci-tests:
name: Run AWS CI Tests?
runs-on: ubuntu-24.04
needs: run-unit-tests
steps:
- name: Detect Running AWS CI Tests
run: true
environment:
approve-aws-ci-execution

run-aws-ci-tests:
name: AWS CI Tests
runs-on: ubuntu-24.04
needs: approval-for-aws-ci-tests
env:
DSS_RUN_CI_TEST: "true"
AWS_DEFAULT_REGION: ${{ vars.AWS_CI_REGION }}
AWS_USER_NAME: ailab-ci-user

steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0

- name: Setup Python & Poetry Environment
uses: exasol/python-toolbox/.github/actions/python-environment@v8
with:
python-version: "3.10"

- uses: aws-actions/configure-aws-credentials@v6
with:
role-to-assume: ${{ vars.AWS_CI_ROLE }}
role-session-name: github-actions-ai-lab-ci
aws-region: ${{ vars.AWS_CI_REGION }}

- name: Run AWS CI Tests
run: >
poetry run -- pytest
--capture=no
--override-ini=log_cli=true
--override-ini=log_cli_level=INFO
test/aws_ci/test_ci*.py

run-integration-tests:
name: Integration Tests
runs-on: ubuntu-24.04
Expand Down Expand Up @@ -110,7 +157,7 @@ jobs:
if: ${{ !cancelled() }}
name: Gate 2 - Allow Merge
runs-on: ubuntu-24.04
needs: [ run-unit-tests, run-integration-tests, run-stable-notebook-tests, run-stable-gpu-notebook-tests ]
needs: [ run-unit-tests, run-aws-ci-tests, run-integration-tests, run-stable-notebook-tests, run-stable-gpu-notebook-tests ]
steps:
- name: Branch Protection - failure if any ancestor failed
if: ${{ contains(needs.*.result, 'failure') }}
Expand Down
25 changes: 14 additions & 11 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,46 +17,49 @@ permissions:

jobs:
release:
name: Release
environment: AWS_RELEASE
runs-on: ubuntu-24.04
env:
RELEASE_MODE: ${{ github.event_name }}
RELEASE_TAG: ${{ github.ref_name }}
RELEASE_TITLE: ${{ github.event.inputs.release_title }}
AWS_DEFAULT_REGION: ${{ vars.AWS_CI_REGION }}
RELEASE_NOTES_DIR: ${{ runner.temp }}/release-notes
RELEASE_NOTES_DIR: /tmp/release-notes

steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0

- name: Setup Python & Poetry Environment
uses: exasol/python-toolbox/.github/actions/python-environment@v1
uses: exasol/python-toolbox/.github/actions/python-environment@v8
with:
python-version: "3.10"

- uses: aws-actions/configure-aws-credentials@v5
- uses: aws-actions/configure-aws-credentials@v6
with:
role-to-assume: ${{ vars.AWS_CI_ROLE }}
role-to-assume: ${{ vars.AWS_CD_ROLE }}
role-session-name: github-actions-ai-lab-release
# Release runs long enough that we request a longer OIDC session so cleanup can finish safely.
role-duration-seconds: 18000
aws-region: ${{ vars.AWS_CI_REGION }}

- name: Check Release
run: poetry run -- python3 -m scripts.build.release_workflow check
run: poetry run -- ai-lab release check

- name: Publish Release Build
run: poetry run -- python3 -m scripts.build.release_workflow build
run: poetry run -- ai-lab release build
env:
AWS_USER_NAME: release_user
AWS_USER_NAME: ailab-cd-user
RELEASE_DEFAULT_PASSWORD: ${{ secrets.RELEASE_DEFAULT_PASSWORD }}
DOCKER_REGISTRY_USER: ${{ secrets.DOCKER_REGISTRY_USER }}
DOCKER_REGISTRY_PASSWORD: ${{ secrets.DOCKER_REGISTRY_PASSWORD }}
DOCKER_REGISTRY_USER: ${{ secrets.CI4_DOCKERHUB_USERNAME }}
DOCKER_REGISTRY_PASSWORD: ${{ secrets.CI4_DOCKERHUB_TOKEN }}

- name: Generate Release Notes
run: poetry run -- python3 -m scripts.build.release_workflow notes
run: poetry run -- ai-lab release notes

- name: Publish GitHub Release
env:
GH_TOKEN: ${{ github.token }}
run: poetry run -- python3 -m scripts.build.release_workflow publish
run: poetry run -- ai-lab release publish
29 changes: 0 additions & 29 deletions aws-code-build/ci/buildspec.yaml

This file was deleted.

11 changes: 6 additions & 5 deletions doc/changes/changes_5.1.0.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
# AI-Lab 5.1.0 released 2026-05-28
# AI-Lab 5.1.0 released 2026-06-03

Code name: Ansible Runner Wrapper dependency and Sagemaker tests removal
Code name: SageMaker Notebook Removal, Ansible-Runner-Wrapper, CI/CD now on Github Action

## Summary

This release replaces the former ansible wrapper in the AI Lab by external dependency to `ansible-runner wrapper`.
It also removes Sagemaker tests from CI testing pipeline and fixes AWS CodeBuild and Docker Image Build issues.
This release removes the SageMaker Notebooks. Furthermore, it replaces the former ansible wrapper in the AI Lab by external dependency to `ansible-runner wrapper` and the CI/CD workflows moved from AWS CodeBuild to Github Actions

## Features

Expand All @@ -18,8 +17,10 @@ It also removes Sagemaker tests from CI testing pipeline and fixes AWS CodeBuild
## Refactorings

* #504: Replaced `exasol.ds.sandbox.lib.ansible` by `ansible-runner-wrapper`
* #515: Moved AI-Lab release CodeBuild from AWS to GitHub
* #515: Migrated AI-Lab release from AWS Codebuild to Github Actions
* #521: Removed some Ansible unit tests
* #525: Moved the release workflow into the package CLI layer and removed the legacy release script tree
* #526: Migrated the AWS-backed CI from CodeBuild to GitHub Actions

## Bug Fixes

Expand Down
96 changes: 91 additions & 5 deletions doc/developer_guide/aws.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@
* Code name
* Summary
* Remove sections without tickets or add `n/a`
2. Open a pull request and let the PR CI validate `start-release-build`
2. Open a pull request and let the PR CI validate the release version through the `Check Version Number` step in [.github/workflows/ci.yaml](https://github.com/exasol/ai-lab/blob/main/.github/workflows/ci.yaml)
3. Merge the pull request
4. Push the release version tag
5. The `Release` GitHub Actions workflow authenticates to AWS via GitHub OIDC, builds the AMI and VM artifacts, and publishes the Docker release image for that tag
5. The tagged `Release` GitHub Actions workflow authenticates to AWS via GitHub OIDC, runs `ai-lab release check`, `build`, `notes`, and `publish`, builds the AMI and VM artifacts, and publishes the Docker release image for that tag

## AWS Infrastructure Workflow

Expand Down Expand Up @@ -38,9 +38,95 @@ The export creates an AMI based on the running EC2 instance and exports the AMI

## Release

The release now runs in GitHub Actions instead of AWS CodeBuild. PR CI performs a dry-run of
the release logic through unit tests, while the tagged `Release` workflow authenticates to AWS via OIDC, builds the
AMI and VM artifacts, and publishes the Docker image.
The release now runs in GitHub Actions. PR CI validates the release version, while the AWS-backed CI tests and the
tagged `Release` workflow both authenticate to AWS via OIDC, run the release workflow commands, build the AMI and VM
artifacts, and publish the Docker image.

Manual `workflow_dispatch` runs are treated as draft test releases: they still generate release notes and a draft GitHub
release, but they do not make the AMI public or publish the Docker image.

### IAM permissions for GitHub Actions

The AWS-backed CI and the tagged release workflow both authenticate to AWS via GitHub OIDC. The CI role should get the
shared permission block below, and the release role should get the same block plus one additional release-only
permission.

The release workflow also requests a 5-hour OIDC session from `aws-actions/configure-aws-credentials` so the long
running test release can complete stack cleanup before the temporary credentials expire. The CI workflow uses a
shorter session because it finishes much faster.

Shared permissions used by AWS CI and release:

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "SharedCiAndReleasePermissions",
"Effect": "Allow",
"Action": [
"logs:*",
"cloudformation:CreateChangeSet",
"cloudformation:DescribeChangeSet",
"cloudformation:ExecuteChangeSet",
"cloudformation:ValidateTemplate",
"cloudformation:ListStackResources",
"cloudformation:ListStacks",
"cloudformation:DescribeStacks",
"cloudformation:DeleteStack",
"ec2:RunInstances",
"ec2:CreateKeyPair",
"ec2:DeleteKeyPair",
"ec2:CreateSecurityGroup",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:DeleteSecurityGroup",
"ec2:TerminateInstances",
"ec2:CreateTags",
"ec2:DescribeInstances",
"ec2:DescribeSecurityGroups",
"ec2:DescribeImages",
"ec2:DescribeInstanceStatus",
"ec2:DescribeSnapshots",
"ec2:DescribeExportImageTasks",
"ec2:DescribeKeyPairs",
"ec2:CreateImage",
"ec2:ExportImage",
"ec2:DeregisterImage",
"ec2:DeleteSnapshot",
"s3:ListBucket",
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "*"
}
]
}
```

These EC2 permissions are required because CloudFormation executes the stack directly and creates the EC2 instance and
security group on behalf of the GitHub Actions role.

Release-only permission:

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ReleaseOnlyPermissions",
"Effect": "Allow",
"Action": [
"ec2:ModifyImageAttribute"
],
"Resource": "*"
}
]
}
```

`AWS_USER_NAME` is only a workflow input used to label AWS resources created by the build. It is not an IAM
authorization mechanism.

## AWS S3 Bucket

Expand Down
37 changes: 17 additions & 20 deletions doc/developer_guide/ci.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,36 @@
## Running tests in the CI

The project has two types of CI tests:
* Unit tests and integration tests which run in a Github workflow
* Special integration tests verifying the content of the Jupyter notebook files
* A system test which runs in a AWS Codebuild
The GitHub workflow runs on each pull request and contains these test groups:
* Unit tests
* AWS-backed CI tests, which run after manual approval and provision AWS resources directly from GitHub Actions
* Integration tests, which include Docker-image build and validation checks in `test/integration`
* Notebook tests, which verify the notebook content and run in a separate workflow chain
* A system test suite that can be run locally against AWS resources

All these tests need to pass before the approval of a Github PR.
The Github workflow will run on each push to a branch in the Github repository.

However, the notebook tests and the AWS Codebuild will only run under specific conditions, e.g. manual approval or push a commit containing a special string in the commit message, see the following sections.
All required checks need to pass before a Github PR can be approved. The AWS-backed CI job stays blocked until the approval environment is granted.

### Executing Jupyter Notebook Tests

The regular CI build will ask for confirmation (aka. "review") before executing these tests, see [ETAJ developer guide](https://github.com/exasol/exasol-test-setup-abstraction-java/blob/main/doc/developer_guide/developer_guide.md#ci-build) for details.

### Executing AWS CodeBuild
### Executing AWS-backed CI

The AWS-backed CI tests are executed by the GitHub Actions workflow using AWS OIDC credentials and the
`test/aws_ci/test_ci*.py` suite.

Use the following git commands to execute the AWS CodeBuild script:
To run these tests locally please use

```shell
git commit -m "[CodeBuild]" --allow-empty && git push
export DSS_RUN_CI_TEST=true; poetry run -- pytest test/aws_ci/test_ci*.py
```

This will trigger a webhook that was installed by an AWS template into the git-Repository.
* The webhook is defined in file `exasol/ds/sandbox/templates/ci_code_build.jinja.yaml`
* and calls `aws-code-build/ci/buildspec.yaml`
* which then executes `test/codebuild/test_ci*.py`
### Executing Integration Tests

The CodeBuild will take about 20 minutes to complete.

## Running AWS CodeBuild locally
The integration job in the GitHub workflow runs `test/integration`, which includes tests that build and validate the
AI Lab Docker image, for example `test/integration/test_create_dss_docker_image.py`.

To run these tests locally please use

```shell
export DSS_RUN_CI_TEST=true; poetry run -- test/codebuild/test_ci.py
poetry run -- pytest test/integration
```

Loading
Loading