Skip to content

Commit 410a993

Browse files
authored
SNP CI on bigsku (#6989)
1 parent 595b2a5 commit 410a993

File tree

5 files changed

+77
-34
lines changed

5 files changed

+77
-34
lines changed

.github/workflows/README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,16 @@ Documents the various GitHub Actions workflows, the role they fulfil and 3rd par
55
Builds and runs CCF performance tests, both end to end and micro-benchmarks. Results are posted to bencher.dev, and [plotted to make regressions obvious](https://bencher.dev/console/projects/ccf/plots).
66
Triggered on every commit on `main`, but not on PR builds because the setup required to build from forks is complex and fragile in terms of security, and the increase in pool usage would be substantial.
77

8+
Tests are run and published on two different testbeds for comparison: gha-vmss-d16av5-ci (d16av5 VMs) and gha-c-aci-ci (C-ACI with 16 cores and 32Gb RAM), and are labeled accordingly in the bencher UI.
9+
810
File: `bencher.yml`
911
3rd party dependencies:
1012

1113
- `bencherdev/bencher@main`
1214

1315
# Continuous Integration Containers GHCR
1416

15-
Produces the build images used by nearly all other actions, particularly CI and release from 5.0.0-rc0 onwards. Complete images are attested and published to GHCR.
16-
Triggered on label creation (`build/*`).
17+
Produces the build images used by CI and release workflows between 5.0.0-rc0 and 6.0.0 (excluded). Complete images are attested and published to GHCR. Triggered on label creation (`build/*`).
1718

1819
File: `ci-containers-ghcr.yml`
1920
3rd party dependencies:
@@ -22,18 +23,18 @@ File: `ci-containers-ghcr.yml`
2223
- `docker/metadata-action@v5`
2324
- `docker/build-push-action@v6`
2425

25-
Note: This job will be removed with Ubuntu support, because installing dependencies on Azure Linux images is very fast, and producing CI-specific images is no longer necessary there.
26+
Note: This job is being kept until 5.0.x goes out of support.
2627

2728
# Continuous Integration
2829

29-
Main continuous integration job. Builds CCF for all target platforms, runs unit, end to end and partition tests Virtual. Run on every commit, including PRs from forks, gates merging. Also runs once a week, regardless of commits.
30+
Main continuous integration job. Builds CCF for all target platforms, runs unit, end to end and partition tests. Run on every commit, including PRs from forks, gates merging. Also runs once a week, regardless of commits.
3031

3132
File: `ci.yml`
3233
3rd party dependencies: None
3334

3435
# Long Tests
3536

36-
Secondary continuous integration job. Runs more expensive, longer tests, such as tests against ASAN and TSAN builds, fuzzing etc.
37+
Secondary continuous integration job. Runs more expensive, longer tests, such as tests against ASAN and TSAN builds, extended fuzzing etc.
3738

3839
- Runs daily on week days.
3940
- Can be manually run on a PR by setting `run-long-test` label, or via workflow dispatch.
@@ -70,14 +71,14 @@ File: `long-verification.yml`
7071

7172
# Release
7273

73-
Produces CCF release artefacts from 5.0.0-rc0 onwards, for all languages and platforms. Triggered on tags matching `ccf-[56].\*`. The output of the job is a draft release, which needs to be published manually. Publishing triggers the downstream jobs listed below.
74+
Produces CCF reference release artefacts from 5.0.0-rc0 onwards, for all languages and platforms. Triggered on tags matching `ccf-[56].\*`. The output of the job is a draft release, which needs to be published manually. Publishing triggers the downstream jobs listed below.
7475

7576
File: `release.yml`
7677
3rd party dependencies: None
7778

7879
# Containers GHCR
7980

80-
Produces reference release images for 5.x release version. Not used from 6.0.0 onwards. Complete images are attested and published to GHCR. Triggered on release publishing.
81+
Produces reference release images for 5.x release versions. Not used from 6.0.0 onwards. Complete images are attested and published to GHCR. Triggered on release publishing.
8182

8283
File: `containers-ghcr.yml`
8384
3rd party dependencies:
@@ -86,6 +87,8 @@ File: `containers-ghcr.yml`
8687
- `docker/metadata-action@v5`
8788
- `docker/build-push-action@v6`
8889

90+
Note: This job is being kept until 5.0.x goes out of support.
91+
8992
# NPM
9093

9194
Publishes ccf-app TS package from a GitHub release to NPM. Triggered on release publishing.

.github/workflows/ci.yml

Lines changed: 60 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ permissions:
1919
jobs:
2020
checks:
2121
name: "Format and License Checks"
22-
runs-on: [self-hosted, 1ES.Pool=gha-virtual-ccf-sub]
22+
runs-on: [self-hosted, 1ES.Pool=gha-vmss-d16av5-ci]
2323
container:
2424
image: mcr.microsoft.com/azurelinux/base/core:3.0
2525
options: --user root --publish-all --cap-add NET_ADMIN --cap-add NET_RAW --cap-add SYS_PTRACE
@@ -47,7 +47,7 @@ jobs:
4747

4848
build_with_tidy:
4949
name: "Build with clang-tidy"
50-
runs-on: [self-hosted, 1ES.Pool=gha-virtual-ccf-sub]
50+
runs-on: [self-hosted, 1ES.Pool=gha-vmss-d16av5-ci]
5151
container:
5252
image: mcr.microsoft.com/azurelinux/base/core:3.0
5353
options: --user root --publish-all --cap-add NET_ADMIN --cap-add NET_RAW --cap-add SYS_PTRACE
@@ -81,16 +81,10 @@ jobs:
8181
ninja
8282
shell: bash
8383

84-
build_and_test:
85-
name: "CI"
84+
build_and_test_virtual:
85+
name: "Virtual CI"
8686
needs: checks
87-
strategy:
88-
matrix:
89-
platform:
90-
- name: virtual
91-
- name: snp
92-
93-
runs-on: [self-hosted, 1ES.Pool=gha-virtual-ccf-sub]
87+
runs-on: [self-hosted, 1ES.Pool=gha-vmss-d16av5-ci]
9488
container:
9589
image: mcr.microsoft.com/azurelinux/base/core:3.0
9690
options: --user root --publish-all --cap-add NET_ADMIN --cap-add NET_RAW --cap-add SYS_PTRACE
@@ -107,6 +101,11 @@ jobs:
107101
with:
108102
fetch-depth: 0
109103

104+
- name: "cpuinfo"
105+
run: |
106+
cat /proc/cpuinfo
107+
shell: bash
108+
110109
- name: "Install dependencies"
111110
shell: bash
112111
run: |
@@ -119,12 +118,11 @@ jobs:
119118
git config --global --add safe.directory /__w/CCF/CCF
120119
mkdir build
121120
cd build
122-
cmake -GNinja -DCOMPILE_TARGET=${{ matrix.platform.name }} -DCMAKE_BUILD_TYPE=Debug ..
121+
cmake -GNinja -DCOMPILE_TARGET=virtual -DCMAKE_BUILD_TYPE=Debug ..
123122
ninja
124123
shell: bash
125124

126-
- name: "Test ${{ matrix.platform.name }}"
127-
if: "${{ matrix.platform.name != 'snp' }}" # Needs 1ES Pool support
125+
- name: "Test virtual"
128126
run: |
129127
set -ex
130128
cd build
@@ -133,16 +131,16 @@ jobs:
133131
export ASAN_SYMBOLIZER_PATH=$(realpath /usr/bin/llvm-symbolizer-15)
134132
# Unit tests
135133
./tests.sh --output-on-failure -L unit -j$(nproc --all)
136-
# All other acceptably fast tests, which are now supported on Azure Linux.
134+
# End to end tests
137135
./tests.sh --timeout 360 --output-on-failure -LE "benchmark|suite|unit"
138136
# Partitions tests
139137
./tests.sh --timeout 360 --output-on-failure -L partitions -C partitions
140138
shell: bash
141139

142-
- name: "Upload logs for ${{ matrix.platform.name }}"
140+
- name: "Upload logs for virtual"
143141
uses: actions/upload-artifact@v4
144142
with:
145-
name: logs-azurelinux-${{ matrix.platform.name }}
143+
name: logs-azurelinux-virtual
146144
path: |
147145
build/workspace/*/*.config.json
148146
build/workspace/*/out
@@ -152,15 +150,29 @@ jobs:
152150
if: success() || failure()
153151

154152
build_and_test_caci:
155-
name: "Confidential Container (ACI) CI"
156-
runs-on: [self-hosted, 1ES.Pool=gha-caci-ne]
153+
name: "Confidential Container CI"
154+
runs-on: [self-hosted, 1ES.Pool=gha-c-aci-ci]
157155
needs: checks
158156

159157
steps:
160158
- uses: actions/checkout@v4
161159
with:
162160
fetch-depth: 0
163161

162+
- name: "Dump environment"
163+
run: |
164+
set -ex
165+
# Dump environment variables, extract Fabric_NodeIPOrFQDN
166+
# and save it to a file for reconfiguration test using THIM.
167+
cat /proc/*/environ | tr '\000' '\n' | sort -u | grep Fabric_NodeIPOrFQDN > /Fabric_NodeIPOrFQDN
168+
echo "::group::Disk usage"
169+
df -kh
170+
echo "::endgroup::"
171+
echo "::group::CPU Info"
172+
cat /proc/cpuinfo
173+
echo "::endgroup::"
174+
shell: bash
175+
164176
- name: "Build Debug"
165177
run: |
166178
set -ex
@@ -178,18 +190,42 @@ jobs:
178190
rm -rf /github/home/.cache
179191
mkdir -p /github/home/.cache
180192
export ASAN_SYMBOLIZER_PATH=$(realpath /usr/bin/llvm-symbolizer-15)
181-
# Unit tests, minus indexing that is sometimes timing out with this few cores
182-
./tests.sh --output-on-failure -L unit -j$(nproc --ignore=1) -E indexing
183-
# Minimal end to end test that exercises SNP attestation verification
184-
# but works within the current 4 core budget.
185-
./tests.sh --timeout 360 --output-on-failure -R code_update
193+
# Unit tests
194+
./tests.sh --output-on-failure -L unit -j$(nproc --all)
195+
196+
# End to end tests
197+
./tests.sh --timeout 360 --output-on-failure -LE "benchmark|suite|unit|reconfiguration"
198+
199+
# DISABLED until Pool issue causing the CI to stop after 30 minutes is resolved
200+
# Reconfiguration tests
201+
# ./tests.sh --timeout 360 --output-on-failure -L reconfiguration -C reconfiguration
186202
shell: bash
187203

204+
- name: "Partition Tests"
205+
run: |
206+
set -ex
207+
cd build
208+
# DISABLED until Pool issue causing the CI to stop after 30 minutes is resolved
209+
# Partitions tests
210+
# ./tests.sh --timeout 360 --output-on-failure -L partitions -C partitions
211+
shell: bash
212+
213+
- name: "Capture dmesg"
214+
run: |
215+
set -ex
216+
echo "::group::Disk usage"
217+
df -kh
218+
echo "::endgroup::"
219+
dmesg > dmesg.log
220+
shell: bash
221+
if: success() || failure()
222+
188223
- name: "Upload logs"
189224
uses: actions/upload-artifact@v4
190225
with:
191226
name: logs-caci-snp
192227
path: |
228+
dmesg.log
193229
build/workspace/*/*.config.json
194230
build/workspace/*/out
195231
build/workspace/*/err

CMakeLists.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1067,6 +1067,9 @@ if(BUILD_TESTS)
10671067
--test-duration 300 --test-suite reconfiguration --jinja-templates-path
10681068
${CMAKE_SOURCE_DIR}/samples/templates
10691069
)
1070+
set_property(
1071+
TEST reconfiguration_test_suite PROPERTY LABELS reconfiguration
1072+
)
10701073

10711074
if(LONG_TESTS)
10721075
add_e2e_test(
@@ -1368,6 +1371,7 @@ if(BUILD_TESTS)
13681371
PYTHON_SCRIPT ${CMAKE_SOURCE_DIR}/tests/reconfiguration.py
13691372
ADDITIONAL_ARGS ${RECONFIG_TEST_ARGS}
13701373
)
1374+
set_property(TEST reconfiguration_test_cft PROPERTY LABELS reconfiguration)
13711375

13721376
add_e2e_test(
13731377
NAME election_test PYTHON_SCRIPT ${CMAKE_SOURCE_DIR}/tests/election.py

tests/infra/remote.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -260,9 +260,6 @@ def setup(self, use_links=True):
260260
Empty the temporary directory if it exists,
261261
and populate it with the initial set of files.
262262
"""
263-
# SNP Testing currently runs on a fileshare which does not support symlinks
264-
if snp.IS_SNP:
265-
use_links = False
266263
self._setup_files(use_links)
267264

268265
def get_cmd(self, include_dir=True):

tests/infra/snp.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,9 @@ def get_aci_env():
5656
else:
5757
(security_context_dir,) = glob.glob("/security-context-*")
5858
env[ACI_SEV_SNP_ENVVAR_UVM_SECURITY_CONTEXT_DIR] = security_context_dir
59+
# If Fabric_NodeIPOrFQDN is set, pick it up
60+
if "Fabric_NodeIPOrFQDN" in os.environ:
61+
env["Fabric_NodeIPOrFQDN"] = os.environ["Fabric_NodeIPOrFQDN"]
5962
return env
6063

6164

0 commit comments

Comments
 (0)