Skip to content

Add Pipeline to deploy custom agent image for FIPS testing #8035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

michel-laterman
Copy link
Contributor

What does this PR do?

Add a new buildkite pipeline to build a custom agent image and use it in an ECH deployment for testing.

Why is it important?

FIPS integration tests will require a custom agent running in the CFT region.

Checklist

  • I have read and understood the pull request guidelines of this project.
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

N/A

@michel-laterman michel-laterman added enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team backport-8.19 Automated backport to the 8.19 branch labels Apr 29, 2025
@michel-laterman
Copy link
Contributor Author

buildkite test this

@michel-laterman
Copy link
Contributor Author

@v1v @pazone, can you take a look at this, we need permissions in order to push a custom image to use in the CFT region

denied: requested access to the resource is denied
--
  | Error: Failed pushing docker image: running "docker image push docker.elastic.co/observability-ci/elastic-agent-fips:git-b84b80343415" failed with exit code 1

@v1v
Copy link
Member

v1v commented May 8, 2025

| Error: Failed pushing docker image: running "docker image push docker.elastic.co/observability-ci/elastic-agent-fips:git-b84b80343415" failed with exit code 1

Can you share the URL link to the error?

I'm not familiar with the current user and namespace, but as far as I see, those details are stored at https://github.com/elastic/elastic-agent/blob/aa224536eadf49f8b9b962df240c0caa4861970e/.buildkite/hooks/pre-command#l17.

However, I think you need to configure the pre-command hook to run for the new BK pipelines:

  • if [[ "$BUILDKITE_PIPELINE_SLUG" == "elastic-agent-package" ]]; then
    if [[ "$BUILDKITE_STEP_KEY" == "package_elastic-agent" ]]; then
    docker_login
    fi
    if [[ "$BUILDKITE_STEP_KEY" == "dra-publish" || "$BUILDKITE_STEP_KEY" == "bk-api-publish-independent-agent" ]]; then
    release_manager_login
    fi
    fi
    is the settings for the elastic-agent-package BK pipeline.

I see you have enabled the pre-command for the new step

@michel-laterman michel-laterman marked this pull request as ready for review May 9, 2025 18:48
Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few suggestions but overall this looks good.

@ycombinator
Copy link
Contributor

CI, specifically the fips:Stateful:Ubuntu (only the x86_64 ones) and fips:Kubernetes steps are failing like so:

.buildkite/scripts/steps/integration_tests_tf.sh: line 10: asdf: command not found

Looks like the platform-ingest-fleet-server-ubuntu-2204-fips VM image is missing asdf. Here is the PR to add it: https://github.com/elastic/ci-agent-images/pull/1431.

@pazone
Copy link
Contributor

pazone commented May 12, 2025

/test


IMAGE_UBUNTU_2404_ARM_64: "platform-ingest-elastic-agent-ubuntu-2404-aarch64-1744855248"
IMAGE_UBUNTU_2404_X86_64: "platform-ingest-elastic-agent-ubuntu-2404-1744855248"
IMAGE_UBUNTU_X86_64_FIPS: "platform-ingest-fleet-server-ubuntu-2204-fips"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a known reason why we use a fleet-server image here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the images were created as a result of some experimentation that was being done with the fleet-server repo

Copy link
Contributor

@pazone pazone May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I create FIPS-compliant images for elastic-agent to avoid possible problems?

Copy link
Contributor

mergify bot commented May 20, 2025

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fips-ech upstream/fips-ech
git merge upstream/main
git push upstream fips-ech

image: "docker.elastic.co/ci-agent-images/platform-ingest/buildkite-agent-beats-ci-with-hooks:0.5"
useCustomGlobalHooks: true

- group: "fips:Stateful:Ubuntu"
Copy link
Contributor

@pazone pazone May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that we run a set integration test groups in the same way and the only difference is in the VM image and the FIPS=true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VM image, and i think that FIPS=true results in -integration.fips=true being sent

command: |
#!/usr/bin/env bash
set -euo pipefail
mage cloud:image
Copy link
Contributor

@pazone pazone May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloud:image invokes the Package() function that packages the agent again. This step takes a considerable amount of time (~15 minutes). Can we download the artifacts produced by the packaging step and reuse them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can try, but I'm not sure if the Image step will use artifacts that are already present, @pkoutsovasilis do you know if it will?

@elasticmachine
Copy link
Contributor

elasticmachine commented May 22, 2025

💔 Build Failed

Failed CI Steps

History

cc @michel-laterman

Copy link

Quality Gate passed Quality Gate passed

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarQube

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.19 Automated backport to the 8.19 branch enhancement New feature or request skip-changelog Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants