Skip to content

VMs management#5222

Open
geyslan wants to merge 1 commit into
aquasecurity:mainfrom
geyslan:vms-mngmnt
Open

VMs management#5222
geyslan wants to merge 1 commit into
aquasecurity:mainfrom
geyslan:vms-mngmnt

Conversation

@geyslan

@geyslan geyslan commented Jan 29, 2026

Copy link
Copy Markdown
Member

1. Explain what the PR does

796a8c1 feat(vm-setup): add VM image manager with TUI, cloud-init generation, and multi-arch support

Introduce a VM management toolkit for building, customizing, and running
Tracee development VMs locally or for AWS.

  • TUI (gum-based) with environment-aware menus: Development VMs (local)
    and AWS VMs, each with Build/Run sub-options; AWS adds Convert to AMI
    and Push AMI to AWS
  • Cloud-init generation from distro templates (Ubuntu, CentOS, Alpine)
    with modular step files for init services, kernel install, MOTD,
    unattended upgrades, shared directories, and tool installation
  • Distro plugin system (distros/*.sh) with Ubuntu implementation:
    base image download, kernel list from APT repo, build flow with
    flavour/version selection, and base image inference for Run
  • Multi-arch QEMU support: x86_64 (KVM + SeaBIOS) and aarch64
    (UEFI/pflash with cortex-a57 TCG fallback on x86_64 hosts)
  • Virtiofs for native-arch KVM, 9p fallback for cross-arch TCG
  • QCOW2 overlay disks backed by base images with selectable size
  • Configurable SSH key injection (picks from ~/.ssh/*.pub)
  • Environment suffix in image names (-local/-aws) so both coexist
  • Dynamic MOTD with kernel version mismatch detection
  • Embedded script support (install-kernel.sh, setup-motd.sh) for
    reliability before scripts are merged upstream

--

2. Explain how to test it

./scripts/vm-setup/vm-manager.sh and play a bit:

image

3. Other comments

@codecov

codecov Bot commented Jan 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 35.42%. Comparing base (235daa0) to head (796a8c1).
⚠️ Report is 251 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5222      +/-   ##
==========================================
+ Coverage   33.51%   35.42%   +1.91%     
==========================================
  Files         250      239      -11     
  Lines       28908    31487    +2579     
==========================================
+ Hits         9688    11155    +1467     
- Misses      18609    19617    +1008     
- Partials      611      715     +104     
Flag Coverage Δ
unit 35.42% <ø> (+1.91%) ⬆️
see 104 files with indirect coverage changes
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@geyslan geyslan force-pushed the vms-mngmnt branch 3 times, most recently from 01c1039 to b2538bd Compare February 17, 2026 17:33
@geyslan geyslan changed the title [WIP] VMs management VMs management Feb 17, 2026
@geyslan geyslan marked this pull request as ready for review February 17, 2026 17:34
Copilot AI review requested due to automatic review settings February 17, 2026 17:34
… and multi-arch support

Introduce a VM management toolkit for building, customizing, and running
Tracee development VMs locally or for AWS.

- TUI (gum-based) with environment-aware menus: Development VMs (local)
  and AWS VMs, each with Build/Run sub-options; AWS adds Convert to AMI
  and Push AMI to AWS
- Cloud-init generation from distro templates (Ubuntu, CentOS, Alpine)
  with modular step files for init services, kernel install, MOTD,
  unattended upgrades, shared directories, and tool installation
- Distro plugin system (distros/*.sh) with Ubuntu implementation:
  base image download, kernel list from APT repo, build flow with
  flavour/version selection, and base image inference for Run
- Multi-arch QEMU support: x86_64 (KVM + SeaBIOS) and aarch64
  (UEFI/pflash with cortex-a57 TCG fallback on x86_64 hosts)
- Virtiofs for native-arch KVM, 9p fallback for cross-arch TCG
- QCOW2 overlay disks backed by base images with selectable size
- Configurable SSH key injection (picks from ~/.ssh/*.pub)
- Environment suffix in image names (-local/-aws) so both coexist
- Dynamic MOTD with kernel version mismatch detection
- Embedded script support (install-kernel.sh, setup-motd.sh) for
  reliability before scripts are merged upstream

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a VM provisioning/management workflow under scripts/vm-setup/, including an interactive VM manager, cloud-init template generation, and helpers for running/debugging Tracee dev VMs across local and AWS-oriented flows.

Changes:

  • Add a vm-manager.sh TUI/CLI orchestrator with a pluggable “distro module” system (initial Ubuntu module).
  • Add cloud-init template/generator system (templates + reusable step snippets) and QEMU run/debug helpers.
  • Extend install-deps scripts to configure user Go env and install full Docker engine (not CLI-only).

Reviewed changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated 25 comments.

Show a summary per file
File Description
scripts/vm-setup/vm-manager.sh New VM manager TUI/CLI (download/build/run, AWS helpers).
scripts/vm-setup/vm-manager-lib.sh Shared config + prompting + download progress helper for the manager.
scripts/vm-setup/distros/ubuntu.sh Ubuntu module: download/verify images, kernel list cache, build flow integration.
scripts/vm-setup/distros/README.md Documents the distro-module “contract” for vm-manager.
scripts/vm-setup/generate-cloud-init.sh Generates user-data/meta-data from templates and injected steps/scripts.
scripts/vm-setup/templates/user-data-ubuntu-template.yaml Ubuntu cloud-init template with step placeholders and MOTD/kernel hooks.
scripts/vm-setup/templates/user-data-centos-template.yaml CentOS/RHEL-family cloud-init template with placeholders.
scripts/vm-setup/templates/user-data-alpine-template.yaml Alpine cloud-init template with placeholders.
scripts/vm-setup/templates/meta-data-template.yaml meta-data template for cloud-init instances.
scripts/vm-setup/templates/steps/download-scripts.yaml Step to clone Tracee scripts/ via sparse-checkout.
scripts/vm-setup/templates/steps/install-tools.yaml Step to run install-deps-${DISTRO}.sh and optional AMI tooling.
scripts/vm-setup/templates/steps/setup-virtfs.yaml Step to mount shared dirs for local dev VMs.
scripts/vm-setup/templates/steps/init-services.yaml Step for initial services/logging + home ownership fix (Ubuntu template).
scripts/vm-setup/templates/steps/ubuntu-disable-unattended-upgrades.yaml Step to disable unattended upgrades using repo script.
scripts/vm-setup/templates/steps/write-vm-configs.yaml Step to write common VM configs/scripts (e.g., mount helper).
scripts/vm-setup/templates/steps/finalize.yaml Step to mark completion and reboot.
scripts/vm-setup/templates/steps/README.md Documentation for the step/placeholder injection system.
scripts/vm-setup/start-vm-virtiofs.sh QEMU launcher with virtiofs/9p logic and arch handling.
scripts/vm-setup/debug-vm.sh Helper to inspect VM images/logs via guestfish/guestmount/nbd.
scripts/vm-setup/scripts/setup-motd.sh Installs dynamic MOTD script comparing running vs expected kernel.
scripts/vm-setup/scripts/install-kernel.sh Distro-agnostic kernel installer used by cloud-init.
scripts/vm-setup/config/config.env.example Example config for manager paths.
scripts/vm-setup/generated/ubuntu-22.04-generic-5.19.0-50-x86_64-user-data.yaml Generated sample cloud-init output (currently committed).
scripts/vm-setup/generated/ubuntu-22.04-generic-5.19.0-50-x86_64-meta-data.yaml Generated sample meta-data output (currently committed).
scripts/installation/install-deps-ubuntu.sh Adds user Go env configuration + installs full Docker engine + docker service enable.
scripts/installation/install-deps-centos.sh Same as above for CentOS/RHEL flows.
scripts/installation/install-deps-alpine.sh Same as above for Alpine; replaces docker “check” with docker installation.

fi
local dst="${OUTPUT_IMAGES_DIR}/${img}.qcow2"
local abs_src
abs_src=$(realpath "${src}")

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code assumes realpath exists. On macOS and some minimal distros it may be missing, which will break overlay creation. Consider a fallback (e.g., readlink -f when available, or use qemu-img backing file with a relative path).

Suggested change
abs_src=$(realpath "${src}")
if command -v realpath &>/dev/null; then
abs_src=$(realpath "${src}")
elif command -v readlink &>/dev/null; then
abs_src=$(readlink -f "${src}" 2>/dev/null || printf '%s\n' "${src}")
else
abs_src="${src}"
fi

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +26
# Set password to 'ubuntu' (plaintext, cloud-init will hash it)
# For console/emergency access
plain_text_passwd: ubuntu
lock_passwd: false
ssh_authorized_keys:
- ${SSH_PUBKEY}

hostname: ${DISTRO}-${VERSION}-${KERNEL_FLAVOR}
disable_root: true
ssh_pwauth: true

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This template enables password SSH login and sets a known plaintext password. Even for dev images, this is a high-risk default and makes accidental exposure (e.g., bridged networking, cloud reuse) much more dangerous. Consider disabling ssh_pwauth by default and requiring key-based SSH (or gating password auth behind an explicit opt-in variable).

Suggested change
# Set password to 'ubuntu' (plaintext, cloud-init will hash it)
# For console/emergency access
plain_text_passwd: ubuntu
lock_passwd: false
ssh_authorized_keys:
- ${SSH_PUBKEY}
hostname: ${DISTRO}-${VERSION}-${KERNEL_FLAVOR}
disable_root: true
ssh_pwauth: true
# Password login is disabled by default; use SSH keys instead.
# To enable password login, override this template explicitly.
lock_passwd: true
ssh_authorized_keys:
- ${SSH_PUBKEY}
hostname: ${DISTRO}-${VERSION}-${KERNEL_FLAVOR}
disable_root: true
ssh_pwauth: false

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +20
# Set password to 'alpine' (plaintext, cloud-init will hash it)
# For console/emergency access
plain_text_passwd: alpine
lock_passwd: false

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This template enables password SSH login and sets a known plaintext password. Even for dev images, this is a high-risk default and makes accidental exposure much more dangerous. Consider disabling ssh_pwauth by default and requiring key-based SSH (or gating password auth behind an explicit opt-in variable).

Copilot uses AI. Check for mistakes.

hostname: ${DISTRO}-${VERSION}-${KERNEL_FLAVOR}
disable_root: true
ssh_pwauth: true

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ssh_pwauth: true enables SSH password authentication. Combined with a predictable password in the user stanza, this creates an easy remote entry point if the VM becomes reachable. Consider setting this to false by default and relying on injected SSH keys.

Suggested change
ssh_pwauth: true
ssh_pwauth: false

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +126
"linux-image-${KERNEL_VERSION}" \
"linux-headers-${KERNEL_VERSION}" \
"linux-modules-extra-${KERNEL_VERSION}"
;;

gcp)
apt-get update -y
apt-get install -y \
"linux-image-${KERNEL_VERSION}" \
"linux-headers-${KERNEL_VERSION}"
;;

azure)
apt-get update -y
apt-get install -y \
"linux-image-${KERNEL_VERSION}" \
"linux-headers-${KERNEL_VERSION}"

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ubuntu aws case installs linux-image-${KERNEL_VERSION} / linux-headers-${KERNEL_VERSION} without a flavor suffix, but elsewhere kernel versions are discovered from package names like linux-image-<ver>-aws. This is likely to fail to install the requested flavor/version. Consider using flavor-specific package names (and apply similarly to gcp/azure).

Suggested change
"linux-image-${KERNEL_VERSION}" \
"linux-headers-${KERNEL_VERSION}" \
"linux-modules-extra-${KERNEL_VERSION}"
;;
gcp)
apt-get update -y
apt-get install -y \
"linux-image-${KERNEL_VERSION}" \
"linux-headers-${KERNEL_VERSION}"
;;
azure)
apt-get update -y
apt-get install -y \
"linux-image-${KERNEL_VERSION}" \
"linux-headers-${KERNEL_VERSION}"
"linux-image-${KERNEL_VERSION}-${KERNEL_FLAVOR}" \
"linux-headers-${KERNEL_VERSION}-${KERNEL_FLAVOR}" \
"linux-modules-extra-${KERNEL_VERSION}-${KERNEL_FLAVOR}"
;;
gcp)
apt-get update -y
apt-get install -y \
"linux-image-${KERNEL_VERSION}-${KERNEL_FLAVOR}" \
"linux-headers-${KERNEL_VERSION}-${KERNEL_FLAVOR}"
;;
azure)
apt-get update -y
apt-get install -y \
"linux-image-${KERNEL_VERSION}-${KERNEL_FLAVOR}" \
"linux-headers-${KERNEL_VERSION}-${KERNEL_FLAVOR}"

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +24

if mount -t virtiofs "${MOUNT_TAG}" "${MOUNT_POINT}" 2>/dev/null; then
echo "Mounted ${MOUNT_POINT} via virtiofs"
echo "${MOUNT_TAG} ${MOUNT_POINT} virtiofs rw,nofail 0 0" >> /etc/fstab
elif mount -t 9p -o trans=virtio,version=9p2000.L "${MOUNT_TAG}" "${MOUNT_POINT}" 2>/dev/null; then
echo "Mounted ${MOUNT_POINT} via 9p"
echo "${MOUNT_TAG} ${MOUNT_POINT} 9p trans=virtio,version=9p2000.L,rw,nofail 0 0" >> /etc/fstab

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step appends entries to /etc/fstab on every boot when a mount succeeds, which can lead to duplicate lines accumulating over time. Consider checking whether the exact line is already present before appending (or write a dedicated /etc/fstab.d file where supported).

Suggested change
if mount -t virtiofs "${MOUNT_TAG}" "${MOUNT_POINT}" 2>/dev/null; then
echo "Mounted ${MOUNT_POINT} via virtiofs"
echo "${MOUNT_TAG} ${MOUNT_POINT} virtiofs rw,nofail 0 0" >> /etc/fstab
elif mount -t 9p -o trans=virtio,version=9p2000.L "${MOUNT_TAG}" "${MOUNT_POINT}" 2>/dev/null; then
echo "Mounted ${MOUNT_POINT} via 9p"
echo "${MOUNT_TAG} ${MOUNT_POINT} 9p trans=virtio,version=9p2000.L,rw,nofail 0 0" >> /etc/fstab
FSTAB_LINE_VIRTIOFS="${MOUNT_TAG} ${MOUNT_POINT} virtiofs rw,nofail 0 0"
FSTAB_LINE_9P="${MOUNT_TAG} ${MOUNT_POINT} 9p trans=virtio,version=9p2000.L,rw,nofail 0 0"
if mount -t virtiofs "${MOUNT_TAG}" "${MOUNT_POINT}" 2>/dev/null; then
echo "Mounted ${MOUNT_POINT} via virtiofs"
if ! grep -qxF "${FSTAB_LINE_VIRTIOFS}" /etc/fstab 2>/dev/null; then
echo "${FSTAB_LINE_VIRTIOFS}" >> /etc/fstab
fi
elif mount -t 9p -o trans=virtio,version=9p2000.L "${MOUNT_TAG}" "${MOUNT_POINT}" 2>/dev/null; then
echo "Mounted ${MOUNT_POINT} via 9p"
if ! grep -qxF "${FSTAB_LINE_9P}" /etc/fstab 2>/dev/null; then
echo "${FSTAB_LINE_9P}" >> /etc/fstab
fi

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +19
# Reload systemd to pick up fstab changes
systemctl daemon-reload

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup-virtfs.yaml uses systemctl daemon-reload, but this step is injected into the Alpine template as well (which uses OpenRC by default). That will fail on Alpine. Consider guarding systemctl usage (e.g., command -v systemctl) or splitting this step into systemd vs OpenRC variants.

Suggested change
# Reload systemd to pick up fstab changes
systemctl daemon-reload
# Reload systemd to pick up fstab changes (if systemd is available)
if command -v systemctl >/dev/null 2>&1; then
systemctl daemon-reload
fi

Copilot uses AI. Check for mistakes.

# Configure repository and branch (can be overridden via environment)
TRACEE_REPO="${TRACEE_REPO:-https://github.com/aquasecurity/tracee.git}"
TRACEE_BRANCH="${TRACEE_BRANCH:-main}"

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TRACEE_REPO/TRACEE_BRANCH default to cloning and running scripts from a moving remote branch on first boot. This is a supply-chain/reproducibility risk. Consider pinning to a commit/tag by default (or requiring an explicit opt-in to use a moving branch) and/or verifying the fetched content.

Suggested change
TRACEE_BRANCH="${TRACEE_BRANCH:-main}"
# Default to a pinned tag for reproducibility; override TRACEE_BRANCH to use another ref (including moving branches)
TRACEE_BRANCH="${TRACEE_BRANCH:-v0.19.0}"

Copilot uses AI. Check for mistakes.
Comment on lines +485 to +488
local task_output
task_output=$(aws ec2 import-image --disk-containers "Format=raw,UserBucket={S3Bucket=${S3_BUCKET},S3Key=${s3_key}}")
local import_task_id
import_task_id=$(echo "${task_output}" | grep -o '"ImportTaskId": "[^"]*"' | cut -d'"' -f4)

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing ImportTaskId out of JSON with grep/cut is fragile and can break if AWS CLI output format changes or includes multiple matches. Consider using --query with --output text (or jq) to reliably extract ImportTaskId.

Suggested change
local task_output
task_output=$(aws ec2 import-image --disk-containers "Format=raw,UserBucket={S3Bucket=${S3_BUCKET},S3Key=${s3_key}}")
local import_task_id
import_task_id=$(echo "${task_output}" | grep -o '"ImportTaskId": "[^"]*"' | cut -d'"' -f4)
local import_task_id
import_task_id=$(aws ec2 import-image \
--disk-containers "Format=raw,UserBucket={S3Bucket=${S3_BUCKET},S3Key=${s3_key}}" \
--query 'ImportTaskId' \
--output text 2>/dev/null || echo "")

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +57
total_bytes=$(curl -sI -L "${url}" 2>/dev/null | grep -i '^content-length:' | awk '{print $2}' | tr -d '\r')
[[ -z "${total_bytes}" || ! "${total_bytes}" =~ ^[0-9]+$ ]] && total_bytes=0
if command -v curl &>/dev/null; then

Copilot AI Feb 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

download_with_progress() always runs a curl HEAD request to compute Content-Length. With set -e, this will abort on systems that only have wget installed. Consider using curl only when available (or use wget --spider) and gracefully fall back when Content-Length can’t be determined.

Suggested change
total_bytes=$(curl -sI -L "${url}" 2>/dev/null | grep -i '^content-length:' | awk '{print $2}' | tr -d '\r')
[[ -z "${total_bytes}" || ! "${total_bytes}" =~ ^[0-9]+$ ]] && total_bytes=0
if command -v curl &>/dev/null; then
local downloader=""
# Choose downloader and, if possible, determine total size without aborting on failure.
if command -v curl &>/dev/null; then
downloader="curl"
total_bytes=$(curl -sI -L "${url}" 2>/dev/null | grep -i '^content-length:' | awk '{print $2}' | tr -d '\r' || true)
elif command -v wget &>/dev/null; then
downloader="wget"
# Use wget --spider to retrieve headers and extract Content-Length.
total_bytes=$(wget --spider -S "${url}" 2>&1 | grep -i '^ *content-length:' | awk '{print $2}' | tr -d '\r' || true)
else
print_error "Neither 'curl' nor 'wget' is installed; cannot download '${url}'."
return 1
fi
[[ -z "${total_bytes}" || ! "${total_bytes}" =~ ^[0-9]+$ ]] && total_bytes=0
if [[ "${downloader}" == "curl" ]]; then

Copilot uses AI. Check for mistakes.
@geyslan

geyslan commented Mar 24, 2026

Copy link
Copy Markdown
Member Author

This is not priority now, however if one of you guys want to take over. Be my guest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants