forked from openshift/assisted-installer-agent
-
Notifications
You must be signed in to change notification settings - Fork 0
Support arm #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
tsorya
wants to merge
44
commits into
master
Choose a base branch
from
igal/arm
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Support arm #2
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Older versions of go are out of support, so for security compliance, we were trying to get all components on the latest version. 1.14 is already out of support, i.e. https://endoflife.date/go
…run instead of timeout to pull single image. (openshift#218)
Support both APIs. The distiction is by the flags --cluster-id and --infra-env-id. If --cluster-id is present then we use v1 APIs. If --infra-env-id is present then we use v2 APIs.
This PR adds output of the `lsblk` to the logs related to the mounts and disks on the target node. This is useful as currently the informations gathered contain detailed information only for the mounted devices, but do not provide them for any additional device that may be present but not mounted. This is useful so that apart from the existence of the device we can see also its size and type.
To add oVirt provider support some modifications need to be done on the agent side. oVirt platform is detected according to the family parameter and not by product name which can be various, to detect the right platform and add it to the isVitual list new function was added and it will change the host product type to match oVirt one, it will ensure that the rest of the flow will remain the same as for the different types of hosts.
The TPM version is validated in the service to make sure the specific TPM HW is supported by openshift. Signed-off-by: Yoni Bettan <[email protected]>
…hift#236) Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>
Signed-off-by: Flavio Percoco <[email protected]>
NO-ISSUE: Add flaper87 to approvers list
The original plan was to move all images to ubi8. This is not possible due to the lack of some packages that are needed for other projects. We are now going to switch all images to stream8 with the hope that the consistency accross repos will prevent (or help) with debugging current/future issues in CI. The goal is to keep component's builds as consistent as possible in the channels we are releasing them on Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>
…dLogs command (openshift#239) In order to move the Agent to use V2SendLogs command, we must first prepare Agent to receive additional arguments (InfraEnvID), otherwise it may cause crashes due to unknown flag on. Only after that we can change the command passed between Assisted-Service and Agent, and after that we can actually change Agent code to use V2 API
…enshift#240) This is the final commit for this issue. In previous ones we have changed Agent to extract InfraEnvID from sendLogs command parameters (if exists) and changed AssistedService to send InfraEnvID of the host in the sendLogs command
It will make the agent send an invalid TPM version format to the service and, therefore, to fail. Signed-off-by: Yoni Bettan <[email protected]>
…shift#246) This PR changes Agent code to use V2 API for HostLogProgress instead of V1 API
…0 if it failed on one of the images (openshift#245)
…d, (openshift#251) return only empty list
k4e-device-worker uses the inventory for collecting hardware information. However, the k4e-device-worker runs as a process on the host and not inside a container. Therefore the path for its root filesystem doesn't need to be chrooted. The PR enbales the caller to provide the chroot root folder. Signed-off-by: Moti Asayag <[email protected]>
This commit introduces a dry run mode to the agent. The mode is activated when the `DRY_ENABLE=true` environment is set (or when the `--dry-run` flag is passed to the installer). The dry run mode disables several destructive actions performed by the agent. The purpose of the dry run mode is mostly allowing us to run a lot of agents on the same machine, without causing harm to that machine, but still communicating with the service. This is useful when performing load testing to the service. Other than disabling destructive actions, it also disables actions that take too long due to networking / processing reasons. See the diff for the exact changes, but here's a summary - - Next step runner retry delay during dry run reduced from 1h to 1m. I believe 1h was chosen to reduce load on production servers, and it's annoyingly long when dealing with crashes during load testing. - Added journal `DRY_AGENT_ID` field to help separate between logs of multiple dry run agents running at the same time on the same machine - When registering a host, the host ID defined by the `DRY_HOST_ID` environment will be used rather then the host ID retrieved from the hardware. - `diagnoseSystem` is skipped, it's not necessary and consumes CPU - Image availability is skipped, dummy results are returned instead - Disk speed check - 1ms is returned immediately rather than doing destructive/slow disk checks - `getBmcAddress`/`getBmcV6Address` are skipped, the default "not found" `0.0.0.0` or `::/0` are immediately returned - `smartctl` calls are skipped, it's too slow and not useful, dummy hard-coded smartctl JSON is returned - The first interface's MAC address is overridden with the MAC address specified in the `DRY_MAC_ADDRESS` env variable. This is useful for when you want multiple agents to run on the same machine, but present different mac addresses to the service, so they can be individually identiefied and separated by BMAC. - NTP sync - returns an empty list immediately, it's too slow and not really necessary when doing a fake installation - Added the `--cgroup` namespace to `nsenter`. The reasoning for that is explained in detail in a code comment in this commit's diff. - Agent and next_step_runner will now halt when the file whose path is configured by DRY_FAKE_REBOOT_MARKER_PATH gets created. This file is used by the installer to signal that a "fake reboot" happened. Other unrelated changes - - Added some test artifacts to `.dockerignore` to prevent them from causing cache misses after doing a `COPY . .` in the Dockerfile
When the installer launches the logs-sender binary from the agent, we encounter this error: `Logs were sent\n1 error occurred:\n\t* /usr/bin/lsblk failed: 1 lsblk: unknown column: NAME,MAJ:MIN,SIZE,TYPE,FSTYPE,KNAME,MODEL,UUID,WWN,HCTL,VENDOR,STATE,TRAN,PKNAME\n\n\n\n\n` It is caused by the columns list and the `-o` flag being inside the same parameter. This commit separates them to avoid this error
Currently when logs-sender is failing we don't know why it happens and on which step as there is no output and agent has nothing to send to service. We need to add logs-sender logs to stdout and it will allow agent to gather them on failure
…ft#254) The agent binary only checked if a reboot happened when next_step_runner has an error. This is not the case when reboots happen, next_step_runner exits cleanly with a 0 exit code, so we should only check dry reboot in the agent if next_step_runner had no error
…dry mode (openshift#256) Dry run mode was only tried so far with single-node. This commit makes the IP and hostname configurable, allowing the swarm to launch multi-node clusters without having hosts clash. It also disables the connectivity check in dry run mode because with different fake IP addresses that would obviously not work
Assisted Installer Service already add the suffix "config/worker". Removing in Agent side to avoid to get to this URL: http://<API IP>:22624/config/worker/config/worker
The CIDR validation, the majorityGroup check, and the L2 and L3 validations, are already performed by assisted service when validating the cluster, host, and network data. There should not be a need to replicate this validation in the agent itself. Signed-off-by: Flavio Percoco <[email protected]>
Added 'Accept' to ignition download request header to explicitly specify version 3.2.0. This is required to avoid a redundant conversion to v2.2[1] and a failure in machine-config-operator when LUKS is enabled[2] [1] https://github.com/openshift/machine-config-operator/blob/9c6c2bfd7ed498bfbc296d530d1839bd6a177b0b/pkg/server/api.go#L154 [2] error: failed to convert config from spec v3.2 to v2.2: unable to convert Ignition spec v3 config to v2: LUKS is not supported on 2.2
Add support of Authorization token and CA cert to the API VIP connectivity validation.
…API VIP connectivity failure (openshift#264) The apivip request pass base64 encoded CA cert, the agent should decode it before trying to parse it
Return ignition as part of APIVipConnectivityResponse for exposing LUKS (disk encryption) information to the service. This is required as part of disk encryption validation for day2 hosts.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.