Support arm #2

tsorya · 2021-06-28T17:34:09Z

No description provided.

Older versions of go are out of support, so for security compliance, we were trying to get all components on the latest version. 1.14 is already out of support, i.e. https://endoflife.date/go

…run instead of timeout to pull single image. (openshift#218)

Support both APIs. The distiction is by the flags --cluster-id and --infra-env-id. If --cluster-id is present then we use v1 APIs. If --infra-env-id is present then we use v2 APIs.

…hift#227)

This PR adds output of the `lsblk` to the logs related to the mounts and disks on the target node. This is useful as currently the informations gathered contain detailed information only for the mounted devices, but do not provide them for any additional device that may be present but not mounted. This is useful so that apart from the existence of the device we can see also its size and type.

…hift#231)

To add oVirt provider support some modifications need to be done on the agent side. oVirt platform is detected according to the family parameter and not by product name which can be various, to detect the right platform and add it to the isVitual list new function was added and it will change the host product type to match oVirt one, it will ensure that the rest of the flow will remain the same as for the different types of hosts.

The TPM version is validated in the service to make sure the specific TPM HW is supported by openshift. Signed-off-by: Yoni Bettan <[email protected]>

…hift#236) Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>

Signed-off-by: Flavio Percoco <[email protected]>

NO-ISSUE: Add flaper87 to approvers list

The original plan was to move all images to ubi8. This is not possible due to the lack of some packages that are needed for other projects. We are now going to switch all images to stream8 with the hope that the consistency accross repos will prevent (or help) with debugging current/future issues in CI. The goal is to keep component's builds as consistent as possible in the channels we are releasing them on Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>

…dLogs command (openshift#239) In order to move the Agent to use V2SendLogs command, we must first prepare Agent to receive additional arguments (InfraEnvID), otherwise it may cause crashes due to unknown flag on. Only after that we can change the command passed between Assisted-Service and Agent, and after that we can actually change Agent code to use V2 API

…enshift#240) This is the final commit for this issue. In previous ones we have changed Agent to extract InfraEnvID from sendLogs command parameters (if exists) and changed AssistedService to send InfraEnvID of the host in the sendLogs command

It will make the agent send an invalid TPM version format to the service and, therefore, to fail. Signed-off-by: Yoni Bettan <[email protected]>

…shift#246) This PR changes Agent code to use V2 API for HostLogProgress instead of V1 API

…0 if it failed on one of the images (openshift#245)

…d, (openshift#251) return only empty list

k4e-device-worker uses the inventory for collecting hardware information. However, the k4e-device-worker runs as a process on the host and not inside a container. Therefore the path for its root filesystem doesn't need to be chrooted. The PR enbales the caller to provide the chroot root folder. Signed-off-by: Moti Asayag <[email protected]>

This commit introduces a dry run mode to the agent. The mode is activated when the `DRY_ENABLE=true` environment is set (or when the `--dry-run` flag is passed to the installer). The dry run mode disables several destructive actions performed by the agent. The purpose of the dry run mode is mostly allowing us to run a lot of agents on the same machine, without causing harm to that machine, but still communicating with the service. This is useful when performing load testing to the service. Other than disabling destructive actions, it also disables actions that take too long due to networking / processing reasons. See the diff for the exact changes, but here's a summary - - Next step runner retry delay during dry run reduced from 1h to 1m. I believe 1h was chosen to reduce load on production servers, and it's annoyingly long when dealing with crashes during load testing. - Added journal `DRY_AGENT_ID` field to help separate between logs of multiple dry run agents running at the same time on the same machine - When registering a host, the host ID defined by the `DRY_HOST_ID` environment will be used rather then the host ID retrieved from the hardware. - `diagnoseSystem` is skipped, it's not necessary and consumes CPU - Image availability is skipped, dummy results are returned instead - Disk speed check - 1ms is returned immediately rather than doing destructive/slow disk checks - `getBmcAddress`/`getBmcV6Address` are skipped, the default "not found" `0.0.0.0` or `::/0` are immediately returned - `smartctl` calls are skipped, it's too slow and not useful, dummy hard-coded smartctl JSON is returned - The first interface's MAC address is overridden with the MAC address specified in the `DRY_MAC_ADDRESS` env variable. This is useful for when you want multiple agents to run on the same machine, but present different mac addresses to the service, so they can be individually identiefied and separated by BMAC. - NTP sync - returns an empty list immediately, it's too slow and not really necessary when doing a fake installation - Added the `--cgroup` namespace to `nsenter`. The reasoning for that is explained in detail in a code comment in this commit's diff. - Agent and next_step_runner will now halt when the file whose path is configured by DRY_FAKE_REBOOT_MARKER_PATH gets created. This file is used by the installer to signal that a "fake reboot" happened. Other unrelated changes - - Added some test artifacts to `.dockerignore` to prevent them from causing cache misses after doing a `COPY . .` in the Dockerfile

When the installer launches the logs-sender binary from the agent, we encounter this error: `Logs were sent\n1 error occurred:\n\t* /usr/bin/lsblk failed: 1 lsblk: unknown column: NAME,MAJ:MIN,SIZE,TYPE,FSTYPE,KNAME,MODEL,UUID,WWN,HCTL,VENDOR,STATE,TRAN,PKNAME\n\n\n\n\n` It is caused by the columns list and the `-o` flag being inside the same parameter. This commit separates them to avoid this error

Currently when logs-sender is failing we don't know why it happens and on which step as there is no output and agent has nothing to send to service. We need to add logs-sender logs to stdout and it will allow agent to gather them on failure

…ft#254) The agent binary only checked if a reboot happened when next_step_runner has an error. This is not the case when reboots happen, next_step_runner exits cleanly with a 0 exit code, so we should only check dry reboot in the agent if next_step_runner had no error

…dry mode (openshift#256) Dry run mode was only tried so far with single-node. This commit makes the IP and hostname configurable, allowing the swarm to launch multi-node clusters without having hosts clash. It also disables the connectivity check in dry run mode because with different fake IP addresses that would obviously not work

…penshift#257)

Assisted Installer Service already add the suffix "config/worker". Removing in Agent side to avoid to get to this URL: http://<API IP>:22624/config/worker/config/worker

The CIDR validation, the majorityGroup check, and the L2 and L3 validations, are already performed by assisted service when validating the cluster, host, and network data. There should not be a need to replicate this validation in the agent itself. Signed-off-by: Flavio Percoco <[email protected]>

Added 'Accept' to ignition download request header to explicitly specify version 3.2.0. This is required to avoid a redundant conversion to v2.2[1] and a failure in machine-config-operator when LUKS is enabled[2] [1] https://github.com/openshift/machine-config-operator/blob/9c6c2bfd7ed498bfbc296d530d1839bd6a177b0b/pkg/server/api.go#L154 [2] error: failed to convert config from spec v3.2 to v2.2: unable to convert Ignition spec v3 config to v2: LUKS is not supported on 2.2

Add support of Authorization token and CA cert to the API VIP connectivity validation.

…API VIP connectivity failure (openshift#264) The apivip request pass base64 encoded CA cert, the agent should decode it before trying to parse it

Return ignition as part of APIVipConnectivityResponse for exposing LUKS (disk encryption) information to the service. This is required as part of disk encryption validation for day2 hosts.

omertuc and others added 30 commits July 14, 2021 03:53

MGMT-7210: Upgrade Go version to 1.16 (openshift#215)

38841de

Older versions of go are out of support, so for security compliance, we were trying to get all components on the latest version. 1.14 is already out of support, i.e. https://endoflife.date/go

MGMT-6977: Make sure that the image availability timeout is per tool …

94d7130

…run instead of timeout to pull single image. (openshift#218)

Bug 1975672: Parse GW nil as net zero (openshift#217)

1d517b6

MGMT-7383: Add v2 APIs to agent (openshift#219)

bda33e2

Support both APIs. The distiction is by the flags --cluster-id and --infra-env-id. If --cluster-id is present then we use v1 APIs. If --infra-env-id is present then we use v2 APIs.

MGMT-7407: Remove V1 APIs from agent (openshift#221)

c157386

Bug 1990060: Host returns no routes with multipart (openshift#220)

b14d0a3

NO-ISSUE: Align IP protocol version number (openshift#222)

6c955e8

MGMT-7443: add removable attr to Disk (openshift#224)

3d697f0

MGMT-7597: Remove ronniel1, and razregev from OWNERS (openshift#226)

38136d7

MGMT-7659: Generate and update the agent (openshift#228)

efe8ea0

NO-ISSUE - Add eliorerz to OWNERS (openshift#229)

938df89

MGMT-5400: Get physical_bytes from ghw if dmidecode was failed (opens…

e96269a

…hift#227)

MGMT-7661: Add property called physical_bytes_method to memory (opens…

60ac74e

…hift#231)

MGMT-7457: Adding TPM version to host inventory. (openshift#232)

4d39baf

The TPM version is validated in the service to make sure the specific TPM HW is supported by openshift. Signed-off-by: Yoni Bettan <[email protected]>

MGMT-7847: Add a deprecation warning for the VerifyCIDR option (opens…

14870a7

…hift#236) Signed-off-by: Flavio Percoco <[email protected]> Co-authored-by: Flavio Percoco <[email protected]>

NO-ISSUE: Add flaper87 to approvers list

763eaf7

Signed-off-by: Flavio Percoco <[email protected]>

Merge pull request openshift#238 from flaper87/flaperowner

38b5a5f

NO-ISSUE: Add flaper87 to approvers list

NO-ISSUE: Add lranjbar to OWNERS_ALIASES (openshift#234)

422b94c

NO-ISSUE: Removing \n suffix from TPM version. (openshift#242)

7918e3b

It will make the agent send an invalid TPM version format to the service and, therefore, to fail. Signed-off-by: Yoni Bettan <[email protected]>

NO-ISSUE: Remove user yuvigold (openshift#244)

37a5214

Bug 2012839: Agent should use V2 API for HostLogProgress report (open…

dda746a

…shift#246) This PR changes Agent code to use V2 API for HostLogProgress instead of V1 API

OCPBUGSM-34971: container image availability command should not exit …

761e669

…0 if it failed on one of the images (openshift#245)

Bug 2017502: don't return error from Resolver in case domain not foun…

ea4b281

…d, (openshift#251) return only empty list

omertuc and others added 14 commits November 7, 2021 14:46

MGMT-8294: Adapt assisted installer agent to latest swagger changes (o…

1b51ecd

…penshift#257)

OCPBUGSM-37106: List only physical disks on inventory. (openshift#259)

51fca8b

MGMT-8422: apivip check remove suffix (openshift#260)

e727246

Assisted Installer Service already add the suffix "config/worker". Removing in Agent side to avoid to get to this URL: http://<API IP>:22624/config/worker/config/worker

MGMT-8424: Add token and CA cert to API VIP check (openshift#261)

487be21

Add support of Authorization token and CA cert to the API VIP connectivity validation.

MGMT-8623: capi-provider-agent fail to add host to hypershift due to …

e37c419

…API VIP connectivity failure (openshift#264) The apivip request pass base64 encoded CA cert, the agent should decode it before trying to parse it

MGMT-7428: return ignition in apivip_check response (openshift#263)

c7a3a43

Return ignition as part of APIVipConnectivityResponse for exposing LUKS (disk encryption) information to the service. This is required as part of disk encryption validation for day2 hosts.

Support arm

4e1f477

moving to mac in case of bad motherboard serial

cb92797

tsorya force-pushed the igal/arm branch from d0829f8 to cb92797 Compare January 5, 2022 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support arm #2

Support arm #2

Uh oh!

tsorya commented Jun 28, 2021

Uh oh!

Uh oh!

Support arm #2

Are you sure you want to change the base?

Support arm #2

Uh oh!

Conversation

tsorya commented Jun 28, 2021

Uh oh!

Uh oh!