tests/toolbox: adjust to be runnable on RHCOS#4017
tests/toolbox: adjust to be runnable on RHCOS#4017Rolv-Apneseth wants to merge 1 commit intocoreos:testing-develfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request successfully adapts the toolbox test to be runnable on RHCOS by replacing machinectl with su - core, expanding search patterns for toolboxes, and cleverly handling an older toolbox script via function overrides. The changes are well-structured and make the test more robust. I've identified a bug in the toolbox_count function that could cause the script to fail, and I've also suggested a refactoring to improve the clarity of the code that handles the older toolbox version. Overall, this is a great improvement.
445a653 to
6bf3929
Compare
|
Anybody know why that CI is failing? Also, should these be pointing at a fork? I assume this was for testing / debugging: |
|
/retest |
6bf3929 to
1c5679e
Compare
| version: 1.6.0 | ||
| storage: | ||
| files: | ||
| # Toolbox configuration for older streams using https://github.com/coreos/toolbox |
There was a problem hiding this comment.
What do you mean by this exactly?
There was a problem hiding this comment.
Yeah sorry, wasn't sure how to word it. I suppose it's only RHCOS 9.x and older that use the coreos toolbox
There was a problem hiding this comment.
How about
Toolbox configuration for RHCOS variants < 10.x using https://github.com/coreos/toolbox
There was a problem hiding this comment.
ok I see. So this is wholly for the legacy toolbox?
TBH in that case I would just put it in the test itself for the legacy case:
cat <<EOF > /var/home/core/.toolboxrc
REGISTRY=quay.io
IMAGE=fedora/fedora-toolbox:latest
TOOLBOX_NAME=fedora-toolbox-latest
AUTHFILE=/dev/null
EOF
There was a problem hiding this comment.
Unfortunately, RHCOS 10 also use the legacy coreos/toolbox implementation. So it's a RHCOS vs FCOS thing right now.
There was a problem hiding this comment.
When I was testing it was only 9.8 that had it, not even c9s.
Here's the toolbox package metadata from the release browser for each (not sure where else to get a list of packages):
- 9.8:
[ "toolbox", "0", "0.1.2", "1.rhaos4.22.el9", "noarch" ] - 10.0:
[ "toolbox", "0", "0.3", "1.el10", "x86_64" ] - c9s:
[ "toolbox", "0", "0.3", "1.el9", "x86_64" ] - c10s:
[ "toolbox", "0", "0.3", "1.el10", "x86_64" ]
| - path: /var/home/core/config.json | ||
| contents: | ||
| inline: "{}" |
There was a problem hiding this comment.
Not sure I understand the point of this
There was a problem hiding this comment.
I will double check but I believe when I was running it, the coreos toolbox would fail if the file didn't exist
There was a problem hiding this comment.
Yeah, it fails without that. Either:
Error: credential file is not accessible: faccessat /var/lib/kubelet/config.json: no such file or directory
Or, if specifying AUTHFILE:
Error: credential file is not accessible: faccessat /var/home/core/config.json: no such file or directory
There was a problem hiding this comment.
ok so you mean coreos/toolbox fails if a credentials file doesn't exist? Does it work if you use AUTHFILE=/dev/null ?
There was a problem hiding this comment.
Unfortunately not:
Error: unable to copy from source docker://quay.io/fedora/fedora-toolbox:latest: initializing source docker://quay.io/fedora/fedora-toolbox:latest: getting username and password: reading JSON file "/dev/null": unmarshaling JSON at "/dev/null": unexpected end of JSON input
There was a problem hiding this comment.
And yes, this whole butane file is only for coreos/toolbox
tests/kola/toolbox/test.sh
Outdated
|
|
||
| # IMPORTANT: Commands are run indirectly via `su - core` to re-create the | ||
| # user environment needed for unprivileged podman functionality. | ||
| run_as_core() |
There was a problem hiding this comment.
we use a similar function to this in a few other tests (search for runascoreuser). might be a good idea to put a function like this in commonlib.sh and then use that everywhere.
migt be good to point AI at this and it could do it in like 10s.
There was a problem hiding this comment.
Note though.. that machinectl shell is a more proper way to get an environment that more closely models what a user session would have. Maybe we should continue to use machinectl shell here.
cc @travier
There was a problem hiding this comment.
oh - hmm. I now see the comment above that mentions RHCOS doesn't have machinectl. I wonder if we should just add it.
There was a problem hiding this comment.
Yes that was a big part of the complication with getting it to run on rhcos. I did consider adding it to the common lib but I wanted to see if we were gonna keep the machinectl approach for systems that support it first
There was a problem hiding this comment.
yeah, let's see what @travier thinks on the machinectl in RHCOS thing.
There was a problem hiding this comment.
I would prefer we keep using machinectl on FCOS. I think I had tried with su, I don't remember why it did not work.
There was a problem hiding this comment.
I think my question was more: should we add machinectl to RHCOS?
There was a problem hiding this comment.
I don't think so if we don't need it
There was a problem hiding this comment.
if we don't need it for something else. I don't remember why it's in FCOS
There was a problem hiding this comment.
I had confirmed this works on both, which is why I didn't choose to split it here, but I don't mind keeping machinectl for FCOS if we think it's worth doing it the more "proper" way when we can
1c5679e to
4faf68a
Compare
Fixes openshift/os#1035 RHCOS incompatabilities and adjustments made by this commit: 1. No `machinectl` command on RHCOS, so instead use `su - core` to run commands as the core user. Also has the minor benefit of propagating error codes, though I opted to keep the explicit checks anyway. 2. Grep and awk calls were looking only for `fedora-toolbox-`. Simply expand the search to also accept `rhel-toolbox-`. 3. The rhel-9.8 variant is using [this toolbox](https://github.com/coreos/toolbox). This is the one that adds the most complexity to this test, as commands are completely different, the script uses root podman and requires extra configuration.
4faf68a to
66f5ac1
Compare
| # RHEL registry requires credentials - use quay.io registry and a Fedora image for testing | ||
| - path: /var/home/core/.toolboxrc | ||
| contents: | ||
| inline: | | ||
| REGISTRY=quay.io | ||
| IMAGE=fedora/fedora-toolbox:latest | ||
| TOOLBOX_NAME=fedora-toolbox-latest | ||
| AUTHFILE=/var/home/core/config.json |
There was a problem hiding this comment.
Unfortunately that will not test the same thing as the legacy coreos/toolbox behaves differently when there are special labels on the image and I think I remember those being there in the RHEL one but not the upstream containers/toolbox one.
There was a problem hiding this comment.
Would you mind explaining, I don't really understand. Would I need to get it to run the RHEL images here then? How do I get the required authentication?
I assume this is what you're referring to with the labels
|
As this will never be used in FCOS, I think you should keep the current FCOS test as is and duplicate it in RHCOS instead. That way we will not have to undo the change here when we move away from the legacy one in RHCOS and we'll only have to delete it there. |
But then any changes/additions will need to be made in 2 places even though they'll eventually both be using containers/toolbox right? For example, I was thinking we should add at least a line in this test checking toolbox works when |
Closes openshift/os#1035
This will should enable running the basic toolbox test for RHCOS.
To explain a couple of the reasons it would not work in its current state in RHCOS, and the changes this PR makes to get around them:
machinectlcommand on RHCOS. I've instead usedsu - coreto run commands as the core user. Also has the minor benefit of propagating error codes, though I opted to keep the explicit checks anyway.fedora-toolbox-. I simply expanded the search to also acceptrhel-toolbox-.I would love feedback, as I'm not sure all the decisions I made were the best ones. For example, I could see a case for keeping the
machinectlpath for systems which have it available, as it actually simulates an interactive login. Also, do we even need to test that older toolbox script? Either way, I'm happy to make any changes.