Skip to content

update pattern to use more recent common subtree #59

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 365 commits into from

Conversation

dminnear-rh
Copy link
Contributor

The two main changes in this commit are:

  1. Update the common subtree to be our modern slimmed down common (the ansible bits and helm charts like the one for the cluster group now live outside common)
  2. Make the operand image for NFD use a tag based on the cluster version. (this is necessary to run on more recent versions of openshift like 4.17 and 4.18. the existing hardcoded image does not have the nfd-gc binary that the operator expects and it causes the NFD operator to be unable to fully roll out all its resources. this, in turn, causes the nvidia gpu operator to not become completely ready and the ibm-granite-instruct-predictor deployment does not roll out because gpu nodes are not properly labeled.)

I was able to successfully deploy onto an openshift cluster with 3 control plane and 3 compute nodes all in us-west-2a:

image

mbaldessari and others added 30 commits May 14, 2024 00:11
Switch to registry.redhat.io for the initContainer image
- If statement was checking for .Values.global.extraValueFiles.
- We now checking at the .extraValueFiles in the managedClusterGroups section.

  managedClusterGroups:
    aro-prod:
      name: innovation
      acmlabels:
        - name: clusterGroup
          value: innovation
      extraValueFiles:
        - '/overrides/values-common-capabilities.yaml'
      helmOverrides:
        - name: clusterGroup.isHubCluster
          value: "false"
Update for ACM chart to application-policies.yaml
- Problem Statement
  The current **clustergroup** schema does not allow the definition of **extraParameters** under the **main** section of a values file.

- Caveat
  The user defined variables in the **extraParameters** section would only be applied if the user deploys the pattern via the command, using `./pattern.sh make install` or `./pattern.sh make operator-deploy` and not via the OpenShift Validated Patterns Operator UI.

- Fix Description
  Add the **extraParameters** to the definition of **Main.properties** in the values.schema.json:

        "extraParameters": {
          "type": "array",
          "description": "Pass in extra Helm parameters to all ArgoCD Applications and the framework."
        },

- This will allow users to define extra parameters that will be added by the framework to the ArgoCD applications it creates.

- For more information see validatedpatterns/common#510
…n of a values file.

- The operator adds these extraParameters to the extraParametersNested section as key/value pairs in the Cluster Wide ArgoCD Application created by the Validated Patterns operator.
- This update will add the user defined extra parameters on the ArgoCD Applications on the Spoke Clusters.

efinition of extraParameters under the main
We'd like to make the imperative namespace optional,
so let's use the golang-external-secrets one, which is probably
more correct anyways since the acm hub ca is tied to ESO anyways.
The acm hub ca is needed for ESO on spokes to connect to the vault on
the hub, there is no need for this when vault is not used, so let's
drop it in that case
Feat: Followup to definition of extraParameters under the main section of a values file.
…er-namespace

Co-authored-by: Michele Baldessari <[email protected]>
Co-authored-by: Alejandro Villegas <[email protected]>
Signed-off-by: Tomer Figenblat <[email protected]>
This is important because in some situations (we've observed this on the
clusterwide argo instance on spokes) the permissions are not there yet
when argo tries to create service accounts for the imperative SAs.

This means that the very first sync works up to the service account
creation which then fails due to lacking RBACs. This triggers a gitops
issue where selfheal never retries because the previous run failed and
so the app is in a stuck loop forever

Co-Authored-By: Jonny Rickard <[email protected]>

Closes: GITOPS-4677
Force rolebindings as early as possible
Problem Statement:
When setting a namespace like this:
    - openshift-distributed-tracing:
        operatorGroup: true
        targetNamespaces: []

The chart generates the following yaml:
```yaml
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-distributed-tracing-operator-group
  namespace: openshift-distributed-tracing
spec:
  targetNamespaces:
```

Which k8s rejects the targetNamespaces key as invalid when it attempts to apply it and removes it since it doesn't have a value, which just so happens to have the desired result of not setting the targetNamespaces (or a selector) to enable it for All Namespaces.
bug: Invalid OperatorGroup generated when omitting targetNamespaces
feat: use hive clusterdeployment for creating spoke clusters
…sters

Added support to control the scheduler/cluster spec
Actually use adminServiceAccountName for the auto approve job
This should fix the fact that jobs are triggered on unrelated changes
Make sure that the if condition on chart split is not always true
darkdoc and others added 28 commits October 22, 2024 10:04
Record the exit code at the right time
Fix path when invoking the qe run_test.sh script
Since the slimming of common this won't work anymore:

❯ make validate-schema
make -f common/Makefile validate-schema
make[1]: Entering directory '/home/michele/Engineering/cloud-patterns/multicloud-gitops'
Validating clustergroup schema of:  ./values-global.yamlError: repo common not found
make[1]: *** [common/Makefile:162: validate-schema] Error 1
make[1]: Leaving directory '/home/michele/Engineering/cloud-patterns/multicloud-gitops'
make: *** [Makefile:12: validate-schema] Error 2

Fix this to use the oci helm chart:
❯ make validate-schema
make -f common/Makefile validate-schema
make[1]: Entering directory '/home/michele/Engineering/cloud-patterns/multicloud-gitops'
Validating clustergroup schema of:  ./values-global.yamlPulled: quay.io/hybridcloudpatterns/clustergroup:0.9.13
Digest: sha256:725af54c0a5ad8c2235676bbff2785ece62c9929ab58aaf33837aa3f19708ce6
 ./values-group-one.yamlPulled: quay.io/hybridcloudpatterns/clustergroup:0.9.13
Digest: sha256:725af54c0a5ad8c2235676bbff2785ece62c9929ab58aaf33837aa3f19708ce6
 ./values-hub.yamlPulled: quay.io/hybridcloudpatterns/clustergroup:0.9.13
Digest: sha256:725af54c0a5ad8c2235676bbff2785ece62c9929ab58aaf33837aa3f19708ce6

make[1]: Leaving directory '/home/michele/Engineering/cloud-patterns/multicloud-gitops'
Yukin observed a case on a baremetal server where the install failed
with:

    make -f common/Makefile operator-deploy
    make[1]: Entering directory '/home/fedora/validated_patterns/multicloud-gitops'
    Checking repository:
      https://github.com/validatedpatterns-workspace/multicloud-gitops - branch 'qe_test-18760': OK
    Checking cluster:
      cluster-info: OK
      storageclass: OK
    Installing pattern: ....Installation failed [5/5]. Error:
    WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/fedora/rhvpsno2-intel-18760/auth/kubeconfig
    WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/fedora/rhvpsno2-intel-18760/auth/kubeconfig
    Pulled: quay.io/hybridcloudpatterns/pattern-install:0.0.7
    Digest: sha256:b845f86c735478cfd44b0b43842697851cec64737c737bd18a872fa86bb0484d
    customresourcedefinition.apiextensions.k8s.io/patterns.gitops.hybrid-cloud-patterns.io unchanged
    configmap/patterns-operator-config unchanged
    pattern.gitops.hybrid-cloud-patterns.io/multicloud-gitops created
    subscription.operators.coreos.com/patterns-operator unchanged
    make[1]: *** [common/Makefile:71: operator-deploy] Error 1
    make[1]: Leaving directory '/home/fedora/validated_patterns/multicloud-gitops'
    make: *** [Makefile:12: operator-deploy] Error 2

In fact the install proceeded just okay, we just gave up too early.
Let's double the amount of times we wait for this and also increase the
wait in between tries by 5 seconds. Hopefully this should cover these
edge cases.
If ACM is installed the search for `applications` matches the ACM one
and not the argo one.
This way we can override the TARGET_SITE when invoking pattern.sh
Add TARGET_SITE as an env variable
Since ubuntu sometimes has /etc/pki/fwupd with little else in there,
let's just bind mount /etc/pki when /etc/pki/tls exists.
This keeps Fedora-based distros running and should fix this specific
corner case observed on ubuntu.

Co-Authored-By: Akos Eros <[email protected]>

Closes: validatedpatterns/medical-diagnosis#130
Do not bind mount /etc/pki blindly
…mands

Currently, we pass the env var EXTRA_PLAYBOOK_OPTS into our utility container when running
the `pattern-util.sh` script, however, we do not use it anywhere. This commit adds propagation
of the env var to the `ansible-playbook` commands which could make use of it.

As an example, you could set
```sh
export EXTRA_PLAYBOOK_OPTS="-vvv"
```
which would enable verbose logging for any of the ansible playbooks when we run `./pattern.sh make <make_target>`
in any of our pattern repos.
propagate the env var EXTRA_PLAYBOOK_OPTS to our ansible-playbook commands
A few small changes in this commit:
* Update README to reference the `make-common-subtree` script in common rather than MCG repo
* Update README and `make-common-subtree` script to use same default remote name for common subtree
that we use in our `update-common-everywhere` script.
* Update file name for the script to use dashes rather than underscores for consistency
* Update the name of our GH org to `validatedpatterns`
git-subtree-dir: common
git-subtree-mainline: 2350a52
git-subtree-split: 7d184fb
@dminnear-rh
Copy link
Contributor Author

I didn't realize we were already waiting on #55 to be merged. I'll close this for now and update common after Michele's PR is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants