Skip to content

add NetworkPolicy support to operator installation#151

Closed
rporres wants to merge 13 commits intomainfrom
MGMT-23668-network-policy
Closed

add NetworkPolicy support to operator installation#151
rporres wants to merge 13 commits intomainfrom
MGMT-23668-network-policy

Conversation

@rporres
Copy link
Copy Markdown
Contributor

@rporres rporres commented Mar 26, 2026

Apply three default NetworkPolicies to every operator namespace
(except openshift-operators): allow intra-namespace traffic, allow
ingress from cluster nodes for admission webhook access, and allow
ingress from the OpenShift monitoring stack.

Operators can opt out of all default policies via the new
disableDefaultNetworkPolicies flag, and opt in to additional
cross-namespace ingress policies via allowIngressFrom (a list of
name/namespaceSelector pairs). The schema forbids combining
allowIngressFrom with disableDefaultNetworkPolicies: true, since
disabling defaults while selectively re-allowing traffic is
semantically inconsistent.

Operators with exposed routes (quay-operator, ACM, cincinnati,
pipelines, ODF, LVMS) are configured with allowIngressFrom for the
OpenShift ingress namespace. MCE and ACM are configured with
bidirectional allowIngressFrom to preserve east-west traffic between
the two operators that collaborate at runtime.

cert-manager and openshift-gitops require policies in secondary
namespaces (cert-manager and openshift-gitops) that are created by
the operators themselves rather than by the generic installation
task; these are handled in each operator's own tasks.yaml.

The kube-apiserver NetworkPolicy uses ipBlock with machineNetwork
rather than a namespaceSelector for openshift-kube-apiserver.
kube-apiserver pods run with hostNetwork: true in OpenShift, meaning
their traffic originates from node IPs rather than pod-network
addresses. A namespaceSelector only matches pod-network traffic, so
it silently failed to allow webhook calls, leaving the implicit deny
in place and causing context deadline exceeded errors when creating
resources with admission webhooks (observed with the OCM webhook at
ocm-webhook.multicluster-engine.svc:443). Using ipBlock with
machineNetwork correctly covers all cluster nodes including the
control plane.

https://redhat.atlassian.net/browse/MGMT-23668

Summary by CodeRabbit

  • New Features

    • Default ingress NetworkPolicies are created for operators to restrict cross-namespace access (including allow-from-same-namespace, monitoring, ingress label groups, and cluster CIDR).
    • Storage operator entries added with restrictive ingress allowlists.
  • Configuration

    • New per-operator allowIngressFrom setting to declare permitted inbound namespaces via label selectors.
    • New global disableDefaultNetworkPolicies option to disable default operator policies; when true, per-operator allowlists are disallowed.

@github-actions github-actions Bot added deployment Deployment-related changes infrastructure Infrastructure setup (VMs, networks) validation Validation and testing operators Operator installation/config labels Mar 26, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds operator ingress allowlists and a new storage operators defaults file, extends operator/plugin schemas with disableDefaultNetworkPolicies and allowIngressFrom, updates LVMS plugin to disable default policies, and adds Ansible tasks to create namespace-scoped NetworkPolicy resources when default policies are enabled and operator namespace ≠ openshift-operators.

Changes

Cohort / File(s) Summary
Default operator configs
defaults/operators.yaml
Added allowIngressFrom entries to several operator defaults (quay-operator, multicluster-engine, advanced-cluster-management, cincinnati-operator, openshift-pipelines-operator-rh) specifying ingress sources via name and namespaceSelector.matchLabels.
Storage operator defaults (new file)
defaults/storage_operators.yaml
New file adding storage_operators definitions (odf, lvms) with operator metadata, mirror settings, csvNames, extraMirrorPackages (ODF) and allowIngressFrom selectors for storage operators.
Operator-specific NetworkPolicy tasks
operators/openshift-cert-manager-operator/tasks.yaml, operators/openshift-gitops-operator/tasks.yaml
Added conditional Ansible kubernetes.core.k8s tasks to create networking.k8s.io/v1 NetworkPolicy resources in cert-manager and openshift-gitops namespaces (policies: same-namespace, monitoring selector, ingress selector(s), and kube-apiserver ipBlock). GitOps removed a debug and now ensures policies.
Generic operator configure task
playbooks/tasks/configure_operator.yaml
Added conditional tasks (gated by operator.namespace != "openshift-operators" and operator.disableDefaultNetworkPolicies false) to upsert default ingress NetworkPolicy resources, including looped creation from operator.allowIngressFrom and retry/until checks.
Schema changes (operator/plugin)
schemas/operators.yaml, schemas/plugin.yaml
Added global.disableDefaultNetworkPolicies: boolean and global.allowIngressFrom: array (items require name and namespaceSelector.matchLabels, additionalProperties: false), plus conditional rules disallowing allowIngressFrom when disableDefaultNetworkPolicies is true.
Plugin configuration update
plugins/lvms/plugin.yaml
Set disableDefaultNetworkPolicies: true for the lvms-operator entry.

Sequence Diagram

sequenceDiagram
    participant Playbook
    participant ConfigureTask as Configure Operator Task
    participant K8sAPI as Kubernetes API
    participant OpNS as Operator Namespace

    Playbook->>ConfigureTask: invoke with operator config
    ConfigureTask->>ConfigureTask: check operator.namespace and disableDefaultNetworkPolicies

    alt create default policies
        ConfigureTask->>K8sAPI: Apply NetworkPolicy (same-namespace) to OpNS
        K8sAPI-->>OpNS: NetworkPolicy created
        ConfigureTask->>K8sAPI: Apply NetworkPolicy (monitoring selector) to OpNS
        K8sAPI-->>OpNS: NetworkPolicy created
        ConfigureTask->>K8sAPI: Apply NetworkPolicy (kube-apiserver ipBlock) to OpNS
        K8sAPI-->>OpNS: NetworkPolicy created
        loop for each operator.allowIngressFrom entry
            ConfigureTask->>K8sAPI: Apply NetworkPolicy (namespaceSelector matchLabels) to OpNS
            K8sAPI-->>OpNS: NetworkPolicy created
        end
    end

    ConfigureTask-->>Playbook: return results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main change: adding NetworkPolicy support to operator installation. It clearly reflects the primary objective of enabling default NetworkPolicies and optional ingress controls for operators.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch MGMT-23668-network-policy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
operators/openshift-cert-manager-operator/tasks.yaml (1)

2-55: Add retry logic for consistency and reliability.

These tasks lack the retries/delay/until pattern used in configure_operator.yaml. Since these tasks target the cert-manager namespace (which may not exist immediately when the operator is first deployed), transient failures are possible.

♻️ Proposed fix to add retry logic
 - name: "Ensure same-namespace NetworkPolicy exists in cert-manager"
   kubernetes.core.k8s:
     state: present
     definition:
       apiVersion: networking.k8s.io/v1
       kind: NetworkPolicy
       metadata:
         name: allow-from-same-namespace
         namespace: cert-manager
       spec:
         podSelector: {}
         ingress:
           - from:
               - podSelector: {}
         policyTypes:
           - Ingress
+  register: r_cert_manager_same_ns_netpol
+  retries: "{{ k8s_retries }}"
+  delay: "{{ k8s_delay }}"
+  until: r_cert_manager_same_ns_netpol is success

Apply the same pattern to the other two NetworkPolicy tasks.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@operators/openshift-cert-manager-operator/tasks.yaml` around lines 2 - 55,
The three NetworkPolicy tasks ("Ensure same-namespace NetworkPolicy exists in
cert-manager", "Ensure monitoring can reach cert-manager", and "Ensure
kube-apiserver can reach cert-manager webhook") need the same retry/backoff
pattern used in configure_operator.yaml to tolerate a missing cert-manager
namespace; update each kubernetes.core.k8s task (the three NetworkPolicy
definitions) to include retries, delay and an until condition that retries the
task until it succeeds (match the exact retries/delay/until values and success
check used in configure_operator.yaml) so transient namespace-not-found errors
are retried automatically.
operators/openshift-gitops-operator/tasks.yaml (1)

6-78: Add retry logic for consistency and reliability.

Similar to operators/openshift-cert-manager-operator/tasks.yaml, these tasks lack retry logic. The openshift-gitops namespace may not exist immediately when the operator tasks are first invoked.

♻️ Proposed fix to add retry logic (example for first task)
 - name: "Ensure same-namespace NetworkPolicy exists in openshift-gitops"
   kubernetes.core.k8s:
     state: present
     definition:
       apiVersion: networking.k8s.io/v1
       kind: NetworkPolicy
       metadata:
         name: allow-from-same-namespace
         namespace: openshift-gitops
       spec:
         podSelector: {}
         ingress:
           - from:
               - podSelector: {}
         policyTypes:
           - Ingress
+  register: r_gitops_same_ns_netpol
+  retries: "{{ k8s_retries }}"
+  delay: "{{ k8s_delay }}"
+  until: r_gitops_same_ns_netpol is success

Apply the same pattern to all four NetworkPolicy tasks.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@operators/openshift-gitops-operator/tasks.yaml` around lines 6 - 78, The four
NetworkPolicy tasks ("Ensure same-namespace NetworkPolicy exists in
openshift-gitops", "Ensure kube-apiserver can reach openshift-gitops webhooks",
"Ensure monitoring can reach openshift-gitops", "Ensure ingress can reach
openshift-gitops") should include retry logic like the cert-manager tasks: for
each kubernetes.core.k8s task register the result (e.g., np_result_<name>), add
retries and delay (for example retries: 10 delay: 6) and an until clause that
asserts the operation succeeded (e.g., until: not np_result_<name>.failed or
until: np_result_<name>.status is defined depending on the module output), so
the task will retry if the openshift-gitops namespace isn't present yet; apply
this pattern to all four NetworkPolicy tasks and use distinct register names for
each.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@schemas/operators.yaml`:
- Around line 44-46: The field disableDefaultNetworkPolicies currently claims it
opts out of only the "allow-from-same-namespace" policy; update its description
to accurately reflect that setting disables all three default policies
(same-namespace, kube-apiserver, and monitoring) as implemented in
configure_operator.yaml—replace the description text for
disableDefaultNetworkPolicies to explicitly list those three policies and the
resulting behavior.

---

Nitpick comments:
In `@operators/openshift-cert-manager-operator/tasks.yaml`:
- Around line 2-55: The three NetworkPolicy tasks ("Ensure same-namespace
NetworkPolicy exists in cert-manager", "Ensure monitoring can reach
cert-manager", and "Ensure kube-apiserver can reach cert-manager webhook") need
the same retry/backoff pattern used in configure_operator.yaml to tolerate a
missing cert-manager namespace; update each kubernetes.core.k8s task (the three
NetworkPolicy definitions) to include retries, delay and an until condition that
retries the task until it succeeds (match the exact retries/delay/until values
and success check used in configure_operator.yaml) so transient
namespace-not-found errors are retried automatically.

In `@operators/openshift-gitops-operator/tasks.yaml`:
- Around line 6-78: The four NetworkPolicy tasks ("Ensure same-namespace
NetworkPolicy exists in openshift-gitops", "Ensure kube-apiserver can reach
openshift-gitops webhooks", "Ensure monitoring can reach openshift-gitops",
"Ensure ingress can reach openshift-gitops") should include retry logic like the
cert-manager tasks: for each kubernetes.core.k8s task register the result (e.g.,
np_result_<name>), add retries and delay (for example retries: 10 delay: 6) and
an until clause that asserts the operation succeeded (e.g., until: not
np_result_<name>.failed or until: np_result_<name>.status is defined depending
on the module output), so the task will retry if the openshift-gitops namespace
isn't present yet; apply this pattern to all four NetworkPolicy tasks and use
distinct register names for each.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8999f0e6-d624-4478-8085-d23cf7f19f87

📥 Commits

Reviewing files that changed from the base of the PR and between 9c4e89e and 573d2b7.

📒 Files selected for processing (6)
  • defaults/operators.yaml
  • defaults/storage_operators.yaml
  • operators/openshift-cert-manager-operator/tasks.yaml
  • operators/openshift-gitops-operator/tasks.yaml
  • playbooks/tasks/configure_operator.yaml
  • schemas/operators.yaml

Comment thread schemas/operators.yaml Outdated
@rporres rporres marked this pull request as draft March 26, 2026 15:48
@rporres rporres force-pushed the MGMT-23668-network-policy branch from 407817b to 3e697e1 Compare March 26, 2026 20:13
@rporres rporres marked this pull request as ready for review March 26, 2026 20:15
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
operators/openshift-cert-manager-operator/tasks.yaml (3)

38-54: Add retry logic for consistency.

Same recommendation - add retry logic for reliability parity with the generic tasks.

♻️ Proposed fix
 - name: "Ensure kube-apiserver can reach cert-manager webhook"
   kubernetes.core.k8s:
     state: present
     definition:
       apiVersion: networking.k8s.io/v1
       kind: NetworkPolicy
       metadata:
         name: allow-ingress-from-kube-apiserver
         namespace: cert-manager
       spec:
         podSelector: {}
         ingress:
           - from:
               - ipBlock:
                   cidr: "{{ machineNetwork }}"
         policyTypes:
           - Ingress
+  register: r_cert_manager_apiserver_netpol
+  retries: "{{ k8s_retries }}"
+  delay: "{{ k8s_delay }}"
+  until: r_cert_manager_apiserver_netpol is success
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@operators/openshift-cert-manager-operator/tasks.yaml` around lines 38 - 54,
The NetworkPolicy creation task named "Ensure kube-apiserver can reach
cert-manager webhook" (kubernetes.core.k8s creating NetworkPolicy
allow-ingress-from-kube-apiserver) lacks retry handling and can fail
intermittently; update this task to add Ansible retry logic (use until with a
success condition, retries and delay) around the kubernetes.core.k8s call
referencing the existing cidr "{{ machineNetwork }}" so the module will be
retried until the resource is present/created before proceeding. Ensure the
until condition checks the task result (e.g., changed or result success), and
set sensible retries and delay values consistent with the project's other
generic tasks for reliability parity.

19-36: Add retry logic for consistency.

Same recommendation as above - add register, retries, delay, and until for reliability.

♻️ Proposed fix
 - name: "Ensure monitoring can reach cert-manager"
   kubernetes.core.k8s:
     state: present
     definition:
       apiVersion: networking.k8s.io/v1
       kind: NetworkPolicy
       metadata:
         name: allow-ingress-from-monitoring
         namespace: cert-manager
       spec:
         podSelector: {}
         ingress:
           - from:
               - namespaceSelector:
                   matchLabels:
                     network.openshift.io/policy-group: monitoring
         policyTypes:
           - Ingress
+  register: r_cert_manager_monitoring_netpol
+  retries: "{{ k8s_retries }}"
+  delay: "{{ k8s_delay }}"
+  until: r_cert_manager_monitoring_netpol is success
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@operators/openshift-cert-manager-operator/tasks.yaml` around lines 19 - 36,
The NetworkPolicy task "Ensure monitoring can reach cert-manager" using the
kubernetes.core.k8s module needs retry logic for consistency: add a register
(e.g., register: np_result), retries (e.g., retries: 5), delay (e.g., delay: 3)
and an until that checks the task result (e.g., until: np_result is succeeded or
np_result.failed == false) to the task that defines metadata.name:
allow-ingress-from-monitoring so the play will retry creating the NetworkPolicy
until it succeeds.

2-17: Add retry logic for consistency and reliability.

This task lacks the retry logic present in the generic NetworkPolicy tasks in playbooks/tasks/configure_operator.yaml. This could cause transient failures if the Kubernetes API is temporarily unavailable.

♻️ Proposed fix to add retry logic
 - name: "Ensure same-namespace NetworkPolicy exists in cert-manager"
   kubernetes.core.k8s:
     state: present
     definition:
       apiVersion: networking.k8s.io/v1
       kind: NetworkPolicy
       metadata:
         name: allow-from-same-namespace
         namespace: cert-manager
       spec:
         podSelector: {}
         ingress:
           - from:
               - podSelector: {}
         policyTypes:
           - Ingress
+  register: r_cert_manager_same_ns_netpol
+  retries: "{{ k8s_retries }}"
+  delay: "{{ k8s_delay }}"
+  until: r_cert_manager_same_ns_netpol is success
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@operators/openshift-cert-manager-operator/tasks.yaml` around lines 2 - 17,
The NetworkPolicy task "Ensure same-namespace NetworkPolicy exists in
cert-manager" (kubernetes.core.k8s) needs the same retry pattern as the generic
NetworkPolicy tasks: register the module result (e.g., register: netpol_result)
and add retries/delay/until task attributes (for example retries: 5, delay: 3,
until: netpol_result is succeeded or (netpol_result is defined and
netpol_result.failed is not defined/false)) so transient Kubernetes API errors
are retried and the task only fails after repeated attempts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@operators/openshift-cert-manager-operator/tasks.yaml`:
- Around line 38-54: The NetworkPolicy creation task named "Ensure
kube-apiserver can reach cert-manager webhook" (kubernetes.core.k8s creating
NetworkPolicy allow-ingress-from-kube-apiserver) lacks retry handling and can
fail intermittently; update this task to add Ansible retry logic (use until with
a success condition, retries and delay) around the kubernetes.core.k8s call
referencing the existing cidr "{{ machineNetwork }}" so the module will be
retried until the resource is present/created before proceeding. Ensure the
until condition checks the task result (e.g., changed or result success), and
set sensible retries and delay values consistent with the project's other
generic tasks for reliability parity.
- Around line 19-36: The NetworkPolicy task "Ensure monitoring can reach
cert-manager" using the kubernetes.core.k8s module needs retry logic for
consistency: add a register (e.g., register: np_result), retries (e.g., retries:
5), delay (e.g., delay: 3) and an until that checks the task result (e.g.,
until: np_result is succeeded or np_result.failed == false) to the task that
defines metadata.name: allow-ingress-from-monitoring so the play will retry
creating the NetworkPolicy until it succeeds.
- Around line 2-17: The NetworkPolicy task "Ensure same-namespace NetworkPolicy
exists in cert-manager" (kubernetes.core.k8s) needs the same retry pattern as
the generic NetworkPolicy tasks: register the module result (e.g., register:
netpol_result) and add retries/delay/until task attributes (for example retries:
5, delay: 3, until: netpol_result is succeeded or (netpol_result is defined and
netpol_result.failed is not defined/false)) so transient Kubernetes API errors
are retried and the task only fails after repeated attempts.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8c0865d6-3eca-4d0f-b3bc-c8776160d320

📥 Commits

Reviewing files that changed from the base of the PR and between 573d2b7 and 3e697e1.

📒 Files selected for processing (6)
  • defaults/operators.yaml
  • defaults/storage_operators.yaml
  • operators/openshift-cert-manager-operator/tasks.yaml
  • operators/openshift-gitops-operator/tasks.yaml
  • playbooks/tasks/configure_operator.yaml
  • schemas/operators.yaml
✅ Files skipped from review due to trivial changes (1)
  • defaults/storage_operators.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • defaults/operators.yaml
  • operators/openshift-gitops-operator/tasks.yaml
  • schemas/operators.yaml

@rporres rporres force-pushed the MGMT-23668-network-policy branch 2 times, most recently from 12b632b to 36b316d Compare March 29, 2026 17:58
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@operators/openshift-cert-manager-operator/tasks.yaml`:
- Around line 2-54: The three NetworkPolicy tasks (metadata names
allow-from-same-namespace, allow-ingress-from-monitoring,
allow-ingress-from-kube-apiserver) currently create policies unconditionally;
update each task to respect the operator opt-out by adding a conditional that
skips these tasks when disableDefaultNetworkPolicies is true (e.g., use a when
or conditional that checks the cert-manager config flag like
disableDefaultNetworkPolicies | default(false) and only run the tasks when that
flag is false). Ensure the condition references the same variable used by the
operator config so the policies are not created when
disableDefaultNetworkPolicies is enabled.

In `@operators/openshift-gitops-operator/tasks.yaml`:
- Around line 6-77: The four NetworkPolicy tasks (named "Ensure same-namespace
NetworkPolicy exists in openshift-gitops", "Ensure kube-apiserver can reach
openshift-gitops", "Ensure monitoring can reach openshift-gitops", and "Ensure
ingress can reach openshift-gitops" which create NetworkPolicy resources named
allow-from-same-namespace, allow-ingress-from-kube-apiserver,
allow-ingress-from-monitoring, and allow-ingress-from-ingress) must be skipped
when the operator flag disableDefaultNetworkPolicies is true; add an Ansible
conditional (when: not disableDefaultNetworkPolicies or equivalent) to each of
these task definitions so they only run when default network policies are
enabled. Ensure the condition references the same variable name used elsewhere
in the operator playbook/role to avoid mismatches.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b8fc78e9-3873-4efd-b368-c2a72b26cc79

📥 Commits

Reviewing files that changed from the base of the PR and between 3e697e1 and 36b316d.

📒 Files selected for processing (6)
  • defaults/operators.yaml
  • defaults/storage_operators.yaml
  • operators/openshift-cert-manager-operator/tasks.yaml
  • operators/openshift-gitops-operator/tasks.yaml
  • playbooks/tasks/configure_operator.yaml
  • schemas/operators.yaml
✅ Files skipped from review due to trivial changes (1)
  • playbooks/tasks/configure_operator.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • schemas/operators.yaml
  • defaults/storage_operators.yaml

Comment thread operators/openshift-cert-manager-operator/tasks.yaml
Comment thread operators/openshift-gitops-operator/tasks.yaml
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
operators/openshift-cert-manager-operator/tasks.yaml (1)

2-57: Consider consolidating default NetworkPolicy definitions to one source

These three manifests mirror the generic defaults in playbooks/tasks/configure_operator.yaml (same policy names/spec shape). If both flows execute for cert-manager, this creates duplicated management paths and future drift risk. Consider centralizing these defaults in one reusable task and only parameterizing the target namespace(s).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@operators/openshift-cert-manager-operator/tasks.yaml` around lines 2 - 57,
The three NetworkPolicy tasks ("Ensure same-namespace NetworkPolicy exists in
cert-manager", "Ensure monitoring can reach cert-manager", "Ensure
kube-apiserver can reach cert-manager webhook") duplicate generic defaults
already defined in playbooks/tasks/configure_operator.yaml; consolidate by
extracting the NetworkPolicy definitions into a single reusable task or role
(e.g., a parameterized task file or include_role) that accepts a target
namespace variable, then replace these three tasks with calls to that reusable
task passing namespace: cert-manager (and machineNetwork/monitoring selectors as
parameters) so the same canonical policy definitions are managed from one
source.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@operators/openshift-cert-manager-operator/tasks.yaml`:
- Around line 2-57: The three NetworkPolicy tasks ("Ensure same-namespace
NetworkPolicy exists in cert-manager", "Ensure monitoring can reach
cert-manager", "Ensure kube-apiserver can reach cert-manager webhook") duplicate
generic defaults already defined in playbooks/tasks/configure_operator.yaml;
consolidate by extracting the NetworkPolicy definitions into a single reusable
task or role (e.g., a parameterized task file or include_role) that accepts a
target namespace variable, then replace these three tasks with calls to that
reusable task passing namespace: cert-manager (and machineNetwork/monitoring
selectors as parameters) so the same canonical policy definitions are managed
from one source.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6448448e-8cee-405a-9e95-a974fd69da5f

📥 Commits

Reviewing files that changed from the base of the PR and between 36b316d and 22afe1c.

📒 Files selected for processing (3)
  • defaults/storage_operators.yaml
  • operators/openshift-cert-manager-operator/tasks.yaml
  • operators/openshift-gitops-operator/tasks.yaml
✅ Files skipped from review due to trivial changes (2)
  • defaults/storage_operators.yaml
  • operators/openshift-gitops-operator/tasks.yaml

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@schemas/plugin.yaml`:
- Around line 61-72: Tighten validation for the name and
namespaceSelector.matchLabels schema: add DNS-1123 constraints to the name field
by adding a pattern that enforces DNS-1123 subdomain syntax and a maxLength of
253, and strengthen namespaceSelector.matchLabels by requiring minProperties: 1
and restricting additionalProperties values to strings (additionalProperties: {
type: "string" }); apply these exact changes to the same name /
namespaceSelector.matchLabels schema block in the operators schema as well so
both schemas validate like Kubernetes NetworkPolicy metadata and label
selectors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1cc463f2-f817-4430-8c8f-e2ec6cc639b6

📥 Commits

Reviewing files that changed from the base of the PR and between 22afe1c and 56f4a5f.

📒 Files selected for processing (3)
  • defaults/storage_operators.yaml
  • plugins/lvms/plugin.yaml
  • schemas/plugin.yaml
✅ Files skipped from review due to trivial changes (2)
  • plugins/lvms/plugin.yaml
  • defaults/storage_operators.yaml

Comment thread schemas/plugin.yaml
@rporres rporres force-pushed the MGMT-23668-network-policy branch from 1a80fb6 to 8440175 Compare April 6, 2026 08:47
rporres added 13 commits April 7, 2026 10:49
Apply three default NetworkPolicies to every operator namespace
(except openshift-operators): allow intra-namespace traffic, allow
ingress from cluster nodes for admission webhook access, and allow
ingress from the OpenShift monitoring stack.

Operators can opt out of all default policies via the new
disableDefaultNetworkPolicies flag, and opt in to additional
cross-namespace ingress policies via allowIngressFrom (a list of
name/namespaceSelector pairs). The schema forbids combining
allowIngressFrom with disableDefaultNetworkPolicies: true, since
disabling defaults while selectively re-allowing traffic is
semantically inconsistent.

Operators with exposed routes (quay-operator, ACM, cincinnati,
pipelines, ODF, LVMS) are configured with allowIngressFrom for the
OpenShift ingress namespace. MCE and ACM are configured with
bidirectional allowIngressFrom to preserve east-west traffic between
the two operators that collaborate at runtime.

cert-manager and openshift-gitops require policies in secondary
namespaces (cert-manager and openshift-gitops) that are created by
the operators themselves rather than by the generic installation
task; these are handled in each operator's own tasks.yaml.

The kube-apiserver NetworkPolicy uses ipBlock with machineNetwork
rather than a namespaceSelector for openshift-kube-apiserver.
kube-apiserver pods run with hostNetwork: true in OpenShift, meaning
their traffic originates from node IPs rather than pod-network
addresses. A namespaceSelector only matches pod-network traffic, so
it silently failed to allow webhook calls, leaving the implicit deny
in place and causing context deadline exceeded errors when creating
resources with admission webhooks (observed with the OCM webhook at
ocm-webhook.multicluster-engine.svc:443). Using ipBlock with
machineNetwork correctly covers all cluster nodes including the
control plane.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
… failure

The LVMCluster admission webhook (vlvmcluster.kb.io) is timing out with
context deadline exceeded. Disabling the default NetworkPolicies for the
lvms-operator to confirm the kube-apiserver ipBlock policy is what is
blocking webhook traffic in the CI environment.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
The three NetworkPolicy tasks in the cert-manager operator tasks.yaml
were created unconditionally. Add the same disableDefaultNetworkPolicies
guard used by configure_operator.yaml so they are skipped when the
operator opts out of default network policies.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
The four NetworkPolicy tasks in the openshift-gitops operator tasks.yaml
were created unconditionally. Add the same disableDefaultNetworkPolicies
guard used by configure_operator.yaml so they are skipped when the
operator opts out of default network policies.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
The previous diagnostic commit set disableDefaultNetworkPolicies in
defaults/storage_operators.yaml, which is dead code - configure_operators.yaml
no longer loops over storage_operators after PR #170 moved LVMS into a plugin.
The operator is now defined in plugins/lvms/plugin.yaml and processed by
deploy_plugin.yaml via configure_operator.yaml.

Move the diagnostic flag to the correct location so NetworkPolicies are
actually skipped for openshift-storage during this test run.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
The plugin.yaml schema operator definition was missing
disableDefaultNetworkPolicies and allowIngressFrom, causing schema
validation to reject plugin operator entries that use these fields.
Mirror the same definitions and if/then constraint from schemas/operators.yaml.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
This file was removed from main by commit 18ca91c (PR #170) which
extracted LVMS and ODF into plugins. It was left in this branch as
a rebase artifact and would have been re-introduced on merge.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
NooBaa management console and Rook-Ceph dashboard are exposed via
Routes through the OCP ingress controller. Without allowIngressFrom,
the default NetworkPolicies would block ingress traffic to
openshift-storage.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
Add DNS-1123 pattern and maxLength: 253 to the name field so values
used as NetworkPolicy name suffixes are valid Kubernetes resource
names. Require minProperties: 1 on matchLabels and restrict values to
strings to match Kubernetes label selector semantics.

Applied to both schemas/plugin.yaml and schemas/operators.yaml.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
Both multicluster-engine and netobserv-operator expose console plugins.
The OCP console backend (in openshift-console) proxies plugin requests
directly to the plugin service via the pod network, bypassing the ingress
controller. Without an allowIngressFrom for openshift-console, the default
NetworkPolicies block this traffic and the console plugin pages are broken.

Add allowIngressFrom for kubernetes.io/metadata.name: openshift-console
to both operators.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
Add a Local Development Environment section noting that ansible and all
required tools are available via `source .venv/bin/activate`. This must
be activated before running validation or deployment commands locally.

Also remove the stale reference to defaults/storage_operators.yaml,
which was deleted in commit 7426ee6.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
@rporres rporres force-pushed the MGMT-23668-network-policy branch from a524ea6 to d110abc Compare April 7, 2026 09:09
@rporres rporres closed this Apr 22, 2026
@rporres rporres deleted the MGMT-23668-network-policy branch April 22, 2026 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deployment Deployment-related changes infrastructure Infrastructure setup (VMs, networks) operators Operator installation/config plugins validation Validation and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant