Skip to content

Commit 6fae98f

Browse files
committed
TELCODOCS-1874: Adding hub cluster RDS - Tech Preview
1 parent 10220ca commit 6fae98f

File tree

79 files changed

+1377
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

79 files changed

+1377
-0
lines changed

_topic_maps/_topic_map.yml

+2
Original file line numberDiff line numberDiff line change
@@ -3332,6 +3332,8 @@ Topics:
33323332
File: telco-core-rds
33333333
- Name: Telco RAN DU reference design specifications
33343334
File: telco-ran-du-rds
3335+
- Name: Telco hub reference design specifications
3336+
File: telco-hub-rds
33353337
- Name: Comparing cluster configurations
33363338
Dir: cluster-compare
33373339
Distros: openshift-origin,openshift-enterprise
89.3 KB
Loading
Loading
+85
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
[id="telco-hub-acm-observability"]
2+
= {rh-rhacm} Observability
3+
4+
Cluster observability is provided by the multicluster engine and {rh-rhacm}.
5+
6+
* Observability storage needs several `PV` resources and an S3 compatible bucket storage for long term retention of the metrics.
7+
* Storage requirements calculation is complex and dependent on the specific workloads and characteristics of managed clusters.
8+
Requirements for `PV` resources and the S3 bucket depend on many aspects including data retention, the number of managed clusters, managed cluster workloads, and so on.
9+
* Estimate the required storage for observability by using the observability sizing calculator in the {rh-rhacm} capacity planning repository.
10+
See the Red Hat Knowledgebase article link:https://access.redhat.com/articles/7103886[Calculating storage need for MultiClusterHub Observability on telco environments] for an explanation of using the calculator to estimate observability storage requirements.
11+
The below table uses inputs derived from the telco RAN DU RDS and the hub cluster RDS as representative values.
12+
13+
[NOTE]
14+
====
15+
The following numbers are estimated.
16+
Tune the values for more accurate results.
17+
Add an engineering margin, for example +20%, to the results to account for potential estimation inaccuracies.
18+
====
19+
20+
.Cluster requirements
21+
[cols="42%,42%,16%",options="header"]
22+
|====
23+
|Capacity planner input
24+
|Data source
25+
|Example value
26+
27+
|Number of control plane nodes
28+
|Hub cluster RDS (scale) and telco RAN DU RDS (topology)
29+
|3500
30+
31+
|Number of additional worker nodes
32+
|Hub cluster RDS (scale) and telco RAN DU RDS (topology)
33+
|0
34+
35+
|Days for storage of data
36+
|Hub cluster RDS
37+
|15
38+
39+
|Total Number of pods per cluster
40+
|Telco RAN DU RDS
41+
|120
42+
43+
|Number of namespaces (excl OCP)
44+
|Telco RAN DU RDS
45+
|4
46+
47+
|Number of metric samples per hour
48+
|Default value
49+
|12
50+
51+
|Number of hours of retention in Receiver PV
52+
|Default value
53+
|24
54+
|====
55+
56+
With these input values, the sizing calculator as described in the Red Hat Knowledgebase article link:https://access.redhat.com/articles/7103886[Calculating storage need for MultiClusterHub Observability on telco environments] indicates the following storage needs:
57+
58+
.Storage requirements
59+
[options="header"]
60+
|====
61+
2+|alertmanager PV 2+|thanos-receive PV 2+|thanos-compactor PV
62+
63+
|*Per replica* |*Total* |*Per replica* |*Total* 2+|*Total*
64+
65+
|10GBi |30GBi |10GBi |30GBi 2+|100GBi
66+
|====
67+
68+
.Storage requirements
69+
[options="header"]
70+
|====
71+
2+|thanos-rule PV 2+|thanos-store PV 2+|Object bucket^[1]^
72+
73+
|*Per replica* |*Total* |*Per replica* |*Total* |*Per day* |*Total*
74+
75+
|30GBi |90GBi |100GBi |300GBi |15GBi |101GBi
76+
|====
77+
[1] For object bucket we assume we disable downsampling, so only need to calculate storage for raw data.
78+
79+
[role="_additional-resources"]
80+
.Additional resources
81+
82+
* link:https://github.com/stolostron/capacity-planning/blob/main/calculation/ObsSizingTemplate-Rev1.ipynb[Observability sizing calculator]
83+
* link:https://github.com/stolostron/capacity-planning[{rh-rhacm} capacity planning repository]
84+
85+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-acmAgentServiceConfig-yaml"]
2+
.acmAgentServiceConfig.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/acm/acmAgentServiceConfig.yaml[role=include]
6+
----
7+

modules/telco-hub-acmMCH-yaml.adoc

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-acmMCH-yaml"]
2+
.acmMCH.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/acm/acmMCH.yaml[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-acmMirrorRegistryCM-yaml"]
2+
.acmMirrorRegistryCM.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/acm/acmMirrorRegistryCM.yaml[role=include]
6+
----
7+

modules/telco-hub-acmNS-yaml.adoc

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-acmNS-yaml"]
2+
.acmNS.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/acm/acmNS.yaml[role=include]
6+
----
7+
+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-acmOperGroup-yaml"]
2+
.acmOperGroup.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/acm/acmOperGroup.yaml[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-acmPerfSearch-yaml"]
2+
.acmPerfSearch.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/acm/acmPerfSearch.yaml[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-acmProvisioning-yaml"]
2+
.acmProvisioning.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/acm/acmProvisioning.yaml[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-acmSubscription-yaml"]
2+
.acmSubscription.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/acm/acmSubscription.yaml[role=include]
6+
----
7+
+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-agent-config-yaml"]
2+
.agent-config.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/install/openshift/agent-config.yaml[role=include]
6+
----
7+
+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-app-project-yaml"]
2+
.app-project.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/gitops/ztp-installation/app-project.yaml[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
[id="telco-hub-architecture-overview"]
2+
= Hub cluster architecture overview
3+
4+
Use the features and components running on the management hub cluster to manage many other clusters in a hub-and-spoke topology.
5+
The hub cluster provides a highly-available and centralized interface for managing the configuration, lifecycle, and observability of the fleet of deployed clusters.
6+
7+
[NOTE]
8+
====
9+
All management hub functionality can be deployed on a dedicated {product-title} cluster or as applications that are co-resident on an existing cluster.
10+
====
11+
12+
Managed cluster lifecycle::
13+
Using a combination of Day 2 Operators, the hub cluster provides the necessary infrastructure to deploy and configure the fleet of clusters by using a GitOps methodology.
14+
Over the lifetime of the deployed clusters, further management of upgrades, scaling the number of clusters, node replacement, and other lifecycle management functions can be declaratively defined and rolled out.
15+
You can control the timing and progression of the rollout across the fleet.
16+
17+
Monitoring::
18+
+
19+
--
20+
The hub cluster provides monitoring and status reporting for the managed clusters through the Observability pillar of the {rh-rhacm} Operator.
21+
This includes aggregated metrics, alerts, and compliance monitoring through the Governance policy framework.
22+
--
23+
24+
The Telco management hub reference design specifications (RDS) and the associated reference CRs describe the telco engineering and QE validated method for deploying, configuring and managing the lifecycle of telco managed cluster infrastructure.
25+
The reference configuration includes the installation and configuration of the hub cluster components on top of {product-title}.
26+
27+
28+
.Hub cluster reference design components
29+
image::telco-hub-cluster-reference-design-components.png[]
30+
31+
.Hub cluster reference design architecture
32+
image::telco-hub-cluster-rds-architecture.png[]
33+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-argocd-openshift-gitops-patch-yaml"]
2+
.argocd-openshift-gitops-patch.json
3+
[source,json]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/gitops/ztp-installation/argocd-openshift-gitops-patch.json[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-argocd-ssh-known-hosts-cm-yaml"]
2+
.argocd-ssh-known-hosts-cm.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/gitops/argocd-ssh-known-hosts-cm.yaml[role=include]
6+
----
7+
+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
[id="telco-hub-assisted-service"]
2+
= Assisted Service
3+
4+
The Assisted Service is deployed with the multicluster engine and {rh-rhacm}.
5+
6+
.Assisted Service storage requirements
7+
[cols="1,2", options="header"]
8+
|====
9+
|Persistent volume resource
10+
|Size (GB)
11+
12+
|`imageStorage`
13+
|50
14+
15+
|`filesystemStorage`
16+
|700
17+
18+
|`dataBaseStorage`
19+
|20
20+
|====
21+
22+
23+
[role="_additional-resources"]
24+
.Additional resources
25+
26+
* link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12/html/clusters/cluster_mce_overview#enable-cim-disconnected[Enabling central infrastructure management in disconnected environments]
27+
+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
[id="telco-hub-cluster-topology"]
2+
= Cluster topology
3+
4+
In production settings, the {product-title} hub cluster must be highly available to maintain high availability of the management functions.
5+
6+
Limits and requirements::
7+
Use a highly available cluster topology for the hub cluster, for example:
8+
* Compact (3 nodes combined control plane and compute nodes)
9+
* Standard (3 control plane nodes + N compute nodes)
10+
11+
Engineering considerations::
12+
* In non-production settings, a {sno} cluster can be used for limited hub cluster functionality.
13+
* Certain capabilities, for example {rh-storage}, are not supported on {sno}.
14+
In this configuration some hub cluster features might not be available.
15+
* The number of optional compute nodes can vary depending on the scale of the specific use case.
16+
* Compute nodes can be added later as required.
17+
18+
[role="_additional-resources"]
19+
.Additional resources
20+
21+
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/architecture/architecture[{product-title} architecture]
22+
* link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html/postinstallation_configuration/post-install-node-tasks[Postinstallation node tasks]
23+
+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-clusterLogNS-yaml"]
2+
.clusterLogNS.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/optional/logging/clusterLogNS.yaml[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-clusterLogOperGroup-yaml"]
2+
.clusterLogOperGroup.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/optional/logging/clusterLogOperGroup.yaml[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
[id="telco-hub-clusterLogSubscription-yaml"]
2+
.clusterLogSubscription.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/optional/logging/clusterLogSubscription.yaml[role=include]
6+
----
7+
8+
+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[id="telco-hub-clusters-app-yaml"]
2+
.clusters-app.yaml
3+
[source,yaml]
4+
----
5+
link:https://raw.githubusercontent.com/openshift-kni/telco-reference/release-4.19/telco-hub/configuration/reference-crs/required/gitops/ztp-installation/clusters-app.yaml[role=include]
6+
----
7+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
[id="telco-hub-engineering-considerations"]
2+
= Hub cluster engineering considerations
3+
4+
The follwing sections describe the engineering considerations for hub cluster resource scaling targets and utilization.
5+
6+
Reference configuration scaling target::
7+
+
8+
--
9+
The resource requirements for the hub cluster are directly dependent on the number of clusters being managed by the hub, the number of policies used for each managed cluster, and the set of features that are configured in {rh-rhacm}.
10+
11+
The hub cluster reference configuration can support up to 3500 managed {sno} clusters under the following conditions:
12+
13+
* 5 policies for each cluster with hub-side templating configured with a 10 minute evaluation interval.
14+
15+
* Only the following {rh-rhacm} add-ons are enabled:
16+
17+
** Policy controller
18+
** Observability with the default configuration
19+
20+
* You deploy managed clusters by using {ztp} in batches of up to 500 clusters at a time.
21+
22+
The reference configuration is also validated for deployment and management of a mix of managed cluster topologies.
23+
The specific limits depend on the mix of cluster topologies, enabled {rh-rhacm} features, and so on.
24+
In a mixed topology scenario, the reference hub configuration is validated with a combination of 1200 {sno} clusters, 400 compact clusters (3 nodes combined control plane and compute nodes), and 230 standard clusters (3 control plane and 2 worker nodes).
25+
26+
[NOTE]
27+
====
28+
Specific dimensioning requirements are highly dependent on the cluster topology and workload.
29+
See "Storage requirements" for details.
30+
Adjust cluster dimensions for the specific characteristics of your fleet of managed clusters.
31+
====
32+
--
33+
34+
Resource utilization::
35+
+
36+
--
37+
Resource utilization was measured for hub clusters in the following scenario:
38+
39+
* Under reference load managing 3500 {sno} clusters
40+
* 3-node compact cluster for management hub running on dual socket bare-metal servers.
41+
* Network impairment of 50ms round-trip latency, 100Mbps bandwidth limit and 0.02% packet loss.
42+
43+
.Resource utilization values
44+
[options="header"]
45+
|====
46+
|Metric |Peak Measurement
47+
|OpenShift Platform CPU |106 cores (52 cores per node)
48+
|OpenShift Platform memory |504G (168G per node)
49+
|Persistent storage |<pending data from scale test>
50+
|====
51+
--
52+
53+
54+
[role="_additional-resources"]
55+
.Additional resources
56+
57+
* link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12/html-single/governance/index#template-comparison-table[Comparison of hub cluster and managed cluster templates]
58+
59+

modules/telco-hub-git-repository.adoc

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
[id="telco-hub-git-repository"]
2+
= Git repository
3+
4+
The telco management hub cluster supports a GitOps driven methodology for installing and managing the configuration of OpenShift clusters for various telco applications.
5+
This methodology requires an accessible Git repository that serves as the authoritative source of truth for cluster definitions and configuration artifacts.
6+
7+
Red Hat does not offer a commercially supported Git server.
8+
An existing Git server provided in the production environment can be used.
9+
Gitea and Gogs are examples of self-hosted Git servers that you can use.
10+
11+
The Git repository is typically provided in the production network external to the hub cluster.
12+
In a large-scale deployment, multiple hub clusters can use the same Git repository for maintaining the definitions of managed clusters. Using this approach, you can easily review the state of the complete network.
13+
As the source of truth for cluster definitions, the Git repository should be highly available and recoverable in disaster scenarios.
14+
15+
[NOTE]
16+
====
17+
For disaster recovery and multi-hub considerations, run the Git repository separately from the hub cluster.
18+
====
19+
20+
Limits and requirements::
21+
* A Git repository is required to support the {ztp} functions of the hub cluster, including installation, configuration, and lifecycle management of the managed clusters.
22+
* The Git repository must be accessible from the management cluster.
23+
24+
Engineering considerations::
25+
* The Git repository is used by the GitOps Operator to ensure continuous deployment and a single source of truth for the applied configuration.
26+

0 commit comments

Comments
 (0)