diff --git a/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/_index.md b/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/_index.md index f818d59e1a80..fa4ee62e9e45 100644 --- a/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/_index.md +++ b/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/_index.md @@ -42,6 +42,12 @@ For information on configuring alerts, health checks, and diagnostics reporting href="high-availability/" icon="fa-thin fa-clone">}} + {{}} + {{ High Availability > Replication Configuration** and click **Make Active**. + +1. Select the backup from which you want to restore (in most cases, you should choose the most recent backup) and enable **Confirm promotion**. + + {{< warning title="Do not promote an old backup" >}} + +If you recently upgraded the active instance, ensure the backup you select is up to date. See [Promotion and old backups](#promotion-and-old-backups). + + {{< /warning >}} + +1. If you are performing a failover, where the previous active instance is unavailable or unreachable during promotion, select the **Force promotion** option. + + This will promote the standby without demoting the active. + +1. Click **Continue**. The restore takes a few seconds, after which expect to be signed out. + +1. Sign in using the credentials that you had configured on the previously active instance. If you are performing failover, you must sign in using your Super Admin account. + + You should be able to see that all of the data has been restored into the instance, including universes, users, metrics, alerts, task history, provider configurations, and so on. + +1. In the case of failover, follow the steps in [Failover](#failover) to ensure that the old active does not come back up or that it goes into standby mode when it does come up. + +## Verify promotion + +After switching or failing over to the standby, verify that the old active YBA instance is in standby mode (switchover), or is no longer available (failover). + +If both YBA instances were to attempt to perform actions on a universe, it could have unpredictable side effects. It is critical to ensure that the old active instance is taken out of service or re-imaged as soon as possible if it is unavailable. + +YugabyteDB release archives are not synchronized between the active and standby instances. If any custom releases were added to the old active instance, you will need to add them to the new active instance again. The _Universe Release Files Missing_ alert will fire on any universes that are missing their corresponding release archives. If this alert fires, follow the steps in [How to Configure YugabyteDB Anywhere to provide Older, Hotfix, or Debug Builds](https://support.yugabyte.com/hc/en-us/articles/360054421952-How-to-configure-YugabyteDB-Anywhere-to-provide-Older-Hotfix-or-Debug-Builds). + +### Switchover + +After a switchover, do the following: + +- [Verify that HA is functioning properly](../high-availability/#verify-ha). +- If the old active instance is not in standby mode, there could be a communication issue from the new active to the old active instance. Follow the [setup instructions](../high-availability/#set-up-high-availability) to verify that certificates and ports are set up correctly. + +### Failover + +After a failover, do the following: + +- If the old active instance is hard down, verify that there is no chance that it can come back and run YBA at a later point. It is recommended to re-image the server hosting the active instance. +- If the old active instance does come back up, it should automatically go into standby mode. If it does not go into standby mode, you should manually demote it using the YBA API. Refer to [High Availability Workflows](https://github.com/yugabyte/yugabyte-db/blob/master/managed/api-examples/python-simple/high-availability.ipynb) for an example. + +- If the old active instance has successfully switched to standby, [verify that HA is functioning properly](../high-availability/#verify-ha). diff --git a/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/high-availability.md b/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/high-availability.md index f0f09fe9dedf..49d3aa9aaaef 100644 --- a/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/high-availability.md +++ b/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/high-availability.md @@ -1,9 +1,9 @@ --- title: High availability of YugabyteDB Anywhere -headerTitle: Enable high availability +headerTitle: Enable High Availability description: Make YugabyteDB Anywhere highly available headcontent: Configure standby instances of YugabyteDB Anywhere -linkTitle: Enable high availability +linkTitle: High Availability aliases: - /stable/yugabyte-platform/manage-deployments/platform-high-availability/ menu: @@ -11,14 +11,16 @@ menu: identifier: platform-high-availability parent: administer-yugabyte-platform weight: 40 +rightNav: + hideH4: true type: docs --- -YugabyteDB Anywhere (YBA) high availability (HA) is an active-standby model for multiple YBA instances. YBA HA uses YugabyteDB's distributed architecture to replicate your YBA data across multiple virtual machines (VM), ensuring that you can recover quickly from a VM failure and continue to manage and monitor your universes, with your configuration and metrics data intact. +YugabyteDB Anywhere (YBA) High Availability (HA) is an active-standby model for multiple YBA instances. YBA HA uses YugabyteDB's distributed architecture to replicate your YBA data across multiple virtual machines (VM), ensuring that you can recover quickly from a VM failure and continue to manage and monitor your universes, with your configuration and metrics data intact. Each HA cluster includes a single active YBA instance and at least one standby YBA instance, configured as follows: -- The active instance runs normally, but also pushes out backups of its state to all of the standby instances in the HA cluster at a configurable frequency (no more than once per minute). +- The active instance runs normally, but also pushes backups of its state to all of the standby instances in the HA cluster at a configurable frequency (no more than once per minute). The active instance also creates and sends one-off backups to standby instances whenever a task completes (such as creating a new universe). @@ -28,7 +30,9 @@ Each HA cluster includes a single active YBA instance and at least one standby Y The standby instance's Prometheus instance is federated to the active instance's Prometheus to constantly receive up to date metrics asynchronously. -When you promote a standby instance to active, YBA restores your selected backup, and then attempts to demote the previous active instance to standby mode. If the previous active instance is unavailable, it has to be manually decommissioned. +When you [promote a standby instance](../high-availability-promote/) to active, YBA restores your selected backup, and then attempts to demote the previous active instance to standby mode. If the previous active instance is unavailable, it has to be manually decommissioned. + +If you use the [YugabyteDB Kubernetes Operator](../anywhere-automation/yb-kubernetes-operator/) and deploy YBA across separate Kubernetes clusters, {{}}[Operator HA](../operator-high-availability/) synchronizes operator custom resources and secrets to the standby cluster during promotion. ## Prerequisites @@ -40,7 +44,7 @@ Before configuring a HA cluster for your YBA instances, ensure that you have the - The YBA instances were installed using the same installation method (YBA Installer or Helm (Kubernetes)). - The YBA instances are configured to use the same path for the installation root. - If you are using custom ports for Prometheus, all YBA instances are using the same custom port. (The default Prometheus port for YugabyteDB Anywhere is 9090.) -- All YBA instances are running the same version of YBA software. (The YBA instances in a HA cluster should always be upgraded at approximately the same time.) +- All YBA instances are running the same version of YBA software. (The YBA instances in a HA cluster should always be [upgraded](#upgrade-instances) at approximately the same time.) - The YBA instances have the same login credentials. {{< tip title="Getting the API key for the standby" >}} @@ -49,7 +53,7 @@ If you are using the API to configure HA, obtain your API key for the standby in {{< /tip >}} -## Configure active and standby instances +## Set up High Availability To set up HA, you first configure the active instance by creating an active HA replication configuration and generating a shared authentication key. @@ -196,57 +200,43 @@ Upload the combined certificate to the trust store and try enabling certificate To set up a single URL for signing in to YBA that points to the current active YBA, even after a switchover or failover, it is recommended to use an application (L7) load balancer. On the load balancer, set the health check URL for each HA instance to `https:///api/v1/ha_leader`. (Specify any custom port configuration if you changed the default 443 configuration.) Note that you may need to set the support origin URL for your YBA instance to the load balancer URL; this can be set during installation, refer to [Install YugabyteDB Anywhere](../../install-yugabyte-platform/install-software/installer/). Configure the load balancer to forward ports 443 for the YBA UI and 9090 for Prometheus. -## Promote a standby instance to active - -You can make a standby instance active as follows: - -1. On the standby instance you want to promote, navigate to **Admin > High Availability > Replication Configuration** and click **Make Active**. - -1. Select the backup from which you want to restore (in most cases, you should choose the most recent backup) and enable **Confirm promotion**. +### Remove a standby instance - {{< warning title="Don't promote an old active backup" >}} -Immediately after upgrading the active instance to a new version of YBA, older state backups of the active instance (that is, before it was upgraded) will still be available on the standby. These are not deleted until the standby is promoted at some point, or until they expire. - -Because these old backups are present, you need to be cautious promoting the standby in the time immediately following an upgrade. - -When possible, only promote a standby when both standby and active are on the same version, and use the most recent backup that you are confident was received after the active instance was upgraded. - {{< /warning >}} +To remove a standby instance from a HA cluster, you need to remove it from the active instance's list, and then delete the configuration from the instance to be removed, as follows: -1. Click **Continue**. The restore takes a few seconds, after which expect to be signed out. +1. On the active instance's list, click **Delete Instance** for the standby instance to be removed. -1. Sign in using the credentials that you had configured on the previously active instance. If you are performing failover, you must sign in using your Super Admin account. +1. On the standby instance you wish to remove from the HA cluster, on the **Admin > High Availability** tab, click **Delete Configuration**. -In cases of failover, the previous active instance may be unavailable or unreachable during promotion. In this case, you must perform a force promotion that will promote the standby without demoting the active as per the following illustration: +The standby instance is now a standalone instance again. -![Force promotion](/images/yp/high-availability/ha-force-promotion.png) +After you have returned a standby instance to standalone mode, the information on the instance is likely to be out of date, which can lead to incorrect behavior. It is not recommended to continue to use this standby instance for any management operations. Uninstall YBA from this instance and reinstall it to return it to a clean state before using it as a standalone instance. -Afterwards, follow the steps in [Failover](#failover) to ensure that the old active does not come back up or that it goes into standby mode when it does come up. +## Monitoring and alerts -You should be able to see that all of the data has been restored into the instance, including universes, users, metrics, alerts, task history, provider configurations, and so on. +The easiest way to determine the health of your HA configuration is to monitor the overall HA state of your active YBA instance, which is displayed on the **Replication Configuration** tab as per the following illustration: -### Verify promotion +![Monitoring HA](/images/yp/high-availability/ha-monitor.png) -After switching or failing over to the standby, verify that the old active YBA instance is in standby mode (switchover), or is no longer available (failover). +The overall HA state is computed from the individual instance states, which can be viewed on the **Instance Configuration** tab. -If both YBA instances were to attempt to perform actions on a universe, it could have unpredictable side effects. It is critical to ensure that the old active instance is taken out of service or re-imaged as soon as possible if it is unavailable. +If some standbys are connected and some are disconnected, the global state will show _Warning_. -YugabyteDB release archives are not synchronized between the active and standby instances. If any custom releases were added to the old active instance, you will need to add them to the new active instance again. The _Universe Release Files Missing_ alert will fire on any universes that are missing their corresponding release archives. If this alert fires, follow the steps in [How to Configure YugabyteDB Anywhere to provide Older, Hotfix, or Debug Builds](https://support.yugabyte.com/hc/en-us/articles/360054421952-How-to-configure-YugabyteDB-Anywhere-to-provide-Older-Hotfix-or-Debug-Builds). +If all of your standby instances are disconnected, the state will show _Error_. -#### Switchover +The following HA-related [alerts](../../alerts-monitoring/alert/) are automatically configured to alert you of issues with your HA configuration: -After a switchover, do the following: +- HA Standby Sync -- [Verify that HA is functioning properly](#verify-ha). -- If the old active instance is not in standby mode, there could be a communication issue from the new active to the old active instance. Follow the [setup instructions](#configure-active-and-standby-instances) to verify that certificates and ports are set up correctly. + This alert fires when backup to a particular standby has failed for a specified amount of time. The default is 15 minutes, and can be changed by editing the HA Standby Sync alert policy. -#### Failover +- HA Version Mismatch -After a failover, do the following: + This alert fires when there is a version mismatch between the active and standby instances, and clears automatically when both instances are upgraded to the same version. -- If the old active instance is hard down, verify that there is no chance that it can come back and run YBA at a later point. It is recommended to re-image the server hosting the active instance. -- If the old active instance does come back up, it should automatically go into standby mode. If it does not go into standby mode, you should manually demote it using the YBA API. Refer to [High Availability Workflows](https://github.com/yugabyte/yugabyte-db/blob/master/managed/api-examples/python-simple/high-availability.ipynb) for an example. +- Universe Release Files Missing -- If the old active instance has successfully switched to standby, [verify that HA is functioning properly](#verify-ha). + This alert fires if any of your universes are using a local YugabyteDB release that is not available in YBA. This can happen after a switchover or failover to a YBA instance that doesn't have the same releases. The alert clears after you add the missing releases. ## Upgrade instances @@ -270,47 +260,9 @@ Certificates in the trust store should not require setup again. If you are promoting a YBA standby that is running version 2024.1.0 or later, while the old active instance is running a version earlier than 2024.1.0, see the [Limitations](#limitations). -## Remove a standby instance - -To remove a standby instance from a HA cluster, you need to remove it from the active instance's list, and then delete the configuration from the instance to be removed, as follows: - -1. On the active instance's list, click **Delete Instance** for the standby instance to be removed. - -1. On the standby instance you wish to remove from the HA cluster, on the **Admin > High Availability** tab, click **Delete Configuration**. - -The standby instance is now a standalone instance again. - -After you have returned a standby instance to standalone mode, the information on the instance is likely to be out of date, which can lead to incorrect behavior. It is not recommended to continue to use this standby instance for any management operations. Uninstall YBA from this instance and reinstall it to return it to a clean state before using it as a standalone instance. - -## Monitoring - -The easiest way to determine the health of your HA configuration is to monitor the overall HA state of your active YBA instance, which is displayed on the **Replication Configuration** tab as per the following illustration: - -![Monitoring HA](/images/yp/high-availability/ha-monitor.png) - -The overall HA state is computed from the individual instance states, which can be viewed on the **Instance Configuration** tab. - -If some standbys are connected and some are disconnected, the global state will show _Warning_. - -If all of your standby instances are disconnected, the state will show _Error_. - -The following HA-related [alerts](../../alerts-monitoring/alert/) are automatically configured to alert you of issues with your HA configuration: - -- HA Standby Sync - - This alert fires when backup to a particular standby has failed for a specified amount of time. The default is 15 minutes, and can be changed by editing the HA Standby Sync alert policy. - -- HA Version Mismatch - - This alert fires when there is a version mismatch between the active and standby instances, and clears automatically when both instances are upgraded to the same version. - -- Universe Release Files Missing - - This alert fires if any of your universes are using a local YugabyteDB release that is not available in YBA. This can happen after a switchover or failover to a YBA instance that doesn't have the same releases. The alert clears after you add the missing releases. - ## Limitations -- No automatic failover. If the active instance fails, follow the steps in [Promote a standby instance to active](#promote-a-standby-instance-to-active). +- No automatic failover. If the active instance fails, follow the steps in [Promote a standby instance to active](../high-availability-promote/#promote-a-standby-instance-to-active). - When performing failover, the first time you sign in after failover, you must use your Super Admin account. - Promotion will fail when HA is configured with an active instance at YBA version earlier than 2024.1, and a standby instance at version 2024.1 or later. It is not recommended to run in this configuration for an extended period. Reach out to {{% support-platform %}} if this is required. - If you are making API calls to YBA through custom automation, note that the [API token](../../anywhere-automation/#authentication) is different on the YBA active and standby until the standby has been promoted at least once to be an active instance. If you are using YBA with an API token, either generate a new token before every request, or perform a switchover after generating the API token (this process will have to be repeated when the API token is regenerated). diff --git a/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/operator-high-availability.md b/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/operator-high-availability.md new file mode 100644 index 000000000000..6282ca1e07ee --- /dev/null +++ b/docs/content/stable/yugabyte-platform/administer-yugabyte-platform/operator-high-availability.md @@ -0,0 +1,81 @@ +--- +title: YugabyteDB Anywhere Operator high availability +headerTitle: Operator High Availability +description: Extend YBA High Availability to synchronize Kubernetes Operator custom resources across clusters +headcontent: Synchronize operator-managed resources for high availability +linkTitle: Operator HA +tags: + feature: early-access +menu: + stable_yugabyte-platform: + identifier: platform-operator-high-availability + parent: platform-high-availability + weight: 45 +type: docs +--- + +{{}}YugabyteDB Anywhere (YBA) Operator high availability (HA) extends [YBA HA](../high-availability/) to synchronize Kubernetes Operator custom resources (CRs) and their associated secrets between active and standby YBA instances. This ensures that a standby YBA instance can resume management of operator-controlled universes after a failover, without requiring you to manually recreate CRs or secrets. + +Operator HA uses the same asynchronous backup and restore mechanism as YBA HA. Operator resources are included in the backups and restored automatically when a standby instance is promoted. In addition, improvements to YBA HA are automatically inherited by the operator, providing a unified experience for both platform state and operator-managed Kubernetes resources. + +Assuming you fulfill the prerequisites, Operator HA is _automatically enabled_. + +## Prerequisites + +Before you can use Operator HA, ensure the following: + +- [YBA HA is configured](../high-availability/) between your active and standby instances, both of which are deployed on Kubernetes. +- The [YugabyteDB Kubernetes Operator](../../anywhere-automation/yb-kubernetes-operator/) is enabled on each YBA instance in the HA cluster. +- Each YBA instance in the HA cluster can reach the Kubernetes API server for its local cluster. + +## Overview + +Operator HA is designed for deployments where YBA instances run on _separate Kubernetes clusters_. This is common in [Multi-Cluster Services (MCS)](../../configure-yugabyte-platform/kubernetes/#configure-kubernetes-multi-cluster-environment) environments. + +In a single-cluster deployment, both the active and standby YBA instances typically share access to the same Kubernetes control plane and the same CRs. Operator HA is not required in that scenario as standard [YBA HA](../high-availability/) covers it. + +In a multi-cluster deployment, a failover creates a management blackout: the standby YBA instance receives YBA platform state through HA backups, but the operator CRs and secrets that define and manage universes exist only on the primary cluster's Kubernetes API. Without Operator HA, the standby instance cannot manage those universes after promotion. + +Operator HA addresses this when: + +- YBA instances are deployed on entirely separate Kubernetes clusters. +- The primary cluster goes offline and the standby YBA on a remote cluster must take over management of existing universes. +- The standby instance needs immediate access to the CRs and secrets (such as kubeconfigs and certificates) used to create and maintain those universes. + +## What to expect on failover and failback + +Operator HA provides streamlined transition of management capabilities when a failover or failback is triggered. + +### During failover + +- Resource synchronization: The standby node automatically imports and applies all necessary YAML definitions for Operator CRs and their associated secrets (such as kubeconfigs and certificates) to its local Kubernetes API. + +- State alignment: The system applies "force/replace" logic to ensure the new active instance's state matches the latest source of truth from the backup, avoiding inconsistencies. + +- Operator activation: After operator resources are successfully applied, the standby YBA service activates its operator thread, and resumes management of the infrastructure. You do not need to manually recreate CRs or re-import universes. + +For general failover steps, see [Promote a standby instance to active](../high-availability/#promote-a-standby-instance-to-active). + +### During failback + +When you fail back to the original primary, Operator HA keeps operator resource state consistent across both clusters. + +- Spec consistency: When you fail back to the original primary, Operator HA ensures that any edits made while the standby was active are synchronized back to the original primary. This prevents specifications from being rolled back to an outdated state. + +- Lifecycle management: If a CR was deleted during the failover period, the system recognizes this state and ensures the resource is not incorrectly recreated upon failback. + +## Supported resources + +Operator HA tracks and transfers all critical operator resources, including the following: + +- Universes and providers. +- Backup, scheduled backup, and PITR configurations. +- Storage configurations and YugabyteDB certificates. +- Referenced Kubernetes secrets containing credentials and tokens. + +For details on each CR type, see [YugabyteDB Kubernetes Operator CRDs](../../anywhere-automation/yb-kubernetes-operator/#yugabytedb-kubernetes-operator-crds). + +## Learn more + +- [Enable High Availability](../high-availability/) +- [YugabyteDB Kubernetes Operator](../../anywhere-automation/yb-kubernetes-operator/) diff --git a/docs/content/stable/yugabyte-platform/anywhere-automation/yb-kubernetes-operator.md b/docs/content/stable/yugabyte-platform/anywhere-automation/yb-kubernetes-operator.md index 9de3a4d28624..4dbf766edbe6 100644 --- a/docs/content/stable/yugabyte-platform/anywhere-automation/yb-kubernetes-operator.md +++ b/docs/content/stable/yugabyte-platform/anywhere-automation/yb-kubernetes-operator.md @@ -10,6 +10,7 @@ menu: identifier: yb-kubernetes-operator weight: 100 type: docs +hideH4: true --- The YugabyteDB Kubernetes Operator streamlines the deployment and management of YugabyteDB clusters in Kubernetes environments. You can use the Operator to automate provisioning, scaling, and handling lifecycle events of YugabyteDB clusters, and it provides additional capabilities not available via other automation methods (which rely on REST APIs, UIs, and Helm charts). @@ -18,7 +19,7 @@ The Operator establishes `ybuniverse` as a Custom Resource Definition (CRD) in K You can define and update these custom resources to manage your universe's configuration, including granular resource specifications (CPU and memory for Masters and TServers) and precise regional/zonal placement policies to ensure optimal performance and high availability. Custom resources support seamless upgrades with no downtime, as well as automated, transparent scaling, and cluster-balanced deployments. -{{}}You can additionally convert Kubernetes universes that are managed via Helm charts to be managed by the YugabyteDB Kubernetes Operator, using the `operator-import` API. See [Import universe](#import-universe). +You can additionally convert Kubernetes universes that are managed via Helm charts to be managed by the YugabyteDB Kubernetes Operator, using the `operator-import` API. See [Import universe](#import-universe). ![YugabyteDB Kubernetes Operator](/images/yb-platform/yb-kubernetes-operator.png) @@ -37,6 +38,8 @@ The following additional CRDs support day 2 operations. | [Backup and RestoreJob](#backup-and-restore) | Take full backups of a universe and restore for data protection. | | [BackupSchedule](#scheduled-backups) | Schedule full and incremental backups of a universe. | | [PitrConfig](#configure-pitr) | Configure point-in-time recovery (PITR) for a universe. | +| [PitrRestore](#restore-from-pitr) | {{}}Restore a universe to a point in time using a PITR configuration. | +| [DrConfig](#configure-xcluster-dr) | {{}}Create and manage [xCluster DR](../../back-up-restore-universes/disaster-recovery/) configurations. | | [YBCertificate](#configure-tls-certificates) | Configure TLS certificates for encryption in transit (self-signed or cert-manager). | For details of each CRD, run `kubectl explain` on the CR. @@ -262,6 +265,12 @@ To use the YugabyteDB Kubernetes Operator with an existing YugabyteDB Anywhere i {{< /tabpane >}} +### Operator High Availability + +{{}}If you deploy YBA across separate Kubernetes clusters with [YBA High Availability](../../administer-yugabyte-platform/high-availability/) enabled, Operator HA synchronizes operator CRs and their associated secrets to the standby cluster during failover and failback. This lets the standby YBA instance resume management of operator-controlled universes without manually recreating resources. + +For details, see [Operator High Availability](../../administer-yugabyte-platform/operator-high-availability/). + ## Example workflows ### Create a provider @@ -399,7 +408,7 @@ operator-universe-demo Ready {{< yb-version version="stable" format="build"> To modify the universe, edit the CRD and use `kubectl apply/edit` operations. -### Create a universe with placement information +#### Create a universe with placement information Starting from YugabyteDB Anywhere v2025.2, you can specify `placementInfo` in the YBUniverse CRD to control regional and zonal placement of nodes. Use `defaultRegion` and `regions` with zone-level `numNodes` and optional `preferred` to define where nodes are placed. You need a Kubernetes provider (for example, one created via [YBProvider](#create-a-provider)) and set `spec.providerName` to its name. @@ -453,6 +462,51 @@ spec: memory: 8Gi ``` +#### Create a universe with read replicas + +{{}}Starting from YugabyteDB Anywhere v2026.1, you can specify a [Read Replica](../../../architecture/key-concepts/#read-replica-cluster) cluster in the YBUniverse CR using the `readReplica` field. + +```sh +kubectl apply universe-read-replica.yaml -n yb-platform +``` + +```yaml +# universe-read-replica.yaml +apiVersion: operator.yugabyte.io/v1alpha1 +kind: YBUniverse +metadata: + name: yugabyte-read-replica +spec: + numNodes: 3 + replicationFactor: 3 + tserverResourceSpec: + cpu: 3 + memory: 6 + masterResourceSpec: + cpu: 3 + providerName: operator-provider + readReplica: + numNodes: 3 + replicationFactor: 3 + deviceInfo: + numVolumes: 1 + volumeSize: 80 + tserverResourceSpec: + cpu: 4 + memory: 6 + enableYSQL: true + enableNodeToNodeEncrypt: false + enableClientToNodeEncrypt: false + ybSoftwareVersion: 2026.1.0.0-b0 + enableYSQLAuth: false + enableYCQL: false + enableYCQLAuth: false + enableIPV6: false + deviceInfo: + numVolumes: 1 + volumeSize: 80 +``` + ### Add a different software release of YugabyteDB Use the Release CRD to add a different software release of YugabyteDB: @@ -829,9 +883,11 @@ No resources found in schedule-cr namespace. ### Configure PITR -Use the PitrConfig CRD to configure point-in-time recovery (PITR) for a universe. +Use the PitrConfig CRD to configure point-in-time recovery (PITR) for a universe. Declarative operations include creating a PITR configuration, updating the list of databases, and deleting the configuration. + +Starting from YugabyteDB Anywhere v2026.1, you can also trigger a PITR restore using the [PitrRestore CR](#restore-from-pitr). -Currently, only declarative operations are supported, including creating a PITR configuration, updating the list of databases, and deleting the configuration. Imperative operations such as restore from a PITR configuration will be supported in a future release. +#### Create a PITR configuration ```sh kubectl apply pitr-config.yaml -n test-pitr @@ -851,6 +907,251 @@ spec: tableType: 'YSQL' ``` +#### Restore from PITR + +{{}}Starting from YugabyteDB Anywhere v2026.1, use the PitrRestore CRD to restore a universe to a state back in time when PITR is enabled for a database. + +1. Create a universe: + + ```sh + kubectl apply pitr-universe.yaml -n test-pitr + ``` + + ```yaml + # pitr-universe.yaml + apiVersion: operator.yugabyte.io/v1alpha1 + kind: YBUniverse + metadata: + name: pitr-universe + spec: + universeName: "pitr-universe" + numNodes: 1 + replicationFactor: 1 + enableYSQL: true + enableNodeToNodeEncrypt: true + enableClientToNodeEncrypt: true + enableLoadBalancer: false + ybSoftwareVersion: "2026.1.0.0-b0" + enableYSQLAuth: false + enableYCQL: true + enableYCQLAuth: false + gFlags: + tserverGFlags: {} + masterGFlags: {} + deviceInfo: + volumeSize: 400 + numVolumes: 1 + storageClass: "yb-standard" + kubernetesOverrides: + resource: + master: + requests: + cpu: 2 + memory: 8Gi + limits: + cpu: 3 + memory: 8Gi + ``` + +1. Create a PITR configuration: + + ```sh + kubectl apply pitr-config.yaml -n test-pitr + ``` + + ```yaml + # pitr-config.yaml + apiVersion: operator.yugabyte.io/v1alpha1 + kind: PitrConfig + metadata: + name: pitr-config + spec: + name: pitr-config + universe: pitr-universe + database: 'yugabyte' + tableType: 'YSQL' + ``` + +1. Trigger a PITR restore: + + ```sh + kubectl apply pitr-restore.yaml -n test-pitr + ``` + + ```yaml + # pitr-restore.yaml + apiVersion: operator.yugabyte.io/v1alpha1 + kind: PitrRestore + metadata: + name: my-pitr-restore + spec: + universe: pitr-universe + pitrConfig: pitr-config + restoreTime: "2026-02-27T12:50:00Z" + ``` + +### Configure xCluster DR + +{{}}Starting from YugabyteDB Anywhere v2026.1, use the DrConfig CRD to create and manage [xCluster DR](../../back-up-restore-universes/disaster-recovery/) configurations. Both declarative operations (create, update the database list, delete) and imperative operations (switchover, failover, pause/resume, restart, replace replica) are supported. + +Before you create a DrConfig CR, ensure that the source and target universes and the storage configuration referenced in the CR exist. The following sections describe the DrConfig CR changes for each supported operation. + +#### Create a DR configuration + +```sh +kubectl apply dr-config.yaml -n yb-platform +``` + +```yaml +# dr-config.yaml +apiVersion: operator.yugabyte.io/v1alpha1 +kind: DrConfig +metadata: + name: prod-to-dr-config +spec: + name: prod-to-dr-config + sourceUniverse: dr-source + targetUniverse: dr-target + databases: + - "db1" + storageConfig: trial-backup-config +``` + +#### Edit the database list + +Update the `databases` list in the DrConfig CR and apply the change: + +```yaml +apiVersion: operator.yugabyte.io/v1alpha1 +kind: DrConfig +metadata: + name: prod-to-dr-config +spec: + name: prod-to-dr-config + sourceUniverse: dr-source + targetUniverse: dr-target + databases: + - "db1" + - "db2" + storageConfig: trial-backup-config +``` + +#### Switchover + +Swap `sourceUniverse` and `targetUniverse` in the CR to initiate a switchover: + +```yaml +apiVersion: operator.yugabyte.io/v1alpha1 +kind: DrConfig +metadata: + name: prod-to-dr-config +spec: + name: prod-to-dr-config + sourceUniverse: dr-target + targetUniverse: dr-source + databases: + - "db1" + - "db2" + storageConfig: trial-backup-config +``` + +#### Failover + +Fail over to the replica: + +1. Set the current target as `sourceUniverse` and use a null string (`""`) for `targetUniverse` to initiate a failover: + + ```yaml + apiVersion: operator.yugabyte.io/v1alpha1 + kind: DrConfig + metadata: + name: prod-to-dr-config + spec: + name: prod-to-dr-config + sourceUniverse: dr-source + targetUniverse: "" + databases: + - "db1" + - "db2" + storageConfig: trial-backup-config + ``` + +1. Restart DR after failover. + + After the universe from which failover was performed (`dr-target`) is in the Ready state, add it back as `targetUniverse`, replacing the empty string. This initiates a restart DR operation and resumes replication: + + ```yaml + apiVersion: operator.yugabyte.io/v1alpha1 + kind: DrConfig + metadata: + name: prod-to-dr-config + spec: + name: prod-to-dr-config + sourceUniverse: dr-source + targetUniverse: dr-target + databases: + - "db1" + - "db2" + storageConfig: trial-backup-config + ``` + +#### Replace the DR replica + +To change the DR replica universe, set `targetUniverse` to the new target universe. Replication is re-established to the new target: + +```yaml +apiVersion: operator.yugabyte.io/v1alpha1 +kind: DrConfig +metadata: + name: prod-to-dr-config +spec: + name: prod-to-dr-config + sourceUniverse: dr-source + targetUniverse: dr-third + databases: + - "db1" + - "db2" + storageConfig: trial-backup-config +``` + +#### Pause and resume replication + +Set `paused: true` to pause replication. When a DR config is created, `paused` defaults to `false`. + +```yaml +apiVersion: operator.yugabyte.io/v1alpha1 +kind: DrConfig +metadata: + name: prod-to-dr-config +spec: + name: prod-to-dr-config + sourceUniverse: dr-source + targetUniverse: dr-third + paused: true + databases: + - "db1" + - "db2" + storageConfig: trial-backup-config +``` + +Set `paused: false` to resume a paused replication: + +```yaml +apiVersion: operator.yugabyte.io/v1alpha1 +kind: DrConfig +metadata: + name: prod-to-dr-config +spec: + name: prod-to-dr-config + sourceUniverse: dr-source + targetUniverse: dr-third + paused: false + databases: + - "db1" + - "db2" + storageConfig: trial-backup-config +``` + ### Configure TLS certificates Use the YBCertificate CRD to configure TLS certificates for encryption in transit: @@ -907,7 +1208,7 @@ spec: ## Import universe -{{}} Available in YugabyteDB Anywhere v2025.2.2 and later. +Available in YugabyteDB Anywhere v2025.2.2 and later. Use the operator import universe feature to import existing YugabyteDB Anywhere Kubernetes universes that are managed via Helm charts to be managed by the Kubernetes Operator. @@ -919,7 +1220,7 @@ Currently, universes with any of the following configurations are not supported ### Before you begin -- Install the operator. The operator must be enabled on your instance. See [Installing Kubernetes Operator](#installing-kubernetes-operator). +- Install the operator. The operator must be enabled on your instance. See [Install Kubernetes Operator](#install-kubernetes-operator). - Verify namespace configuration. - If the operator is configured to watch a single, specific namespace, the namespace provided in the import payload must match that runtime configuration (for example, `yb.kubernetes.operator.namespace`). - If the operator is not watching a specific namespace, the payload should be the namespace you want the resources to be created in. @@ -992,7 +1293,5 @@ Importing a universe to the operator creates or adopts the following in the targ - YugabyteDB Kubernetes Operator is single cluster only, and does not support multi-cluster universes. - Currently, YugabyteDB Kubernetes Operator does not support the following features: - Software upgrade rollback - - [xCluster](../../../architecture/docdb-replication/async-replication/) - - [Read Replica](../../../architecture/key-concepts/#read-replica-cluster) - [Encryption-At-Rest](../../security/enable-encryption-at-rest/) - Only self-signed [encryption in transit](../../security/enable-encryption-in-transit/) is supported. Editing this later is not supported.