From 31d2a61b2262f1c22eb9d8674e49e58999e7ab72 Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Wed, 26 Mar 2025 17:27:31 +0100 Subject: [PATCH 1/9] K8SPXC-1405 Added a note about the retention policy with PITR reworked the doc to improve readability --- __pycache__/main.cpython-311.pyc | Bin 1705 -> 0 bytes docs/backups-pitr.md | 57 ++++++++++++++++--------------- 2 files changed, 29 insertions(+), 28 deletions(-) delete mode 100644 __pycache__/main.cpython-311.pyc diff --git a/__pycache__/main.cpython-311.pyc b/__pycache__/main.cpython-311.pyc deleted file mode 100644 index 7ca444f13de0ee776cf1836f2afdf89010b29c18..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 1705 zcmcgsO>7%Q6rR~#|2C5qq7((S;HGMbY3)^2A+^MWLo38j6_U8&C!wV4ops{G-nC}e zZCIQtha5O?C{j^_1jrWxRW2Mka_n&@O4Vp3q)ME)8Rg~^Z+30xhe*9K)_(Ke``*mF zdGnrsAta4py!rD_^nrxXUqa~((8IBJ5~hcUB8sbM7Ex&(&0>m+C@cRh93~7L`b8ka zB)+{xfQP7wmLfYK#p`Gj=B>VE6(8bZ=Oesyh>ufc9Ut+KX$&Gsc;r3XUe1&omf==x z>rhAvocAJpd(!}L5$qlaxQRAF+v;09@S&2x=X%M)3l3K%yyV=6SBjILUdwseTceAv zTX){lw51Bu^L3LMwx#C{yQYbXmgL|Bu_eUlkbdeXDO< zm716;%c0I}U+gLB@%NYx$7zwyq< zf;*ODDzILvpl2+-X1+b_C5Oignm(SH^W!O*(@?ap6>kNu{B}i5)*0QiW;s%*aCVJBD0zn>M_R%JS7mwq71psBQ4n0-R zZmVZo=UNU|FYKrno~oC&)k|ExyrW)TTl`af;hQ`6tybyR(Vl5HeQINPJAK+upWaQL zyl?R2(ANXIWN70ebWUr96YaMXoQ(bc0VnVN(ct99^dL)dmyO8_!0uyzhka8-hj++I>?JrmjN}mtP2q!dl7*U0 i8&z|hy#~CfVdo-1TgDjgqSRXWYbz=iryY;sZvO(0NQ_qi diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index 3f27daaa..f325c42f 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -1,28 +1,41 @@ # Store binary logs for point-in-time recovery -Point-in-time recovery functionality allows users to roll back the cluster to a -specific transaction, time (or even skip a transaction in some cases). -Technically, this feature involves continuously saving binary log updates -[to the backup storage](backups-storage.md). Point-in-time recovery is off by +Point-in-time recovery allows users to roll back the cluster to a +specific transaction or time. You can even skip a transaction if you don't need it anymore. To do so, the Operator needs a backup and the binary logs (binlogs) of the server. They contain the operations that modified the database from a point in the past. + +Point-in-time recovery is off by default and is supported by the Operator only with Percona XtraDB Cluster versions starting from 8.0.21-12.1. -To be used, it requires setting a number of keys in the `pitr` subsection -under the `backup` section of the [deploy/cr.yaml :octicons-link-external-16:](https://github.com/percona/percona-xtradb-cluster-operator/blob/main/deploy/cr.yaml) file: +After you [enable point-in-time recovery](#enable-point-in-time-recovery), the Operator saves binary log updates +[to the backup storage](backups-storage.md). + +## Considerations + +1. Both binlog and full backup should use the same storage to make the point-in-time recovery work +2. Point-in-time recovery will be done for binlogs without any + cluster-based filtering. Therefore it is recommended to use a separate + storage, bucket, or directory to store binlogs for the cluster. + Also, it is recommended to have empty bucket/directory which holds binlogs + (with no binlogs or files from previous attempts or other clusters) when + you enable point-in-time recovery. +3. Don't [purge binlogs :octicons-link-external-16:](https://dev.mysql.com/doc/refman/8.0/en/purge-binary-logs.html) + before they are transferred to the backup storage. Doing so breaks point-in-time recovery +4. Disable the [retention policy](operator.md#backupschedulekeep) as it is incompatible with the point-in-time recovery. To clean up the storage, configure the [Bucket lifecycle :octicons-link-external-16:](https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html) on the storage + +## Enable point-in-time recovery -* `backup.pitr.enabled` key should be set to `true` +To use point-in-time recovery, set the following keys in the `pitr` subsection +under the `backup` section of the [deploy/cr.yaml :octicons-link-external-16:](https://github.com/percona/percona-xtradb-cluster-operator/blob/main/deploy/cr.yaml) manifest: -* `backup.pitr.storageName` key should point to the name of the storage already - configured in the `storages` subsection +* `backup.pitr.enabled` - set it to `true` - !!! note - Both binlog and full backup should use s3-compatible storage to make - point-in-time recovery work! +* `backup.pitr.storageName` - specify the same storage name that you have defined in the `storages` subsection -* `timeBetweenUploads` key specifies the number of seconds between running the - binlog uploader. +* `timeBetweenUploads`- specify the number of seconds between running the + binlog uploader -The following example shows how the `pitr` subsection looks like: +The following example shows how the `pitr` subsection looks like if you use the S3 storage: ```yaml backup: @@ -33,16 +46,4 @@ backup: timeBetweenUploads: 60 ``` -!!! note - - Point-in-time recovery will be done for binlogs without any - cluster-based filtering. Therefore it is recommended to use a separate - storage, bucket, or directory to store binlogs for the cluster. - Also, it is recommended to have empty bucket/directory which holds binlogs - (with no binlogs or files from previous attempts or other clusters) when - you enable point-in-time recovery. - -!!! note - - [Purging binlogs :octicons-link-external-16:](https://dev.mysql.com/doc/refman/8.0/en/purge-binary-logs.html) - before they are transferred to backup storage will break point-in-time recovery. +For how to restore a database to a point in time, see [Restore the cluster with point-in-time recovery](backups-restore.md#restore-the-cluster-with-point-in-time-recovery). \ No newline at end of file From 6a8f0efde5777e9f5a2eed158b16ec5a8adae9fd Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Thu, 27 Mar 2025 12:21:59 +0100 Subject: [PATCH 2/9] K8SPXC-1544 Added info about Prometheus metrics --- docs/backups-pitr.md | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index f325c42f..cc588daa 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -1,15 +1,18 @@ # Store binary logs for point-in-time recovery Point-in-time recovery allows users to roll back the cluster to a -specific transaction or time. You can even skip a transaction if you don't need it anymore. To do so, the Operator needs a backup and the binary logs (binlogs) of the server. They contain the operations that modified the database from a point in the past. +specific transaction or time. You can even skip a transaction if you don't need it anymore. To do so, the Operator needs a backup and binary logs (binlogs) of the server. + +A binary log records all changes made to the database, such as updates, inserts, and deletes. It is used to synchronize data across servers for and point-in-time recovery. Point-in-time recovery is off by -default and is supported by the Operator only with Percona XtraDB Cluster +default and is supported by the Operator with Percona XtraDB Cluster versions starting from 8.0.21-12.1. -After you [enable point-in-time recovery](#enable-point-in-time-recovery), the Operator saves binary log updates +After you [enable point-in-time recovery](#enable-point-in-time-recovery), the Operator spins up a separate point-in-time recovery Pod, which starts saving binary log updates [to the backup storage](backups-storage.md). + ## Considerations 1. Both binlog and full backup should use the same storage to make the point-in-time recovery work @@ -46,4 +49,19 @@ backup: timeBetweenUploads: 60 ``` -For how to restore a database to a point in time, see [Restore the cluster with point-in-time recovery](backups-restore.md#restore-the-cluster-with-point-in-time-recovery). \ No newline at end of file +For how to restore a database to a specific point in time, see [Restore the cluster with point-in-time recovery](backups-restore.md#restore-the-cluster-with-point-in-time-recovery). + +## Monitoring binary logs + +The point-in-time recovery Pod collects statistics metrics for binlogs. They provide insights into the success and failure rates of binlog operations, timeliness of processing and uploads and potential gaps or inconsistencies in binlog data. + +The available metrics are: + +* `pxc_binlog_collector_success_total` - The total number of successful binlog collection cycles. It helps monitor how often the binlog collector successfully processes and uploads binary logs. +* `pxc_binlog_collector_failure_total` - The total number of failed binlog collection cycles. Indicates issues in the binlog collection process, such as connectivity problems or errors during processing +* `pxc_binlog_collector_gap_detected_total` - Tracks the total number of gaps detected in the binlog sequence during collection. Highlights potential issues with missing or skipped binlogs, which could impact replication or recovery. +* `pxc_binlog_collector_last_processing_timestamp` - Records the timestamp of the last successful binlog processing operation. +* `pxc_binlog_collector_last_upload_timestamp` - Records the timestamp of the last successful binlog upload to the storage +* `pxc_binlog_collector_uploaded_total` - The total number of successfully uploaded binlogs + +You can connect to this Pod using the `:8080/metrics` endpoint to gather these metrics and further analyze them. \ No newline at end of file From d1439b5110b8d981f99b5f86fc52ded783cee4f9 Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Thu, 27 Mar 2025 14:35:58 +0100 Subject: [PATCH 3/9] Updated considerations --- docs/backups-pitr.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index cc588daa..a2b3066c 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -1,7 +1,7 @@ # Store binary logs for point-in-time recovery Point-in-time recovery allows users to roll back the cluster to a -specific transaction or time. You can even skip a transaction if you don't need it anymore. To do so, the Operator needs a backup and binary logs (binlogs) of the server. +specific transaction or time. You can even skip a transaction if you don't need it anymore. To make a point-in-time recovery, the Operator needs a backup and binary logs (binlogs) of the server to. A binary log records all changes made to the database, such as updates, inserts, and deletes. It is used to synchronize data across servers for and point-in-time recovery. @@ -15,13 +15,11 @@ After you [enable point-in-time recovery](#enable-point-in-time-recovery), the O ## Considerations -1. Both binlog and full backup should use the same storage to make the point-in-time recovery work -2. Point-in-time recovery will be done for binlogs without any - cluster-based filtering. Therefore it is recommended to use a separate - storage, bucket, or directory to store binlogs for the cluster. - Also, it is recommended to have empty bucket/directory which holds binlogs - (with no binlogs or files from previous attempts or other clusters) when - you enable point-in-time recovery. +1. You must use either s3-compatible or Azure-compatible storage for both binlog and full backup to make the point-in-time recovery work +2. The Operator saves binlogs without any + cluster-based filtering. Therefore, either use a separate folder per cluster on the same bucket or use different buckets for binlogs. + + Also,we recommend to have an empty bucket or a folder on a bucket for binlogs when you enable point-in-time recovery. This bucket/folder should not contain no binlogs nor files from previous attempts or other clusters. 3. Don't [purge binlogs :octicons-link-external-16:](https://dev.mysql.com/doc/refman/8.0/en/purge-binary-logs.html) before they are transferred to the backup storage. Doing so breaks point-in-time recovery 4. Disable the [retention policy](operator.md#backupschedulekeep) as it is incompatible with the point-in-time recovery. To clean up the storage, configure the [Bucket lifecycle :octicons-link-external-16:](https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html) on the storage From aab645674134442e6775e7ea8bbb6d9b13ad7baa Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Thu, 27 Mar 2025 16:14:10 +0100 Subject: [PATCH 4/9] Excluded Monitoring section --- docs/backups-pitr.md | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index a2b3066c..aa93e5e5 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -49,17 +49,3 @@ backup: For how to restore a database to a specific point in time, see [Restore the cluster with point-in-time recovery](backups-restore.md#restore-the-cluster-with-point-in-time-recovery). -## Monitoring binary logs - -The point-in-time recovery Pod collects statistics metrics for binlogs. They provide insights into the success and failure rates of binlog operations, timeliness of processing and uploads and potential gaps or inconsistencies in binlog data. - -The available metrics are: - -* `pxc_binlog_collector_success_total` - The total number of successful binlog collection cycles. It helps monitor how often the binlog collector successfully processes and uploads binary logs. -* `pxc_binlog_collector_failure_total` - The total number of failed binlog collection cycles. Indicates issues in the binlog collection process, such as connectivity problems or errors during processing -* `pxc_binlog_collector_gap_detected_total` - Tracks the total number of gaps detected in the binlog sequence during collection. Highlights potential issues with missing or skipped binlogs, which could impact replication or recovery. -* `pxc_binlog_collector_last_processing_timestamp` - Records the timestamp of the last successful binlog processing operation. -* `pxc_binlog_collector_last_upload_timestamp` - Records the timestamp of the last successful binlog upload to the storage -* `pxc_binlog_collector_uploaded_total` - The total number of successfully uploaded binlogs - -You can connect to this Pod using the `:8080/metrics` endpoint to gather these metrics and further analyze them. \ No newline at end of file From df405a556065af6c108f674b54ec3e30819caa59 Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Mon, 31 Mar 2025 19:32:03 +0200 Subject: [PATCH 5/9] K8SPXC-1544 Added info about Prometheus metrics modified: docs/backups-pitr.md --- docs/backups-pitr.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index aa93e5e5..a2b3066c 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -49,3 +49,17 @@ backup: For how to restore a database to a specific point in time, see [Restore the cluster with point-in-time recovery](backups-restore.md#restore-the-cluster-with-point-in-time-recovery). +## Monitoring binary logs + +The point-in-time recovery Pod collects statistics metrics for binlogs. They provide insights into the success and failure rates of binlog operations, timeliness of processing and uploads and potential gaps or inconsistencies in binlog data. + +The available metrics are: + +* `pxc_binlog_collector_success_total` - The total number of successful binlog collection cycles. It helps monitor how often the binlog collector successfully processes and uploads binary logs. +* `pxc_binlog_collector_failure_total` - The total number of failed binlog collection cycles. Indicates issues in the binlog collection process, such as connectivity problems or errors during processing +* `pxc_binlog_collector_gap_detected_total` - Tracks the total number of gaps detected in the binlog sequence during collection. Highlights potential issues with missing or skipped binlogs, which could impact replication or recovery. +* `pxc_binlog_collector_last_processing_timestamp` - Records the timestamp of the last successful binlog processing operation. +* `pxc_binlog_collector_last_upload_timestamp` - Records the timestamp of the last successful binlog upload to the storage +* `pxc_binlog_collector_uploaded_total` - The total number of successfully uploaded binlogs + +You can connect to this Pod using the `:8080/metrics` endpoint to gather these metrics and further analyze them. \ No newline at end of file From 46709f20dda942bf9e6f14fb0b038e3905f71056 Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Mon, 31 Mar 2025 19:33:31 +0200 Subject: [PATCH 6/9] Excluded pxc_binlog_collector_failure_total as not ready --- docs/backups-pitr.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index a2b3066c..86801f66 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -56,7 +56,6 @@ The point-in-time recovery Pod collects statistics metrics for binlogs. They pro The available metrics are: * `pxc_binlog_collector_success_total` - The total number of successful binlog collection cycles. It helps monitor how often the binlog collector successfully processes and uploads binary logs. -* `pxc_binlog_collector_failure_total` - The total number of failed binlog collection cycles. Indicates issues in the binlog collection process, such as connectivity problems or errors during processing * `pxc_binlog_collector_gap_detected_total` - Tracks the total number of gaps detected in the binlog sequence during collection. Highlights potential issues with missing or skipped binlogs, which could impact replication or recovery. * `pxc_binlog_collector_last_processing_timestamp` - Records the timestamp of the last successful binlog processing operation. * `pxc_binlog_collector_last_upload_timestamp` - Records the timestamp of the last successful binlog upload to the storage From 42297867754241ce68462689cd2c36256b6a7d2c Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Mon, 31 Mar 2025 19:36:31 +0200 Subject: [PATCH 7/9] Updated after the review --- docs/backups-pitr.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index 86801f66..4000e49f 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -57,7 +57,7 @@ The available metrics are: * `pxc_binlog_collector_success_total` - The total number of successful binlog collection cycles. It helps monitor how often the binlog collector successfully processes and uploads binary logs. * `pxc_binlog_collector_gap_detected_total` - Tracks the total number of gaps detected in the binlog sequence during collection. Highlights potential issues with missing or skipped binlogs, which could impact replication or recovery. -* `pxc_binlog_collector_last_processing_timestamp` - Records the timestamp of the last successful binlog processing operation. +* `pxc_binlog_collector_last_processing_timestamp` - Records the timestamp of the last successful binlog collection operation. * `pxc_binlog_collector_last_upload_timestamp` - Records the timestamp of the last successful binlog upload to the storage * `pxc_binlog_collector_uploaded_total` - The total number of successfully uploaded binlogs From c01e847327c430c67b2fae5bf175ce4c8fb700e1 Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Mon, 31 Mar 2025 19:47:00 +0200 Subject: [PATCH 8/9] Updated after the review --- docs/backups-pitr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index 4000e49f..7adb9791 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -49,9 +49,9 @@ backup: For how to restore a database to a specific point in time, see [Restore the cluster with point-in-time recovery](backups-restore.md#restore-the-cluster-with-point-in-time-recovery). -## Monitoring binary logs +## Binary logs statistics -The point-in-time recovery Pod collects statistics metrics for binlogs. They provide insights into the success and failure rates of binlog operations, timeliness of processing and uploads and potential gaps or inconsistencies in binlog data. +The point-in-time recovery Pod has statistics metrics for binlogs. They provide insights into the success and failure rates of binlog operations, timeliness of processing and uploads and potential gaps or inconsistencies in binlog data. The available metrics are: From 828f67afbc15a3e69e7920a70b77c19412a9a03e Mon Sep 17 00:00:00 2001 From: Anastasia Alexadrova Date: Mon, 31 Mar 2025 19:54:36 +0200 Subject: [PATCH 9/9] Added info about growing counters --- docs/backups-pitr.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/backups-pitr.md b/docs/backups-pitr.md index 7adb9791..75b4b6ef 100644 --- a/docs/backups-pitr.md +++ b/docs/backups-pitr.md @@ -61,4 +61,6 @@ The available metrics are: * `pxc_binlog_collector_last_upload_timestamp` - Records the timestamp of the last successful binlog upload to the storage * `pxc_binlog_collector_uploaded_total` - The total number of successfully uploaded binlogs -You can connect to this Pod using the `:8080/metrics` endpoint to gather these metrics and further analyze them. \ No newline at end of file +You can connect to this Pod using the `:8080/metrics` endpoint to gather these metrics and further analyze them. + +Note that the statistics data is not kept when the point-in-time recovery Pod restarts. This means that the counters like `pxc_binlog_collector_success_total` are reset. \ No newline at end of file