Skip to content

Commit f282b65

Browse files
authored
K8SPXC-1405 Added a note about the retention policy with PITR (#219)
K8SPXC-1544 Added info about Prometheus metrics
1 parent 7e46690 commit f282b65

File tree

1 file changed

+44
-26
lines changed

1 file changed

+44
-26
lines changed

Diff for: docs/backups-pitr.md

+44-26
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,42 @@
11
# Store binary logs for point-in-time recovery
22

3-
Point-in-time recovery functionality allows users to roll back the cluster to a
4-
specific transaction, time (or even skip a transaction in some cases).
5-
Technically, this feature involves continuously saving binary log updates
6-
[to the backup storage](backups-storage.md). Point-in-time recovery is off by
7-
default and is supported by the Operator only with Percona XtraDB Cluster
3+
Point-in-time recovery allows users to roll back the cluster to a
4+
specific transaction or time. You can even skip a transaction if you don't need it anymore. To make a point-in-time recovery, the Operator needs a backup and binary logs (binlogs) of the server to.
5+
6+
A binary log records all changes made to the database, such as updates, inserts, and deletes. It is used to synchronize data across servers for and point-in-time recovery.
7+
8+
Point-in-time recovery is off by
9+
default and is supported by the Operator with Percona XtraDB Cluster
810
versions starting from 8.0.21-12.1.
911

10-
To be used, it requires setting a number of keys in the `pitr` subsection
11-
under the `backup` section of the [deploy/cr.yaml :octicons-link-external-16:](https://github.com/percona/percona-xtradb-cluster-operator/blob/main/deploy/cr.yaml) file:
12+
After you [enable point-in-time recovery](#enable-point-in-time-recovery), the Operator spins up a separate point-in-time recovery Pod, which starts saving binary log updates
13+
[to the backup storage](backups-storage.md).
14+
15+
16+
## Considerations
17+
18+
1. You must use either s3-compatible or Azure-compatible storage for both binlog and full backup to make the point-in-time recovery work
19+
2. The Operator saves binlogs without any
20+
cluster-based filtering. Therefore, either use a separate folder per cluster on the same bucket or use different buckets for binlogs.
1221

13-
* `backup.pitr.enabled` key should be set to `true`
22+
Also,we recommend to have an empty bucket or a folder on a bucket for binlogs when you enable point-in-time recovery. This bucket/folder should not contain no binlogs nor files from previous attempts or other clusters.
23+
3. Don't [purge binlogs :octicons-link-external-16:](https://dev.mysql.com/doc/refman/8.0/en/purge-binary-logs.html)
24+
before they are transferred to the backup storage. Doing so breaks point-in-time recovery
25+
4. Disable the [retention policy](operator.md#backupschedulekeep) as it is incompatible with the point-in-time recovery. To clean up the storage, configure the [Bucket lifecycle :octicons-link-external-16:](https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html) on the storage
1426

15-
* `backup.pitr.storageName` key should point to the name of the storage already
16-
configured in the `storages` subsection
27+
## Enable point-in-time recovery
1728

18-
!!! note
19-
Both binlog and full backup should use s3-compatible storage to make
20-
point-in-time recovery work!
29+
To use point-in-time recovery, set the following keys in the `pitr` subsection
30+
under the `backup` section of the [deploy/cr.yaml :octicons-link-external-16:](https://github.com/percona/percona-xtradb-cluster-operator/blob/main/deploy/cr.yaml) manifest:
2131

22-
* `timeBetweenUploads` key specifies the number of seconds between running the
23-
binlog uploader.
32+
* `backup.pitr.enabled` - set it to `true`
2433

25-
The following example shows how the `pitr` subsection looks like:
34+
* `backup.pitr.storageName` - specify the same storage name that you have defined in the `storages` subsection
35+
36+
* `timeBetweenUploads`- specify the number of seconds between running the
37+
binlog uploader
38+
39+
The following example shows how the `pitr` subsection looks like if you use the S3 storage:
2640

2741
```yaml
2842
backup:
@@ -33,16 +47,20 @@ backup:
3347
timeBetweenUploads: 60
3448
```
3549
36-
!!! note
50+
For how to restore a database to a specific point in time, see [Restore the cluster with point-in-time recovery](backups-restore.md#restore-the-cluster-with-point-in-time-recovery).
51+
52+
## Binary logs statistics
53+
54+
The point-in-time recovery Pod has statistics metrics for binlogs. They provide insights into the success and failure rates of binlog operations, timeliness of processing and uploads and potential gaps or inconsistencies in binlog data.
55+
56+
The available metrics are:
3757
38-
Point-in-time recovery will be done for binlogs without any
39-
cluster-based filtering. Therefore it is recommended to use a separate
40-
storage, bucket, or directory to store binlogs for the cluster.
41-
Also, it is recommended to have empty bucket/directory which holds binlogs
42-
(with no binlogs or files from previous attempts or other clusters) when
43-
you enable point-in-time recovery.
58+
* `pxc_binlog_collector_success_total` - The total number of successful binlog collection cycles. It helps monitor how often the binlog collector successfully processes and uploads binary logs.
59+
* `pxc_binlog_collector_gap_detected_total` - Tracks the total number of gaps detected in the binlog sequence during collection. Highlights potential issues with missing or skipped binlogs, which could impact replication or recovery.
60+
* `pxc_binlog_collector_last_processing_timestamp` - Records the timestamp of the last successful binlog collection operation.
61+
* `pxc_binlog_collector_last_upload_timestamp` - Records the timestamp of the last successful binlog upload to the storage
62+
* `pxc_binlog_collector_uploaded_total` - The total number of successfully uploaded binlogs
4463

45-
!!! note
64+
You can connect to this Pod using the `<pitr-pod-service>:8080/metrics` endpoint to gather these metrics and further analyze them.
4665

47-
[Purging binlogs :octicons-link-external-16:](https://dev.mysql.com/doc/refman/8.0/en/purge-binary-logs.html)
48-
before they are transferred to backup storage will break point-in-time recovery.
66+
Note that the statistics data is not kept when the point-in-time recovery Pod restarts. This means that the counters like `pxc_binlog_collector_success_total` are reset.

0 commit comments

Comments
 (0)