@@ -15,6 +15,7 @@ It collects key metrics about:
1515- Pending modifications
1616- Logs size
1717- RDS quota usage information
18+ - AWS Performance Insights
1819
1920> [ !TIP]
2021> Prometheus RDS exporter is part of the [ Database Monitoring Framework] ( https://github.com/qonto/database-monitoring-framework ) which provides alerts, along with their handy runbooks for AWS RDS.
@@ -25,6 +26,8 @@ It collects key metrics about:
2526
2627🧩 AWS Quotas Insights: Stay in control with real-time information about AWS quotas. Ensure you never hit limits unexpectedly.
2728
29+ 📋 AWS Performance Insights: Extended set of metrics with additional of database internals (available only for PostgreSQL).
30+
2831💡 Hard Limits visibility: Know the hard limits of the EC2 instance used by RDS and manage your resources effectively.
2932
3033🔔 Alerting at Your Fingertips: Easily set up Prometheus alerting rules to stay informed of critical events, ensuring you're always ahead of issues.
@@ -39,6 +42,8 @@ It collects key metrics about:
3942
4043## Metrics
4144
45+ Basic metrics are available:
46+
4247| Name | Labels | Description |
4348| ---- | ------ | ----------- |
4449| rds_allocated_disk_iops_average | ` aws_account_id ` , ` aws_region ` , ` dbidentifier ` | Allocated disk IOPS |
@@ -87,6 +92,56 @@ It collects key metrics about:
8792| rds_write_throughput_bytes | ` aws_account_id ` , ` aws_region ` , ` dbidentifier ` | Average number of bytes written to disk per second |
8893| up | | Was the last scrape of RDS successful |
8994
95+ <details >
96+ <summary >Performance insights metrics for RDS PosrgeSQL also available</summary >
97+
98+ | Name | Labels | Description |
99+ | -------------------------------------------------------------------| ---------------------------------------------| -----------------------------------------------------------------------------|
100+ | rds_db_cache_blks_hit | | Number of times disk blocks were found already in the Postgres buffer cache (Blocks per second) |
101+ | rds_db_cache_buffers_alloc | | Total number of new buffers allocated by background writer (Blocks per second) |
102+ | rds_db_checkpoint_buffers_checkpoint | | Number of buffers written during checkpoints (Blocks per second) |
103+ | rds_db_checkpoint_checkpoint_sync_time | | Total amount of time spent syncing files during checkpoints (Milliseconds per checkpoint) |
104+ | rds_db_checkpoint_checkpoint_write_time | | Total amount of time spent writing files during checkpoints (Milliseconds per checkpoint) |
105+ | rds_db_checkpoint_checkpoints_req | | Number of requested checkpoints that have been performed (Checkpoints per minute) |
106+ | rds_db_checkpoint_checkpoints_timed | | Number of scheduled checkpoints that have been performed (Checkpoints per minute) |
107+ | rds_db_checkpoint_maxwritten_clean | | Background writer clean stops due to too many buffers (Bgwriter clean stops per minute) |
108+ | rds_db_concurrency_deadlocks | | Deadlocks (Deadlocks per minute) |
109+ | rds_db_io_blk_read_time | | Time spent reading data file blocks by backends (Milliseconds) |
110+ | rds_db_io_blks_read | | Number of disk blocks read (Blocks per second) |
111+ | rds_db_io_buffers_backend | | Number of buffers written directly by a backend (Blocks per second) |
112+ | rds_db_io_buffers_backend_fsync | | Number of times a backend had to execute its own fsync call (Blocks per second) |
113+ | rds_db_io_buffers_clean | | Number of buffers written by the background writer (Blocks per second) |
114+ | rds_db_sql_tup_deleted | | Number of rows deleted by queries (Tuples per second) |
115+ | rds_db_sql_tup_fetched | | Number of rows fetched by queries (Tuples per second) |
116+ | rds_db_sql_tup_inserted | | Number of rows inserted by queries (Tuples per second) |
117+ | rds_db_sql_tup_returned | | Number of rows returned by queries (Tuples per second) |
118+ | rds_db_sql_tup_updated | | Number of rows updated by queries (Tuples per second) |
119+ | rds_db_temp_temp_bytes | | Total amount of data written to temporary files (Bytes per second) |
120+ | rds_db_temp_temp_files | | Number of temporary files created (Files per minute) |
121+ | rds_db_transactions_blocked_transactions | | Number of blocked transactions (Transactions) |
122+ | rds_db_transactions_max_used_xact_ids | | Number of unvacuumed transactions (Transactions) |
123+ | rds_db_transactions_xact_commit | | Number of committed transactions (Commits per second) |
124+ | rds_db_transactions_xact_rollback | | Number of rolled back transactions (Rollbacks per second) |
125+ | rds_db_transactions_oldest_inactive_logical_replication_slot_xid_age | | Oldest xid age held by Inactive Logical Replication Slot (Transactions) |
126+ | rds_db_transactions_oldest_active_logical_replication_slot_xid_age | | Oldest xid age held by active logical replication slot (Transactions) |
127+ | rds_db_transactions_oldest_prepared_transaction_xid_age | | Oldest xid age held by prepared transactions (Transactions) |
128+ | rds_db_transactions_oldest_running_transaction_xid_age | | Oldest xid age held by running transaction (Transactions) |
129+ | rds_db_transactions_oldest_hot_standby_feedback_xid_age | | Oldest xid age held on replica with hot_standby_feedback=on (Transactions) |
130+ | rds_db_user_numbackends | | Number of backends currently connected (Connections) |
131+ | rds_db_user_max_connections | | Maximum number of connections allowed by max_connections (Connections) |
132+ | rds_db_wal_archived_count | | Number of WAL files successfully archived (Files per minute) |
133+ | rds_db_wal_archive_failed_count | | Number of failed attempts to archive WAL files (Files per minute) |
134+ | rds_db_state_active_count | | Number of sessions in active state (Sessions) |
135+ | rds_db_state_idle_count | | Number of sessions in idle state (Sessions) |
136+ | rds_db_state_idle_in_transaction_count | | Number of sessions in idle in transaction state (Sessions) |
137+ | rds_db_state_idle_in_transaction_aborted_count | | Number of sessions in idle in transaction (aborted) state (Sessions) |
138+ | rds_db_state_idle_in_transaction_max_time | | Duration of longest running idle-in-transaction (Seconds) |
139+ | rds_db_checkpoint_checkpoint_sync_latency | | Time spent syncing files during checkpoints (Milliseconds per checkpoint) |
140+ | rds_db_checkpoint_checkpoint_write_latency | | Time spent writing files during checkpoints (Milliseconds per checkpoint) |
141+ | rds_db_transactions_active_transactions | | Number of active transactions (Transactions) |
142+
143+ </details >
144+
90145<details >
91146 <summary >Standard Go and Prometheus metrics are also available</summary >
92147
@@ -200,25 +255,26 @@ Prometheus RDS exporter</br>
200255
201256Configuration could be defined in [ prometheus-rds-exporter.yaml] ( https://github.com/qonto/prometheus-rds-exporter/blob/main/configs/prometheus-rds-exporter/prometheus-rds-exporter.yaml ) or environment variables (format ` PROMETHEUS_RDS_EXPORTER_<PARAMETER_NAME> ` ).
202257
203- | Parameter | Description | Default |
204- | ------------------------ | -------------------------------------------------------------------------------------------------------------------------- | ----------------------- |
205- | aws-assume-role-arn | AWS IAM ARN role to assume to fetch metrics | |
206- | aws-assume-role-session | AWS assume role session name | prometheus-rds-exporter |
258+ | Parameter | Description | Default |
259+ | --- | ----------------------------------------------------------------------------------------------------------------------------| -------------------------|
260+ | aws-assume-role-arn | AWS IAM ARN role to assume to fetch metrics | |
261+ | aws-assume-role-session | AWS assume role session name | prometheus-rds-exporter |
207262| collect-instance-metrics | Collect AWS instances metrics (AWS Cloudwatch API) | true |
208- | collect-instance-tags | Collect AWS RDS tags | true |
209- | collect-instance-types | Collect AWS instance types information (AWS EC2 API) | true |
210- | collect-logs-size | Collect AWS instances logs size (AWS RDS API) | true |
211- | collect-maintenances | Collect AWS instances maintenances (AWS RDS API) | true |
212- | collect-quotas | Collect AWS RDS quotas (AWS quotas API) | true |
213- | collect-usages | Collect AWS RDS usages (AWS Cloudwatch API) | true |
214- | tag-selections | Tags to select database instances with. Refer to [ dedicated section on tag configuration] ( #tag-configuration ) | |
215- | debug | Enable debug mode | |
216- | enable-otel-traces | Enable OpenTelemetry traces. See [ configuration] ( https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/ ) | false |
217- | listen-address | Address to listen on for web interface | :9043 |
218- | log-format | Log format (` text ` or ` json ` ) | json |
219- | metrics-path | Path under which to expose metrics | /metrics |
220- | tls-cert-path | Path to TLS certificate | |
221- | tls-key-path | Path to private key for TLS | |
263+ | collect-instance-tags | Collect AWS RDS tags | true |
264+ | collect-instance-types | Collect AWS instance types information (AWS EC2 API) | true |
265+ | collect-logs-size | Collect AWS instances logs size (AWS RDS API) | true |
266+ | collect-maintenances | Collect AWS instances maintenances (AWS RDS API) | true |
267+ | collect-performance-insights | Collect AWS RDS [ Performance Insights Metrics] ( https://aws.amazon.com/rds/performance-insights/ ) (AWS PI API) | false |
268+ | collect-quotas | Collect AWS RDS quotas (AWS quotas API) | true |
269+ | collect-usages | Collect AWS RDS usages (AWS Cloudwatch API) | true |
270+ | tag-selections | Tags to select database instances with. Refer to [ dedicated section on tag configuration] ( #tag-configuration ) | |
271+ | debug | Enable debug mode | |
272+ | enable-otel-traces | Enable OpenTelemetry traces. See [ configuration] ( https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/ ) | false |
273+ | listen-address | Address to listen on for web interface | :9043 |
274+ | log-format | Log format (` text ` or ` json ` ) | json |
275+ | metrics-path | Path under which to expose metrics | /metrics |
276+ | tls-cert-path | Path to TLS certificate | |
277+ | tls-key-path | Path to private key for TLS | |
222278
223279Configuration parameters priorities:
224280
@@ -316,11 +372,136 @@ If you are running on [AWS EKS](https://aws.amazon.com/eks/), we strongly recomm
316372 " tag:GetResources"
317373 ],
318374 " Resource " : " *"
319- }
375+ },
376+
320377 ]
321378}
322379```
323380
381+ <details >
382+ <summary >Minimal required IAM permissions with performance Insights</summary >
383+
384+ ``` json
385+ {
386+ "Version" : " 2012-10-17" ,
387+ "Statement" : [
388+ {
389+ "Sid" : " AllowInstanceAndLogDescriptions" ,
390+ "Effect" : " Allow" ,
391+ "Action" : [
392+ " rds:DescribeDBInstances" ,
393+ " rds:DescribeDBLogFiles"
394+ ],
395+ "Resource" : [
396+ " arn:aws:rds:*:*:db:*"
397+ ]
398+ },
399+ {
400+ "Sid" : " AllowMaintenanceDescriptions" ,
401+ "Effect" : " Allow" ,
402+ "Action" : [
403+ " rds:DescribePendingMaintenanceActions"
404+ ],
405+ "Resource" : " *"
406+ },
407+ {
408+ "Sid" : " AllowGettingCloudWatchMetrics" ,
409+ "Effect" : " Allow" ,
410+ "Action" : [
411+ " cloudwatch:GetMetricData"
412+ ],
413+ "Resource" : " *"
414+ },
415+ {
416+ "Sid" : " AllowRDSUsageDescriptions" ,
417+ "Effect" : " Allow" ,
418+ "Action" : [
419+ " rds:DescribeAccountAttributes"
420+ ],
421+ "Resource" : " *"
422+ },
423+ {
424+ "Sid" : " AllowQuotaDescriptions" ,
425+ "Effect" : " Allow" ,
426+ "Action" : [
427+ " servicequotas:GetServiceQuota"
428+ ],
429+ "Resource" : " *"
430+ },
431+ {
432+ "Sid" : " AllowInstanceTypeDescriptions" ,
433+ "Effect" : " Allow" ,
434+ "Action" : [
435+ " ec2:DescribeInstanceTypes"
436+ ],
437+ "Resource" : " *"
438+ },
439+ {
440+ "Sid" : " AllowInstanceFilterByTags" ,
441+ "Effect" : " Allow" ,
442+ "Action" : [
443+ " tag:GetResources"
444+ ],
445+ "Resource" : " *"
446+ },
447+ {
448+ "Sid" : " AmazonRDSPerformanceInsightsDescribeDimensionKeys" ,
449+ "Effect" : " Allow" ,
450+ "Action" : " pi:DescribeDimensionKeys" ,
451+ "Resource" : " arn:aws:pi:*:*:metrics/rds/*"
452+ },
453+ {
454+ "Sid" : " AmazonRDSPerformanceInsightsGetDimensionKeyDetails" ,
455+ "Effect" : " Allow" ,
456+ "Action" : " pi:GetDimensionKeyDetails" ,
457+ "Resource" : " arn:aws:pi:*:*:metrics/rds/*"
458+ },
459+ {
460+ "Sid" : " AmazonRDSPerformanceInsightsGetResourceMetadata" ,
461+ "Effect" : " Allow" ,
462+ "Action" : " pi:GetResourceMetadata" ,
463+ "Resource" : " arn:aws:pi:*:*:metrics/rds/*"
464+ },
465+ {
466+ "Sid" : " AmazonRDSPerformanceInsightsGetResourceMetrics" ,
467+ "Effect" : " Allow" ,
468+ "Action" : " pi:GetResourceMetrics" ,
469+ "Resource" : " arn:aws:pi:*:*:metrics/rds/*"
470+ },
471+ {
472+ "Sid" : " AmazonRDSPerformanceInsightsListAvailableResourceDimensions" ,
473+ "Effect" : " Allow" ,
474+ "Action" : " pi:ListAvailableResourceDimensions" ,
475+ "Resource" : " arn:aws:pi:*:*:metrics/rds/*"
476+ },
477+ {
478+ "Sid" : " AmazonRDSPerformanceInsightsListAvailableResourceMetrics" ,
479+ "Effect" : " Allow" ,
480+ "Action" : " pi:ListAvailableResourceMetrics" ,
481+ "Resource" : " arn:aws:pi:*:*:metrics/rds/*"
482+ },
483+ {
484+ "Sid" : " AmazonRDSPerformanceInsightsGetPerformanceAnalysisReport" ,
485+ "Effect" : " Allow" ,
486+ "Action" : " pi:GetPerformanceAnalysisReport" ,
487+ "Resource" : " arn:aws:pi:*:*:perf-reports/rds/*/*"
488+ },
489+ {
490+ "Sid" : " AmazonRDSPerformanceInsightsListPerformanceAnalysisReports" ,
491+ "Effect" : " Allow" ,
492+ "Action" : " pi:ListPerformanceAnalysisReports" ,
493+ "Resource" : " arn:aws:pi:*:*:perf-reports/rds/*/*"
494+ },
495+ {
496+ "Sid" : " AmazonRDSPerformanceInsightsListTagsForResource" ,
497+ "Effect" : " Allow" ,
498+ "Action" : " pi:ListTagsForResource" ,
499+ "Resource" : " arn:aws:pi:*:*:*/rds/*"
500+ }
501+ ]
502+ }
503+ ```
504+
324505For convenience, you can download it using:
325506
326507``` bash
0 commit comments